pfritz

November 29, 2008

The Lists and their quirks

Filed under: Uncategorized — pfritz @ 22:24

When I started to write this article I planed to write about the API of the two major list implementations and their weak points. But since this article is already long enough, I only focusing on evas_list. Don’t worry, an article about ecore_list will follow.

Both API are widely used so you should assume that they are proved to be suited for daily use. And that is right in most of the cases where I have seen them to be used. But they have some weak points, which I want to show you here. Creating a list is in both cases very simple, where it gets hairy is removing items from the list. Evas_List (now Eina_List) doesn’t have a real destroy function, so you have to figure out the best way to free a complete list.

But this is an relative easy job and you’ll find thousands of examples in the efl. The real tricky point is if you do selective destruction, i.e. you remove a list item under some conditions.

Let’s say we have a list of filenames and want to remove the files that doesn’t have the extension “.jpg”. This shouldn’t be a hard job you think, shouldn’t it? Well it could be easier.

Let’s start with Eina_List. I mimic here the unaware new user, of course I know already how to do it right. We start with the normal approach. Iterate over all items and remove the items we don’t like.

#include <Eina.h>
#include <Ecore_Str.h>
#include <string.h>
#include <stdio.h>

int
main(void)
{
        Eina_List *list = NULL;
        Eina_List *l;
        const char *file;

        eina_init();

        list = eina_list_append(list, "dsc01.jpg");
        list = eina_list_append(list, "blah");
        list = eina_list_append(list, "dsc02.jpg");
        list = eina_list_append(list, "dsc03.jpg");
        list = eina_list_append(list, "dsc04.jpg");
        list = eina_list_append(list, "dsc05.jpg");
        list = eina_list_append(list, "foooo");
        list = eina_list_append(list, "dsc07.jpg");
        list = eina_list_append(list, "dsc08.jpg");
        list = eina_list_append(list, "dada");
        list = eina_list_append(list, "dsc10.jpg");

        EINA_LIST_FOREACH(list, l, file)
        {
                if (!ecore_str_has_extension(file, ".jpg"))
                        list = eina_list_remove(list, file);
        }

        /* and now print the rest */
        while (list)
        {
                file = eina_list_data_get(list);

                printf("%s\n", file);
                list = eina_list_remove(list, file);
        }

        eina_shutdown();
        return 0;
}

This doesn't look that bad on the first look and if you test it does indeed work. But it is wrong! Let me first explain why it is wrong, I'll explain you later why it does work nevertheless. Let's take a look on what the EINA_LIST_FOREACH-macro does. Here is the same loop but the macro expanded:

        for (l = list, file = eina_list_data_get(l); l; l = eina_list_next(l),
                        file = eina_list_data_get(l))
        {
                if (!ecore_str_has_extension(file, ".jpg"))
                        list = eina_list_remove(list, file);
        }

eina_list_remove() will, remove the node where l is pointing to and frees the node. So with the following l = eina_list_next(l) we are accessing an already freed node. But why doesn't it crash? Or why doesn't valgrind show any error? Because eina uses an memory pool, to reduce the free()s and malloc()s, so the given node isn't in fact freed. Ok, but if eina is working like that, why shouldn't we take it as a hidden feature, since it does the job as we need it?

Imagine we want now not only remove the non-jpg-files, but want to put them in a second list:

#include <Eina.h>
#include <Ecore_Str.h>
#include <string.h>
#include <stdio.h>

int
main(void)
{
        Eina_List *list = NULL;
        Eina_List *list2 = NULL;
        Eina_List *l;
        const char *file;

        eina_init();

        list = eina_list_append(list, "dsc01.jpg");
        list = eina_list_append(list, "blah");
        list = eina_list_append(list, "dsc02.jpg");
        list = eina_list_append(list, "dsc03.jpg");
        list = eina_list_append(list, "dsc04.jpg");
        list = eina_list_append(list, "dsc05.jpg");
        list = eina_list_append(list, "foooo");
        list = eina_list_append(list, "dsc07.jpg");
        list = eina_list_append(list, "dsc08.jpg");
        list = eina_list_append(list, "dada");
        list = eina_list_append(list, "dsc10.jpg");

        for (l = list, file = eina_list_data_get(l); l; l = eina_list_next(l),
                        file = eina_list_data_get(l))
        {
                if (!ecore_str_has_extension(file, ".jpg"))
                {
                        list = eina_list_remove(list, file);
                        list2 = eina_list_append(list2, file);
                }
        }

        /* and now print the rest */
        printf("first list:\n");
        while (list)
        {
                const char *file = eina_list_data_get(list);

                printf("\t%s\n", file);
                list = eina_list_remove(list, file);
        }

        /* and now print the rest */
        printf("second list:\n");
        while (list2)
        {
                const char *file = eina_list_data_get(list2);

                printf("\t%s\n", file);
                list2 = eina_list_remove(list2, file);
        }

        eina_shutdown();
        return 0;
}

And suddenly it doesn't work anymore. The reason is pretty simple the removed node is put into the memory pool, but then before we iterate to the next node it is appended to the second list. So in fact we are looping after the first remove over the second list.

So how can we do it right? We need to go to the next node before we are removing the node, i.e. we need to keep a reference of the node that is to be removed and do the next step:

        for (l = list; l;)
        {
                Eina_List *tmp = l;

                l = eina_list_next(l);
                file = eina_list_data_get(tmp);
                if (!ecore_str_has_extension(file, ".jpg"))
                {
                        list = eina_list_remove(list, file);
                        list2 = eina_list_append(list2, file);
                }
        }

Nice that works! So are we finally done? Not quite. We are using eina_list_remove(). Although there is nothing wrong with this function, you should avoid to use it, because it does a linear search to find the node to remove, i.e. the complexity is O(n). Better use eina_list_remove_list(). Unlike the name suggests, it doesn't remove a list, but it removes a list node, actually exactly what we want.

        for (l = list; l;)
        {
                Eina_List *tmp = l;

                l = eina_list_next(l);
                file = eina_list_data_get(tmp);
                if (!ecore_str_has_extension(file, ".jpg"))
                {
                        list = eina_list_remove_list(list, tmp);
                        list2 = eina_list_append(list2, file);
                }
        }

Puh, that was harden then expected, but it was doable, and we even kept it to be an O(n) algorithim. The actually bad thing was that our first shot worked by accident. So keep your eyes open, if you have ever to do a selective remove with eina_list.

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.