How to cope with slow vector<string> destructor?

Is there a well-known solution to the following problem?

  • You have a vector of a lot of strings

  • You happily populate it with a few hundred thousand strings, and it's fast

  • You manipulate your strings in arbitrary ways; life is good.

  • You're done with the vector; vector goes out of scope, and now you have to go grab some coffee and sit back while each string gets destroyed one-by-one.

Edit: Problem solved!

I just ran the code below on Linux on the same computer and it was fine, which led me to figure out the solution. It turned out to be with my system -- something I'd caused myself, a long time ago, but which I'd forgotten. Upon fixing the problem, the time decreased dramatically, to even better than with GCC!

It's a good puzzle though, so instead of posting the answer, I'll do something else: I'm not allowed to place a bounty on this question right now, but if you think you know the cause, give it a shot. If it's correct, I'll accept it and give you a nice bounty. (Remind me if I forget to give you the bounty!) If no one knows then I'll just post it myself after some time passes.

Sample code:

I used to be as skeptical as anybody, but now I guess people do have a point when they the STL is slow! This took 3.95 seconds on my laptop: (the shuffling is critical)

#include <cmath>
#include <cstdio>
#include <cstdlib>
#include <ctime>
#include <string>
#include <vector>

int main()
{
    using namespace std;
    srand((unsigned)time(NULL));
    clock_t start;
    {
        vector<string> v(400000);
        for (size_t i = 0; i < v.size(); i++)
        {
            v[i].resize(16 + rand() % 32);  // some variation
        }

        // shuffle
        for (size_t i = 0; i < (size_t)pow((double)v.size(), 1.15); i++)
        {
            size_t j = rand() * (RAND_MAX + 1) + rand();
            swap(v[i % v.size()], v[j % v.size()]);
        }

        printf("Going out of scope...\n"); fflush(stdout);
        start = clock();
    }
    clock_t end = clock();
    printf("%u ms\n", (end - start) * 1000 / CLOCKS_PER_SEC);
    return 0;
}

It looks to me like this program is using some O(n2) algorithm internally, either in Visual C++ or in Windows. Not sure what's happening, but it's interesting...

Answers


Use custom allocator with deallocation en masse.


Okay, since nobody figured it out...

It was because heap tail checking was turned on in my system. Once I removed it, the code finished quickly.


Why don't you create the vector itself dynamically so you can manage it with a reference counting smart pointer. Then, you can ensure the thread that has the last reference to it isn't the UI thread so the UI thread isn't the one doing the processing when it goes out of scope.

You could even manipulate the priority of the thread that does the processing so it's lower and doesn't affect the rest of your threads badly - it'll make sure the UI thread is scheduled in front of the lower priority thread.

Note: I've never tried that, but I don't see why it wouldn't work - but spend time on it at your own risk! :)


Need Your Help

Avoid repetition and print unique ordered sets

java arrays

I am trying to take in some integers through command line arguments and then generate all the possible ordered sets from the set of numbers. The following code does that but it repeats the ordered ...

R "magic": file can be found via 'source' and cannot via 'make'

r build gnu-make rstudio rdata

Maybe it's something trivial and I simply was looking for too long at the same code... When sourcing R module getFLOSSmoleDataXML.R via RStudio, the code correctly detects .Rdata files in cache dir...