Is the Visual C++ implementation of std::async using a thread pool legal

Visual C++ uses the Windows thread pool (Vista's CreateThreadpoolWork if available and QueueUserWorkItem if not) when calling std::async with std::launch::async.

The number of threads in the pool is limited. If create several tasks that run for a long time without sleeping (including doing I/O), the upcoming tasks in the queue won't get a chance to work.

The standard (I'm using N4140) says that using std::async with std::launch::async

... calls INVOKE(DECAY_COPY(std::forward<F>(f)), DECAY_COPY(std::forward<Args>(args))...) (20.9.2, 30.3.1.2) as if in a new thread of execution represented by a thread object with the calls to DECAY_COPY() being evaluated in the thread that called async.

(§30.6.8p3, Emphasis mine.)

std::thread's constructor creates a new thread etc.

About threads in general it says (§1.10p3):

Implementations should ensure that all unblocked threads eventually make progress. [Note: Standard library functions may silently block on I/O or locks. Factors in the execution environment, including externally-imposed thread priorities, may prevent an implementation from making certain guarantees of forward progress. —end note]

If I create a bunch of OS threads or std::threads, all performing some very long (perhaps infinite) tasks, they'll all be scheduled (at least on Windows; without messing with priorities, affinities, etc.). If we schedule the same tasks to the Windows thread pool (or use std::async(std::launch::async, ...) which does that), the later scheduled tasks won't run until the earlier tasks will finish.

Is this legal, strictly speaking? And what does "eventually" mean?


The problem is that if the tasks scheduled first are de-facto infinite, the rest of the tasks won't run. So the other threads (not OS threads, but "C++-threads" according to the as-if rule) won't make progress.

One may argue that if the code has infinite loops the behavior is undefined, and thus it's legal.

But I argue that we don't need an infinite loop of the problematic kind the standard says causes UB to make that happen. Accessing volatile objects, performing atomic operation and synchronization operations are all side effects that "disable" the assumption about loops terminating.

(I have a bunch of async calls executing the following lambda

auto lambda = [&] {
    while (m.try_lock() == false) {
        for (size_t i = 0; i < (2 << 24); i++) {
            vi++;
        }
        vi = 0;
    }
};

and the lock is released only upon user input. But there are other valid kinds of legitimate infinite loops.)

If I schedule a couple of such tasks, tasks I schedule after them don't get to run.

A really wicked example would be launching too many tasks that run until a lock is release/a flag is raised and then schedule using `std::async(std::launch::async, ...) a task that raises the flag. Unless the word "eventually" means something very surprising, this program has to terminate. But under the VC++ implementation it won't!

To me it seems like a violation of the standard. What makes me wonder is the second sentence in the note. Factors may prevent implementations from making certain guarantees of forward progress. So how are these implementation conforming?

It's like saying there may be factors preventing implementations from providing certain aspect of memory ordering, atomicity, or even the existence of multiple threads of execution. Great, but conforming hosted implementations must support multiple threads. Too bad for them and their factors. If they can't provide them that's not C++.

Is this a relaxation of the requirement? If interpreting so, it's a complete withdrawal of the requirement, since it doesn't specify what are the factors and, more importantly, which guarantees may be not supplied by the implementations.

If not - what does that note even mean?

I recall footnotes being non-normative according to the ISO/IEC Directives, but I'm not sure about notes. I did find in the ISO/IEC directives the following:

24 Notes

24.1 Purpose or rationale

Notes are used for giving additional information intended to assist the understanding or use of the text of the document. The document shall be usable without the notes.

Emphasis mine. If I consider the document without that unclear note, seems to me like threads must make progress, std::async(std::launch::async, ...) has the effect as-if the functor is execute on a new thread, as-if it was being created using std::thread, and thus a functors dispatched using std::async(std::launch::async, ...) must make progress. And in the VC++ implementation with the threadpool they don't. So VC++ is in violation of the standard in this respect.


Full example, tested using VS 2015U3 on Windows 10 Enterprise 1607 on i5-6440HQ:

#include <iostream>
#include <future>
#include <atomic>

int main() {
    volatile int vi{};
    std::mutex m{};
    m.lock();

    auto lambda = [&] {
        while (m.try_lock() == false) {
            for (size_t i = 0; i < (2 << 10); i++) {
                vi++;
            }
            vi = 0;
        }
        m.unlock();
    };

    std::vector<decltype(std::async(std::launch::async, lambda))> v;

    int threadCount{};
    std::cin >> threadCount;
    for (int i = 0; i < threadCount; i++) {
        v.emplace_back(std::move(std::async(std::launch::async, lambda)));
    }

    auto release = std::async(std::launch::async, [&] {
        __asm int 3;
        std::cout << "foo" << std::endl;
        vi = 123;
        m.unlock();
    });

    return 0;
}

With 4 or less it terminates. With more than 4 it doesn't.


Similar questions:

Answers


The situation has been clarified somewhat in C++17 by P0296R2. Unless the Visual C++ implementation documents that its threads do not provide concurrent forward progress guarantees (which would be generally undesirable), the bounded thread pool is not conforming (in C++17).

The note about "externally imposed thread priorities" has been removed, perhaps because it is already always possible for the environment to prevent the progress of a C++ program (if not by priority, then by being suspended, and if not that, then by power or hardware failure).

There is one remaining normative "should" in that section, but it pertains (as conio mentioned) only to lock-free operations, which can be delayed indefinitely by frequent concurrent access by other thread to the same cache line (not merely the same atomic variable). (I think that in some implementations this can happen even if the other threads are only reading.)


Need Your Help

How to configure xdebug with WAMP

wamp xdebug

I am using wamp 2.0 and trying to install XDebug extension for php. I have followed all steps written here http://wiki.netbeans.org/HowToConfigureXDebug#How_to_configure_xdebug_with_WAMP

Measure execution time in C#

c# datetime execution-time measure

I want to measure the execution of a piece of code and I'm wondering what the best method to do this is?