Update the lib to include polling on the pwqrfd. This introduces an overcommit_count. This counter is set by a poller thread (or an event loop, see pwqr_overcommit_poll_loop. On the other side, normal jobs consumers check this counter, when non 0 they XCHG it with 0 (to be sure they are alone evaluating the overcommit ratio), ask the kernel for the current overcommit, substract one, put it as the new counter and go to be parked. Of course when going to be parked the threads may actually find overcommit jobs or similar, then the polling thread will reset the overcommit_count again in that case and this will start again. In the more common case, the thread will be parked directly and we hope it'll be enough. When a thread goes out of PARK mode without signaling an EDQUOT condition, we forcefully set the overcommit_count to zero. This should hopefully take care of the downsizing of the pool in case of overcommit for too long. As a side note, the kernel only signals overcommit when it's lasting for more than PWQR_OC_DELAY (which is 1/20 of a second as of this commit), which lets plenty of time for the overcommit to be reduced in other more "natural" ways. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>
implement poll and read No matter if we implement some other kind of dirty notification mechanism, it feels right to have pwqr pollable for overcommit. Documentation: - drop the "in kernel unpark" method, it sucks - migrate to using non blocking "read" for the probing method. - document the pollability and how read works in the pwqr_create "manpage". lib: - implement epoll_create with flags. It requires a kernel supporting O_NONBLOCK/O_CLOEXEC flags to open(), I've been too lazy to implement the emulation yet. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>
do not forbid other processes to use the pwqr fd. Actually it could even make sense, to have stuff running in the background and wanting to be accounted in the overall load of this process group (for example some kind of snapshot procedure that would be forke()d in the background). Plus this restriction was kind of un-kernelish. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>
Prepare code to plug the overcommit notification. Let the pwqr_sb have a state (instead of yes/no "dead" flag) being among: - DEAD - NONE (normal) - UC (undercommit) - OC (overcommit). In the last two modes a timer is fired. In the UC case, if the timer fires, we unpark a thread (if any, and if no overcommit unpark is pending) as before. In the OC case for now we do nothing. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>
rework the quarantine: it doesn't really need to be accounted. The notion of quarantine is purely virtual and we only care about it at WAKE time. It significantly simplifies the fastpath of our code, namely __pwqr_sb_update_state. Also allow the WAKE command to directly unpark threads if that fits with the concurrency level. I don't really expect userland to really use that, but it doesn't break anything and makes sense. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>
Implement the reluctancy to unpark threads. This means that a pool needs to undercommit for 0.1s before we allow it to grow its number of in-pool threads. Document the last todo: how to reduce the pool when we're overcommiting. Right now we only pray that userland will put some threads to WAIT. But frankly it's less than ideal. With the repulsion to start a new thread we hope though that the overcommit will never ever grow out of proportion for now. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>
Pthread Workqueue Regulator (pwqr) initial commit. This contains an alpha/beta quality kernel proof-of-concept driver, plus some code concepts to wrap the syscall in lib/libpwqr.c Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>