While working on uspace-plus, I caused a bunch of failures on the buildbot (kernel 4.1 RT-PREEMPT, 2 CPUs) which I later reproduced on bare metal with kernel 4.4 RT-PREEMPT. It would come up at least 1 out of 1000 runs of the ‘flipflop.0’ test. With debugging messages, it appears the hang is inside preempt_cancel(), and that CPU0 (the one not running realtime code) begins to use 100% CPU.
I bisected it to (the moral equivalent of) this commit, the purpose of which is to clean up all tasks and destroy the App object before exiting rtapi_app. I think I can just drop this change from the uspace-plus series and leave the problem for another day, but we really should do this cleanup at exit!
0001-uspace-stop-threads-and-then-destroy-the-RtapiApp-at.patch.txt