Yes they need to fix it so it spawns spinning threads till 100% usage on all cores is achieved. Then we can be sure it's running efficiently.
That's a lot of effort for a redundant feature.
Windows already does this for you, free of charge. Since a core cannot be stopped, Windows assigns it the Idle thread if no other application has a thread ready for processing. The Idle thread, as the name suggests, does nothing but endlessly spinning around in a loop. The windows Task Manager, however, deducts the time spend in the Idle thread from the total CPU time used. If it did not, all cores would display 100% usage all of the time.
One of the last CPU's that could actually be halted was the Z80. It would then be completely idle until an external interrupt (from a keyboard or other external device) woke it up again.
As far as most threads ending up on one core instead of being evenly spread. That, too, is due to the Windows thread scheduler. It prefers threads belonging to the same process to be run on the same core, as that places a lesser burden on the cache controllers inside the CPU. Each core has a private level 1 cache, and all cores share the level 2/3 caches. Windows (rightly) assumes that threads of the same process use the same memory pool. When these run on multiple cores, the CPU needs to put in extra time to maintain the integrity of these level 1 caches, slowing the cores down.
Edit : I don't know if the Windows Scheduler actively supports it, but keeping most threads on one core when possible utilises Intel's Turbo Boost feature to the max. With this feature, cores that aren't doing very much get their clock speed reduced (thus drawing less power and emmenating less heat) so that the thus freed room in power draw and heat generation can be put on the one core that's busy, by cranking the clock speed of that core over the maximum. In short, workload that put's two cores at 50% runs actually slower than putting all that workload on one single core.