After yesterdays blog, g2boojum asked if it would be faster to actually use the tsort C binary if available. My response was that I would expect it to be faster, but not by much as most of the work is making the dependancy tree to start with. Being the silly sort I am I decided to back myself up with benchmarks! :)
C tsort stopping
real 0m0.343s user 0m0.196s sys 0m0.144s
C tsort starting
real 0m0.243s user 0m0.132s sys 0m.112s
bash tsort stopping
real 0m0.353s user 0m0.248s sys 0m0.108s
bash tsort starting
real 0m0.239s user 0m0.164s sys 0m0.076s
So there we have it! But what does this mean? Why is C tsort faster at stopping than bash? Why is bash tsort faster at starting than C? More importantly does this so that I’m a really good coder? :evil:
This is easily explained- when we “stop” we load all modules available. When we start we filter out duplicates based on user and then system preference, which means we’re dealing with less modules. My bash code is more efficient, purely because I don’t have to rebuild the entire PROVIDES array at the end of the sort as it gets changed during the sort- something which I cannot do in the C tsort. However, bash loops are notoriously slow, hence the more modules you throw at it the C tsort will overtake.
Attached is a patch (will apply to baselayout-1.12.0_pre10) that enables the C tsort. This is not being comitted to the tree as it adds more code, complexity and the C tsort may not be available all the time (/usr may be net mounted) and as proved, can actually be slower. What’s more, it’s unlikely that everyone will use the number of modules that I use (I have all of them “loaded” bar adsl) by the time we reach the sort, so bash is faster. :P So there. :P
However, there’s not stopping you from seeing if you can improve it and tempt us into comitting it … ;)