moar, and moar, and moar debsources stats
A while ago I've announced the availability of several stats about Debian source code on http://sources.debian.net. Since then the statistical basis of those stats has increased a lot, and now includes all Debian historical releases, from hamm (July 1998) onward. This allows to appreciate macro-level evolution trends in Free Software, over a period of more than 15 years, through the eyes of a distro that sits at the nice intersection of the eldest, largest, and most reputed distros.
To get there I've added support for sticky suites to the plumbing layer of debsources, and then injected historical releases from http://archive.debian.org. The injection process took about a week (without any sort of parallelism, pretty slow disks, and computing sha256 checksums, ctags, and sloccount on all source files) and has been an "interesting" experience.
When you go back decades in technology time, bit
rot is just around the corner, and I've found my
sources.d.n. In both cases the respective maintainers
(Guillem and Ganneff, kudos) have been positive about and helpful
in improving the situation, despite the low impact of the bugs I've
found on the average user. That's quite important for the
long-term preservation of digital information in
general, and for the perennity of access to Free Software in the
specific case of Debian.
While we are it, I'm now maintaining a list of
sources.d.n but belonging to other
packages, in case you fancy helping out but are not a Python
hacker. Interestingly enough, quite a bit of those bugs are related
to the fact that tools debsources uses (e.g. ctags, sloccount) are
also starting to show their age.
You might wander why buzz, rex, and bo are still missing from
sources.d.n. That's in fact for similar reasons.
Before hamm Debian didn't have complete archive coverage in terms
Sources indexes and
.dsc files. Given
that debsources rely on both to extract source packages, it first
needs to grow an additional abstraction layer that can cope with
their absence. It's SMOP, and planned.
And now let's have fun with ctags bombs.
Stefano “Indiana” Zacchiroli