Saturday, June 20, 2009

Google Usenet archive data lossage

According to Google Groups today, there were only 227 articles posted to comp.lang.c between 1981 and 1995.

The effect seems to extend broadly across the comp.lang.* hierarchy. I haven't investigated beyond that. I hope this is a temporary glitch and not permanent data loss.

It's a little scary how much the world has come to rely on Google for historical data archiving.

2 comments:

Alonso said...

This seems to be a search related issue.

Here it is listing messages not only in comp.lang.c, but in comp.unix.bsd. comp.unix.programmer etc.

If you look at the archive page (http://groups.google.com/group/comp.lang.c/about) you will see there are tons of messages each month, since 1986 (it lists 1 in 1969, another glitch I guess).

In the topics page (http://groups.google.com/group/comp.lang.c/topics) it shows there are 137441 messages.

Anyway, yes it is a little scary. Are those Usenet messages stored only by Google?

Ron said...

> Are those Usenet messages stored only by Google?

AFAIK, Google has the only complete collection.