The US National Archives and Records Administration (NARA) has apparently decided to end its policy of taking a “digital snapshot” of all public congressional and federal web sites after each congressional and presidential term. According to NARA, which is understandably drawing heat for the policy change, they shouldn’t need to archive those web sites because federal agencies and congress should be doing their own archiving. I read about NARA after reading a very timely piece from Leland Rucker about the nature of information archiving in a totally digital world, and it got me wondering: what happens to all this content on the web 250 years in the future? Last year Google’s archives touched 100 exabytes of data from the web. To put that in perspective, that’s about 107 billion gigabytes (or, over a half a million 200 GB hard drives). The entire catalog of the Library of Congress is about 136 terabytes — which makes Google’s archive the data equivalent of 771,000 Libraries of Congress.
Read the full article here

















Considering most books are written on acid based paper, even their permanence is widely debated. Many books will literally burn up due to the acid in the paper and the chemical interaction with air.
As for the web, I think that what is truly unique must also be cared for what people will want to research in the future. Keeping a good archive physical or digital is likened often to keeping a well organized closet. Government agencies should also share some record managing duties. Making or having an electronic footprint on the web is one thing, maintaining it for the future is another interest entirely. The point is we don’t always even know what we should be archiving and saving. Agencies should ask themselves what researchers would be looking for 10, 20 and 30 years down the way.
As for NARA hats off to them. I have always thought their task was daunting. Their mission is solid and should stand for a blanket within each agencies archiving. We are saving this stuff for scholars and most in the future will be interested in the role of government and big business, economics, and preservation of the environment as well as consumption habits and attitudes.
Dynamic web pages need good spiders and bots to help them keep relevant hopefully in time the technology will get better.