Digital Information 250 Years from Now

The US National Archives and Records Administration (NARA) has apparently decided to end its policy of taking a “digital snapshot” of all public congressional and federal web sites after each congressional and presidential term. According to NARA, which is understandably drawing heat for the policy change, they shouldn’t need to archive those web sites because federal agencies and congress should be doing their own archiving. I read about NARA after reading a very timely piece from Leland Rucker about the nature of information archiving in a totally digital world, and it got me wondering: what happens to all this content on the web 250 years in the future? Last year Google’s archives touched 100 exabytes of data from the web. To put that in perspective, that’s about 107 billion gigabytes (or, over a half a million 200 GB hard drives). The entire catalog of the Library of Congress is about 136 terabytes — which makes Google’s archive the data equivalent of 771,000 Libraries of Congress.

Read the full article here

1 Response to “Digital Information 250 Years from Now”


  1. 1 Carla Apr 30th, 2008 at 3:32 pm

    Considering most books are written on acid based paper, even their permanence is widely debated. Many books will literally burn up due to the acid in the paper and the chemical interaction with air.

    As for the web, I think that what is truly unique must also be cared for what people will want to research in the future. Keeping a good archive physical or digital is likened often to keeping a well organized closet. Government agencies should also share some record managing duties. Making or having an electronic footprint on the web is one thing, maintaining it for the future is another interest entirely. The point is we don’t always even know what we should be archiving and saving. Agencies should ask themselves what researchers would be looking for 10, 20 and 30 years down the way.

    As for NARA hats off to them. I have always thought their task was daunting. Their mission is solid and should stand for a blanket within each agencies archiving. We are saving this stuff for scholars and most in the future will be interested in the role of government and big business, economics, and preservation of the environment as well as consumption habits and attitudes.

    Dynamic web pages need good spiders and bots to help them keep relevant hopefully in time the technology will get better.

Leave a Reply




Tags in Use

Tag Cloud

libraries education digital librarians/information professionals Google search social networking innovation demographics books library services virtual worlds internet technology social sites web 2.0 Library of Congress collections research publishing YouTube corporations Second Life economics children Facebook information web sites museums archives knowledge management Microsoft business intelligence mobile/cell phones Yahoo video OCLC censorship American Library Association copyright blogs China gadgets reading gaming mapping MySpace literacy Pew Research users open access software trends cataloging Canada privacy competitive intelligence Wikipedia journals communities databases NYPL computer science OLPC preservation art semantic web India digital divide green Amazon e-books consumers collaboration Apple Europe iPhone environment podcasts disabilities microblogging Kindle reference language news techniques television future dictionaries Ask wiki awards media Uncategorized online resources Flickr Firefox United Kingdom email Twitter government library 2.0 LibraryThing authors open source widgets writing mashups tagging comics AOL United Nations applications geospatial audiobooks video games poetry Mozilla Japan peer review politics music RSS prisons security plagiarism architecture Wikia salaries original content outsourcing sources Africa hardware Generation Y instant messages Special Libraries Association intranets film experiential data Digg Copyright Clearance Center emoticons European Library journalism cloud computing bookcarts broadband marketing metadata MSL leadership design young adult society storytelling spam Windows Virtual Earth VoIP SOA travel translation Vista Technorati taxonomies Shakespeare thought leadership law developing worlds del.icio.us Cuba classification Dogpile early adopters genealogy Flock ethics EMEA chat CAD anthropology AJAX aggregators acquisitions/mergers ARL artificial intelligence book groups Baby Boomers Asia Pacific science grants outreach orphan works OPAC newspapers patents photos RFID records management psychology history multi-tasking 3D intellectual property Intel Moodle leisure media players Horizon Report LinkedIn IEEE paper Blogroll

Calendar

April 2008
S M T W T F S
« Mar   May »
 12345
6789101112
13141516171819
20212223242526
27282930  

RSS Feeds


Google Reader or Homepage
Add to My Yahoo!
Subscribe with Bloglines
Subscribe in NewsGator Online

Add to My AOL
Subscribe in Rojo
Subscribe with Pluck RSS reader
Add to Technorati Favorites!
Add to netvibes


Subscribe in NewsAlloy



Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 License