Today I Learned - Rocky Kev

TIL how big the Internet Archives are

POSTED ON:

TAGS:

The Internet Archive, a 501(c)(3) non-profit, is building a digital library of Internet sites and other cultural artifacts in digital form.

They were incredibly helpful as a resource for my presentation about a obscure 2000s game NetMonster, as I was literally digging through 2000's internet and pouring over Geocities source code.

It took me to some interesting rabbit holes

How does data get stored?

IA does everything in-house rather than having its storage and processing hosted by, for example, AWS. His answer: lower cost, greater control, and greater confidence that their users are not being tracked.
via https://news.ycombinator.com/item?id=26312389

How much storage is used

(1 Petabyte === 1024 Terabytes)
2010: 5.8 Petabytes src
2014: 50 Petabytes src
2020: 70 Petabytes src

These numbers look off.

Via Jonah Edwards, who is part of the Internet Archive team:
As of 2021:

Jonah Edwards - Internet Archive Infrastructure
https://archive.org/details/jonah-edwards-presentation

The 'Petabox':


https://archive.org/web/petabox.php


Related TILs

Tagged:

TIL how big the Internet Archives are

I don't actually have an answer. But as of a 2021 presentation, they grow by about 5-6 PB per quarter.

TIL how big the Internet Archives are

I don't actually have an answer. But as of a 2021 presentation, they grow by about 5-6 PB per quarter.

TIL how big the Internet Archives are

I don't actually have an answer. But as of a 2021 presentation, they grow by about 5-6 PB per quarter.