The Minecraft Archive Project

ZZT
Minecraft
Hi, I'm Leonard Richardson. When I was growing up in the 1990s, my favorite computer game was a blocky little thing called ZZT. Lots of games had level editors, but ZZT came with its programming language, allowing you to script your own adventures and puzzles.

The kids who grew up playing ZZT are now artists, game designers, and programmers. But many of the worlds they created are gone. ZZT worlds were shared through BBSes and online services like CompuServe. When the Internet took over, those services shut down and the worlds were lost. It's estimated that only half the ZZT worlds ever created still survive.

In the early 2010s I realized that history was repeating itself. This time, the blocky game with the embedded programming language was Minecraft. Kids and teenagers were creating worlds, putting a lot of work into them, and sharing them on unreliable file-hosting sites. Some of those kids are now artists, game designers, and programmers. Soon they'll get nostalgic, start thinking back on the game that showed them how fun it was to create their own worlds... and it'll all be gone.

The Minecraft Archive Project is my attempt to stop that from happening. Minecraft is much more popular than ZZT ever was, and I don't think I can save more than a fraction of one percent of its cultural history, but without this project pretty much all of it is doomed.

In the Collection

I periodically refresh the MAP by capturing new data from a couple different sites:

My focus is on packaged binaries (Minecraft maps, resource packs, and mods) but I also capture images (screenshots and skins).

I've split the Minecraft Archive Project into a large number of ZIP files of about 50 gigabytes each. Each file has an HTML finding aid that gives you an overview of what's in the ZIP file. Here are the links:

Mods

Worlds by Month

2011 January February March April May June July August September October November December
2012 January February March April May June July August September October November December
2013 January February March April May June July August September October November December
2014 January February March April May June July August September October November December
2015 January February March April May June July August September October November December
2016 January February March April May June July August September October November December
2017 January February March April May June July August September October November December
2018 January February March April May June July August September October November December

Worlds – Other dates

Modpacks

Resource Packs

Minecraft Pocket Edition

Git repositories - “minecraft”

I clone every Git repository I learn about in the course of performing a capture. I've also run a Github search for "minecraft", and cloned every repository that shows up. These are organized by the creation date of the Git repository.

Git repositories - “craft”

This section is a mixed bag. It mostly contains Minecraft clones or games inspired by Minecraft, and projects that have nothing to do with games at all—they just have a name like "FooCraft" that sounds Minecraft-ish.

Git repositories - “bukkit”

As you'd expect, this contains mostly repositories relating to the Bukkit project.

Miscellaneous

Wiki archives

Archives made by other people

These items are in the "Minecraft Archive Project" collection on the Internet Archive, and they're of interest to anyone who's interested in the MAP, but I didn't make these archives -- other people did.

How?

There's no secret, really. I just wrote a lot of Python scripts and let them run for a really long time. When one script finishes, I run the next one in the sequence. I go into some detail about my process in a 2015 blog post.

Spin-off projects

Using the data from the initial MAP capture in 2014, I created these projects:

ESC: the Ephemeral Software Collection

As the Minecraft Archive Project grew, I started getting data from sites like CurseForge and GitHub which contain both Minecraft and non-Minecraft stuff. I started the Ephemeral Software Collection to hold the non-Minecraft stuff. Before long, the ESC became larger than the Minecraft Archive Project that spawned it—over four terabytes as of February 2016.

You can see an overview of the ESC here. It's an eclectic collection, but I generally think of it as containing software that's at risk of being lost, forgotten or destroyed. It contains the equivalent of the Minecraft Archive Project for games other than Minecraft. It also contains mods, add-ons, software created for one-off events like game jams, implementations of classic computer games, software that exists in a copyright or trademark grey area, experimental code, and stuff I just think might be interesting or useful later.

I'm not even going to try to archive a comprehensive collection of ephemeral software, but I figure I might as well collect what I can, since it's mostly a matter of letting a script run and filling up old hard drives.

What I Didn't Capture

I could spend my whole life archiving this stuff, but... I don't want to. Twice a year I run the basic Minecraft capture scripts on Planet Minecraft and CurseForge. Everything else I get, I consider a bonus.

Whenever I discover or hear about some new dataset of ephemeral software, I put it on the following list and then forget about it. If you're inspired by what I've done with the Minecraft Archive Project and the Ephemeral Software Collection, a great way to show your appreciation would be to tackle one of these projects. Otherwise we'll see if these sites are still around when I retire.

If you happen to run one of these sites and would like to contribute a mirror to the MAP or ESC or the Internet Archive, and make sure your users' creations don't get lost, please send me email at leonardr@segfault.org.

Adding to the Minecraft Archive Project

The holy grail of the Minecraft Archive Project is a way to automatically archive active public Minecraft servers. There's no technical obstacle to doing this—walking around on a server streams the chunks to the client, and there are even mods for archiving the streamed chunks—but I've never gotten these mods to work, and getting it to work automatically, across hundreds of thousands of servers running different versions of Minecraft, requires work and resources far beyond what I can bring to the project. Thinking of applying for a digital preservation grant? Try this project out.

I would like to set up a dead-drop email address where people can send their zipped-up Minecraft worlds to explicitly put them in the MAP without publishing them anywhere else. This creates a lot of problems that I don't have time to deal with, so I haven't made any serious attempt at this.

Getting into the more achievable goals, there are more Minecraft maps at Minecraft Maps, MinecraftDL, 9Minecraft, etc. I don't even know if these sites have anything new or if it's all duplicates of things I already have. I haven't gone through them because adding a new site to the rotation is a lot of work, and these collections are very small compared to the Minecraft forum or Planet Minecraft.

Back in May 2014 I archived maps from Minecraft World Share and Minecraft World Map, but I haven't been back. It's a similar situation—they have a couple thousand maps but the collection is relatively small and doesn't grow quickly the way Planet Minecraft does.

The Technic Platform hosts thousands (not sure of the exact number) of Minecraft mod packs.

Adding to the Ephemeral Software Collection

My top wishlist item for the Ephemeral Software Collection is a way to archive all the Super Mario Maker levels. I have no idea how to do this—I suspect you need to mod a Wii U.

Why am I concerned about Super Mario Maker? Because of what happened to Warioware D.I.Y.. Four years after this DS game was released, Nintendo shut down the servers that allowed you to share your minigames. Now the only way to collect old D.I.Y. levels is to buy old cartridges and rip them.

Steam Workshop hosts millions of add-ons for over 300 games, as well as screenshots and links to hosted videos. It seems extraordinarily difficult to download the files, though. I think it's impossible if you don't own the games, and you'll probably need to hack a Steam client if you want to download the add-ons in a systematic way.

I'd like someone to archive all the board game rules and other files on BoardGameGeek. I regularly archive BGG game metadata for the Loaded Dice project, but getting the files is a much trickier proposition.

Youtube hosts petabytes of gaming videos, and there's no way to save it all, but it should be possible to archive a gameplay video for every game in MobyGames. It's also especially important to archive gameplay videos for mobile and online games, which can die as soon as the game studio shuts down a server.

Since ZZT started me on this project in the first place, I should make sure to mirror the ZZT archive, as well as the archive of its cousin Megazeux.

The Terraria forums have links to mods and maps.

Hacked console ROMs (Super Mario World, Sonic the Hedgehog, etc.) Big collection at Romhacking.net. I'm sure other people have private collections of these, so it's not as big a deal.

A wide variety of mods (and prerelease versions of games in development) at ModDB. Similarly, mobile games at SlideDB.

Civilization add-ons at CivFanatics.

The Sims mods at Mod the Sims.

Kerbal Space Program mods at Kerbal Stuff and KSP mods.

Nexus Mods hosts over 100,000 add-ons for over 200 games.

Glorious Trainwrecks archives thousands of quickly-created games.

In general

Whenever we humans create a new art form, the early stuff gets lost. It's not considered "art", it doesn't fit into the existing archives, it's a pain to collect, expensive to keep around, and nobody's in charge of saving it. So it gets lost. This is especially true for art forms favored by children or other people who aren't considered artists.

Time passes, and we regret the loss. We cherish every scrap that survives. Ninety percent of humanity's early films are gone, and a lot of the ten percent is crap, but we preserve it all because there's nothing else like it. Sometimes the crap turns out to be pretty good after all: pulp sci-fi and noir. Even ephemera, things that never get raised to the level of "art", become valuable as windows into the past: account books, restaurant menus, road maps, receipts.

I believe all this stuff is art and I want to save it. But even if history disagrees with me, and the MAP and the ESC are classified as ephemera, that's fine too. In the long run, it's all ephemera.