The Minecraft Archive Project
The kids who grew up playing ZZT are now artists, game designers, and programmers. But many of the worlds they created are gone. ZZT worlds were shared through BBSes and online services like CompuServe. When the Internet took over, those services shut down and the worlds were lost. It's estimated that only half the ZZT worlds ever created still survive.
In the early 2010s I realized that history was repeating itself. This time, the blocky game with the embedded programming language was Minecraft. Kids and teenagers were creating worlds, putting a lot of work into them, and sharing them on unreliable file-hosting sites. Some of those kids are now artists, game designers, and programmers. Soon they'll get nostalgic, start thinking back on the game that showed them how fun it was to create their own worlds... and it'll all be gone.
The Minecraft Archive Project is my attempt to stop that from
happening. Minecraft is much more popular than ZZT ever was, and I
don't think I can save more than a fraction of one percent of its
cultural history, but without this project pretty much all of it is
In the Collection
I periodically refresh the MAP by capturing new data from a couple different sites:
My focus is on packaged binaries (Minecraft maps, resource packs, and mods) but I also capture images (screenshots and skins).
I've split the Minecraft Archive Project into a large number of ZIP files of about 50 gigabytes each. Each file has an HTML finding aid that gives you an overview of what's in the ZIP file. Here are the links:
Worlds by Month
Worlds – Other dates
Minecraft Pocket Edition
Git repositories - “minecraft”
I clone every Git repository I learn about in the course of performing a capture. I've also run a Github search for "minecraft", and cloned every repository that shows up. These are organized by the creation date of the Git repository.
- 1970-September 2011
- -April 2012
- -November 2012
- -December 2012
- -March 2013
- -May 2013
- -August 2013
- -October 2013
- -November 2012
- -January 2014
- -February 2014
Git repositories - “craft”
This section is a mixed bag. It mostly contains Minecraft clones or games inspired by Minecraft, and projects that have nothing to do with games at all—they just have a name like "FooCraft" that sounds Minecraft-ish.
- 1970-October 2012
- -March 2013
- -September 2013
- -April 2014
- -August 2014
- -November 2014
- -Janury 2015
- -April 2015
- -June 2015
- -August 2015
- -September 2015
- -November 2015
- -January 2016
- -June 2016
- -September 2016
- -November 2016
- -February 2017
- -March 2017
- -April 2017
- -June 2017
- -July 2017
Git repositories - “bukkit”
As you'd expect, this contains mostly repositories relating to the Bukkit project.
Archives made by other people
These items are in the "Minecraft Archive Project" collection on the Internet Archive, and they're of interest to anyone who's interested in the MAP, but I didn't make these archives -- other people did.
There's no secret, really. I just wrote a lot of Python scripts and let them run for a really long time. When one script finishes, I run the next one in the sequence. I go into some detail about my process in a 2015 blog post.
Using the data from the initial MAP capture in 2014, I created these projects:
ESC: the Ephemeral Software Collection
As the Minecraft Archive Project grew, I started getting data from sites like CurseForge and GitHub which contain both Minecraft and non-Minecraft stuff. I started the Ephemeral Software Collection to hold the non-Minecraft stuff. Before long, the ESC became larger than the Minecraft Archive Project that spawned it—over four terabytes as of February 2016.
You can see an overview of the ESC here. It's an eclectic collection, but I generally think of it as containing software that's at risk of being lost, forgotten or destroyed. It contains the equivalent of the Minecraft Archive Project for games other than Minecraft. It also contains mods, add-ons, software created for one-off events like game jams, implementations of classic computer games, software that exists in a copyright or trademark grey area, experimental code, and stuff I just think might be interesting or useful later.
I'm not even going to try to archive a comprehensive collection of ephemeral software, but I figure I might as well collect what I can, since it's mostly a matter of letting a script run and filling up old hard drives.
What I Didn't Capture
I could spend my whole life archiving this stuff, but... I don't want to. Twice a year I run the basic Minecraft capture scripts on Planet Minecraft and CurseForge. Everything else I get, I consider a bonus.
Whenever I discover or hear about some new dataset of ephemeral software, I put it on the following list and then forget about it. If you're inspired by what I've done with the Minecraft Archive Project and the Ephemeral Software Collection, a great way to show your appreciation would be to tackle one of these projects. Otherwise we'll see if these sites are still around when I retire.
If you happen to run one of these sites and would like to contribute a mirror to the MAP or ESC or the Internet Archive, and make sure your users' creations don't get lost, please send me email at email@example.com.
Adding to the Minecraft Archive Project
The holy grail of the Minecraft Archive Project is a way to automatically archive active public Minecraft servers. There's no technical obstacle to doing this—walking around on a server streams the chunks to the client, and there are even mods for archiving the streamed chunks—but I've never gotten these mods to work, and getting it to work automatically, across hundreds of thousands of servers running different versions of Minecraft, requires work and resources far beyond what I can bring to the project. Thinking of applying for a digital preservation grant? Try this project out.
I would like to set up a dead-drop email address where people can send their zipped-up Minecraft worlds to explicitly put them in the MAP without publishing them anywhere else. This creates a lot of problems that I don't have time to deal with, so I haven't made any serious attempt at this.
Getting into the more achievable goals, there are more Minecraft maps at Minecraft Maps, MinecraftDL, 9Minecraft, etc. I don't even know if these sites have anything new or if it's all duplicates of things I already have. I haven't gone through them because adding a new site to the rotation is a lot of work, and these collections are very small compared to the Minecraft forum or Planet Minecraft.
Back in May 2014 I archived maps from Minecraft World Share and Minecraft World Map, but I haven't been back. It's a similar situation—they have a couple thousand maps but the collection is relatively small and doesn't grow quickly the way Planet Minecraft does.
The Technic Platform hosts thousands (not sure of the exact number) of Minecraft mod packs.
Adding to the Ephemeral Software Collection
My top wishlist item for the Ephemeral Software Collection is a way to archive all the Super Mario Maker levels. I have no idea how to do this—I suspect you need to mod a Wii U.
Why am I concerned about Super Mario Maker? Because of what happened to Warioware D.I.Y.. Four years after this DS game was released, Nintendo shut down the servers that allowed you to share your minigames. Now the only way to collect old D.I.Y. levels is to buy old cartridges and rip them.
Steam Workshop hosts millions of add-ons for over 300 games, as well as screenshots and links to hosted videos. It seems extraordinarily difficult to download the files, though. I think it's impossible if you don't own the games, and you'll probably need to hack a Steam client if you want to download the add-ons in a systematic way.
I'd like someone to archive all the board game rules and other files on BoardGameGeek. I regularly archive BGG game metadata for the Loaded Dice project, but getting the files is a much trickier proposition.
Youtube hosts petabytes of gaming videos, and there's no way to save it all, but it should be possible to archive a gameplay video for every game in MobyGames. It's also especially important to archive gameplay videos for mobile and online games, which can die as soon as the game studio shuts down a server.
The Terraria forums have links to mods and maps.
Hacked console ROMs (Super Mario World, Sonic the Hedgehog, etc.) Big collection at Romhacking.net. I'm sure other people have private collections of these, so it's not as big a deal.
Civilization add-ons at CivFanatics.
The Sims mods at Mod the Sims.
Nexus Mods hosts over 100,000 add-ons for over 200 games.
Glorious Trainwrecks archives thousands of quickly-created games.
Whenever we humans create a new art form, the early stuff gets lost. It's not considered "art", it doesn't fit into the existing archives, it's a pain to collect, expensive to keep around, and nobody's in charge of saving it. So it gets lost. This is especially true for art forms favored by children or other people who aren't considered artists.
Time passes, and we regret the loss. We cherish every scrap that survives. Ninety percent of humanity's early films are gone, and a lot of the ten percent is crap, but we preserve it all because there's nothing else like it. Sometimes the crap turns out to be pretty good after all: pulp sci-fi and noir. Even ephemera, things that never get raised to the level of "art", become valuable as windows into the past: account books, restaurant menus, road maps, receipts.
I believe all this stuff is art and I want to save it. But even if history disagrees with me, and the MAP and the ESC are classified as ephemera, that's fine too. In the long run, it's all ephemera.