The nice folks at Stack Overflow publish their entire data set (data included) in XML format. It’s tons of fun for demos, but you need a way to get it into a relational database.
The Stack Overflow Data Dump Importer (SODDI) makes this point-and-click easy. Just download the latest release, install it, download the XML exports, and you can import it into MySQL, SQLite, or Microsoft SQL Server.
Last week, we sponsored @AndrewBrobston to make a few improvements and release v1.4:
- Add support for the PostHistory, PostLinks, and Tags tables
- Enable you to set the Id fields as identities (really useful for the workloads in my training classes)
- Enable MySQL and SQLite support again (had broken in prior releases)
@BennetElder also added a database connection dialog to make it easier to pick & choose the target database.
- To import Stack Overflow (or any of the other data exports, like the smaller ones) – download the latest SODDI and read the instructions at Github
- If you run into issues, look at the known list of issues, and you can also file new ones
- To just start querying fast, download the Stack Overflow database via BitTorrent (but be aware that it’s big)
- Learn about the Stack Overflow database schema
- Get sample Stack Overflow queries from data.stackexchange.com for your performance tuning training