StackOverflow, ServerFault, and SuperUser are Q&A sites for IT professionals. I’ve blogged about why I like ServerFault before, but perhaps one of the coolest reasons for database people is that they make their data public. Every month, StackOverflow dumps out their data to XML. You can import the data dump into SQL Server, and the whole thing is less than 10gb as of this writing.
But you haven’t done that because you’re lazy.
You just want to open up SQL Server Management Studio 2008 or Toad for SQL Server and connect to the database. Alright, you got it:
- SQL Server: brentozar.dyndns.org (as of this writing, it’s 18.104.22.168 – if the name doesn’t work for you, try the IP)
- Username: StackOverflow_Reader
- Password: c0mm0ns
- Databases you can access: ServerFault, StackOverflow, StackOverflowMeta, SuperUser
The data is not a “live” copy – it’s just the monthly Creative Commons data dump, and the schema is the raw output from Sam Saffron’s data dump tool linked above. The server is a desktop-class machine, nothing fancy, and it’s using my home internet connection. (Yes, that’s why I’m posting this halfway through the day on Friday – easing into the load.) You can get a snapshot of my current desk gear and my servers at Flickr.
If you can’t connect to the SQL Server, there’s probably a firewall blocking port 1433 between your workstation and my lab. Please don’t leave a note to complain – try accessing it from another location, like from your house instead of your work.
To learn more about the schema and how to query it, check out these articles at SQLServerPedia: