I challenged you to decide:
- What data needs to be stored to make all this work?
- What database or persistence layer should we use?
So let’s talk through each of ’em.
1. What data needs to be stored?
The 3 major components are:
- Rooms – with attributes like the room name, URL, room owner email
- Messages – including the message, person’s name, person’s email address. Initially, we didn’t want email address, but we added it and made it optional just so we could show Gravatar icons next to peoples’ names.
- Votes – and this is where your opinions will play out in the comments. The data professional in you wants to capture the details of every vote, and you might want to add a separate normalized Votes table for it. However, for the sake of this app, we don’t need granular tracking of votes. We just need to increment the number of votes on a message – and for that, it makes sense to just have a VoteCount column on the Messages object. I know. You’re not happy about that. But remember, we have a goal to track as little data as possible.
Here’s one of the early data model sketches from Richie’s notebook:
Richie’s sketch adds a fourth data object, connection, that’s required for tracking each of the browsers connected at the moment. We don’t need that for evil purposes – we just need to push updates back to them so their screen can refresh in real time.
Just for clarity’s sake, this isn’t the exact structure of PollGab’s current data model – but I like showing this early draft because it shows the thought process. Data models change over time, especially when you’re building a brand new app like this one where we can iterate quickly, and we don’t need to deal with queries coming from other systems, like reporting tools.
There are additional things to think about as well, like banning users and making payments, but I’ll keep it simple for the sake of this discussion. (And really, in a modern app, you don’t want to hold the payments yourself – that’s what services like Stripe are for.)
In terms of the way we load and query this data, here are a few sample use cases:
- View the list of rooms here
- View the list of messages for one room (like Office Hours)
- Insert a new message
- Upvote a message
- Flag a message
- Delete all messages in a room
You’ll notice that the views are fairly tightly scoped: we’re not doing things like analytical reports.
2. What database or persistence layer should we use?
I know, most of us here use the Microsoft data platform to make a living. However, dating back several years ago when Richie built PasteThePlan.com, then when he built SQL ConstantCare®, we’ve long believed that where we can avoid using a relational database, we should.
NoSQL systems are generally cheaper and easier to manage than relational databases. Yeah, I said it. Come at me. Try hosting production-quality, auto-scaling relational databases for $3.43 per month. It’s damn hard.
So like PasteThePlan (but unlike SQL ConstantCare®), we ended up using Amazon DynamoDB. I know it’s marketing, but the main boxes on the DynamoDB product page do a pretty good job of summing up its advantages:
DynamoDB does have drawbacks for most of us in the audience: we can’t just whip open SSMS or Azure Data Studio, write a query with joins and grouping, and get reports out. However, for the sake of this web site, we just don’t need reporting across multiple rooms or questions.
During the design phase, I specifically said that any kind of marketing analytics were completely out of scope. I recognize that ad agencies and podcasting companies might want to do centralized purchases for lots of podcasts or rooms, and then get analytical data to know which users are asking the most questions, or what kinds of questions get the most upvotes. They’re just not our target market for the initial product release.
This is a cloud-native, locked-in app.
We designed this from the ground up to work exclusively with Amazon DynamoDB. This means we’re stuck there, for better or worse. If Amazon DynamoDB goes down, PollGab’s going down with it. As a small business owner, that’s a risk I’m totally willing to take. DynamoDB has much higher uptime than any server I’ve personally been involved with building and maintaining.
I’m sharing this with you because I want you, dear reader, to understand why startups and small businesses might choose to go cloud-native and build their applications without traditional relational database systems. I love databases! I still absolutely love ’em. But I’d rather not be on call for fixing them when (not if) they break.
I’m also being fine locked into AWS because we chose to keep building with Amazon Lambda, the same hosting and processing architecture that we chose for PasteThePlan and SQL ConstantCare®. Lambda lets you write short-lived functions, and you don’t have to know anything about the operating system or servers that run your code. When someone hits a URL, the code runs, period. When lots of people come in, AWS will gradually spin up more… somethings… that run your code. That’s just not really our problem.
Could we have used Azure or Google services instead? Absolutely! But at the time we started building this, Richie had already built up 5 solid years of experience with DynamoDB and Lambda, so we were able to start building more quickly without learning new tooling.
Could we have used a conventional server-based application, like a web site in IIS or a .NET app? Absolutely! But we wanted to be able to completely outsource uptime management and troubleshooting to Amazon, and that’s where Lambda makes things easier. The best evidence for that is that as Richie continues to build stuff, he’s able to spend an absolute minimum of time managing our already existing apps like PasteThePlan and SQL ConstantCare.
What questions do you have about PasteThePlan’s data model or persistence layer? Let us know in the comments.