Right now, someone in your company is thinking about moving your operations up into the cloud. If your company is like 95% of companies out there, “the cloud” is really a synonym for Amazon Web Services (AWS). Thinking about a move to AWS is happening across many companies, and it’s not something that you need to be scared of – nobody is going to lose their job. However, it’s important to understand what you and your company can do to make this transition successful, painless, and an opportunity for growth.
How fast is your application right now? Do you know when you’ve hit peak load or even which metrics signify peak load?
Having a performance baseline is the only way to make sure that changes are helping. A general feeling that things are getting better isn’t enough, you need to know. During our engagements with customers, we evaluate how applications are performing today. We gather metrics around disk and CPU performance and look for signs of existing and potential bottlenecks. Once we know where things are, it’s easy to figure out where things are going.
As you’re looking to move to AWS, take a look at your current performance baseline. Are you happy with those numbers? Things probably won’t get better after the move; virtualization has overhead and everything in AWS is virtualized. Once you’ve got a baseline of your performance, it’s easy to test your theories about performance and make decisions about how aggressively you need to tune your application to meet your promises to your customers.
Service Level Agreements
SLAs are so important that near the beginning of every engagement I ask, “Do you have an SLA with your customers? What about internal SLAs between your teams?”
An SLA can be as simple as responding to outages within a certain window or as complex as measuring the performance of specific actions in the application. Having an SLA in place early can make it easy to evaluate your ability to move up to AWS and still maintain acceptable performance. If you know that a specific set of web pages must return faster than an average of 250ms during peak load, you have an easy measurement to determine if performance is acceptable.
Distilling performance down to a set of metrics makes it easy to make decisions, spot problems, and design for the future. In addition to a performance baseline, SLAs also give guarantees. If query performance drifts outside of agreed upon norms, the SLA can describe who is going to work on improving performance and how quickly that work gets scheduled.
SLAs aren’t just performance related – they relate to how fast you can bring a system online in the event of an outage, how you will respond to potential performance issues, and what the shape of that response looks like. Everyone aspires to five 9s of uptime, but what’s important is how you handle unforeseen outages.
An SLA shouldn’t be a detailed, painful document that resembles of a software license. A well designed SLA will serve to drive customer interaction and push teams to take responsibility for making the application perform well.
What does your monitoring environment look like today? If you have monitoring in place, I suspect that you’re looking for problems as they’re happening. To maintain good performance in AWS, you need look for performance problems before they occur.
You can use your existing monitoring, or you can use Amazon’s CloudWatch to create alerts around trends in resource utilization. If you know your performance baselines, you can configure alerts to notify you when things are out of the ordinary. If CPU utilization has never gone higher than 55%, even under peak load, it’s helpful to set up an alarm to fire when CPU utilization has been higher than 60% over a period of 15 minutes.
It’s better to be aware of potential problems than to respond to a fire. Make sure performance warnings are different from alerts about actual problems, but also make sure that you’re doing something about warnings as they arrive. Being proactive about monitoring performance does more than help you keep things running smoothly; proactive monitoring gives you insight into where you can tune your applications and make things better in the long run.
What are your thoughts around AWS? What’s the big picture? Having a vision around AWS is going to be critical to a successful transition.
A vision around AWS can’t be a simple statement like “We’re going to move our operations into the cloud and save a lot of money.” While saving money is a nice goal, the vision around AWS needs to be something more than fork lifting your existing infrastructure into the cloud.
Having a strong vision about how AWS can help your organization meets its goals is critical. This doesn’t have to be highly detailed and include specific features, but make sure that the vision includes an understanding about how you will be deploying and refining your application over time. Moving to AWS is not a quick fix for any problem. Your vision needs to include what today looks like, what your goals look like, and how you’ll be working toward them over the next 3, 6, 9, and 12 months.
Taking full advantage of the rich feature set in AWS takes time, and your vision should reflect how you will make the move and monitor application behavior in order to make good decisions about direction and functionality.
Most important is buy in. Everyone involved needs to buy into the idea that a move to AWS makes sense for the organization. This may be the hardest item to accomplish, but it’s worth making sure that your team is on the same page.
Having a team that’s accepting of the move will make the transition easier. Part of this acceptance is the realization that things will not be the same as they were before. Gone are the days of giant back end servers with hundreds of gigabytes of memory and multiple network cards. Performance is no longer a purchase order away. Your team needs to accept these statements as facts; you can’t easily upgrade your way out of performance problems in AWS.
Instead of buying their way out of problems, the team needs to be committed to investigating new ways to solve their problems. With AWS it’s simple to design a rapid prototype and direct load to the new prototype system. It’s important that everyone is on board with testing changes at any level – from a single function to the entire infrastructure. In AWS these sweeping changes are easy to do, but you need buy in from everyone involved that things don’t need to be the same as they were before.
Drifting Away Into Cloudy Success
If you don’t have all of these factors in place, are you going to fail? Probably not. These are distinct traits that I’ve found in companies who have successfully moved their infrastructure into AWS. The more of these that a company has possessed, the happier I’ve found them to be with AWS. As you’re considering a move into AWS remember that there are more than technical challenges to moving into AWS – there are human and organizational challenges that need to be met in order to ensure success.