Standard architecture for modern webapps
It's interesting to see that there are new web applications being built which are still not prepared for the cloud. This means that if they get what was once called "the slashdot effect", they simply struggle to serve requests. Something unnacceptable for 2013. I decided then to share what I see as a standard architecture for modern REST-based webapps. This is highly influenced by the Amazon's Reference Architecture called "Web Application Hosting", but with two differences:
- CloudFront is in front of all requests, including dynamic content.
- Separation of critical data databases to sit behind specialized applications
The first item is a trick I've learned during the last AWS Summit, and the main purpose is to keep AWS close to the final user even when you don't have servers in a region close to the user (Japan, for instance). In this case, the user connects to CloudFront, and CloudFront connects to your servers at your region. In the worst case scenario, the speed is the same as the user would experience without CloudFront, but the reasoning is that AWS's connection to their own resources is better than another ISP's connection. The trick is to keep serving static content as usual from CloudFront, and set the TTL to 0 for dynamic resources.
The second point might seem unnecessary, but given how many applications are still being compromised in a daily basis, I think it should help in securing critical data, like user information (name, email, password, ...) and third-party application data (which manipulates user data on behalf of the user). Some other advantages include:
- You can change the data-store type for the user data and application data. For instance, you can use a NoSQL database for them.
- You can easily implement new security controls, without affecting the main application. For instance, it's trivial to add extensive auditing and logging to the user info part, without affecting the performance of your backend.
- You can scale just the parts that needs scaling.
- You can afford to do things which are not perfect in the "performance" area, because you have only this small service there. And if this becomes a problem, either change the algorithm or let it scale up.
- And of course, the most obvious one is that your user's data will not get leaked if you get a SQL Injection in your biggest code base (the backend or frontend). Your user's data is better protected.
And to spice it up a bit, you can bundle some security features around it, using AWS's Security Groups:
- Only instances in the sg-front-end can talk HTTP with instances in sg-user-info (and similar for sg-back-end > sg-app-info)
- Machines in sg-user-info or sg-app-info cannot start a conversation (other than, perhaps, to talk with Chef servers and/or software update repositories)
- SSH is accepted only if it's from a sg-ssh-bastion (not shown in the picture)
And the price is still not bad: for a development environment (without the ELBs), you'd spend something between 0.50 USD per month (40 hours/week) up to around 50 USD, for 24x7 micro instances for each layer. And for a production setup, just like shown on the image, you'd spend around 150 USD/month for a 3-year "reserved" instances (1200 USD upfront). Certainly not the cheapest option out there, but for sure, the cheapest option on this type of high-available and secure architecture.