When working with my customers in AWS, I observed that one thing keeps popping up all the time. I would call that “suboptimal account setup”.
Starting with AWS is very easy. Opening an account is a fast process. The full power of AWS becomes reachable by anyone in your organization. Experimentation and quick POCs are done by different people/internal teams. Excitement is rising. New services are being explored. First applications are moved to that account, success coming from speed to market and agility is visible. People are being praised for the good work they’ve done. More stuff is being deployed to AWS. Sky is the limit…
…and then you hit the reality. Your account becomes unmanageable. Operational issues are showing up everywhere. You lost track of who can access what, resources are being publicly accessible, backups non existent or misplaced somewhere, logs all over the place, services created through web console, no infrastructure as code anywhere, security in the mercy of God.
When you find yourself in such state, migration is inevitable. Opening a new account and doing it properly from scratch. That takes time and if possible it is better to prevent it from happening.
I am currently engaged by 3 customers to fix exactly these kind of issues. If you also think you might be in similar situation, feel free to contact me. We can do assessment of your current environment together and see where you are at.
But to be fair to all of you who are in such situation, it really isn’t your fault. I partially blame AWS for not putting enough focus on this topic. There are numerous blog posts how quickly you can spin up some cool data analytics or AI/ML solution, but very few words on actual prework that needs to be done to have a healthy and optimal environment to deploy such solutions. Yesterday I was searching for books on AWS. There are about 15-20 books available and all of them are talking about same things – how to spin up EC2 instance, how to use Lambda etc. But none of them is discussing how to setup optimal AWS account before launching any of those services. I am actually tempted to write such eBook. Talk me out of it please. 🙂
You might ask now: “But wait, what about Control Tower? Isn’t that service meant to be used exactly for this purpose?”
AWS Control Tower is a service that tries to solve this issue of setting up an AWS account structure according to the best practices of well-architected framework. It automates creation of a Landing Zone in your management account.
In my opinion, AWS Control Tower is not a viable solution. It comes with limitations:
- inability to create sub-OU’s. If a new Organizationa Unit is created through Control Tower, it always has Root as a parent.
- guardrails are fixed, you can’t add new one
- when you update Control Tower to a new version, you often need to update individual accounts as well (either re-register OU or update account through Service Catalog)
- fragility – it is easy to break Control Tower or make it out of sync if you manually change AWS Config rules for example. Control Tower governs your accounts by running CloudFormation StackSets in the accounts. Any incidental deletion of those StackSets and Control Tower is out of sync.
Also, AWS Control Tower is not actually bringing any new functionality on its own. It configures existing services for you, such as AWS Organizations, AWS SSO, AWS Config, AWS CloudTrail etc. The issue I see here is that for any change of users, groups, permissions, you are being redirected to other services where these things are actually being configured. So, you still need to know how to operate AWS Organizations, AWS SSO and other services. If you are on that level of knowledge to be able to operate those services on its own, you really don’t need AWS Control Tower. You can do it yourself, without previous limitations and much more control and flexibility.
So, how to create an optimal AWS account setup?
These are the steps that I follow to assess existing accounts and setup new ones:
- Create a new AWS account – setup billing, contact and security verification information to be able to receive proper billing/tax invoices and for AWS to be able to contact you if necessary. Here you would also properly protect root account user and the Admin IAM user. Create IAM user access for billing only.
- Create AWS Organizations – set up a hierarchy of Organisational Units and create individual accounts in each of those OU’s. For each new account, setup budgets and alarms and delete default VPC’s in each region of interest. You should always at minimum have separate Security, Network and Logging account. Sandbox OU should have a set of individual accounts for developers. Suspended OU should hold accounts that will be deleted. Workloads OU should hold all dev/test/prod accounts of each application.
- Setup SSO – create AWS SSO or integrate third-party identity provider (Okta, Auth0, OneLogin, JumpCloud etc.). Setup groups and roles and configure access over SSO. Each user should receive instructions on how to access AWS via console and/or CLI.
- Setup SCP guardrails – as mentioned in my previous blog post, I usually setup 12 common guardrails to start with.
- Setup networking services – in the Network account, setup Transit Gateway/VPN/Direct Connect.
- Setup log shipping – in the Logging Account, create S3 buckets that will receive logs (Config and CloudTrail) from all other member accounts.
- Setup Delegated Admin – make Security account a delegated admin for security services AWS Config, AWS CloudTrail, Amazon GuardDuty, AWS Security Hub, IAM Access Analyzer. Your security teams should have access to the Security account from where they can see aggregated findings and issues of all member accounts.
- Setup AWS Config rules – for each individual account, decide on which Config rules should be applied. There are 37 preconfigured rules that can be extended by your custom rules.
- Setup an inspection tool for CloudTrail logs – CloudTrail logs are the most important source of information who did what and when in your AWS environment. However, CloudTrail logs are written in S3 as an endless list of gzip-ed json files. To inspect them and find what we need, it is important to have a proper analyzing tool. I like to use Amazon Athena and CloudWatch Log Insights as the CloudTrail inspection tools.
- Notification setup – incidents happen and we need to notify admin teams about them. For that purpose I like to setup bots (Marbot or AWS Chatbot) to notify me on Slack channels if something goes wrong.
- Remote access – if your developers/third-party consultants need to access some of your instances, the best way is to setup Client VPN endpoint in the Network account and configure access to instances. Additionally, we can setup Workspaces and Cloud9 environments for developers to access AWS accounts without giving them access to the rest of AWS environment.
- Service Catalog – any kind of infrastructure code that you script through CDK/CloudFormation should be uploaded to Service Catalog. Service Catalog can share these resources with other member accounts so its easy for developers to spin up e.g. new CI/CD pipelines by using predefined products in Service Catalog.
That would be my optimal AWS account setup. Now you are ready to deploy all those AI/ML and other workloads.