Dev team experience
Let's imagine that you've started a team with two developers, Ann and Bob. The two of them are starting a brand new app. This app will be a web service called search that your customers will be able to access at https://search.your-company.com. Here is how Ann and Bob can bring this app to life:
Scaffold the app
- First, Ann browses your company's Service Catalog to see what languages and frameworks you support for web services. She sees a list that includes:
- Java on Spring Boot.
- Ruby on Rails.
- Ann and Bob are both familiar with Ruby on Rails, so Ann picks that, enters
searchas the name, and clicks "Create new Ruby on Rails app". The Service Catalog then:- Creates a new repo called
searchin your company's GitHub org. - Generates the scaffolding for the Ruby on Rails app and commits the code to the
searchrepo. The scaffolding includes all the basic libraries, packaging tools, monitoring code, automated test scaffolding, and everything else a web service needs to have to meet your company's requirements. - Configures a CI / CD pipeline for the
searchrepo so that you get automatic builds and tests (but not yet deployment) for the app. - Configures required code reviews for every merge to the
mainbranch. - Provides Ann with instructions on how to check out the repo and get started.
- Creates a new repo called
Run the app
- Ann follows the instructions to
git clonethe repo. - Inside, she finds a
READMEwith further instructions. - Following the
README, she runsdocker-compose upto fire up the app. - Now she can code locally and use
localhost:3000to test manually. - The app scaffold includes tests, so Ann can use
rails testto test automatically.
Scaffold new features in the app
- Occasionally, Ann will come back to the Service Catalog to look up how to accomplish common tasks in the Ruby on Rails app.
- Example #1: The
searchservice may need to read a secret, such as a database password. Secrets management should be one of the core use cases in your Service Catalog, so when Ann looks it up, the Service Catalog generates some example code for her showing, for example, how to read secrets from HashiCorp Vault while the app is booting, and checks that example code directly into a branch of thesearchrepo for Ann to build on top of. - Example #2: While Ann is working on the
searchservice, she realizes that the service should only be accessible to logged in users, so she browses the Service Catalog for how to handle auth. The Service Catalog generates and checks in some code that shows Ann how to make API calls to your company'sauthservice, how to extract data from the response, and how to redirect the user to a login page if they aren't logged in.
Deploy the infrastructure
- Meanwhile, Bob heads over to the Service Catalog to set up the infrastructure for the new app. First, he sees the clouds your company uses:
- AWS
- Azure
- Due to the way the ACLs are setup in your Service Catalog, the organization Ann and Bob are in only has access to AWS, so that's what Bob picks.
- Next, he sees a list of services available:
- App deployment (EKS)
- Database (RDS)
- Cache (ElastiCache)
- (... etc ...)
- Bob wants to deploy just their app for now, so he picks "App deployment (EKS)".
- The Service Catalog prompts him to enter some information, such as:
- The name of the app.
- The repo the app's code is in.
- How many replicas to run.
- How much CPU and memory the app needs.
- What domain name to configure for the app.
- Who the service owners are.
- Bob enters all this info and clicks "create."
- Let's assume your company uses Terraform and Terragrunt to manage your infrastructure. At this point, the Service Catalog does the following:
- Deploys a new EKS cluster in each environment. Under the hood, this could be done by going into the
infrastructure-liverepo and creating a newterragrunt.hclto deploy an EKS cluster in each of thedev,stage, andprodenvironments. The EKS cluster would be deployed using aneks-clustermodule in the Service Catalog that has been tested to meet all your company's requirements out-of-the-box: e.g., it's configured with Istio as a service mesh, uses self-managed, hardened EC2 instances as worker nodes, has pod and network security policies built in, and so on. - Creates a new ECR repo to store the Docker image for the app. Under the hood, this could be done by going into the
infrastructure-liverepo, addingsearchto the list of repos inecr-repos/terragrunt.hclin thesharedenvironment, committing the changes tomain, and allowing the CI / CD pipeline to runterragrunt apply. - Creates the code to deploy the app in each environment. Under the hood, this could be done by going into the
infrastructure-liverepo and creating a newterragrunt.hclas a thin wrapper around a Helm chart, in each of thedev,stage, andprodenvironments to deploy the Docker image into the EKS cluster. - Updates the CI / CD configuration for the app to automatically deploy it into this EKS cluster. Under the hood, this could be done by going into the
searchrepo and updating the CI / CD config to build a Docker image, push it to the new ECR repo, and deploy it to the EKS cluster in the proper environment based on the branch and tags.
- Deploys a new EKS cluster in each environment. Under the hood, this could be done by going into the
Iterate on the app
- Ann makes some changes to the
searchapp. - She commits those changes to a branch.
- Ann opens a pull request (PR).
- The CI / CD pipeline runs the automated tests and updates the PR with the results.
- The CI / CD pipeline deploys the app into a preview environment (e.g.,
dev). - Bob reviews the code, and if everything looks good, gives it the "ship it!"
- Ann merges the PR to
main. - The CI / CD pipeline deploys the changes to
stage. - Ann, Bob, and other team members test in
stage. - If everything looks good, Ann creates a
release-xxxtag in thesearchrepo. - The CI / CD pipeline deploys the app to
prod. - The deployment is fully automated, handled via Kubernetes, and configured via the Helm chart to do a rolling, zero-downtime deployment.
Maintain the app
- One day, Bob gets an alert that the
searchapp is down, and jumps in to debug the issue. - The alert that notified Bob comes from CloudWatch, which was configured by the Helm chart, out-of-the-box, to notify service owners if the health check suddenly starts failing.
- To figure out what's going on, Bob logs into the company's AWS accounts using his Google account. This is possible because those AWS accounts were set up with an account baseline from the Service Catalog that configured Single Sign-On (SSO) with Google as the Identity Provider (IdP).
- Bob is able to find the metrics and logs for the
searchapp in CloudWatch, as the Rails app scaffolding and the Helm chart were configured to send all logs and metrics there. - Using the logs, Bob is able to figure out what the error is, and get it fixed.
Automatically update the app
- Over the next few months, maintainers release new versions of the code Ann and Bob depend on.
- Example #1: The maintainer of the Ruby on Rails app scaffold and libraries releases a new version of the authentication code with a critical bug fix.
- Immediately, a new PR is automatically opened in the
searchrepo. - The CI / CD pipeline runs the tests, which pass.
- The repo is configured to auto-merge critical security patch and minor releases, so the PR merges to
mainautomatically. - The CI / CD pipeline automatically builds a Docker image and deploys it to
dev. - Smoke tests run in the
devenvironment and pass. - The CI / CD pipeline automatically deploys to
stage. - Ann and Bob review what happened and create a new
release-nov-12-2021tag to trigger a deploy to prod.
- Immediately, a new PR is automatically opened in the
- Example #2: The maintainer of the Helm chart Ann and Bob use releases a few new versions with minor new features.
- Once per month, the auto-update system opens a PR to update the
searchapp to use the new version of the Helm chart. - The changes go through the same release process as above.
- Once per month, the auto-update system opens a PR to update the