|
1 | 1 | # Easy EKS (Pre-Alpha) |
2 | 2 |
|
3 | | -## What problem does this solve? |
| 3 | +## What is Easy EKS? |
| 4 | +**Here are 3 useful answers to that question based on different perspectives:** |
| 5 | +(Each answer is clarified further in later sections on this page.) |
| 6 | +1. **A Solution:** |
| 7 | + Setting up and learning how to implement EKS according to best practices is often said to be hard, |
| 8 | + so much so that it crosses the threshold of being problematically difficult and a barrier to |
| 9 | + adoption. From this perspective Easy EKS can be seen as a solution to EKS's difficulty problem. |
| 10 | +2. **Summarized Technical Description:** |
| 11 | + An opionated bundling of automation & IaC (Infrastructure as Code) that aims to: |
| 12 | + 1. Make it easy to provision EKS clusters that are nearly production ready by default. |
| 13 | + 2. Maintain a heavily standardized opinionated set of IaC, which makes automation maintainable. |
| 14 | + 3. Apply useful design patterns from Helm and Kustomize to IaC based on AWS CDK. |
| 15 | +3. **General Description:** |
| 16 | + A user experience optimized approach to EKS, that aims to make using EKS `simpler`, `accessible`, |
| 17 | + and `enjoyable`. |
| 18 | + |
| 19 | +------------------------------------------------------------------------------------------------------- |
| 20 | + |
| 21 | +### What problem does Easy EKS solve? |
4 | 22 | * EKS is a double-edged sword: |
5 | 23 | * Good: |
6 | 24 | * It simplifies the setup of a Kubernetes Cluster on AWS. |
7 | 25 | * It works great after it's set up. |
8 | | - * Bad: |
9 | | - * Set up is left to end users who have a high risk of setting it up poorly, taking months, or |
10 | | - both. |
| 26 | + * Bad: |
| 27 | + * EKS by itself is far from being production ready by default, EKS is more like the virtual |
| 28 | + equivalent of receiving a custom PC build project, while wanting a push button server. |
11 | 29 | * It has a terrible FTUX (first time user experience) and OOTB (out of the box) UX (user |
12 | | - experience). |
13 | | -* Easy EKS is a solution to EKS's problems related to its slow and flawed set up process, FTUX, and |
14 | | - OOTB UX. |
| 30 | + experience), because end users are left to figure out how to re-invent a production ready setup, |
| 31 | + and there's a high risk that they'll set it up poorly, need months to figure it out, or both. |
| 32 | +* Easy EKS can be seen as a solution to EKS's problems related to its slow and flawed set up process, |
| 33 | + FTUX, and OOTB UX: |
15 | 34 | * https://www.reddit.com/r/aws/comments/qpw36d/aws_eks_rant/ |
16 | 35 | * https://www.reddit.com/r/devops/comments/y5am95/why_is_eks_and_aws_in_general_so_much_more/ |
17 | 36 | * https://matduggan.com/aws-eks/ |
18 | 37 |
|
19 | | -## What is Easy EKS? |
20 | | -* Easy EKS is a user experience optimized approach to EKS, where using it becomes `simpler`, `accessible`, and `enjoyable`. |
| 38 | +------------------------------------------------------------------------------------------------------- |
| 39 | + |
| 40 | +### What Specific Technical Benefits does Easy EKS Offer? |
| 41 | +* **Currently Available in Pre-Alpha:** |
| 42 | + 1. `Useful elements of Helm's design pattern are used:` |
| 43 | + * A nice feature of Helm over say Kustomize, Terraform, or common CDK/Pulumi design patterns, is |
| 44 | + that it's intuitively clear what parts of the IaC are fine to change vs shouldn't be changed. |
| 45 | + * Configuration input parameters have sensible defaults, but can be overridden. |
| 46 | + * Some IaC complexity can be hidden, which allows users to focus on well organized config, which |
| 47 | + in turn significantly lowers cognitive overhead and improves ease of mangement and accessibility. |
| 48 | + * Supports the deployment of Multiple Instances: It's very easy to have multiple clusters per |
| 49 | + environment (dev1-eks, dev2-eks, etc.) |
| 50 | + * Helm popularized a convention of mixing config values with |
| 51 | + [heavy commentary](https://artifacthub.io/packages/helm/prometheus-community/prometheus?modal=values) |
| 52 | + which improves accessibility and general user experience, by explaining what a config flag will |
| 53 | + do and documenting commented out examples of alternative possible values with correct syntax. |
| 54 | + 1. `Useful elements of Kustomize's design pattern are used:` |
| 55 | + * Kustomize popularized the [config overlay design pattern](https://kubectl.docs.kubernetes.io/guides/introduction/kustomize/#2-create-variants-using-overlays), |
| 56 | + which offers multiple advantages: |
| 57 | + * It allows config shared between multiple environments, to be deduplicated which makes it much |
| 58 | + easier to avoid unwanted config drift between environments, which improves maintainability. |
| 59 | + * It keeps the config well organized, which makes it easier to quickly navigate. |
| 60 | + 1. `Two well configured AWS VPCs` |
| 61 | + * The VPCs are dualstack(IPv4/v6), and EKS cluster's use IPv6 mode to eliminate problem of running |
| 62 | + out of IPs. |
| 63 | + * fck-nat: The (f)easible (c)ost (k)onfigurable NAT, is an alternative to AWS's Managed NAT GW, |
| 64 | + that's an order of magnitude cheaper. |
| 65 | + * lower-envs-vpc defaults to 1 fck-NAT instance |
| 66 | + * higher-envs-vpc defaults to 2 fck-NAT instances, and can optionally be set to 3 AWS Managed NAT |
| 67 | + GWs. |
| 68 | + * node-local-dns-cache and S3 Gateway endpoints are also enabled by default. |
| 69 | + 1. `Heavily cost optimized:` |
| 70 | + * Easy EKS gives the benefits of EKS's Auto Mode (and more), without Auto Mode's additional costs. |
| 71 | + * The baseline costs of a dev cluster is under $100/month. |
| 72 | + * EKS control plane cost is $73/month. |
| 73 | + * lower-env-vpc's fck-NAT defaults to $3.06/month, and is meant to be shared by multiple clusters. |
| 74 | + * 2x t4g.small spot baseline nodes are $10.22/month |
| 75 | + * karpenter's lower-envs default config is weighted to prefer spot based ARM bottlerocket nodes. |
| 76 | + 1. `UX optimizations:` |
| 77 | + * EKS clusters have useful tags. |
| 78 | + * Name tags of EC2 instances are nicely organized. |
| 79 | + * IAM admins are given EKS viewer access by default for both the EKS web console and kubectl. |
| 80 | + * kubectl onboarding is streamlined. |
| 81 | + 1. `Production Readiness optimizations:` |
| 82 | + * kubernetes secrets stored in etcd get KMS encrypted by default. |
| 83 | + * EKS Addons are all installed by default. |
| 84 | + * CoreDNS's config is optimized by default in terms of node affinity and autoscaling. |
| 85 | + * AWS Load Balancer Controller is installed by default and configured using eks-pod-identity-agent, |
| 86 | + which means it doubles as a great IaC reference for pod level IAM rights. |
| 87 | + * Karpenter is installed by default and preconfigured to provision spot, on-demand, AMD, or ARM |
| 88 | + bottlerocket based worker nodes. |
| 89 | +* **Planned for Alpha:** |
| 90 | + 1. `The default storage class is preconfigured to provide kms encrypted gp3 ebs volumes.` |
| 91 | + 1. `Additional streamlining of kubectl access onboarding` |
| 92 | + 1. `Metric Level Observability` |
| 93 | + 1. `Log Level Observability` |
| 94 | + 1. `Standardize Variable Naming Conventions` |
21 | 95 |
|
22 | 96 | ------------------------------------------------------------------------------------------------------- |
23 | 97 |
|
24 | 98 | ### Simpler EKS |
25 | 99 | 1. **Deployment <u>and baseline configuration</u> are both automated:** |
26 | | - * `cdk` is used to automate the provisioning of production ready EKS Clusters. |
27 | | -2. **The administrative overhead associated with managing multiple clusters is minimized:** |
28 | | - * A `kustomize inspired` design pattern is used to make the deployment and management over time of multiple clusters much easier. |
29 | | -3. **Complexity is simplified, by shielding the end user engineers from unnecessary complexity that's practical to hide away:** |
| 100 | + * `cdk` is used to automate the provisioning of nearly production ready EKS Clusters. |
| 101 | +2. **The administrative overhead associated with managing multiple clusters is lower:** |
| 102 | + * A `kustomize inspired` design pattern is used to make the deployment and management over time of |
| 103 | + multiple clusters much easier. |
| 104 | +3. **Complexity is simplified, by shielding the end user engineers from unnecessary complexity that can be practically hidden away:** |
30 | 105 | * A `helm inspired` design pattern to abstract away complexity. |
31 | | - * helm hides complexity in templatized yaml files, and helm values.yaml files, which represent sane default values of input parameters to feed into the templating engine. |
| 106 | + * helm hides complexity in templatized yaml files, and helm values.yaml files, which represent |
| 107 | + sane default values of input parameters to feed into the templating engine. |
32 | 108 | * Here's an example of how helm allows end uesrs to see a significantly simplified interface: |
33 | 109 | * A 15 line long `kps.helm-values.yaml` file (of values representing overrides of |
34 | 110 | kube-prometheus-stack helm chart's default input parameters) |
|
39 | 115 | * /lib/ (a cdk library) |
40 | 116 | * /.flox/ (a recommended, yet optional method of automating dev shell dependencies with `flox activate`) |
41 | 117 | * Easy EKS presents a simplified workflow to end users: |
42 | | - * Edit /config/ (which is an intuitive and simplified end user interface inspired by kustomize and helm values) |
| 118 | + * Edit /config/ (which is an intuitive and simplified end user interface inspired by kustomize |
| 119 | + and helm values) |
43 | 120 | * `cdk list` |
44 | 121 | * `cdk deploy dev1-eks` |
45 | 122 |
|
|
57 | 134 | ------------------------------------------------------------------------------------------------------- |
58 | 135 |
|
59 | 136 | ### Enjoyable EKS |
60 | | - |
61 | 137 | * User Experience is what makes cars enjoyable products. The same is true for Easy EKS. |
62 | 138 | * Cars have complexity, |
63 | 139 | * But it's the car maker that deals with the complexity. |
64 | 140 | * You the end user get a simplifed turn key user experience. |
65 | 141 | * It's designed to be intuitive, learning how to drive isn't hard. |
66 | 142 | * Easy EKS has complexity, |
67 | 143 | * But you will be shielded from the majority of the complexity, it's abstrated away where practical. |
68 | | - * `You get to enjoy a turn key, batteries included, production ready user experience`. |
| 144 | + * `You get to enjoy a turn key, batteries included, nearly production ready user experience`. |
69 | 145 | * It's designed to be intuitive, and even FTUX (first time user experience) and OUX (onboarding UX) |
70 | 146 | are prioritized to make it easy to learn. |
71 | 147 | * You can enjoy: |
72 | 148 | * Being able to get meaningful work done quick: |
73 | 149 | * Learn the basics within a day. |
74 | | - * Deploy a cluster in under an hour, with a production ready baseline configuration. |
| 150 | + * Deploy a cluster in under an hour, with a nearly production ready baseline configuration. |
75 | 151 | * Develop working proficiency in under a week. |
76 | 152 | * Not having to think through engineering toil: |
77 | 153 | * Instead of choices, that make engineer's stress over identifying the best chocie. |
|
86 | 162 | * ADR's (Architectural Decision Records) are available to verify reasoning behind all choices. |
87 | 163 | * This isn't just a platform that claims to follow best practices. |
88 | 164 | * It's a platform that includes justifications of why it's practices are best practices. |
89 | | - |
90 | | -------------------------------------------------------------------------------------------------------- |
91 | | - |
92 | | -## Why Easy EKS Exists |
93 | | -| **Basic Functionality you'd expect to see, for normal usage and production readiness:** | **GCP's GKE AutoPilot:**<br> (a point of reference of what good looks like) | **AWS EKS:**<br> (The default out of the box user experience is a collection of dumb problems to have) | **Easy EKS** <br> (Smart solutions to dumb problems that make EKS easier, brought to you by doit.com) | |
94 | | -|-----------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
95 | | -| A well configured VPC | Default VPC ships with Cloud NAT | Default VPC doesn't ship with a NAT GW, and Managed NAT GW is so bad [(link 1)](https://www.lastweekinaws.com/blog/the-aws-managed-nat-gateway-is-unpleasant-and-not-recommended/), that fck-NAT exists [(link 2)](https://fck-nat.dev/stable/). | Ships with fck-NAT (order of magnitude cheaper), and dualstack VPC for IPv6 based EKS, which eliminates potential problem of running out of IPs. | |
96 | | -| Optimized DNS | DNS is optimized by default via Node Local DNS Cache and Cloud DNS | Ships with nothing. A relatively easy install won't be Fault Tolerant, won't have a dns auto-scaler, nor node-local-dns-cache. Figuring out production grade optimizations takes days. | Alpha ships with Node Local DNS Cache, core dns autoscaler, and anti affinity rules for increased fault tolerance.<br> Planned for Beta: verify/optimize core dns autoscaler's config. | |
97 | | -| Easily populate ~/.kube/config for Kubectl Access | A blue connect button at the top of the Web GUI, shows a command. | Access tends to be a multistep process, so you look up docs for something that should be trivially easy. | When cdk eks blueprints finishes, it outputs a config command. | |
98 | | -| Teammates can easily access to kubectl and Web Console | GCP IAM roles map to GKE's rbac rights by default. | In general, access needs to be explicitly configured per cluster, nuanced limitations make it hard. | Pragmatic workarounds to access limitations are set by default to make access easier. | |
99 | | -| Metric Level Observability | Ships with preconfigured working dashboards | Ships with nothing, figuring out how to set up takes days. | PLANNED (alpha) | |
100 | | -| Log Level Observability | Ships with intuitive centralized logging | Ships with nothing, figuring out how to set up takes days. | PLANNED (alpha) | |
101 | | -| Automatically Provisions storage for stateful workloads | Ships with a preconfigured storageclass | Ships with broken implementation, fixing is relatively easy, but how/why is this not a default functionality baked into the platform? | Ships with KMS Encrypted EBS storageclass | |
102 | | -| Automatically Provision Load Balancers for Ingress | Ships with GKE Ingress Controller and GKE's Gateway API controller | Ships with nothing, and the solution: AWS Load Balancer Controller, is considered a 3rd party add-on, with a complex installation that can take days to figure out. | Ships with AWS Load Balancer Controller | |
103 | | -| Pod Level IAM Identity | Ships with Workload Identity (pod level IAM roles) | Ships with nothing, making it work is relatively easy, seems reasonable to have this be a default baked into the platform. | Ships with Amazon EKS Pod Identity Agent | |
104 | | -| Worker Node Autoscaling | Ships with NAP (Node Auto Provisioner) | Ships with nothing, figuring out how to install cluster autoscaler or karpenter.sh can take days. | Ships with Karpenter.sh (Note: currently an outdated version to avoid compatibility issues, waiting for Karpenter 1.2.x / stable version planned for alpha) | |
105 | | - |
106 | | -------------------------------------------------------------------------------------------------------- |
107 | | - |
108 | | -## How do I get started? |
109 | | -[Check the docs page](/docs) |
|
0 commit comments