This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Amazon EKS Anywhere

EKS Anywhere provides a means of managing Kubernetes clusters using the same operational excellence and practices that Amazon Web Services uses for its Amazon Elastic Kubernetes Service (Amazon EKS). Based on EKS Distro, EKS Anywhere adds methods for deploying, using, and managing Kubernetes clusters that run in your own data centers. Its goal is to include full lifecycle management of multiple Kubernetes clusters that are capable of operating completely independently of any AWS services.

The tenets of the EKS Anywhere project are:

  • Simple: Make using a Kubernetes distribution simple and boring (reliable and secure).
  • Opinionated Modularity: Provide opinionated defaults about the best components to include with Kubernetes, but give customers the ability to swap them out
  • Open: Provide open source tooling backed, validated and maintained by Amazon
  • Ubiquitous: Enable customers and partners to integrate a Kubernetes distribution in the most common tooling.
  • Stand Alone: Provided for use anywhere without AWS dependencies
  • Better with AWS: Enable AWS customers to easily adopt additional AWS services

1 - Overview

Provides an overview of EKS Anywhere

EKS Anywhere creates a Kubernetes cluster on premises to a chosen provider. Supported providers include Bare Metal (via Tinkerbell), CloudStack, and vSphere. To manage that cluster, you can run cluster create and delete commands from an Ubuntu or Mac Administrative machine.

Creating a cluster involves downloading EKS Anywhere tools to an Administrative machine, then running the eksctl anywhere create cluster command to deploy that cluster to the provider. A temporary bootstrap cluster runs on the Administrative machine to direct the target cluster creation. For a detailed description, see Cluster creation workflow .

Here’s a diagram that explains the process visually.

EKS Anywhere Create Cluster

EKS Anywhere create cluster overview


Next steps:

2 - Getting started

The Getting started section includes information on starting to set up your own EKS Anywhere local or production environment.

EKS Anywhere can be deployed as a simple, unsupported local environment or as a production-quality environment that can become a supported on-premises Kubernetes platform. This section lists the different ways to set up and run EKS Anywhere. When you install EKS Anywhere, choose an installation type based on: ease of maintenance, security, control, available resources, and expertise required to operate and manage a cluster.

Install EKS Anywhere

To create an EKS Anywhere cluster you’ll need to download the command line tool that is used to create and manage a cluster. You can install it using the installation guide

Local environment

If you just want to try out EKS Anywhere, there is a single-system method for installing and running EKS Anywhere using Docker. See EKS Anywhere local environment .

Production environment

When evaluating a solution for a production environment consider deploying EKS Anywhere on providers listed on the Create production cluster page.

2.1 - Install EKS Anywhere

EKS Anywhere will create and manage Kubernetes clusters on multiple providers. Currently we support creating development clusters locally using Docker and production clusters from providers listed on the Create production cluster page.

Creating an EKS Anywhere cluster begins with setting up an Administrative machine where you will run Docker and add some binaries. From there, you create the cluster for your chosen provider. See Create cluster workflow for an overview of the cluster creation process.

To create an EKS Anywhere cluster you will need eksctl and the eksctl-anywhere plugin. This will let you create a cluster in multiple providers for local development or production workloads.

NOTE: For Snow provider, the Snow devices will come with a pre-configured Admin AMI which can be used to create an Admin instance with all the necessary binaries, dependencies and artifacts to create an EKS Anywhere cluster. Skip the below steps and see Create Snow production cluster to get started with EKS Anywhere on Snow.

Administrative machine prerequisites

  • Docker 20.x.x
  • Mac OS 10.15 / Ubuntu 20.04.2 LTS (See Note on newer Ubuntu versions)
  • 4 CPU cores
  • 16GB memory
  • 30GB free disk space
  • Administrative machine must be on the same Layer 2 network as the cluster machines (Bare Metal provider only).

If you are using Ubuntu, use the Docker CE installation instructions to install Docker and not the Snap installation, as described here.

If you are using Ubuntu 21.10 or 22.04, you will need to switch from cgroups v2 to cgroups v1. For details, see Troubleshooting Guide.

If you are using Docker Desktop, you need to know that:

  • For EKS Anywhere Bare Metal, Docker Desktop is not supported
  • For EKS Anywhere vSphere, if you are using Mac OS Docker Desktop 4.4.2 or newer "deprecatedCgroupv1": true must be set in ~/Library/Group\ Containers/group.com.docker/settings.json.

Install EKS Anywhere CLI tools

Via Homebrew (macOS and Linux)

You can install eksctl and eksctl-anywhere with homebrew . This package will also install kubectl and the aws-iam-authenticator which will be helpful to test EKS Anywhere clusters.

brew install aws/tap/eks-anywhere

Manually (macOS and Linux)

Install the latest release of eksctl. The EKS Anywhere plugin requires eksctl version 0.66.0 or newer.

curl "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" \
    --silent --location \
    | tar xz -C /tmp
sudo mv /tmp/eksctl /usr/local/bin/

Install the eksctl-anywhere plugin.

RELEASE_VERSION=$(curl https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml --silent --location | yq ".spec.latestVersion")
EKS_ANYWHERE_TARBALL_URL=$(curl https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml --silent --location | yq ".spec.releases[] | select(.version==\"$RELEASE_VERSION\").eksABinary.$(uname -s | tr A-Z a-z).uri")
curl $EKS_ANYWHERE_TARBALL_URL \
    --silent --location \
    | tar xz ./eksctl-anywhere
sudo mv ./eksctl-anywhere /usr/local/bin/

Install the kubectl Kubernetes command line tool. This can be done by following the instructions here .

Or you can install the latest kubectl directly with the following.

export OS="$(uname -s | tr A-Z a-z)" ARCH=$(test "$(uname -m)" = 'x86_64' && echo 'amd64' || echo 'arm64')
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/${OS}/${ARCH}/kubectl"
sudo mv ./kubectl /usr/local/bin
sudo chmod +x /usr/local/bin/kubectl

Upgrade eksctl-anywhere

If you installed eksctl-anywhere via homebrew you can upgrade the binary with

brew update
brew upgrade aws/tap/eks-anywhere

If you installed eksctl-anywhere manually you should follow the installation steps to download the latest release.

You can verify your installed version with

eksctl anywhere version

Prepare for airgapped deployments (optional)

When creating an EKS Anywhere cluster, there may be times where you need to do so in an airgapped environment. In this type of environment, cluster nodes are connected to the Admin Machine, but not to the internet. In order to download images and artifacts, however, the Admin machine needs to be temporarily connected to the internet.

An airgapped environment is especially important if you require the most secure networks. EKS Anywhere supports airgapped installation for creating clusters using a registry mirror. For airgapped installation to work, the Admin machine must have:

  • Temporary access to the internet to download images and artifacts
  • Ample space (80 GB or more) to store artifacts locally

To create a cluster in an airgapped environment, perform the following:

  1. Download the artifacts and images that will be used by the cluster nodes to the Admin machine using the following command:

    eksctl anywhere download artifacts
    

    A compressed file eks-anywhere-downloads.tar.gz will be downloaded.

  2. To decompress this file, use the following command:

    tar -xvf eks-anywhere-downloads.tar.gz
    

    This will create an eks-anywhere-downloads folder that we’ll be using later.

  3. In order for the next command to run smoothly, ensure that Docker has been pre-installed and is running. Then run the following:

    eksctl anywhere download images -o images.tar
    

    For the remaining steps, the Admin machine no longer needs to be connected to the internet or the bastion host.

  4. Next, you will need to set up a local registry mirror to host the downloaded EKS Anywhere images. In order to set one up, refer to Registry Mirror configuration.

  5. Now that you’ve configured your local registry mirror, you will need to import images to the local registry mirror using the following command (be sure to replace with the url of the local registry mirror you created in step 4):

    eksctl anywhere import images -i images.tar -r <registryUrl> \
       -- bundles ./eks-anywhere-downloads/bundle-release.yaml
    

You are now ready to deploy a cluster by following instructions to Create local cluster or Create production cluster. See text below for specific provider instructions.

For Bare Metal (Tinkerbell)

You will need to have hookOS and its OS artifacts downloaded and served locally from an HTTP file server. You will also need to modify the hookImagesURLPath and the osImageURL in the cluster configuration files. Ensure that structure of the files is set up as described in hookImagesURLPath.

For vSphere

If you are using the vSphere provider, be sure that the requirements in the Prerequisite checklist have been met.

Deploy a cluster

Once you have the tools installed you can deploy a local cluster or production cluster in the next steps.

2.2 - Create local cluster

EKS Anywhere docker provider deployments

EKS Anywhere supports a Docker provider for development and testing use cases only. This allows you to try EKS Anywhere on your local system before deploying to a supported provider to create either:

  • A single, standalone cluster or
  • Multiple management/workload clusters on the same provider, as described in Cluster topologies . The management/workload topology is recommended for production clusters and can be tried out here using both eksctl and GitOps tools.

Create a standalone cluster

Prerequisite Checklist

To install the EKS Anywhere binaries and see system requirements please follow the installation guide .

Steps

  1. Generate a cluster config

    CLUSTER_NAME=mgmt
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider docker > $CLUSTER_NAME.yaml
    

    The command above creates a file named eksa-cluster.yaml with the contents below in the path where it is executed. The configuration specification is divided into two sections:

    • Cluster
    • DockerDatacenterConfig
    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
       name: mgmt
    spec:
       clusterNetwork:
          cniConfig:
             cilium: {}
          pods:
             cidrBlocks:
                - 192.168.0.0/16
          services:
             cidrBlocks:
                - 10.96.0.0/12
       controlPlaneConfiguration:
          count: 1
       datacenterRef:
          kind: DockerDatacenterConfig
          name: mgmt
       externalEtcdConfiguration:
          count: 1
       kubernetesVersion: "1.25"
       managementCluster:
          name: mgmt
       workerNodeGroupConfigurations:
          - count: 1
             name: md-0
    ---
    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: DockerDatacenterConfig
    metadata:
       name: mgmt
    spec: {}
    
    
    • Apart from the base configuration, you can add additional optional configuration to enable supported features:
  2. Configure Curated Packages

    The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here . Cluster creation will succeed if authentication is not set up, but some warnings may be generated. Detailed package configurations can be found here .

    If you are going to use packages, set up authentication. These credentials should have limited capabilities :

    export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
    export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"
    export EKSA_AWS_REGION="us-west-2"
    

    NOTE: The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. Due to this there might be some warnings in the CLI if proper authentication is not set up.

  3. Create Cluster:

    For a regular cluster create (with internet access), type the following:

    eksctl anywhere create cluster \
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f $CLUSTER_NAME.yaml
    

    For an airgapped cluster create, follow Preparation for airgapped deployments instructions, then type the following:

    eksctl anywhere create cluster 
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f $CLUSTER_NAME.yaml \
       --bundles-override ./eks-anywhere-downloads/bundle-release.yaml
    

    Example command output

    Performing setup and validations
    ✅ validation succeeded {"validation": "docker Provider setup is valid"}
    Creating new bootstrap cluster
    Installing cluster-api providers on bootstrap cluster
    Provider specific setup
    Creating new workload cluster
    Installing networking on workload cluster
    Installing cluster-api providers on workload cluster
    Moving cluster management from bootstrap to workload cluster
    Installing EKS-A custom components (CRD and controller) on workload cluster
    Creating EKS-A CRDs instances on workload cluster
    Installing GitOps Toolkit on workload cluster
    GitOps field not specified, bootstrap flux skipped
    Deleting bootstrap cluster
    🎉 Cluster created!
    ----------------------------------------------------------------------------------
    The Amazon EKS Anywhere Curated Packages are only available to customers with the
    Amazon EKS Anywhere Enterprise Subscription
    ----------------------------------------------------------------------------------
    Installing curated packages controller on management cluster
    secret/aws-secret created
    job.batch/eksa-auth-refresher created
    

    NOTE: to install curated packages during cluster creation, use --install-packages packages.yaml flag

  4. Use the cluster

    Once the cluster is created you can use it with the generated KUBECONFIG file in your local directory

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl get ns
    

    Example command output

    NAME                                STATUS   AGE
    capd-system                         Active   21m
    capi-kubeadm-bootstrap-system       Active   21m
    capi-kubeadm-control-plane-system   Active   21m
    capi-system                         Active   21m
    capi-webhook-system                 Active   21m
    cert-manager                        Active   22m
    default                             Active   23m
    eksa-packages                       Active   23m
    eksa-system                         Active   20m
    kube-node-lease                     Active   23m
    kube-public                         Active   23m
    kube-system                         Active   23m
    

    You can now use the cluster like you would any Kubernetes cluster. Deploy the test application with:

    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    

    Verify the test application in the deploy test application section .

Create management/workload clusters

To try the recommended EKS Anywhere topology, you can create a management cluster and one or more workload clusters on the same Docker provider.

Prerequisite Checklist

To install the EKS Anywhere binaries and see system requirements please follow the installation guide .

Create a management cluster

  1. Generate a management cluster config (named mgmt for this example):

    CLUSTER_NAME=mgmt
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider docker > eksa-mgmt-cluster.yaml
    
  2. Modify the management cluster config (eksa-mgmt-cluster.yaml) you could use the same one described earlier or modify it to use GitOps, as shown below:

    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: mgmt
      namespace: default
    spec:
      bundlesRef:
        apiVersion: anywhere.eks.amazonaws.com/v1alpha1
        name: bundles-1
        namespace: eksa-system
      clusterNetwork:
        cniConfig:
          cilium: {}
        pods:
          cidrBlocks:
          - 192.168.0.0/16
        services:
          cidrBlocks:
          - 10.96.0.0/12
      controlPlaneConfiguration:
        count: 1
      datacenterRef:
        kind: DockerDatacenterConfig
        name: mgmt
      externalEtcdConfiguration:
        count: 1
      gitOpsRef:
        kind: FluxConfig
        name: mgmt
      kubernetesVersion: "1.25"
      managementCluster:
        name: mgmt
      workerNodeGroupConfigurations:
      - count: 1
        name: md-1
    
    ---
    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: DockerDatacenterConfig
    metadata:
      name: mgmt
      namespace: default
    spec: {}
    
    ---
    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: FluxConfig
    metadata:
      name: mgmt
      namespace: default
    spec:
      branch: main
      clusterConfigPath: clusters/mgmt
      github:
        owner: <your github account, such as example for https://github.com/example>
        personal: true
        repository: <your github repo, such as test for https://github.com/example/test>
      systemNamespace: flux-system
    
    ---
    
  3. Configure Curated Packages

    The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here . Cluster creation will succeed if authentication is not set up, but some warnings may be generated. Detailed package configurations can be found here .

    If you are going to use packages, set up authentication. These credentials should have limited capabilities :

    export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
    export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"  
    
  4. Create cluster:

    For a regular cluster create (with internet access), type the following:

    eksctl anywhere create cluster \ 
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f $CLUSTER_NAME.yaml
    

    For an airgapped cluster create, follow Preparation for airgapped deployments instructions, then type the following:

    eksctl anywhere create cluster \
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f $CLUSTER_NAME.yaml \
       --bundles-override ./eks-anywhere-downloads/bundle-release.yaml
    
  5. Once the cluster is created you can use it with the generated KUBECONFIG file in your local directory:

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    
  6. Check the initial cluster’s CRD:

    To ensure you are looking at the initial cluster, list the CRD to see that the name of its management cluster is itself:

    kubectl get clusters mgmt -o yaml
    

    Example command output

    ...
    kubernetesVersion: "1.25"
    managementCluster:
      name: mgmt
    workerNodeGroupConfigurations:
    ...
    

Create separate workload clusters

Follow these steps to have your management cluster create and manage separate workload clusters.

  1. Generate a workload cluster config:

    CLUSTER_NAME=w01
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider docker > eksa-w01-cluster.yaml
    

    Refer to the initial config described earlier for the required and optional settings.

    NOTE: Ensure workload cluster object names (Cluster, DockerDatacenterConfig, etc.) are distinct from management cluster object names. Be sure to set the managementCluster field to identify the name of the management cluster.

  2. Create a workload cluster in one of the following ways:

    • GitOps: See Manage separate workload clusters with GitOps

    • Terraform: See Manage separate workload clusters with Terraform

    • Kubernetes CLI: The cluster lifecycle feature lets you use kubectl to manage a workload cluster. For example:

      kubectl apply -f eksa-w01-cluster.yaml 
      
    • eksctl CLI: Useful for temporary cluster configurations. To create a workload cluster with eksctl, do one of the following. For a regular cluster create (with internet access), type the following:

      eksctl anywhere create cluster \
          -f eksa-w01-cluster.yaml  \
         # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
          --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
      

      For an airgapped cluster create, follow Preparation for airgapped deployments instructions, then type the following:

      eksctl create cluster \
         # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
         -f $CLUSTER_NAME.yaml \
         --bundles-override ./eks-anywhere-downloads/bundle-release.yaml \
          --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
      

      As noted earlier, adding the --kubeconfig option tells eksctl to use the management cluster identified by that kubeconfig file to create a different workload cluster.

  3. To check the workload cluster, get the workload cluster credentials and run a test workload:

    • If your workload cluster was created with eksctl, change your credentials to point to the new workload cluster (for example, w01), then run the test application with:

      export CLUSTER_NAME=w01
      export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
      kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
      
    • If your workload cluster was created with GitOps or Terraform, you can get credentials and run the test application as follows:

      kubectl get secret -n eksa-system w01-kubeconfig -o jsonpath={.data.value}' | base64 —decode > w01.kubeconfig
      export KUBECONFIG=w01.kubeconfig
      kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
      

      NOTE: For Docker, you must modify the server field of the kubeconfig file by replacing the IP with 127.0.0.1 and the port with its value. The port’s value can be found by running docker ps and checking the workload cluster’s load balancer.

  4. Add more workload clusters:

    To add more workload clusters, go through the same steps for creating the initial workload, copying the config file to a new name (such as eksa-w02-cluster.yaml), modifying resource names, and running the create cluster command again.

Next steps:

  • See the Cluster management section for more information on common operational tasks like scaling and deleting the cluster.

  • See the Package management section for more information on post-creation curated packages installation.

2.3 - Create production cluster

EKS Anywhere allows you to provision and manage Amazon EKS on your own infrastructure. To get started with different production-quality EKS Anywhere providers, choose from the providers below:

2.3.1 - Create Bare Metal production cluster

Create a production-quality cluster on Bare Metal

EKS Anywhere supports a Bare Metal provider for production grade EKS Anywhere deployments. EKS Anywhere allows you to provision and manage Kubernetes clusters based on Amazon EKS software on your own infrastructure.

This document walks you through setting up EKS Anywhere on Bare Metal as a standalone, self-managed cluster or combined set of management/workload clusters. See Cluster topologies for details.

Prerequisite checklist

EKS Anywhere needs:

Also, see the Ports and protocols page for information on ports that need to be accessible from control plane, worker, and Admin machines.

Steps

The following steps are divided into two sections:

  • Create an initial cluster (used as a management or self-managed cluster)
  • Create zero or more workload clusters from the management cluster

Create an initial cluster

Follow these steps to create an EKS Anywhere cluster that can be used either as a management cluster or as a self-managed cluster (for running workloads itself).

  1. Set an environment variables for your cluster name

    export CLUSTER_NAME=mgmt
    
  2. Generate a cluster config file for your Bare Metal provider (using tinkerbell as the provider type).

    eksctl anywhere generate clusterconfig $CLUSTER_NAME --provider tinkerbell > eksa-mgmt-cluster.yaml
    
  3. Modify the cluster config (eksa-mgmt-cluster.yaml) by referring to the Bare Metal configuration reference documentation.

  4. Set License Environment Variable

    If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    

    After you have created your eksa-mgmt-cluster.yaml and set your credential environment variables, you will be ready to create the cluster.

  5. Configure Curated Packages

    The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here . Cluster creation will succeed if authentication is not set up, but some warnings may be generated. Detailed package configurations can be found here .

    If you are going to use packages, set up authentication. These credentials should have limited capabilities :

    export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
    export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"
    export EKSA_AWS_REGION="us-west-2" 
    
  6. Create the cluster, using the hardware.csv file you made in Bare Metal preparation .

    For a regular cluster create (with internet access), type the following:

    eksctl anywhere create cluster \
       --hardware-csv hardware.csv \
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f eksa-mgmt-cluster.yaml
    

    For an airgapped cluster create, follow Preparation for airgapped deployments instructions, then type the following:

    eksctl anywhere create cluster
       --hardware-csv hardware.csv \
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f $CLUSTER_NAME.yaml \
       --bundles-override ./eks-anywhere-downloads/bundle-release.yaml
    
  7. Once the cluster is created you can use it with the generated KUBECONFIG file in your local directory:

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    
  8. Check the cluster nodes:

    To check that the cluster completed, list the machines to see the control plane and worker nodes:

    kubectl get machines -A
    

    Example command output:

    NAMESPACE     NAME                        CLUSTER   NODENAME        PROVIDERID                              PHASE     AGE   VERSION
    eksa-system   mgmt-47zj8                  mgmt      eksa-node01     tinkerbell://eksa-system/eksa-node01    Running   1h    v1.23.7-eks-1-23-4
    eksa-system   mgmt-md-0-7f79df46f-wlp7w   mgmt      eksa-node02     tinkerbell://eksa-system/eksa-node02    Running   1h    v1.23.7-eks-1-23-4
    ...
    
  9. Check the cluster:

    You can now use the cluster as you would any Kubernetes cluster. To try it out, run the test application with:

    export CLUSTER_NAME=mgmt
    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    

    Verify the test application in Deploy test workload .

Create separate workload clusters

Follow these steps if you want to use your initial cluster to create and manage separate workload clusters.

  1. Generate a workload cluster config:

    CLUSTER_NAME=w01
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider tinkerbell > eksa-w01-cluster.yaml
    

    Refer to the initial config described earlier for the required and optional settings. Ensure workload cluster object names (Cluster, TinkerbellDatacenterConfig, TinkerbellMachineConfig, etc.) are distinct from management cluster object names. Keep the tinkerbellIP of workload cluster the same as tinkerbellIP of the management cluster.

  2. Be sure to set the managementCluster field to identify the name of the management cluster.

    For example, the management cluster, mgmt is defined for our workload cluster w01 as follows:

    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: w01
    spec:
      managementCluster:
        name: mgmt
    
  3. Set License Environment Variable

    Add a license to any cluster for which you want to receive paid support. If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    
  4. Create a workload cluster

    To create a new workload cluster from your management cluster run this command, identifying:

    • The workload cluster YAML file
    • The initial cluster’s credentials (this causes the workload cluster to be managed from the management cluster)
    With hardware CSV
    eksctl anywhere create cluster \
        -f eksa-w01-cluster.yaml  \
        # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
        --hardware-csv <hardware.csv>
        # --bundles-override ./eks-anywhere-downloads/bundle-release.yaml \ # uncomment for airgapped install
        --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
    
    Without hardware CSV
    eksctl anywhere create cluster \
        -f eksa-w01-cluster.yaml  \
        # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
        # --bundles-override ./eks-anywhere-downloads/bundle-release.yaml \ # uncomment for airgapped install
        --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
    

    As noted earlier, adding the --kubeconfig option tells eksctl to use the management cluster identified by that kubeconfig file to create a different workload cluster.

  5. Check the workload cluster:

    You can now use the workload cluster as you would any Kubernetes cluster. Change your credentials to point to the new workload cluster (for example, mgmt-w01), then run the test application with:

    export CLUSTER_NAME=mgmt-w01
    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    

    Verify the test application in the deploy test application section.

  6. Add more workload clusters:

    To add more workload clusters, go through the same steps for creating the initial workload, copying the config file to a new name (such as eksa-w02-cluster.yaml), modifying resource names, and running the create cluster command again.

Next steps:

  • See the Cluster management section for more information on common operational tasks like deleting the cluster.

  • See the Package management section for more information on post-creation curated packages installation.

2.3.2 - Create CloudStack production cluster

Create a production-quality cluster on CloudStack

EKS Anywhere supports a CloudStack provider for production grade EKS Anywhere deployments. This document walks you through setting up EKS Anywhere on CloudStack in a way that:

  • Deploys an initial cluster on your CloudStack environment. That cluster can be used as a standalone cluster (to run workloads) or a management cluster (to create and manage other clusters)
  • Deploys zero or more workload clusters from the management cluster

If your initial cluster is a management cluster, it is intended to stay in place so you can use it later to modify, upgrade, and delete workload clusters. Using a management cluster makes it faster to provision and delete workload clusters. Also it lets you keep CloudStack credentials for a set of clusters in one place: on the management cluster. The alternative is to simply use your initial cluster to run workloads. See Cluster topologies for details.

Prerequisite Checklist

EKS Anywhere needs to:

Also, see the Ports and protocols page for information on ports that need to be accessible from control plane, worker, and Admin machines.

Steps

The following steps are divided into two sections:

  • Create an initial cluster (used as a management or standalone cluster)
  • Create zero or more workload clusters from the management cluster

Create an initial cluster

Follow these steps to create an EKS Anywhere cluster that can be used either as a management cluster or as a standalone cluster (for running workloads itself).

  1. Generate an initial cluster config (named mgmt for this example):

    export CLUSTER_NAME=mgmt
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider cloudstack > eksa-mgmt-cluster.yaml
    
  2. Create credential file

    Create a credential file (for example, cloud-config) and add the credentials needed to access your CloudStack environment. The file should include:

    • api-key: Obtained from CloudStack
    • secret-key: Obtained from CloudStack
    • api-url: The URL to your CloudStack API endpoint

    For example:

    [Global]
    api-key     =  -Dk5uB0DE3aWng
    secret-key  =  -0DQLunsaJKxCEEHn44XxP80tv6v_RB0DiDtdgwJ
    api-url     =  http://172.16.0.1:8080/client/api
    
    

    You can have multiple credential entries. To match this example, you would enter global as the credentialsRef in the cluster config file for your CloudStack availability zone. You can configure multiple credentials for multiple availability zones.

  3. Modify the initial cluster config (eksa-mgmt-cluster.yaml) as follows:

    • Refer to Cloudstack configuration for information on configuring this cluster config for a CloudStack provider.
    • Add Optional configuration settings as needed.
    • Create at least two control plane nodes, three worker nodes, and three etcd nodes for a production cluster, to provide high availability and rolling upgrades.
  4. Set Environment Variables

    Convert the credential file into base64 and set the following environment variable to that value:

    export EKSA_CLOUDSTACK_B64ENCODED_SECRET=$(base64 -i cloud-config)
    
  5. Set License Environment Variable

    Add a license to any cluster for which you want to receive paid support. If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    
  6. Configure Curated Packages

    The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here . Cluster creation will succeed if authentication is not set up, but some warnings may be generated. Detailed package configurations can be found here .

    If you are going to use packages, set up authentication. These credentials should have limited capabilities :

    export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
    export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"
    export EKSA_AWS_REGION="us-west-2"  
    
  7. Disable Kubevip load balancer

    Skip this step if you want to use the Kubevip load balancer with your cluster. If you want to use a different load balancer, you can disable Kubevip as follows:

    export CLOUDSTACK_KUBE_VIP_DISABLED=true
    
  8. Create cluster

    eksctl anywhere create cluster \
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f eksa-mgmt-cluster.yaml
    
  9. Once the cluster is created you can use it with the generated KUBECONFIG file in your local directory:

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    
  10. Check the cluster nodes:

    To check that the cluster completed, list the machines to see the control plane, etcd, and worker nodes:

    kubectl get machines -A
    

    Example command output

    NAMESPACE   NAME                PROVIDERID        PHASE    VERSION
    eksa-system mgmt-b2xyz          cloudstack:/xxxxx    Running  v1.23.1-eks-1-21-5
    eksa-system mgmt-etcd-r9b42     cloudstack:/xxxxx    Running  
    eksa-system mgmt-md-8-6xr-rnr   cloudstack:/xxxxx    Running  v1.23.1-eks-1-21-5
    ...
    

    The etcd machine doesn’t show the Kubernetes version because it doesn’t run the kubelet service.

  11. Check the initial cluster’s CRD:

    To ensure you are looking at the initial cluster, list the CRD to see that the name of its management cluster is itself:

    kubectl get clusters mgmt -o yaml
    

    Example command output

    ...
    kubernetesVersion: "1.23"
    managementCluster:
      name: mgmt
    workerNodeGroupConfigurations:
    ...
    

Create separate workload clusters

Follow these steps if you want to use your initial cluster to create and manage separate workload clusters.

  1. Generate a workload cluster config:

    CLUSTER_NAME=w01
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider cloudstack > eksa-w01-cluster.yaml
    
  2. Modify the workload cluster config (eksa-w01-cluster.yaml) as follows. Refer to the initial config described earlier for the required and optional settings. In particular:

    • Ensure workload cluster object names (Cluster, CloudDatacenterConfig, CloudStackMachineConfig, etc.) are distinct from management cluster object names.
  3. Be sure to set the managementCluster field to identify the name of the management cluster.

    For example, the management cluster, mgmt is defined for our workload cluster w01 as follows:

    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: w01
    spec:
      managementCluster:
        name: mgmt
    
  4. Set License Environment Variable

    Add a license to any cluster for which you want to receive paid support. If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    
  5. Create a workload cluster

    To create a new workload cluster from your management cluster run this command, identifying:

    • The workload cluster YAML file
    • The initial cluster’s credentials (this causes the workload cluster to be managed from the management cluster)
    eksctl anywhere create cluster \
        -f eksa-w01-cluster.yaml  \
        # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
        --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
    

    As noted earlier, adding the --kubeconfig option tells eksctl to use the management cluster identified by that kubeconfig file to create a different workload cluster.

  6. Check the workload cluster:

    You can now use the workload cluster as you would any Kubernetes cluster. Change your credentials to point to the new workload cluster (for example, mgmt-w01), then run the test application with:

    export CLUSTER_NAME=mgmt-w01
    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    

    Verify the test application in the deploy test application section.

  7. Add more workload clusters:

    To add more workload clusters, go through the same steps for creating the initial workload, copying the config file to a new name (such as eksa-w02-cluster.yaml), modifying resource names, and running the create cluster command again.

Next steps:

  • See the Cluster management section for more information on common operational tasks like scaling and deleting the cluster.

  • See the Package management section for more information on post-creation curated packages installation.

2.3.3 - Create Nutanix production cluster

Create a production-quality cluster on Nutanix Cloud Infrastructure with AHV

EKS Anywhere supports a Nutanix Cloud Infrastructure (NCI) provider for production grade EKS Anywhere deployments. This document walks you through setting up EKS Anywhere on Nutanix Cloud Infrastructure with AHV in a way that:

  • Deploys an initial cluster in your Nutanix environment. That cluster can be used as a self-managed cluster (to run workloads) or a management cluster (to create and manage other clusters)
  • Deploys zero or more workload clusters from the management cluster

If your initial cluster is a management cluster, it is intended to stay in place so you can use it later to modify, upgrade, and delete workload clusters. Using a management cluster makes it faster to provision and delete workload clusters. It also lets you keep NCI credentials for a set of clusters in one place: on the management cluster. The alternative is to simply use your initial cluster to run workloads. See Cluster topologies for details.

Prerequisite Checklist

EKS Anywhere needs to:

Also, see the Ports and protocols page for information on ports that need to be accessible from control plane, worker, and Admin machines.

Steps

The following steps are divided into two sections:

  • Create an initial cluster (used as a management or self-managed cluster)
  • Create zero or more workload clusters from the management cluster

Create an initial cluster

Follow these steps to create an EKS Anywhere cluster that can be used either as a management cluster or as a self-managed cluster (for running workloads itself).

  1. Generate an initial cluster config (named mgmt for this example):

    CLUSTER_NAME=mgmt
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider nutanix > eksa-mgmt-cluster.yaml
    
  2. Modify the initial cluster config (eksa-mgmt-cluster.yaml) as follows:

    • Refer to Nutanix configuration for information on configuring this cluster config for a Nutanix provider.
    • Add Optional configuration settings as needed.
    • Create at least three control plane nodes, and three worker nodes for a production cluster, to provide high availability and rolling upgrades.
  3. Set Credential Environment Variables

    Before you create the initial cluster, you will need to set and export these environment variables for your Nutanix Prism Central user name and password. Make sure you use single quotes around the values so that your shell does not interpret the values:

    export EKSA_NUTANIX_USERNAME='billy'
    export EKSA_NUTANIX_PASSWORD='t0p$ecret'
    
  4. Set License Environment Variable

    Add a license to any cluster for which you want to receive paid support. If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    

    After you have created your eksa-mgmt-cluster.yaml and set your credential environment variables, you will be ready to create the cluster.

  5. Configure Curated Packages

    The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here . Cluster creation will succeed if authentication is not set up, but some warnings may be generated. Detailed package configurations can be found here .

    If you are going to use packages, set up authentication. These credentials should have limited capabilities :

    export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
    export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"
    export EKSA_AWS_REGION="us-west-2"  
    
  6. Create cluster

    eksctl anywhere create cluster \
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f eksa-mgmt-cluster.yaml
    
  7. Once the cluster is created, you can access it with the generated KUBECONFIG file in your local directory:

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    
  8. Check the cluster nodes:

    To check that the cluster is ready, list the machines to see the control plane, and worker nodes:

    kubectl get machines -n eksa-system
    

    Example command output

       NAME              CLUSTER  NODENAME                                 PROVIDERID       PHASE     AGE   VERSION
       mgmt-4gtt2        mgmt     mgmt-control-plane-1670343878900-2m4ln   nutanix://xxxx   Running   11m   v1.24.7-eks-1-24-4
       mgmt-d42xn        mgmt     mgmt-control-plane-1670343878900-jbfxt   nutanix://xxxx   Running   11m   v1.24.7-eks-1-24-4
       mgmt-md-0-9868m   mgmt     mgmt-md-0-1670343878901-lkmxw            nutanix://xxxx   Running   11m   v1.24.7-eks-1-24-4
       mgmt-md-0-njpk2   mgmt     mgmt-md-0-1670343878901-9clbz            nutanix://xxxx   Running   11m   v1.24.7-eks-1-24-4
       mgmt-md-0-p4gp2   mgmt     mgmt-md-0-1670343878901-mbktx            nutanix://xxxx   Running   11m   v1.24.7-eks-1-24-4
       mgmt-zkwrr        mgmt     mgmt-control-plane-1670343878900-jrdkk   nutanix://xxxx   Running   11m   v1.24.7-eks-1-24-4
    
  9. Check the initial cluster’s CRD:

    To ensure you are looking at the initial cluster, list the cluster CRD to see that the name of its management cluster is itself:

    kubectl get clusters mgmt -o yaml
    

    Example command output

    ...
    kubernetesVersion: "1.25"
    managementCluster:
      name: mgmt
    workerNodeGroupConfigurations:
    ...
    

Create separate workload clusters

Follow these steps if you want to use your initial cluster to create and manage separate workload clusters.

  1. Generate a workload cluster config:

    CLUSTER_NAME=w01
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider nutanix > eksa-w01-cluster.yaml
    

    Refer to the initial config described earlier for the required and optional settings. Ensure workload cluster object names (Cluster, NutanixDatacenterConfig, NutanixMachineConfig, etc.) are distinct from management cluster object names.

  2. Be sure to set the managementCluster field to identify the name of the management cluster.

    For example, the management cluster, mgmt is defined for our workload cluster w01 as follows:

    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: w01
    spec:
      managementCluster:
        name: mgmt
    
  3. Set License Environment Variable

    Add a license to any cluster for which you want to receive paid support. If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    
  4. Create a workload cluster

    To create a new workload cluster from your management cluster run this command, identifying:

    • The workload cluster YAML file
    • The initial cluster’s kubeconfig (this causes the workload cluster to be managed from the management cluster)
    eksctl anywhere create cluster \
        -f eksa-w01-cluster.yaml  \
        # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
        --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
    

    As noted earlier, adding the --kubeconfig option tells eksctl to use the management cluster identified by that kubeconfig file to create a different workload cluster.

  5. Check the workload cluster:

    You can now use the workload cluster as you would any Kubernetes cluster. Change your kubeconfig to point to the new workload cluster (for example, w01), then run the test application with:

    export CLUSTER_NAME=w01
    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    

    Verify the test application in the deploy test application section.

  6. Add more workload clusters:

    To add more workload clusters, go through the same steps for creating the initial workload, copying the config file to a new name (such as eksa-w02-cluster.yaml), modifying resource names, and running the create cluster command again.

Next steps:

  • See the Cluster management section for more information on common operational tasks like scaling and deleting the cluster.

  • See the Package management section for more information on post-creation curated packages installation.

2.3.4 - Create Snow production cluster

Create a production-quality cluster on AWS Snow

EKS Anywhere supports an AWS Snow provider for production grade EKS Anywhere deployments.

This document walks you through setting up EKS Anywhere on Snow as a standalone, self-managed cluster or combined set of management/workload clusters. See Cluster topologies for details.

Prerequisite checklist

EKS Anywhere on Snow needs:

Also, see the Ports and protocols page for information on ports that need to be accessible from control plane, worker, and Admin machines.

Steps

The following steps are divided into two sections:

  • Create an initial cluster (used as a management or standalone cluster)
  • Create zero or more workload clusters from the management cluster

Create an initial cluster

Follow these steps to create an EKS Anywhere cluster that can be used either as a management cluster or as a standalone cluster (for running workloads itself).

  1. Set an environment variables for your cluster name

    export CLUSTER_NAME=mgmt
    
  2. Generate a cluster config file for your Snow provider

    eksctl anywhere generate clusterconfig $CLUSTER_NAME --provider snow > eksa-mgmt-cluster.yaml
    
  3. Optionally import images to private registry

    This optional step imports EKS Anywhere artifacts and release bundle to a local registry. This is required for air-gapped installation.

    eksctl anywhere import images \
       --input /usr/lib/eks-a/artifacts/artifacts.tar.gz \
       --bundles /usr/lib/eks-a/manifests/bundle-release.yaml \
       --registry $PRIVATE_REGISTRY_ENDPOINT \
       --insecure=true
    
  4. Modify the cluster config (eksa-mgmt-cluster.yaml) as follows:

    • Refer to the Snow configuration for information on configuring this cluster config for a Snow provider.
    • Add Optional configuration settings as needed.
  5. Set License Environment Variable

    If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    
  6. Configure Curated Packages

    The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here . Cluster creation will succeed if authentication is not set up, but some warnings may be generated. Detailed package configurations can be found here .

    If you are going to use packages, set up authentication. These credentials should have limited capabilities :

    export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
    export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"
    export EKSA_AWS_REGION="us-west-2" 
    

    Curated packages are not yet supported on air-gapped installation.

  7. Set Credential Environment Variables

    Before you create the initial cluster, you will need to use the credentials and ca-bundles files that are in the Admin instance, and export these environment variables for your AWS Snowball device credentials. Make sure you use single quotes around the values so that your shell does not interpret the values:

    export EKSA_AWS_CREDENTIALS_FILE='/PATH/TO/CREDENTIALS/FILE'
    export EKSA_AWS_CA_BUNDLES_FILE='/PATH/TO/CABUNDLES/FILE'
    

    After you have created your eksa-mgmt-cluster.yaml and set your credential environment variables, you will be ready to create the cluster.

  8. Create cluster

    a. For none air-gapped environment

    eksctl anywhere create cluster \
       -f eksa-mgmt-cluster.yaml
    

    b. For air-gapped environment

    eksctl anywhere create cluster \
       -f eksa-mgmt-cluster.yaml \
       --bundles-override /usr/lib/eks-a/manifests/bundle-release.yaml
    
  9. Once the cluster is created you can use it with the generated KUBECONFIG file in your local directory:

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    
  10. Check the cluster nodes:

    To check that the cluster completed, list the machines to see the control plane and worker nodes:

    kubectl get machines -A
    

    Example command output:

    NAMESPACE    NAME                        CLUSTER  NODENAME                    PROVIDERID                                       PHASE    AGE    VERSION
    eksa-system  mgmt-etcd-dsxb5             mgmt                                 aws-snow:///192.168.1.231/s.i-8b0b0631da3b8d9e4  Running  4m59s  
    eksa-system  mgmt-md-0-7b7c69cf94-99sll  mgmt     mgmt-md-0-1-58nng           aws-snow:///192.168.1.231/s.i-8ebf6b58a58e47531  Running  4m58s  v1.24.9-eks-1-24-7
    eksa-system  mgmt-srrt8                  mgmt     mgmt-control-plane-1-xs4t9  aws-snow:///192.168.1.231/s.i-8414c7fcabcf3d7c1  Running  4m58s  v1.24.9-eks-1-24-7
    ...    
    
  11. Check the cluster:

    You can now use the cluster as you would any Kubernetes cluster. To try it out, run the test application with:

    export CLUSTER_NAME=mgmt
    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    

    Verify the test application in Deploy test workload .

Create separate workload clusters

Follow these steps if you want to use your initial cluster to create and manage separate workload clusters.

  1. Generate a workload cluster config:

    CLUSTER_NAME=w01
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider snow > eksa-w01-cluster.yaml
    

    Refer to the initial config described earlier for the required and optional settings.

    NOTE: Ensure workload cluster object names (Cluster, SnowDatacenterConfig, SnowMachineConfig, etc.) are distinct from management cluster object names.

  2. Be sure to set the managementCluster field to identify the name of the management cluster.

    For example, the management cluster, mgmt is defined for our workload cluster w01 as follows:

    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: w01
    spec:
      managementCluster:
        name: mgmt
    
  3. Set License Environment Variable

    Add a license to any cluster for which you want to receive paid support. If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    
  4. Create a workload cluster in one of the following ways:

    • GitOps: See Manage separate workload clusters with GitOps

    • Terraform: See Manage separate workload clusters with Terraform

      NOTE: snowDatacenterConfig.spec.identityRef and a Snow bootstrap credentials secret need to be specified when provisioning a cluster through GitOps or Terraform, as EKS Anywhere Cluster Controller will not create a Snow bootstrap credentials secret like eksctl CLI does when field is empty.

      snowMachineConfig.spec.sshKeyName must be specified to SSH into your nodes when provisioning a cluster through GitOps or Terraform, as the EKS Anywhere Cluster Controller will not generate the keys like eksctl CLI does when the field is empty.

    • eksctl CLI: To create a workload cluster with eksctl, run:

      eksctl anywhere create cluster \
          -f eksa-w01-cluster.yaml  \
          --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
      

      As noted earlier, adding the --kubeconfig option tells eksctl to use the management cluster identified by that kubeconfig file to create a different workload cluster.

    • kubectl CLI: The cluster lifecycle feature lets you use kubectl, or other tools that that can talk to the Kubernetes API, to create a workload cluster. To use kubectl, run:

      kubectl apply -f eksa-w01-cluster.yaml
      
  5. Check the workload cluster:

    You can now use the workload cluster as you would any Kubernetes cluster.

    • If your workload cluster was created with eksctl, change your credentials to point to the new workload cluster (for example, w01), then run the test application with:

      export CLUSTER_NAME=w01
      export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
      kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
      
    • If your workload cluster was created with GitOps or Terraform, the kubeconfig for your new cluster is stored as a secret on the management cluster. You can get credentials and run the test application as follows:

      kubectl get secret -n eksa-system w01-kubeconfig -o jsonpath={.data.value}' | base64 —decode > w01.kubeconfig
      export KUBECONFIG=w01.kubeconfig
      kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
      

    Verify the test application in the deploy test application section.

  6. Add more workload clusters:

    To add more workload clusters, go through the same steps for creating the initial workload, copying the config file to a new name (such as eksa-w02-cluster.yaml), modifying resource names, and running the create cluster command again.

Next steps:

  • See the Cluster management section for more information on common operational tasks like deleting the cluster.

  • See the Package management section for more information on post-creation curated packages installation.

2.3.5 - Create vSphere production cluster

Create a production-quality cluster on VMware vSphere

EKS Anywhere supports a VMware vSphere provider for production grade EKS Anywhere deployments. This document walks you through setting up EKS Anywhere on vSphere in a way that:

  • Deploys an initial cluster on your vSphere environment. That cluster can be used as a self-managed cluster (to run workloads) or a management cluster (to create and manage other clusters)
  • Deploys zero or more workload clusters from the management cluster

If your initial cluster is a management cluster, it is intended to stay in place so you can use it later to modify, upgrade, and delete workload clusters. Using a management cluster makes it faster to provision and delete workload clusters. Also it lets you keep vSphere credentials for a set of clusters in one place: on the management cluster. The alternative is to simply use your initial cluster to run workloads. See Cluster topologies for details.

Prerequisite Checklist

EKS Anywhere needs to:

Also, see the Ports and protocols page for information on ports that need to be accessible from control plane, worker, and Admin machines.

Steps

The following steps are divided into two sections:

  • Create an initial cluster (used as a management or self-managed cluster)
  • Create zero or more workload clusters from the management cluster

Create an initial cluster

Follow these steps to create an EKS Anywhere cluster that can be used either as a management cluster or as a self-managed cluster (for running workloads itself).

  1. Generate an initial cluster config (named mgmt for this example):

    CLUSTER_NAME=mgmt
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider vsphere > eksa-mgmt-cluster.yaml
    
  2. Modify the initial cluster config (eksa-mgmt-cluster.yaml) as follows:

    • Refer to vsphere configuration for information on configuring this cluster config for a vSphere provider.
    • Add Optional configuration settings as needed. See Github provider to see how to identify your Git information.
    • Create at least two control plane nodes, three worker nodes, and three etcd nodes for a production cluster, to provide high availability and rolling upgrades.
  3. Set Credential Environment Variables

    Before you create the initial cluster, you will need to set and export these environment variables for your vSphere user name and password. Make sure you use single quotes around the values so that your shell does not interpret the values:

    export EKSA_VSPHERE_USERNAME='billy'
    export EKSA_VSPHERE_PASSWORD='t0p$ecret'
    
  4. Set License Environment Variable

    Add a license to any cluster for which you want to receive paid support. If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    

    After you have created your eksa-mgmt-cluster.yaml and set your credential environment variables, you will be ready to create the cluster.

  5. Configure Curated Packages

    The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here . Cluster creation will succeed if authentication is not set up, but some warnings may be genered. Detailed package configurations can be found here .

    If you are going to use packages, set up authentication. These credentials should have limited capabilities :

    export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
    export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"
    export EKSA_AWS_REGION="us-west-2"  
    
  6. Create cluster

    eksctl anywhere create cluster \
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f eksa-mgmt-cluster.yaml
    
  7. Once the cluster is created you can use it with the generated KUBECONFIG file in your local directory:

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    
  8. Check the cluster nodes:

    To check that the cluster completed, list the machines to see the control plane, etcd, and worker nodes:

    kubectl get machines -A
    

    Example command output

    NAMESPACE   NAME                PROVIDERID        PHASE    VERSION
    eksa-system mgmt-b2xyz          vsphere:/xxxxx    Running  v1.24.2-eks-1-24-5
    eksa-system mgmt-etcd-r9b42     vsphere:/xxxxx    Running  
    eksa-system mgmt-md-8-6xr-rnr   vsphere:/xxxxx    Running  v1.24.2-eks-1-24-5
    ...
    

    The etcd machine doesn’t show the Kubernetes version because it doesn’t run the kubelet service.

  9. Check the initial cluster’s CRD:

    To ensure you are looking at the initial cluster, list the CRD to see that the name of its management cluster is itself:

    kubectl get clusters mgmt -o yaml
    

    Example command output

    ...
    kubernetesVersion: "1.25"
    managementCluster:
      name: mgmt
    workerNodeGroupConfigurations:
    ...
    

Create separate workload clusters

Follow these steps if you want to use your initial cluster to create and manage separate workload clusters.

  1. Generate a workload cluster config:

    CLUSTER_NAME=w01
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider vsphere > eksa-w01-cluster.yaml
    

    Refer to the initial config described earlier for the required and optional settings.

    NOTE: Ensure workload cluster object names (Cluster, vSphereDatacenterConfig, vSphereMachineConfig, etc.) are distinct from management cluster object names.

  2. Be sure to set the managementCluster field to identify the name of the management cluster.

    For example, the management cluster, mgmt is defined for our workload cluster w01 as follows:

    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: w01
    spec:
      managementCluster:
        name: mgmt
    
  3. Set License Environment Variable

    Add a license to any cluster for which you want to receive paid support. If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    
  4. Create a workload cluster in one of the following ways:

    • GitOps: See Manage separate workload clusters with GitOps

    • Terraform: See Manage separate workload clusters with Terraform

      NOTE: spec.users[0].sshAuthorizedKeys must be specified to SSH into your nodes when provisioning a cluster through GitOps or Terraform, as the EKS Anywhere Cluster Controller will not generate the keys like eksctl CLI does when the field is empty.

    • eksctl CLI: To create a workload cluster with eksctl, run:

      eksctl anywhere create cluster \
          -f eksa-w01-cluster.yaml  \
          # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
          --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
      

      As noted earlier, adding the --kubeconfig option tells eksctl to use the management cluster identified by that kubeconfig file to create a different workload cluster.

    • kubectl CLI: The cluster lifecycle feature lets you use kubectl, or other tools that that can talk to the Kubernetes API, to create a workload cluster. To use kubectl, run:

      kubectl apply -f eksa-w01-cluster.yaml 
      
  5. To check the workload cluster, get the workload cluster credentials and run a test workload:

    • If your workload cluster was created with eksctl, change your credentials to point to the new workload cluster (for example, w01), then run the test application with:

      export CLUSTER_NAME=w01
      export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
      kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
      
    • If your workload cluster was created with GitOps or Terraform, the kubeconfig for your new cluster is stored as a secret on the management cluster. You can get credentials and run the test application as follows:

      kubectl get secret -n eksa-system w01-kubeconfig -o jsonpath={.data.value}' | base64 —decode > w01.kubeconfig
      export KUBECONFIG=w01.kubeconfig
      kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
      
  6. Add more workload clusters:

    To add more workload clusters, go through the same steps for creating the initial workload, copying the config file to a new name (such as eksa-w02-cluster.yaml), modifying resource names, and running the create cluster command again.

Next steps:

  • See the Cluster management section for more information on common operational tasks like scaling and deleting the cluster.

  • See the Package management section for more information on post-creation curated packages installation.

3 - Concepts

The Concepts section will describe the components and overall architecture of EKS Anywhere.

Most of the content of this section will cover how EKS Anywhere deploys, upgrades and otherwise manages Kubernetes clusters. It will point to Kubernetes documentation for specifics on how Kubernetes itself works.

3.1 - Compare EKS Anywhere and EKS

Comparing Amazon EKS Anywhere features to Amazon EKS

Amazon EKS Anywhere is a new deployment option for Amazon EKS that enables you to easily create and operate Kubernetes clusters on-premises. EKS Anywhere provides an installable software package for creating and operating Kubernetes clusters on-premises and automation tooling for cluster lifecycle support. To learn more, see EKS Anywhere .

Amazon Elastic Kubernetes Service (Amazon EKS) is a managed Kubernetes service that makes it easy for you to run Kubernetes on the AWS cloud. Amazon EKS is certified Kubernetes conformant, so existing applications that run on upstream Kubernetes are compatible with Amazon EKS. To learn more about Amazon EKS, see Amazon Elastic Kubernetes Service .

Comparing Amazon EKS Anywhere to Amazon EKS

Feature Amazon EKS Anywhere Amazon EKS
Control plane
K8s control plane management Managed by customer Managed by AWS
K8s control plane location Customer’s datacenter AWS cloud
Cluster updates Manual CLI updates for control plane. Flux supported rolling updates for data plane Managed in-place updates for control plane and managed rolling updates for data plane.
Compute
Compute options CloudStack, VMware vSphere, Bare Metal servers Amazon EC2, AWS Fargate
Supported node operating systems Bottlerocket, Ubuntu, and RHEL Amazon Linux 2, Windows Server, Bottlerocket, Ubuntu
Physical hardware (servers, network equipment, storage, etc.) Managed by customer Managed by AWS
Serverless Not supported Amazon EKS on AWS Fargate
Management
Command line interface (CLI) eksctl (OSS command line tool) eksctl (OSS command line tool)
Console view for Kubernetes objects Optional EKS console connection using EKS Connector (public preview) Native EKS console connection
Infrastructure-as-code Cluster manifest, Kubernetes controllers, 3rd-party solutions AWS CloudFormation, 3rd-party solutions
Logging and monitoring 3rd-party solutions CloudWatch, CloudTrail, 3rd-party solutions
GitOps Flux controller Flux controller
Functions and tooling
Networking and Security Cilium CNI and network policy supported Amazon VPC CNI supported. Calico supported for network policy. Other compatible 3rd-party CNI plugins available.
Load balancer Metallb Elastic Load Balancing including Application Load Balancer (ALB), and Network Load Balancer (NLB)
Service mesh Community or 3rd-party solutions AWS App Mesh, community, or 3rd-party solutions
Community tools and Helm Works with compatible community tooling and helm charts. Works with compatible community tooling and helm charts.
Pricing and support
Control plane pricing Free to download, paid support subscription option Hourly pricing per cluster
AWS Support Additional annual subscription (per cluster) for AWS support Basic support included. Included in paid AWS support plans (developer, business, and enterprise)

Comparing Amazon EKS Anywhere to Amazon EKS on Outposts

Like EKS Anywhere, Amazon EKS on Outposts provides a means of running Kubernetes clusters using EKS software on-premises. The main differences are that:

  • Amazon provides the hardware with Outposts, while most EKS Anywhere providers leverage the customer’s own hardware.
  • With Amazon EKS on Outposts, the Kubernetes control plane is fully managed by AWS. With EKS Anywhere, customers are responsible for managing the lifecycle of the Kubernetes control plane with EKS Anywhere automation tooling.
  • Customers can use Amazon EKS on Outposts with the same console, APIs, and tools they use to run Amazon EKS clusters in AWS Cloud. With EKS Anywhere, customers can use the eksctl CLI to manage their clusters, optionally connect their clusters to the EKS console for observability, and optionally use infrastructure as code tools such as Terraform and GitOps to manage their clusters. However, the primary interfaces for EKS Anywhere are the EKS Anywhere Custom Resources. Amazon EKS does not have a CRD-based interface today.
  • Amazon EKS on Outposts is a regional AWS service that requires a consistent, reliable connection from the Outpost to the AWS Region. EKS Anywhere is a standalone software offering that can run entirely disconnected from AWS Cloud, including air-gapped environments.

Outposts have two deployment methods available:

  • Extended clusters: With extended clusters, the Kubernetes control plane runs in an AWS Region, while Kubernetes nodes run on Outpost hardware.

  • Local clusters: With local clusters, both the Kubernetes control plane and nodes run on Outpost hardware.

For more information, see Amazon EKS on AWS Outposts .

3.2 - Cluster creation workflow

Explanation of the process of creating an EKS Anywhere cluster

Each EKS Anywhere cluster is built from a cluster specification file, with the structure of the configuration file based on the target provider for the cluster. Currently, Bare Metal, CloudStack, Nutanix, Snow, and VMware vSphere are the recommended providers for supported EKS Anywhere clusters. Docker is available as an unsupported provider. We step through the cluster creation workflow for Bare Metal and vSphere providers here.

Management and workload clusters

EKS Anywhere offers two cluster deployment topology options:

  • Standalone cluster: If want only a single EKS Anywhere cluster, you can deploy a self-managed, standalone cluster. This type of cluster contains all Cluster API (CAPI) management components needed to manage itself, including managing its own upgrades. It can also run workloads.

  • Management cluster with workload clusters: If you plan to deploy multiple clusters, the project recommends you first deploy a management cluster. The management cluster can then be used to deploy, upgrade, delete, and otherwise manage a fleet of workload clusters.

For further details about the different cluster topologies, see Cluster topologies

Before cluster creation

Some assets need to be in place before you can create an EKS Anywhere cluster. You need to have an Administrative machine that includes the tools required to create the cluster. Next, you need get the software tools and artifacts used to build the cluster. Then you also need to prepare the provider, such as a vCenter environment or a set of Bare Metal machines, on which to create the resulting cluster.

Administrative machine

The Administrative machine is needed to provide:

  • A place to run the commands to create and manage the target cluster.
  • A Docker container runtime to run a temporary, local bootstrap cluster that creates the resulting target cluster on the vSphere provider.
  • A place to hold the kubeconfig file needed to perform administrative actions using kubectl. (The kubeconfig file is stored in the root of the folder created during cluster creation.)

See the Install EKS Anywhere guide for Administrative machine requirements.

EKS Anywhere software

To obtain EKS Anywhere software, you need Internet access to the repositories holding that software. EKS Anywhere software is divided into two types of components: The CLI interface for managing clusters and the cluster components and controllers used to run workloads and configure clusters. The software you need to obtain includes:

  • Command line tools: Binaries to install on the administrative machine , include eksctl, eksctl-anywhere, kubectl, and aws-iam-authenticator.
  • Cluster components and controllers: These components are listed on the artifacts page for each provider.

If you are operating behind a firewall that limits access to the Internet, you can configure EKS Anywhere to identify the location of the proxy service you choose to connect to the Internet.

For more information on the software used in EKS Distro, which includes the Kubernetes release and related software in EKS Anywhere, see the EKS Distro Releases GitHub page.

Providers

EKS Anywhere uses an infrastructure provider model for creating, upgrading, and managing Kubernetes clusters that leverages the Kubernetes Cluster API project.

Like Cluster API, EKS Anywhere runs a kind cluster on the local Administrative machine to act as a bootstrap cluster. However, instead of using CAPI directly with the clusterctl command to manage the workload cluster, you use the eksctl anywhere command which abstracts that process for you, including calling clusterctl under the covers.

With your Administrative machine in place, you need to prepare your Bare Metal , vSphere , CloudStack , or Snow provider for EKS Anywhere. The following sections describe how to create a Bare Metal or vSphere cluster.

Creating a Bare Metal cluster

The following diagram illustrates what happens when you start the cluster creation process for a Bare Metal provider, as described in the Bare Metal Getting started guide.

Start creating a Bare Metal cluster

Start creating EKS Anywhere Bare Metal cluster

1. Generate a config file for Bare Metal

Identify the provider (--provider tinkerbell) and the cluster name to the eksctl anywhere create clusterconfig command and direct the output into a cluster config .yaml file.

2. Modify the config file and hardware CSV file

Modify the generated cluster config file to suit your situation. Details about this config file are contained on the Bare Metal Config page. Create a hardware configuration file (hardware.csv) as described in Prepare hardware inventory .

3. Launch the cluster creation

Run the eksctl anywhere cluster create command, providing the cluster config and hardware CSV files. To see details on the cluster creation process, increase verbosity (-v=9 provides maximum verbosity).

4. Create bootstrap cluster and provision hardware

The cluster creation process starts by creating a temporary Kubernetes bootstrap cluster on the Administrative machine. Containerized components of the Tinkerbell provisioner run either as pods on the bootstrap cluster (Hegel, Rufio, and Tink) or directly as containers on Docker (Boots). Those Tinkerbell components drive the provisioning of the operating systems and Kubernetes components on each of the physical computers.

With the information gathered from the cluster specification and the hardware CSV file, three custom resource definitions (CRDs) are created. These include:

  • Hardware custom resources: Which store hardware information for each machine
  • Template custom resources: Which store the tasks and actions
  • Workflow custom resources: Which put together the complete hardware and template information for each machine. There are different workflows for control plane and worker nodes.

As the bootstrap cluster comes up and Tinkerbell components are started, you should see messages like the following:

$ eksctl anywhere create cluster --hardware-csv hardware.csv -f eksa-mgmt-cluster.yaml
Performing setup and validations
 Tinkerbell Provider setup is valid
 Validate certificate for registry mirror
 Create preflight validations pass
Creating new bootstrap cluster
Provider specific pre-capi-install-setup on bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Provider specific post-setup
Creating new workload cluster

At this point, Tinkerbell will try to boot up the machines in the target cluster.

Continuing cluster creation

Tinkerbell takes over the activities for creating provisioning the Bare Metal machines to become the new target cluster. See Overview of Tinkerbell in EKS Anywhere for examples of commands you can run to watch over this process.

Continue creating EKS Anywhere Bare Metal cluster

1. Tinkerbell network boots and configures nodes

  • Rufio uses BMC information to set the power state for the first control plane node it wants to provision.
  • When the node boots from its NIC, it talks to the Boots DHCP server, which fetches the kernel and initramfs (HookOS) needed to network boot the machine.
  • With HookOS running on the node, the operating system identified by IMG_URL in the cluster specification is copied to the identified DEST_DISK on the machine.
  • The Hegel components provides data stores that contain information used by services such as cloud-init to configure each system.
  • Next, the workflow is run on the first control plane node, followed by network booting and running the workflow for each subsequent control plane node.
  • Once the control plane is up, worker nodes are network booted and workflows are run to deploy each node.

2. Tinkerbell components move to the target cluster

Once all the defined nodes are added to the cluster, the Tinkerbell components and associated data are moved to run as pods on worker nodes in the new workload cluster.

Deleting Tinkerbell from Admin machine

All Tinkerbell-related pods and containers are then deleted from the Admin machine. Further management of tinkerbell and related information can be done using from the new cluster, using tools such as kubectl.

Delete Tinkerbell pods and container

Creating a vSphere cluster

The following diagram illustrates what happens when you start the cluster creation process, as described in the vSphere Getting started guide.

Start creating a vSphere cluster

Start creating EKS Anywhere cluster

1. Generate a config file for vSphere

To this command, you identify the name of the provider (-p vsphere) and a cluster name and redirect the output to a file. The result is a config file template that you need to modify for the specific instance of your provider.

2. Modify the config file

Using the generated cluster config file, make modifications to suit your situation. Details about this config file are contained on the vSphere Config page.

3. Launch the cluster creation

Once you have modified the cluster configuration file, use eksctl anywhere cluster create -f $CLUSTER_NAME.yaml starts the cluster creation process. To see details on the cluster creation process, increase verbosity (-v=9 provides maximum verbosity).

4. Authenticate and create bootstrap cluster

After authenticating to vSphere and validating the assets there, the cluster creation process starts off creating a temporary Kubernetes bootstrap cluster on the Administrative machine. To begin, the cluster creation process runs a series of govc commands to check on the vSphere environment:

  • Checks that the vSphere environment is available.

  • Using the URL and credentials provided in the cluster spec files, authenticates to the vSphere provider.

  • Validates the datacenter and the datacenter network exists:

  • Validates that the identified datastore (to store your EKS Anywhere cluster) exists, that the folder holding your EKS Anywhere cluster VMs exists, and that the resource pools containing compute resources exist. If you have multiple VSphereMachineConfig objects in your config file, will see these validations repeated:

  • Validates the virtual machine templates to be used for the control plane and worker nodes (such as ubuntu-2004-kube-v1.20.7):

If all validations pass, you will see this message:

✅ Vsphere Provider setup is valid

Next, the process runs the kind command to build a single-node Kubernetes bootstrap cluster on the Administrative machine. This includes pulling the kind node image, preparing the node, writing the configuration, starting the control-plane, installing CNI, and installing the StorageClass. You will see:

After this point the bootstrap cluster is installed, but not yet fully configured.

Continuing cluster creation

The following diagram illustrates the activities that occur next:

Continue creating EKS Anywhere cluster

1. Add CAPI management

Cluster API (CAPI) management is added to the bootstrap cluster to direct the creation of the target cluster.

2. Set up cluster

Configure the control plane and worker nodes.

3. Add Cilium networking

Add Cilium as the CNI plugin to use for networking between the cluster services and pods.

4. Add storage

Add the default storage class to the cluster

5. Add CAPI to target cluster

Add the CAPI service to the target cluster in preparation for it to take over management of the cluster after the cluster creation is completed and the bootstrap cluster is deleted. The bootstrap cluster can then begin moving the CAPI objects over to the target cluster, so it can take over the management of itself.

With the bootstrap cluster running and configured on the Administrative machine, the creation of the target cluster begins. It uses kubectl to apply a target cluster configuration as follows:

  • Once etcd, the control plane, and the worker nodes are ready, it applies the networking configuration to the target cluster.

  • The default storage class is installed on the target cluster.

  • CAPI providers are configured on the target cluster, in preparation for the target cluster to take over responsibilities for running the components needed to manage the itself.

  • With CAPI running on the target cluster, CAPI objects for the target cluster are moved from the bootstrap cluster to the target cluster’s CAPI service (done internally with the clusterctl command):

  • Add Kubernetes CRDs and other addons that are specific to EKS Anywhere.

  • The cluster configuration is saved:

Once etcd, the control plane, and the worker nodes are ready, it applies the networking configuration to the workload cluster:

Installing networking on workload cluster

Next, the default storage class is installed on the workload cluster:

Installing storage class on workload cluster

After that, the CAPI providers are configured on the workload cluster, in preparation for the workload cluster to take over responsibilities for running the components needed to manage the itself.

Installing cluster-api providers on workload cluster

With CAPI running on the workload cluster, CAPI objects for the workload cluster are moved from the bootstrap cluster to the workload cluster’s CAPI service (done internally with the clusterctl command):

Moving cluster management from bootstrap to workload cluster

At this point, the cluster creation process will add Kubernetes CRDs and other addons that are specific to EKS Anywhere. That configuration is applied directly to the cluster:

Installing EKS-A custom components (CRD and controller) on workload cluster
Creating EKS-A CRDs instances on workload cluster
Installing GitOps Toolkit on workload cluster

If you did not specify GitOps support, starting the flux service is skipped:

GitOps field not specified, bootstrap flux skipped

The cluster configuration is saved:

Writing cluster config file

With the cluster up, and the CAPI service running on the new cluster, the bootstrap cluster is no longer needed and is deleted:

Delete EKS Anywhere bootstrap cluster

At this point, cluster creation is complete. You can now use your target cluster as either:

  • A standalone cluster (to run workloads) or
  • A management cluster (to optionally create one or more workload clusters)

Creating workload clusters (optional)

As described in Create separate workload clusters , you can use the cluster you just created as a management cluster to create and manage one or more workload clusters on the same vSphere provider as follows:

  • Use eksctl to generate a cluster config file for the new workload cluster.
  • Modify the cluster config with a new cluster name and different vSphere resources.
  • Use eksctl to create the new workload cluster from the new cluster config file and credentials from the initial management cluster.

3.3 - Cluster topologies

Explanation of standalone vs. management/workload cluster topologies

For trying out EKS Anywhere or for times when a single cluster is needed, it is fine to create a standalone cluster and run your workloads on it. However, if you plan to create multiple clusters for running Kubernetes workloads, we recommend you create a management cluster. Then use that management cluster to manage a set of workload clusters.

This document describes those two different EKS Anywhere cluster topologies.

What is an EKS Anywhere management cluster?

An EKS Anywhere management cluster is a long-lived, on-premises Kubernetes cluster that can create and manage a fleet of EKS Anywhere workload clusters. The workload clusters are where you run your applications. The management cluster can only be created and managed by the Amazon CLI eksctl.

The management cluster runs on your on-premises hardware and it does not require any connectivity back to AWS to function. Customers are responsible for operating the management cluster including (but not limited to) patching, upgrading, scaling, and monitoring the cluster control plane and data plane.

What’s the difference between a management cluster and a standalone cluster?

From a technical point of view, they are the same. Regardless of which deployment topology you choose, you always start by creating a singleton, standalone cluster that’s capable of managing itself. This shows examples of separate, standalone clusters:

Standalone clusters self-manage and can run applications

Once a standalone cluster is created, you have an option to use it to use it as a management cluster to create separate workload cluster(s) under it, hence making this cluster a long-lived management cluster. You can only use eksctl to create or delete the management cluster or a standalone cluster. This shows examples of a management cluster that deploys and manages multiple workload clusters:

Management clusters can create and manage multiple workload clusters

With the management cluster in place, you have a choice of tools for creating, upgrading, and deleting workload clusters. Check each provider to see which tools it currently supports. Supported workload cluster creation, upgrade and deletion tools include:

  • eksctl CLI
  • Terraform
  • GitOps
  • kubectl CLI to communicate with the Kubernetes API

What’s the difference between a management cluster and a bootstrap cluster for EKS Anywhere?

A management cluster is a long-lived entity you have to actively operate. The bootstrap cluster is a temporary, short-lived kind cluster that is created on a separate Administrative machine to facilitate the creation of an initial standalone or management cluster.

The kind cluster is automatically deleted by the end of the initial cluster creation.

When should I deploy a management cluster?

If you want to run three or more EKS Anywhere clusters, we recommend that you choose a management/workload cluster deployment topology because of the advantages listed in the table below. The EKS Anywhere Curated Packages feature recommends deploying certain packages such as the container registry package or monitoring packages on the management cluster to avoid circular dependency.

Standalone cluster topology Management/workload cluster topology
Pros Save hardware resources Isolation of secrets
Reduced operational overhead of maintaining a separate management cluster Resource isolation between different teams. Reduced noisy-neighbor effect.
Isolation between development and production workloads.
Isolation between applications and fleet management services, such as monitoring server or container registry.
Provides a central control plane and API to automate cluster lifecycles
Cons Shared secrets such, as SSH credentials or VMware credentials, across all teams who share the cluster. Consumes extra resources.
Without a central control plane (such as a parent management cluster), it is not possible to automate cluster creation/deletion with advanced methods like GitOps or IaC. The creation/deletion of the management cluster itself can’t be automated through GitOps or IaC.
Circular dependencies arise if the cluster has to host a monitoring server or a local container registry.

Which EKS Anywhere features support the management/workload cluster deployment topology today?

Features Supported
Create/update/delete a workload cluster on…
  • VMware via CLI
Y
  • CloudStack via CLI
Y
  • Bare Metal via CLI
Y
  • Docker via CLI (non-production only)
Y
Update a workload cluster on…
  • VMware via GitOps/Terraform
Y
  • CloudStack via GitOps/Terraform
Y
  • Bare Metal via GitOps/Terraform
N
  • Docker via CLI (non-production only)
Y
Create/delete a workload cluster on…
  • VMware via GitOps/Terraform
Y
  • CloudStack via GitOps/Terraform
N
  • Bare Metal via GitOps/Terraform
N
  • Docker via GitOps/Terraform (non-production only)
Y
Install a curated package on the management cluster Y
Install a curated package on the workload cluster from the management cluster Y
Install a curated package on the management cluster during a workload cluster creation N

3.4 - EKS Anywhere curated packages

All information you may need for EKS Anywhere curated packages

Overview

Amazon EKS Anywhere Curated Packages are Amazon-curated software packages that extend the core functionalities of Kubernetes on your EKS Anywhere clusters. If you operate EKS Anywhere clusters on-premises, you probably install additional software to ensure the security and reliability of your clusters. However, you may be spending a lot of effort researching for the right software, tracking updates, and testing them for compatibility. Now with the EKS Anywhere Curated Packages, you can rely on Amazon to provide trusted, up-to-date, and compatible software that are supported by Amazon, reducing the need for multiple vendor support agreements.

  • Amazon-built: All container images of the packages are built from source code by Amazon, including the open source (OSS) packages. OSS package images are built from the open source upstream.
  • Amazon-scanned: Amazon scans the container images including the OSS package images daily for security vulnerabilities and provides remediation.
  • Amazon-signed: Amazon signs the package bundle manifest (a Kubernetes manifest) for the list of curated packages. The manifest is signed with AWS Key Management Service (AWS KMS) managed private keys. The curated packages are installed and managed by a package controller on the clusters. Amazon provides validation of signatures through an admission control webhook in the package controller and the public keys distributed in the bundle manifest file.
  • Amazon-tested: Amazon tests the compatibility of all curated packages including the OSS packages with each new version of EKS Anywhere.
  • Amazon-supported: All curated packages including the curated OSS packages are supported under the EKS Anywhere Support Subscription.

The main components of EKS Anywhere Curated Packages are the package controller , the package build artifacts and the command line interface . The package controller will run in a pod in an EKS Anywhere cluster. The package controller will manage the lifecycle of all curated packages.

Curated packages

Please check out curated package list for the complete list of EKS Anywhere curated packages.

Workshop

Please check out workshop for curated packages.

FAQ

  1. Can I install software not from the curated package list?

    Yes. You can install any optional software of your choice. Be aware you cannot use EKS Anywhere tooling to install or update your self-managed software. Amazon does not provide testing, security patching, software updates, or customer support for your self-managed software.

  2. Can I install software that’s on the curated package list but not sourced from EKS Anywhere repository?

    If, for example, you deploy a Harbor image that is not built and signed by Amazon, Amazon will not provide testing or customer support to your self-built images.

3.4.1 - EKS Anywhere curated package controller

Overview

The package controller will install, upgrade, configure and remove packages from the cluster. The package controller will watch the packages and packagebundle custom resources for the packages to run and their configuration values. The package controller only runs on the management cluster and manages packages on the management cluster and on the workload clusters.

Package release information is stored in a package bundle manifest. The package controller will continually monitor and download new package bundles. When a new package bundle is downloaded, it will show up as update available and users can use the CLI to activate the bundle to upgrade the installed packages.

Any changes to a package custom resource will trigger and install, upgrade, configuration or removal of that package. The package controller will use ECR or private registry to get all resources including bundle, helm charts, and container images.

Installation

Please check out create local cluster and create production cluster for how to install package controller at the cluster creation time.

Please check out package management for how to install package controller after cluster creation and manage curated packages.

3.4.2 - EKS Anywhere curated package build artifacts

There are three types of build artifacts for packages: the container images, the helm charts and the package bundle manifests. The container images, helm charts and bundle manifests for all of the packages will be built and stored in EKS Anywhere ECR repository. Each package may have multiple versions specified in the packages bundle. The bundle will reference the helm chart tag in the ECR repository. The helm chart will reference the container images for the package.

3.4.3 - EKS Anywhere curated package CLI

Overview

The Curated Packages CLI provides the user experience required to manage curated packages. Through the CLI, a user is able to discover, create, delete, and upgrade curated packages to a cluster. These functionalities can be achieved during and after an EKS Anywhere cluster is created.

The CLI provides both imperative and declarative mechanisms to manage curated packages. These packages will be included as part of a packagebundle that will be provided by the EKS Anywhere team. Whenever a user requests a package creation through the CLI (eksctl anywhere create package), a custom resource is created on the cluster indicating the existence of a new package that needs to be installed. When a user executes a delete operation (eksctl anywhere delete package), the custom resource will be removed from the cluster indicating the need for uninstalling a package. An upgrade through the CLI (eksctl anywhere upgrade packages) upgrades all packages to the latest release.

Installation

Please check out Install EKS Anywhere to install the eksctl anywhere CLI on your machine.

Also check out Create local cluster and Create production cluster for how to use the CLI during and after cluster creation.

Check out EKS Anywhere curated package management for how to use the CLI after a cluster is created and manage curated packages.

4 - Tasks

Common actions and set-up you may need for EKS Anywhere

4.1 - Workload management

Common tasks for managing workloads.

4.1.1 - Deploy test workload

How to deploy a workload to check that your cluster is working properly

We’ve created a simple test application for you to verify your cluster is working properly. You can deploy it with the following command:

kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"

To see the new pod running in your cluster, type:

kubectl get pods -l app=hello-eks-a

Example output:

NAME                                     READY   STATUS    RESTARTS   AGE
hello-eks-a-745bfcd586-6zx6b   1/1     Running   0          22m

To check the logs of the container to make sure it started successfully, type:

kubectl logs -l app=hello-eks-a

There is also a default web page being served from the container. You can forward the deployment port to your local machine with

kubectl port-forward deploy/hello-eks-a 8000:80

Now you should be able to open your browser or use curl to http://localhost:8000 to view the page example application.

curl localhost:8000

Example output:

⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢

Thank you for using

███████╗██╗  ██╗███████╗
██╔════╝██║ ██╔╝██╔════╝
█████╗  █████╔╝ ███████╗
██╔══╝  ██╔═██╗ ╚════██║
███████╗██║  ██╗███████║
╚══════╝╚═╝  ╚═╝╚══════╝

 █████╗ ███╗   ██╗██╗   ██╗██╗    ██╗██╗  ██╗███████╗██████╗ ███████╗
██╔══██╗████╗  ██║╚██╗ ██╔╝██║    ██║██║  ██║██╔════╝██╔══██╗██╔════╝
███████║██╔██╗ ██║ ╚████╔╝ ██║ █╗ ██║███████║█████╗  ██████╔╝█████╗  
██╔══██║██║╚██╗██║  ╚██╔╝  ██║███╗██║██╔══██║██╔══╝  ██╔══██╗██╔══╝  
██║  ██║██║ ╚████║   ██║   ╚███╔███╔╝██║  ██║███████╗██║  ██║███████╗
╚═╝  ╚═╝╚═╝  ╚═══╝   ╚═╝    ╚══╝╚══╝ ╚═╝  ╚═╝╚══════╝╚═╝  ╚═╝╚══════╝

You have successfully deployed the hello-eks-a pod hello-eks-a-c5b9bc9d8-qp6bg

For more information check out
https://anywhere.eks.amazonaws.com

⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢

If you would like to expose your applications with an external load balancer or an ingress controller, you can follow the steps in Adding an external load balancer .

4.1.2 - Add an external load balancer

How to deploy a load balancer controller to expose a workload running in EKS Anywhere

While you are free to use any load balancer you like with your EKS Anywhere cluster, AWS currently only supports the MetalLB load balancer. For information on how to configure a MetalLB curated package for EKS Anywhere, see the Add MetalLB page.

4.1.3 - Add an ingress controller

How to deploy an ingress controller for simple host or URL-based HTTP routing into workload running in EKS-A

While you are free to use any Ingress Controller you like with your EKS Anywhere cluster, AWS currently only supports Emissary Ingress. For information on how to configure a Emissary Ingress curated package for EKS Anywhere, see the Add Emissary Ingress page.

Setting up Emissary-ingress for Ingress Controller

  1. Deploy the Hello EKS Anywhere test application.

    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    
  2. Set up a load balancer: Set up MetalLB Load Balancer by following the instructions here

  3. Install Emissary Ingress: Follow the instructions here Add Emissary Ingress

  4. Create Emissary Listeners on your cluster (This is a one time setup).

    kubectl apply -f - <<EOF
    ---
    apiVersion: getambassador.io/v3alpha1
    kind: Listener
    metadata:
      name: http-listener
      namespace: default
    spec:
      port: 8080
      protocol: HTTPS
      securityModel: XFP
      hostBinding:
        namespace:
          from: ALL
    ---
    apiVersion: getambassador.io/v3alpha1
    kind: Listener
    metadata:
      name: https-listener
      namespace: default
    spec:
      port: 8443
      protocol: HTTPS
      securityModel: XFP
      hostBinding:
        namespace:
          from: ALL
    EOF
    
  5. Create a Mapping on your cluster. This Mapping tells Emissary-ingress to route all traffic inbound to the /backend/ path to the Hello EKS Anywhere Service. This hostname IP is the IP found from the LoadBalancer resource deployed by MetalLB for you.

    kubectl apply -f - <<EOF
    ---
    apiVersion: getambassador.io/v2
    kind: Mapping
    metadata:
      name: hello-backend
    spec:
      prefix: /backend/
      service: hello-eks-a
      hostname: "195.16.99.65"
    EOF
    
  6. Store the Emissary-ingress load balancer IP address to a local environment variable. You will use this variable to test accessing your service.

    export EMISSARY_LB_ENDPOINT=$(kubectl get svc ambassador -o "go-template={{range .status.loadBalancer.ingress}}{{or .ip .hostname}}{{end}}")
    
  7. Test the configuration by accessing the service through the Emissary-ingress load balancer.

    curl -Lk http://$EMISSARY_LB_ENDPOINT/backend/
    

    NOTE: URL base path will need to match what is specified in the prefix exactly, including the trailing ‘/’

    You should see something like this in the output

    ⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢
    
    Thank you for using
    
    ███████╗██╗  ██╗███████╗                                             
    ██╔════╝██║ ██╔╝██╔════╝                                             
    █████╗  █████╔╝ ███████╗                                             
    ██╔══╝  ██╔═██╗ ╚════██║                                             
    ███████╗██║  ██╗███████║                                             
    ╚══════╝╚═╝  ╚═╝╚══════╝                                             
    
     █████╗ ███╗   ██╗██╗   ██╗██╗    ██╗██╗  ██╗███████╗██████╗ ███████╗
    ██╔══██╗████╗  ██║╚██╗ ██╔╝██║    ██║██║  ██║██╔════╝██╔══██╗██╔════╝
    ███████║██╔██╗ ██║ ╚████╔╝ ██║ █╗ ██║███████║█████╗  ██████╔╝█████╗  
    ██╔══██║██║╚██╗██║  ╚██╔╝  ██║███╗██║██╔══██║██╔══╝  ██╔══██╗██╔══╝  
    ██║  ██║██║ ╚████║   ██║   ╚███╔███╔╝██║  ██║███████╗██║  ██║███████╗
    ╚═╝  ╚═╝╚═╝  ╚═══╝   ╚═╝    ╚══╝╚══╝ ╚═╝  ╚═╝╚══════╝╚═╝  ╚═╝╚══════╝
    
    You have successfully deployed the hello-eks-a pod hello-eks-a-c5b9bc9d8-fx2fr
    
    For more information check out
    https://anywhere.eks.amazonaws.com
    
    ⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢
    
    

4.1.4 - Secure connectivity with CNI and Network Policy

How to validate the setup of Cilium CNI and deploy network policies to secure workload connectivity.

EKS Anywhere uses Cilium for pod networking and security.

Cilium is installed by default as a Kubernetes CNI plugin and so is already running in your EKS Anywhere cluster.

This section provides information about:

  • Understanding Cilium components and requirements

  • Validating your Cilium networking setup.

  • Using Cilium to securing workload connectivity using Kubernetes Network Policy.

Cilium Components

The primary Cilium Agent runs as a DaemonSet on each Kubernetes node. Each cluster also includes a Cilium Operator Deployment to handle certain cluster-wide operations. For EKS Anywhere, Cilium is configured to use the Kubernetes API server as the identity store, so no etcd cluster connectivity is required.

In a properly working environment, each Kubernetes node should have a Cilium Agent pod (cilium-WXYZ) in “Running” and ready (1/1) state. By default there will be two Cilium Operator pods (cilium-operator-123456-WXYZ) in “Running” and ready (1/1) state on different Kubernetes nodes for high-availability.

Run the following command to ensure all cilium related pods are in a healthy state.

kubectl get pods -n kube-system | grep cilium

Example output for this command in a 3 node environment is:

kube-system   cilium-fsjmd                                1/1     Running           0          4m
kube-system   cilium-nqpkv                                1/1     Running           0          4m
kube-system   cilium-operator-58ff67b8cd-jd7rf            1/1     Running           0          4m
kube-system   cilium-operator-58ff67b8cd-kn6ss            1/1     Running           0          4m
kube-system   cilium-zz4mt                                1/1     Running           0          4m

Network Connectivity Requirements

To provide pod connectivity within an on-premises environment, the Cilium agent implements an overlay network using the GENEVE tunneling protocol. As a result, UDP port 6081 connectivity MUST be allowed by any firewall running between Kubernetes nodes running the Cilium agent.

Allowing ICMP Ping (type = 8, code = 0) as well as TCP port 4240 is also recommended in order for Cilium Agents to validate node-to-node connectivity as part of internal status reporting.

Validating Connectivity

Cilium includes a connectivity check YAML that can be deployed into a test namespace in order to validate proper installation and connectivity within a Kubernetes cluster. If the connectivity check passes, all pods created by the YAML manifest will reach “Running” and ready (1/1) state. We recommend running this test only once you have multiple worker nodes in your environment to ensure you are validating cross-node connectivity.

It is important that this test is run in a dedicated namespace, with no existing network policy. For example:

kubectl create ns cilium-test
kubectl apply -n cilium-test -f https://docs.isovalent.com/v1.10/public/connectivity-check-eksa.yaml

Once all pods have started, simply checking the status of pods in this namespace will indicate whether the tests have passed:

kubectl get pods -n cilium-test

Successful test output will show all pods in a “Running” and ready (1/1) state:

NAME                                                     READY   STATUS    RESTARTS   AGE
echo-a-d576c5f8b-zlfsk                                   1/1     Running   0          59s
echo-b-787dc99778-sxlcc                                  1/1     Running   0          59s
echo-b-host-675cd8cfff-qvvv8                             1/1     Running   0          59s
host-to-b-multi-node-clusterip-6fd884bcf7-pvj5d          1/1     Running   0          58s
host-to-b-multi-node-headless-79f7df47b9-8mzbp           1/1     Running   0          58s
pod-to-a-57695cc7ff-6tqpv                                1/1     Running   0          59s
pod-to-a-allowed-cnp-7b6d5ff99f-4rhrs                    1/1     Running   0          59s
pod-to-a-denied-cnp-6887b57579-zbs2t                     1/1     Running   0          59s
pod-to-b-intra-node-hostport-7d656d7bb9-6zjrl            1/1     Running   0          57s
pod-to-b-intra-node-nodeport-569d7c647-76gn5             1/1     Running   0          58s
pod-to-b-multi-node-clusterip-fdf45bbbc-8l4zz            1/1     Running   0          59s
pod-to-b-multi-node-headless-64b6cbdd49-9hcqg            1/1     Running   0          59s
pod-to-b-multi-node-hostport-57fc8854f5-9d8m8            1/1     Running   0          58s
pod-to-b-multi-node-nodeport-54446bdbb9-5xhfd            1/1     Running   0          58s
pod-to-external-1111-56548587dc-rmj9f                    1/1     Running   0          59s
pod-to-external-fqdn-allow-google-cnp-5ff4986c89-z4h9j   1/1     Running   0          59s

Afterward, simply delete the namespace to clean-up the connectivity test:

kubectl delete ns cilium-test

Kubernetes Network Policy

By default, all Kubernetes workloads within a cluster can talk to any other workloads in the cluster, as well as any workloads outside the cluster. To enable a stronger security posture, Cilium implements the Kubernetes Network Policy specification to provide identity-aware firewalling / segmentation of Kubernetes workloads.

Network policies are defined as Kubernetes YAML specifications that are applied to a particular namespaces to describe that connections should be allowed to or from a given set of pods. These network policies are “identity-aware” in that they describe workloads within the cluster using Kubernetes metadata like namespace and labels, rather than by IP Address.

Basic network policies are validated as part of the above Cilium connectivity check test.

For next steps on leveraging Network Policy, we encourage you to explore:

Additional Cilium Features

Many advanced features of Cilium are not yet enabled as part of EKS Anywhere, including: Hubble observability, DNS-aware and HTTP-Aware Network Policy, Multi-cluster Routing, Transparent Encryption, and Advanced Load-balancing.

Please contact the EKS Anywhere team if you are interested in leveraging these advanced features along with EKS Anywhere.

4.2 - Cluster management

Common tasks for managing clusters.

4.2.1 - Cluster management overview

Overview of tools and interfaces for managing EKS Anywhere clusters

The content in this page will describe the tools and interfaces available to an administrator after an EKS Anywhere cluster is up and running. It will also describe which administrative actions done:

  • Directly in Kubernetes itself (such as adding nodes with kubectl)
  • Through the EKS Anywhere API (such as deleting a cluster with eksctl).
  • Through tools which interface with the Kubernetes API (such as managing a cluster with terraform )

Note that direct changes to OVAs before nodes are deployed is not yet supported. However, we are working on a solution for that issue.

4.2.2 - Scale cluster

How to scale your cluster

4.2.2.1 - Scale Bare Metal cluster

How to scale your Bare Metal cluster

Scaling nodes on Bare Metal clusters

When you are horizontally scaling your Bare Metal EKS Anywhere cluster, consider the number of nodes you need for your control plane and for your data plane.

See the Kubernetes Components documentation to learn the differences between the control plane and the data plane (worker nodes).

Horizontally scaling the cluster is done by increasing the number for the control plane or worker node groups under the Cluster specification.

NOTE: If etcd is running on your control plane (the default configuration) you should scale your control plane in odd numbers (3, 5, 7…).

apiVersion: anywhere.eks.amazonaws.com/v1
kind: Cluster
metadata:
  name: test-cluster
spec:
  controlPlaneConfiguration:
    count: 1     # increase this number to horizontally scale your control plane
...    
  workerNodeGroupsConfiguration:
  - count: 1     # increase this number to horizontally scale your data plane

Next, you must ensure you have enough available hardware for the scale-up operation to function. Available hardware could have been fed to the cluster as extra hardware during a prior create command, or could be fed to the cluster during the scale-up process by providing the hardware CSV file to the upgrade cluster command (explained in detail below). For scale-down operation, you can skip directly to the upgrade cluster command .

To check if you have enough available hardware for scale up, you can use the kubectl command below to check if there are hardware with the selector labels corresponding to the controlplane/worker node group and without the ownerName label.

kubectl get hardware -n eksa-system --show-labels

For example, if you want to scale a worker node group with selector label type=worker-group-1, then you must have an additional hardware object in your cluster with the label type=worker-group-1 that doesn’t have the ownerName label.

In the command shown below, eksa-worker2 matches the selector label and it doesn’t have the ownerName label. Thus, it can be used to scale up worker-group-1 by 1.

kubectl get hardware -n eksa-system --show-labels 
NAME                STATE       LABELS
eksa-controlplane               type=controlplane,v1alpha1.tinkerbell.org/ownerName=abhnvp-control-plane-template-1656427179688-9rm5f,v1alpha1.tinkerbell.org/ownerNamespace=eksa-system
eksa-worker1                    type=worker-group-1,v1alpha1.tinkerbell.org/ownerName=abhnvp-md-0-1656427179689-9fqnx,v1alpha1.tinkerbell.org/ownerNamespace=eksa-system
eksa-worker2                    type=worker-group-1

If you don’t have any available hardware that match this requirement in the cluster, you can setup a new hardware CSV . You can feed this hardware inventory file during the upgrade cluster command .

Upgrade Cluster Command for Scale Up/Down

With Hardware CSV File
eksctl anywhere upgrade cluster -f cluster.yaml --hardware-csv <hardware.csv>
Without Hardware CSV File
eksctl anywhere upgrade cluster -f cluster.yaml

Autoscaling

EKS Anywhere supports autoscaling of worker node groups using the Kubernetes Cluster Autoscaler and as a curated package .

See here for details on how to configure your cluster spec to autoscale worker node groups for autoscaling.

4.2.2.2 - Scale CloudStack cluster

How to scale your CloudStack cluster

Autoscaling

EKS Anywhere supports autoscaling of worker node groups using the Kubernetes Cluster Autoscaler and as a curated package .

See here for details on how to configure your cluster spec to autoscale worker node groups for autoscaling.

4.2.2.3 - Scale Nutanix cluster

How to scale your Nutanix cluster

When you are scaling your Nutanix EKS Anywhere cluster, consider the number of nodes you need for your control plane and for your data plane. Each plane can be scaled horizontally (add more nodes) or vertically (provide nodes with more resources). In each case you can scale the cluster manually, semi-automatically, or automatically.

See the Kubernetes Components documentation to learn the differences between the control plane and the data plane (worker nodes).

Manual cluster scaling

Horizontally scaling the cluster is done by increasing the number for the control plane or worker node groups under the Cluster specification.

NOTE: If etcd is running on your control plane (the default configuration) you should scale your control plane in odd numbers (3, 5, 7…).

apiVersion: anywhere.eks.amazonaws.com/v1
kind: Cluster
metadata:
  name: test-cluster
spec:
  controlPlaneConfiguration:
    count: 1     # increase this number to horizontally scale your control plane
...    
  workerNodeGroupsConfiguration:
  - count: 1     # increase this number to horizontally scale your data plane

Vertically scaling your cluster is done by updating the machine config spec for your infrastructure provider. For a Nutanix cluster an example is

NOTE: Not all providers can be vertically scaled (e.g. bare metal)

apiVersion: anywhere.eks.amazonaws.com/v1
kind: NutanixMachineConfig
metadata:
  name: test-machine
  namespace: default
spec:
  systemDiskSize: 50    # increase this number to make the VM disk larger
  vcpuSockets: 8        # increase this number to add vCPUs to your VM
  memorySize: 8192      # increase this number to add memory to your VM

Once you have made configuration updates you can apply the changes to your cluster. If you are adding or removing a node, only the terminated nodes will be affected. If you are vertically scaling your nodes, then all nodes will be replaced one at a time.

eksctl anywhere upgrade cluster -f cluster.yaml

Semi-automatic scaling

Scaling your cluster in a semi-automatic way still requires changing your cluster manifest configuration. In a semi-automatic mode you change your cluster spec and then have automation make the cluster changes.

You can do this by storing your cluster config manifest in git and then having a CI/CD system deploy your changes. Or you can use a GitOps controller to apply the changes. To read more about making changes with the integrated Flux GitOps controller you can read how to Manage a cluster with GitOps .

Autoscaling

EKS Anywhere supports autoscaling of worker node groups using the Kubernetes Cluster Autoscaler and as a curated package .

See here for details on how to configure your cluster spec to autoscale worker node groups for autoscaling.

4.2.2.4 - Scale vSphere cluster

How to scale your vSphere cluster

When you are scaling your vSphere EKS Anywhere cluster, consider the number of nodes you need for your control plane and for your data plane. Each plane can be scaled horizontally (add more nodes) or vertically (provide nodes with more resources). In each case you can scale the cluster manually, semi-automatically, or automatically.

See the Kubernetes Components documentation to learn the differences between the control plane and the data plane (worker nodes).

Manual cluster scaling

Horizontally scaling the cluster is done by increasing the number for the control plane or worker node groups under the Cluster specification.

NOTE: If etcd is running on your control plane (the default configuration) you should scale your control plane in odd numbers (3, 5, 7…).

apiVersion: anywhere.eks.amazonaws.com/v1
kind: Cluster
metadata:
  name: test-cluster
spec:
  controlPlaneConfiguration:
    count: 1     # increase this number to horizontally scale your control plane
...    
  workerNodeGroupsConfiguration:
  - count: 1     # increase this number to horizontally scale your data plane

Vertically scaling your cluster is done by updating the machine config spec for your infrastructure provider. For a vSphere cluster an example is

NOTE: Not all providers can be vertically scaled (e.g. bare metal)

apiVersion: anywhere.eks.amazonaws.com/v1
kind: VSphereMachineConfig
metadata:
  name: test-machine
  namespace: default
spec:
  diskGiB: 25       # increase this number to make the VM disk larger
  numCPUs: 2        # increase this number to add vCPUs to your VM
  memoryMiB: 8192   # increase this number to add memory to your VM

Once you have made configuration updates you can apply the changes to your cluster. If you are adding or removing a node, only the terminated nodes will be affected. If you are vertically scaling your nodes, then all nodes will be replaced one at a time.

eksctl anywhere upgrade cluster -f cluster.yaml

Semi-automatic scaling

Scaling your cluster in a semi-automatic way still requires changing your cluster manifest configuration. In a semi-automatic mode you change your cluster spec and then have automation make the cluster changes.

You can do this by storing your cluster config manifest in git and then having a CI/CD system deploy your changes. Or you can use a GitOps controller to apply the changes. To read more about making changes with the integrated Flux GitOps controller you can read how to Manage a cluster with GitOps .

Autoscaling

EKS Anywhere supports autoscaling of worker node groups using the Kubernetes Cluster Autoscaler and as a curated package .

See here for details on how to configure your cluster spec to autoscale worker node groups for autoscaling.

4.2.3 - Upgrade cluster

How to perform a cluster upgrade

4.2.3.1 - Upgrade Bare Metal cluster

How to perform a cluster upgrade for Bare Metal cluster

EKS Anywhere provides the command upgrade, which allows you to upgrade various aspects of your EKS Anywhere cluster. When you run eksctl anywhere upgrade cluster -f ./cluster.yaml, EKS Anywhere runs a set of preflight checks to ensure your cluster is ready to be upgraded. EKS Anywhere then performs the upgrade, modifying your cluster to match the updated specification. The upgrade command also upgrades core components of EKS Anywhere and lets the user enjoy the latest features, bug fixes and security patches.

NOTE: Currently only Minor Version Upgrades are support for Bare Metal clusters. No other aspects of the cluster upgrades are currently supported.

Minor Version Upgrades

Kubernetes has minor releases three times per year and EKS Distro follows a similar cadence. EKS Anywhere will add support for new EKS Distro releases as they are released, and you are advised to upgrade your cluster when possible.

Cluster upgrades are not handled automatically and require administrator action to modify the cluster specification and perform an upgrade. You are advised to upgrade your clusters in development environments first and verify your workloads and controllers are compatible with the new version.

Cluster upgrades are performed using a rolling upgrade process (similar to Kubernetes Deployments). Upgrades can only happen one minor version at a time (e.g. 1.24 -> 1.25). Control plane components will be upgraded before worker nodes.

Prerequisites

This type of upgrade requires you to have one spare hardware server for control plane upgrade and one for each worker node group upgrade. The spare hardware server is provisioned with the new version and then an old server is deprovisioned. The deprovisioned server is then reprovisioned with the new version while another old server is deprovisioned. This happens one at a time until all the control plane components have been upgraded, followed by worker node upgrades.

Core component upgrades

EKS Anywhere upgrade also supports upgrading the following core components:

  • Core CAPI
  • CAPI providers
  • Cilium CNI plugin
  • Cert-manager
  • Etcdadm CAPI provider
  • EKS Anywhere controllers and CRDs
  • GitOps controllers (Flux) - this is an optional component, will be upgraded only if specified

The latest versions of these core EKS Anywhere components are embedded into a bundles manifest that the CLI uses to fetch the latest versions and image builds needed for each component upgrade. The command detects both component version changes and new builds of the same versioned component. If there is a new Kubernetes version that is going to get rolled out, the core components get upgraded before the Kubernetes version. Irrespective of a Kubernetes version change, the upgrade command will always upgrade the internal EKS Anywhere components mentioned above to their latest available versions. All upgrade changes are backwards compatible.

Check upgrade components

Before you perform an upgrade, check the current and new versions of components that are ready to upgrade by typing:

eksctl anywhere upgrade plan cluster -f cluster.yaml

The output should appear similar to the following:

Worker node group name not specified. Defaulting name to md-0.
Warning: The recommended number of control plane nodes is 3 or 5
Worker node group name not specified. Defaulting name to md-0.
Checking new release availability...
NAME                     CURRENT VERSION                 NEXT VERSION
EKS-A                    v0.0.0-dev+build.1000+9886ba8   v0.0.0-dev+build.1105+46598cb
cluster-api              v1.0.2+e8c48f5                  v1.0.2+1274316
kubeadm                  v1.0.2+92c6d7e                  v1.0.2+aa1a03a
vsphere                  v1.0.1+efb002c                  v1.0.1+ef26ac1
kubadm                   v1.0.2+f002eae                  v1.0.2+f443dcf
etcdadm-bootstrap        v1.0.2-rc3+54dcc82              v1.0.0-rc3+df07114
etcdadm-controller       v1.0.2-rc3+a817792              v1.0.0-rc3+a310516

To format the output in json, add -o json to the end of the command line.

Check hardware availability

Next, you must ensure you have enough available hardware for the rolling upgrade operation to function. This type of upgrade requires you to have one spare hardware server for control plane upgrade and one for each worker node group upgrade. Check prerequisites for more information. Available hardware could have been fed to the cluster as extra hardware during a prior create command, or could be fed to the cluster during the upgrade process by providing the hardware CSV file to the upgrade cluster command .

To check if you have enough available hardware for rolling upgrade, you can use the kubectl command below to check if there are hardware objects with the selector labels corresponding to the controlplane/worker node group and without the ownerName label.

kubectl get hardware -n eksa-system --show-labels

For example, if you want to perform upgrade on a cluster with one worker node group with selector label type=worker-group-1, then you must have an additional hardware object in your cluster with the label type=controlplane (for control plane upgrade) and one with type=worker-group-1 (for worker node group upgrade) that doesn’t have the ownerName label.

In the command shown below, eksa-worker2 matches the selector label and it doesn’t have the ownerName label. Thus, it can be used to perform rolling upgrade of worker-group-1. Similarly, eksa-controlplane-spare will be used for rolling upgrade of control plane.

kubectl get hardware -n eksa-system --show-labels 
NAME                STATE       LABELS
eksa-controlplane               type=controlplane,v1alpha1.tinkerbell.org/ownerName=abhnvp-control-plane-template-1656427179688-9rm5f,v1alpha1.tinkerbell.org/ownerNamespace=eksa-system
eksa-controlplane-spare         type=controlplane
eksa-worker1                    type=worker-group-1,v1alpha1.tinkerbell.org/ownerName=abhnvp-md-0-1656427179689-9fqnx,v1alpha1.tinkerbell.org/ownerNamespace=eksa-system
eksa-worker2                    type=worker-group-1

If you don’t have any available hardware that match this requirement in the cluster, you can setup a new hardware CSV . You can feed this hardware inventory file during the upgrade cluster command .

Performing a cluster upgrade

To perform a cluster upgrade you can modify your cluster specification kubernetesVersion field to the desired version.

As an example, to upgrade a cluster with version 1.24 to 1.25 you would change your spec as follows:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: dev
spec:
  controlPlaneConfiguration:
    count: 1
    endpoint:
      host: "198.18.99.49"
    machineGroupRef:
      kind: TinkerbellMachineConfig
      name: dev
      ...
  kubernetesVersion: "1.25"
      ...

NOTE: If you have a custom machine image for your nodes in your cluster config yaml you may also need to update your TinkerbellDatacenterConfig with a new osImageURL .

and then you will run the upgrade cluster command .

Upgrade cluster command

With hardware CSV
eksctl anywhere upgrade cluster -f cluster.yaml --hardware-csv <hardware.csv>
Without hardware CSV
eksctl anywhere upgrade cluster -f cluster.yaml

This will upgrade the cluster specification (if specified), upgrade the core components to the latest available versions and apply the changes using the provisioner controllers.

Output

Example output:

✅ control plane ready
✅ worker nodes ready
✅ nodes ready
✅ cluster CRDs ready
✅ cluster object present on workload cluster
✅ upgrade cluster kubernetes version increment
✅ validate immutable fields
🎉 all cluster upgrade preflight validations passed
Performing provider setup and validations
Pausing EKS-A cluster controller reconcile
Pausing Flux kustomization
GitOps field not specified, pause flux kustomization skipped
Creating bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Moving cluster management from workload to bootstrap cluster
Upgrading workload cluster
Moving cluster management from bootstrap to workload cluster
Applying new EKS-A cluster resource; resuming reconcile
Resuming EKS-A controller reconciliation
Updating Git Repo with new EKS-A cluster spec
GitOps field not specified, update git repo skipped
Forcing reconcile Git repo with latest commit
GitOps not configured, force reconcile flux git repo skipped
Resuming Flux kustomization
GitOps field not specified, resume flux kustomization skipped

During the upgrade process, EKS Anywhere pauses the cluster controller reconciliation by adding the paused annotation anywhere.eks.amazonaws.com/paused: true to the EKS Anywhere cluster, provider datacenterconfig and machineconfig resources, before the components upgrade. After upgrade completes, the annotations are removed so that the cluster controller resumes reconciling the cluster.

Though not recommended, you can manually pause the EKS Anywhere cluster controller reconciliation to perform extended maintenance work or interact with Cluster API objects directly. To do it, you can add the paused annotation to the cluster resource:

kubectl annotate clusters.anywhere.eks.amazonaws.com ${CLUSTER_NAME} -n ${CLUSTER_NAMESPACE} anywhere.eks.amazonaws.com/paused=true

After finishing the task, make sure you resume the cluster reconciliation by removing the paused annotation, so that EKS Anywhere cluster controller can continue working as expected.

kubectl annotate clusters.anywhere.eks.amazonaws.com ${CLUSTER_NAME} -n ${CLUSTER_NAMESPACE} anywhere.eks.amazonaws.com/paused-

Upgradeable cluster attributes

Cluster:

  • kubernetesVersion

Advanced configuration for rolling upgrade

EKS Anywhere allows an optional configuration to customize the behavior of upgrades.

It allows the specification of Two parameters that control the desired behavior of rolling upgrades:

  • maxSurge - The maximum number of machines that can be scheduled above the desired number of machines. When not specified, the current CAPI default of 1 is used.
  • maxUnavailable - The maximum number of machines that can be unavailable during the upgrade. When not specified, the current CAPI default of 0 is used.

Example configuration:

upgradeRolloutStrategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1
    maxUnavailable: 0    # only configurable for worker nodes

‘upgradeRolloutStrategy’ configuration can be specified separately for control plane and for each worker node group. This template contains an example for control plane under the ‘controlPlaneConfiguration’ section and for worker node group under ‘workerNodeGroupConfigurations’:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: my-cluster-name
spec:
  clusterNetwork:
    cniConfig:
      cilium: {}
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12
  controlPlaneConfiguration:
    count: 1
    endpoint:
      host: "10.61.248.209"
    machineGroupRef:
      kind: TinkerbellMachineConfig
      name: my-cluster-name-cp
    upgradeRolloutStrategy:
      type: RollingUpdate
      rollingUpdate:
        maxSurge: 1
  datacenterRef:
    kind: TinkerbellDatacenterConfig
    name: my-cluster-name
  kubernetesVersion: "1.25"
  managementCluster:
    name: my-cluster-name 
  workerNodeGroupConfigurations:
  - count: 2
    machineGroupRef:
      kind: TinkerbellMachineConfig
      name: my-cluster-name 
    name: md-0
    upgradeRolloutStrategy:
      type: RollingUpdate
      rollingUpdate:
        maxSurge: 1
        maxUnavailable: 0

---
...

upgradeRolloutStrategy

Configuration parameters for upgrade strategy.

upgradeRolloutStrategy.type

Type of rollout strategy. Currently only RollingUpdate is supported.

upgradeRolloutStrategy.rollingUpdate

Configuration parameters for customizing rolling upgrade behavior.

upgradeRolloutStrategy.rollingUpdate.maxSurge

Default: 1

This can not be 0 if maxUnavailable is 0.

The maximum number of machines that can be scheduled above the desired number of machines.

Example: When this is set to n, the new worker node group can be scaled up immediately by n when the rolling upgrade starts. Total number of machines in the cluster (old + new) never exceeds (desired number of machines + n). Once scale down happens and old machines are brought down, the new worker node group can be scaled up further ensuring that the total number of machines running at any time does not exceed the desired number of machines + n.

upgradeRolloutStrategy.rollingUpdate.maxUnavailable

Default: 0

This can not be 0 if MaxSurge is 0.

The maximum number of machines that can be unavailable during the upgrade.

Example: When this is set to n, the old worker node group can be scaled down by n machines immediately when the rolling upgrade starts. Once new machines are ready, old worker node group can be scaled down further, followed by scaling up the new worker node group, ensuring that the total number of machines unavailable at all times during the upgrade never falls below n.

Rolling upgrades with no additional hardware

When maxSurge is set to 0 and maxUnavailable is set to 1, it allows for a rolling upgrade without need for additional hardware. Use this configuration if your workloads can tolerate node unavailability.

NOTE: This could ONLY be used if unavailability of a maximum of 1 node is acceptable. For single node clusters, an additional temporary machine is a must. Alternatively, you may recreate the single node cluster for upgrading and handle data recovery manually.

With this kind of configuration, the rolling upgrade will proceed node by node, deprovision and delete a node fully before re-provisioning it with upgraded version, and re-join it to the cluster. This means that any point during the course of the rolling upgrade, there could be one unavailable node.

Troubleshooting

Attempting to upgrade a cluster with more than 1 minor release will result in receiving the following error.

✅ validate immutable fields
❌ validation failed    {"validation": "Upgrade preflight validations", "error": "validation failed with 1 errors: WARNING: version difference between upgrade version (1.21) and server version (1.19) do not meet the supported version increment of +1", "remediation": ""}
Error: failed to upgrade cluster: validations failed

For more errors you can see the troubleshooting section .

4.2.3.2 - Upgrade vSphere, CloudStack, Nutanix, or Snow cluster

How to perform a cluster upgrade for vSphere, CloudStack, Nutanix, or Snow cluster

EKS Anywhere provides the command upgrade, which allows you to upgrade various aspects of your EKS Anywhere cluster. When you run eksctl anywhere upgrade cluster -f ./cluster.yaml, EKS Anywhere runs a set of preflight checks to ensure your cluster is ready to be upgraded. EKS Anywhere then performs the upgrade, modifying your cluster to match the updated specification. The upgrade command also upgrades core components of EKS Anywhere and lets the user enjoy the latest features, bug fixes and security patches.

NOTE: If an upgrade fails, it is very important not to delete the Docker containers running the KinD bootstrap cluster. During an upgrade, the bootstrap cluster contains critical EKS Anywhere components. If it is deleted after a failed upgrade, they cannot be recovered.

Minor Version Upgrades

Kubernetes has minor releases three times per year and EKS Distro follows a similar cadence. EKS Anywhere will add support for new EKS Distro releases as they are released, and you are advised to upgrade your cluster when possible.

Cluster upgrades are not handled automatically and require administrator action to modify the cluster specification and perform an upgrade. You are advised to upgrade your clusters in development environments first and verify your workloads and controllers are compatible with the new version.

Cluster upgrades are performed in place using a rolling process (similar to Kubernetes Deployments). Upgrades can only happen one minor version at a time (e.g. 1.24 -> 1.25). Control plane components will be upgraded before worker nodes.

A new VM is created with the new version and then an old VM is removed. This happens one at a time until all the control plane components have been upgraded.

Core component upgrades

EKS Anywhere upgrade also supports upgrading the following core components:

  • Core CAPI
  • CAPI providers
  • Cilium CNI plugin
  • Cert-manager
  • Etcdadm CAPI provider
  • EKS Anywhere controllers and CRDs
  • GitOps controllers (Flux) - this is an optional component, will be upgraded only if specified

The latest versions of these core EKS Anywhere components are embedded into a bundles manifest that the CLI uses to fetch the latest versions and image builds needed for each component upgrade. The command detects both component version changes and new builds of the same versioned component. If there is a new Kubernetes version that is going to get rolled out, the core components get upgraded before the Kubernetes version. Irrespective of a Kubernetes version change, the upgrade command will always upgrade the internal EKS Anywhere components mentioned above to their latest available versions. All upgrade changes are backwards compatible.

Specifically for Snow provider, a new Admin instance is needed when upgrading to the new versions of EKS Anywhere. See Upgrade EKS Anywhere AMIs in Snowball Edge devices to upgrade and use a new Admin instance in Snow devices. After that, ugrades of other components can be done as described in this document.

Check upgrade components

Before you perform an upgrade, check the current and new versions of components that are ready to upgrade by typing:

Management Cluster

eksctl anywhere upgrade plan cluster -f mgmt-cluster.yaml

Workload Cluster

eksctl anywhere upgrade plan cluster -f workload-cluster.yaml --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig

The output should appear similar to the following:

Worker node group name not specified. Defaulting name to md-0.
Warning: The recommended number of control plane nodes is 3 or 5
Worker node group name not specified. Defaulting name to md-0.
Checking new release availability...
NAME                     CURRENT VERSION                 NEXT VERSION
EKS-A                    v0.0.0-dev+build.1000+9886ba8   v0.0.0-dev+build.1105+46598cb
cluster-api              v1.0.2+e8c48f5                  v1.0.2+1274316
kubeadm                  v1.0.2+92c6d7e                  v1.0.2+aa1a03a
vsphere                  v1.0.1+efb002c                  v1.0.1+ef26ac1
kubadm                   v1.0.2+f002eae                  v1.0.2+f443dcf
etcdadm-bootstrap        v1.0.2-rc3+54dcc82              v1.0.0-rc3+df07114
etcdadm-controller       v1.0.2-rc3+a817792              v1.0.0-rc3+a310516

To the format output in json, add -o json to the end of the command line.

Performing a cluster upgrade

To perform a cluster upgrade you can modify your cluster specification kubernetesVersion field to the desired version.

As an example, to upgrade a cluster with version 1.24 to 1.25 you would change your spec

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: dev
spec:
  controlPlaneConfiguration:
    count: 1
    endpoint:
      host: "198.18.99.49"
    machineGroupRef:
      kind: VSphereMachineConfig
      name: dev
      ...
  kubernetesVersion: "1.25"
      ...

NOTE: If you have a custom machine image for your nodes you may also need to update your vsphereMachineConfig with a new template.

and then you will run the command

Management Cluster

eksctl anywhere upgrade cluster -f mgmt-cluster.yaml

Workload Cluster

eksctl anywhere upgrade cluster -f workload-cluster.yaml --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig

This will upgrade the cluster specification (if specified), upgrade the core components to the latest available versions and apply the changes using the provisioner controllers.

Example output:

✅ control plane ready
✅ worker nodes ready
✅ nodes ready
✅ cluster CRDs ready
✅ cluster object present on workload cluster
✅ upgrade cluster kubernetes version increment
✅ validate immutable fields
🎉 all cluster upgrade preflight validations passed
Performing provider setup and validations
Pausing EKS-A cluster controller reconcile
Pausing Flux kustomization
GitOps field not specified, pause flux kustomization skipped
Creating bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Moving cluster management from workload to bootstrap cluster
Upgrading workload cluster
Moving cluster management from bootstrap to workload cluster
Applying new EKS-A cluster resource; resuming reconcile
Resuming EKS-A controller reconciliation
Updating Git Repo with new EKS-A cluster spec
GitOps field not specified, update git repo skipped
Forcing reconcile Git repo with latest commit
GitOps not configured, force reconcile flux git repo skipped
Resuming Flux kustomization
GitOps field not specified, resume flux kustomization skipped

During the upgrade process, EKS Anywhere pauses the cluster controller reconciliation by adding the paused annotation anywhere.eks.amazonaws.com/paused: true to the EKS Anywhere cluster, provider datacenterconfig and machineconfig resources, before the components upgrade. After upgrade completes, the annotations are removed so that the cluster controller resumes reconciling the cluster.

Though not recommended, you can manually pause the EKS Anywhere cluster controller reconciliation to perform extended maintenance work or interact with Cluster API objects directly. To do it, you can add the paused annotation to the cluster resource:

kubectl annotate clusters.anywhere.eks.amazonaws.com ${CLUSTER_NAME} -n ${CLUSTER_NAMESPACE} anywhere.eks.amazonaws.com/paused=true

After finishing the task, make sure you resume the cluster reconciliation by removing the paused annotation, so that EKS Anywhere cluster controller can continue working as expected.

kubectl annotate clusters.anywhere.eks.amazonaws.com ${CLUSTER_NAME} -n ${CLUSTER_NAMESPACE} anywhere.eks.amazonaws.com/paused-

Upgradeable Cluster Attributes

EKS Anywhere upgrade supports upgrading more than just the kubernetesVersion, allowing you to upgrade a number of fields simultaneously with the same procedure.

Upgradeable Attributes

Cluster:

  • kubernetesVersion
  • controlPlaneConfig.count
  • controlPlaneConfigurations.machineGroupRef.name
  • workerNodeGroupConfigurations.count
  • workerNodeGroupConfigurations.machineGroupRef.name
  • etcdConfiguration.externalConfiguration.machineGroupRef.name
  • identityProviderRefs (Only for kind:OIDCConfig, kind:AWSIamConfig is immutable)
  • gitOpsRef (Once set, you can’t change or delete the field’s content later)

VSphereMachineConfig:

  • datastore
  • diskGiB
  • folder
  • memoryMiB
  • numCPUs
  • resourcePool
  • template
  • users

NutanixMachineConfig:

  • vcpusPerSocket
  • vcpuSockets
  • memorySize
  • image
  • cluster
  • subnet
  • systemDiskSize

SnowMachineConfig:

  • amiID
  • instanceType
  • physicalNetworkConnector
  • sshKeyName
  • devices
  • containersVolume
  • osFamily
  • network

OIDCConfig:

  • clientID
  • groupsClaim
  • groupsPrefix
  • issuerUrl
  • requiredClaims.claim
  • requiredClaims.value
  • usernameClaim
  • usernamePrefix

AWSIamConfig:

  • mapRoles
  • mapUsers

EKS Anywhere upgrade also supports adding more worker node groups post-creation. To add more worker node groups, modify your cluster config file to define the additional group(s). Example:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: dev
spec:
  controlPlaneConfiguration:
     ...
  workerNodeGroupConfigurations:
  - count: 2
    machineGroupRef:
      kind: VSphereMachineConfig
      name: my-cluster-machines
    name: md-0
  - count: 2
    machineGroupRef:
      kind: VSphereMachineConfig
      name: my-cluster-machines
    name: md-1
      ...

Worker node groups can use the same machineGroupRef as previous groups, or you can define a new machine configuration for your new group.

Resume upgrade after failure

EKS Anywhere supports re-running the upgrade command post-failure as an experimental feature. If the upgrade command fails, the user can manually fix the issue (when applicable) and simply rerun the same command. At this point, the CLI will skip the completed tasks, restore the state of the operation, and resume the upgrade process. The completed tasks are stored in the generated folder as a file named <clusterName>-checkpoint.yaml.

This feature is experimental. To enable this feature, export the following environment variable:
export CHECKPOINT_ENABLED=true

Troubleshooting

Attempting to upgrade a cluster with more than 1 minor release will result in receiving the following error.

✅ validate immutable fields
❌ validation failed    {"validation": "Upgrade preflight validations", "error": "validation failed with 1 errors: WARNING: version difference between upgrade version (1.21) and server version (1.19) do not meet the supported version increment of +1", "remediation": ""}
Error: failed to upgrade cluster: validations failed

For more errors you can see the troubleshooting section .

4.2.4 - Etcd Backup and Restore

How to backup and restore an EKS Anywhere cluster

NOTE: External etcd topology is supported for vSphere and CloudStack clusters, but not yet for Bare Metal or Nutanix clusters.

This page contains steps for backing up a cluster by taking an etcd snapshot, and restoring the cluster from a snapshot. These steps are for an EKS Anywhere cluster provisioned using the external etcd topology (selected by default) and Ubuntu OVAs.

Use case

EKS-Anywhere clusters use etcd as the backing store. Taking a snapshot of etcd backs up the entire cluster data. This can later be used to restore a cluster back to an earlier state if required. Etcd backups can be taken prior to cluster upgrade, so if the upgrade doesn’t go as planned you can restore from the backup.

Backup

Etcd offers a built-in snapshot mechanism. You can take a snapshot using the etcdctl snapshot save command by following the steps given below.

  1. Login to any one of the etcd VMs
ssh -i $PRIV_KEY ec2-user@$ETCD_VM_IP
  1. Run the etcdctl command to take a snapshot with the following steps
sudo su
source /etc/etcd/etcdctl.env
etcdctl snapshot save snapshot.db
chown ec2-user snapshot.db
  1. Exit the VM. Copy the snapshot from the VM to your local/admin setup where you can save snapshots in a secure place. Before running scp, make sure you don’t already have a snapshot file saved by the same name locally.
scp -i $PRIV_KEY ec2-user@$ETCD_VM_IP:/home/ec2-user/snapshot.db . 

NOTE: This snapshot file contains all information stored in the cluster, so make sure you save it securely (encrypt it).

Restore

Restoring etcd is a 2-part process. The first part is restoring etcd using the snapshot, creating a new data-dir for etcd. The second part is replacing the current etcd data-dir with the one generated after restore. During etcd data-dir replacement, we cannot have any kube-apiserver instances running in the cluster. So we will first stop all instances of kube-apiserver and other controlplane components using the following steps for every controlplane VM:

Pausing Etcdadm controller reconcile

During restore, it is required to pause the Etcdadm controller reconcile for the target cluster (whether it is management or workload cluster). To do that, you need to add a cluster.x-k8s.io/paused annotation to the target cluster’s etcdadmclusters resource. For example,

kubectl annotate etcdadmclusters workload-cluster-1-etcd cluster.x-k8s.io/paused=true -n eksa-system --kubeconfig mgmt-cluster.kubeconfig

Stopping the controlplane components

  1. Login to a controlplane VM
ssh -i $PRIV_KEY ec2-user@$CONTROLPLANE_VM_IP
  1. Stop controlplane components by moving the static pod manifests under a temp directory:
sudo su
mkdir temp-manifests
mv /etc/kubernetes/manifests/*.yaml temp-manifests
  1. Repeat these steps for all other controlplane VMs

After this you can restore etcd from a saved snapshot using the etcdctl snapshot save command following the steps given below.

Restoring from the snapshot

  1. The snapshot file should be made available in every etcd VM of the cluster. You can copy it to each etcd VM using this command:
scp -i $PRIV_KEY snapshot.db ec2-user@$ETCD_VM_IP:/home/ec2-user
  1. To run the etcdctl snapshot restore command, you need to provide the following configuration parameters:
  • name: This is the name of the etcd member. The value of this parameter should match the value used while starting the member. This can be obtained by running:
export ETCD_NAME=$(cat /etc/etcd/etcd.env | grep ETCD_NAME | awk -F'=' '{print $2}')
  • initial-advertise-peer-urls: This is the advertise peer URL with which this etcd member was configured. It should be the exact value with which this etcd member was started. This can be obtained by running:
export ETCD_INITIAL_ADVERTISE_PEER_URLS=$(cat /etc/etcd/etcd.env | grep ETCD_INITIAL_ADVERTISE_PEER_URLS | awk -F'=' '{print $2}')
  • initial-cluster: This should be a comma-separated mapping of etcd member name and its peer URL. For this, get the ETCD_NAME and ETCD_INITIAL_ADVERTISE_PEER_URLS values for each member and join them. And then use this exact value for all etcd VMs. For example, for a 3 member etcd cluster this is what the value would look like (The command below cannot be run directly without substituting the required variables and is meant to be an example)
export ETCD_INITIAL_CLUSTER=${ETCD_NAME_1}=${ETCD_INITIAL_ADVERTISE_PEER_URLS_1},${ETCD_NAME_2}=${ETCD_INITIAL_ADVERTISE_PEER_URLS_2},${ETCD_NAME_3}=${ETCD_INITIAL_ADVERTISE_PEER_URLS_3}
  • initial-cluster-token: Set this to a unique value and use the same value for all etcd members of the cluster. It can be any value such as etcd-cluster-1 as long as it hasn’t been used before.
  1. Gather the required env vars for the restore command
cat <<EOF >> restore.env
export ETCD_NAME=$(cat /etc/etcd/etcd.env | grep ETCD_NAME | awk -F'=' '{print $2}')
export ETCD_INITIAL_ADVERTISE_PEER_URLS=$(cat /etc/etcd/etcd.env | grep ETCD_INITIAL_ADVERTISE_PEER_URLS | awk -F'=' '{print $2}')
EOF

cat /etc/etcd/etcdctl.env >> restore.env
  1. Make sure you form the correct ETCD_INITIAL_CLUSTER value using all etcd members, and set it as an env var in the restore.env file created in the above step.
  2. Once you have obtained all the right values, run the following commands to restore etcd replacing the required values:
sudo su
source restore.env
etcdctl snapshot restore snapshot.db --name=${ETCD_NAME} --initial-cluster=${ETCD_INITIAL_CLUSTER} --initial-cluster-token=etcd-cluster-1 --initial-advertise-peer-urls=${ETCD_INITIAL_ADVERTISE_PEER_URLS}
  1. This is going to create a new data-dir for the restored contents under a new directory {ETCD_NAME}.etcd. To start using this, restart etcd with the new data-dir with the following steps:
systemctl stop etcd.service
mv /var/lib/etcd/member /var/lib/etcd/member.bak
mv ${ETCD_NAME}.etcd/member /var/lib/etcd/
  1. Perform this directory swap on all etcd VMs, and then start etcd again on those VMs
systemctl start etcd.service

NOTE: Until the etcd process is started on all VMs, it might appear stuck on the VMs where it was started first, but this should be temporary.

Starting the controlplane components

  1. Login to a controlplane VM
ssh -i $PRIV_KEY ec2-user@$CONTROLPLANE_VM_IP
  1. Start the controlplane components by moving back the static pod manifests from under the temp directory to the /etc/kubernetes/manifests directory:
mv temp-manifests/*.yaml /etc/kubernetes/manifests
  1. Repeat these steps for all other controlplane VMs
  2. It may take a few minutes for the kube-apiserver and the other components to get restarted. After this you should be able to access all objects present in the cluster at the time the backup was taken.

Resuming Etcdadm controller reconcile

Resume Etcdadm controller reconcile for the target cluster by removing the cluster.x-k8s.io/paused annotation in the target cluster’s etcdadmclusters resource. For example,

kubectl annotate etcdadmclusters workload-cluster-1-etcd cluster.x-k8s.io/paused- -n eksa-system

4.2.5 - Verify cluster

How to verify an EKS Anywhere cluster is running properly

To verify that a cluster control plane is up and running, use the kubectl command to show that the control plane pods are all running.

kubectl get po -A -l control-plane=controller-manager
NAMESPACE                           NAME                                                             READY   STATUS    RESTARTS   AGE
capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-controller-manager-57b99f579f-sd85g       2/2     Running   0          47m
capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-controller-manager-79cdf98fb8-ll498   2/2     Running   0          47m
capi-system                         capi-controller-manager-59f4547955-2ks8t                         2/2     Running   0          47m
capi-webhook-system                 capi-controller-manager-bb4dc9878-2j8mg                          2/2     Running   0          47m
capi-webhook-system                 capi-kubeadm-bootstrap-controller-manager-6b4cb6f656-qfppd       2/2     Running   0          47m
capi-webhook-system                 capi-kubeadm-control-plane-controller-manager-bf7878ffc-rgsm8    2/2     Running   0          47m
capi-webhook-system                 capv-controller-manager-5668dbcd5-v5szb                          2/2     Running   0          47m
capv-system                         capv-controller-manager-584886b7bd-f66hs                         2/2     Running   0          47m

You may also check the status of the cluster control plane resource directly. This can be especially useful to verify clusters with multiple control plane nodes after an upgrade.

kubectl get kubeadmcontrolplanes.controlplane.cluster.x-k8s.io
NAME                       INITIALIZED   API SERVER AVAILABLE   VERSION              REPLICAS   READY   UPDATED   UNAVAILABLE
supportbundletestcluster   true          true                   v1.20.7-eks-1-20-6   1          1       1

To verify that the expected number of cluster worker nodes are up and running, use the kubectl command to show that nodes are Ready. This will confirm that the expected number of worker nodes are present. Worker nodes are named using the cluster name followed by the worker node group name (example: my-cluster-md-0)

kubectl get nodes
NAME                                           STATUS   ROLES                  AGE    VERSION
supportbundletestcluster-md-0-55bb5ccd-mrcf9   Ready    <none>                 4m   v1.20.7-eks-1-20-6
supportbundletestcluster-md-0-55bb5ccd-zrh97   Ready    <none>                 4m   v1.20.7-eks-1-20-6
supportbundletestcluster-mdrwf                 Ready    control-plane,master   5m   v1.20.7-eks-1-20-6

To test a workload in your cluster you can try deploying the hello-eks-anywhere .

4.2.6 - Add cluster integrations

How to add integrations to an EKS Anywhere cluster

EKS Anywhere offers AWS support for certain third-party vendor components, namely Ubuntu TLS, Cilium, and Flux. It also provides flexibility for you to integrate with your choice of tools in other areas. Below is a list of example third-party tools your consideration.

For a full list of partner integration options, please visit Amazon EKS Anywhere Partner page .

Feature Example third-party tools
Ingress controller Gloo Edge , Emissary-ingress (previously Ambassador)
Service type load balancer MetalLB
Local container repository Harbor
Monitoring Prometheus , Grafana , Datadog , or NewRelic
Logging Splunk or Fluentbit
Secret management Hashi Vault
Policy agent Open Policy Agent
Service mesh Istio , Gloo Mesh , or Linkerd
Cost management KubeCost
Etcd backup and restore Velero
Storage Default storage class, any compatible CSI

4.2.7 - Reboot nodes

How to properly reboot a node in an EKS Anywhere cluster

If you need to reboot a node in your cluster for maintenance or any other reason, performing the following steps will help prevent possible disruption of services on those nodes:

  1. Cordon the node so no further workloads are scheduled to run on it:

    kubectl cordon <node-name>
    
  2. Drain the node of all current workloads:

    kubectl drain <node-name>
    
  3. Shut down. Using the appropriate method for your provider, shut down the node.

  4. Perform system maintenance or other task you need to do on the node and boot up the node.

  5. Uncordon the node so that it can begin receiving workloads again.

    kubectl uncordon <node-name>
    

4.2.8 - Connect cluster to console

Connect a cluster to the EKS console

The AWS EKS Connector lets you connect your EKS Anywhere cluster to the AWS EKS console, where you can see your the EKS Anywhere cluster, its configuration, workloads, and their status. EKS Connector is a software agent that can be deployed on your EKS Anywhere cluster, enabling the cluster to register with the EKS console.

Visit AWS EKS Connector for details.

4.2.9 - License cluster

How to license your cluster.

If you are are licensing an existing cluster, apply the following secret to your cluster (replacing my-license-here with your license):

kubectl apply -f - <<EOF 
apiVersion: v1
kind: Secret
metadata:
  name: eksa-license
  namespace: eksa-system
stringData:
  license: "my-license-here"
type: Opaque
EOF

4.2.10 - Multus CNI plugin configuration

EKS Anywhere configuration for Multus CNI plugin

NOTE: Currently, Multus support is only available with the EKS Anywhere Bare Metal provider. The vSphere and CloudStack providers, do not have multi-network support for cluster machines. Once multiple network support is added to those clusters, Multus CNI can be supported.

Multus CNI is a container network interface plugin for Kubernetes that enables attaching multiple network interfaces to pods. In Kubernetes, each pod has only one network interface by default, other than local loopback. With Multus, you can create multi-homed pods that have multiple interfaces. Multus acts a as ‘meta’ plugin that can call other CNI plugins to configure additional interfaces.

Pre-Requisites

Given that Multus CNI is used to create pods with multiple network interfaces, the cluster machines that these pods run on need to have multiple network interfaces attached and configured. The interfaces on multi-homed pods need to map to these interfaces on the machines.

For Bare Metal clusters using the Tinkerbell provider, the cluster machines need to have multiple network interfaces cabled in and appropriate network configuration put in place during machine provisioning.

Overview of Multus setup

The following diagrams show the result of two applications (app1 and app2) running in pods that use the Multus plugin to communicate over two network interfaces (eth0 and net1) from within the pods. The Multus plugin uses two network interfaces on the worker node (eth0 and eth1) to provide communications outside of the node.

Multus allows pods to have multiple network interfaces

Follow the procedure below to set up Multus as illustrated in the previous diagrams.

Install and configure Multus

Deploying Multus using a Daemonset will spin up pods that install a Multus binary and configure Multus for usage in every node in the cluster. Here are the steps for doing that.

  1. Clone the Multus CNI repo:

    git clone https://github.com/k8snetworkplumbingwg/multus-cni.git && cd multus-cni
    
  2. Apply Multus daemonset to your EKS Anywhere cluster:

    kubectl apply -f ./deployments/multus-daemonset-thick-plugin.yml
    
  3. Verify that you have Multus pods running:

    kubectl get pods --all-namespaces | grep -i multus
    
  4. Check that Multus is running:

    kubectl get pods -A | grep multus
    

    Output:

    kube-system kube-multus-ds-bmfjs     1/1      Running      0      3d1h
    kube-system kube-multus-ds-fk2sk     1/1      Running      0      3d1h
    

Create Network Attachment Definition

You need to create a Network Attachment Definition for the CNI you wish to use as the plugin for the additional interface. You can verify that your intended CNI plugin is supported by ensuring that the binary corresponding to that CNI plugin is present in the node’s /opt/cni/bin directory.

Below is an example of a Network Attachment Definition yaml:

cat <<EOF | kubectl create -f -
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
   name: ipvlan-conf
spec:
   config: '{
      "cniVersion": "0.3.0",
      "type": "ipvlan",
      "master": "eth1",
      "mode": "l3",
      "ipam": {
         "type": "host-local",
         "subnet": "198.17.0.0/24",
         "rangeStart": "198.17.0.200",
         "rangeEnd": "198.17.0.216",
         "routes": [
             { "dst": "0.0.0.0/0" }
         ],
         "gateway": "198.17.0.1"
      }
 }'
EOF

Note that eth1 is used as the master parameter. This master parameter should match the interface name on the hosts in your cluster.

Verify the configuration

Type the following to verify the configuration you created:

kubectl get network-attachment-definitions
kubectl describe network-attachment-definitions ipvlan-conf

Deploy sample applications with network attachment

  1. Create a sample application 1 (app1) with network annotation created in the previous steps:

    cat <<EOF | kubectl apply -f - 
    apiVersion: v1
    kind: Pod
    metadata:
      name: app1
      annotations:
        k8s.v1.cni.cncf.io/networks: ipvlan-conf
    spec:
      containers:
      - name: app1
        command: ["/bin/sh", "-c", "trap : TERM INT; sleep infinity & wait"]
        image: alpine
    EOF
    
  2. Create a sample application 2 (app2) with the network annotation created in the previous step:

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Pod
    metadata:
      name: app2
      annotations:
        k8s.v1.cni.cncf.io/networks: ipvlan-conf
    spec:
      containers:
      - name: app2
        command: ["/bin/sh", "-c", "trap : TERM INT; sleep infinity & wait"]
        image: alpine
    EOF
    
  3. Verify that the additional interfaces were created on these application pods using the defined network attachment:

    kubectl exec -it app1 -- ip a                            
    

    Output:

    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host 
           valid_lft forever preferred_lft forever
    *2: net1@if3: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN 
        link/ether 00:50:56:9a:84:3b brd ff:ff:ff:ff:ff:ff
        inet 198.17.0.200/24 brd 198.17.0.255 scope global net1
           valid_lft forever preferred_lft forever
        inet6 fe80::50:5600:19a:843b/64 scope link 
           valid_lft forever preferred_lft forever*
    31: eth0@if32: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
        link/ether 0a:9e:a0:b4:21:05 brd ff:ff:ff:ff:ff:ff
        inet 192.168.1.218/32 scope global eth0
           valid_lft forever preferred_lft forever
        inet6 fe80::89e:a0ff:feb4:2105/64 scope link 
           valid_lft forever preferred_lft forever
    
    kubectl exec -it app2 -- ip a
    

    Output:

    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host 
           valid_lft forever preferred_lft forever
    *2: net1@if3: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN 
        link/ether 00:50:56:9a:84:3b brd ff:ff:ff:ff:ff:ff
        inet 198.17.0.201/24 brd 198.17.0.255 scope global net1
           valid_lft forever preferred_lft forever
        inet6 fe80::50:5600:29a:843b/64 scope link 
           valid_lft forever preferred_lft forever*
    33: eth0@if34: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
        link/ether b2:42:0a:67:c0:48 brd ff:ff:ff:ff:ff:ff
        inet 192.168.1.210/32 scope global eth0
           valid_lft forever preferred_lft forever
        inet6 fe80::b042:aff:fe67:c048/64 scope link 
           valid_lft forever preferred_lft forever
    

    Note that both pods got the new interface net1. Also, the additional network interface on each pod got assigned an IP address out of the range specified by the Network Attachment Definition.

  4. Test the network connectivity across these pods for Multus interfaces:

    kubectl exec -it app1 -- ping -I net1 198.17.0.201 
    

    Output:

    PING 198.17.0.201 (198.17.0.201): 56 data bytes
    64 bytes from 198.17.0.201: seq=0 ttl=64 time=0.074 ms
    64 bytes from 198.17.0.201: seq=1 ttl=64 time=0.077 ms
    64 bytes from 198.17.0.201: seq=2 ttl=64 time=0.078 ms
    64 bytes from 198.17.0.201: seq=3 ttl=64 time=0.077 ms
    
    kubectl exec -it app2 -- ping -I net1 198.17.0.200
    

    Output:

    PING 198.17.0.200 (198.17.0.200): 56 data bytes
    64 bytes from 198.17.0.200: seq=0 ttl=64 time=0.074 ms
    64 bytes from 198.17.0.200: seq=1 ttl=64 time=0.077 ms
    64 bytes from 198.17.0.200: seq=2 ttl=64 time=0.078 ms
    64 bytes from 198.17.0.200: seq=3 ttl=64 time=0.077 ms
    

4.2.11 - Authenticate cluster with AWS IAM Authenticator

Configure AWS IAM Authenticator to authenticate user access to the cluster

AWS IAM Authenticator Support (optional)

EKS Anywhere supports configuring AWS IAM Authenticator as an authentication provider for clusters.

When you create a cluster with IAM Authenticator enabled, EKS Anywhere

  • Installs aws-iam-authenticator server as a DaemonSet on the workload cluster.
  • Configures the Kubernetes API Server to communicate with iam authenticator using a token authentication webhook .
  • Creates the necessary ConfigMaps based on user options.

Create IAM Authenticator enabled cluster

Generate your cluster configuration and add the necessary IAM Authenticator configuration. For a full spec reference check AWSIamConfig .

Create an EKS Anywhere cluster as follows:

CLUSTER_NAME=my-cluster-name
eksctl anywhere create cluster -f ${CLUSTER_NAME}.yaml

Example AWSIamConfig configuration

This example uses a region in the default aws partition and EKSConfigMap as backendMode. Also, the IAM ARNs are mapped to the kubernetes system:masters group.

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
   name: my-cluster-name
spec:
   ...
   # IAM Authenticator
   identityProviderRefs:
      - kind: AWSIamConfig
        name: aws-iam-auth-config
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: AWSIamConfig
metadata:
   name: aws-iam-auth-config
spec:
    awsRegion: us-west-1
    backendMode:
        - EKSConfigMap
    mapRoles:
        - roleARN: arn:aws:iam::XXXXXXXXXXXX:role/myRole
          username: myKubernetesUsername
          groups:
          - system:masters
    mapUsers:
        - userARN: arn:aws:iam::XXXXXXXXXXXX:user/myUser
          username: myKubernetesUsername
          groups:
          - system:masters
    partition: aws

Authenticating with IAM Authenticator

After your cluster is created you may now use the mapped IAM ARNs to authenticate to the cluster.

EKS Anywhere generates a KUBECONFIG file in your local directory that uses aws-iam-authenticator client to authenticate with the cluster. The file can be found at

${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-aws.kubeconfig

Steps

  1. Ensure the IAM role/user ARN mapped in the cluster is configured on the local machine from which you are trying to access the cluster.

  2. Install the aws-iam-authenticator client binary on the local machine.

    • We recommend installing the binary referenced in the latest release manifest of the kubernetes version used when creating the cluster.
    • The below commands can be used to fetch the installation uri for clusters created with 1.21 kubernetes version and OS linux.
    CLUSTER_NAME=my-cluster-name
    KUBERNETES_VERSION=1.21
    
    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    
    EKS_D_MANIFEST_URL=$(kubectl get bundles $CLUSTER_NAME -o jsonpath="{.spec.versionsBundles[?(@.kubeVersion==\"$KUBERNETES_VERSION\")].eksD.manifestUrl}")
    
    OS=linux
    curl -fsSL $EKS_D_MANIFEST_URL | yq e '.status.components[] | select(.name=="aws-iam-authenticator") | .assets[] | select(.os == '"\"$OS\""' and .type == "Archive") | .archive.uri' -
    
  3. Export the generated IAM Authenticator based KUBECONFIG file.

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-aws.kubeconfig
    
  4. Run kubectl commands to check cluster access. Example,

    kubectl get pods -A
    

Modify IAM Authenticator mappings

EKS Anywhere supports modifying IAM ARNs that are mapped on the cluster. The mappings can be modified by either running the upgrade cluster command or using GitOps.

upgrade command

The mapRoles and mapUsers lists in AWSIamConfig can be modified when running the upgrade cluster command from EKS Anywhere.

As an example, let’s add another IAM user to the above example configuration.

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: AWSIamConfig
metadata:
   name: aws-iam-auth-config
spec:
    ...
    mapUsers:
        - userARN: arn:aws:iam::XXXXXXXXXXXX:user/myUser
          username: myKubernetesUsername
          groups:
          - system:masters
        - userARN: arn:aws:iam::XXXXXXXXXXXX:user/anotherUser
          username: anotherKubernetesUsername
    partition: aws

and then run the upgrade command

CLUSTER_NAME=my-cluster-name
eksctl anywhere upgrade cluster -f ${CLUSTER_NAME}.yaml

EKS Anywhere now updates the role mappings for IAM authenticator in the cluster and a new user gains access to the cluster.

GitOps

If the cluster created has GitOps configured, then the mapRoles and mapUsers list in AWSIamConfig can be modified by the GitOps controller. For GitOps configuration details refer to Manage Cluster with GitOps .

  1. Clone your git repo and modify the cluster specification. The default path for the cluster file is:
    clusters/$CLUSTER_NAME/eksa-system/eksa-cluster.yaml
    
  2. Modify the AWSIamConfig object and add to the mapRoles and mapUsers object lists.
  3. Commit the file to your git repository
    git add eksa-cluster.yaml
    git commit -m 'Adding IAM Authenticator access ARNs'
    git push origin main
    

EKS Anywhere GitOps Controller now updates the role mappings for IAM authenticator in the cluster and users gains access to the cluster.

4.2.12 - Manage cluster with GitOps

Use Flux to manage clusters with GitOps

NOTE: GitOps support is available for vSphere clusters, but is not yet available for Bare Metal clusters

GitOps Support (optional)

EKS Anywhere supports a GitOps workflow for the management of your cluster.

When you create a cluster with GitOps enabled, EKS Anywhere will automatically commit your cluster configuration to the provided GitHub repository and install a GitOps toolkit on your cluster which watches that committed configuration file. You can then manage the scale of the cluster by making changes to the version controlled cluster configuration file and committing the changes. Once a change has been detected by the GitOps controller running in your cluster, the scale of the cluster will be adjusted to match the committed configuration file.

If you’d like to learn more about GitOps, and the associated best practices, check out this introduction from Weaveworks .

NOTE: Installing a GitOps controller can be done during cluster creation or through upgrade. In the event that GitOps installation fails, EKS Anywhere cluster creation will continue.

Supported Cluster Properties

Currently, you can manage a subset of cluster properties with GitOps:

Management Cluster

Cluster:

  • workerNodeGroupConfigurations.count
  • workerNodeGroupConfigurations.machineGroupRef.name

WorkerNodes VSphereMachineConfig:

  • datastore
  • diskGiB
  • folder
  • memoryMiB
  • numCPUs
  • resourcePool
  • template
  • users

Workload Cluster

Cluster:

  • kubernetesVersion
  • controlPlaneConfiguration.count
  • controlPlaneConfiguration.machineGroupRef.name
  • workerNodeGroupConfigurations.count
  • workerNodeGroupConfigurations.machineGroupRef.name
  • identityProviderRefs (Only for kind:OIDCConfig, kind:AWSIamConfig is immutable)

ControlPlane / Etcd / WorkerNodes VSphereMachineConfig:

  • datastore
  • diskGiB
  • folder
  • memoryMiB
  • numCPUs
  • resourcePool
  • template
  • users

OIDCConfig:

  • clientID
  • groupsClaim
  • groupsPrefix
  • issuerUrl
  • requiredClaims.claim
  • requiredClaims.value
  • usernameClaim
  • usernamePrefix

Any other changes to the cluster configuration in the git repository will be ignored. If an immutable field has been changed in a Git repository, there are two ways to find the error message:

  1. If a notification webhook is set up, check the error message in notification channel.
  2. Check the Flux Kustomization Controller log: kubectl logs -f -n flux-system kustomize-controller-****** for error message containing text similar to Invalid value: 1: field is immutable

Getting Started with EKS Anywhere GitOps with Github

In order to use GitOps to manage cluster scaling, you need a couple of things:

Create a GitHub Personal Access Token

Create a Personal Access Token (PAT) to access your provided GitHub repository. It must be scoped for all repo permissions.

NOTE: GitOps configuration only works with hosted github.com and will not work on a self-hosted GitHub Enterprise instances.

This PAT should have at least the following permissions:

GitHub PAT permissions

NOTE: The PAT must belong to the owner of the repository or, if using an organization as the owner, the creator of the PAT must have repo permission in that organization.

You need to set your PAT as the environment variable $EKSA_GITHUB_TOKEN to use it during cluster creation:

export EKSA_GITHUB_TOKEN=ghp_MyValidPersonalAccessTokenWithRepoPermissions

Create GitOps configuration repo

If you have an existing repo you can set that as your repository name in the configuration. If you specify a repo in your FluxConfig which does not exist EKS Anywhere will create it for you. If you would like to create a new repo you can click here to create a new repo.

If your repository contains multiple cluster specification files, store them in sub-folders and specify the configuration path in your cluster specification.

In order to accommodate the management cluster feature, the CLI will now structure the repo directory following a new convention:

clusters
└── management-cluster
    ├── flux-system
    │   └── ...
    ├── management-cluster
    │   └── eksa-system
    │       └── eksa-cluster.yaml
    │       └── kustomization.yaml
    ├── workload-cluster-1
    │   └── eksa-system
    │       └── eksa-cluster.yaml
    └── workload-cluster-2
        └── eksa-system
            └── eksa-cluster.yaml

By default, Flux kustomization reconciles at the management cluster’s root level (./clusters/management-cluster), so both the management cluster and all the workload clusters it manages are synced.

Example GitOps cluster configuration for Github

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: mynewgitopscluster
spec:
... # collapsed cluster spec fields
# Below added for gitops support
  gitOpsRef:
    kind: FluxConfig
    name: my-cluster-name
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: FluxConfig
metadata:
  name: my-cluster-name
spec:
    github:
      personal: true
      repository: mygithubrepository
      owner: mygithubusername

Create a GitOps enabled cluster

Generate your cluster configuration and add the GitOps configuration. For a full spec reference see the Cluster Spec reference .

NOTE: After your cluster has been created the cluster configuration will automatically be committed to your git repo.

  1. Create an EKS Anywhere cluster with GitOps enabled.

    CLUSTER_NAME=gitops
    eksctl anywhere create cluster -f ${CLUSTER_NAME}.yaml
    

Enable GitOps in an existing cluster

You can also install Flux and enable GitOps in an existing cluster by running the upgrade command with updated cluster configuration. For a full spec reference see the Cluster Spec reference .

  1. Upgrade an EKS Anywhere cluster with GitOps enabled.

    CLUSTER_NAME=gitops
    eksctl anywhere upgrade cluster -f ${CLUSTER_NAME}.yaml
    

Test GitOps controller

After your cluster has been created, you can test the GitOps controller by modifying the cluster specification.

  1. Clone your git repo and modify the cluster specification. The default path for the cluster file is:

    clusters/$CLUSTER_NAME/eksa-system/eksa-cluster.yaml
    
  2. Modify the workerNodeGroupsConfigurations[0].count field with your desired changes.

  3. Commit the file to your git repository

    git add eksa-cluster.yaml
    git commit -m 'Scaling nodes for test'
    git push origin main
    
  4. The Flux controller will automatically make the required changes.

    If you updated your node count, you can use this command to see the current node state.

    kubectl get nodes 
    

Getting Started with EKS Anywhere GitOps with any Git source

You can configure EKS Anywhere to use a generic git repository as the source of truth for GitOps by providing a FluxConfig with a git configuration.

EKS Anywhere requires a valid SSH Known Hosts file and SSH Private key in order to connect to your repository and bootstrap Flux.

Create a Git repository for use by EKS Anywhere and Flux

When using the git provider, EKS Anywhere requires that the configuration repository be pre-initialized. You may re-use an existing repo or use the same repo for multiple management clusters.

Create the repository through your git provider and initialize it with a README.md documenting the purpose of the repository.

Create a Private Key for use by EKS Anywhere and Flux

EKS Anywhere requires a private key to authenticate to your git repository, push the cluster configuration, and configure Flux for ongoing management and monitoring of that configuration. The private key should have permissions to read and write from the repository in question.

It is recommended that you create a new private key for use exclusively by EKS Anywhere. You can use ssh-keygen to generate a new key.

ssh-keygen -t ecdsa -C "my_email@example.com"

Please consult the documentation for your git provider to determine how to add your corresponding public key; for example, if using Github enterprise, you can find the documentation for adding a public key to your github account here .

Add your private key to your SSH agent on your management machine

When using a generic git provider, EKS Anywhere requires that your management machine has a running SSH agent and the private key be added to that SSH agent.

You can start an SSH agent and add your private key by executing the following in your current session:

eval "$(ssh-agent -s)" && ssh-add $EKSA_GIT_PRIVATE_KEY

Create an SSH Known Hosts file for use by EKS Anywhere and Flux

EKS Anywhere needs an SSH known hosts file to verify the identity of the remote git host. A path to a valid known hosts file must be provided to the EKS Anywhere command line via the environment variable EKSA_GIT_KNOWN_HOSTS.

For example, if you have a known hosts file at /home/myUser/.ssh/known_hosts that you want EKS Anywhere to use, set the environment variable EKSA_GIT_KNOWN_HOSTS to the path to that file, /home/myUser/.ssh/known_hosts.

export EKSA_GIT_KNOWN_HOSTS=/home/myUser/.ssh/known_hosts

While you can use your pre-existing SSH known hosts file, it is recommended that you generate a new known hosts file for use by EKS Anywhere that contains only the known-hosts entries required for your git host and key type. For example, if you wanted to generate a known hosts file for a git server located at example.com with key type ecdsa, you can use the OpenSSH utility ssh-keyscan:

ssh-keyscan -t ecdsa example.com >> my_eksa_known_hosts

This will generate a known hosts file which contains only the entry necessary to verify the identity of example.com when using an ecdsa based private key file.

Example FluxConfig cluster configuration for a generic git provider

For a full spec reference see the Cluster Spec reference .

NOTE: The repositoryUrl value is of the format ssh://git@provider.com/$REPO_OWNER/$REPO_NAME.git. This may differ from the default SSH URL given by your provider. For Example, the github.com user interface provides an SSH URL containing a : before the repository owner, rather than a /. Make sure to replace this : with a /, if present.

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: mynewgitopscluster
spec:
... # collapsed cluster spec fields
# Below added for gitops support
  gitOpsRef:
    kind: FluxConfig
    name: my-cluster-name
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: FluxConfig
metadata:
  name: my-cluster-name
spec:
    git:
      repositoryUrl: ssh://git@provider.com/myAccount/myClusterGitopsRepo.git
      sshKeyAlgorithm: ecdsa

Manage separate workload clusters using Gitops

Follow these steps if you want to use your initial cluster to create and manage separate workload clusters via Gitops.

Prerequisites

  • An existing EKS Anywhere cluster with Gitops enabled. If your existing cluster does not have Gitops installed, see Enable Gitops in an existing cluster. .

  • A cluster configuration file for your new workload cluster.

Create cluster using Gitops

  1. Clone your git repo and add the new cluster specification. Be sure to follow the directory structure defined here :

    clusters/<management-cluster-name>/$CLUSTER_NAME/eksa-system/eksa-cluster.yaml
    

    NOTE: Specify the namespace for all EKS Anywhere objects when you are using GitOps to create new workload clusters (even for the default namespace, use namespace: default on those objects).

    Ensure workload cluster object names are distinct from management cluster object names. Be sure to set the managementCluster field to identify the name of the management cluster.

    Make sure there is a kustomization.yaml file under the namespace directory for the management cluster. Creating a Gitops enabled management cluster with eksctl should create the kustomization.yaml file automatically.

  2. Commit the file to your git repository.

    git add clusters/<management-cluster-name>/$CLUSTER_NAME/eksa-system/eksa-cluster.yaml
    git commit -m 'Creating new workload cluster'
    git push origin main
    
  3. The Flux controller will automatically make the required changes. You can list the workload clusters managed by the management cluster.

    export KUBECONFIG=${PWD}/${MGMT_CLUSTER_NAME}/${MGMT_CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl get clusters
    
  4. The kubeconfig for your new cluster is stored as a secret on the management cluster. You can get credentials and run the test application on your new workload cluster as follows:

    kubectl get secret -n eksa-system w01-kubeconfig -o jsonpath={.data.value}' | base64 —decode > w01.kubeconfig
    export KUBECONFIG=w01.kubeconfig
    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    

Upgrade cluster using Gitops

  1. To upgrade the cluster using Gitops, modify the workload cluster yaml file with the desired changes.

  2. Commit the file to your git repository.

    git add eksa-cluster.yaml
    git commit -m 'Scaling nodes on new workload cluster'
    git push origin main
    

Delete cluster using Gitops

  1. To delete the cluster using Gitops, delete the workload cluster yaml file from your repository and commit those changes.
    git rm eksa-cluster.yaml
    git commit -m 'Deleting workload cluster'
    git push origin main
    

4.2.13 - Manage cluster with Terraform

Use Terraform to manage EKS Anywhere Clusters

NOTE: Support for using Terraform to manage and modify an EKS Anywhere cluster is available for vSphere, Snow and Nutanix clusters, but not yet for Bare Metal or CloudStack clusters.

Using Terraform to manage an EKS Anywhere Cluster (Optional)

This guide explains how you can use Terraform to manage and modify an EKS Anywhere cluster. The guide is meant for illustrative purposes and is not a definitive approach to building production systems with Terraform and EKS Anywhere.

At its heart, EKS Anywhere is a set of Kubernetes CRDs, which define an EKS Anywhere cluster, and a controller, which moves the cluster state to match these definitions. These CRDs, and the EKS-A controller, live on the management cluster or on a self-managed cluster. We can manage a subset of the fields in the EKS Anywhere CRDs with any tool that can interact with the Kubernetes API, like kubectl or, in this case, the Terraform Kubernetes provider.

In this guide, we’ll show you how to import your EKS Anywhere cluster into Terraform state and how to scale your EKS Anywhere worker nodes using the Terraform Kubernetes provider.

Prerequisites

  • An existing EKS Anywhere cluster

  • the latest version of Terraform

  • the latest version of tfk8s , a tool for converting Kubernetes manifest files to Terraform HCL

Guide

  1. Create an EKS-A management cluster, or a self-managed stand-alone cluster.
  1. Set up the Terraform Kubernetes provider Make sure your KUBECONFIG environment variable is set

    export KUBECONFIG=/path/to/my/kubeconfig.kubeconfig
    

    Set an environment variable with your cluster name:

    export MY_EKSA_CLUSTER="myClusterName"
    
    cat << EOF > ./provider.tf
    provider "kubernetes" {
      config_path    = "${KUBECONFIG}"
    }
    EOF
    
  2. Get tfk8s and use it to convert your EKS Anywhere cluster Kubernetes manifest into Terraform HCL:

    • Install tfk8s
    • Convert the manifest into Terraform HCL:
    kubectl get cluster ${MY_EKSA_CLUSTER} -o yaml | tfk8s --strip -o ${MY_EKSA_CLUSTER}.tf
    
  3. Configure the Terraform cluster resource definition generated in step 2

    • Set metadata.generation as a computed field . Add the following to your cluster resource configuration
    computed_fields = ["metadata.generated"]
    
    field_manager {
      force_conflicts = true
    }
    
    • Add the namespace to the metadata of the cluster
    • Remove the generation field from the metadata of the cluster
    • Your Terraform cluster resource should look similar to this:
    computed_fields = ["metadata.generated"]
    field_manager {
      force_conflicts = true
    }
    manifest = {
      "apiVersion" = "anywhere.eks.amazonaws.com/v1alpha1"
      "kind" = "Cluster"
      "metadata" = {
        "name" = "MyClusterName"
        "namespace" = "default"
    }
    
  4. Import your EKS Anywhere cluster into terraform state:

    terraform init
    terraform import kubernetes_manifest.cluster_${MY_EKSA_CLUSTER} "apiVersion=anywhere.eks.amazonaws.com/v1alpha1,kind=Cluster,namespace=default,name=${MY_EKSA_CLUSTER}"
    

    After you import your cluster, you will need to run terraform apply one time to ensure that the manifest field of your cluster resource is in-sync. This will not change the state of your cluster, but is a required step after the initial import. The manifest field stores the contents of the associated kubernetes manifest, while the object field stores the actual state of the resource.

  5. Modify Your Cluster using Terraform

    • Modify the count value of one of your workerNodeGroupConfigurations, or another mutable field, in the configuration stored in ${MY_EKSA_CLUSTER}.tf file.
    • Check the expected diff between your cluster state and the modified local state via terraform plan

    You should see in the output that the worker node group configuration count field (or whichever field you chose to modify) will be modified by Terraform.

  6. Now, actually change your cluster to match the local configuration:

    terraform apply
    
  7. Observe the change to your cluster. For example:

    kubectl get nodes
    

Manage separate workload clusters using Terraform

Follow these steps if you want to use your initial cluster to create and manage separate workload clusters via Terraform.

NOTE: If you choose to manage your cluster using Terraform, do not use kubectl to edit your cluster objects as this can lead to field manager conflicts.

Prerequisites

  • An existing EKS Anywhere cluster imported into Terraform state. If your existing cluster is not yet imported, see this guide. .
  • A cluster configuration file for your new workload cluster.

Create cluster using Terraform

  1. Create the new cluster configuration Terraform file.

       tfk8s -f new-workload-cluster.yaml -o new-workload-cluster.tf
    

    NOTE: Specify the namespace for all EKS Anywhere objects when you are using Terraform to manage your clusters (even for the default namespace, use "namespace" = "default" on those objects).

    Ensure workload cluster object names are distinct from management cluster object names. Be sure to set the managementCluster field to identify the name of the management cluster.

  2. Ensure that this new Terraform workload cluster configuration exists in the same directory as the management cluster Terraform files.

    my/terraform/config/path
    ├── management-cluster.tf
    ├── new-workload-cluster.tf
    ├── provider.tf
    ├──  ... 
    └──
    
  3. Verify the changes to be applied:

    terraform plan
    
  4. If the plan looks as expected, apply those changes to create the new cluster resources:

    terraform apply
    
  5. You can list the workload clusters managed by the management cluster.

    export KUBECONFIG=${PWD}/${MGMT_CLUSTER_NAME}/${MGMT_CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl get clusters
    
  6. The kubeconfig for your new cluster is stored as a secret on the management cluster. You can get the workload cluster credentials and run the test application on your new workload cluster as follows:

    kubectl get secret -n eksa-system w01-kubeconfig -o jsonpath={.data.value}' | base64 --decode > w01.kubeconfig
    export KUBECONFIG=w01.kubeconfig
    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    

Upgrade cluster using Terraform

  1. To upgrade a workload cluster using Terraform, modify the desired fields in the Terraform resource file and apply the changes.
    terraform apply
    

Delete cluster using Terraform

  1. To delete a workload cluster using Terraform, you will need the name of the Terraform cluster resource. This can be found on the first line of your cluster resource definition.
    terraform destroy --target kubernetes_manifest.cluster_w01
    

Appendix

Terraform K8s Provider https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs

tfk8s https://github.com/jrhouston/tfk8s

4.2.14 - Delete cluster

How to delete an EKS Anywhere cluster

NOTE: EKS Anywhere Bare Metal clusters do not yet support separate workload and management clusters. Use the instructions for Deleting a management cluster to delete a Bare Metal cluster.

Deleting a workload cluster

Follow these steps to delete your EKS Anywhere cluster that is managed by a separate management cluster.

To delete a workload cluster, you will need:

  • name of your workload cluster
  • kubeconfig of your workload cluster
  • kubeconfig of your management cluster

Run the following commands to delete the cluster:

  1. Set up CLUSTER_NAME and KUBECONFIG environment variables:

    export CLUSTER_NAME=eksa-w01-cluster
    export KUBECONFIG=${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    export MANAGEMENT_KUBECONFIG=<path-to-management-cluster-kubeconfig>
    
  2. Run the delete command:

  • If you are running the delete command from the directory which has the cluster folder with ${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.yaml:

    eksctl anywhere delete cluster ${CLUSTER_NAME} --kubeconfig ${MANAGEMENT_KUBECONFIG}
    

Deleting a management cluster

Follow these steps to delete your management cluster.

To delete a cluster you will need:

  • cluster name or cluster configuration
  • kubeconfig of your cluster

Run the following commands to delete the cluster:

  1. Set up CLUSTER_NAME and KUBECONFIG environment variables:

    export CLUSTER_NAME=mgmt
    export KUBECONFIG=${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    
  2. Run the delete command:

  • If you are running the delete command from the directory which has the cluster folder with ${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.yaml:

    eksctl anywhere delete cluster ${CLUSTER_NAME}
    
  • Otherwise, use this command to manually specify the clusterconfig file path:

    export CONFIG_FILE=<path-to-config-file>
    eksctl anywhere delete cluster -f ${CONFIG_FILE}
    

Example output:

Performing provider setup and validations
Creating management cluster
Installing cluster-api providers on management cluster
Moving cluster management from workload cluster
Deleting workload cluster
Clean up Git Repo
GitOps field not specified, clean up git repo skipped
🎉 Cluster deleted!

For vSphere, CloudStack, and Nutanix, this will delete all of the VMs that were created in your provider. For Bare Metal, the servers will be powered off if BMC information has been provided. If your workloads created external resources such as external DNS entries or load balancer endpoints you may need to delete those resources manually.

4.3 - Cluster troubleshooting

Troubleshooting your EKS Anywhere Cluster

4.3.1 - Troubleshooting

Troubleshooting EKS Anywhere clusters

This guide covers EKS Anywhere troubleshooting. It is divided into the following sections:

You may want to search this document for a fragment of the error you are seeing.

General troubleshooting

Increase eksctl anywhere output

If you’re having trouble running eksctl anywhere you may get more verbose output with the -v 6 option. The highest level of verbosity is -v 9 and the default level of logging is level equivalent to -v 0.

Cannot run docker commands

The EKS Anywhere binary requires access to run docker commands without using sudo. If you’re using a Linux distribution you will need to be using Docker 20.x.x add your user needs to be part of the docker group.

To add your user to the docker group you can use.

sudo usermod -a -G docker $USER

Now you need to log out and back in to get the new group permissions.

Minimum requirements for docker version have not been met

Error: failed to validate docker: minimum requirements for docker version have not been met. Install Docker version 20.x.x or above

Ensure you are running Docker 20.x.x for example:

% docker --version
Docker version 20.10.6, build 370c289

Minimum requirements for docker version have not been met on Mac OS

Error: EKS Anywhere does not support Docker desktop versions between 4.3.0 and 4.4.1 on macOS
Error: EKS Anywhere requires Docker desktop to be configured to use CGroups v1. Please  set `deprecatedCgroupv1:true` in your `~/Library/Group\\ Containers/group.com.docker/settings.json` file

Ensure you are running Docker Desktop 4.4.2 or newer and have set "deprecatedCgroupv1": true in your settings.json file

% defaults read /Applications/Docker.app/Contents/Info.plist CFBundleShortVersionString
4.42
% docker info --format '{{json .CgroupVersion}}' 
"1"

cgroups v2 is not supported in Ubuntu 21.10+ and 22.04

ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1"

It is recommended to use Ubuntu 20.04 for the Administrative Machine. This is because the EKS Anywhere Bootstrap cluster requires cgroups v1. Since Ubuntu 21.10 cgroups v2 is enabled by default. You can use Ubuntu 21.10 and 22.04 for the Administrative machine if you configure Ubuntu to use cgroups v1 instead.

To verify cgroups version

% docker info | grep Cgroup
 Cgroup Driver: cgroupfs
 Cgroup Version: 2

To use cgroups v1 you need to sudo and edit /etc/default/grub to set GRUB_CMDLINE_LINUX to “systemd.unified_cgroup_hierarchy=0” and reboot.

%sudo <editor> /etc/default/grub
GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=0"

sudo update-grub
sudo reboot now

Then verify you are using cgroups v1.

% docker info | grep Cgroup
 Cgroup Driver: cgroupfs
 Cgroup Version: 1

ECR access denied

Error: failed to create cluster: unable to initialize executables: failed to setup eks-a dependencies: Error response from daemon: pull access denied for public.ecr.aws/***/cli-tools, repository does not exist or may require 'docker login': denied: Your authorization token has expired. Reauthenticate and try again.

All images needed for EKS Anywhere are public and do not need authentication. Old cached credentials could trigger this error. Remove cached credentials by running:

docker logout public.ecr.aws

error unmarshaling JSON: while decoding JSON: json: unknown field “spec”

Error: loading config file "cluster.yaml": error unmarshaling JSON: while decoding JSON: json: unknown field "spec"

Use eksctl anywhere create cluster -f cluster.yaml instead of eksctl create cluster -f cluster.yaml to create an EKS Anywhere cluster.

Error: old cluster config file exists under my-cluster, please use a different clusterName to proceed

Error: old cluster config file exists under my-cluster, please use a different clusterName to proceed

The my-cluster directory already exists in the current directory. Either use a different cluster name or move the directory.

failed to create cluster: node(s) already exist for a cluster with the name

Performing provider setup and validations
Creating new bootstrap cluster
Error create bootstrapcluster	{"error": "error creating bootstrap cluster: error executing create cluster: ERROR: failed to create cluster: node(s) already exist for a cluster with the name \"cluster-name\"\n, try rerunning with --force-cleanup to force delete previously created bootstrap cluster"}
Failed to create cluster	{"error": "error creating bootstrap cluster: error executing create cluster: ERROR: failed to create cluster: node(s) already exist for a cluster with the name \"cluster-name\"\n, try rerunning with --force-cleanup to force delete previously created bootstrap cluster"}ry rerunning with --force-cleanup to force delete previously created bootstrap cluster"}

A bootstrap cluster already exists with the same name. If you are sure the cluster is not being used, you may use the --force-cleanup option to eksctl anywhere to delete the cluster or you may delete the cluster with kind delete cluster --name <cluster-name>. If you do not have kind installed, you may use docker stop to stop the docker container running the KinD cluster.

Memory or disk resource problem

There are various disk and memory issues that can cause problems. Make sure docker is configured with enough memory. Make sure the system wide Docker memory configuration provides enough RAM for the bootstrap cluster.

Make sure you do not have unneeded KinD clusters running kind get clusters. You may want to delete unneeded clusters with kind delete cluster --name <cluster-name>. If you do not have kind installed, you may install it from https://kind.sigs.k8s.io/ or use docker ps to see the KinD clusters and docker stop to stop the cluster.

Make sure you do not have any unneeded Docker containers running with docker ps. Terminate any unneeded Docker containers.

Make sure Docker isn’t out of disk resources. If you don’t have any other docker containers running you may want to run docker system prune to clean up disk space.

You may want to restart Docker. To restart Docker on Ubuntu sudo systemctl restart docker.

Waiting for cert-manager to be available… Error: timed out waiting for the condition

Failed to create cluster {"error": "error initializing capi resources in cluster: error executing init: Fetching providers\nInstalling cert-manager Version=\"v1.1.0\"\nWaiting for cert-manager to be available...\nError: timed out waiting for the condition\n"}

This is likely a Memory or disk resource problem . You can also try using techniques from Generic cluster unavailable .

NTP Time sync issues

level=error msg=k8sError error="github.com/cilium/cilium/pkg/k8s/watchers/endpoint_slice.go:91: Failed to watch *v1beta1.EndpointSlice: failed to list *v1beta1.EndpointSlice: Unauthorized" subsys=k8s

You might notice authorization errors if the timestamps on your EKS Anywhere control plane nodes and worker nodes are out-of-sync. Please ensure that all the nodes are configured with same healthy NTP servers to avoid out-of-sync issues.

Error running bootstrapper cmd: error joining as worker: Error waiting for worker join files: Kubeadm join kubelet-start killed after timeout

You might also notice that the joining of nodes will fail if your admin machine differs in time compared to your nodes. Make sure to check the server time matches between the two as well.

The connection to the server localhost:8080 was refused

Performing provider setup and validations
Creating new bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Error initializing capi in bootstrap cluster	{"error": "error waiting for capi-kubeadm-control-plane-controller-manager in namespace capi-kubeadm-control-plane-system: error executing wait: The connection to the server localhost:8080 was refused - did you specify the right host or port?\n"}
Failed to create cluster	{"error": "error waiting for capi-kubeadm-control-plane-controller-manager in namespace capi-kubeadm-control-plane-system: error executing wait: The connection to the server localhost:8080 was refused - did you specify the right host or port?\n"}

This is likely a Memory or disk resource problem .

Generic cluster unavailable

Troubleshoot more by inspecting bootstrap cluster or workload cluster (depending on the stage of failure) using kubectl commands.

kubectl get pods -A --kubeconfig=<kubeconfig>
kubectl get nodes -A --kubeconfig=<kubeconfig>
kubectl get logs <podname> -n <namespace> --kubeconfig=<kubeconfig>
....

Capv troubleshooting guide: https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/master/docs/troubleshooting.md#debugging-issues

Bootstrap cluster fails to come up

If your bootstrap cluster has problems you may get detailed logs by looking at the files created under the ${CLUSTER_NAME}/logs folder. The capv-controller-manager log file will surface issues with vsphere specific configuration while the capi-controller-manager log file might surface other generic issues with the cluster configuration passed in.

You may also access the logs from your bootstrap cluster directly as below:

export KUBECONFIG=${PWD}/${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
kubectl logs -f -n capv-system -l control-plane="controller-manager" -c manager

It also might be useful to start a shell session on the docker container running the bootstrap cluster by running docker ps and then docker exec -it <container-id> bash the kind container.

Bootstrap cluster fails to come up

Error: creating bootstrap cluster: executing create cluster: ERROR: failed to create cluster: node(s) already exist for a cluster with the name \"cluster-name\"
, try rerunning with —force-cleanup to force delete previously created bootstrap cluster

Cluster creation fails because a cluster of the same name already exists. Try running the eksctl anywhere create cluster again, adding the --force-cleanup option.

If that doesn’t work, you can manually delete the old cluster:

kind delete cluster --name cluster-name

Cluster upgrade fails with management cluster on bootstrap cluster

If a cluster upgrade of a management (or self managed) cluster fails or is halted in the middle, you may be left in a state where the management resources (CAPI) are still on the KinD bootstrap cluster on the Admin machine. Right now, you will have to manually move the management resources from the KinD cluster back to the management cluster.

First create a backup:

CLUSTER_NAME=squid
KINDKUBE=${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
MGMTKUBE=${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
DIRECTORY=backup
# Substitute the version with whatever version you are using
CONTAINER=public.ecr.aws/eks-anywhere/cli-tools:v0.12.0-eks-a-19

rm -rf ${DIRECTORY}
mkdir ${DIRECTORY}

docker run -i --network host -w $(pwd) -v /var/run/docker.sock:/var/run/docker.sock -v $(pwd):/$(pwd) --entrypoint clusterctl ${CONTAINER} move \
        --namespace eksa-system \
        --kubeconfig $KINDKUBE \
        --to-directory ${DIRECTORY}

#After the backup, move the management cluster back
docker run -i --network host -w $(pwd) -v /var/run/docker.sock:/var/run/docker.sock -v $(pwd):/$(pwd) --entrypoint clusterctl ${CONTAINER} move \
        --to-kubeconfig $MGMTKUBE \
        --namespace eksa-system \
        --kubeconfig $KINDKUBE

Before you delete your bootstrap KinD cluster, verify there are no import custom resources left on it:

kubectl get crds | grep eks | while read crd rol
do
  echo $crd
  kubectl get $crd -A
done

Bare Metal troubleshooting

Creating new workload cluster hangs or fails

Cluster creation appears to be hung waiting for the Control Plane to be ready. If the CLI is hung on this message for over 30 mins, something likely failed during the OS provisioning:

Waiting for Control Plane to be ready

Or if cluster creation times out on this step and fails with the following messages:

Support bundle archive created {"path": "support-bundle-2022-06-28T00_41_24.tar.gz"}
Analyzing support bundle {"bundle": "CLUSTER_NAME/generated/bootstrap-cluster-2022-06-28T00:41:24Z-bundle.yaml", "archive": "support-bundle-2022-06-28T00_41_24.tar.gz"}
Analysis output generated {"path": "CLUSTER_NAME/generated/bootstrap-cluster-2022-06-28T00:43:40Z-analysis.yaml"}
collecting workload cluster diagnostics
Error: waiting for workload cluster control plane to be ready: executing wait: error: timed out waiting for the condition on clusters/CLUSTER_NAME

In either of those cases, the following steps can help you determine the problem:

  1. Export the kind cluster’s kubeconfig file:

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
    
  2. If you have provided BMC information:

    • Check all of the machines that the EKS Anywhere CLI has picked up from the pool of hardware in the CSV file:

      kubectl get machines.bmc -A
      
    • Check if those nodes are powered on. If any of those nodes are not powered on after a while then it could be possible that BMC credentials are invalid. You can verify it by checking the logs:

      kubectl get tasks.bmc -n eksa-system
      kubectl get tasks.bmc <bmc-name> -n eksa-system -o yaml
      

    Validate BMC credentials are correct if a connection error is observed on the tasks.bmc resource. Note that “IPMI over LAN” or “Redfish” must be enabled in the BMC configuration for the tasks.bmc resource to communicate successfully.

  3. If the machine is powered on but you see linuxkit is not running, then Tinkerbell failed to serve the node via iPXE. In this case, you would want to:

    • Check the Boots service logs from the machine where you are running the CLI to see if it received and/or responded to the request:

      docker logs boots
      
    • Confirm no other DHCP service responded to the request and check for any errors in the BMC console. Other DHCP servers on the network can result in race conditions and should be avoided by configuring the other server to block all MAC addresses and exclude all IP addresses used by EKS Anywhere.

  4. If you see Welcome to LinuxKit, click enter in the BMC console to access the LinuxKit terminal. Run the following commands to check if the tink-worker container is running.

    docker ps -a
    docker logs <container-id>
    
  5. If the machine has already started provisioning the OS and it’s in irrecoverable state, get the workflow of the provisioning/provisioned machine using:

    kubectl get workflows -n eksa-system
    kubectl describe workflow/<workflow-name> -n eksa-system 
    

    Check all the actions and their status to determine if all actions have been executed successfully or not. If the stream-image has action failed, it’s likely due to a timeout or network related issue. You can also provide your own image_url by specifying osImageURL under datacenter spec.

vSphere troubleshooting

EKSA_VSPHERE_USERNAME is not set or is empty

❌ Validation failed	{"validation": "vsphere Provider setup is valid", "error": "failed setup and validations: EKSA_VSPHERE_USERNAME is not set or is empty", "remediation": ""}

Two environment variables need to be set and exported in your environment to create clusters successfully. Be sure to use single quotes around your user name and password to avoid shell manipulation of these values.

export EKSA_VSPHERE_USERNAME='<vSphere-username>'
export EKSA_VSPHERE_PASSWORD='<vSphere-password>'

vSphere authentication failed

❌ Validation failed	{"validation": "vsphere Provider setup is valid", "error": "error validating vCenter setup: vSphere authentication failed: govc: ServerFaultCode: Cannot complete login due to an incorrect user name or password.\n", "remediation": ""}
Error: failed to create cluster: validations failed

Two environment variables need to be set and exported in your environment to create clusters successfully. Be sure to use single quotes around your user name and password to avoid shell manipulation of these values.

export EKSA_VSPHERE_USERNAME='<vSphere-username>'
export EKSA_VSPHERE_PASSWORD='<vSphere-password>'

Issues detected with selected template

Issues detected with selected template. Details: - -1:-1:VALUE_ILLEGAL: No supported hardware versions among [vmx-15]; supported: [vmx-04, vmx-07, vmx-08, vmx-09, vmx-10, vmx-11, vmx-12, vmx-13].

Our upstream dependency on CAPV makes it a requirement that you use vSphere 6.7 update 3 or newer. Make sure your ESXi hosts are also up to date.

Waiting for external etcd to be ready

2022-01-19T15:56:57.734Z        V3      Waiting for external etcd to be ready   {"cluster": "mgmt"}

Debug this problem using techniques from Generic cluster unavailable .

Timed out waiting for the condition on deployments/capv-controller-manager

Failed to create cluster {"error": "error initializing capi in bootstrap cluster: error waiting for capv-controller-manager in namespace capv-system: error executing wait: error: timed out waiting for the condition on deployments/capv-controller-manager\n"}

Debug this problem using techniques from Generic cluster unavailable .

Timed out waiting for the condition on clusters/

Failed to create cluster {"error": "error waiting for workload cluster control plane to be ready: error executing wait: error: timed out waiting for the condition on clusters/test-cluster\n"}

This can be an issue with the number of control plane and worker node replicas defined in your cluster yaml file. Try to start off with a smaller number (3 or 5 is recommended for control plane) in order to bring up the cluster.

This error can also occur because your vCenter server is using self-signed certificates and you have insecure set to true in the generated cluster yaml. To check if this is the case, run the commands below:

export KUBECONFIG=${PWD}/${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
kubectl get machines

If all the machines are in Provisioning phase, this is most likely the issue. To resolve the issue, set insecure to false and thumbprint to the TLS thumbprint of your vCenter server in the cluster yaml and try again.

"msg"="discovered IP address"

The aforementioned log message can also appear with an address value of the control plane in either of the ${CLUSTER_NAME}/logs/capv-controller-manager.log file or the capv-controller-manager pod log which can be extracted with the following command,

export KUBECONFIG=${PWD}/${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
kubectl logs -f -n capv-system -l control-plane="controller-manager" -c manager

Make sure you are choosing an ip in your network range that does not conflict with other VMs. https://anywhere.eks.amazonaws.com/docs/reference/clusterspec/vsphere/#controlplaneconfigurationendpointhost-required

Generic cluster unavailable

The first thing to look at is: were virtual machines created on your target provider? In the case of vSphere, you should see some VMs in your folder and they should be up. Check the console and if you see:

[FAILED] Failed to start Wait for Network to be Configured.

Make sure your DHCP server is up and working.

Workload VM is created on vSphere but can not power on

A similar issue is the VM does power on but does not show any logs on the console and does not have any IPs assigned.

This issue can occur if the resourcePool that the VM uses does not have enough CPU or memory resources to run a VM. To resolve this issue, increase the CPU and/or memory reservations or limits for the resourcePool.

Workload VMs start but Kubernetes not working properly

If the workload VMs start, but Kubernetes does not start or is not working properly, you may want to log onto the VMs and check the logs there. If Kubernetes is at least partially working, you may use kubectl to get the IPs of the nodes:

kubectl get nodes -o=custom-columns="NAME:.metadata.name,IP:.status.addresses[2].address"

If Kubernetes is not working at all, you can get the IPs of the VMs from vCenter or using govc.

When you get the external IP you can ssh into the nodes using the private ssh key associated with the public ssh key you provided in your cluster configuration:

ssh -i <ssh-private-key> <ssh-username>@<external-IP>

create command stuck on Creating new workload cluster

There can we a few reasons if the create command is stuck on Creating new workload cluster for over 30 min. First, check the vSphere UI to see if any workload VM are created.

If any VMs are created, check to see if they have any IPv4 IPs assigned to them.

If there are no IPv4 IPs assigned to them, this is most likely because you don’t have a DHCP server configured for the network configured in the cluster config yaml. Ensure that you have DHCP running and run the create command again.

If there are any IPv4 IPs assigned, check if one of the VMs have the controlPlane IP specified in Cluster.spec.controlPlaneConfiguration.endpoint.host in the clusterconfig yaml. If this IP is not present on any control plane VM, make sure the network has access to the following endpoints:

  • vCenter endpoint (must be accessible to EKS Anywhere clusters)
  • public.ecr.aws
  • anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere binaries, manifests and OVAs)
  • distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
  • d2glxqk2uabbnd.cloudfront.net (for EKS Anywhere and EKS Distro ECR container images)
  • api.ecr.us-west-2.amazonaws.com (for EKS Anywhere package authentication matching your region)
  • d5l0dvt14r5h8.cloudfront.net (for EKS Anywhere package ECR container images)
  • api.github.com (only if GitOps is enabled)

If the IPv4 IPs are assigned to the VM and you have the workload kubeconfig under <cluster-name>/<cluster-name>-eks-a-cluster.kubeconfig, you can use it to check vsphere-cloud-controller-manager logs.

kubectl logs -n kube-system vsphere-cloud-controller-manager-<xxxxx> --kubeconfig <cluster-name>/<cluster-name>-eks-a-cluster.kubeconfig

If you see this message in the logs, it means your cluster nodes do not have access to vSphere, which is required for cluster to get to a ready state.

Failed to connect to <vSphere-FQDN>: connection refused

In this case, you need to enable inbound traffic from your cluster nodes on your vCenter’s management network.

If VMs are created, but they do not get a network connection and DHCP is not configured for your vSphere deployment, you may need to create your own DHCP server . If no VMs are created, check the capi-controller-manager, capv-controller-manager and capi-kubeadm-control-plane-controller-manager logs using the commands mentioned in Generic cluster unavailable section.

Cluster Deletion Fails

If cluster deletion fails, you may need to manually delete the VMs associated with the cluster. The VMs should be named with the cluster name. You can power off and delete from disk using the vCenter web user interface. You may also use govc:

govc find -type VirtualMachine --name '<cluster-name>*'

This will give you a list of virtual machines that should be associated with your cluster. For each of the VMs you want to delete run:

VM_NAME=vm-to-destroy
govc vm.power -off -force $VM_NAME
govc object.destroy $VM_NAME

Troubleshooting GitOps integration

Cluster creation failure leaves outdated cluster configuration in GitHub.com repository

Failed cluster creation can sometimes leave behind cluster configuration files committed to your GitHub.com repository. Make sure to delete these configuration files before you re-try eksctl anywhere create cluster. If these configuration files are not deleted, GitOps installation will fail but cluster creation will continue.

They’ll generally be located under the directory clusters/$CLUSTER_NAME if you used the default path in your flux gitops config. Delete the entire directory named $CLUSTER_NAME.

Cluster creation failure leaves empty GitHub.com repository

Failed cluster creation can sometimes leave behind a completely empty GitHub.com repository. This can cause the GitOps installation to fail if you re-try the creation of a cluster which uses this repository. If cluster creation failure leaves behind an empty github repository, please manually delete the created GitHub.com repository before attempting cluster creation again.

Changes not syncing to cluster

Please remember that the only fields currently supported for GitOps are:

Cluster

  • Cluster.workerNodeGroupConfigurations.count
  • Cluster.workerNodeGroupConfigurations.machineGroupRef.name

Worker Nodes

  • VsphereMachineConfig.diskGiB
  • VsphereMachineConfig.numCPUs
  • VsphereMachineConfig.memoryMiB
  • VsphereMachineConfig.template
  • VsphereMachineConfig.datastore
  • VsphereMachineConfig.folder
  • VsphereMachineConfig.resourcePool

If you’ve changed these fields and they’re not syncing to the cluster as you’d expect, check out the logs of the pod in the source-controller deployment in the flux-system namespaces. If flux is having a problem connecting to your GitHub repository the problem will be logged here.

$ kubectl get pods -n flux-system
NAME                                       READY   STATUS    RESTARTS   AGE
helm-controller-7d644b8547-k8wfs           1/1     Running   0          4h15m
kustomize-controller-7cf5875f54-hs2bt      1/1     Running   0          4h15m
notification-controller-776f7d68f4-v22kp   1/1     Running   0          4h15m
source-controller-7c4555748d-7c7zb         1/1     Running   0          4h15m
$ kubectl logs source-controller-7c4555748d-7c7zb -n flux-system

A well behaved flux pod will simply log the ongoing reconciliation process, like so:

{"level":"info","ts":"2021-07-01T19:58:51.076Z","logger":"controller.gitrepository","msg":"Reconciliation finished in 902.725344ms, next run in 1m0s","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system"}
{"level":"info","ts":"2021-07-01T19:59:52.012Z","logger":"controller.gitrepository","msg":"Reconciliation finished in 935.016754ms, next run in 1m0s","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system"}
{"level":"info","ts":"2021-07-01T20:00:52.982Z","logger":"controller.gitrepository","msg":"Reconciliation finished in 970.03174ms, next run in 1m0s","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system"}

If there are issues connecting to GitHub, you’ll instead see exceptions in the source-controller log stream. For example, if the deploy key used by flux has been deleted, you’d see something like this:

{"level":"error","ts":"2021-07-01T20:04:56.335Z","logger":"controller.gitrepository","msg":"Reconciler error","reconciler group":"source.toolkit.fluxcd.io","reconciler kind":"GitRepository","name":"flux-system","namespace":"flux-system","error":"unable to clone 'ssh://git@github.com/youruser/gitops-vsphere-test', error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain"}

Other ways to troubleshoot GitOps integration

If you’re still having problems after deleting any empty EKS Anywhere created GitHub repositories and looking at the source-controller logs. You can look for additional issues by checking out the deployments in the flux-system and eksa-system namespaces and ensure they’re running and their log streams are free from exceptions.

$ kubectl get deployments -n flux-system
NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
helm-controller           1/1     1            1           4h13m
kustomize-controller      1/1     1            1           4h13m
notification-controller   1/1     1            1           4h13m
source-controller         1/1     1            1           4h13m
$ kubectl get deployments -n eksa-system
NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
eksa-controller-manager   1/1     1            1           4h13m

Snow troubleshooting

Device outage

These are some conditions that can cause a device outage:

  • Intentional outage (a planned power outage or an outage when moving devices, for example).
  • Unintentional outage (a subset of devices or all devices are rebooted, or experiencing network disconnections from the LAN, which make device offline or isolated from the cluster).

NOTE: If all Snowball Edge devices are moved to a different place and connected to a different local network, make sure you use the same subnet, netmask, and gateway for your network configuration. After moving, devices and all node instances need to maintain the original IP addresses. Then, follow the recover cluster procedure to get your cluster up and running again. Otherwise, it might be impossible to resume the cluster.

To recover a cluster

If there is a subset of devices or all devices experience an outage, see Downloading and Installing the Snowball Edge client to get the Snowball Edge client and then follow these steps:

  1. Reboot and unlock all affected devices manually.

    // use reboot-device command to reboot device, this may take several minutes
    $ path-to-snowballEdge_CLIENT reboot-device --endpoint https://snowball-ip --manifest-file path-to-manifest-file --unlock-code unlock-code
    
    // use describe-device command to check the status of device
    $ path-to-snowballEdge_CLIENT describe-device --endpoint https://snowball-ip --manifest-file path-to-manifest-file --unlock-code unlock-code
    
    // when the State in the output of describe-device is LOCKED, run unlock-device
    $ path-to-snowballEdge_CLIENT unlock-device --endpoint https://snowball-ip --manifest-file path-to-manifest-file --unlock-code unlock-code
    
    // use describe-device command to check the status of device until device is unlocked
    $ path-to-snowballEdge_CLIENT describe-device --endpoint https://snowball-ip --manifest-file path-to-manifest-file --unlock-code unlock-code
    
  2. Get all instance IDs that were part of the cluster by looking up the impacted device IP in the PROVIDERID column.

    $ kubectl get machines -A --kubeconfig=cluster-name/cluster-name-eks-a-cluster.kubeconfig
    
    NAMESPACE     NAME            CLUSTER        NODENAME          PROVIDERID                                        PHASE     AGE   VERSION
    eksa-system   machine-name-1  cluster-name   node-name-1       aws-snow:///192.168.1.39/s.i-8319d8c75d54a32cc    Running   82s   v1.24.9-eks-1-24-7
    eksa-system   machine-name-2  cluster-name   node-name-2       aws-snow:///192.168.1.39/s.i-8d7d3679a1713e403    Running   82s   v1.24.9-eks-1-24-7
    eksa-system   machine-name-3  cluster-name   node-name-3       aws-snow:///192.168.1.231/s.i-8201c356fb369c37f   Running   81s   v1.24.9-eks-1-24-7
    eksa-system   machine-name-4  cluster-name   node-name-4       aws-snow:///192.168.1.39/s.i-88597731b5a4a9044    Running   81s   v1.24.9-eks-1-24-7
    eksa-system   machine-name-5  cluster-name   node-name-5       aws-snow:///192.168.1.77/s.i-822f0f46267ad4c6e    Running   81s   v1.24.9-eks-1-24-7          
    
  3. Start all instances on the impacted devices as soon as possible.

    $ aws ec2 start-instances --instance-id instance-id-1 instance-id-2 ... --endpoint http://snowball-ip:6078 --profile profile-name          
    
  4. Check the balance status of the current cluster after the cluster is ready again.

    $ kubectl get machines -A --kubeconfig=cluster-name/cluster-name-eks-a-cluster.kubeconfig          
    
  5. Check if you have unstacked etcd machines.

    • If you have unstacked etcd machines, check the provision of unstacked etcd machines. You can find the device IP in the PROVIDERID column.

      • If there are more than 1 unstacked etcd machines provisioned on the same device and there are devices with no unstacked etcd machine, you need to rebalance unstacked etcd nodes. Follow the rebalance nodes procedure to rebalance your unstacked etcd nodes in order to recover high availability.
      • If you have your etcd nodes evenly distributed with 1 device having at most 1 etcd node, you are done with the recovery.
    • If you don’t have unstacked etcd machines, check the provision of control plane machines. You can find the device IP in PROVIDERID column.

      • If there are more than 1 control plan machines provisioned on the same device and there are devices with no control plane machine, you need to rebalance control plane nodes. Follow the rebalance nodes procedure to rebalance your control plane nodes in order to recover high availability.
      • If you have your control plane nodes evenly distributed with 1 device having at most 1 control plane node, you are done with the recovery.

How to rebalance nodes

  1. Confirm the machines you want to delete and get their node name from the NODENAME column.

    You can determine which machines need to be deleted by referring to the AGE column. The newly-generated machines have short AGE. Delete those new etcd/control plane machine nodes which are not the only etcd/control plane machine nodes on their devices.

  2. Cordon each node so no further workloads are scheduled to run on it.

    $ kubectl cordon node-name --ignore-daemonsets --kubeconfig=cluster-name/cluster-name-eks-a-cluster.kubeconfig
    
  3. Drain machine nodes of all current workloads.

    $ kubectl drain node-name --ignore-daemonsets --kubeconfig=cluster-name/cluster-name-eks-a-cluster.kubeconfig
    
  4. Delete machine node.

    $ kubectl delete node node-name --kubeconfig=cluster-name/cluster-name-eks-a-cluster.kubeconfig
    
  5. Repeat this process until etcd/control plane machine nodes are evenly provisioned.

Device replacement

There might be some reasons which can require device replacement:

  • When a subset of devices are determined to be broken and you want to join a new device into current cluster.
  • When a subset of devices are offline and come back with a new device IP.

To upgrade a cluster with new devices:

  1. Add new certificates to the certificate file and new credentials to the credential file.

  2. Change the device list in your cluster yaml configuration file and use the eksctl anywhere upgrade cluster command.

    $ eksctl anywhere upgrade cluster -f eks-a-cluster.yaml
    

Node outage

Unintentional instance outage

When an instance is in exception status (for example, terminated/stopped for some reason), it will be discovered automatically by Amazon EKS Anywhere and there will be a new replacement instance node created after 5 minutes. The new node will be provisioned to devices based on even provision strategy. In this case, the new node will be provisioned to a device with the fewest number of machines of the same type. Sometimes, more than one device will have the same number of machines of this type. Thus, we cannot guarantee it will be provisioned on the original device.

Intentional node replacement

If you want to replace an unhealthy node which didn’t get detected by Amazon EKS Anywhere automatically, you can follow these steps.

NOTE: Do not delete all worker machine nodes or control plane nodes or etcd nodes at the same time. Make sure you delete machine nodes one by one.

  1. Cordon nodes so no further workloads are scheduled to run on it.

    $ kubectl cordon node-name --ignore-daemonsets --kubeconfig=cluster-name/cluster-name-eks-a-cluster.kubeconfig              
    
  2. Drain machine nodes of all current workloads.

    $ kubectl drain node-name --ignore-daemonsets --kubeconfig=cluster-name/cluster-name-eks-a-cluster.kubeconfig              
    
  3. Delete machine nodes.

    $ kubectl delete node node-name --kubeconfig=cluster-name/cluster-name-eks-a-cluster.kubeconfig              
    
  4. New nodes will be provisioned automatically. You can check the provision result with the get machines command.

    $ kubectl get machines -A --kubeconfig=cluster-name/cluster-name-eks-a-cluster.kubeconfig              
    

Cluster Deletion Fails

If your Amazon EKS Anywhere cluster creation failed and the eksctl anywhere delete cluster -f eksa-cluster.yaml command cannot be run successfully, manually delete a few resources before trying the command again. Run the following commands from the computer on which you set up the AWS configuration and have the Snowball Edge Client installed . If you are using multiple Snowball Edge devices, run these commands on each.

// get the list of instance ids that are created for Amazon EKS Anywhere cluster, 
// that can be identified by cluster name in the tag of the output
$ aws ec2 describe-instances --endpoint http://snowball-ip:8008 --profile profile-name

// the next two commands are for deleting DNI, this needs to be done before deleting instance
$ PATH_TO_Snowball_Edge_CLIENT/bin/snowballEdge describe-direct-network-interfaces --endpoint https://snowball-ip --manifest-file path-to-manifest-file --unlock-code unlock-code
  
// DNI arn can be found in the output of last command, which is associated with the specific instance id you get from describe-instances
$ PATH_TO_Snowball_Edge_CLIENT/bin/snowballEdge delete-direct-network-interface --direct-network-interface-arn DNI-ARN --endpoint https://snowball-ip --manifest-file path-to-manifest-file --unlock-code unlock-code

// delete instance
$ aws ec2 terminate-instances --instance-id instance-id-1,instance-id-2 --endpoint http://snowball-ip:8008 --profile profile-name

Generate a log file from the Snowball Edge device

You can also generate a log file from the Snowball Edge device for AWS Support. See AWS Snowball Edge Logs in this guide.

Nutanix troubleshooting

Error creating Nutanix client

Error: error creating nutanix client: username, password and endpoint are required

Verify if the required environment variables are set before creating the clusters:

export EKSA_NUTANIX_USERNAME="<Nutanix-username>"
export EKSA_NUTANIX_PASSWORD="<Nutanix-password>"

Also, make sure the spec.endpoint is correctly configured in the NutanixDatacenterConfig. The value of the spec.endpoint should be the IP or FQDN of Prism Central.

x509: certificate signed by unknown authority

Failure of the nutanix Provider setup is valid validation with the x509: certificate signed by unknown authority message indicates the certificate of the Prism Central endpoint is not trusted. In case Prism Central is configured with self-signed certificates, it is recommended to configure the additionalTrustBundle in the NutanixDatacenterConfig. More information can be found here .

4.3.2 - Generating a Support Bundle

Using the Support Bundle with your EKS Anywhere Cluster

This guide covers the use of the EKS Anywhere Support Bundle for troubleshooting and support. This allows you to gather cluster information, save it to your administrative machine, and perform analysis of the results.

EKS Anywhere leverages troubleshoot.sh to collect and analyze kubernetes cluster logs, cluster resource information, and other relevant debugging information.

EKS Anywhere has two Support Bundle commands:

eksctl anywhere generate support-bundle will execute a support bundle on your cluster, collecting relevant information, archiving it locally, and performing analysis of the results.

eksctl anywhere generate support-bundle-config will generate a Support Bundle config yaml file for you to customize.

Do not add personally identifiable information (PII) or other confidential or sensitive information to your support bundle. If you provide the support bundle to get support from AWS, it will be accessible to other AWS services, including AWS Support.

Collecting a Support Bundle and running analyzers

eksctl anywhere generate support-bundle

generate support-bundle will allow you to quickly collect relevant logs and cluster resources and save them locally in an archive file. This archive can then be used to aid in further troubleshooting and debugging.

If you provide a cluster configuration file containing your cluster spec using the -f flag, generate support-bundle will customize the auto-generated support bundle collectors and analyzers to match the state of your cluster.

If you provide a support bundle configuration file using the --bundle-config flag, for example one generated with generate support-bundle-config, generate support-bundle will use the provided configuration when collecting information from your cluster and analyzing the results.

Flags:
      --bundle-config string   Bundle Config file to use when generating support bundle
  -f, --filename string        Filename that contains EKS-A cluster configuration
  -h, --help                   help for support-bundle
      --since string           Collect pod logs in the latest duration like 5s, 2m, or 3h.
      --since-time string      Collect pod logs after a specific datetime(RFC3339) like 2021-06-28T15:04:05Z
  -w, --w-config string        Kubeconfig file to use when creating support bundle for a workload cluster

Collecting and analyzing a bundle

You only need to run a single command to generate a support bundle, collect information and analyze the output: eksctl anywhere generate support-bundle -f myCluster.yaml

This command will collect the information from your cluster and run an analysis of the collected information.

The collected information will be saved to your local disk in an archive which can be used for debugging and obtaining additional in-depth support.

The analysis will be printed to your console.

Collect phase:

$ ./bin/eksctl anywhere generate support-bundle -f ./testcluster100.yaml
 Collecting support bundle cluster-info
 Collecting support bundle cluster-resources
 Collecting support bundle secret
 Collecting support bundle logs
 Analyzing support bundle

Analysis phase:

 Analyze Results
------------
Check PASS
Title: gitopsconfigs.anywhere.eks.amazonaws.com
Message: gitopsconfigs.anywhere.eks.amazonaws.com is present on the cluster

------------
Check PASS
Title: vspheredatacenterconfigs.anywhere.eks.amazonaws.com
Message: vspheredatacenterconfigs.anywhere.eks.amazonaws.com is present on the cluster

------------
Check PASS
Title: vspheremachineconfigs.anywhere.eks.amazonaws.com
Message: vspheremachineconfigs.anywhere.eks.amazonaws.com is present on the cluster

------------
Check PASS
Title: capv-controller-manager Status
Message: capv-controller-manager is running.

------------
Check PASS
Title: capv-controller-manager Status
Message: capv-controller-manager is running.

------------
Check PASS
Title: coredns Status
Message: coredns is running.

------------
Check PASS
Title: cert-manager-webhook Status
Message: cert-manager-webhook is running.

------------
Check PASS
Title: cert-manager-cainjector Status
Message: cert-manager-cainjector is running.

------------
Check PASS
Title: cert-manager Status
Message: cert-manager is running.

------------
Check PASS
Title: capi-kubeadm-control-plane-controller-manager Status
Message: capi-kubeadm-control-plane-controller-manager is running.

------------
Check PASS
Title: capi-kubeadm-bootstrap-controller-manager Status
Message: capi-kubeadm-bootstrap-controller-manager is running.

------------
Check PASS
Title: capi-controller-manager Status
Message: capi-controller-manager is running.

------------
Check PASS
Title: capi-controller-manager Status
Message: capi-controller-manager is running.

------------
Check PASS
Title: capi-kubeadm-control-plane-controller-manager Status
Message: capi-kubeadm-control-plane-controller-manager is running.

------------
Check PASS
Title: capi-kubeadm-control-plane-controller-manager Status
Message: capi-kubeadm-control-plane-controller-manager is running.

------------
Check PASS
Title: capi-kubeadm-bootstrap-controller-manager Status
Message: capi-kubeadm-bootstrap-controller-manager is running.

------------
Check PASS
Title: clusters.anywhere.eks.amazonaws.com
Message: clusters.anywhere.eks.amazonaws.com is present on the cluster

------------
Check PASS
Title: bundles.anywhere.eks.amazonaws.com
Message: bundles.anywhere.eks.amazonaws.com is present on the cluster

------------

Archive phase:

a support bundle has been created in the current directory:	{"path": "support-bundle-2021-09-02T19_29_41.tar.gz"}

Generating a custom Support Bundle configuration for your EKS Anywhere Cluster

EKS Anywhere will automatically generate a support bundle based on your cluster configuration; however, if you’d like to customize the support bundle to collect specific information, you can generate your own support bundle configuration yaml for EKS Anywhere to run on your cluster.

eksctl anywhere generate support-bundle-config will generate a default support bundle configuration and print it as yaml.

eksctl anywhere generate support-bundle-config -f myCluster.yaml will generate a support bundle configuration customized to your cluster and print it as yaml.

To run a customized support bundle configuration yaml file on your cluster, save this output to a file and run the command eksctl anywhere generate support-bundle using the flag --bundle-config.

eksctl anywhere generate support-bundle-config
Flags:
  -f, --filename string   Filename that contains EKS-A cluster configuration
  -h, --help              help for support-bundle-config

4.4 - EKS Anywhere curated package management

Common tasks for managing curated packages.

The main goal of EKS Anywhere curated packages is to make it easy to install, configure and maintain operational components in an EKS Anywhere cluster. EKS Anywhere curated packages offers to run secure and tested operational components on EKS Anywhere clusters. Please check out EKS Anywhere curated packages concepts and EKS Anywhere curated packages configurations for more details.

For proper curated package support, make sure the cluster kubernetes version is v1.21 or above and eksctl anywhere version is v0.11.0 or above (can be checked with the eksctl anywhere version command). Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here .

Setup authentication to use curated-packages

When you have been notified that your account has been given access to curated packages, create an IAM user in your account with a policy that only allows ECR read access to the Curated Packages repository; similar to this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ECRRead",
            "Effect": "Allow",
            "Action": [
                "ecr:DescribeImageScanFindings",
                "ecr:GetDownloadUrlForLayer",
                "ecr:DescribeRegistry",
                "ecr:DescribePullThroughCacheRules",
                "ecr:DescribeImageReplicationStatus",
                "ecr:ListTagsForResource",
                "ecr:ListImages",
                "ecr:BatchGetImage",
                "ecr:DescribeImages",
                "ecr:DescribeRepositories",
                "ecr:BatchCheckLayerAvailability"
            ],
            "Resource": "arn:aws:ecr:*:783794618700:repository/*"
        },
        {
            "Sid": "ECRLogin",
            "Effect": "Allow",
            "Action": [
                "ecr:GetAuthorizationToken"
            ],
            "Resource": "*"
        }
    ]
}

Note Curated Packages now supports pulling images from the following regions. Use the corresponding EKSA_AWS_REGION prior to cluster creation to choose which region to pull form, if not set it will default to pull from us-west-2.

"us-east-2",
"us-east-1",
"us-west-1",
"us-west-2",
"ap-northeast-3",
"ap-northeast-2",
"ap-southeast-1",
"ap-southeast-2",
"ap-northeast-1",
"ca-central-1",
"eu-central-1",
"eu-west-1",
"eu-west-2",
"eu-west-3",
"eu-north-1",
"sa-east-1"

Create credentials for this user and set and export the following environment variables:

export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"
export EKSA_AWS_REGION="us-west-2"

Make sure you are authenticated with the AWS CLI

export AWS_ACCESS_KEY_ID="your*access*id"
export AWS_SECRET_ACCESS_KEY="your*secret*key"
aws sts get-caller-identity

Login to docker

aws ecr get-login-password --region us-west-2 |docker login --username AWS --password-stdin 783794618700.dkr.ecr.us-west-2.amazonaws.com

Verify you can pull an image

docker pull 783794618700.dkr.ecr.us-west-2.amazonaws.com/emissary-ingress/emissary:v3.0.0-9ded128b4606165b41aca52271abe7fa44fa7109

If the image downloads successfully, it worked!

Discover curated packages

You can get a list of the available packages from the command line:

export CLUSTER_NAME=nameofyourcluster
export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
eksctl anywhere list packages --kube-version 1.23

Example command output:

Package                 Version(s)
-------                 ----------
hello-eks-anywhere      0.1.2-a6847010915747a9fc8a412b233a2b1ee608ae76
adot                    0.25.0-c26690f90d38811dbb0e3dad5aea77d1efa52c7b
cert-manager            1.9.1-dc0c845b5f71bea6869efccd3ca3f2dd11b5c95f
cluster-autoscaler      9.21.0-1.23-5516c0368ff74d14c328d61fe374da9787ecf437
harbor                  2.5.1-ee7e5a6898b6c35668a1c5789aa0d654fad6c913
metallb                 0.13.7-758df43f8c5a3c2ac693365d06e7b0feba87efd5
metallb-crds            0.13.7-758df43f8c5a3c2ac693365d06e7b0feba87efd5
metrics-server          0.6.1-eks-1-23-6-c94ed410f56421659f554f13b4af7a877da72bc1
emissary                3.3.0-cbf71de34d8bb5a72083f497d599da63e8b3837b
emissary-crds           3.3.0-cbf71de34d8bb5a72083f497d599da63e8b3837b
prometheus              2.41.0-b53c8be243a6cc3ac2553de24ab9f726d9b851ca

Generate a curated-packages config

The example shows how to install the harbor package from the curated package list .

export CLUSTER_NAME=nameofyourcluster
eksctl anywhere generate package harbor --cluster ${CLUSTER_NAME} --kube-version 1.23 > packages.yaml

Available curated packages and troubleshooting guides are listed below.

Install package controller after installation

If you created a cluster without the package controller or if the package controller was not properly configured, you may need to do some things to enable it.

Make sure you are authenticated with the AWS CLI. Use the credentials you set up for packages. These credentials should have limited capabilities :

export AWS_ACCESS_KEY_ID="your*access*id"
export AWS_SECRET_ACCESS_KEY="your*secret*key"
export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"

Verify your credentials are working:

aws sts get-caller-identity

Login to docker

aws ecr get-login-password |docker login --username AWS --password-stdin 783794618700.dkr.ecr.us-west-2.amazonaws.com

Verify you can pull an image

docker pull 783794618700.dkr.ecr.us-west-2.amazonaws.com/emissary-ingress/emissary:v3.0.0-9ded128b4606165b41aca52271abe7fa44fa7109

If the image downloads successfully, it worked!

If you do not have the package controller installed (it is installed by default), install it now:

eksctl anywhere install packagecontroller -f cluster.yaml

If you had the package controller disabled, you may need to modify your cluster.yaml to enable it.

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: billy
spec:
  packages:
    disable: false

You may need to create or update your credentials which you can do with a command like this. Set the environment variables to the proper values before running the command.

kubectl delete secret -n eksa-packages aws-secret
kubectl create secret -n eksa-packages generic aws-secret \
   --from-literal=AWS_ACCESS_KEY_ID=${EKSA_AWS_ACCESS_KEY_ID} \
   --from-literal=AWS_SECRET_ACCESS_KEY=${EKSA_AWS_SECRET_ACCESS_KEY}  \
   --from-literal=REGION=${EKSA_AWS_REGION}

If you recreate secrets, you can manually re-enable the cronjob and run the job to update the image pull secrets:

kubectl get cronjob -n eksa-packages cron-ecr-renew -o yaml | yq e '.spec.suspend |= false' - | kubectl apply -f -
kubectl create job -n eksa-packages --from=cronjob/cron-ecr-renew run-it-now

Upgrade the packages controller

Starting with EKS-A v0.15.0 (packages controller v0.3.9+) the package controller will upgrade automatically according to the selected bundle. For any version prior to v0.3.X, manual steps must be executed to upgrade.

  1. Ensure the namespace will be kept
kubectl annotate namespaces eksa-packages helm.sh/resource-policy=keep
  1. Uninstall the eks-anywhere-package helm release
helm uninstall eks-anywhere-packages
  1. Remove the secret called aws-secret (we will need credentials when installing the new version)
kubectl delete secret -n eksa-package aws-secret
  1. Install the new version using the latest eksctl-anywhere binary
eksctl anywhere install packagecontroller -f ${CLUSTER_NAME}.yaml

4.4.1 - Package Prerequisites

Prerequisites for using curated packages

Prerequisites

Before installing any curated packages for EKS Anywhere, do the following:

  • Check that the cluster Kubernetes version is v1.21 or above. For example, you could run kubectl get cluster -o yaml <cluster-name> | grep -i kubernetesVersion

  • Check that the version of eksctl anywhere is v0.11.0 or above with the eksctl anywhere version command.

  • It is recommended that the package controller is only installed on the management cluster.

  • Check the existence of package controller:

    kubectl get pods -n eksa-packages | grep "eks-anywhere-packages"
    

    If the returned result is empty, you need to install the package controller.

  • Install the package controller if it is not installed: Install the package controller

    Note This command is temporarily provided to ease integration with curated packages. This command will be deprecated in the future

    eksctl anywhere install packagecontroller -f $CLUSTER_NAME.yaml
    

4.4.2 - Curated Packages Troubleshooting

Troubleshooting specific to curated packages

General debugging

The major component of Curated Packages is the package controller. If the container is not running or not running correctly, packages will not be installed. Generally it should be debugged like any other Kubernetes application. The first step is to check that the pod is running.

kubectl get pods -n eksa-packages

You should see at least two pods with running and one or more refresher completed.

NAME                                     READY   STATUS      RESTARTS   AGE
eks-anywhere-packages-69d7bb9dd9-9d47l   1/1     Running     0          14s
eksa-auth-refresher-w82nm                0/1     Completed   0          10s

The describe command might help to get more detail on why there is a problem:

kubectl describe pods -n eksa-packages

Logs of the controller can be seen in a normal Kubernetes fashion:

kubectl logs deploy/eks-anywhere-packages -n eksa-packages controller

To get the general state of the package controller, run the following command:

kubectl get packages,packagebundles,packagebundlecontrollers -A

You should see an active packagebundlecontroller and an available bundle. The packagebundlecontroller should indicate the active bundle. It may take a few minutes to download and activate the latest bundle. The state of the package in this example is installing and there is an error downloading the chart.

NAMESPACE              NAME                                          PACKAGE              AGE   STATE       CURRENTVERSION                                   TARGETVERSION                                             DETAIL
eksa-packages-sammy    package.packages.eks.amazonaws.com/my-hello   hello-eks-anywhere   42h   installed   0.1.1-bc7dc6bb874632972cd92a2bca429a846f7aa785   0.1.1-bc7dc6bb874632972cd92a2bca429a846f7aa785 (latest)   
eksa-packages-tlhowe   package.packages.eks.amazonaws.com/my-hello   hello-eks-anywhere   44h   installed   0.1.1-083e68edbbc62ca0228a5669e89e4d3da99ff73b   0.1.1-083e68edbbc62ca0228a5669e89e4d3da99ff73b (latest)   

NAMESPACE       NAME                                                 STATE
eksa-packages   packagebundle.packages.eks.amazonaws.com/v1-21-83    available
eksa-packages   packagebundle.packages.eks.amazonaws.com/v1-23-70    available
eksa-packages   packagebundle.packages.eks.amazonaws.com/v1-23-81    available
eksa-packages   packagebundle.packages.eks.amazonaws.com/v1-23-82    available
eksa-packages   packagebundle.packages.eks.amazonaws.com/v1-23-83    available

NAMESPACE       NAME                                                        ACTIVEBUNDLE   STATE               DETAIL
eksa-packages   packagebundlecontroller.packages.eks.amazonaws.com/sammy    v1-23-70       upgrade available   v1-23-83 available
eksa-packages   packagebundlecontroller.packages.eks.amazonaws.com/tlhowe   v1-21-83       active       active   

Package controller not running

If you do not see a pod or various resources for the package controller, it may be that it is not installed.

No resources found in eksa-packages namespace.

Most likely the cluster was created with an older version of the EKS Anywhere CLI. Curated packages became generally available with v0.11.0. Use the eksctl anywhere version command to verify you are running a new enough release and you can use the eksctl anywhere install packagecontroller command to install the package controller on an older release.

Error: this command is currently not supported

Error: this command is currently not supported

Curated packages became generally available with version v0.11.0. Use the version command to make sure you are running version v0.11.0 or later:

eksctl anywhere version

Error: cert-manager is not present in the cluster

Error: curated packages cannot be installed as cert-manager is not present in the cluster

This is most likely caused by an action to install curated packages at a workload cluster with eksctl anywhere version older than v0.12.0. In order to use packages on workload clusters, please upgrade eksctl anywhere version to v0.12+. The package manager will remotely manage packages on the workload cluster from the management cluster.

Package registry authentication

Error: ImagePullBackOff on Package

If a package fails to start with ImagePullBackOff:

NAME                                     READY   STATUS             RESTARTS   AGE
generated-harbor-jobservice-564d6fdc87   0/1     ImagePullBackOff   0          2d23h

If a package pod cannot pull images, you may not have your AWS credentials set up properly. Verify that your credentials are working properly.

Make sure you are authenticated with the AWS CLI. Use the credentials you set up for packages. These credentials should have limited capabilities :

export AWS_ACCESS_KEY_ID="your*access*id"
export AWS_SECRET_ACCESS_KEY="your*secret*key"
aws sts get-caller-identity

Login to docker

aws ecr get-login-password |docker login --username AWS --password-stdin 783794618700.dkr.ecr.us-west-2.amazonaws.com

Verify you can pull an image

docker pull 783794618700.dkr.ecr.us-west-2.amazonaws.com/emissary-ingress/emissary:v3.0.0-9ded128b4606165b41aca52271abe7fa44fa7109

If the image downloads successfully, it worked!

You may need to create or update your credentials which you can do with a command like this. Set the environment variables to the proper values before running the command.

kubectl delete secret -n eksa-packages aws-secret
kubectl create secret -n eksa-packages generic aws-secret --from-literal=AWS_ACCESS_KEY_ID=${EKSA_AWS_ACCESS_KEY_ID} --from-literal=AWS_SECRET_ACCESS_KEY=${EKSA_AWS_SECRET_ACCESS_KEY}  --from-literal=REGION=${EKSA_AWS_REGION}

If you recreate secrets, you can manually re-enable the cronjob and run the job to update the image pull secrets:

kubectl get cronjob -n eksa-packages cron-ecr-renew -o yaml | yq e '.spec.suspend |= false' - | kubectl apply -f -
kubectl create job -n eksa-packages --from=cronjob/cron-ecr-renew run-it-now

Warning: not able to trigger cron job

secret/aws-secret created
Warning: not able to trigger cron job, please be aware this will prevent the package controller from installing curated packages.

This is most likely caused by an action to install curated packages in a cluster that is running Kubernetes at version v1.20 or below. Note curated packages only support Kubernetes v1.21 and above.

Package on workload clusters

Starting at eksctl anywhere version v0.12.0, packages on workload clusters are remotely managed by the management cluster. While interacting with the package resources by the following commands for a workload cluster, please make sure the kubeconfig is pointing to the management cluster that was used to create the workload cluster.

Package manager is not managing packages on workload cluster

If the package manager is not managing packages on a workload cluster, make sure the management cluster has various resources for the workload cluster:

kubectl get packages,packagebundles,packagebundlecontrollers -A

You should see a PackageBundleController for the workload cluster named with the name of the workload cluster and the status should be set. There should be a namespace for the workload cluster as well:

kubectl get ns | grep eksa-packagess

Create a PackageBundlecController for the workload cluster if it does not exist (where billy here is the cluster name):

 cat <<! | k apply -f -
apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: PackageBundleController
metadata:
  name: billy
  namespace: eksa-packages
!

Workload cluster is disconnected

Cluster is disconnected:

NAMESPACE       NAME                                                        ACTIVEBUNDLE   STATE               DETAIL
eksa-packages   packagebundlecontroller.packages.eks.amazonaws.com/billy                   disconnected        initializing target client: getting kubeconfig for cluster "billy": Secret "billy-kubeconfig" not found

In the example above, the secret does not exist which may be that the management cluster is not managing the cluster, the PackageBundleController name is wrong or the secret was deleted.

This also may happen if the management cluster cannot communicate with the workload cluster or the workload cluster was deleted, although the detail would be different.

Error: the server doesn’t have a resource type “packages”

All packages are remotely managed by the management cluster, and packages, packagebundles, and packagebundlecontrollers resources are all deployed on the management cluster. Please make sure the kubeconfig is pointing to the management cluster that was used to create the workload cluster while interacting with package-related resources.

Error: packagebundlecontrollers.packages.eks.amazonaws.com “clusterName” not found

A package command run on a cluster that does not seem to be managed by the management cluster. To get a list of the clusters managed by the management cluster run the following command:

eksctl anywhere get packagebundlecontroller
NAME     ACTIVEBUNDLE   STATE     DETAIL
billy    v1-21-87       active

There will be one packagebundlecontroller for each cluster that is being managed. The only valid cluster name in the above example is billy.

4.4.3 - Cert-Manager

Install/update/upgrade/uninstall Cert-Manager

If you have not already done so, make sure your cluster meets the package prerequisites. Be sure to refer to the troubleshooting guide in the event of a problem.

Install on workload cluster

NOTE: The cert-manager package can only be installed on a workload cluster

  1. Generate the package configuration

    eksctl anywhere generate package cert-manager --cluster <cluster-name> > cert-manager.yaml
    
  2. Add the desired configuration to cert-manager.yaml

    Please see complete configuration options for all configuration options and their default values.

    Example package file configuring a cert-manager package to run on a workload cluster.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
      name: my-cert-manager
      namespace: eksa-packages-<cluster-name>
    spec:
      packageName: cert-manager
      targetNamespace: <namespace-to-install-component>
    
  3. Install Cert-Manager

    eksctl anywhere create packages -f cert-manager.yaml
    
  4. Validate the installation

    eksctl anywhere get packages --cluster <cluster-name>
    

    Example command output

    NAME                          PACKAGE              AGE   STATE       CURRENTVERSION                                               TARGETVERSION                                                         DETAIL
    my-cert-manager               cert-manager         15s   installed   1.9.1-dc0c845b5f71bea6869efccd3ca3f2dd11b5c95f               1.9.1-dc0c845b5f71bea6869efccd3ca3f2dd11b5c95f (latest)
    

Update

To update package configuration, update cert-manager.yaml file, and run the following command:

eksctl anywhere apply package -f cert-manager.yaml

Upgrade

Cert-Manager will automatically be upgraded when a new bundle is activated.

Uninstall

To uninstall cert-manager, simply delete the package

eksctl anywhere delete package --cluster <cluster-name> cert-manager

4.4.4 - Cluster Autoscaler

Install/upgrade/uninstall Cluster Autoscaler

If you have not already done so, make sure your cluster meets the package prerequisites. Be sure to refer to the troubleshooting guide in the event of a problem.

Choose a Deployment Approach

Each Cluster Autoscaler instance can target one cluster for autoscaling.

There are three ways to deploy a Cluster Autoscaler instance:

  1. Cluster Autoscaler deployed in the management cluster to autoscale the management cluster itself
  2. Cluster Autoscaler deployed in the management cluster to autoscale a remote workload cluster
  3. Cluster Autoscaler deployed in the workload cluster to autoscale the workload cluster itself

To read more about the tradeoffs of these different approaches, see here .

Install Cluster Autoscaler in management cluster

  1. Ensure you have configured at least one WorkerNodeGroup in your cluster to support autoscaling as outlined here

  2. Generate the package configuration

    eksctl anywhere generate package cluster-autoscaler --cluster <cluster-name> > cluster-autoscaler.yaml
    
  3. Add the desired configuration to cluster-autoscaler.yaml

    Please see complete configuration options for all configuration options and their default values.

    Example package file configuring a cluster autoscaler package to run in the management cluster.

    Note: Here, the <cluster-name> value represents the name of the management or workload cluster you would like to autoscale.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
      name: cluster-autoscaler
      namespace: eksa-packages-<cluster-name>
    spec:
      packageName: cluster-autoscaler
      targetNamespace: <namespace-to-install-component>
      config: |-
          cloudProvider: "clusterapi"
          autoDiscovery:
            clusterName: "<cluster-name>"      
    
  4. Install Cluster Autoscaler

    eksctl anywhere create packages -f cluster-autoscaler.yaml
    
  5. Validate the installation

    eksctl anywhere get packages --cluster <cluster-name>
    

    Example command output

    NAMESPACE                  NAME                          PACKAGE              AGE   STATE       CURRENTVERSION                                               TARGETVERSION                                                         DETAIL
    eksa-packages-mgmt-v-vmc   cluster-autoscaler            cluster-autoscaler   18h   installed   9.21.0-1.21-147e2a701f6ab625452fe311d5c94a167270f365         9.21.0-1.21-147e2a701f6ab625452fe311d5c94a167270f365 (latest)
    

Update

To update package configuration, update cluster-autoscaler.yaml file, and run the following command:

eksctl anywhere apply package -f cluster-autoscaler.yaml

Upgrade

Cluster Autoscaler will automatically be upgraded when a new bundle is activated.

Uninstall

To uninstall Cluster Autoscaler, simply delete the package

eksctl anywhere delete package --cluster <cluster-name> cluster-autoscaler

Install Cluster Autoscaler in workload cluster

A few extra steps are required to install cluster autoscaler in a workload cluster instead of the management cluster.

First, retrieve the management cluster’s kubeconfig secret:

kubectl -n eksa-system get secrets <management-cluster-name>-kubeconfig -o yaml > mgmt-secret.yaml

Update the secret’s namespace to the namespace in the workload cluster that you would like to deploy the cluster autoscaler to. Then, apply the secret to the workload cluster.

kubectl --kubeconfig /path/to/workload/kubeconfig apply -f mgmt-secret.yaml

Now apply this package configuration to the management cluster:

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
    name: workload-cluster-autoscaler
    namespace: eksa-packages-<workload-cluster-name>
spec:
    packageName: cluster-autoscaler
    targetNamespace: <workload-cluster-namespace-to-install-components>
    config: |-
        cloudProvider: "clusterapi"
        autoDiscovery:
            clusterName: "<workload-cluster-name>"
        clusterAPIMode: "incluster-kubeconfig"
        clusterAPICloudConfigPath: "/etc/kubernetes/value"
        extraVolumeSecrets:
            cluster-autoscaler-cloud-config:
                mountPath: "/etc/kubernetes"
                name: "<management-cluster-name>-kubeconfig"        

4.4.5 - Metrics Server

Install/upgrade/uninstall Metrics Server

If you have not already done so, make sure your cluster meets the package prerequisites. Be sure to refer to the troubleshooting guide in the event of a problem.

Install

  1. Generate the package configuration

    eksctl anywhere generate package metrics-server --cluster <cluster-name> > metrics-server.yaml
    
  2. Add the desired configuration to metrics-server.yaml

    Please see complete configuration options for all configuration options and their default values.

    Example package file configuring a cluster autoscaler package to run on a management cluster.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
      name: metrics-server
      namespace: eksa-packages-<cluster-name>
    spec:
      packageName: metrics-server
      targetNamespace: <namespace-to-install-component>
      config: |-
        args:
          - "--kubelet-insecure-tls"    
    
  3. Install Metrics Server

    eksctl anywhere create packages -f metrics-server.yaml
    
  4. Validate the installation

    eksctl anywhere get packages --cluster <cluster-name>
    

    Example command output

    NAME                   PACKAGE              AGE   STATE        CURRENTVERSION                                                     TARGETVERSION                                                               DETAIL
    metrics-server         metrics-server       8h    installed    0.6.1-eks-1-23-6-b4c2524fabb3dd4c5f9b9070a418d740d3e1a8a2          0.6.1-eks-1-23-6-b4c2524fabb3dd4c5f9b9070a418d740d3e1a8a2 (latest)
    

Update

To update package configuration, update metrics-server.yaml file, and run the following command:

eksctl anywhere apply package -f metrics-server.yaml

Upgrade

Metrics Server will automatically be upgraded when a new bundle is activated.

Uninstall

To uninstall Metrics Server, simply delete the package

eksctl anywhere delete package --cluster <cluster-name> metrics-server

4.4.6 - AWS Distro for OpenTelemetry (ADOT)

Install/upgrade/uninstall ADOT

If you have not already done so, make sure your cluster meets the package prerequisites. Be sure to refer to the troubleshooting guide in the event of a problem.

Install

  1. Generate the package configuration

    eksctl anywhere generate package adot --cluster <cluster-name> > adot.yaml
    
  2. Add the desired configuration to adot.yaml

    Please see complete configuration options for all configuration options and their default values.

    Example package file with daemonSet mode and default configuration:

     apiVersion: packages.eks.amazonaws.com/v1alpha1
     kind: Package
     metadata:
       name: my-adot
       namespace: eksa-packages-<cluster-name>
     spec:
       packageName: adot
       targetNamespace: observability
       config: | 
         mode: daemonset
    

    Example package file with deployment mode and customized collector components to scrap ADOT collector’s own metrics:

     apiVersion: packages.eks.amazonaws.com/v1alpha1
     kind: Package
     metadata:
       name: my-adot
       namespace: eksa-packages-<cluster-name>
     spec:
       packageName: adot
       targetNamespace: observability
       config: | 
         mode: deployment
         replicaCount: 2
         config:
           receivers:
             prometheus:
               config:
                 scrape_configs:
                   - job_name: opentelemetry-collector
                     scrape_interval: 10s
                     static_configs:
                       - targets:
                           - ${MY_POD_IP}:8888
           processors:
             batch: {}
             memory_limiter: null
           exporters:
             logging:
               loglevel: debug
             prometheusremotewrite:
               endpoint: "<prometheus-remote-write-end-point>"
           extensions:
             health_check: {}
             memory_ballast: {}
           service:
             pipelines:
               metrics:
                 receivers: [prometheus]
                 processors: [batch]
                 exporters: [logging, prometheusremotewrite]
             telemetry:
               metrics:
                 address: 0.0.0.0:8888
    
  3. Create the namespace (If overriding targetNamespace, change observability to the value of targetNamespace)

    kubectl create namespace observability
    
  4. Install adot

    eksctl anywhere create packages -f adot.yaml
    
  5. Validate the installation

    eksctl anywhere get packages --cluster <cluster-name>
    

    Example command output

    NAME   PACKAGE   AGE   STATE       CURRENTVERSION                                                            TARGETVERSION                                                                   DETAIL
    my-adot   adot   19h   installed   0.25.0-c26690f90d38811dbb0e3dad5aea77d1efa52c7b   0.25.0-c26690f90d38811dbb0e3dad5aea77d1efa52c7b (latest)
    

Update

To update package configuration, update adot.yaml file, and run the following command:

eksctl anywhere apply package -f adot.yaml

Upgrade

ADOT will automatically be upgraded when a new bundle is activated.

Uninstall

To uninstall ADOT, simply delete the package

eksctl anywhere delete package --cluster <cluster-name> my-adot

4.4.7 - Prometheus

Install/upgrade/uninstall Prometheus

If you have not already done so, make sure your cluster meets the package prerequisites. Be sure to refer to the troubleshooting guide in the event of a problem.

Install

  1. Generate the package configuration

    eksctl anywhere generate package prometheus --cluster <cluster-name> > prometheus.yaml
    
  2. Add the desired configuration to prometheus.yaml

    Please see complete configuration options for all configuration options and their default values.

    Example package file with default configuration, which enables prometheus-server and node-exporter:

     apiVersion: packages.eks.amazonaws.com/v1alpha1
     kind: Package
     metadata:
       name: generated-prometheus
       namespace: eksa-packages-<cluster-name>
     spec:
       packageName: prometheus
    

    Example package file with prometheus-server (or node-exporter) disabled:

     apiVersion: packages.eks.amazonaws.com/v1alpha1
     kind: Package
     metadata:
       name: generated-prometheus
       namespace: eksa-packages-<cluster-name>
     spec:
       packageName: prometheus
       config: |
         # disable prometheus-server
         server:
           enabled: false
         # or disable node-exporter
         # nodeExporter:
         #   enabled: false     
    

    Example package file with prometheus-server deployed as a statefulSet with replicaCount 2, and set scrape config to collect Prometheus-server’s own metrics only:

     apiVersion: packages.eks.amazonaws.com/v1alpha1
     kind: Package
     metadata:
       name: generated-prometheus
       namespace: eksa-packages-<cluster-name>
     spec:
       packageName: prometheus
       targetNamespace: observability
       config: |
         server:
           replicaCount: 2
           statefulSet:
             enabled: true
         serverFiles:
           prometheus.yml:
             scrape_configs:
               - job_name: prometheus
                 static_configs:
                   - targets:
                     - localhost:9090     
    
  3. Create the namespace (If overriding targetNamespace, change observability to the value of targetNamespace)

    kubectl create namespace observability
    
  4. Install prometheus

    eksctl anywhere create packages -f prometheus.yaml
    
  5. Validate the installation

    eksctl anywhere get packages --cluster <cluster-name>
    

    Example command output

    NAMESPACE                      NAME                   PACKAGE      AGE   STATE       CURRENTVERSION                                    TARGETVERSION                                              DETAIL
    eksa-packages-<cluster-name>   generated-prometheus   prometheus   17m   installed   2.41.0-b53c8be243a6cc3ac2553de24ab9f726d9b851ca   2.41.0-b53c8be243a6cc3ac2553de24ab9f726d9b851ca (latest)
    

Update

To update package configuration, update prometheus.yaml file, and run the following command:

eksctl anywhere apply package -f prometheus.yaml

Upgrade

Prometheus will automatically be upgraded when a new bundle is activated.

Uninstall

To uninstall Prometheus, simply delete the package

eksctl anywhere delete package --cluster <cluster-name> generated-prometheus

4.4.8 - Emissary Ingress

Install/upgrade/uninstall Emissary Ingress

If you have not already done so, make sure your cluster meets the package prerequisites. Be sure to refer to the troubleshooting guide in the event of a problem.

Install

  1. Generate the package configuration

    eksctl anywhere generate package emissary --cluster <cluster-name> > emissary.yaml
    
  2. Add the desired configuration to emissary.yaml

    Please see complete configuration options for all configuration options and their default values.

    Example package file with standard configuration.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
      name: emissary
      namespace: eksa-packages-<cluster-name>
    spec:
      packageName: emissary
    
  3. Install Emissary

    eksctl anywhere create packages -f emissary.yaml
    
  4. Validate the installation

    eksctl anywhere get packages --cluster <cluster-name>
    

    Example command output

    NAMESPACE     NAME       PACKAGE    AGE     STATE       CURRENTVERSION                                   TARGETVERSION                                              DETAIL
    eksa-packages emissary   emissary   2m57s   installed   3.0.0-a507e09c2a92c83d65737835f6bac03b9b341467   3.0.0-a507e09c2a92c83d65737835f6bac03b9b341467 (latest)
    

Update

To update package configuration, update emissary.yaml file, and run the following command:

eksctl anywhere apply package -f emissary.yaml

Upgrade

Emissary will automatically be upgraded when a new bundle is activated.

Uninstall

To uninstall Emissary, simply delete the package

eksctl anywhere delete package --cluster <cluster-name> emissary

4.4.9 - Harbor

Install/upgrade/uninstall Harbor

If you have not already done so, make sure your cluster meets the package prerequisites. Be sure to refer to the troubleshooting guide in the event of a problem.

Install

  1. Generate the package configuration

    eksctl anywhere generate package harbor --cluster <cluster-name> > harbor.yaml
    
  2. Add the desired configuration to harbor.yaml

    Please see complete configuration options for all configuration options and their default values.

    TLS example with auto certificate generation

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
       name: my-harbor
       namespace: eksa-packages-<cluster-name>
    spec:
       packageName: harbor
       config: |-
          secretKey: "use-a-secret-key"
          externalURL: https://harbor.eksa.demo:30003
          expose:
             tls:
                certSource: auto
                auto:
                   commonName: "harbor.eksa.demo"      
    

    Non-TLS example

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
       name: my-harbor
       namespace: eksa-packages-<cluster-name>
    spec:
       packageName: harbor
       config: |-
          secretKey: "use-a-secret-key"
          externalURL: http://harbor.eksa.demo:30002
          expose:
             tls:
                enabled: false      
    
  3. Install Harbor

    eksctl anywhere create packages -f harbor.yaml
    
  4. Check Harbor

    eksctl anywhere get packages --cluster <cluster-name>
    

    Example command output

    NAME        PACKAGE   AGE     STATE       CURRENTVERSION             TARGETVERSION        DETAIL
    my-harbor   harbor    5m34s   installed   v2.5.1                     v2.5.1 (latest)
    

    Harbor web portal is accessible at whatever externalURL is set to. See complete configuration options for all default values.

    Harbor web portal

Update

To update package configuration, update harbor.yaml file, and run the following command:

eksctl anywhere apply package -f harbor.yaml

Upgrade

  1. Verify a new bundle is available

    eksctl anywhere get packagebundle
    

    Example command output

    NAME         VERSION   STATE
    v1.25-120    1.25      active (upgrade available)
    v1.26-120    1.26      inactive
    
  2. Upgrade Harbor

    eksctl anywhere upgrade packages --bundle-version v1.26-120
    
  3. Check Harbor

    eksctl anywhere get packages --cluster <cluster-name>
    

    Example command output

    NAME        PACKAGE   AGE     STATE       CURRENTVERSION             TARGETVERSION        DETAIL
    my-harbor   Harbor    14m     installed   v2.7.1                     v2.7.1 (latest)
    

Uninstall

  1. Uninstall Harbor

    eksctl anywhere delete package --cluster <cluster-name> my-harbor
    

4.4.10 - MetalLB

Install/upgrade/uninstall MetalLB

If you have not already done so, make sure your cluster meets the package prerequisites. Be sure to refer to the troubleshooting guide in the event of a problem.

Install

  1. Generate the package configuration

    eksctl anywhere generate package metallb --cluster <cluster-name> > metallb.yaml
    
  2. Add the desired configuration to metallb.yaml

    Please see complete configuration options for all configuration options and their default values.

    Example package file with bgp configuration:

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
      name: mylb
      namespace: eksa-packages-<cluster-name>
    spec:
      packageName: metallb
      config: |
        IPAddressPools:
          - name: default
            addresses:
              - 10.220.0.93/32
              - 10.220.0.97-10.220.0.120
        BGPAdvertisements:
          - ipAddressPools:
            - default
        BGPPeers:
          - peerAddress: 10.220.0.2
            peerASN: 65000
            myASN: 65002    
    

    Example package file with ARP configuration:

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
      name: mylb
      namespace: eksa-packages
    spec:
      packageName: metallb
      config: |
        IPAddressPools:
          - name: default
            addresses:
              - 10.220.0.93/32
              - 10.220.0.97-10.220.0.120
        L2Advertisements:
          - ipAddressPools:
            - default    
    
  3. Create the namespace (If overriding targetNamespace, change metallb-system to the value of targetNamespace)

    kubectl create namespace metallb-system
    
  4. Install MetalLB

    eksctl anywhere create packages -f metallb.yaml
    
  5. Validate the installation

    eksctl anywhere get packages --cluster <cluster-name>
    

    Example command output

    NAME   PACKAGE   AGE   STATE       CURRENTVERSION                                    TARGETVERSION                                              DETAIL
    mylb   metallb   22h   installed   0.13.5-ce5b5de19014202cebd4ab4c091830a3b6dfea06   0.13.5-ce5b5de19014202cebd4ab4c091830a3b6dfea06 (latest)
    

Update

To update package configuration, update metallb.yaml file, and run the following command:

eksctl anywhere apply package -f metallb.yaml

Upgrade

MetalLB will automatically be upgraded when a new bundle is activated.

Uninstall

To uninstall MetalLB, simply delete the package

eksctl anywhere delete package --cluster <cluster-name> mylb

5 - Reference

Reference documents for EKS Anywhere configuration

5.1 - Config

Config reference for EKS Anywhere clusters

5.1.1 - Bare metal configuration

Full EKS Anywhere configuration reference for a Bare Metal cluster.

This is a generic template with detailed descriptions below for reference. The following additional optional configuration can also be included:

To generate your own cluster configuration, follow instructions from the Bare Metal Create production cluster section and modify it using descriptions below. For information on how to add cluster configuration settings to this file for advanced node configuration, see Advanced Bare Metal cluster configuration .

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: my-cluster-name
spec:
  clusterNetwork:
    cniConfig:
      cilium: {}
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12
  controlPlaneConfiguration:              
    count: 1
    endpoint:
      host: "<Control Plane Endpoint IP>"
    machineGroupRef:
      kind: TinkerbellMachineConfig
      name: my-cluster-name-cp
  datacenterRef:
    kind: TinkerbellDatacenterConfig
    name: my-cluster-name
  kubernetesVersion: "1.25"
  managementCluster:
    name: my-cluster-name
  workerNodeGroupConfigurations:
  - count: 1
    machineGroupRef:
      kind: TinkerbellMachineConfig
      name: my-cluster-name
    name: md-0

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellDatacenterConfig
metadata:
  name: my-cluster-name
spec:
  tinkerbellIP: "<Tinkerbell IP>"

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
  name: my-cluster-name-cp
spec:
  hardwareSelector: {}
  osFamily: bottlerocket
  templateRef: {}
  users:
  - name: ec2-user
    sshAuthorizedKeys:
    - ssh-rsa AAAAB3NzaC1yc2... jwjones@833efcab1482.home.example.com

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
  name: my-cluster-name
spec:
  hardwareSelector: {}
  osFamily: bottlerocket
  templateRef:
    kind: TinkerbellTemplateConfig
    name: my-cluster-name
  users:
  - name: ec2-user
    sshAuthorizedKeys:
    - ssh-rsa AAAAB3NzaC1yc2... jwjones@833efcab1482.home.example.com

Cluster Fields

name (required)

Name of your cluster (my-cluster-name in this example).

clusterNetwork (required)

Specific network configuration for your Kubernetes cluster.

clusterNetwork.cniConfig (required)

CNI plugin to be installed in the cluster. The only supported value at the moment is cilium.

clusterNetwork.pods.cidrBlocks[0] (required)

Subnet used by pods in CIDR notation. Please note that only 1 custom pods CIDR block specification is permitted. This CIDR block should not conflict with the clusterNetwork.services.cidrBlocks and network subnet range selected for the machines.

clusterNetwork.services.cidrBlocks[0] (required)

Subnet used by services in CIDR notation. Please note that only 1 custom services CIDR block specification is permitted. This CIDR block should not conflict with the clusterNetwork.pods.cidrBlocks and network subnet range selected for the machines.

clusterNetwork.dns.resolvConf.path (optional)

Path to the file with a custom DNS resolver configuration.

controlPlaneConfiguration (required)

Specific control plane configuration for your Kubernetes cluster.

controlPlaneConfiguration.count (required)

Number of control plane nodes. This number needs to be odd to maintain ETCD quorum.

controlPlaneConfiguration.endpoint.host (required)

A unique IP you want to use for the control plane in your EKS Anywhere cluster. Choose an IP in your network range that does not conflict with other machines.

NOTE: This IP should be outside the network DHCP range as it is a floating IP that gets assigned to one of the control plane nodes for kube-apiserver loadbalancing.

controlPlaneConfiguration.machineGroupRef (required)

Refers to the Kubernetes object with Tinkerbell-specific configuration for your nodes. See TinkerbellMachineConfig Fields below.

controlPlaneConfiguration.taints

A list of taints to apply to the control plane nodes of the cluster.

Replaces the default control plane taint (For k8s versions prior to 1.24, node-role.kubernetes.io/master. For k8s versions 1.24+, node-role.kubernetes.io/control-plane). The default control plane components will tolerate the provided taints.

Modifying the taints associated with the control plane configuration will cause new nodes to be rolled-out, replacing the existing nodes.

NOTE: The taints provided will be used instead of the default control plane taint. Any pods that you run on the control plane nodes must tolerate the taints you provide in the control plane configuration.

controlPlaneConfiguration.labels

A list of labels to apply to the control plane nodes of the cluster. This is in addition to the labels that EKS Anywhere will add by default.

Modifying the labels associated with the control plane configuration will cause new nodes to be rolled out, replacing the existing nodes.

datacenterRef

Refers to the Kubernetes object with Tinkerbell-specific configuration. See TinkerbellDatacenterConfig Fields below.

kubernetesVersion (required)

The Kubernetes version you want to use for your cluster. Supported values: 1.25, 1.24, 1.23, 1.22, 1.21

managementCluster

Identifies the name of the management cluster. If this is a standalone cluster or if it were serving as the management cluster for other workload clusters, this will be the same as the cluster name. Bare Metal EKS Anywhere clusters do not yet support the creation of separate workload clusters.

workerNodeGroupConfigurations

This takes in a list of node groups that you can define for your workers.

You can omit workerNodeGroupConfigurations when creating Bare Metal clusters. In this case, control plane nodes will not be tainted and all pods will run on the control plane nodes. This mechanism can be used to deploy Bare Metal clusters on a single server.

NOTE: Empty workerNodeGroupConfigurations is not supported when Kubernetes version <= 1.21.

workerNodeGroupConfigurations.count

Number of worker nodes. Optional if autoscalingConfiguration is used, in which case count will default to autoscalingConfiguration.minCount.

workerNodeGroupConfigurations.machineGroupRef (required)

Refers to the Kubernetes object with Tinkerbell-specific configuration for your nodes. See TinkerbellMachineConfig Fields below.

workerNodeGroupConfigurations.name (required)

Name of the worker node group (default: md-0)

workerNodeGroupConfigurations.autoscalingConfiguration.minCount

Minimum number of nodes for this node group’s autoscaling configuration.

workerNodeGroupConfigurations.autoscalingConfiguration.maxCount

Maximum number of nodes for this node group’s autoscaling configuration.

workerNodeGroupConfigurations.taints

A list of taints to apply to the nodes in the worker node group.

Modifying the taints associated with a worker node group configuration will cause new nodes to be rolled-out, replacing the existing nodes associated with the configuration.

At least one node group must not have NoSchedule or NoExecute taints applied to it.

workerNodeGroupConfigurations.labels

A list of labels to apply to the nodes in the worker node group. This is in addition to the labels that EKS Anywhere will add by default.

Modifying the labels associated with a worker node group configuration will cause new nodes to be rolled out, replacing the existing nodes associated with the configuration.

TinkerbellDatacenterConfig Fields

tinkerbellIP

Required field to identify the IP address of the Tinkerbell service. This IP address must be a unique IP in the network range that does not conflict with other IPs. Once the Tinkerbell services move from the Admin machine to run on the target cluster, this IP address makes it possible for the stack to be used for future provisioning needs. When separate management and workload clusters are supported in Bare Metal, the IP address becomes a necessity.

osImageURL

Optional field to replace the default Bottlerocket operating system. EKS Anywhere can only auto-import Bottlerocket. In order to use Ubuntu or Redhat see building baremetal node images to learn more on building and using Ubuntu with an EKS Anywhere cluster. This field is also useful if you want to provide a customized operating system image or simply host the standard image locally.

hookImagesURLPath

Optional field to replace the HookOS image. This field is useful if you want to provide a customized HookOS image or simply host the standard image locally. See Artifacts for details.

Example TinkerbellDatacenterConfig.spec

spec:
  tinkerbellIP: "192.168.0.10"                                          # Available, routable IP
  osImageURL: "http://my-web-server/ubuntu-v1.23.7-eks-a-12-amd64.gz"   # Full URL to the OS Image hosted locally
  hookImagesURLPath: "http://my-web-server/hook"                        # Path to the hook images. This path must contain vmlinuz-x86_64 and initramfs-x86_64 

This is the folder structure for my-web-server:

my-web-server
├── hook
│   ├── initramfs-x86_64
│   └── vmlinuz-x86_64
└── ubuntu-v1.23.7-eks-a-12-amd64.gz

skipLoadBalancerDeployment

Optional field to skip deploying the default load balancer for Tinkerbell stack.

EKS Anywhere for Bare Metal uses kube-vip load balancer by default to expose the Tinkerbell stack externally. You can disable this feature by setting this field to true.

NOTE: If you skip load balancer deployment, you will have to ensure that the Tinkerbell stack is available at tinkerbellIP once the cluster creation is finished. One way to achieve this is by using the MetalLB package.

TinkerbellMachineConfig Fields

In the example, there are TinkerbellMachineConfig sections for control plane (my-cluster-name-cp) and worker (my-cluster-name) machine groups. The following fields identify information needed to configure the nodes in each of those groups.

NOTE: Currently, you can only have one machine group for all machines in the control plane, although you can have multiple machine groups for the workers.

hardwareSelector

Use fields under hardwareSelector to add key/value pair labels to match particular machines that you identified in the CSV file where you defined the machines in your cluster. Choose any label name you like. For example, if you had added the label node=cp-machine to the machines listed in your CSV file that you want to be control plane nodes, the following hardwareSelector field would cause those machines to be added to the control plane:

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
  name: my-cluster-name-cp
spec:
  hardwareSelector:
    node: "cp-machine"

osFamily (required)

Operating system on the machine. For example, bottlerocket or ubuntu.

templateRef (optional)

Identifies the template that defines the actions that will be applied to the TinkerbellMachineConfig. See TinkerbellTemplateConfig fields below. EKS Anywhere will generate default templates based on osFamily during the create command. You can override this default template by providing your own template here.

users

The name of the user you want to configure to access your virtual machines through SSH.

The default is ec2-user. Currently, only one user is supported.

users[0].sshAuthorizedKeys (optional)

The SSH public keys you want to configure to access your machines through SSH (as described below). Only 1 is supported at this time.

users[0].sshAuthorizedKeys[0] (optional)

This is the SSH public key that will be placed in authorized_keys on all EKS Anywhere cluster machines so you can SSH into them. The user will be what is defined under name above. For example:

ssh -i <private-key-file> <user>@<machine-IP>

The default is generating a key in your $(pwd)/<cluster-name> folder when not specifying a value.

Advanced Bare Metal cluster configuration

When you generate a Bare Metal cluster configuration, the TinkerbellTemplateConfig is kept internally and not shown in the generated configuration file. TinkerbellTemplateConfig settings define the actions done to install each node, such as get installation media, configure networking, add users, and otherwise configure the node.

Advanced users can override the default values set for TinkerbellTemplateConfig. They can also add their own Tinkerbell actions to make personalized modifications to EKS Anywhere nodes.

The following shows two TinkerbellTemplateConfig examples that you can add to your cluster configuration file to override the values that EKS Anywhere sets: one for Ubuntu and one for Bottlerocket. Most actions used differ for different operating systems.

NOTE: For the stream-image action, DEST_DISK points to the device representing the entire hard disk (for example, /dev/sda). For UEFI-enabled images, such as Ubuntu, write actions use DEST_DISK to point to the second partition (for example, /dev/sda2), with the first being the EFI partition. For the Bottlerocket image, which has 12 partitions, DEST_DISK is partition 12 (for example, /dev/sda12). Device names will be different for different disk types.

Ubuntu TinkerbellTemplateConfig example

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellTemplateConfig
metadata:
  name: my-cluster-name
spec:
  template:
    global_timeout: 6000
    id: ""
    name: my-cluster-name
    tasks:
    - actions:
      - environment:
          COMPRESSED: "true"
          DEST_DISK: /dev/sda
          IMG_URL: https://my-file-server/ubuntu-v1.23.7-eks-a-12-amd64.gz
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/image2disk:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: stream-image
        timeout: 360
      - environment:
          DEST_DISK: /dev/sda2
          DEST_PATH: /etc/netplan/config.yaml
          STATIC_NETPLAN: true
          DIRMODE: "0755"
          FS_TYPE: ext4
          GID: "0"
          MODE: "0644"
          UID: "0"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: write-netplan
        timeout: 90
      - environment:
          CONTENTS: |
            datasource:
              Ec2:
                metadata_urls: [<admin-machine-ip>, <tinkerbell-ip-from-cluster-config>]
                strict_id: false
            manage_etc_hosts: localhost
            warnings:
              dsid_missing_source: off            
          DEST_DISK: /dev/sda2
          DEST_PATH: /etc/cloud/cloud.cfg.d/10_tinkerbell.cfg
          DIRMODE: "0700"
          FS_TYPE: ext4
          GID: "0"
          MODE: "0600"
          UID: "0"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: add-tink-cloud-init-config
        timeout: 90
      - environment:
          CONTENTS: |
            network:
              config: disabled            
          DEST_DISK: /dev/sda2
          DEST_PATH: /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg
          DIRMODE: "0700"
          FS_TYPE: ext4
          GID: "0"
          MODE: "0600"
          UID: "0"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: disable-cloud-init-network-capabilities
        timeout: 90
      - environment:
          CONTENTS: | 
            datasource: Ec2
          DEST_DISK: /dev/sda2
          DEST_PATH: /etc/cloud/ds-identify.cfg
          DIRMODE: "0700"
          FS_TYPE: ext4
          GID: "0"
          MODE: "0600"
          UID: "0"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: add-tink-cloud-init-ds-config
        timeout: 90
      - environment:
          BLOCK_DEVICE: /dev/sda2
          FS_TYPE: ext4
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/kexec:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: kexec-image
        pid: host
        timeout: 90
      name: my-cluster-name
      volumes:
      - /dev:/dev
      - /dev/console:/dev/console
      - /lib/firmware:/lib/firmware:ro
      worker: '{{.device_1}}'
    version: "0.1"

Bottlerocket TinkerbellTemplateConfig example

Pay special attention to the BOOTCONFIG_CONTENTS environment section below if you wish to set up console redirection for the kernel and systemd. If you are only using a direct attached monitor as your primary display device, no additional configuration is needed here. However, if you need all boot output to be shown via a server’s serial console for example, extra configuration should be provided inside BOOTCONFIG_CONTENTS.

An empty kernel {} key is provided below in the example; inside this key is where you will specify your console devices. You may specify multiple comma delimited console devices in quotes to a console key as such: console = "tty0", "ttyS0,115200n8". The order of the devices is significant; systemd will output to the last device specified. The console key belongs inside the kernel key like so:

kernel {
    console = "tty0", "ttyS0,115200n8"
}

The above example will send all kernel output to both consoles, and systemd output to ttyS0. Additional information about serial console setup can be found in the Linux kernel documentation .

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellTemplateConfig
metadata:
  name: my-cluster-name
spec:
  template:
    global_timeout: 6000
    id: ""
    name: my-cluster-name
    tasks:
    - actions:
      - environment:
          COMPRESSED: "true"
          DEST_DISK: /dev/sda
          IMG_URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/raw/1-22/bottlerocket-v1.22.10-eks-d-1-22-8-eks-a-11-amd64.img.gz
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/image2disk:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: stream-image
        timeout: 360
      - environment:
          # An example console declaration that will send all kernel output to both consoles, and systemd output to ttyS0.
          # kernel {
          #     console = "tty0", "ttyS0,115200n8"
          # }
          BOOTCONFIG_CONTENTS: |
                        kernel {}
          DEST_DISK: /dev/sda12
          DEST_PATH: /bootconfig.data
          DIRMODE: "0700"
          FS_TYPE: ext4
          GID: "0"
          MODE: "0644"
          UID: "0"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: write-bootconfig
        timeout: 90
      - environment:
          CONTENTS: |
            # Version is required, it will change as we support
            # additional settings
            version = 1
            # "eno1" is the interface name
            # Users may turn on dhcp4 and dhcp6 via boolean
            [eno1]
            dhcp4 = true
            # Define this interface as the "primary" interface
            # for the system.  This IP is what kubelet will use
            # as the node IP.  If none of the interfaces has
            # "primary" set, we choose the first interface in
            # the file
            primary = true            
          DEST_DISK: /dev/sda12
          DEST_PATH: /net.toml
          DIRMODE: "0700"
          FS_TYPE: ext4
          GID: "0"
          MODE: "0644"
          UID: "0"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: write-netconfig
        timeout: 90
      - environment:
          HEGEL_URL: http://<hegel-ip>:50061
          DEST_DISK: /dev/sda12
          DEST_PATH: /user-data.toml
          DIRMODE: "0700"
          FS_TYPE: ext4
          GID: "0"
          MODE: "0644"
          UID: "0"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: write-user-data
        timeout: 90
      - name: "reboot"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/reboot:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        timeout: 90
        volumes:
          - /worker:/worker
      name: my-cluster-name
      volumes:
      - /dev:/dev
      - /dev/console:/dev/console
      - /lib/firmware:/lib/firmware:ro
      worker: '{{.device_1}}'
    version: "0.1"

TinkerbellTemplateConfig Fields

The values in the TinkerbellTemplateConfig fields are created from the contents of the CSV file used to generate a configuration. The template contains actions that are performed on a Bare Metal machine when it first boots up to be provisioned. For advanced users, you can add these fields to your cluster configuration file if you have special needs to do so.

While there are fields that apply to all provisioned operating systems, actions are specific to each operating system. Examples below describe actions for Ubuntu and Bottlerocket operating systems.

template.global_timeout

Sets the timeout value for completing the configuration. Set to 6000 (100 minutes) by default.

template.id

Not set by default.

template.tasks

Within the TinkerbellTemplateConfig template under tasks is a set of actions. The following descriptions cover the actions shown in the example templates for Ubuntu and Bottlerocket:

template.tasks.actions.name.stream-image (Ubuntu and Bottlerocket)

The stream-image action streams the selected image to the machine you are provisioning. It identifies:

  • environment.COMPRESSED: When set to true, Tinkerbell expects IMG_URL to be a compressed image, which Tinkerbell will uncompress when it writes the contents to disk.
  • environment.DEST_DISK: The hard disk on which the operating system is deployed. The default is the first SCSI disk (/dev/sda), but can be changed for other disk types.
  • environment.IMG_URL: The operating system tarball (ubuntu or other) to stream to the machine you are configuring.
  • image: Container image needed to perform the steps needed by this action.
  • timeout: Sets the amount of time (in seconds) that Tinkerbell has to stream the image, uncompress it, and write it to disk before timing out. Consider increasing this limit from the default 600 to a higher limit if this action is timing out.

Ubuntu-specific actions

template.tasks.actions.name.write-netplan (Ubuntu)

The write-netplan action writes Ubuntu network configuration information to the machine (see Netplan ) for details. It identifies:

  • environment.CONTENTS.network.version: Identifies the network version.
  • environment.CONTENTS.network.renderer: Defines the service to manage networking. By default, the networkd systemd service is used.
  • environment.CONTENTS.network.ethernets: Network interface to external network (eno1, by default) and whether or not to use dhcp4 (true, by default).
  • environment.DEST_DISK: Destination block storage device partition where the operating system is copied. By default, /dev/sda2 is used (sda1 is the EFI partition).
  • environment.DEST_PATH: File where the networking configuration is written (/etc/netplan/config.yaml, by default).
  • environment.DIRMODE: Linux directory permissions bits to use when creating directories (0755, by default)
  • environment.FS_TYPE: Type of filesystem on the partition (ext4, by default).
  • environment.GID: The Linux group ID to set on file. Set to 0 (root group) by default.
  • environment.MODE: The Linux permission bits to set on file (0644, by default).
  • environment.UID: The Linux user ID to set on file. Set to 0 (root user) by default.
  • image: Container image used to perform the steps needed by this action.
  • timeout: Time needed to complete the action, in seconds.

template.tasks.actions.add-tink-cloud-init-config (Ubuntu)

The add-tink-cloud-init-config action configures cloud-init features to further configure the operating system. See cloud-init Documentation for details. It identifies:

  • environment.CONTENTS.datasource: Identifies Ec2 (Ec2.metadata_urls) as the data source and sets Ec2.strict_id: false to prevent cloud-init from producing warnings about this datasource.
  • environment.CONTENTS.system_info: Creates the tink user and gives it administrative group privileges (wheel, adm) and passwordless sudo privileges, and set the default shell (/bin/bash).
  • environment.CONTENTS.manage_etc_hosts: Updates the system’s /etc/hosts file with the hostname. Set to localhost by default.
  • environment.CONTENTS.warnings: Sets dsid_missing_source to off.
  • environment.DEST_DISK: Destination block storage device partition where the operating system is located (/dev/sda2, by default).
  • environment.DEST_PATH: Location of the cloud-init configuration file on disk (/etc/cloud/cloud.cfg.d/10_tinkerbell.cfg, by default)
  • environment.DIRMODE: Linux directory permissions bits to use when creating directories (0700, by default)
  • environment.FS_TYPE: Type of filesystem on the partition (ext4, by default).
  • environment.GID: The Linux group ID to set on file. Set to 0 (root group) by default.
  • environment.MODE: The Linux permission bits to set on file (0600, by default).
  • environment.UID: The Linux user ID to set on file. Set to 0 (root user) by default.
  • image: Container image used to perform the steps needed by this action.
  • timeout: Time needed to complete the action, in seconds.

template.tasks.actions.add-tink-cloud-init-ds-config (Ubuntu)

The add-tink-cloud-init-ds-config action configures cloud-init data store features. This identifies the location of your metadata source once the machine is up and running. It identifies:

  • environment.CONTENTS.datasource: Sets the datasource. Uses Ec2, by default.
  • environment.DEST_DISK: Destination block storage device partition where the operating system is located (/dev/sda2, by default).
  • environment.DEST_PATH: Location of the data store identity configuration file on disk (/etc/cloud/ds-identify.cfg, by default)
  • environment.DIRMODE: Linux directory permissions bits to use when creating directories (0700, by default)
  • environment.FS_TYPE: Type of filesystem on the partition (ext4, by default).
  • environment.GID: The Linux group ID to set on file. Set to 0 (root group) by default.
  • environment.MODE: The Linux permission bits to set on file (0600, by default).
  • environment.UID: The Linux user ID to set on file. Set to 0 (root user) by default.
  • image: Container image used to perform the steps needed by this action.
  • timeout: Time needed to complete the action, in seconds.

template.tasks.actions.kexec-image (Ubuntu)

The kexec-image action performs provisioning activities on the machine, then allows kexec to pivot the kernel to use the system installed on disk. This action identifies:

  • environment.BLOCK_DEVICE: Disk partition on which the operating system is installed (/dev/sda2, by default)
  • environment.FS_TYPE: Type of filesystem on the partition (ext4, by default).
  • image: Container image used to perform the steps needed by this action.
  • pid: Process ID. Set to host, by default.
  • timeout: Time needed to complete the action, in seconds.
  • volumes: Identifies mount points that need to be remounted to point to locations in the installed system.

There are known issues related to drivers with some hardware that may make it necessary to replace the kexec-image action with a full reboot. If you require a full reboot, you can change the kexec-image setting as follows:

actions:
- name: "reboot"
  image: public.ecr.aws/l0g8r8j6/tinkerbell/hub/reboot-action:latest
  timeout: 90
  volumes:
  - /worker:/worker

Bottlerocket-specific actions

template.tasks.actions.write-bootconfig (Bottlerocket)

The write-bootconfig action identifies the location on the machine to put content needed to boot the system from disk.

  • environment.BOOTCONFIG_CONTENTS.kernel: Add kernel parameters that are passed to the kernel when the system boots.
  • environment.DEST_DISK: Identifies the block storage device that holds the boot partition.
  • environment.DEST_PATH: Identifies the file holding boot configuration data (/bootconfig.data in this example).
  • environment.DIRMODE: The Linux permissions assigned to the boot directory.
  • environment.FS_TYPE: The filesystem type associated with the boot partition.
  • environment.GID: The group ID associated with files and directories created on the boot partition.
  • environment.MODE: The Linux permissions assigned to files in the boot partition.
  • environment.UID: The user ID associated with files and directories created on the boot partition. UID 0 is the root user.
  • image: Container image used to perform the steps needed by this action.
  • timeout: Time needed to complete the action, in seconds.

template.tasks.actions.write-netconfig (Bottlerocket)

The write-netconfig action configures networking for the system.

  • environment.CONTENTS: Add network values, including: version = 1 (version number), [eno1] (external network interface), dhcp4 = true (turns on dhcp4), and primary = true (identifies this interface as the primary interface used by kubelet).
  • environment.DEST_DISK: Identifies the block storage device that holds the network configuration information.
  • environment.DEST_PATH: Identifies the file holding network configuration data (/net.toml in this example).
  • environment.DIRMODE: The Linux permissions assigned to the directory holding network configuration settings.
  • environment.FS_TYPE: The filesystem type associated with the partition holding network configuration settings.
  • environment.GID: The group ID associated with files and directories created on the partition. GID 0 is the root group.
  • environment.MODE: The Linux permissions assigned to files in the partition.
  • environment.UID: The user ID associated with files and directories created on the partition. UID 0 is the root user.
  • image: Container image used to perform the steps needed by this action.

template.tasks.actions.write-user-data (Bottlerocket)

The write-user-data action configures the Tinkerbell Hegel service, which provides the metadata store for Tinkerbell.

  • environment.HEGEL_URL: The IP address and port number of the Tinkerbell Hegel service.
  • environment.DEST_DISK: Identifies the block storage device that holds the network configuration information.
  • environment.DEST_PATH: Identifies the file holding network configuration data (/net.toml in this example).
  • environment.DIRMODE: The Linux permissions assigned to the directory holding network configuration settings.
  • environment.FS_TYPE: The filesystem type associated with the partition holding network configuration settings.
  • environment.GID: The group ID associated with files and directories created on the partition. GID 0 is the root group.
  • environment.MODE: The Linux permissions assigned to files in the partition.
  • environment.UID: The user ID associated with files and directories created on the partition. UID 0 is the root user.
  • image: Container image used to perform the steps needed by this action.
  • timeout: Time needed to complete the action, in seconds.

template.tasks.actions.reboot (Bottlerocket)

The reboot action defines how the system restarts to bring up the installed system.

  • image: Container image used to perform the steps needed by this action.
  • timeout: Time needed to complete the action, in seconds.
  • volumes: The volume (directory) to mount into the container from the installed system.

version

Matches the current version of the Tinkerbell template.

Custom Tinkerbell action examples

By creating your own custom Tinkerbell actions, you can add to or modify the installed operating system so those changes take effect when the installed system first starts (from a reboot or pivot). The following example shows how to add a .deb package (openssl) to an Ubuntu installation:

      - environment:
          BLOCK_DEVICE: /dev/sda1
          CHROOT: "y"
          CMD_LINE: apt -y update && apt -y install openssl
          DEFAULT_INTERPRETER: /bin/sh -c
          FS_TYPE: ext4
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/cexec:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: install-openssl
        timeout: 90

The following shows an example of adding a new user (tinkerbell) to an installed Ubuntu system:

      - environment:
          BLOCK_DEVICE: <block device path> # E.g. /dev/sda1
          FS_TYPE: ext4
          CHROOT: y
          DEFAULT_INTERPRETER: "/bin/sh -c"
          CMD_LINE: "useradd --password $(openssl passwd -1 tinkerbell) --shell /bin/bash --create-home --groups sudo tinkerbell"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/cexec:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
        name: "create-user"
        timeout: 90

Look for more examples as they are added to the Tinkerbell examples page.

5.1.2 - Nutanix configuration

Full EKS Anywhere configuration reference for a Nutanix cluster.

This is a generic template with detailed descriptions below for reference.

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
 name: mgmt
 namespace: default
spec:
 bundlesRef:
   apiVersion: anywhere.eks.amazonaws.com/v1alpha1
   name: bundles-2
   namespace: eksa-system
 clusterNetwork:
   cniConfig:
     cilium: {}
   pods:
     cidrBlocks:
       - 192.168.0.0/16
   services:
     cidrBlocks:
       - 10.96.0.0/16
 controlPlaneConfiguration:
   count: 3
   endpoint:
     host: ""
   machineGroupRef:
     kind: NutanixMachineConfig
     name: mgmt-cp-machine
 datacenterRef:
   kind: NutanixDatacenterConfig
   name: nutanix-cluster
 kubernetesVersion: "1.25"
 workerNodeGroupConfigurations:
   - count: 1
     machineGroupRef:
       kind: NutanixMachineConfig
       name: mgmt-machine
     name: md-0
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: NutanixDatacenterConfig
metadata:
 name: nutanix-cluster
 namespace: default
spec:
 endpoint: pc01.cloud.internal
 port: 9440
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: NutanixMachineConfig
metadata:
 annotations:
   anywhere.eks.amazonaws.com/control-plane: "true"
 name: mgmt-cp-machine
 namespace: default
spec:
 cluster:
   name: nx-cluster-01
   type: name
 image:
   name: eksa-ubuntu-2004-kube-v1.25
   type: name
 memorySize: 4Gi
 osFamily: ubuntu
 subnet:
   name: vm-network
   type: name
 systemDiskSize: 40Gi
 project:
   type: name
   name: my-project
 users:
   - name: eksa
     sshAuthorizedKeys:
       - ssh-rsa AAAA…
 vcpuSockets: 2
 vcpusPerSocket: 1
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: NutanixMachineConfig
metadata:
 name: mgmt-machine
 namespace: default
spec:
 cluster:
   name: nx-cluster-01
   type: name
 image:
   name: eksa-ubuntu-2004-kube-v1.25
   type: name
 memorySize: 4Gi
 osFamily: ubuntu
 subnet:
   name: vm-network
   type: name
 systemDiskSize: 40Gi
 project:
   type: name
   name: my-project
 users:
   - name: eksa
     sshAuthorizedKeys:
       - ssh-rsa AAAA…
 vcpuSockets: 2
 vcpusPerSocket: 1
---

Cluster Fields

name (required)

Name of your cluster mgmt in this example.

clusterNetwork (required)

Specific network configuration for your Kubernetes cluster.

clusterNetwork.cniConfig (required)

CNI plugin configuration to be used in the cluster. The only supported configuration at the moment is cilium.

clusterNetwork.cniConfig.cilium.policyEnforcementMode

Optionally, you may specify a policyEnforcementMode of default, always, never.

clusterNetwork.pods.cidrBlocks[0] (required)

Subnet used by pods in CIDR notation. Please note that only 1 custom pods CIDR block specification is permitted. This CIDR block should not conflict with the network subnet range selected for the VMs.

clusterNetwork.services.cidrBlocks[0] (required)

Subnet used by services in CIDR notation. Please note that only 1 custom services CIDR block specification is permitted. This CIDR block should not conflict with the network subnet range selected for the VMs.

controlPlaneConfiguration (required)

Specific control plane configuration for your Kubernetes cluster.

controlPlaneConfiguration.count (required)

Number of control plane nodes

controlPlaneConfiguration.machineGroupRef (required)

Refers to the Kubernetes object with Nutanix specific configuration for your nodes. See NutanixMachineConfig fields below.

controlPlaneConfiguration.endpoint.host (required)

A unique IP you want to use for the control plane VM in your EKS Anywhere cluster. Choose an IP in your network range that does not conflict with other VMs.

NOTE: This IP should be outside the network DHCP range as it is a floating IP that gets assigned to one of the control plane nodes for kube-apiserver loadbalancing. Suggestions on how to ensure this IP does not cause issues during cluster creation process are here .

workerNodeGroupConfigurations (required)

This takes in a list of node groups that you can define for your workers. You may define one or more worker node groups.

workerNodeGroupConfigurations.count

Number of worker nodes. Optional if autoscalingConfiguration is used, in which case count will default to autoscalingConfiguration.minCount.

workerNodeGroupConfigurations.machineGroupRef (required)

Refers to the Kubernetes object with Nutanix specific configuration for your nodes. See NutanixMachineConfig fields below.

workerNodeGroupConfigurations.name (required)

Name of the worker node group (default: md-0)

workerNodeGroupConfigurations.autoscalingConfiguration.minCount

Minimum number of nodes for this node group’s autoscaling configuration.

workerNodeGroupConfigurations.autoscalingConfiguration.maxCount

Maximum number of nodes for this node group’s autoscaling configuration.

datacenterRef

Refers to the Kubernetes object with Nutanix environment specific configuration. See NutanixDatacenterConfig fields below.

kubernetesVersion (required)

The Kubernetes version you want to use for your cluster. Supported values: 1.25, 1.24, 1.23, 1.22, 1.21

NutanixDatacenterConfig Fields

endpoint (required)

The Prism Central server fully qualified domain name or IP address. If the server IP is used, the PC SSL certificate must have an IP SAN configured.

port (required)

The Prism Central server port. (Default: 9440)

insecure (optional)

Set insecure to true if the Prism Central server does not have a valid certificate. This is not recommended for production use cases. (Default: false)

additionalTrustBundle (optional; required if using a self-signed PC SSL certificate)

The PEM encoded CA trust bundle.

The additionalTrustBundle needs to be populated with the PEM-encoded x509 certificate of the Root CA that issued the certificate for Prism Central. Suggestions on how to obtain this certificate are here .

Example:

 additionalTrustBundle: |
    -----BEGIN CERTIFICATE-----
    <certificate string>
    -----END CERTIFICATE-----
    -----BEGIN CERTIFICATE-----
    <certificate string>
    -----END CERTIFICATE-----    

NutanixMachineConfig Fields

cluster

Reference to the Prism Element cluster.

cluster.type

Type to identify the Prism Element cluster. (Permitted values: name or uuid)

cluster.name

Name of the Prism Element cluster.

cluster.uuid

UUID of the Prism Element cluster.

image

Reference to the OS image used for the system disk.

image.type

Type to identify the OS image. (Permitted values: name or uuid)

image.name (name or UUID required)

Name of the image

image.uuid (name or UUID required)

UUID of the image

memorySize

Size of RAM on virtual machines (Default: 4Gi)

osFamily (optional)

Operating System on virtual machines. (Permitted values: ubuntu)

subnet

Reference to the subnet to be assigned to the VMs.

subnet.name (name or UUID required)

Name of the subnet.

subnet.type

Type to identify the subnet. (Permitted values: name or uuid)

subnet.uuid (name or UUID required)

UUID of the subnet.

systemDiskSize

Amount of storage assigned to the system disk. (Default: 40Gi)

vcpuSockets

Amount of vCPU sockets. (Default: 2)

vcpusPerSocket

Amount of vCPUs per socket. (Default: 1)

project (optional)

Reference to an existing project used for the virtual machines.

project.type

Type to identify the project. (Permitted values: name or uuid)

project.name (name or UUID required)

Name of the project

project.uuid (name or UUID required)

UUID of the project

users (optional)

The users you want to configure to access your virtual machines. Only one is permitted at this time.

users[0].name (optional)

The name of the user you want to configure to access your virtual machines through ssh.

The default is eksa if osFamily=ubuntu

users[0].sshAuthorizedKeys (optional)

The SSH public keys you want to configure to access your virtual machines through ssh (as described below). Only 1 is supported at this time.

users[0].sshAuthorizedKeys[0] (optional)

This is the SSH public key that will be placed in authorized_keys on all EKS Anywhere cluster VMs so you can ssh into them. The user will be what is defined under name above. For example:

ssh -i <private-key-file> <user>@<VM-IP>

The default is generating a key in your $(pwd)/<cluster-name> folder when not specifying a value

5.1.3 - Snow configuration

Full EKS Anywhere configuration reference for a AWS Snow cluster.

This is a generic template with detailed descriptions below for reference. The following additional optional configuration can also be included:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: my-cluster-name
spec:
  clusterNetwork:
    cniConfig:
      cilium: {}
    pods:
      cidrBlocks:
      - 10.1.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12
  controlPlaneConfiguration:
    count: 3
    endpoint:
      host: ""
    machineGroupRef:
      kind: SnowMachineConfig
      name: my-cluster-machines
  datacenterRef:
    kind: SnowDatacenterConfig
    name: my-cluster-datacenter
  externalEtcdConfiguration:
    count: 3
    machineGroupRef:
      kind: SnowMachineConfig
      name: my-cluster-machines
  kubernetesVersion: "1.25"
  workerNodeGroupConfigurations:
  - count: 1
    machineGroupRef:
      kind: SnowMachineConfig
      name: my-cluster-machines
    name: md-0
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: SnowDatacenterConfig
metadata:
  name: my-cluster-datacenter
spec: {}

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: SnowMachineConfig
metadata:
  name: my-cluster-machines
spec:
  amiID: ""
  instanceType: sbe-c.large
  sshKeyName: ""
  osFamily: ubuntu
  devices:
  - ""
  containersVolume:
    size: 25
  network:
    directNetworkInterfaces:
    - index: 1
      primary: true
      ipPoolRef:
        kind: SnowIPPool
        name: ip-pool-1
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: SnowIPPool
metadata:
  name: ip-pool-1
spec:
  pools:
  - ipStart: 192.168.1.2
    ipEnd: 192.168.1.14
    subnet: 192.168.1.0/24
    gateway: 192.168.1.1
  - ipStart: 192.168.1.55
    ipEnd: 192.168.1.250
    subnet: 192.168.1.0/24
    gateway: 192.168.1.1

Cluster Fields

name (required)

Name of your cluster my-cluster-name in this example

clusterNetwork (required)

Specific network configuration for your Kubernetes cluster.

clusterNetwork.cniConfig (required)

CNI plugin configuration to be used in the cluster. The only supported configuration at the moment is cilium.

clusterNetwork.cniConfig.cilium.policyEnforcementMode

Optionally, you may specify a policyEnforcementMode of default, always, never.

clusterNetwork.pods.cidrBlocks[0] (required)

Subnet used by pods in CIDR notation. Please note that only 1 custom pods CIDR block specification is permitted. This CIDR block should not conflict with the network subnet range selected for the devices.

clusterNetwork.services.cidrBlocks[0] (required)

Subnet used by services in CIDR notation. Please note that only 1 custom services CIDR block specification is permitted. This CIDR block should not conflict with the network subnet range selected for the devices.

clusterNetwork.dns.resolvConf.path (optional)

Path to the file with a custom DNS resolver configuration.

controlPlaneConfiguration (required)

Specific control plane configuration for your Kubernetes cluster.

controlPlaneConfiguration.count (required)

Number of control plane nodes

controlPlaneConfiguration.machineGroupRef (required)

Refers to the Kubernetes object with Snow specific configuration for your nodes. See SnowMachineConfig Fields below.

controlPlaneConfiguration.endpoint.host (required)

A unique IP you want to use for the control plane VM in your EKS Anywhere cluster. Choose an IP in your network range that does not conflict with other devices.

NOTE: This IP should be outside the network DHCP range as it is a floating IP that gets assigned to one of the control plane nodes for kube-apiserver loadbalancing.

controlPlaneConfiguration.taints

A list of taints to apply to the control plane nodes of the cluster.

Replaces the default control plane taint. For k8s versions prior to 1.24, it replaces node-role.kubernetes.io/master. For k8s versions 1.24+, it replaces node-role.kubernetes.io/control-plane. The default control plane components will tolerate the provided taints.

Modifying the taints associated with the control plane configuration will cause new nodes to be rolled-out, replacing the existing nodes.

NOTE: The taints provided will be used instead of the default control plane taint. Any pods that you run on the control plane nodes must tolerate the taints you provide in the control plane configuration.

controlPlaneConfiguration.labels

A list of labels to apply to the control plane nodes of the cluster. This is in addition to the labels that EKS Anywhere will add by default.

Modifying the labels associated with the control plane configuration will cause new nodes to be rolled out, replacing the existing nodes.

workerNodeGroupConfigurations (required)

This takes in a list of node groups that you can define for your workers. You may define one or more worker node groups.

workerNodeGroupConfigurations.count

Number of worker nodes. Optional if autoscalingConfiguration is used, in which case count will default to autoscalingConfiguration.minCount.

workerNodeGroupConfigurations.machineGroupRef (required)

Refers to the Kubernetes object with Snow specific configuration for your nodes. See SnowMachineConfig Fields below.

workerNodeGroupConfigurations.name (required)

Name of the worker node group (default: md-0)

workerNodeGroupConfigurations.autoscalingConfiguration.minCount

Minimum number of nodes for this node group’s autoscaling configuration.

workerNodeGroupConfigurations.autoscalingConfiguration.maxCount

Maximum number of nodes for this node group’s autoscaling configuration.

workerNodeGroupConfigurations.taints

A list of taints to apply to the nodes in the worker node group.

Modifying the taints associated with a worker node group configuration will cause new nodes to be rolled-out, replacing the existing nodes associated with the configuration.

At least one node group must not have NoSchedule or NoExecute taints applied to it.

workerNodeGroupConfigurations.labels

A list of labels to apply to the nodes in the worker node group. This is in addition to the labels that EKS Anywhere will add by default.

Modifying the labels associated with a worker node group configuration will cause new nodes to be rolled out, replacing the existing nodes associated with the configuration.

externalEtcdConfiguration.count

Number of etcd members.

externalEtcdConfiguration.machineGroupRef

Refers to the Kubernetes object with Snow specific configuration for your etcd members. See SnowMachineConfig Fields below.

datacenterRef

Refers to the Kubernetes object with Snow environment specific configuration. See SnowDatacenterConfig Fields below.

kubernetesVersion (required)

The Kubernetes version you want to use for your cluster. Supported values: 1.25, 1.24, 1.23, 1.22, 1.21

SnowDatacenterConfig Fields

identityRef

Refers to the Kubernetes secret object with Snow devices credentials used to reconcile the cluster.

SnowMachineConfig Fields

amiID (optional)

AMI ID from which to create the machine instance. Snow provider offers an AMI lookup logic which will look for a suitable AMI ID based on the Kubernetes version and osFamily if the field is empty.

instanceType (optional)

Type of the Snow EC2 machine instance. See Quotas for Compute Instances on a Snowball Edge Device for supported instance types on Snow (Default: sbe-c.large).

osFamily

Operating System on instance machines. Permitted value: ubuntu.

physicalNetworkConnector (optional)

Type of snow physical network connector to use for creating direct network interfaces. Permitted values: SFP_PLUS, QSFP, RJ45 (Default: SFP_PLUS).

sshKeyName (optional)

Name of the AWS Snow SSH key pair you want to configure to access your machine instances.

The default is eksa-default-{cluster-name}-{uuid}.

devices

A device IP list from which to bootstrap and provision machine instances.

network

Custom network setting for the machine instances. DHCP and static IP configurations are supported.

network.directNetworkInterfaces[0].index (optional)

Index number of a direct network interface (DNI) used to clarify the position in the list. Must be no smaller than 1 and no greater than 8.

network.directNetworkInterfaces[0].primary (optional)

Whether the DNI is primary or not. One and only one primary DNI is required in the directNetworkInterfaces list.

network.directNetworkInterfaces[0].vlanID (optional)

VLAN ID to use for the DNI.

network.directNetworkInterfaces[0].dhcp (optional)

Whether DHCP is to be used to assign IP for the DNI.

network.directNetworkInterfaces[0].ipPoolRef (optional)

Refers to a SnowIPPool object which provides a range of ip addresses. When specified, an IP address selected from the pool will be allocated to the DNI.

containersVolume (optional)

Configuration option for customizing containers data storage volume.

containersVolume.size

Size of the storage for containerd runtime in Gi.

The field is optional for Ubuntu and if specified, the size must be no smaller than 8 Gi.

containersVolume.deviceName (optional)

Containers volume device name.

containersVolume.type (optional)

Type of the containers volume. Permitted values: sbp1, sbg1. (Default: sbp1)

sbp1 stands for capacity-optimized HDD. sbg1 is performance-optimized SSD.

nonRootVolumes (optional)

Configuration options for the non root storage volumes.

nonRootVolumes[0].deviceName

Non root volume device name. Must be specified and cannot have prefix “/dev/sda” as it is reserved for root volume and containers volume.

nonRootVolumes[0].size

Size of the storage device for the non root volume. Must be no smaller than 8 Gi.

nonRootVolumes[0].type (optional)

Type of the non root volume. Permitted values: sbp1, sbg1. (Default: sbp1)

sbp1 stands for capacity-optimized HDD. sbg1 is performance-optimized SSD.

SnowIPPool Fields

pools[0].ipStart

Start address of an IP range.

pools[0].ipEnd

End address of an IP range.

pools[0].subnet

An IP subnet for determining whether an IP is within the subnet.

pools[0].gateway

Gateway of the subnet for routing purpose.

5.1.4 - vSphere configuration

Full EKS Anywhere configuration reference for a VMware vSphere cluster.

This is a generic template with detailed descriptions below for reference.

Key: Provider-specific values are in red ; Resources are in green ; Links to field descriptions are in blue

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
   name: my-cluster-name             # Name of the cluster (required)
spec:
   clusterNetwork:                   # Cluster network configuration (required)
      cniConfig:                     # Cluster CNI plugin - default: cilium (required)
         cilium: {}
      pods:
         cidrBlocks:                 # Subnet CIDR notation for pods (required)
            - 192.168.0.0/16
      services:
         cidrBlocks:                 # Subnet CIDR notation for services (required)
            - 10.96.0.0/12
   controlPlaneConfiguration:        # Specific cluster control plane config (required)
      count: 2                       # Number of control plane nodes (required)
      endpoint:                      # IP for control plane endpoint (required)
         host: "192.168.0.10"
      machineGroupRef:               # vSphere-specific Kubernetes node config (required)
        kind: VSphereMachineConfig
        name: my-cluster-machines
      taints:                        # Taints applied to control plane nodes 
      - key: "key1"
        value: "value1"
        effect: "NoSchedule"
      labels:                        # Labels applied to control plane nodes 
        "key1": "value1"
        "key2": "value2" 
   datacenterRef:                    # Kubernetes object with vSphere-specific config 
      kind: VSphereDatacenterConfig  
      name: my-cluster-datacenter
   externalEtcdConfiguration:
     count: 3                        # Number of etcd members 
     machineGroupRef:                # vSphere-specific Kubernetes etcd config
        kind: VSphereMachineConfig
        name: my-cluster-machines
   kubernetesVersion: "1.25"         # Kubernetes version to use for the cluster (required)
   workerNodeGroupConfigurations:    # List of node groups you can define for workers (required) 
   - count: 2                        # Number of worker nodes 
     machineGroupRef:                # vSphere-specific Kubernetes node objects (required) 
       kind: VSphereMachineConfig
       name: my-cluster-machines
     name: md-0                      # Name of the worker nodegroup (required) 
     taints:                         # Taints to apply to worker node group nodes 
     - key: "key1"                       
       value: "value1"
       effect: "NoSchedule"
     labels:                         # Labels to apply to worker node group nodes 
       "key1": "value1"
       "key2": "value2" 
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereDatacenterConfig
metadata:
   name: my-cluster-datacenter
spec:
  datacenter: "datacenter1"          # vSphere datacenter name on which to deploy EKS Anywhere (required) 
  disableCSI: false                  # Set to true to not have EKS Anywhere install and manage vSphere CSI driver
  server: "myvsphere.local"          # FQDN or IP address of vCenter server (required) 
  network: "network1"                # Path to the VM network on which to deploy EKS Anywhere (required) 
  insecure: false                    # Set to true if vCenter does not have a valid certificate 
  thumbprint: "1E:3B:A1:4C:B2:..."   # SHA1 thumprint of vCenter server certificate (required if insecure=false)

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: VSphereMachineConfig
metadata:
   name: my-cluster-machines
spec:
  diskGiB:  25                       # Size of disk on VMs, if no snapshots
  datastore: "datastore1"            # Path to vSphere datastore to deploy EKS Anywhere on (required)
  folder: "folder1"                  # Path to VM folder for EKS Anywhere cluster VMs (required)
  numCPUs: 2                         # Number of CPUs on virtual machines
  memoryMiB: 8192                    # Size of RAM on VMs
  osFamily: "bottlerocket"           # Operating system on VMs
  resourcePool: "resourcePool1"      # vSphere resource pool for EKS Anywhere VMs (required)
  storagePolicyName: "storagePolicy1"    # Storage policy name associated with VMs
  template: "bottlerocket-kube-v1-25"    # VM template for EKS Anywhere (required for RHEL/Ubuntu-based OVAs)
  users:                             # Add users to access VMs via SSH
  - name: "ec2-user"                 # Name of each user set to access VMs
    sshAuthorizedKeys:               # SSH keys for user needed to access VMs
    - "ssh-rsa AAAAB3NzaC1yc2E..."
  tags:                              # List of tags to attach to cluster VMs, in URN format
  - "urn:vmomi:InventoryServiceTag:5b3e951f-4e1d-4511-95b1-5ba1ea97245c:GLOBAL"
  - "urn:vmomi:InventoryServiceTag:cfee03d0-0189-4f27-8c65-fe75086a86cd:GLOBAL"

The following additional optional configuration can also be included:

Cluster Fields

name (required)

Name of your cluster my-cluster-name in this example

clusterNetwork (required)

Specific network configuration for your Kubernetes cluster.

clusterNetwork.cniConfig (required)

CNI plugin configuration to be used in the cluster. The only supported configuration at the moment is cilium.

clusterNetwork.cniConfig.cilium.policyEnforcementMode

Optionally, you may specify a policyEnforcementMode of default, always, never.

clusterNetwork.pods.cidrBlocks[0] (required)

Subnet used by pods in CIDR notation. Please note that only 1 custom pods CIDR block specification is permitted. This CIDR block should not conflict with the network subnet range selected for the VMs.

clusterNetwork.services.cidrBlocks[0] (required)

Subnet used by services in CIDR notation. Please note that only 1 custom services CIDR block specification is permitted. This CIDR block should not conflict with the network subnet range selected for the VMs.

clusterNetwork.dns.resolvConf.path (optional)

Path to the file with a custom DNS resolver configuration.

controlPlaneConfiguration (required)

Specific control plane configuration for your Kubernetes cluster.

controlPlaneConfiguration.count (required)

Number of control plane nodes

controlPlaneConfiguration.machineGroupRef (required)

Refers to the Kubernetes object with vsphere specific configuration for your nodes. See VSphereMachineConfig Fields below.

controlPlaneConfiguration.endpoint.host (required)

A unique IP you want to use for the control plane VM in your EKS Anywhere cluster. Choose an IP in your network range that does not conflict with other VMs.

NOTE: This IP should be outside the network DHCP range as it is a floating IP that gets assigned to one of the control plane nodes for kube-apiserver loadbalancing. Suggestions on how to ensure this IP does not cause issues during cluster creation process are here

controlPlaneConfiguration.taints

A list of taints to apply to the control plane nodes of the cluster.

Replaces the default control plane taint. For k8s versions prior to 1.24, it replaces node-role.kubernetes.io/master. For k8s versions 1.24+, it replaces node-role.kubernetes.io/control-plane. The default control plane components will tolerate the provided taints.

Modifying the taints associated with the control plane configuration will cause new nodes to be rolled-out, replacing the existing nodes.

NOTE: The taints provided will be used instead of the default control plane taint. Any pods that you run on the control plane nodes must tolerate the taints you provide in the control plane configuration.

controlPlaneConfiguration.labels

A list of labels to apply to the control plane nodes of the cluster. This is in addition to the labels that EKS Anywhere will add by default.

Modifying the labels associated with the control plane configuration will cause new nodes to be rolled out, replacing the existing nodes.

workerNodeGroupConfigurations (required)

This takes in a list of node groups that you can define for your workers. You may define one or more worker node groups.

workerNodeGroupConfigurations.count

Number of worker nodes. Optional if autoscalingConfiguration is used, in which case count will default to autoscalingConfiguration.minCount.

workerNodeGroupConfigurations.machineGroupRef (required)

Refers to the Kubernetes object with vsphere specific configuration for your nodes. See VSphereMachineConfig Fields below.

workerNodeGroupConfigurations.name (required)

Name of the worker node group (default: md-0)

workerNodeGroupConfigurations.autoscalingConfiguration.minCount

Minimum number of nodes for this node group’s autoscaling configuration.

workerNodeGroupConfigurations.autoscalingConfiguration.maxCount

Maximum number of nodes for this node group’s autoscaling configuration.

workerNodeGroupConfigurations.taints

A list of taints to apply to the nodes in the worker node group.

Modifying the taints associated with a worker node group configuration will cause new nodes to be rolled-out, replacing the existing nodes associated with the configuration.

At least one node group must not have NoSchedule or NoExecute taints applied to it.

workerNodeGroupConfigurations.labels

A list of labels to apply to the nodes in the worker node group. This is in addition to the labels that EKS Anywhere will add by default.

Modifying the labels associated with a worker node group configuration will cause new nodes to be rolled out, replacing the existing nodes associated with the configuration.

externalEtcdConfiguration.count

Number of etcd members

externalEtcdConfiguration.machineGroupRef

Refers to the Kubernetes object with vsphere specific configuration for your etcd members. See VSphereMachineConfig Fields below.

datacenterRef

Refers to the Kubernetes object with vsphere environment specific configuration. See VSphereDatacenterConfig Fields below.

kubernetesVersion (required)

The Kubernetes version you want to use for your cluster. Supported values: 1.25, 1.24, 1.23, 1.22, 1.21

VSphereDatacenterConfig Fields

datacenter (required)

The name of the vSphere datacenter to deploy the EKS Anywhere cluster on. For example SDDC-Datacenter.

network (required)

The path to the VM network to deploy your EKS Anywhere cluster on. For example, /<DATACENTER>/network/<NETWORK_NAME>. Use govc find -type n to see a list of networks.

server (required)

The vCenter server fully qualified domain name or IP address. If the server IP is used, the thumbprint must be set or insecure must be set to true.

insecure (optional)

Set insecure to true if the vCenter server does not have a valid certificate. (Default: false)

thumbprint (required if insecure=false)

The SHA1 thumbprint of the vCenter server certificate which is only required if you have a self signed certificate.

There are several ways to obtain your vCenter thumbprint. The easiest way is if you have govc installed, you can run:

govc about.cert -thumbprint -k

Another way is from the vCenter web UI, go to Administration/Certificate Management and click view details of the machine certificate. The format of this thumbprint does not exactly match the format required though and you will need to add : to separate each hexadecimal value.

Another way to get the thumbprint is use this command with your servers certificate in a file named ca.crt:

openssl x509 -sha1 -fingerprint -in ca.crt -noout

If you specify the wrong thumbprint, an error message will be printed with the expected thumbprint. If no valid certificate is being used, insecure must be set to true.

disableCSI (optional)

Set disableCSI to true if you don’t want to have EKS Anywhere install and manage the vSphere CSI driver for you. More details on the driver are here

NOTE: If you upgrade a cluster and disable the vSphere CSI driver after it has already been installed by EKS Anywhere, you will need to remove the resources manually from the cluster. Delete the DaemonSet and Deployment first, as they rely on the other resources. This should be done after setting disableCSI to true and running upgrade cluster.

These are the resources you would need to delete:

  • vsphere-csi-controller-role (kind: ClusterRole)
  • vsphere-csi-controller-binding (kind: ClusterRoleBinding)
  • csi.vsphere.vmware.com (kind: CSIDriver)

These are the resources you would need to delete in the kube-system namespace:

  • vsphere-csi-controller (kind: ServiceAccount)
  • csi-vsphere-config (kind: Secret)
  • vsphere-csi-node (kind: DaemonSet)
  • vsphere-csi-controller (kind: Deployment)

These are the resources you would need to delete in the eksa-system namespace from the management cluster.

  • <cluster-name>-csi (kind: ClusterResourceSet)

Note: If your cluster is self-managed, you would delete <cluster-name>-csi (kind: ClusterResourceSet) from the same cluster.

VSphereMachineConfig Fields

memoryMiB (optional)

Size of RAM on virtual machines (Default: 8192)

numCPUs (optional)

Number of CPUs on virtual machines (Default: 2)

osFamily (optional)

Operating System on virtual machines. Permitted values: bottlerocket, ubuntu, redhat (Default: bottlerocket)

diskGiB (optional)

Size of disk on virtual machines if snapshots aren’t included (Default: 25)

users (optional)

The users you want to configure to access your virtual machines. Only one is permitted at this time

users[0].name (optional)

The name of the user you want to configure to access your virtual machines through ssh.

The default is ec2-user if osFamily=bottlrocket and capv if osFamily=ubuntu

users[0].sshAuthorizedKeys (optional)

The SSH public keys you want to configure to access your virtual machines through ssh (as described below). Only 1 is supported at this time.

users[0].sshAuthorizedKeys[0] (optional)

This is the SSH public key that will be placed in authorized_keys on all EKS Anywhere cluster VMs so you can ssh into them. The user will be what is defined under name above. For example:

ssh -i <private-key-file> <user>@<VM-IP>

The default is generating a key in your $(pwd)/<cluster-name> folder when not specifying a value

template (optional)

The VM template to use for your EKS Anywhere cluster. This template was created when you imported the OVA file into vSphere . This is a required field if you are using Ubuntu-based or RHEL-based OVAs.

datastore (required)

The path to the vSphere datastore to deploy your EKS Anywhere cluster on, for example /<DATACENTER>/datastore/<DATASTORE_NAME>. Use govc find -type s to get a list of datastores.

folder (required)

The path to a VM folder for your EKS Anywhere cluster VMs. This allows you to organize your VMs. If the folder does not exist, it will be created for you. If the folder is blank, the VMs will go in the root folder. For example /<DATACENTER>/vm/<FOLDER_NAME>/.... Use govc find -type f to get a list of existing folders.

resourcePool (required)

The vSphere Resource pools for your VMs in the EKS Anywhere cluster. Examples of resource pool values include:

  • If there is no resource pool: /<datacenter>/host/<cluster-name>/Resources
  • If there is a resource pool: /<datacenter>/host/<cluster-name>/Resources/<resource-pool-name>
  • The wild card option */Resources also often works.

Use govc find -type p to get a list of available resource pools.

storagePolicyName (optional)

The storage policy name associated with your VMs. Generally this can be left blank. Use govc storage.policy.ls to get a list of available storage policies.

tags (optional)

Optional list of tags to attach to your cluster VMs in the URN format.

Example:

  tags:
  - urn:vmomi:InventoryServiceTag:8e0ce079-0675-47d6-8665-16ada4e6dabd:GLOBAL

Optional VSphere Credentials

Use the following environment variables to configure Cloud Provider and CSI Driver with different credentials.

EKSA_VSPHERE_CP_USERNAME

Username for Cloud Provider (Default: $EKSA_VSPHERE_USERNAME).

EKSA_VSPHERE_CP_PASSWORD

Password for Cloud Provider (Default: $EKSA_VSPHERE_PASSWORD).

EKSA_VSPHERE_CSI_USERNAME

Username for CSI Driver (Default: $EKSA_VSPHERE_USERNAME).

EKSA_VSPHERE_CSI_PASSWORD

Password for CSI Driver (Default: $EKSA_VSPHERE_PASSWORD).

5.1.5 - CloudStack configuration

Full EKS Anywhere configuration reference for a CloudStack cluster.

This is a generic template with detailed descriptions below for reference. The following additional optional configuration can also be included:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: my-cluster-name
spec:
  clusterNetwork:
    cniConfig:
      cilium: {}
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12
  controlPlaneConfiguration:
    count: 3
    endpoint:
      host: ""
    machineGroupRef:
      kind: CloudStackMachineConfig
      name: my-cluster-name-cp
    taints:
    - key: ""
      value: ""
      effect: ""
    labels:
      "<key1>": ""
      "<key2>": ""
  datacenterRef:
    kind: CloudStackDatacenterConfig
    name: my-cluster-name
  externalEtcdConfiguration:
    count: 3
    machineGroupRef:
      kind: CloudStackMachineConfig
      name: my-cluster-name-etcd
  kubernetesVersion: "1.23"
  managementCluster:
    name: my-cluster-name
  workerNodeGroupConfigurations:
  - count: 2
    machineGroupRef:
      kind: CloudStackMachineConfig
      name: my-cluster-name
    taints:
    - key: ""
      value: ""
      effect: ""
    labels:
      "<key1>": ""
      "<key2>": ""
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: CloudStackDatacenterConfig
metadata:
  name: my-cluster-name-datacenter
spec:
  availabilityZones:
  - account: admin
    credentialsRef: global
    domain: domain1
    managementApiEndpoint: ""
    name: az-1
    zone:
      name: zone1
      network:
        name: "net1"
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: CloudStackMachineConfig
metadata:
  name: my-cluster-name-cp
spec:
  computeOffering:
    name: "m4-large"
  users:
  - name: capc
    sshAuthorizedKeys:
    - ssh-rsa AAAA...
  template:
    name: "rhel8-k8s-118"
  diskOffering:
    name: "Small"
    mountPath: "/data-small"
    device: "/dev/vdb"
    filesystem: "ext4"
    label: "data_disk"
  symlinks:
    /var/log/kubernetes: /data-small/var/log/kubernetes
  affinityGroupIds:
  - control-plane-anti-affinity
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: CloudStackMachineConfig
metadata:
  name: my-cluster-name
spec:
  computeOffering:
    name: "m4-large"
  users:
  - name: capc
    sshAuthorizedKeys:
    - ssh-rsa AAAA...
  template:
    name: "rhel8-k8s-118"
  diskOffering:
    name: "Small"
    mountPath: "/data-small"
    device: "/dev/vdb"
    filesystem: "ext4"
    label: "data_disk"
  symlinks:
    /var/log/pods: /data-small/var/log/pods
    /var/log/containers: /data-small/var/log/containers
  affinityGroupIds:
  - worker-affinity
  userCustomDetails:
    foo: bar
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: CloudStackMachineConfig
metadata:
  name: my-cluster-name-etcd
spec:
  computeOffering: {}
    name: "m4-large"
  users:
  - name: "capc"
    sshAuthorizedKeys: 
    - "ssh-rsa AAAAB3N...
  template:
    name: "rhel8-k8s-118"
  diskOffering:
    name: "Small"
    mountPath: "/data-small"
    device: "/dev/vdb"
    filesystem: "ext4"
    label: "data_disk"
  symlinks:
    /var/lib: /data-small/var/lib
  affinityGroupIds:
  - etcd-affinity
---

Cluster Fields

name (required)

Name of your cluster my-cluster-name in this example

clusterNetwork (required)

Specific network configuration for your Kubernetes cluster.

clusterNetwork.cniConfig (required)

CNI plugin configuration to be used in the cluster. The only supported configuration at the moment is cilium.

clusterNetwork.cniConfig.cilium.policyEnforcementMode

Optionally, you may specify a policyEnforcementMode of default, always, never.

clusterNetwork.pods.cidrBlocks[0] (required)

Subnet used by pods in CIDR notation. Please note that only 1 custom pods CIDR block specification is permitted. This CIDR block should not conflict with the network subnet range selected for the VMs.

clusterNetwork.services.cidrBlocks[0] (required)

Subnet used by services in CIDR notation. Please note that only 1 custom services CIDR block specification is permitted. This CIDR block should not conflict with the network subnet range selected for the VMs.

controlPlaneConfiguration (required)

Specific control plane configuration for your Kubernetes cluster.

controlPlaneConfiguration.count (required)

Number of control plane nodes

controlPlaneConfiguration.endpoint.host (required)

A unique IP you want to use for the control plane VM in your EKS Anywhere cluster. Choose an IP in your network range that does not conflict with other VMs.

NOTE: This IP should be outside the network DHCP range as it is a floating IP that gets assigned to one of the control plane nodes for kube-apiserver loadbalancing. Suggestions on how to ensure this IP does not cause issues during cluster creation process are here

controlPlaneConfiguration.machineGroupRef (required)

Refers to the Kubernetes object with CloudStack specific configuration for your nodes. See CloudStackMachineConfig Fields below.

controlPlaneConfiguration.taints

A list of taints to apply to the control plane nodes of the cluster.

Replaces the default control plane taint, node-role.kubernetes.io/master. The default control plane components will tolerate the provided taints.

Modifying the taints associated with the control plane configuration will cause new nodes to be rolled-out, replacing the existing nodes.

NOTE: The taints provided will be used instead of the default control plane taint node-role.kubernetes.io/master. Any pods that you run on the control plane nodes must tolerate the taints you provide in the control plane configuration.

controlPlaneConfiguration.labels

A list of labels to apply to the control plane nodes of the cluster. This is in addition to the labels that EKS Anywhere will add by default.

A special label value is supported by the CAPC provider:

    labels:
      cluster.x-k8s.io/failure-domain: ds.meta_data.failuredomain

The ds.meta_data.failuredomain value will be replaced with a failuredomain name where the node is deployed, such as az-1.

Modifying the labels associated with the control plane configuration will cause new nodes to be rolled out, replacing the existing nodes.

datacenterRef

Refers to the Kubernetes object with CloudStack environment specific configuration. See CloudStackDatacenterConfig Fields below.

externalEtcdConfiguration.count

Number of etcd members

externalEtcdConfiguration.machineGroupRef

Refers to the Kubernetes object with CloudStack specific configuration for your etcd members. See CloudStackMachineConfig Fields below.

kubernetesVersion (required)

The Kubernetes version you want to use for your cluster. Supported values: 1.24, 1.23, 1.22, 1.21

managementCluster (required)

Identifies the name of the management cluster. If this is a standalone cluster or if it were serving as the management cluster for other workload clusters, this will be the same as the cluster name.

workerNodeGroupConfigurations (required)

This takes in a list of node groups that you can define for your workers. You may define one or more worker node groups.

workerNodeGroupConfigurations.count

Number of worker nodes. Optional if autoscalingConfiguration is used, in which case count will default to autoscalingConfiguration.minCount.

workerNodeGroupConfigurations.machineGroupRef (required)

Refers to the Kubernetes object with CloudStack specific configuration for your nodes. See CloudStackMachineConfig Fields below.

workerNodeGroupConfigurations.name (required)

Name of the worker node group (default: md-0)

workerNodeGroupConfigurations.autoscalingConfiguration.minCount

Minimum number of nodes for this node group’s autoscaling configuration.

workerNodeGroupConfigurations.autoscalingConfiguration.maxCount

Maximum number of nodes for this node group’s autoscaling configuration.

workerNodeGroupConfigurations.taints

A list of taints to apply to the nodes in the worker node group.

Modifying the taints associated with a worker node group configuration will cause new nodes to be rolled-out, replacing the existing nodes associated with the configuration.

At least one node group must not have NoSchedule or NoExecute taints applied to it.

workerNodeGroupConfigurations.labels

A list of labels to apply to the nodes in the worker node group. This is in addition to the labels that EKS Anywhere will add by default. A special label value is supported by the CAPC provider:

    labels:
      cluster.x-k8s.io/failure-domain: ds.meta_data.failuredomain

The ds.meta_data.failuredomain value will be replaced with a failuredomain name where the node is deployed, such as az-1.

Modifying the labels associated with a worker node group configuration will cause new nodes to be rolled out, replacing the existing nodes associated with the configuration.

CloudStackDatacenterConfig

availabilityZones.account (optional)

Account used to access CloudStack. As long as you pass valid credentials, through availabilityZones.credentialsRef, this value is not required.

availabilityZones.credentialsRef (required)

If you passed credentials through the environment variable EKSA_CLOUDSTACK_B64ENCODED_SECRET noted in Create CloudStack production cluster , you can identify those credentials here. For that example, you would use the profile name global. You can instead use a previously created secret on the Kubernetes cluster in the eksa-system namespace.

availabilityZones.domain (optional)

CloudStack domain to deploy the cluster. The default is ROOT.

availabilityZones.managementApiEndpoint (required)

Location of the CloudStack API management endpoint. For example, http://10.11.0.2:8080/client/api.

availabilityZones.{id,name} (required)

Name or ID of the CloudStack zone on which to deploy the cluster.

availabilityZones.zone.network.{id,name} (required)

CloudStack network name or ID to use with the cluster.

CloudStackMachineConfig

In the example above, there are separate CloudStackMachineConfig sections for the control plane (my-cluster-name-cp), worker (my-cluster-name) and etcd (my-cluster-name-etcd) nodes.

computeOffering.{id,name} (required)

Name or ID of the CloudStack compute instance.

users[0].name (optional)

The name of the user you want to configure to access your virtual machines through ssh. You can add as many users object as you want.

The default is capc.

users[0].sshAuthorizedKeys (optional)

The SSH public keys you want to configure to access your virtual machines through ssh (as described below). Only 1 is supported at this time.

users[0].sshAuthorizedKeys[0] (optional)

This is the SSH public key that will be placed in authorized_keys on all EKS Anywhere cluster VMs so you can ssh into them. The user will be what is defined under name above. For example:

ssh -i <private-key-file> <user>@<VM-IP>

The default is generating a key in your $(pwd)/<cluster-name> folder when not specifying a value.

template.{id,name} (required)

The VM template to use for your EKS Anywhere cluster. Currently, a VM based on RHEL 8.6 is required. This can be a name or ID. See the Artifacts page for instructions for building RHEL-based images.

diskOffering (optional)

Name representing a disk you want to mount into nodes for this CloudStackMachineConfig

diskOffering.mountPath (optional)

Mount point on which to mount the disk.

diskOffering.device (optional)

Device name of the disk partition to mount.

diskOffering.filesystem (optional)

File system type used to format the filesystem on the disk.

diskOffering.label (optional)

Label to apply to the disk partition.

Symbolic link of a directory or file you want to mount from the host filesystem to the mounted filesystem.

userCustomDetails (optional)

Add key/value pairs to nodes in a CloudStackMachineConfig. These can be used for things like identifying sets of nodes that you want to add to a security group that opens selected ports.

affinityGroupIDs (optional)

Group ID to attach to the set of host systems to indicate how affinity is done for services on those systems.

affinity (optional)

Allows you to set pro and anti affinity for the CloudStackMachineConfig. This can be used in a mutually exclusive fashion with the affinityGroupIDs field.

5.1.6 - Optional configuration

Config reference to optional features for EKS Anywhere clusters

5.1.6.1 - Autoscaling configuration

EKS Anywhere cluster yaml autoscaling configuration specification reference

Cluster Autoscaling (Optional)

Cluster Autoscaler configuration in EKS Anywhere cluster spec

EKS Anywhere supports autoscaling worker node groups using the Kubernetes Cluster Autoscaler ’s clusterapi cloudProvider.

  • Configure a worker node group to be picked up by a cluster autoscaler deployment by adding a autoscalingConfiguration block to the workerNodeGroupConfiguration:
    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: my-cluster-name
    spec:
      workerNodeGroupConfigurations:
        - autoscalingConfiguration:
            minCount: 1
            maxCount: 5
          machineGroupRef:
            kind: VSphereMachineConfig
            name: worker-machine-a
          name: md-0
        - count: 1
          autoscalingConfiguration:
            minCount: 1
            maxCount: 3
          machineGroupRef:
            kind: VSphereMachineConfig
            name: worker-machine-b
          name: md-1
    

Note that if no count is specified it will default to the minCount value.

EKS Anywhere will automatically apply the following annotations to your MachineDeployment objects:

cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: <minCount>
cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: <maxCount>

After deploying the Kubernetes Cluster Autoscaler from upstream or as a curated package , the deployment will pick up your MachineDeployment and scale the nodes as per your min and max count values.

Cluster Autoscaler Deployment Topologies

The Kubernetes Cluster Autoscaler can only scale a single cluster per deployment.

This means that each cluster you want to scale will need its own cluster autoscaler deployment.

We support three deployment topologies:

  1. Cluster Autoscaler deployed in the management cluster to autoscale the management cluster itself
  2. Cluster Autoscaler deployed in the management cluster to autoscale a remote workload cluster
  3. Cluster Autoscaler deployed in the workload cluster to autoscale the workload cluster itself

If your cluster architecture supports management clusters with resources to run additional workloads, you may want to consider using deployment topologies (1) and (2). Instructions for using this topology can be found here .

If your deployment topology runs small management clusters though, you may want to follow deployment topology (3) and deploy the cluster autoscaler to run in a workload cluster .

5.1.6.2 - CNI plugin configuration

EKS Anywhere cluster yaml cni plugin specification reference

Specifying CNI Plugin in EKS Anywhere cluster spec

EKS Anywhere currently supports two CNI plugins: Cilium and Kindnet. Only one of them can be selected for a cluster, and the plugin cannot be changed once the cluster is created. Up until the 0.7.x releases, the plugin had to be specified using the cni field on cluster spec. Starting with release 0.8, the plugin should be specified using the new cniConfig field as follows:

  • For selecting Cilium as the CNI plugin:

    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: my-cluster-name
    spec:
      clusterNetwork:
        pods:
          cidrBlocks:
          - 192.168.0.0/16
        services:
          cidrBlocks:
          - 10.96.0.0/12
        cniConfig:
          cilium: {}
    

    EKS Anywhere selects this as the default plugin when generating a cluster config.

  • Or for selecting Kindnetd as the CNI plugin:

    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: my-cluster-name
    spec:
      clusterNetwork:
        pods:
          cidrBlocks:
          - 192.168.0.0/16
        services:
          cidrBlocks:
          - 10.96.0.0/12
        cniConfig:
          kindnetd: {}
    

NOTE: EKS Anywhere allows specifying only 1 plugin for a cluster and does not allow switching the plugins after the cluster is created.

Policy Configuration options for Cilium plugin

Cilium accepts policy enforcement modes from the users to determine the allowed traffic between pods. The allowed values for this mode are: default, always and never. Please refer the official Cilium documentation for more details on how each mode affects the communication within the cluster and choose a mode accordingly. You can choose to not set this field so that cilium will be launched with the default mode. Starting release 0.8, Cilium’s policy enforcement mode can be set through the cluster spec as follows:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: my-cluster-name
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12
    cniConfig:
      cilium: 
        policyEnforcementMode: "always"

Please note that if the always mode is selected, all communication between pods is blocked unless NetworkPolicy objects allowing communication are created. In order to ensure that the cluster gets created successfully, EKS Anywhere will create the required NetworkPolicy objects for all its core components. But it is up to the user to create the NetworkPolicy objects needed for the user workloads once the cluster is created.

Network policies created by EKS Anywhere for “always” mode

As mentioned above, if Cilium is configured with policyEnforcementMode set to always, EKS Anywhere creates NetworkPolicy objects to enable communication between its core components. EKS Anywhere will create NetworkPolicy resources in the following namespaces allowing all ingress/egress traffic by default:

  • kube-system
  • eksa-system
  • All core Cluster API namespaces:
    • capi-system
    • capi-kubeadm-bootstrap-system
    • capi-kubeadm-control-plane-system
    • etcdadm-bootstrap-provider-system
    • etcdadm-controller-system
    • cert-manager
  • Infrastructure provider’s namespace (for instance, capd-system OR capv-system)
  • If Gitops is enabled, then the gitops namespace (flux-system by default)

This is the NetworkPolicy that will be created in these namespaces for the cluster:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-all-ingress-egress
  namespace: test
spec:
  podSelector: {}
  ingress:
  - {}
  egress:
  - {}
  policyTypes:
  - Ingress
  - Egress

Switching the Cilium policy enforcement mode

The policy enforcement mode for Cilium can be changed as a part of cluster upgrade through the cli upgrade command.

  1. Switching to always mode: When switching from default/never to always mode, EKS Anywhere will create the required NetworkPolicy objects for its core components (listed above). This will ensure that the cluster gets upgraded successfully. But it is up to the user to create the NetworkPolicy objects required for the user workloads.

  2. Switching from always mode: When switching from always to default mode, EKS Anywhere will not delete any of the existing NetworkPolicy objects, including the ones required for EKS Anywhere components (listed above). The user must delete NetworkPolicy objects as needed.

Node IPs configuration option

Starting with release v0.10, the node-cidr-mask-size flag for Kubernetes controller manager (kube-controller-manager) is configurable via the EKS anywhere cluster spec. The clusterNetwork.nodes being an optional field, is not generated in the EKS Anywhere spec using generate clusterconfig command. This block for nodes will need to be manually added to the cluster spec under the clusterNetwork section:

  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12
    cniConfig:
      cilium: {}
    nodes:
      cidrMaskSize: 24

If the user does not specify the clusterNetwork.nodes field in the cluster yaml spec, the value for this flag defaults to 24 for IPv4. Please note that this mask size needs to be greater than the pods CIDR mask size. In the above spec, the pod CIDR mask size is 16 and the node CIDR mask size is 24. This ensures the cluster 256 blocks of /24 networks. For example, node1 will get 192.168.0.0/24, node2 will get 192.168.1.0/24, node3 will get 192.168.2.0/24 and so on.

To support more than 256 nodes, the cluster CIDR block needs to be large, and the node CIDR mask size needs to be small, to support that many IPs. For instance, to support 1024 nodes, a user can do any of the following things

  • Set the pods cidr blocks to 192.168.0.0/16 and node cidr mask size to 26
  • Set the pods cidr blocks to 192.168.0.0/15 and node cidr mask size to 25

Please note that the node-cidr-mask-size needs to be large enough to accommodate the number of pods you want to run on each node. A size of 24 will give enough IP addresses for about 250 pods per node, however a size of 26 will only give you about 60 IPs. This is an immutable field, and the value can’t be updated once the cluster has been created.

5.1.6.3 - IAM Roles for Service Accounts configuration

EKS Anywhere cluster spec for Pod IAM (IRSA)

IAM Role for Service Account on EKS Anywhere clusters with self-hosted signing keys

IAM Roles for Service Account (IRSA) enables applications running in clusters to authenticate with AWS services using IAM roles. The current solution for leveraging this in EKS Anywhere involves creating your own OIDC provider for the cluster, and hosting your cluster’s public service account signing key. The public keys along with the OIDC discovery document should be hosted somewhere that AWS STS can discover it. The steps below assume the keys will be hosted on a publicly accessible S3 bucket. Refer this doc to ensure that the s3 bucket is publicly accessible.

The steps below are based on the guide for configuring IRSA for DIY Kubernetes, with modifications specific to EKS Anywhere’s cluster provisioning workflow. The main modification is the process of generating the keys.json document. As per the original guide, the user has to create the service account signing keys, and then use that to create the keys.json document prior to cluster creation. This order is reversed for EKS Anywhere clusters, so you will create the cluster first, and then retrieve the service account signing key generated by the cluster, and use it to create the keys.json document. The sections below show how to do this in detail.

Create an OIDC provider and make its discovery document publicly accessible

  1. Create an s3 bucket to host the public signing keys and OIDC discovery document for your cluster as per this section. Ensure you follow all the steps and save the $HOSTNAME and $ISSUER_HOSTPATH.

  2. Create the OIDC discovery document as follows:

    cat <<EOF > discovery.json
    {
        "issuer": "https://$ISSUER_HOSTPATH",
        "jwks_uri": "https://$ISSUER_HOSTPATH/keys.json",
        "authorization_endpoint": "urn:kubernetes:programmatic_authorization",
        "response_types_supported": [
            "id_token"
        ],
        "subject_types_supported": [
            "public"
        ],
        "id_token_signing_alg_values_supported": [
            "RS256"
        ],
        "claims_supported": [
            "sub",
            "iss"
        ]
    }
    EOF
    
  3. Upload it to the publicly accessible S3 bucket:

    aws s3 cp --acl public-read ./discovery.json s3://$S3_BUCKET/.well-known/openid-configuration
    
  4. Create an OIDC provider for your cluster. Set the Provider URL to https://$ISSUER_HOSTPATH, and audience to sts.amazonaws.com.

  5. Note down the Provider field of OIDC provider after it is created.

  6. Assign an IAM role to this OIDC provider.

    1. From the IAM console, select and click on the OIDC provider created from above, and click on Assign role at the top right.

    2. Select Create a new role.

    3. In the Select type of trusted entity section, choose Web identity.

    4. In the Choose a web identity provider section:

      • For Identity provider, choose the auto selected Identity Provider URL for your cluster.
      • For Audience, choose sts.amazonaws.com.
    5. Choose Next: Permissions.

    6. In the Attach Policy section, select the IAM policy that has the permissions that you want your applications running in the pods to use.

    7. Continue with the next sections of adding tags if desired and a suitable name for this role and create the role.

    8. Below is a sample trust policy of IAM role for your pods. Remember to replace Account ID and ISSUER_HOSTPATH with required values.

      {
       "Version": "2012-10-17",
       "Statement": [
        {
         "Effect": "Allow",
         "Principal": {
          "Federated": "arn:aws:iam::111122223333:oidc-provider/ISSUER_HOSTPATH"
         },
         "Action": "sts:AssumeRoleWithWebIdentity",
         "Condition": {
          "__doc_comment": "scope the role to the service account (optional)",
          "StringEquals": {
           "ISSUER_HOSTPATH:sub": "system:serviceaccount:default:my-serviceaccount"
          },
          "__doc_comment": "OR scope the role to a namespace (optional)",
          "StringLike": {
           "ISSUER_HOSTPATH/CLUSTER_ID:sub": ["system:serviceaccount:default:*","system:serviceaccount:observability:*"]
          }
         }
        }
       ]
      }
      
    9. After the role is created, note down the name of this IAM Role as OIDC_IAM_ROLE.

    10. Once the cluster is created, you can create service accounts and grant them this role by editing the trust relationship of this role. You can use StringLike condition to add required service accounts, as mentioned in the above sample. Please also refer to section Configure the trust relationship for the OIDC provider’s IAM Role .

Create the EKS Anywhere cluster

  1. When creating the EKS Anywhere cluster, you need to configure the kube-apiserver’s service-account-issuer flag so it can issue and mount projected service account tokens in pods. For this, use the value obtained in the first section for $ISSUER_HOSTPATH as the service-account-issuer. Configure the kube-apiserver by setting this value through the EKS Anywhere cluster spec as follows:
    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: my-cluster-name
    spec:
      podIamConfig:
        serviceAccountIssuer: https://$ISSUER_HOSTPATH
    

Set the remaining fields in cluster spec as required and create the cluster using the eksctl anywhere create cluster command.

Generate keys.json and make it publicly accessible

  1. The cluster provisioning workflow generates a pair of service account signing keys. Retrieve the public signing key generated and used by the cluster, and create a keys.json document containing the public signing key.

    git clone https://github.com/aws/amazon-eks-pod-identity-webhook
    cd amazon-eks-pod-identity-webhook
    kubectl get secret ${CLUSTER_NAME}-sa -n eksa-system -o jsonpath={.data.tls\\.crt} | base64 --decode > ${CLUSTER_NAME}-sa.pub    
    go run ./hack/self-hosted/main.go -key ${CLUSTER_NAME}-sa.pub | jq '.keys += [.keys[0]] | .keys[1].kid = ""' > keys.json
    
  2. Upload the keys.json document to the s3 bucket.

    aws s3 cp --acl public-read ./keys.json s3://$S3_BUCKET/keys.json
    

Deploy pod identity webhook

  1. After hosting the service account public signing key and OIDC discovery documents, the applications running in pods can start accessing the desired AWS resources, as long as the pod is mounted with the right service account tokens. This part of configuring the pods with the right service account tokens and env vars is automated by the amazon pod identity webhook . Once the webhook is deployed, it mutates any pods launched using service accounts annotated with eks.amazonaws.com/role-arn

  2. Clone amazon-eks-pod-identity-webhook if not done already.

  3. Set the $KUBECONFIG env var to the path of the EKS Anywhere cluster.

  4. Create my-service-account.yaml with OIDC_IAM_ROLE and other annotations as mentioned in sample below.

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: my-serviceaccount
      namespace: default
      annotations:
        # set this with value of OIDC_IAM_ROLE      
        eks.amazonaws.com/role-arn: "arn:aws:iam::111122223333:role/s3-reader"
        # optional: Defaults to "sts.amazonaws.com" if not set
        eks.amazonaws.com/audience: "sts.amazonaws.com"
        # optional: When set to "true", adds AWS_STS_REGIONAL_ENDPOINTS env var
        #   to containers
        eks.amazonaws.com/sts-regional-endpoints: "true"
        # optional: Defaults to 86400 for expirationSeconds if not set
        #   Note: This value can be overwritten if specified in the pod 
        #         annotation as shown in the next step.
        eks.amazonaws.com/token-expiration: "86400"
    
  5. Run the following command to apply the manifests for the amazon-eks-pod-identity-webhook. The image used here will be pulled from docker.io. Optionally, the image can be imported into (or proxied through) your private registry. Change the IMAGE= argument here to your private registry if needed.

    make cluster-up IMAGE=amazon/amazon-eks-pod-identity-webhook:latest
    
  6. Finally, apply the my-service-account.yaml file to create your service account.

    kubectl apply -f my-service-account.yaml
    
  7. You can validate IRSA by using test steps mentioned here . Ensure awscli pod is deployed in same namespace of ServiceAccount pod-identity-webhook.

Configure the trust relationship for the OIDC provider’s IAM Role

In order to grant certain service accounts access to the desired AWS resources, edit the trust relationship for the OIDC provider’s IAM Role (OIDC_IAM_ROLE) created in the first section, and add in the desired service accounts.

  1. Choose the role in the console to open it for editing.

  2. Choose the Trust relationships tab, and then choose Edit trust relationship.

  3. Find the line that looks similar to the following:

    "$ISSUER_HOSTPATH:aud": "sts.amazonaws.com"
    
  4. Change the line to look like the following line. Replace aud with sub and replace KUBERNETES_SERVICE_ACCOUNT_NAMESPACE and KUBERNETES_SERVICE_ACCOUNT_NAME with the name of your Kubernetes service account and the Kubernetes namespace that the account exists in.

    "$ISSUER_HOSTPATH:sub": "system:serviceaccount:KUBERNETES_SERVICE_ACCOUNT_NAMESPACE:KUBERNETES_SERVICE_ACCOUNT_NAME"
    
  5. Refer this doc for different ways of configuring one or multiple service accounts through the condition operators in the trust relationship.

  6. Choose Update Trust Policy to finish.

5.1.6.4 - etcd configuration

EKS Anywhere cluster yaml etcd specification reference

NOTE: Currently, the Unstacked etcd topology is not supported with the Amazon EKS Anywhere Bare Metal and Nutanix deployment options.

There are two types of etcd topologies for configuring a Kubernetes cluster:

  • Stacked: The etcd members and control plane components are colocated (run on the same node/machines)
  • Unstacked/External: With the unstacked or external etcd topology, etcd members have dedicated machines and are not colocated with control plane components

The unstacked etcd topology is recommended for a HA cluster for the following reasons:

  • External etcd topology decouples the control plane components and etcd member. So if a control plane-only node fails, or if there is a memory leak in a component like kube-apiserver, it won’t directly impact an etcd member.
  • Etcd is resource intensive, so it is safer to have dedicated nodes for etcd, since it could use more disk space or higher bandwidth. Having a separate etcd cluster for these reasons could ensure a more resilient HA setup.

EKS Anywhere supports both topologies. In order to configure a cluster with the unstacked/external etcd topology, you need to configure your cluster by updating the configuration file before creating the cluster. This is a generic template with detailed descriptions below for reference:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
   name: my-cluster-name
spec:
   clusterNetwork:
      pods:
         cidrBlocks:
            - 192.168.0.0/16
      services:
         cidrBlocks:
            - 10.96.0.0/12
      cniConfig:
         cilium: {}
   controlPlaneConfiguration:
      count: 1
      endpoint:
         host: ""
      machineGroupRef:
         kind: VSphereMachineConfig
         name: my-cluster-name-cp
   datacenterRef:
      kind: VSphereDatacenterConfig
      name: my-cluster-name
   # etcd configuration
   externalEtcdConfiguration:
      count: 3
      machineGroupRef:
        kind: VSphereMachineConfig
        name: my-cluster-name-etcd
   kubernetesVersion: "1.19"
   workerNodeGroupConfigurations:
      - count: 1
        machineGroupRef:
           kind: VSphereMachineConfig
           name: my-cluster-name
        name: md-0

externalEtcdConfiguration (under Cluster)

This field accepts any configuration parameters for running external etcd.

count (required)

This determines the number of etcd members in the cluster. The recommended number is 3.

machineGroupRef (required)

5.1.6.5 - AWS IAM Authenticator configuration

EKS Anywhere cluster yaml specification AWS IAM Authenticator reference

AWS IAM Authenticator support (optional)

EKS Anywhere can create clusters that support AWS IAM Authenticator-based api server authentication. In order to add IAM Authenticator support, you need to configure your cluster by updating the configuration file before creating the cluster. This is a generic template with detailed descriptions below for reference:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
   name: my-cluster-name
spec:
   ...
   # IAM Authenticator support
   identityProviderRefs:
      - kind: AWSIamConfig
        name: aws-iam-auth-config
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: AWSIamConfig
metadata:
   name: aws-iam-auth-config
spec:
    awsRegion: ""
    backendMode:
        - ""
    mapRoles:
        - roleARN: arn:aws:iam::XXXXXXXXXXXX:role/myRole
          username: myKubernetesUsername
          groups:
          - ""
    mapUsers:
        - userARN: arn:aws:iam::XXXXXXXXXXXX:user/myUser
          username: myKubernetesUsername
          groups:
          - ""
    partition: ""

identityProviderRefs (Under Cluster)

List of identity providers you want configured for the Cluster. This would include a reference to the AWSIamConfig object with the configuration below.

awsRegion (required)

  • Description: awsRegion can be any region in the aws partition that the IAM roles exist in.
  • Type: string

backendMode (required)

  • Description: backendMode configures the IAM authenticator server’s backend mode (i.e. where to source mappings from). We support EKSConfigMap and CRD modes supported by AWS IAM Authenticator, for more details refer to backendMode
  • Type: string
  • Description: When using EKSConfigMap backendMode, we recommend providing either mapRoles or mapUsers to set the IAM role mappings at the time of creation. This input is added to an EKS style ConfigMap. For more details refer to EKS IAM

  • Type: list object

    roleARN, userARN (required)

    • Description: IAM ARN to authenticate to the cluster. roleARN specifies an IAM role and userARN specifies an IAM user.
    • Type: string

    username (required)

    • Description: The Kubernetes username the IAM ARN is mapped to in the cluster. The ARN gets mapped to the Kubernetes cluster permissions associated with the username.
    • Type: string

    groups

    • Description: List of kubernetes user groups that the mapped IAM ARN is given permissions to.
    • Type: list string

partition

  • Description: This field is used to set the aws partition that the IAM roles are present in. Default value is aws.
  • Type: string

5.1.6.6 - OIDC configuration

EKS Anywhere cluster yaml specification OIDC reference

OIDC support (optional)

EKS Anywhere can create clusters that support api server OIDC authentication.

In order to add OIDC support, you need to configure your cluster by updating the configuration file to include the details below. The OIDC configuration can be added at cluster creation time, or introduced via a cluster upgrade in VMware and CloudStack.

This is a generic template with detailed descriptions below for reference:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
   name: my-cluster-name
spec:
   ...
   # OIDC support
   identityProviderRefs:
      - kind: OIDCConfig
        name: my-cluster-name
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: OIDCConfig
metadata:
   name: my-cluster-name
spec:
    clientId: ""
    groupsClaim: ""
    groupsPrefix: ""
    issuerUrl: "https://x"
    requiredClaims:
      - claim: ""
        value: ""
    usernameClaim: ""
    usernamePrefix: ""

identityProviderRefs (Under Cluster)

List of identity providers you want configured for the Cluster. This would include a reference to the OIDCConfig object with the configuration below.

clientId (required)

  • Description: ClientId defines the client ID for the OpenID Connect client
  • Type: string

groupsClaim (optional)

  • Description: GroupsClaim defines the name of a custom OpenID Connect claim for specifying user groups
  • Type: string

groupsPrefix (optional)

  • Description: GroupsPrefix defines a string to be prefixed to all groups to prevent conflicts with other authentication strategies
  • Type: string

issuerUrl (required)

  • Description: IssuerUrl defines the URL of the OpenID issuer, only HTTPS scheme will be accepted
  • Type: string

requiredClaims (optional)

List of RequiredClaim objects listed below. Only one is supported at this time.

requiredClaims[0] (optional)

  • Description: RequiredClaim defines a key=value pair that describes a required claim in the ID Token
    • claim
      • type: string
    • value
      • type: string
  • Type: object

usernameClaim (optional)

  • Description: UsernameClaim defines the OpenID claim to use as the user name. Note that claims other than the default (‘sub’) is not guaranteed to be unique and immutable
  • Type: string

usernamePrefix (optional)

  • Description: UsernamePrefix defines a string to be prefixed to all usernames. If not provided, username claims other than ‘email’ are prefixed by the issuer URL to avoid clashes. To skip any prefixing, provide the value ‘-’.
  • Type: string

5.1.6.7 - GitOpsConfig configuration

Configuration reference for GitOps cluster management.

GitOps Support (Optional)

EKS Anywhere can create clusters that supports GitOps configuration management with Flux. In order to add GitOps support, you need to configure your cluster by specifying the configuration file with gitOpsRef field when creating or upgrading the cluster. We currently support two types of configurations: FluxConfig and GitOpsConfig.

Flux Configuration

The flux configuration spec has three optional fields, regardless of the chosen git provider.

Flux Configuration Spec Details

systemNamespace (optional)

  • Description: Namespace in which to install the gitops components in your cluster. Defaults to flux-system
  • Type: string

clusterConfigPath (optional)

  • Description: The path relative to the root of the git repository where EKS Anywhere will store the cluster configuration files. Defaults to the cluster name
  • Type: string

branch (optional)

  • Description: The branch to use when committing the configuration. Defaults to main
  • Type: string

EKS Anywhere currently supports two git providers for FluxConfig: Github and Git.

Github provider

Please note that for the Flux config to work successfully with the Github provider, the environment variable EKSA_GITHUB_TOKEN needs to be set with a valid GitHub PAT . This is a generic template with detailed descriptions below for reference:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: my-cluster-name
  namespace: default
spec:
  ...
  #GitOps Support
  gitOpsRef:
    name: my-github-flux-provider
    kind: FluxConfig
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: FluxConfig
metadata:
  name: my-github-flux-provider
  namespace: default
spec:
  systemNamespace: "my-alternative-flux-system-namespace"
  clusterConfigPath: "path-to-my-clusters-config"
  branch: "main"
  github:
    personal: true
    repository: myClusterGitopsRepo
    owner: myGithubUsername

---

github Configuration Spec Details

repository (required)

  • Description: The name of the repository where EKS Anywhere will store your cluster configuration, and sync it to the cluster. If the repository exists, we will clone it from the git provider; if it does not exist, we will create it for you.
  • Type: string

owner (required)

  • Description: The owner of the Github repository; either a Github username or Github organization name. The Personal Access Token used must belong to the owner if this is a personal repository, or have permissions over the organization if this is not a personal repository.
  • Type: string

personal (optional)

  • Description: Is the repository a personal or organization repository? If personal, this value is true; otherwise, false. If using an organizational repository (e.g. personal is false) the owner field will be used as the organization when authenticating to github.com
  • Default: true
  • Type: boolean

Git provider

Before you create a cluster using the Git provider, you will need to set and export the EKSA_GIT_KNOWN_HOSTS and EKSA_GIT_PRIVATE_KEY environment variables.

EKSA_GIT_KNOWN_HOSTS

EKS Anywhere uses the provided known hosts file to verify the identity of the git provider when connecting to it with SSH. The EKSA_GIT_KNOWN_HOSTS environment variable should be a path to a known hosts file containing entries for the git server to which you’ll be connecting.

For example, if you wanted to provide a known hosts file which allows you to connect to and verify the identity of github.com using a private key based on the key algorithm ecdsa, you can use the OpenSSH utility ssh-keyscan to obtain the known host entry used by github.com for the ecdsa key type. EKS Anywhere supports ecdsa, rsa, and ed25519 key types, which can be specified via the sshKeyAlgorithm field of the git provider config.

ssh-keyscan -t ecdsa github.com >> my_eksa_known_hosts

This will produce a file which contains known-hosts entries for the ecdsa key type supported by github.com, mapping the host to the key-type and public key.

github.com ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBEmKSENjQEezOmxkZMy7opKgwFB9nkt5YRrYMjNuG5N87uRgg6CLrbo5wAdT/y6v0mKV0U2w0WZ2YB/++Tpockg=

EKS Anywhere will use the content of the file at the path EKSA_GIT_KNOWN_HOSTS to verify the identity of the remote git server, and the provided known hosts file must contain an entry for the remote host and key type.

EKSA_GIT_PRIVATE_KEY

The EKSA_GIT_PRIVATE_KEY environment variable should be a path to the private key file associated with a valid SSH public key registered with your Git provider. This key must have permission to both read from and write to your repository. The key can use the key algorithms rsa, ecdsa, and ed25519.

This key file must have restricted file permissions, allowing only the owner to read and write, such as octal permissions 600.

If your private key file is passphrase protected, you must also set EKSA_GIT_SSH_KEY_PASSPHRASE with that value.

This is a generic template with detailed descriptions below for reference:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: my-cluster-name
  namespace: default
spec:
  ...
  #GitOps Support
  gitOpsRef:
    name: my-git-flux-provider
    kind: FluxConfig
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: FluxConfig
metadata:
  name: my-git-flux-provider
  namespace: default
spec:
  systemNamespace: "my-alternative-flux-system-namespace"
  clusterConfigPath: "path-to-my-clusters-config"
  branch: "main"
  git:
    repositoryUrl: ssh://git@github.com/myAccount/myClusterGitopsRepo.git
    sshKeyAlgorithm: ecdsa
---

git Configuration Spec Details

repositoryUrl (required)

NOTE: The repositoryUrl value for private SSH repositories is of the format ssh://git@provider.com/$REPO_OWNER/$REPO_NAME.git. This may differ from the default SSH URL given by your provider. For example, the github.com user interface provides an SSH URL containing a : before the repository owner, rather than a /. Make sure to replace this : with a /, if present.

  • Description: The URL of an existing repository where EKS Anywhere will store your cluster configuration and sync it to the cluster. For private repositories, the SSH URL will be of the format ssh://git@provider.com/$REPO_OWNER/$REPO_NAME.git
  • Type: string

sshKeyAlgorithm (optional)

  • Description: The SSH key algorithm of the private key specified via EKSA_PRIVATE_KEY_FILE. Defaults to ecdsa
  • Type: string

Supported SSH key algorithm types are ecdsa, rsa, and ed25519.

Be sure that this SSH key algorithm matches the private key file provided by EKSA_GIT_PRIVATE_KEY_FILE and that the known hosts entry for the key type is present in EKSA_GIT_KNOWN_HOSTS.

GitOps Configuration

Please note that for the GitOps config to work successfully the environment variable EKSA_GITHUB_TOKEN needs to be set with a valid GitHub PAT . This is a generic template with detailed descriptions below for reference:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: my-cluster-name
  namespace: default
spec:
  ...
  #GitOps Support
  gitOpsRef:
    name: my-gitops
    kind: GitOpsConfig
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: GitOpsConfig
metadata:
  name: my-gitops
  namespace: default
spec:
  flux:
    github:
      personal: true
      repository: myClusterGitopsRepo
      owner: myGithubUsername
      fluxSystemNamespace: ""
      clusterConfigPath: ""

GitOps Configuration Spec Details

flux (required)

  • Description: our supported gitops provider is flux. This is the only supported value.
  • Type: object

Flux Configuration Spec Details

github (required)

  • Description: github is the only currently supported git provider. This defines your github configuration to be used by EKS Anywhere and flux.
  • Type: object

github Configuration Spec Details

repository (required)

  • Description: The name of the repository where EKS Anywhere will store your cluster configuration, and sync it to the cluster. If the repository exists, we will clone it from the git provider; if it does not exist, we will create it for you.
  • Type: string

owner (required)

  • Description: The owner of the Github repository; either a Github username or Github organization name. The Personal Access Token used must belong to the owner if this is a personal repository, or have permissions over the organization if this is not a personal repository.
  • Type: string

personal (optional)

  • Description: Is the repository a personal or organization repository? If personal, this value is true; otherwise, false. If using an organizational repository (e.g. personal is false) the owner field will be used as the organization when authenticating to github.com
  • Default: true
  • Type: boolean

clusterConfigPath (optional)

  • Description: The path relative to the root of the git repository where EKS Anywhere will store the cluster configuration files.
  • Default: clusters/$MANAGEMENT_CLUSTER_NAME
  • Type: string

fluxSystemNamespace (optional)

  • Description: Namespace in which to install the gitops components in your cluster.
  • Default: flux-system.
  • Type: string

branch (optional)

  • Description: The branch to use when committing the configuration.
  • Default: main
  • Type: string

5.1.6.8 - Proxy configuration

EKS Anywhere cluster yaml specification proxy configuration reference

Proxy support (optional)

You can configure EKS Anywhere to use a proxy to connect to the Internet. This is the generic template with proxy configuration for your reference:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
   name: my-cluster-name
spec:
   ...
   proxyConfiguration:
      httpProxy: http-proxy-ip:port
      httpsProxy: https-proxy-ip:port
      noProxy:
      - list of no proxy endpoints

Configuring Docker daemon

EKS Anywhere will proxy for you given the above configuration file. However, to successfully use EKS Anywhere you will also need to ensure your Docker daemon is configured to use the proxy.

This generally means updating your daemon to launch with the HTTPS_PROXY, HTTP_PROXY, and NO_PROXY environment variables.

For an example of how to do this with systemd, please see Docker’s documentation here .

Configuring EKS Anywhere proxy without config file

For commands using a cluster config file, EKS Anywhere will derive its proxy config from the cluster configuration file.

However, for commands that do not utilize a cluster config file, you can set the following environment variables:

export HTTPS_PROXY=https-proxy-ip:port
export HTTP_PROXY=http-proxy-ip:port
export NO_PROXY=no-proxy-domain.com,another-domain.com,localhost

Proxy Configuration Spec Details

proxyConfiguration (required)

  • Description: top level key; required to use proxy.
  • Type: object

httpProxy (required)

  • Description: HTTP proxy to use to connect to the internet; must be in the format IP:port
  • Type: string
  • Example: httpProxy: 192.168.0.1:3218

httpsProxy (required)

  • Description: HTTPS proxy to use to connect to the internet; must be in the format IP:port
  • Type: string
  • Example: httpsProxy: 192.168.0.1:3218

noProxy (optional)

  • Description: list of endpoints that should not be routed through the proxy; can be an IP, CIDR block, or a domain name
  • Type: list of strings
  • Example
  noProxy:
   - localhost
   - 192.168.0.1
   - 192.168.0.0/16
   - .example.com

5.1.6.9 - Registry Mirror configuration

EKS Anywhere cluster yaml specification for registry mirror configuration

Registry Mirror Support (optional)

You can configure EKS Anywhere to use a private registry as a mirror for pulling the required images.

The following cluster spec shows an example of how to configure registry mirror:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
   name: my-cluster-name
spec:
   ...
  registryMirrorConfiguration:
    endpoint: <private registry IP or hostname>
    port: <private registry port>
    ociNamespaces:
      - registry: <upstream registry IP or hostname>
        namespace: <namespace in private registry>
      ...
    caCertContent: |
      -----BEGIN CERTIFICATE-----
      MIIF1DCCA...
      ...
      es6RXmsCj...
      -----END CERTIFICATE-----        

Registry Mirror Configuration Spec Details

registryMirrorConfiguration (required)

  • Description: top level key; required to use a private registry.
  • Type: object

endpoint (required)

  • Description: IP address or hostname of the private registry for pulling images
  • Type: string
  • Example: endpoint: 192.168.0.1

port (optional)

  • Description: port for the private registry. This is an optional field. If a port is not specified, the default HTTPS port 443 is used
  • Type: string
  • Example: port: 443

ociNamespaces (optional)

  • Description: mapping from upstream registries to the locations of the private registry. When specified, the artifacts pulled from an upstream registry will be put in its corresponding location/namespace in the private registry. The target location/namespace must be already existing.
  • Type: array
  • Example:
    ociNamespaces:
      - registry: "public.ecr.aws"
        namespace: "eks-anywhere"
      - registry: "783794618700.dkr.ecr.us-west-2.amazonaws.com"
        namespace: "curated-packages"
    

caCertContent (optional)

  • Description: certificate Authority (CA) Certificate for the private registry . When using self-signed certificates it is necessary to pass this parameter in the cluster spec. This must be the individual public CA cert used to sign the registry certificate. This will be added to the cluster nodes so that they are able to pull images from the private registry.

    It is also possible to configure CACertContent by exporting an environment variable:
    export EKSA_REGISTRY_MIRROR_CA="/path/to/certificate-file"

  • Type: string

  • Example:

    CACertContent: |
      -----BEGIN CERTIFICATE-----
      MIIF1DCCA...
      ...
      es6RXmsCj...
      -----END CERTIFICATE-----  
    

authenticate (optional)

  • Description: optional field to authenticate with a private registry. When using private registries that require authentication, it is necessary to set this parameter to true in the cluster spec.
  • Type: boolean
  • Example: authenticate: true

To use an authenticated private registry, please also set the following environment variables:

export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>

insecureSkipVerify (optional)

  • Description: optional field to skip the registry certificate verification. Only use this solution for isolated testing or in a tightly controlled, air-gapped environment. Currently only supported for Ubuntu OS.
  • Type: boolean

Import images into a private registry

You can use the download images and import images commands to pull images from public.ecr.aws and push them to your private registry. The copy packages must be used if you want to copy EKS Anywhere Curated Packages to your registry mirror. The download images command also pulls the cilium chart from public.ecr.aws and pushes it to the registry mirror. It requires the registry credentials for performing a login. Set the following environment variables for the login:

export REGISTRY_ENDPOINT=<registry_endpoint>
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>

Download the EKS Anywhere artifacts to get the EKS Anywhere bundle:

eksctl anywhere download artifacts
tar -xzf eks-anywhere-downloads.tar.gz

Download and import EKS Anywhere images:

eksctl anywhere download images -o eks-anywhere-images.tar
docker login https://${REGISTRY_ENDPOINT}
...
eksctl anywhere import images -i eks-anywhere-images.tar --bundles eks-anywhere-downloads/bundle-release.yaml --registry ${REGISTRY_ENDPOINT}

Use the EKS Anywhere bundle to copy packages:

eksctl anywhere copy packages --bundle ./eks-anywhere-downloads/bundle-release.yaml --dst-cert rootCA.pem ${REGISTRY_ENDPOINT}

Docker configurations

It is necessary to add the private registry’s CA Certificate to the list of CA certificates on the admin machine if your registry uses self-signed certificates.

For Linux , you can place your certificate here: /etc/docker/certs.d/<private-registry-endpoint>/ca.crt

For Mac , you can follow this guide to add the certificate to your keychain: https://docs.docker.com/desktop/mac/#add-tls-certificates

Registry configurations

Depending on what registry you decide to use, you will need to create the following projects:

bottlerocket
eks-anywhere
eks-distro
isovalent
cilium-chart

For example, if a registry is available at private-registry.local, then the following projects will have to be created:

https://private-registry.local/bottlerocket
https://private-registry.local/eks-anywhere
https://private-registry.local/eks-distro
https://private-registry.local/isovalent
https://private-registry.local/cilium-chart

5.1.6.10 - Package controller configuration

EKS Anywhere cluster yaml specification for package controller configuration

Package Controller Configuration (optional)

You can configure EKS Anywhere controller configuration.

The following cluster spec shows an example of how to configure registry mirror:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
   name: my-cluster-name
spec:
   ...
  packages:
    disable: false
    controller:
      resources:
        requests:
          cpu: 100m
          memory: 50Mi
        limits:
          cpu: 750m
          memory: 450Mi


Package Controller Configuration Spec Details

packages (optional)

  • Description: Top level key; required controller configuration.
  • Type: object

packages.disable (optional)

  • Description: Disable the package controller.
  • Type: bool
  • Example: disable: true

packages.controller (optional)

  • Description: Disable the package controller.
  • Type: object

packages.controller.resources (optional)

  • Description: Resources for the package controller.
  • Type: object

packages.controller.resources.limits (optional)

  • Description: Resource limits.
  • Type: object

packages.controller.resources.limits.cpu (optional)

  • Description: CPU limit.
  • Type: string

packages.controller.resources.limits.memory (optional)

  • Description: Memory limit.
  • Type: string

packages.controller.resources.requests (optional)

  • Description: Requested resources.
  • Type: object

packages.controller.resources.requests.cpu (optional)

  • Description: Requested cpu.
  • Type: string

packages.controller.resources.limits.memory (optional)

  • Description: Requested memory.
  • Type: string

packages.cronjob (optional)

  • Description: Disable the package controller.
  • Type: object

packages.cronjob.disable (optional)

  • Description: Disable the cron job.
  • Type: bool
  • Example: disable: true

packages.cronjob.resources (optional)

  • Description: Resources for the package controller.
  • Type: object

packages.cronjob.resources.limits (optional)

  • Description: Resource limits.
  • Type: object

packages.cronjob.resources.limits.cpu (optional)

  • Description: CPU limit.
  • Type: string

packages.cronjob.resources.limits.memory (optional)

  • Description: Memory limit.
  • Type: string

packages.cronjob.resources.requests (optional)

  • Description: Requested resources.
  • Type: object

packages.cronjob.resources.requests.cpu (optional)

  • Description: Requested cpu.
  • Type: string

packages.cronjob.resources.limits.memory (optional)

  • Description: Requested memory.
  • Type: string

5.2 - Bare Metal

Preparing a Bare Metal provider for EKS Anywhere

5.2.1 - Requirements for EKS Anywhere on Bare Metal

Bare Metal provider requirements for EKS Anywhere

To run EKS Anywhere on Bare Metal, you need to meet the hardware and networking requirements described below.

Administrative machine

Set up an Administrative machine as described in Install EKS Anywhere .

Compute server requirements

The minimum number of physical machines needed to run EKS Anywhere on bare metal is 1. To configure EKS Anywhere to run on a single server, set controlPlaneConfiguration.count to 1, and omit workerNodeGroupConfigurations from your cluster configuration.

The recommended number of physical machines for production is at least:

  • Control plane physical machines: 3
  • Worker physical machines: 2

The compute hardware you need for your Bare Metal cluster must meet the following capacity requirements:

  • vCPU: 2
  • Memory: 8GB RAM
  • Storage: 25GB

Upgrade requirements

If you are running a standalone cluster with only one control plane node, you will need at least one additional, temporary machine for each control plane node grouping. For cluster with multiple control plane nodes, you can perform a rolling upgrade with or without an extra temporary machine. For worker node upgrades, you can perform a rolling upgrade with or without an extra temporary machine.

When upgrading without an extra machine, keep in mind that your control plane and your workload must be able to tolerate node unavailability. When upgrading with extra machine(s), you will need additional temporary machine(s) for each control plane and worker node grouping. Refer to Upgrade Bare Metal Cluster and Advanced configuration for rolling upgrade .

NOTE: For single node clusters that require an additional temporary machine for upgrading, if you don’t want to set up the extra hardware, you may recreate the cluster for upgrading and handle data recovery manually.

Network requirements

Each machine should include the following features:

  • Network Interface Cards: At least one NIC is required. It must be capable of network booting.
  • BMC integration (recommended): An IPMI or Redfish implementation (such a Dell iDRAC, RedFish-compatible, legacy or HP iLO) on the computer’s motherboard or on a separate expansion card. This feature is used to allow remote management of the machine, such as turning the machine on and off.

NOTE: BMC integration is not required for an EKS Anywhere cluster. However, without BMC integration, upgrades are not supported and you will have to physically turn machines off and on when appropriate.

Here are other network requirements:

  • All EKS Anywhere machines, including the Admin, control plane and worker machines, must be on the same layer 2 network and have network connectivity to the BMC (IPMI, Redfish, and so on).

  • You must be able to run DHCP on the control plane/worker machine network.

NOTE:: If you have another DHCP service running on the network, you need to prevent it from interfering with the EKS Anywhere DHCP service. You can do that by configuring the other DHCP service to explicitly block all MAC addresses and exclude all IP addresses that you plan to use with your EKS Anywhere clusters.

  • The administrative machine and the target workload environment will need network access to:

    • public.ecr.aws
    • anywhere-assets.eks.amazonaws.com: To download the EKS Anywhere binaries, manifests and OVAs
    • distro.eks.amazonaws.com: To download EKS Distro binaries and manifests
    • d2glxqk2uabbnd.cloudfront.net: For EKS Anywhere and EKS Distro ECR container images
  • Two IP addresses routable from the cluster, but excluded from DHCP offering. One IP address is to be used as the Control Plane Endpoint IP. The other is for the Tinkerbell IP address on the target cluster. Below are some suggestions to ensure that these IP addresses are never handed out by your DHCP server. You may need to contact your network engineer to manage these addresses.

    • Pick IP addresses reachable from the cluster subnet that are excluded from the DHCP range or
    • Create an IP reservation for these addresses on your DHCP server. This is usually accomplished by adding a dummy mapping of this IP address to a non-existent mac address.

NOTE: When you set up your cluster configuration YAML file, the endpoint and Tinkerbell addresses are set in the ControlPlaneConfiguration.endpoint.host and tinkerbellIP fields, respectively.

  • Ports must be open to the Admin machine and cluster machines as described in Ports and protocols .

Validated hardware

Through extensive testing in a variety of on premises customer environments during our beta phase, we expect Amazon EKS Anywhere on bare metal to run on most generic hardware that meets the above requirements. In addition, we have collaborated with our hardware original equipment manufacturer (OEM) partners to provide you a list of validated hardware:

Bare metal servers BMC NIC OS
Dell PowerEdge R740 iDRAC9 Mellanox ConnectX-4 LX 25GbE Validated with Ubuntu v20.04.1
Dell PowerEdge R7525 (NVIDIA Tesla™ T4 GPU’s) iDRAC9 Mellanox ConnectX-4 LX 25GbE & Intel Ethernet 10G 4P X710 OCP Validated with Ubuntu v20.04.1
Dell PowerFlex (R640) iDRAC9 Mellanox ConnectX-4 LX 25GbE Validated with Ubuntu v20.04.1
SuperServer SYS-510P-M IPMI2.0/Redfish API Intel® Ethernet Controller i350 2x 1GbE Validated with Ubuntu v20.04.1 and Bottlerocket v1.8.0
Dell PowerEdge R240 iDRAC9 Broadcom 57414 Dual Port 10/25GbE Validated with Ubuntu v20.04 and Bottlerocket v1.8.0
HPE ProLiant DL20 iLO5 HPE 361i 1G Validated with Ubuntu v20.04 and Bottlerocket v1.8.0
HPE ProLiant DL160 Gen10 iLO5 HPE Eth 10/25Gb 2P 640SFP28 A Validated with Ubuntu v20.04.1
Dell PowerEdge R340 iDRAC9 Broadcom 57416 Dual Port 10GbE Validated with Ubuntu v20.04.1 and Bottlerocket v1.8.0
HPE ProLiant DL360 iLO5 HPE Ethernet 1Gb 4-port 331i Validated with Ubuntu v20.04.1
Lenovo ThinkSystem SR650 V2 XClarity Controller Enterprise v7.92
  • Intel I350 1GbE RJ45 4-port OCP
  • Marvell QL41232 10/25GbE SFP28
    2-Port PCIe Ethernet Adapter
Validated with Ubuntu v20.04.1

5.2.2 - Preparing Bare Metal for EKS Anywhere

Set up a Bare Metal cluster to prepare it for EKS Anywhere

After gathering hardware described in Bare Metal Requirements , you need to prepare the hardware and create a CSV file describing that hardware.

Prepare hardware

To prepare your computer hardware for EKS Anywhere, you need to connect your computer hardware and do some configuration. Once the hardware is in place, you need to:

  • Obtain IP and MAC addresses for your machines' NICs.
  • Obtain IP addresses for your machines' BMC interfaces.
  • Obtain the gateway address for your network to reach the Internet.
  • Obtain the IP address for your DNS servers.
  • Make sure the following settings are in place:
    • UEFI is enabled on all target cluster machines, unless you are provisioning RHEL systems. Enable legacy BIOS on any RHEL machines.
    • Netboot (PXE or HTTP) boot is enabled for the NIC on each machine for which you provided the MAC address. This is the interface on which the operating system will be provisioned.
    • IPMI over LAN and/or Redfish is enabled on all BMC interfaces.
  • Go to the BMC settings for each machine and set the IP address (bmc_ip), username (bmc_username), and password (bmc_password) to use later in the CSV file.

Prepare hardware inventory

Create a CSV file to provide information about all physical machines that you are ready to add to your target Bare Metal cluster. This file will be used:

  • When you generate the hardware file to be included in the cluster creation process described in the Create Bare Metal production cluster Getting Started guide.
  • To provide information that is passed to each machine from the Tinkerbell DHCP server when the machine is initially network booted.

The following is an example of an EKS Anywhere Bare Metal hardware CSV file:

hostname,bmc_ip,bmc_username,bmc_password,mac,ip_address,netmask,gateway,nameservers,labels,disk
eksa-cp01,10.10.44.1,root,PrZ8W93i,CC:48:3A:00:00:01,10.10.50.2,255.255.254.0,10.10.50.1,8.8.8.8|8.8.4.4,type=cp,/dev/sda
eksa-cp02,10.10.44.2,root,Me9xQf93,CC:48:3A:00:00:02,10.10.50.3,255.255.254.0,10.10.50.1,8.8.8.8|8.8.4.4,type=cp,/dev/sda
eksa-cp03,10.10.44.3,root,Z8x2M6hl,CC:48:3A:00:00:03,10.10.50.4,255.255.254.0,10.10.50.1,8.8.8.8|8.8.4.4,type=cp,/dev/sda
eksa-wk01,10.10.44.4,root,B398xRTp,CC:48:3A:00:00:04,10.10.50.5,255.255.254.0,10.10.50.1,8.8.8.8|8.8.4.4,type=worker,/dev/sda
eksa-wk02,10.10.44.5,root,w7EenR94,CC:48:3A:00:00:05,10.10.50.6,255.255.254.0,10.10.50.1,8.8.8.8|8.8.4.4,type=worker,/dev/sda

The CSV file is a comma-separated list of values in a plain text file, holding information about the physical machines in the datacenter that are intended to be a part of the cluster creation process. Each line represents a physical machine (not a virtual machine).

The following sections describe each value.

hostname

The hostname assigned to the machine.

bmc_ip (optional)

The IP address assigned to the BMC interface on the machine.

bmc_username (optional)

The username assigned to the BMC interface on the machine.

bmc_password (optional)

The password associated with the bmc_username assigned to the BMC interface on the machine.

mac

The MAC address of the network interface card (NIC) that provides access to the host computer.

ip_address

The IP address providing access to the host computer.

netmask

The netmask associated with the ip_address value. In the example above, a /23 subnet mask is used, allowing you to use up to 510 IP addresses in that range.

gateway

IP address of the interface that provides access (the gateway) to the Internet.

nameservers

The IP address of the server that you want to provide DNS service to the cluster.

labels

The optional labels field can consist of a key/value pair to use in conjunction with the hardwareSelector field when you set up your Bare Metal configuration . The key/value pair is connected with an equal (=) sign.

For example, a TinkerbellMachineConfig with a hardwareSelector containing type: cp will match entries in the CSV containing type=cp in its label definition.

disk

The device name of the disk on which the operating system will be installed. For example, it could be /dev/sda for the first SCSI disk or /dev/nvme0n1 for the first NVME storage device.

5.2.3 - Netbooting and Tinkerbell for Bare Metal

Overview of Tinkerbell and network booting for EKS Anywhere on Bare Metal

EKS Anywhere uses Tinkerbell to provision machines for a Bare Metal cluster. Understanding what Tinkerbell is and how it works with EKS Anywhere can help you take advantage of advanced provisioning features or overcome provisioning problems you encounter.

As someone deploying an EKS Anywhere cluster on Bare Metal, you have several opportunities to interact with Tinkerbell:

  • Create a hardware CSV file: You are required to create a hardware CSV file that contains an entry for every physical machine you want to add at cluster creation time.
  • Create an EKS Anywhere cluster: By modifying the Bare Metal configuration file used to create a cluster, you can change some Tinkerbell settings or add actions to define how the operating system on each machine is configured.
  • Monitor provisioning: You can follow along with the Tinkerbell Overview in this page to monitor the progress of your hardware provisioning, as Tinkerbell finds machines and attempts to network boot, configure, and restart them.

Using Tinkerbell on EKS Anywhere

The sections below step through how Tinkerbell is integrated with EKS Anywhere to deploy a Bare Metal cluster. While based on features described in Tinkerbell Documentation , EKS Anywhere has modified and added to Tinkerbell components such that the entire Tinkerbell stack is now Kubernetes-friendly and can run on a Kubernetes cluster.

Create bare metal CSV file

The information that Tinkerbell uses to provision machines for the target EKS Anywhere cluster needs to be gathered in a CSV file with the following format:

hostname,bmc_ip,bmc_username,bmc_password,mac,ip_address,netmask,gateway,nameservers,labels,disk
eksa-cp01,10.10.44.1,root,PrZ8W93i,CC:48:3A:00:00:01,10.10.50.2,255.255.254.0,10.10.50.1,8.8.8.8,type=cp,/dev/sda
...

Each physical, bare metal machine is represented by a comma-separated list of information on a single line. It includes information needed to identify each machine (the NIC’s MAC address), network boot the machine, point to the disk to install on, and then configure and start the installed system. See Preparing hardware inventory for details on the content and format of that file.

Modify the cluster specification file

Before you create a cluster using the Bare Metal configuration file, you can make Tinkerbell-related changes to that file. In particular, TinkerbellDatacenterConfig fields , TinkerbellMachineConfig fields , and Tinkerbell Actions can be added or modified.

Tinkerbell actions vary based on the operating system you choose for your EKS Anywhere cluster. Actions are stored internally and not shown in the generated cluster specification file, so you must add those sections yourself to change from the defaults (see Ubuntu TinkerbellTemplateConfig example and Bottlerocket TinkerbellTemplateConfig example for details).

In most cases, you don’t need to touch the default actions. However, you might want to modify an action (for example to change kexec to a reboot action if the hardware requires it) or add an action to further configure the installed system. Examples in Advanced Bare Metal cluster configuration show a few actions you might want to add.

Once you have made all your modifications, you can go ahead and create the cluster. The next section describes how Tinkerbell works during cluster creation to provision your Bare Metal machines and prepare them to join the EKS Anywhere cluster.

Overview of Tinkerbell in EKS Anywhere

When you run the command to create an EKS Anywhere Bare Metal cluster, a set of Tinkerbell components start up on the Admin machine. One of these components runs in a container on Docker, while other components run as either controllers or services in pods on the Kubernetes kind cluster that is started up on the Admin machine. Tinkerbell components include Boots, Hegel, Rufio, and Tink.

Tinkerbell Boots service

The Boots service runs in a single container to handle the DHCP service and network booting activities. In particular, Boots hands out IP addresses, serves iPXE binaries via HTTP and TFTP, delivers an iPXE script to the provisioned machines, and runs a syslog server.

Boots is different from the other Tinkerbell services because the DHCP service it runs must listen directly to layer 2 traffic. (The kind cluster running on the Admin machine doesn’t have the ability to have pods listening on layer 2 networks, which is why Boots is run directly on Docker instead, with host networking enabled.)

Because Boots is running as a container in Docker, you can see the output in the logs for the Boots container by running:

docker logs boots

From the logs output, you will see iPXE try to network boot each machine. If the process doesn’t get all the information it wants from the DHCP server, it will time out. You can see iPXE loading variables, loading a kernel and initramfs (via DHCP), then booting into that kernel and initramfs: in other words, you will see everything that happens with iPXE before it switches over to the kernel and initramfs. The kernel, initramfs, and all images retrieved later are obtained remotely over HTTP and HTTPS.

Tinkerbell Hegel, Rufio, and Tink components

After Boots comes up on Docker, a small Kubernetes kind cluster starts up on the Admin machine. Other Tinkerbell components run as pods on that kind cluster. Those components include:

  • Hegel: Manages Tinkerbell’s metadata service. The Hegel service gets its metadata from the hardware specification stored in Kubernetes in the form of custom resources. The format that it serves is similar to an Ec2 metadata format.
  • Rufio: Handles talking to BMCs (which manages things like starting and stopping systems with IPMI or Redfish). The Rufio Kubernetes controller sets things such as power state, persistent boot order. BMC authentication is managed with Kubernetes secrets.
  • Tink: The Tink service consists of three components: Tink server, Tink controller, and Tink worker. The Tink controller manages hardware data, templates you want to execute, and the workflows that each target specific hardware you are provisioning. The Tink worker is a small binary that runs inside of HookOS and talks to the Tink server. The worker sends the Tink server its MAC address and asks the server for workflows to run. The Tink worker will then go through each action, one-by-one, and try to execute it.

To see those services and controllers running on the kind bootstrap cluster, type:

kubectl get pods -n eksa-system
NAME                                      READY STATUS    RESTARTS AGE
hegel-sbchp                               1/1   Running   0        3d
rufio-controller-manager-5dcc568c79-9kllz 1/1   Running   0        3d
tink-controller-manager-54dc786db6-tm2c5  1/1   Running   0        3d
tink-server-5c494445bc-986sl              1/1   Running   0        3d

Provisioning hardware with Tinkerbell

After you start up the cluster create process, the following is the general workflow that Tinkerbell performs to begin provisioning the bare metal machines and prepare them to become part of the EKS Anywhere target cluster. You can set up kubectl on the Admin machine to access the bootstrap cluster and follow along:

export KUBECONFIG=${PWD}/${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig

Power up the nodes

Tinkerbell starts by finding a node from the hardware list (based on MAC address) and contacting it to identify a baseboard management job (job.bmc) that runs a set of baseboard management tasks (task.bmc). To see that information, type:

kubectl get job.bmc -A
NAMESPACE    NAME                                           AGE
eksa-system  mycluster-md-0-1656099863422-vxvh2-provision   12m
kubectl get tasks.bmc -A
NAMESPACE    NAME                                                AGE
eksa-system  mycluster-md-0-1656099863422-vxh2-provision-task-0  55s
eksa-system  mycluster-md-0-1656099863422-vxh2-provision-task-1  51s
eksa-system  mycluster-md-0-1656099863422-vxh2-provision-task-2  47s

The following shows snippets from the tasks.bmc output that represent the three tasks: Power Off, enable network boot, and Power On.

kubectl describe tasks.bmc -n eksa-system eksa-system mycluster-md-0-1656099863422-vxh2-provision-task-0
...
  Task:
    Power Action:  Off
Status:
  Completion Time:   2022-06-27T20:32:59Z
  Conditions:
    Status:    True
    Type:      Completed 
kubectl describe tasks.bmc -n eksa-system eksa-system mycluster-md-0-1656099863422-vxh2-provision-task-1
...
  Task:
    One Time Boot Device Action:
      Device:
        pxe
      Efi Boot:  true
Status:
  Completion Time:   2022-06-27T20:33:04Z
  Conditions:
    Status:    True
    Type:      Completed   
kubectl describe tasks.bmc -n eksa-system eksa-system mycluster-md-0-1656099863422-vxh2-provision-task-2
  Task:
    Power Action:  on
Status:
  Completion Time:   2022-06-27T20:33:10Z
  Conditions:
    Status:    True
    Type:      Completed   

Rufio converts the baseboard management jobs into task objects, then goes ahead and executes each task. To see Rufio logs, type:

kubectl logs -n eksa-system rufio-controller-manager-5dcc568c79-9kllz | less

Network booting the nodes

Next the Boots service netboots the machine and begins streaming the HookOS (vmlinuz and initramfs) to the machine. HookOS runs in memory and provides the installation environment. To watch the Boots log messages as each node powers up, type:

docker logs boots 

You can search the output for vmlinuz and initramfs to watch as the HookOS is downloaded and booted from memory on each machine.

Running workflows

Once the HookOS is up, Tinkerbell begins running the tasks and actions contained in the workflows. This is coordinated between the Tink worker, running in memory within the HookOS on the machine, and the Tink server on the kind cluster. To see the workflows being run, type the following:

kubectl get workflows.tinkerbell.org -n eksa-system
NAME                                TEMPLATE                            STATE
mycluster-md-0-1656099863422-vxh2   mycluster-md-0-1656099863422-vxh2   STATE_RUNNING

This shows the workflow for the first machine that is being provisioned. Add -o yaml to see details of that workflow template:

kubectl get workflows.tinkerbell.org -n eksa-system -o yaml
...
status:
  state: STATE_RUNNING
  tasks:
  - actions
    - environment:
        COMPRESSED: "true"
        DEST_DISK: /dev/sda
        IMG_URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/11/artifacts/raw/1-22/bottlerocket-v1.22.10-eks-d-1-22-8-eks-a-11-amd64.img.gz
      image: public.ecr.aws/eks-anywhere/tinkerbell/hub/image2disk:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
      name: stream-image
      seconds: 35
      startedAt: "2022-06-27T20:37:39Z"
      status: STATE_SUCCESS
...

You can see that the first action in the workflow is to stream (stream-image) the operating system to the destination disk (DEST_DISK) on the machine. In this example, the Bottlerocket operating system that will be copied to disk (/dev/sda) is being served from the location specified by IMG_URL. The action was successful (STATE_SUCCESS) and it took 35 seconds.

Each action and its status is shown in this output for the whole workflow. To see details of the default actions for each supported operating system, see the Ubuntu TinkerbellTemplateConfig example and Bottlerocket TinkerbellTemplateConfig example .

In general, the actions include:

  • Streaming the operating system image to disk on each machine.
  • Configuring the network interfaces on each machine.
  • Setting up the cloud-init or similar service to add users and otherwise configure the system.
  • Identifying the data source to add to the system.
  • Setting the kernel to pivot to the installed system (using kexec) or having the system reboot to bring up the installed system from disk.

If all goes well, you will see all actions set to STATE_SUCCESS, except for the kexec-image action. That should show as STATE_RUNNING for as long as the machine is running.

You can review the CAPT logs to see provisioning activity. For example, at the start of a new provisioning event, you would see something like the following:

kubectl logs -n capt-system capt-controller-manager-9f8b95b-frbq | less
..."Created BMCJob to get hardware ready for provisioning"...

You can follow this output to see the machine as it goes through the provisioning process.

After the node is initialized, completes all the Tinkerbell actions, and is booted into the installed operating system (Ubuntu or Bottlerocket), the new system starts cloud-init to do further configuration. At this point, the system will reach out to the Tinkerbell Hegel service to get its metadata.

If something goes wrong, viewing Hegel files can help you understand why a stuck system that has booted into Ubuntu or Bottlerocket has not joined the cluster yet. To see the Hegel logs, get the internal IP address for one of the new nodes. Then check for the names of Hegel logs and display the contents of one of those logs, searching for the IP address of the node:

kubectl get nodes -o wide
NAME        STATUS   ROLES                 AGE    VERSION               INTERNAL-IP    ...
eksa-da04   Ready    control-plane,master  9m5s   v1.22.10-eks-7dc61e8  10.80.30.23
kubectl get logs -n eksa-system | grep hegel
hegel-n7ngs
kubectl logs -n eksa-system hegel-n7ngs
..."Retrieved IP peer IP..."userIP":"10.80.30.23...

If the log shows you are getting requests from the node, the problem is not a cloud-init issue.

After the first machine successfully completes the workflow, each other machine repeats the same process until the initial set of machines is all up and running.

Tinkerbell moves to target cluster

Once the initial set of machines is up and the EKS Anywhere cluster is running, all the Tinkerbell services and components (including Boots) are moved to the new target cluster and run as pods on that cluster. Those services are deleted on the kind cluster on the Admin machine.

Reviewing the status

At this point, you can change your kubectl credentials to point at the new target cluster to get information about Tinkerbell services on the new cluster. For example:

export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig

First check that the Tinkerbell pods are all running by listing pods from the eksa-system namespace:

kubectl get pods -n eksa-system
NAME                                        READY   STATUS    RESTARTS   AGE
boots-5dc66b5d4-klhmj                       1/1     Running   0          3d
hegel-sbchp                                 1/1     Running   0          3d
rufio-controller-manager-5dcc568c79-9kllz   1/1     Running   0          3d
tink-controller-manager-54dc786db6-tm2c5    1/1     Running   0          3d
tink-server-5c494445bc-986sl                1/1     Running   0          3d

Next, check the list of Tinkerbell machines. If all of the machines were provisioned successfully, you should see true under the READY column for each one.

kubectl get tinkerbellmachine -A
NAMESPACE    NAME                                                   CLUSTER    STATE  READY  INSTANCEID                          MACHINE
eksa-system  mycluster-control-plane-template-1656099863422-pqq2q   mycluster         true   tinkerbell://eksa-system/eksa-da04  mycluster-72p72

You can also check the machines themselves. Watch the PHASE change from Provisioning to Provisioned to Running. The Running phase indicates that the machine is now running as a node on the new cluster:

kubectl get machines -n eksa-system
NAME              CLUSTER    NODENAME    PROVIDERID                         PHASE    AGE  VERSION
mycluster-72p72   mycluster  eksa-da04   tinkerbell://eksa-system/eksa-da04 Running  7m25s   v1.22.10-eks-1-22-8

Once you have confirmed that all your machines are successfully running as nodes on the target cluster, there is not much for Tinkerbell to do. It stays around to continue running the DHCP service and to be available to add more machines to the cluster.

5.2.4 - Customize HookOS for EKS Anywhere on Bare Metal

Customizing HookOS for EKS Anywhere on Bare Metal

To initially network boot bare metal machines used in EKS Anywhere clusters, Tinkerbell acquires a kernel and initial ramdisk that is referred to as the HookOS. A default HookOS is provided when you create an EKS Anywhere cluster. However, there may be cases where you want to override the default HookOS, such as to add drivers required to boot your particular type of hardware.

The following procedure describes how to get the Tinkerbell stack’s Hook/Linuxkit OS built locally. For more information on Tinkerbell’s Hook Installation Environment, see the Tinkerbell Hook repo .

  1. Clone the hook repo or your fork of that repo:

    git clone https://github.com/tinkerbell/hook.git
    cd hook/
    
  2. Pull down the commit that EKS Anywhere is tracking for Hook:

    git checkout -b <new-branch> 03a67729d895635fe3c612e4feca3400b9336cc9
    

    NOTE: This commit number can be obtained from the EKS-A build tooling repo .

  3. Make changes shown in the following diff in the Makefile located in the root of the repo using your favorite editor.

    diff --git a/Makefile b/Makefile
    index e7fd844..8e87c78 100644
    --- a/Makefile
    +++ b/Makefile
    @@ -2,7 +2,7 @@
     ### !!NOTE!!
     # If this is changed then a fresh output dir is required (`git clean -fxd` or just `rm -rf out`)
     # Handling this better shows some of make's suckiness compared to newer build tools (redo, tup ...) where the command lines to tools invoked isn't tracked by make
    -ORG := quay.io/tinkerbell
    +ORG := localhost:5000/tinkerbell
     # makes sure there's no trailing / so we can just add them in the recipes which looks nicer
     ORG := $(shell echo "${ORG}" | sed 's|/*$$||')
    
    

    Changes above change the ORG variable to use a local registry (localhost:5000)

  4. Make changes shown in the following diff in the rules.mk located in the root of the repo using your favorite editor.

    diff --git a/rules.mk b/rules.mk
    index b2c5133..64e32b1 100644
    --- a/rules.mk
    +++ b/rules.mk
    @@ -22,7 +22,7 @@ ifeq ($(ARCH),aarch64)
     ARCH = arm64
     endif
    
    -arches := amd64 arm64
    +arches := amd64
     modes := rel dbg
    
     hook-bootkit-deps := $(wildcard hook-bootkit/*)
    @@ -87,13 +87,12 @@ push-hook-bootkit push-hook-docker:
            docker buildx build --platform $$platforms --push -t $(ORG)/$(container):$T $(container)
    
     .PHONY: dist
    -dist: out/$T/rel/amd64/hook.tar out/$T/rel/arm64/hook.tar ## Build tarballs for distribution
    +dist: out/$T/rel/amd64/hook.tar ## Build tarballs for distribution
     dbg-dist: out/$T/dbg/$(ARCH)/hook.tar ## Build debug enabled tarball
     dist dbg-dist:
            for f in $^; do
            case $$f in
            *amd64*) arch=x86_64 ;;
     -      *arm64*) arch=aarch64 ;;
            *) echo unknown arch && exit 1;;
            esac
            d=$$(dirname $$(dirname $$f))
    
    

    Above changes are for the docker build command to only build for the immediately required platform (amd64 in this case) to save time.

  5. Modify the hook.yaml file located in the root of the repo with the following changes:

    diff --git a/hook.yaml b/hook.yaml
    
     index 0c5d789..b51b35e 100644
    
     net: host
    --- a/hook.yaml
    +++ b/hook.yaml
    @@ -1,5 +1,5 @@
     kernel:
    - image: quay.io/tinkerbell/hook-kernel:5.10.85 (http://quay.io/tinkerbell/hook-kernel:5.10.85)
    + image: localhost:5000/tinkerbell/hook-kernel:5.10.85
     cmdline: "console=tty0 console=ttyS0 console=ttyAMA0 console=ttysclp0"
     init:
     - linuxkit/init:v0.8
    @@ -42,7 +42,7 @@ services:
     binds:
     - /var/run:/var/run
     - name: docker
    - image: quay.io/tinkerbell/hook-docker:0.0 (http://quay.io/tinkerbell/hook-docker:0.0)
    + image: localhost:5000/tinkerbell/hook-docker:0.0
     capabilities:
     - all
     net: host
    @@ -64,7 +64,7 @@ services:
     - /var/run/docker
     - /var/run/worker
     - name: bootkit
    - image: quay.io/tinkerbell/hook-bootkit:0.0 (http://quay.io/tinkerbell/hook-bootkit:0.0)
    + image: localhost:5000/tinkerbell/hook-bootkit:0.0
     capabilities:
     - all
    

    The changes above are for using local registry (localhost:5000) for hook-docker, hook-bootkit, and hook-kernel.

    NOTE: You may also need to modify the hook.yaml file if you want to add or change components that are used to build up the image. So far, for example, we have needed to change versions of init and getty and inject SSH keys. Take a look at the LinuxKit Examples site for examples.

  6. Make any planned custom modifications to the files under hook, if you are only making changes to bootkit or tink-docker.

  7. If you are modifying the kernel, such as to change kernel config parameters to add or modify drivers, follow these steps:

    • Change into kernel directory and make a local image for amd64 architecture:
    cd kernel; make kconfig_amd64
    
    • Run the image
    docker run --rm -ti -v $(pwd):/src:z quay.io/tinkerbell/kconfig
    
    • You can now navigate to the source code and run the UI for configuring the kernel:
    cd linux-5-10
    make menuconfig
    
    • Once you have changed the necessary kernel configuration parameters, copy the new configuration:
    cp .config /src/config-5.10.x-x86_64
    

    Exit out of container into the repo’s kernel directory and run make:

    /linux-5.10.85 # exit
    user1 % make
    
  8. Install Linuxkit based on instructions from the LinuxKit page.

  9. Ensure that the linuxkit tool is in your PATH:

    export PATH=$PATH:/home/tink/linuxkit/bin
    
  10. Start a local registry:

    docker run -d -p 5000:5000 -—name registry registry:2
    
  11. Compile by running the following in the root of the repo:

    make dist  
    
  12. Artifacts will be put under the dist directory in the repo’s root:

    ./initramfs-aarch64
    ./initramfs-x86_64
    ./vmlinuz-aarch64
    ./vmlinuz-x86_64
    
  13. To use the kernel (vmlinuz) and initial ram disk (initramfs) when you build your cluster, see the description of the hookImagesURLPath field in your Bare Metal configuration file.

5.3 - CloudStack

Preparing a CloudStack provider for EKS Anywhere

5.3.1 - Requirements for EKS Anywhere on CloudStack

CloudStack provider requirements for EKS Anywhere

To run EKS Anywhere, you will need:

Prepare Administrative machine

Set up an Administrative machine as described in Install EKS Anywhere .

Prepare a CloudStack environment

To prepare a CloudStack environment to run EKS Anywhere, you need the following:

  • A CloudStack 4.14 or later environment. CloudStack 4.16 is used for examples in these docs.

  • Capacity to deploy 6-10 VMs.

  • One shared network in CloudStack to use for the cluster. EKS Anywhere clusters need access to CloudStack through the network to enable self-managing and storage capabilities.

  • A Red Hat Enterprise Linux qcow2 image built using the image-builder tool as described in artifacts .

  • User credentials (CloudStack API key and Secret key) to create VMs and attach networks in CloudStack.

  • One IP address routable from the cluster but excluded from DHCP offering. This IP address is to be used as the Control Plane Endpoint IP. Below are some suggestions to ensure that this IP address is never handed out by your DHCP server. You may need to contact your network engineer.

    • Pick an IP address reachable from the cluster subnet which is excluded from DHCP range OR
    • Alter DHCP ranges to leave out an IP address(s) at the top and/or the bottom of the range OR
    • Create an IP reservation for this IP on your DHCP server. This is usually accomplished by adding a dummy mapping of this IP address to a non-existent mac address.

Each VM will require:

  • 2 vCPUs
  • 8GB RAM
  • 25GB Disk

The administrative machine and the target workload environment will need network access to:

CloudStack information needed before creating the cluster

You need at least the following information before creating the cluster. See CloudStack configuration for a complete list of options and Preparing CloudStack for instructions on creating the assets.

  • Static IP Addresses: You will need one IP address for the management cluster control plane endpoint, and a separate one for the controlplane of each workload cluster you add.

Let’s say you are going to have the management cluster and two workload clusters. For those, you would need three IP addresses, one for each. All of those addresses will be configured the same way in the configuration file you will generate for each cluster.

A static IP address will be used for each control plane VM in your EKS Anywhere cluster. Choose IP addresses in your network range that do not conflict with other VMs and make sure they are excluded from your DHCP offering. An IP address will be the value of the property controlPlaneConfiguration.endpoint.host in the config file of the management cluster. A separate IP address must be assigned for each workload cluster.

  • CloudStack datacenter: You need the name of the CloudStack Datacenter plus the following for each Availability Zone (availabilityZones). Most items can be represented by name or ID:
    • Account (account): Account with permission to create a cluster (optional, admin by default).
    • Credentials (credentialsRef): Credentials provided in an ini file used to access the CloudStack API endpoint. See CloudStack Getting started for details.
    • Domain (domain): The CloudStack domain in which to deploy the cluster (optional, ROOT by default)
    • Management endpoint (managementApiEndpoint): Endpoint for a cloudstack client to make API calls to client.
    • Zone network (zone.network): Either name or ID of the network.
  • CloudStack machine configuration: For each set of machines (for example, you could configure separate set of machines for control plane, worker, and etcd nodes), obtain the following information. This must be predefined in the cloudStack instance and identified by name or ID:
    • Compute offering (computeOffering): Choose an existing compute offering (such as large-instance), reflecting the amount of resources to apply to each VM.
    • Operating system (template): Identifies the operating system image to use (such as rhel8-k8s-118).
    • Users (users.name): Identifies users and SSH keys needed to access the VMs.

5.3.2 - Preparing CloudStack for EKS Anywhere

Set up a CloudStack cluster to prepare it for EKS Anywhere

Before you can create an EKS Anywhere cluster in CloudStack, you must do some setup on your CloudStack environment. This document helps you get what you need to fulfill the prerequisites described in the Requirements and values you need for CloudStack configuration .

Set up a domain and user credentials

Either use the ROOT domain or create a new domain to deploy your EKS Anywhere cluster. One or more users are grouped under a domain. This example creates a user account for the domain with a Domain Administrator role. From the apachecloudstack console:

  1. Select Domains.

  2. Select Add Domain.

  3. Fill in the Name for the domain (eksa in this example) and select OK.

  4. Select Accounts -> Add Account, then fill in the form to add a user with DomainAdmin role, as shown in the following figure:

    Add a user account with the DomainAdmin role

  5. To generate API credentials for the user, select Accounts-> -> View Users -> and select the Generate Keys button.

  6. Select OK to confirm key generation. The API Key and Secret Key should appear as shown in the following figure:

    Generate API Key and Secret Key

  7. Copy the API Key and Secret Key to a credentials file to use when you generate your cluster. For example:

    [Global]
    api-url = http://10.0.0.2:8080/client/api
    api-key = OI7pm0xrPMYjLlMfqrEEj...
    secret-key = tPsgAECJwTHzbU4wMH...
    

Import template

You need to build at least one operating system image and import it as a template to use for your cluster nodes. Currently, only Red Hat Enterprise Linux 8 images are supported. To build a RHEL-based image to use with EKS Anywhere, see Build node images .

  1. Make your image accessible from you local machine or from a URL that is accessible to your CloudStack setup.

  2. Select Images -> Templates, then select either Register Template from URL or Select local Template. The following figure lets you register a template from URL:

    Adding a RHEL-based EKS Anywhere image template

    This example imports a RHEL image (QCOW2), identifies the zone from which it will be available, uses KVM as the hypervisor, uses the osdefault Root disk controller, and identifies the OS Type as Red Hat Enterprise Linux 8.0. Select OK to save the template.

  3. Note the template name and zone so you can use it later when you deploy your cluster.

Create CloudStack configurations

Take a look at the following CloudStack configuration settings before creating your EKS Anywhere cluster. You will need to identify many of these assets when you create you cluster specification:

DatacenterConfig information

Here is how to get information to go into the CloudStackDatacenterConfig section of the CloudStack cluster configuration file:

  • Domain: Select Domains, then select your domain name from under the ROOT domain. Select View Users, not the user with the Domain Admin role, and consider setting limits to what each user can consume from the Resources and Configure Limits tabs.

  • Zones: Select Infrastructure -> Zones. Find a Zone where you can deploy your cluster or create a new one.

    Select from available Zones

  • Network: Select Network -> Guest networks. Choose a network to use for your cluster or create a new one.

Here is what some of that information would look like in a cluster configuration:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: CloudStackDatacenterConfig
metadata:
  name: my-cluster-name-datacenter
spec:
  availabilityZones:
  - account: admin
    credentialsRef: global
    domain: eksa
    managementApiEndpoint: ""
    name: az-1
    zone:
      name: Zone2
      network:
        name: "SharedNet2"

MachineConfig information

Here is how to get information to go into CloudStackMachineConfig sections of the CloudStack cluster configuration file:

  • computeOffering: Select Service Offerings -> Compute Offerings to see a list of available combinations of CPU cores, CPU, and memory to apply to your node instances. See the following figure for an example:

    Choose or add a compute offering to set node resources

  • template: Select Images -> Templates to see available operating system image templates.

  • diskOffering: Select Storage -> Volumes, the select Create Volume, if you want to create disk storage to attach to your nodes (optional). You can use this to store logs or other data you want saved outside of the nodes. When you later create the cluster configuration, you can identify things like where you want the device mounted, the type of file system, labels and other information.

  • AffinityGroupIds: Select Compute -> Affinity Groups, then select Add new affinity group (optional). By creating an affinity group, you can tell all VMs from a set of instances to either all run on different physical hosts (anti-affinity) or just run anywhere they can (affinity).

Here is what some of that information would look like in a cluster configuration:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: CloudStackMachineConfig
metadata:
  name: my-cluster-name-cp
spec:
  computeOffering:
    name: "Medium Instance"
  template:
    name: "rhel8-kube-1.23-eksa"
  diskOffering:
    name: "Small"
    mountPath: "/data-small"
    device: "/dev/vdb"
    filesystem: "ext4"
    label: "data_disk"
  symlinks:
    /var/log/kubernetes: /data-small/var/log/kubernetes
  affinityGroupIds:
  - control-plane-anti-affinity

5.4 - Nutanix

Preparing a Nutanix Cloud Infrastructure provider for EKS Anywhere

See Create Nutanix production cluster to learn how to set up EKS Anywhere on Nutanix. Documents below describe how to prepare your Nutanix environment.

5.4.1 - Requirements for EKS Anywhere on Nutanix Cloud Infrastructure

Preparing a Nutanix Cloud Infrastructure provider for EKS Anywhere

To run EKS Anywhere, you will need:

Prepare Administrative machine

Set up an Administrative machine as described in Install EKS Anywhere .

Prepare a Nutanix environment

To prepare a Nutanix environment to run EKS Anywhere, you need the following:

  • A Nutanix environment running AOS 5.20.4+ with AHV and Prism Central 2022.1+

  • Capacity to deploy 6-10 VMs

  • DHCP service or Nutanix IPAM running in your environment in the primary VM network for your workload cluster

  • One network in AOS to use for the cluster. EKS Anywhere clusters need access to Prism Central through the network to enable self-managing and storage capabilities.

  • A VM image imported into the Prism Image Service for the workload VMs

  • User credentials to create VMs and attach networks, etc

  • One IP address routable from cluster but excluded from DHCP/IPAM offering. This IP address is to be used as the Control Plane Endpoint IP

    Below are some suggestions to ensure that this IP address is never handed out by your DHCP server.

    You may need to contact your network engineer.

    • Pick an IP address reachable from cluster subnet which is excluded from DHCP range OR
    • Alter DHCP ranges to leave out an IP address(s) at the top and/or the bottom of the range OR
    • Create an IP reservation for this IP on your DHCP server. This is usually accomplished by adding a dummy mapping of this IP address to a non-existent mac address.
    • Block an IP address from the Nutanix IPAM managed network using aCLI

Each VM will require:

  • 2 vCPUs
  • 4GB RAM
  • 40GB Disk

The administrative machine and the target workload environment will need network access to:

  • Prism Central endpoint (must be accessible to EKS Anywhere clusters)
  • Prism Element Data Services IP and CVM endpoints (for CSI storage connections)
  • public.ecr.aws (for pulling EKS Anywhere container images)
  • anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere binaries and manifests)
  • distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
  • d2glxqk2uabbnd.cloudfront.net (for EKS Anywhere and EKS Distro ECR container images)
  • api.ecr.us-west-2.amazonaws.com (for EKS Anywhere package authentication matching your region)
  • d5l0dvt14r5h8.cloudfront.net (for EKS Anywhere package ECR container images)
  • api.github.com (only if GitOps is enabled)

Nutanix information needed before creating the cluster

You need to get the following information before creating the cluster:

  • Static IP Addresses: You will need one IP address for the management cluster control plane endpoint, and a separate one for the controlplane of each workload cluster you add.

    Let’s say you are going to have the management cluster and two workload clusters. For those, you would need three IP addresses, one for each. All of those addresses will be configured the same way in the configuration file you will generate for each cluster.

    A static IP address will be used for control plane API server HA in each of your EKS Anywhere clusters. Choose IP addresses in your network range that do not conflict with other VMs and make sure they are excluded from your DHCP offering.

    An IP address will be the value of the property controlPlaneConfiguration.endpoint.host in the config file of the management cluster. A separate IP address must be assigned for each workload cluster.

    Import ova wizard

  • Prism Central FQDN or IP Address: The Prism Central fully qualified domain name or IP address.

  • Prism Element Cluster Name: The AOS cluster to deploy the EKS Anywhere cluster on.

  • VM Subnet Name: The VM network to deploy your EKS Anywhere cluster on.

  • Machine Template Image Name: The VM image to use for your EKS Anywhere cluster.

  • additionalTrustBundle (required if using a self-signed PC SSL certificate): The PEM encoded CA trust bundle of the root CA that issued the certificate for Prism Central.

5.4.2 - Preparing Nutanix Cloud Infrastructure for EKS Anywhere

Set up a Nutanix cluster to prepare it for EKS Anywhere

Certain resources must be in place with appropriate user permissions to create an EKS Anywhere cluster using the Nutanix provider.

Configuring Nutanix User

You need a Prism Admin user to create EKS Anywhere clusters on top of your Nutanix cluster.

Build Nutanix AHV node images

Follow the steps outlined in artifacts to create a Ubuntu-based image for Nutanix AHV and import it into the AOS Image Service.

5.4.3 -

  • Prism Central endpoint (must be accessible to EKS Anywhere clusters)
  • Prism Element Data Services IP and CVM endpoints (for CSI storage connections)
  • public.ecr.aws (for pulling EKS Anywhere container images)
  • anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere binaries and manifests)
  • distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
  • d2glxqk2uabbnd.cloudfront.net (for EKS Anywhere and EKS Distro ECR container images)
  • api.ecr.us-west-2.amazonaws.com (for EKS Anywhere package authentication matching your region)
  • d5l0dvt14r5h8.cloudfront.net (for EKS Anywhere package ECR container images)
  • api.github.com (only if GitOps is enabled)

5.5 - VMware vSphere

Preparing a VMware vSphere provider for EKS Anywhere

5.5.1 - Requirements for EKS Anywhere on VMware vSphere

Preparing a VMware vSphere provider for EKS Anywhere

To run EKS Anywhere, you will need:

Prepare Administrative machine

Set up an Administrative machine as described in Install EKS Anywhere .

Prepare a VMware vSphere environment

To prepare a VMware vSphere environment to run EKS Anywhere, you need the following:

  • A vSphere 7+ environment running vCenter

  • Capacity to deploy 6-10 VMs

  • DHCP service running in vSphere environment in the primary VM network for your workload cluster

  • One network in vSphere to use for the cluster. EKS Anywhere clusters need access to vCenter through the network to enable self-managing and storage capabilities.

  • An OVA imported into vSphere and converted into a template for the workload VMs

  • User credentials to create VMs and attach networks, etc

  • One IP address routable from cluster but excluded from DHCP offering. This IP address is to be used as the Control Plane Endpoint IP

    Below are some suggestions to ensure that this IP address is never handed out by your DHCP server.

    You may need to contact your network engineer.

    • Pick an IP address reachable from cluster subnet which is excluded from DHCP range OR
    • Alter DHCP ranges to leave out an IP address(s) at the top and/or the bottom of the range OR
    • Create an IP reservation for this IP on your DHCP server. This is usually accomplished by adding a dummy mapping of this IP address to a non-existent mac address.

Each VM will require:

  • 2 vCPUs
  • 8GB RAM
  • 25GB Disk

The administrative machine and the target workload environment will need network access to:

  • vCenter endpoint (must be accessible to EKS Anywhere clusters)
  • public.ecr.aws
  • anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere binaries, manifests and OVAs)
  • distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
  • d2glxqk2uabbnd.cloudfront.net (for EKS Anywhere and EKS Distro ECR container images)
  • api.ecr.us-west-2.amazonaws.com (for EKS Anywhere package authentication matching your region)
  • d5l0dvt14r5h8.cloudfront.net (for EKS Anywhere package ECR container images)
  • api.github.com (only if GitOps is enabled)

vSphere information needed before creating the cluster

You need to get the following information before creating the cluster:

  • Static IP Addresses: You will need one IP address for the management cluster control plane endpoint, and a separate one for the controlplane of each workload cluster you add.

    Let’s say you are going to have the management cluster and two workload clusters. For those, you would need three IP addresses, one for each. All of those addresses will be configured the same way in the configuration file you will generate for each cluster.

    A static IP address will be used for each control plane VM in your EKS Anywhere cluster. Choose IP addresses in your network range that do not conflict with other VMs and make sure they are excluded from your DHCP offering.

    An IP address will be the value of the property controlPlaneConfiguration.endpoint.host in the config file of the management cluster. A separate IP address must be assigned for each workload cluster.

    Import ova wizard

  • vSphere Datacenter Name: The vSphere datacenter to deploy the EKS Anywhere cluster on.

    Import ova wizard

  • VM Network Name: The VM network to deploy your EKS Anywhere cluster on.

    Import ova wizard

  • vCenter Server Domain Name: The vCenter server fully qualified domain name or IP address. If the server IP is used, the thumbprint must be set or insecure must be set to true.

    Import ova wizard

  • thumbprint (required if insecure=false): The SHA1 thumbprint of the vCenter server certificate which is only required if you have a self-signed certificate for your vSphere endpoint.

    There are several ways to obtain your vCenter thumbprint. If you have govc installed , you can run the following command in the Administrative machine terminal, and take a note of the output:

    govc about.cert -thumbprint -k
    
  • template: The VM template to use for your EKS Anywhere cluster. This template was created when you imported the OVA file into vSphere.

    Import ova wizard

  • datastore: The vSphere datastore to deploy your EKS Anywhere cluster on.

    Import ova wizard

  • folder: The folder parameter in VSphereMachineConfig allows you to organize the VMs of an EKS Anywhere cluster. With this, each cluster can be organized as a folder in vSphere. You will have a separate folder for the management cluster and each cluster you are adding.

    Import ova wizard

  • resourcePool: The vSphere Resource pools for your VMs in the EKS Anywhere cluster. If there is a resource pool: /<datacenter>/host/<resource-pool-name>/Resources

    Import ova wizard

5.5.2 - Preparing vSphere for EKS Anywhere

Set up a vSphere cluster to prepare it for EKS Anywhere

Certain resources must be in place with appropriate user permissions to create an EKS Anywhere cluster using the vSphere provider.

Configuring Folder Resources

Create a VM folder:

For each user that needs to create workload clusters, have the vSphere administrator create a VM folder. That folder will host:

  • The VMs of the Control plane and Data plane nodes of each cluster.
  • A nested folder for the management cluster and another one for each workload cluster.
  • Each cluster VM in its own nested folder under this folder.

Follow these steps to create the user’s vSphere folder:

  1. From vCenter, select the Menus/VM and Template tab.
  2. Select either a datacenter or another folder as a parent object for the folder that you want to create.
  3. Right-click the parent object and click New Folder.
  4. Enter a name for the folder and click OK. For more details, see the vSphere Create a Folder documentation.

Configuring vSphere User, Group, and Roles

You need a vSphere user with the right privileges to let you create EKS Anywhere clusters on top of your vSphere cluster.

Configure via EKSA CLI

To configure a new user via CLI, you will need two things:

  • a set of vSphere admin credentials with the ability to create users and groups. If you do not have the rights to create new groups and users, you can invoke govc commands directly as outlined here.
  • a user.yaml file:
apiVersion: "eks-anywhere.amazon.com/v1"
kind: vSphereUser
spec:
  username: "eksa"                # optional, default eksa
  group: "MyExistingGroup"        # optional, default EKSAUsers
  globalRole: "MyGlobalRole"      # optional, default EKSAGlobalRole
  userRole: "MyUserRole"          # optional, default EKSAUserRole
  adminRole: "MyEKSAAdminRole"    # optional, default EKSACloudAdminRole
  datacenter: "MyDatacenter"
  vSphereDomain: "vsphere.local"  # this should be the domain used when you login, e.g. YourUsername@vsphere.local
  connection:
    server: "https://my-vsphere.internal.acme.com"
    insecure: false
  objects:
    networks:
      - !!str "/MyDatacenter/network/My Network"
    datastores:
      - !!str "/MyDatacenter/datastore/MyDatastore2"
    resourcePools:
      - !!str "/MyDatacenter/host/Cluster-03/MyResourcePool" # NOTE: see below if you do not want to use a resource pool
    folders:
      - !!str "/MyDatacenter/vm/OrgDirectory/MyVMs"
    templates:
      - !!str "/MyDatacenter/vm/Templates/MyTemplates"

NOTE: if you do not want to create a resource pool, you can instead specify the cluster directly as /MyDatacenter/host/Cluster-03 in user.yaml, where Cluster-03 is your cluster name. In your cluster spec, you will need to specify /MyDatacenter/host/Cluster-03/Resources for the resourcePool field.

Set the admin credentials as environment variables:

export EKSA_VSPHERE_USERNAME=<ADMIN_VSPHERE_USERNAME>
export EKSA_VSPHERE_PASSWORD=<ADMIN_VSPHERE_PASSWORD>

If the user does not already exist, you can create the user and all the specified group and role objects by running:

eksctl anywhere exp vsphere setup user -f user.yaml --password '<NewUserPassword>'

If the user or any of the group or role objects already exist, use the force flag instead to overwrite Group-Role-Object mappings for the group, roles, and objects specified in the user.yaml config file:

eksctl anywhere exp vsphere setup user -f user.yaml --force

Please note that there is one more manual step to configure global permissions here .

Configure via govc

If you do not have the rights to create a new user, you can still configure the necessary roles and permissions using the govc cli .

#! /bin/bash
# govc calls to configure a user with minimal permissions
set -x
set -e

EKSA_USER='<Username>@<UserDomain>'
USER_ROLE='EKSAUserRole'
GLOBAL_ROLE='EKSAGlobalRole'
ADMIN_ROLE='EKSACloudAdminRole'

FOLDER_VM='/YourDatacenter/vm/YourVMFolder'
FOLDER_TEMPLATES='/YourDatacenter/vm/Templates'

NETWORK='/YourDatacenter/network/YourNetwork'
DATASTORE='/YourDatacenter/datastore/YourDatastore'
RESOURCE_POOL='/YourDatacenter/host/Cluster-01/Resources/YourResourcePool'

govc role.create "$GLOBAL_ROLE" $(curl https://raw.githubusercontent.com/aws/eks-anywhere/main/pkg/config/static/globalPrivs.json | jq .[] | tr '\n' ' ' | tr -d '"')

govc role.create "$USER_ROLE" $(curl https://raw.githubusercontent.com/aws/eks-anywhere/main/pkg/config/static/eksUserPrivs.json | jq .[] | tr '\n' ' ' | tr -d '"')

govc role.create "$ADMIN_ROLE" $(curl https://raw.githubusercontent.com/aws/eks-anywhere/main/pkg/config/static/adminPrivs.json | jq .[] | tr '\n' ' ' | tr -d '"')

govc permissions.set -group=false -principal "$EKSA_USER"  -role "$GLOBAL_ROLE" /

govc permissions.set -group=false -principal "$EKSA_USER"  -role "$ADMIN_ROLE" "$FOLDER_VM"

govc permissions.set -group=false -principal "$EKSA_USER"  -role "$ADMIN_ROLE" "$FOLDER_TEMPLATES"

govc permissions.set -group=false -principal "$EKSA_USER"  -role "$USER_ROLE" "$NETWORK"

govc permissions.set -group=false -principal "$EKSA_USER"  -role "$USER_ROLE" "$DATASTORE"

govc permissions.set -group=false -principal "$EKSA_USER"  -role "$USER_ROLE" "$RESOURCE_POOL"

NOTE: if you do not want to create a resource pool, you can instead specify the cluster directly as /MyDatacenter/host/Cluster-03 in user.yaml, where Cluster-03 is your cluster name. In your cluster spec, you will need to specify /MyDatacenter/host/Cluster-03/Resources for the resourcePool field.

Please note that there is one more manual step to configure global permissions here .

Configure via UI

Add a vCenter User

Ask your VSphere administrator to add a vCenter user that will be used for the provisioning of the EKS Anywhere cluster in VMware vSphere.

  1. Log in with the vSphere Client to the vCenter Server.
  2. Specify the user name and password for a member of the vCenter Single Sign-On Administrators group.
  3. Navigate to the vCenter Single Sign-On user configuration UI.
    • From the Home menu, select Administration.
    • Under Single Sign On, click Users and Groups.
  4. If vsphere.local is not the currently selected domain, select it from the drop-down menu. You cannot add users to other domains.
  5. On the Users tab, click Add.
  6. Enter a user name and password for the new user.
  7. The maximum number of characters allowed for the user name is 300.
  8. You cannot change the user name after you create a user. The password must meet the password policy requirements for the system.
  9. Click Add.

For more details, see vSphere Add vCenter Single Sign-On Users documentation.

Create and define user roles

When you add a user for creating clusters, that user initially has no privileges to perform management operations. So you have to add this user to groups with the required permissions, or assign a role or roles with the required permission to this user.

Three roles are needed to be able to create the EKS Anywhere cluster:

  1. Create a global custom role: For example, you could name this EKS Anywhere Global. Define it for the user on the vCenter domain level and its children objects. Create this role with the following privileges:

    > Content Library
    * Add library item
    * Check in a template
    * Check out a template
    * Create local library
    * Update files
    > vSphere Tagging
    * Assign or Unassign vSphere Tag
    * Assign or Unassign vSphere Tag on Object
    * Create vSphere Tag
    * Create vSphere Tag Category
    * Delete vSphere Tag
    * Delete vSphere Tag Category
    * Edit vSphere Tag
    * Edit vSphere Tag Category
    * Modify UsedBy Field For Category
    * Modify UsedBy Field For Tag
    > Sessions
    * Validate session
    
  2. Create a user custom role: The second role is also a custom role that you could call, for example, EKSAUserRole. Define this role with the following objects and children objects.

    • The pool resource level and its children objects. This resource pool that our EKS Anywhere VMs will be part of.
    • The storage object level and its children objects. This storage that will be used to store the cluster VMs.
    • The network VLAN object level and its children objects. This network that will host the cluster VMs.
    • The VM and Template folder level and its children objects.

    Create this role with the following privileges:

    > Content Library
    * Add library item
    * Check in a template
    * Check out a template
    * Create local library
    > Datastore
    * Allocate space
    * Browse datastore
    * Low level file operations
    > Folder
    * Create folder
    > vSphere Tagging
    * Assign or Unassign vSphere Tag
    * Assign or Unassign vSphere Tag on Object
    * Create vSphere Tag
    * Create vSphere Tag Category
    * Delete vSphere Tag
    * Delete vSphere Tag Category
    * Edit vSphere Tag
    * Edit vSphere Tag Category
    * Modify UsedBy Field For Category
    * Modify UsedBy Field For Tag
    > Network
    * Assign network
    > Resource
    * Assign virtual machine to resource pool
    > Scheduled task
    * Create tasks
    * Modify task
    * Remove task
    * Run task
    > Profile-driven storage
    * Profile-driven storage view
    > Storage views
    * View
    > vApp
    * Import
    > Virtual machine
    * Change Configuration
      - Add existing disk
      - Add new disk
      - Add or remove device
      - Advanced configuration
      - Change CPU count
      - Change Memory
      - Change Settings
      - Configure Raw device
      - Extend virtual disk
      - Modify device settings
      - Remove disk
    * Edit Inventory
      - Create from existing
      - Create new
      - Remove
    * Interaction
      - Power off
      - Power on
    * Provisioning
      - Clone template
      - Clone virtual machine
      - Create template from virtual machine
      - Customize guest
      - Deploy template
      - Mark as template
      - Read customization specifications
    * Snapshot management
      - Create snapshot
      - Remove snapshot
      - Revert to snapshot
    
  3. Create a default Administrator role: The third role is the default system role Administrator that you define to the user on the folder level and its children objects (VMs and OVA templates) that was created by the VSphere admistrator for you.

    To create a role and define privileges check Create a vCenter Server Custom Role and Defined Privileges pages.

Manually set Global Permissions role in Global Permissions UI

vSphere does not currently support a public API for setting global permissions. Because of this, you will need to manually assign the Global Role you created to your user or group in the Global Permissions UI.

Deploy an OVA Template

If the user creating the cluster has permission and network access to create and tag a template, you can skip these steps because EKS Anywhere will automatically download the OVA and create the template if it can. If the user does not have the permissions or network access to create and tag the template, follow this guide. The OVA contains the operating system (Ubuntu, Bottlerocket, or RHEL) for a specific EKS Distro Kubernetes release and EKS Anywhere version. The following example uses Ubuntu as the operating system, but a similar workflow would work for Bottlerocket or RHEL.

Steps to deploy the OVA

  1. Go to the artifacts page and download or build the OVA template with the newest EKS Distro Kubernetes release to your computer.
  2. Log in to the vCenter Server.
  3. Right-click the folder you created above and select Deploy OVF Template. The Deploy OVF Template wizard opens.
  4. On the Select an OVF template page, select the Local file option, specify the location of the OVA template you downloaded to your computer, and click Next.
  5. On the Select a name and folder page, enter a unique name for the virtual machine or leave the default generated name, if you do not have other templates with the same name within your vCenter Server virtual machine folder. The default deployment location for the virtual machine is the inventory object where you started the wizard, which is the folder you created above. Click Next.
  6. On the Select a compute resource page, select the resource pool where to run the deployed VM template, and click Next.
  7. On the Review details page, verify the OVF or OVA template details and click Next.
  8. On the Select storage page, select a datastore to store the deployed OVF or OVA template and click Next.
  9. On the Select networks page, select a source network and map it to a destination network. Click Next.
  10. On the Ready to complete page, review the page and click Finish. For details, see Deploy an OVF or OVA Template

To build your own Ubuntu OVA template check the Building your own Ubuntu OVA section in the following link .

To use the deployed OVA template to create the VMs for the EKS Anywhere cluster, you have to tag it with specific values for the os and eksdRelease keys. The value of the os key is the operating system of the deployed OVA template, which is ubuntu in our scenario. The value of the eksdRelease holds kubernetes and the EKS-D release used in the deployed OVA template. Check the following Customize OVAs page for more details.

Steps to tag the deployed OVA template:

  1. Go to the artifacts page and take notes of the tags and values associated with the OVA template you deployed in the previous step.
  2. In the vSphere Client, select Menu > Tags & Custom Attributes.
  3. Select the Tags tab and click Tags.
  4. Click New.
  5. In the Create Tag dialog box, copy the os tag name associated with your OVA that you took notes of, which in our case is os:ubuntu and paste it as the name for the first tag required.
  6. Specify the tag category os if it exist or create it if it does not exist.
  7. Click Create.
  8. Repeat steps 2-4.
  9. In the Create Tag dialog box, copy the os tag name associated with your OVA that you took notes of, which in our case is eksdRelease:kubernetes-1-21-eks-8 and paste it as the name for the second tag required.
  10. Specify the tag category eksdRelease if it exist or create it if it does not exist.
  11. Click Create.
  12. Navigate to the VM and Template tab.
  13. Select the folder that was created.
  14. Select deployed template and click Actions.
  15. From the drop-down menu, select Tags and Custom Attributes > Assign Tag.
  16. Select the tags we created from the list and confirm the operation.

5.5.3 - Customize OVAs: Ubuntu

Customizing Imported Ubuntu OVAs

There may be a need to make specific configuration changes on the imported ova template before using it to create/update EKS-A clusters.

Set up SSH Access for Imported OVA

SSH user and key need to be configured in order to allow SSH login to the VM template

Clone template to VM

Create an environment variable to hold the name of modified VM/template

export VM=<vm-name>

Clone the imported OVA template to create VM

govc vm.clone -on=false -vm=<full-path-to-imported-template> - folder=<full-path-to-folder-that-will-contain-the-VM> -ds=<datastore> $VM

Configure VM with cloud-init and the VMX GuestInfo datasource

Create a metadata.yaml file

instance-id: cloud-vm
local-hostname: cloud-vm
network:
  version: 2
  ethernets:
    nics:
      match:
        name: ens*
      dhcp4: yes

Create a userdata.yaml file

#cloud-config

users:
  - default
  - name: <username>
    primary_group: <username>
    sudo: ALL=(ALL) NOPASSWD:ALL
    groups: sudo, wheel
    ssh_import_id: None
    lock_passwd: true
    ssh_authorized_keys:
    - <user's ssh public key>

Export environment variable containing the cloud-init metadata and userdata

export METADATA=$(gzip -c9 <metadata.yaml | { base64 -w0 2>/dev/null || base64; }) \
       USERDATA=$(gzip -c9 <userdata.yaml | { base64 -w0 2>/dev/null || base64; })

Assign metadata and userdata to VM’s guestinfo

govc vm.change -vm "${VM}" \
  -e guestinfo.metadata="${METADATA}" \
  -e guestinfo.metadata.encoding="gzip+base64" \
  -e guestinfo.userdata="${USERDATA}" \
  -e guestinfo.userdata.encoding="gzip+base64"

Power the VM on

govc vm.power -on “$VM”

Customize the VM

Once the VM is powered on and fetches an IP address, ssh into the VM using your private key corresponding to the public key specified in userdata.yaml

ssh -i <private-key-file> username@<VM-IP>

At this point, you can make the desired configuration changes on the VM. The following sections describe some of the things you may want to do:

Add a Certificate Authority

Copy your CA certificate under /usr/local/share/ca-certificates and run sudo update-ca-certificates which will place the certificate under the /etc/ssl/certs directory.

Add Authentication Credentials for a Private Registry

If /etc/containerd/config.toml is not present initially, the default configuration can be generated by running the containerd config default > /etc/containerd/config.toml command. To configure a credential for a specific registry, create/modify the /etc/containerd/config.toml as follows:

# explicitly use v2 config format
version = 2

# The registry host has to be a domain name or IP. Port number is also
# needed if the default HTTPS or HTTP port is not used.
[plugins."io.containerd.grpc.v1.cri".registry.configs."registry1-host:port".auth]
  username = ""
  password = ""
  auth = ""
  identitytoken = ""
 # The registry host has to be a domain name or IP. Port number is also
 # needed if the default HTTPS or HTTP port is not used.
[plugins."io.containerd.grpc.v1.cri".registry.configs."registry2-host:port".auth]
  username = ""
  password = ""
  auth = ""
  identitytoken = ""

Restart containerd service with the sudo systemctl restart containerd command.

Convert VM to a Template

After you have customized the VM, you need to convert it to a template.

Cleanup the machine and power off the VM

This step is needed because of a known issue in Ubuntu which results in the clone VMs getting the same DHCP IP

sudo su
echo -n > /etc/machine-id
rm /var/lib/dbus/machine-id
ln -s /etc/machine-id /var/lib/dbus/machine-id
cloud-init clean -l --machine-id

Delete the hostname from file

/etc/hostname

Delete the networking config file

rm -rf /etc/netplan/50-cloud-init.yaml

Edit the cloud init config to turn preserve_hostname to false

vi /etc/cloud/cloud.cfg

Power the VM down

govc vm.power -off "$VM"

Take a snapshot of the VM

It is recommended to take a snapshot of the VM as it reduces the provisioning time for the machines and makes cluster creation faster.

If you do snapshot the VM, you will not be able to customize the disk size of your cluster VMs. If you prefer not to take a snapshot, skip this step.

govc snapshot.create -vm "$VM" root

Convert VM to template

govc vm.markastemplate $VM

Tag the template appropriately as described here

Use this customized template to create/upgrade EKS Anywhere clusters

5.5.4 - Import OVAs

Importing EKS Anywhere OVAs to vSphere

If you want to specify an OVA template, you will need to import OVA files into vSphere before you can use it in your EKS Anywhere cluster. This guide was written using VMware Cloud on AWS, but the VMware OVA import guide can be found here .

EKS Anywhere supports the following operating system families

  • Bottlerocket (default)
  • Ubuntu
  • RHEL

A list of OVAs for this release can be found on the artifacts page .

Using vCenter Web User Interface

  1. Right click on your Datacenter, select Deploy OVF Template Import ova drop down

  2. Select an OVF template using URL or selecting a local OVF file and click on Next. If you are not able to select an OVF template using URL, download the file and use Local file option.

    Note: If you are using Bottlerocket OVAs, please select local file option. Import ova wizard

  3. Select a folder where you want to deploy your OVF package (most of our OVF templates are under SDDC-Datacenter directory) and click on Next. You cannot have an OVF template with the same name in one directory. For workload VM templates, leave the Kubernetes version in the template name for reference. A workload VM template will support at least one prior Kubernetes major versions. Import ova wizard

  4. Select any compute resource to run (from cluster-1, 10.2.34.5, etc..) the deployed VM and click on Next Import ova wizard

  5. Review the details and click Next.

  6. Accept the agreement and click Next.

  7. Select the appropriate storage (e.g. “WorkloadDatastore“) and click Next.

  8. Select destination network (e.g. “sddc-cgw-network-1”) and click Next.

  9. Finish.

  10. Snapshot the VM. Right click on the imported VM and select Snapshots -> Take Snapshot… (It is highly recommended that you snapshot the VM. This will reduce the time it takes to provision machines and cluster creation will be faster. If you prefer not to take snapshot, skip to step 13) Import ova wizard

  11. Name your template (e.g. “root”) and click Create. Import ova wizard

  12. Snapshots for the imported VM should now show up under the Snapshots tab for the VM. Import ova wizard

  13. Right click on the imported VM and select Template and Convert to Template Import ova wizard

Steps to deploy a template using GOVC (CLI)

To deploy a template using govc, you must first ensure that you have GOVC installed . You need to set and export three environment variables to run govc GOVC_USERNAME, GOVC_PASSWORD and GOVC_URL.

  1. Import the template to a content library in vCenter using URL or selecting a local OVA file

    Using URL:

    govc library.import -k -pull <library name> <URL for the OVA file>
    

    Using a file from the local machine:

    govc library.import <library name> <path to OVA file on local machine>
    
  2. Deploy the template

    govc library.deploy -pool <resource pool> -folder <folder location to deploy template> /<library name>/<template name> <name of new VM>
    

    2a. If using Bottlerocket template for newer Kubernetes version than 1.21, resize disk 1 to 22G

    govc vm.disk.change -vm <template name> -disk.label "Hard disk 1" -size 22G
    

    2b. If using Bottlerocket template for Kubernetes version 1.21, resize disk 2 to 20G

    govc vm.disk.change -vm <template name> -disk.label "Hard disk 2" -size 20G
    
  3. Take a snapshot of the VM (It is highly recommended that you snapshot the VM. This will reduce the time it takes to provision machines and cluster creation will be faster. If you prefer not to take snapshot, skip this step)

    govc snapshot.create -vm ubuntu-2004-kube-v1.25.6 root
    
  4. Mark the new VM as a template

    govc vm.markastemplate <name of new VM>
    

Important Additional Steps to Tag the OVA

Using vCenter UI

Tag to indicate OS family

  1. Select the template that was newly created in the steps above and navigate to Summary -> Tags. Import ova wizard
  2. Click Assign -> Add Tag to create a new tag and attach it Import ova wizard
  3. Name the tag os:ubuntu or os:bottlerocket Import ova wizard

Tag to indicate eksd release

  1. Select the template that was newly created in the steps above and navigate to Summary -> Tags. Import ova wizard
  2. Click Assign -> Add Tag to create a new tag and attach it Import ova wizard
  3. Name the tag eksdRelease:{eksd release for the selected ova}, for example eksdRelease:kubernetes-1-25-eks-5 for the 1.25 ova. You can find the rest of eksd releases in the previous section . If it’s the first time you add an eksdRelease tag, you would need to create the category first. Click on “Create New Category” and name it eksdRelease. Import ova wizard

Using govc

Tag to indicate OS family

  1. Create tag category
govc tags.category.create -t VirtualMachine os
  1. Create tags os:ubuntu and os:bottlerocket
govc tags.create -c os os:bottlerocket
govc tags.create -c os os:ubuntu
  1. Attach newly created tag to the template
govc tags.attach os:bottlerocket <Template Path>
govc tags.attach os:ubuntu <Template Path>
  1. Verify tag is attached to the template
govc tags.ls <Template Path> 

Tag to indicate eksd release

  1. Create tag category
govc tags.category.create -t VirtualMachine eksdRelease
  1. Create the proper eksd release Tag, depending on your template. You can find the eksd releases in the previous section . For example eksdRelease:kubernetes-1-25-eks-5 for the 1.25 template.
govc tags.create -c eksdRelease eksdRelease:kubernetes-1-25-eks-5
  1. Attach newly created tag to the template
govc tags.attach eksdRelease:kubernetes-1-25-eks-5 <Template Path>
  1. Verify tag is attached to the template
govc tags.ls <Template Path> 

After you are done you can use the template for your workload cluster.

5.5.5 - Custom DHCP Configuration

Create a custom DHCP configuration for your vSphere deployment

If your vSphere deployment is not configured with DHCP, you may want to run your own DHCP server. It may be necessary to turn off DHCP snooping on your switch to get DHCP working across VM servers. If you are running your administration machine in vSphere, it would most likely be easiest to run the DHCP server on that machine. This example is for Ubuntu.

Install

Install DHCP server

sudo apt-get install isc-dhcp-server

Configure /etc/dhcp/dhcpd.conf

Update the ip address range, subnet, mask, etc to suite your configuration similar to this:

default-lease-time 600;
max-lease-time 7200;
 
ddns-update-style none;
 
authoritative;
 
subnet 10.8.105.0 netmask 255.255.255.0 {
range 10.8.105.9  10.8.105.41;
option subnet-mask 255.255.255.0;
option routers 10.8.105.1;
 option domain-name-servers 147.149.1.69;
}

Configure /etc/default/isc-dhcp-server

Add the main NIC device interface to this file, such as eth0 (this example uses ens160).

INTERFACESv4="ens160"

Restart DHCP

service isc-dhcp-server restart

Verify your configuration

This example assumes the ens160 interface:

tcpdump -ni ens160 port 67 -vvvv
 
tcpdump: listening on ens160, link-type EN10MB (Ethernet), capture size 262144 bytes
09:13:54.297704 IP (tos 0xc0, ttl 64, id 40258, offset 0, flags [DF], proto UDP (17), length 327)
    10.8.105.12.68 > 10.8.105.5.67: [udp sum ok] BOOTP/DHCP, Request from 00:50:56:90:56:cf, length 299, xid 0xf7a5aac5, secs 50310, Flags [none] (0x0000)
          Client-IP 10.8.105.12
          Client-Ethernet-Address 00:50:56:90:56:cf
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Request
            Client-ID Option 61, length 19: hardware-type 255, 2d:1a:a1:33:00:02:00:00:ab:11:f2:c8:ef:ba:aa:5a:2f:33
            Parameter-Request Option 55, length 11:
              Subnet-Mask, Default-Gateway, Hostname, Domain-Name
              Domain-Name-Server, MTU, Static-Route, Classless-Static-Route
              Option 119, NTP, Option 120
            MSZ Option 57, length 2: 576
            Hostname Option 12, length 15: "prod-etcd-m8ctd"
            END Option 255, length 0
09:13:54.299762 IP (tos 0x0, ttl 64, id 56218, offset 0, flags [DF], proto UDP (17), length 328)
    10.8.105.5.67 > 10.8.105.12.68: [bad udp cksum 0xe766 -> 0x502f!] BOOTP/DHCP, Reply, length 300, xid 0xf7a5aac5, secs 50310, Flags [none] (0x0000)
          Client-IP 10.8.105.12
          Your-IP 10.8.105.12
          Server-IP 10.8.105.5
          Client-Ethernet-Address 00:50:56:90:56:cf
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: ACK
            Server-ID Option 54, length 4: 10.8.105.5
            Lease-Time Option 51, length 4: 600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            Default-Gateway Option 3, length 4: 10.8.105.1
            Domain-Name-Server Option 6, length 4: 147.149.1.69
            END Option 255, length 0
            PAD Option 0, length 0, occurs 26

5.5.6 -

  • vCenter endpoint (must be accessible to EKS Anywhere clusters)
  • public.ecr.aws
  • anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere binaries, manifests and OVAs)
  • distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
  • d2glxqk2uabbnd.cloudfront.net (for EKS Anywhere and EKS Distro ECR container images)
  • api.ecr.us-west-2.amazonaws.com (for EKS Anywhere package authentication matching your region)
  • d5l0dvt14r5h8.cloudfront.net (for EKS Anywhere package ECR container images)
  • api.github.com (only if GitOps is enabled)

5.6 - Security best practices

Using security best practices with your EKS Anywhere deployments

If you discover a potential security issue in this project, we ask that you notify AWS/Amazon Security via our vulnerability reporting page . Please do not create a public GitHub issue for security problems.

This guide provides advice about best practices for EKS Anywhere specific security concerns. For a more complete treatment of Kubernetes security generally please refer to the official Kubernetes documentation on Securing a Cluster and the Amazon EKS Best Practices Guide for Security .

The Shared Responsibility Model and EKS-A

AWS Cloud Services follow the Shared Responsibility Model, where AWS is responsible for security “of” the cloud, while the customer is responsible for security “in” the cloud. However, EKS Anywhere is an open-source tool and the distribution of responsibility differs from that of a managed cloud service like EKS.

AWS Responsibilities

AWS is responsible for building and delivering a secure tool. This tool will provision an initially secure Kubernetes cluster.

AWS is responsible for vetting and securely sourcing the services and tools packaged with EKS Anywhere and the cluster it creates (such as CoreDNS, Cilium, Flux, CAPI, and govc).

The EKS Anywhere build and delivery infrastructure, or supply chain, is secured to the standard of any AWS service and AWS takes responsibility for the secure and reliable delivery of a quality product which provisions a secure and stable Kubernetes cluster. When the eksctl anywhere plugin is executed, EKS Anywhere components are automatically downloaded from AWS. eksctl will then perform checksum verification on the components to ensure their authenticity.

AWS is responsible for the secure development and testing of the EKS Anywhere controller and associated custom resource definitions.

AWS is responsible for the secure development and testing of the EKS Anywhere CLI, and ensuring it handles sensitive data and cluster resources securely.

End user responsibilities

The end user is responsible for the entire EKS Anywhere cluster after it has been provisioned. AWS provides a mechanism to upgrade the cluster in-place, but it is the responsibility of the end user to perform that upgrade using the provided tools. End users are responsible for operating their clusters in accordance with Kubernetes security best practices, and for the ongoing security of the cluster after it has been provisioned. This includes but is not limited to:

  • creation or modification of RBAC roles and bindings
  • creation or modification of namespaces
  • modification of the default container network interface plugin
  • configuration of network ingress and load balancing
  • use and configuration of container storage interfaces
  • the inclusion of add-ons and other services

End users are also responsible for:

  • The hardware and software which make up the infrastructure layer (such as vSphere, ESXi, physical servers, and physical network infrastructure).

  • The ongoing maintenance of the cluster nodes, including the underlying guest operating systems. Additionally, while EKS Anywhere provides a streamlined process for upgrading a cluster to a new Kubernetes version, it is the responsibility of the user to perform the upgrade as necessary.

  • Any applications which run “on” the cluster, including their secure operation, least privilege, and use of well-known and vetted container images.

EKS Anywhere Security Best Practices

This section captures EKS Anywhere specific security best practices. Please read this section carefully and follow any guidance to ensure the ongoing security and reliability of your EKS Anywhere cluster.

Critical Namespaces

EKS Anywhere creates and uses resources in several critical namespaces. All of the EKS Anywhere managed namespaces should be treated as sensitive and access should be limited to only the most trusted users and processes. Allowing additional access or modifying the existing RBAC resources could potentially allow a subject to access the namespace and the resources that it contains. This could lead to the exposure of secrets or the failure of your cluster due to modification of critical resources. Here are rules you should follow when dealing with critical namespaces:

  • Avoid creating Roles in these namespaces or providing users access to them with ClusterRoles . For more information about creating limited roles for day-to-day administration and development, please see the official introduction to Role Based Access Control (RBAC) .

  • Do not modify existing Roles in these namespaces, bind existing roles to additional subjects , or create new Roles in the namespace.

  • Do not modify existing ClusterRoles or bind them to additional subjects.

  • Avoid using the cluster-admin role, as it grants permissions over all namespaces.

  • No subjects except for the most trusted administrators should be permitted to perform ANY action in the critical namespaces.

The critical namespaces include:

  • eksa-system
  • capv-system
  • flux-system
  • capi-system
  • capi-webhook-system
  • capi-kubeadm-control-plane-system
  • capi-kubeadm-bootstrap-system
  • cert-manager
  • kube-system (as with any Kubernetes cluster, this namespace is critical to the functioning of your cluster and should be treated with the highest level of sensitivity.)

Secrets

EKS Anywhere stores sensitive information, like the vSphere credentials and GitHub Personal Access Token, in the cluster as native Kubernetes secrets . These secret objects are namespaced, for example in the eksa-system and flux-system namespace, and limiting access to the sensitive namespaces will ensure that these secrets will not be exposed. Additionally, limit access to the underlying node. Access to the node could allow access to the secret content.

EKS Anywhere does not currently support encryption-at-rest for Kubernetes secrets. EKS Anywhere support for Key Management Services (KMS) is planned.

The EKS Anywhere kubeconfig file

eksctl anywhere create cluster creates an EKS Anywhere-based Kubernetes cluster and outputs a kubeconfig file with administrative privileges to the $PWD/$CLUSTER_NAME directory.

By default, this kubeconfig file uses certificate-based authentication and contains the user certificate data for the administrative user.

The kubeconfig file grants administrative privileges over your cluster to the bearer and the certificate key should be treated as you would any other private key or administrative password.

The EKS Anywhere-generated kubeconfig file should only be used for interacting with the cluster via eksctl anywhere commands, such as upgrade, and for the most privileged administrative tasks. For more information about creating limited roles for day-to-day administration and development, please see the official introduction to Role Based Access Control (RBAC) .

GitOps

GitOps enabled EKS Anywhere clusters maintain a copy of their cluster configuration in the user provided Git repository. This configuration acts as the source of truth for the cluster. Changes made to this configuration will be reflected in the cluster configuration.

AWS recommends that you gate any changes to this repository with mandatory pull request reviews. Carefully review pull requests for changes which could impact the availability of the cluster (such as scaling nodes to 0 and deleting the cluster object) or contain secrets.

GitHub Personal Access Token

Treat the GitHub PAT used with EKS Anywhere as you would any highly privileged secret, as it could potentially be used to make changes to your cluster by modifying the contents of the cluster configuration file through the GitHub.com API.

  • Never commit the PAT to a Git repository
  • Never share the PAT via untrusted channels
  • Never grant non-administrative subjects access to the flux-system namespace where the PAT is stored as a native Kubernetes secret.

Executing EKS Anywhere

Ensure that you execute eksctl anywhere create cluster on a trusted workstation in order to protect the values of sensitive environment variables and the EKS Anywhere generated kubeconfig file.

SSH Access to Cluster Nodes and ETCD Nodes

EKS Anywhere provides the option to configure an ssh authorized key for access to underlying nodes in a cluster, via vsphereMachineConfig.Users.sshAuthorizedKeys. This grants the associated private key the ability to connect to the cluster via ssh as the user capv with sudo permissions. The associated private key should be treated as extremely sensitive, as sudo access to the cluster and ETCD nodes can permit access to secret object data and potentially confer arbitrary control over the cluster.

VMWare OVAs

Only download OVAs for cluster nodes from official sources, and do not allow untrusted users or processes to modify the templates used by EKS Anywhere for provisioning nodes.

Keeping Bottlerocket up to date

EKS Anywhere provides the most updated patch of operating systems with every release. It is recommended that your clusters are kept up to date with the latest EKS Anywhere release to ensure you get the latest security updates. Bottlerocket is an EKS Anywhere supported operating system that can be kept up to date without requiring a cluster update. The Bottlerocket Update Operator is a Kubernetes update operator that coordinates Bottlerocket updates on hosts in the cluster. Please follow the instructions here to install Bottlerocket update operator.

Baremetal Clusters

EKS Anywhere Baremetal clusters run directly on physical servers in a datacenter. Make sure that the physical infrastructure, including the network, is secure before running EKS Anywhere clusters.

Please follow industry best practices for securing your network and datacenter, including but not limited to the following

  • Only allow trusted devices on the network
  • Secure the network using a firewall
  • Never source hardware from an untrusted vendor
  • Inspect and verify the metal servers you are using for the clusters are the ones you intended to use
  • If possible, use a separate L2 network for EKS Anywhere baremetal clusters
  • Conduct thorough audits of access, users, logs and other exploitable venues periodically

Benchmark tests for cluster hardening

EKS Anywhere creates clusters with server hardening configurations out of the box, via the use of security flags and opinionated default templates. You can verify the security posture of your EKS Anywhere cluster by using a tool called kube-bench , that checks whether Kubernetes is deployed securely.

kube-bench runs checks documented in the CIS Benchmark for Kubernetes , such as, pod specification file permissions, disabling insecure arguments, and so on.

Refer to the EKS Anywhere CIS Self-Assessment Guide for more information on how to evaluate the security configurations of your EKS Anywhere cluster.

5.6.1 - CIS Self-Assessment Guide

CIS Benchmark Self-Assessment Guide for EKS Anywhere clusters

The CIS Benchmark self-assessment guide serves to help EKS Anywhere users evaluate the level of security of the hardened cluster configuration against Kubernetes benchmark controls from the Center for Information Security (CIS). This guide will walk through the various controls and provide updated example commands to audit compliance in EKS Anywhere clusters.

You can verify the security posture of your EKS Anywhere cluster by using a tool called kube-bench . The ideal way to run the benchmark tests on your EKS Anywhere cluster is to apply the Kube-bench Job YAMLs to the cluster. This runs the kube-bench tests on a Pod on the cluster, and the logs of the Pod provide the test results.

Kube-bench currently does not support unstacked etcd topology (which is the default for EKS Anywhere), so the following checks are skipped in the default kube-bench Job YAML. If you created your EKS Anywhere cluster with stacked etcd configuration, you can apply the stacked etcd Job YAML instead.

Check number Check description
1.1.7 Ensure that the etcd pod specification file permissions are set to 644 or more restrictive
1.1.8 Ensure that the etcd pod specification file ownership is set to root:root
1.1.11 Ensure that the etcd data directory permissions are set to 700 or more restrictive
1.1.12 Ensure that the etcd data directory ownership is set to etcd:etcd

The following tests are also skipped, because they are not applicable or enforce settings that might make the cluster unstable.

Check number Check description Reason for skipping
Controlplane node configuration
1.2.6 Ensure that the –kubelet-certificate-authority argument is set as appropriate When generating serving certificates, functionality could break in conjunction with hostname overrides which are required for certain cloud providers
1.2.16 Ensure that the admission control plugin PodSecurityPolicy is set Enabling Pod Security Policy can cause applications to unexpectedly fail
1.2.32 Ensure that the –encryption-provider-config argument is set as appropriate Enabling encryption changes how data can be recovered as data is encrypted
1.2.33 Ensure that encryption providers are appropriately configured Enabling encryption changes how data can be recovered as data is encrypted
Worker node configuration
4.2.6 Ensure that the –protect-kernel-defaults argument is set to true System level configurations are required before provisioning the cluster in order for this argument to be set to true
4.2.10 Ensure that the –tls-cert-file and –tls-private-key-file arguments are set as appropriate When generating serving certificates, functionality could break in conjunction with hostname overrides which are required for certain cloud providers

5.7 - Packages

List of EKS Anywhere curated packages

Curated package list

Name Description Versions GitHub
ADOT ADOT Collector is an AWS distribution of the OpenTelemetry Collector, which provides a vendor-agnostic solution to receive, process and export telemetry data. v0.25.0 https://github.com/aws-observability/aws-otel-collector
Cert-manager Cert-manager is a certificate manager for Kubernetes clusters. v1.9.1 https://github.com/cert-manager/cert-manager
Cluster Autoscaler Cluster Autoscaler is a component that automatically adjusts the size of a Kubernetes Cluster so that all pods have a place to run and there are no unneeded nodes. v9.21.0 https://github.com/kubernetes/autoscaler
Emissary Ingress Emissary Ingress is an open source Ingress supporting API Gateway + Layer 7 load balancer built on Envoy Proxy. v3.3.0 https://github.com/emissary-ingress/emissary/
Harbor Harbor is an open source trusted cloud native registry project that stores, signs, and scans content. v2.7.1
v2.5.1
https://github.com/goharbor/harbor
https://github.com/goharbor/harbor-helm
MetalLB MetalLB is a virtual IP provider for services of type LoadBalancer supporting ARP and BGP. v0.13.7 https://github.com/metallb/metallb/
Metrics Server Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines. v3.8.2 https://github.com/kubernetes-sigs/metrics-server
Prometheus Prometheus is an open-source systems monitoring and alerting toolkit that collects and stores metrics as time series data. v2.41.0 https://github.com/prometheus/prometheus

5.7.1 - Packages configuration

Full EKS Anywhere configuration reference for curated packages.

This is a generic template with detailed descriptions below for reference. To generate your own package configuration, follow instructions from Package Management section and modify it using descriptions below.

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: PackageBundleController
metadata:
  name: eksa-packages-bundle-controller
  namespace: eksa-packages
spec:
  activeBundle: v1-21-83
  defaultImageRegistry: 783794618700.dkr.ecr.us-west-2.amazonaws.com
  defaultRegistry: public.ecr.aws/eks-anywhere
  privateRegistry: ""
  upgradeCheckInterval: 24h0m0s

---
apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: PackageBundle
metadata:
  name: package-bundle
  namespace: eksa-packages
spec:
  packages:
    - name: hello-eks-anywhere
      source:
        repository: hello-eks-anywhere
        versions:
          - digest: sha256:c31242a2f94a58017409df163debc01430de65ded6bdfc5496c29d6a6cbc0d94
            images:
              - digest: sha256:26e3f2f9aa546fee833218ece3fe7561971fd905cef40f685fd1b5b09c6fb71d
                repository: hello-eks-anywhere
            name: 0.1.1-083e68edbbc62ca0228a5669e89e4d3da99ff73b
            schema: H4sIAJc5EW...

---
apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: my-hello-eks-anywhere
  namespace: eksa-packages
spec:
  config: |
        title: "My Hello"
  packageName: hello-eks-anywhere
  targetNamespace: eksa-packages

API Reference

Packages:

packages.eks.amazonaws.com/v1alpha1

Resource Types:

PackageBundleController

↩ Parent

PackageBundleController is the Schema for the packagebundlecontroller API.

Name Type Description Required
apiVersion string packages.eks.amazonaws.com/v1alpha1 true
kind string PackageBundleController true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object PackageBundleControllerSpec defines the desired state of PackageBundleController.
false
status object PackageBundleControllerStatus defines the observed state of PackageBundleController.
false

PackageBundleController.spec

↩ Parent

PackageBundleControllerSpec defines the desired state of PackageBundleController.

Name Type Description Required
activeBundle string ActiveBundle is name of the bundle from which packages should be sourced.
false
bundleRepository string Repository portion of an OCI address to the bundle

Default: eks-anywhere-packages-bundles
false
createNamespace boolean Allow target namespace creation by the controller

Default: false
false
defaultImageRegistry string DefaultImageRegistry for pulling images

Default: 783794618700.dkr.ecr.us-west-2.amazonaws.com
false
defaultRegistry string DefaultRegistry for pulling helm charts and the bundle

Default: public.ecr.aws/eks-anywhere
false
logLevel integer LogLevel controls the verbosity of logging in the controller.

Format: int32
false
privateRegistry string PrivateRegistry is the registry being used for all images, charts and bundles
false
upgradeCheckInterval string UpgradeCheckInterval is the time between upgrade checks. The format is that of time's ParseDuration.

Default: 24h
false
upgradeCheckShortInterval string UpgradeCheckShortInterval time between upgrade checks if there is a problem. The format is that of time's ParseDuration.

Default: 1h
false

PackageBundleController.status

↩ Parent

PackageBundleControllerStatus defines the observed state of PackageBundleController.

Name Type Description Required
detail string Detail of the state.
false
spec object Spec previous settings
false
state enum State of the bundle controller.

Enum: ignored, active, disconnected, upgrade available
false

PackageBundleController.status.spec

↩ Parent

Spec previous settings

Name Type Description Required
activeBundle string ActiveBundle is name of the bundle from which packages should be sourced.
false
bundleRepository string Repository portion of an OCI address to the bundle

Default: eks-anywhere-packages-bundles
false
createNamespace boolean Allow target namespace creation by the controller

Default: false
false
defaultImageRegistry string DefaultImageRegistry for pulling images

Default: 783794618700.dkr.ecr.us-west-2.amazonaws.com
false
defaultRegistry string DefaultRegistry for pulling helm charts and the bundle

Default: public.ecr.aws/eks-anywhere
false
logLevel integer LogLevel controls the verbosity of logging in the controller.

Format: int32
false
privateRegistry string PrivateRegistry is the registry being used for all images, charts and bundles
false
upgradeCheckInterval string UpgradeCheckInterval is the time between upgrade checks. The format is that of time's ParseDuration.

Default: 24h
false
upgradeCheckShortInterval string UpgradeCheckShortInterval time between upgrade checks if there is a problem. The format is that of time's ParseDuration.

Default: 1h
false

PackageBundle

↩ Parent

PackageBundle is the Schema for the packagebundle API.

Name Type Description Required
apiVersion string packages.eks.amazonaws.com/v1alpha1 true
kind string PackageBundle true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object PackageBundleSpec defines the desired state of PackageBundle.
false
status object PackageBundleStatus defines the observed state of PackageBundle.
false

PackageBundle.spec

↩ Parent

PackageBundleSpec defines the desired state of PackageBundle.

Name Type Description Required
packages []object Packages supported by this bundle.
true
minControllerVersion string Minimum required packages controller version
false

PackageBundle.spec.packages[index]

↩ Parent

BundlePackage specifies a package within a bundle.

Name Type Description Required
source object Source location for the package (probably a helm chart).
true
name string Name of the package.
false
workloadonly boolean WorkloadOnly specifies if the package should be installed only on the workload cluster
false

PackageBundle.spec.packages[index].source

↩ Parent

Source location for the package (probably a helm chart).

Name Type Description Required
repository string Repository within the Registry where the package is found.
true
versions []object Versions of the package supported by this bundle.
true
registry string Registry in which the package is found.
false

PackageBundle.spec.packages[index].source.versions[index]

↩ Parent

SourceVersion describes a version of a package within a repository.

Name Type Description Required
digest string Digest is a checksum value identifying the version of the package and its contents.
true
name string Name is a human-friendly description of the version, e.g. "v1.0".
true
dependencies []string Dependencies to be installed before the package
false
images []object Images is a list of images used by this version of the package.
false
schema string Schema is a base64 encoded, gzipped json schema used to validate package configurations.
false

PackageBundle.spec.packages[index].source.versions[index].images[index]

↩ Parent

VersionImages is an image used by a version of a package.

Name Type Description Required
digest string Digest is a checksum value identifying the version of the package and its contents.
true
repository string Repository within the Registry where the package is found.
true

PackageBundle.status

↩ Parent

PackageBundleStatus defines the observed state of PackageBundle.

Name Type Description Required
state enum PackageBundleStateEnum defines the observed state of PackageBundle.

Enum: available, ignored, invalid, controller upgrade required
true
spec object PackageBundleSpec defines the desired state of PackageBundle.
false

PackageBundle.status.spec

↩ Parent

PackageBundleSpec defines the desired state of PackageBundle.

Name Type Description Required
packages []object Packages supported by this bundle.
true
minControllerVersion string Minimum required packages controller version
false

PackageBundle.status.spec.packages[index]

↩ Parent

BundlePackage specifies a package within a bundle.

Name Type Description Required
source object Source location for the package (probably a helm chart).
true
name string Name of the package.
false
workloadonly boolean WorkloadOnly specifies if the package should be installed only on the workload cluster
false

PackageBundle.status.spec.packages[index].source

↩ Parent

Source location for the package (probably a helm chart).

Name Type Description Required
repository string Repository within the Registry where the package is found.
true
versions []object Versions of the package supported by this bundle.
true
registry string Registry in which the package is found.
false

PackageBundle.status.spec.packages[index].source.versions[index]

↩ Parent

SourceVersion describes a version of a package within a repository.

Name Type Description Required
digest string Digest is a checksum value identifying the version of the package and its contents.
true
name string Name is a human-friendly description of the version, e.g. "v1.0".
true
dependencies []string Dependencies to be installed before the package
false
images []object Images is a list of images used by this version of the package.
false
schema string Schema is a base64 encoded, gzipped json schema used to validate package configurations.
false

PackageBundle.status.spec.packages[index].source.versions[index].images[index]

↩ Parent

VersionImages is an image used by a version of a package.

Name Type Description Required
digest string Digest is a checksum value identifying the version of the package and its contents.
true
repository string Repository within the Registry where the package is found.
true

Package

↩ Parent

Package is the Schema for the package API.

Name Type Description Required
apiVersion string packages.eks.amazonaws.com/v1alpha1 true
kind string Package true
metadata object Refer to the Kubernetes API documentation for the fields of the `metadata` field. true
spec object PackageSpec defines the desired state of an package.
false
status object PackageStatus defines the observed state of Package.
false

Package.spec

↩ Parent

PackageSpec defines the desired state of an package.

Name Type Description Required
packageName string PackageName is the name of the package as specified in the bundle.
true
config string Config for the package.
false
packageVersion string PackageVersion is a human-friendly version name or sha256 checksum for the package, as specified in the bundle.
false
targetNamespace string TargetNamespace defines where package resources will be deployed.
false

Package.status

↩ Parent

PackageStatus defines the observed state of Package.

Name Type Description Required
currentVersion string Version currently installed.
true
source object Source associated with the installation.
true
detail string Detail of the state.
false
spec object Spec previous settings
false
state enum State of the installation.

Enum: initializing, installing, installing dependencies, installed, updating, uninstalling, unknown
false
targetVersion string Version to be installed.
false
upgradesAvailable []object UpgradesAvailable indicates upgraded versions in the bundle.
false

Package.status.source

↩ Parent

Source associated with the installation.

Name Type Description Required
digest string Digest is a checksum value identifying the version of the package and its contents.
true
registry string Registry in which the package is found.
true
repository string Repository within the Registry where the package is found.
true
version string Versions of the package supported.
true

Package.status.spec

↩ Parent

Spec previous settings

Name Type Description Required
packageName string PackageName is the name of the package as specified in the bundle.
true
config string Config for the package.
false
packageVersion string PackageVersion is a human-friendly version name or sha256 checksum for the package, as specified in the bundle.
false
targetNamespace string TargetNamespace defines where package resources will be deployed.
false

Package.status.upgradesAvailable[index]

↩ Parent

PackageAvailableUpgrade details the package’s available upgrade versions.

Name Type Description Required
tag string Tag is a specific version number or sha256 checksum for the package upgrade.
true
version string Version is a human-friendly version name for the package upgrade.
true

5.7.2 - Configuration Best Practice

Best Practice

Any package configuration options listed under Reference/Packages should be modified through package yaml files (with kind: Package) through command eksctl anywhere apply package -f packageFileName. Modifying objects outside of package yaml files may lead to unpredictable behaviors.

For automatic namespace (targetNamespace) creation, see createNamespace field: PackagebundleController.spec

5.7.3 - ADOT Configuration

OpenTelemetry Collector provides a vendor-agnostic solution to receive, process and export telemetry data. It removes the need to run, operate, and maintain multiple agents/collectors. ADOT Collector is an AWS-supported distribution of the OpenTelemetry Collector.

Best Practice

Any package configuration options listed under Reference/Packages should be modified through package yaml files (with kind: Package) through command eksctl anywhere apply package -f packageFileName. Modifying objects outside of package yaml files may lead to unpredictable behaviors.

For automatic namespace (targetNamespace) creation, see createNamespace field: PackagebundleController.spec

Configuration options for ADOT

5.7.3.1 - v0.21.1

Configuring ADOT in EKS Anywhere package spec

Example

We included a sample configuration below for reference. For in-depth examples and use cases, please refer to ADOT workshop.

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: my-adot
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: adot
  targetNamespace: observability
  config: | 
    mode: daemonset

Configurable parameters and default values under spec.config

Parameter Description Default
General
hostNetwork Indicates if the pod should run in the host networking namespace. false
image.pullPolicy Specifies image pull policy: IfNotPresent, Always, Never. "IfNotPresent"
mode Specifies Collector deployment options: daemonset, deployment, or statefulset. "daemonset"
ports.[*].containerPort Specifies containerPort used. See footnote 1
ports.[*].enabled Indicates if a port is enabled. See footnote 1
ports.[*].hostPort Specifies hostPort used. See footnote 1
ports.[*].protocol Specifies protocol used. See footnote 1
ports.[*].servicePort Specifies servicePort used. See footnote 1
resources.limits.cpu Specifies CPU resource limits for containers. 1
resources.limits.memory Specifies memory resource limits for containers. "2Gi"
Config
config.config Specifies Collector receiver, processor, exporter, and extensions configurations. Refer to aws-otel-collector for full details. Note EKS Anywhere ADOT package version matches the exact aws-otel-collector version. See footnote 2
config.config.receiver Specifies how data gets in the Collector. Receivers can be either push or pull based, and support one or more data source. See footnote 2
config.config.processor Specifies how processors are run on data between the stage of being received and being exported. Processors are optional though some are recommended. See footnote 2
config.config.exporters Specifies how data gets sent to backends/destinations. Exporters can be either push or pull based, and support one or more data source. See footnote 2
config.config.extensions Specifies tasks that do not involve processing telemetry data. Examples of extensions include health monitoring, service discovery, and data forwarding. Extensions are optional. See footnote 2
config.config.service Specifies what components are enabled in the Collector based on the configuration found in the receivers, processors, exporters, and extensions sections. If a component is configured, but not defined within the service section, then it is not enabled. See footnote 2
Deployment mode only
replicaCount Specifies replicaCount for pods. 1
service.type Specifies service types: ClusterIP, NodePort, LoadBalancer, ExternalName. "ClusterIP"

  1. The default ports enables otlp and otlp-http. See below specification for details.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        ports:
          otlp:
            enabled: true
            containerPort: 4317
            servicePort: 4317
            hostPort: 4317
            protocol: TCP
          otlp-http:
            enabled: true
            containerPort: 4318
            servicePort: 4318
            hostPort: 4318
            protocol: TCP    
    
     ↩︎
  2. The default config.config deploys an ADOT Collector with the metrics pipeline, which includes otlp and prometheus receiver, and logging exporter. See below specification for details.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        config:
          receivers:
            otlp:
              protocols:
                grpc:
                  endpoint: 0.0.0.0:4317
                http:
                  endpoint: 0.0.0.0:4318
            prometheus:
              config:
                scrape_configs:
                  - job_name: opentelemetry-collector
                    scrape_interval: 10s
                    static_configs:
                      - targets:
                          - ${MY_POD_IP}:8888
          processors:
            batch: {}
            memory_limiter: null
          exporters:
            logging:
              loglevel: info
          extensions:
            health_check: {}
            memory_ballast: {}
          service:
            telemetry:
              metrics:
                address: 0.0.0.0:8888
            extensions:
              - health_check
              - memory_ballast
            pipelines:
              metrics:
                exporters:
                  - logging
                processors:
                  - memory_limiter
                  - batch
                receivers:
                  - otlp
                  - prometheus    
    
     ↩︎

5.7.3.2 - v0.23.0

Configuring ADOT in EKS Anywhere package spec

Example

We included a sample configuration below for reference. For in-depth examples and use cases, please refer to ADOT workshop.

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: my-adot
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: adot
  targetNamespace: observability
  config: | 
    mode: daemonset

Configurable parameters and default values under spec.config

Parameter Description Default
General
hostNetwork Indicates if the pod should run in the host networking namespace. false
image.pullPolicy Specifies image pull policy: IfNotPresent, Always, Never. "IfNotPresent"
mode Specifies Collector deployment options: daemonset, deployment, or statefulset. "daemonset"
ports.[*].containerPort Specifies containerPort used. See footnote 1
ports.[*].enabled Indicates if a port is enabled. See footnote 1
ports.[*].hostPort Specifies hostPort used. See footnote 1
ports.[*].protocol Specifies protocol used. See footnote 1
ports.[*].servicePort Specifies servicePort used. See footnote 1
resources.limits.cpu Specifies CPU resource limits for containers. 1
resources.limits.memory Specifies memory resource limits for containers. "2Gi"
Config
config.config Specifies Collector receiver, processor, exporter, and extensions configurations. Refer to aws-otel-collector for full details. Note EKS Anywhere ADOT package version matches the exact aws-otel-collector version. See footnote 2
config.config.receiver Specifies how data gets in the Collector. Receivers can be either push or pull based, and support one or more data source. See footnote 2
config.config.processor Specifies how processors are run on data between the stage of being received and being exported. Processors are optional though some are recommended. See footnote 2
config.config.exporters Specifies how data gets sent to backends/destinations. Exporters can be either push or pull based, and support one or more data source. See footnote 2
config.config.extensions Specifies tasks that do not involve processing telemetry data. Examples of extensions include health monitoring, service discovery, and data forwarding. Extensions are optional. See footnote 2
config.config.service Specifies what components are enabled in the Collector based on the configuration found in the receivers, processors, exporters, and extensions sections. If a component is configured, but not defined within the service section, then it is not enabled. See footnote 2
Deployment mode only
replicaCount Specifies replicaCount for pods. 1
service.type Specifies service types: ClusterIP, NodePort, LoadBalancer, ExternalName. "ClusterIP"

  1. The default ports enables otlp and otlp-http. See below specification for details.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        ports:
          otlp:
            enabled: true
            containerPort: 4317
            servicePort: 4317
            hostPort: 4317
            protocol: TCP
          otlp-http:
            enabled: true
            containerPort: 4318
            servicePort: 4318
            hostPort: 4318
            protocol: TCP    
    
     ↩︎
  2. The default config.config deploys an ADOT Collector with the metrics pipeline, which includes otlp and prometheus receiver, and logging exporter. See below specification for details.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        config:
          receivers:
            otlp:
              protocols:
                grpc:
                  endpoint: 0.0.0.0:4317
                http:
                  endpoint: 0.0.0.0:4318
            prometheus:
              config:
                scrape_configs:
                  - job_name: opentelemetry-collector
                    scrape_interval: 10s
                    static_configs:
                      - targets:
                          - ${MY_POD_IP}:8888
          processors:
            batch: {}
            memory_limiter: null
          exporters:
            logging:
              loglevel: info
          extensions:
            health_check: {}
            memory_ballast: {}
          service:
            telemetry:
              metrics:
                address: 0.0.0.0:8888
            extensions:
              - health_check
              - memory_ballast
            pipelines:
              metrics:
                exporters:
                  - logging
                processors:
                  - memory_limiter
                  - batch
                receivers:
                  - otlp
                  - prometheus    
    
     ↩︎

5.7.3.3 - v0.25.0

Configuring ADOT in EKS Anywhere package spec

Example

We included a sample configuration below for reference. For in-depth examples and use cases, please refer to ADOT workshop.

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: my-adot
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: adot
  targetNamespace: observability
  config: | 
    mode: daemonset

Configurable parameters and default values under spec.config

Parameter Description Default
General
hostNetwork Indicates if the pod should run in the host networking namespace. false
image.pullPolicy Specifies image pull policy: IfNotPresent, Always, Never. "IfNotPresent"
mode Specifies Collector deployment options: daemonset, deployment, or statefulset. "daemonset"
ports.[*].containerPort Specifies containerPort used. See footnote 1
ports.[*].enabled Indicates if a port is enabled. See footnote 1
ports.[*].hostPort Specifies hostPort used. See footnote 1
ports.[*].protocol Specifies protocol used. See footnote 1
ports.[*].servicePort Specifies servicePort used. See footnote 1
resources.limits.cpu Specifies CPU resource limits for containers. 1
resources.limits.memory Specifies memory resource limits for containers. "2Gi"
Config
config.config Specifies Collector receiver, processor, exporter, and extensions configurations. Refer to aws-otel-collector for full details. Note EKS Anywhere ADOT package version matches the exact aws-otel-collector version. See footnote 2
config.config.receiver Specifies how data gets in the Collector. Receivers can be either push or pull based, and support one or more data source. See footnote 2
config.config.processor Specifies how processors are run on data between the stage of being received and being exported. Processors are optional though some are recommended. See footnote 2
config.config.exporters Specifies how data gets sent to backends/destinations. Exporters can be either push or pull based, and support one or more data source. See footnote 2
config.config.extensions Specifies tasks that do not involve processing telemetry data. Examples of extensions include health monitoring, service discovery, and data forwarding. Extensions are optional. See footnote 2
config.config.service Specifies what components are enabled in the Collector based on the configuration found in the receivers, processors, exporters, and extensions sections. If a component is configured, but not defined within the service section, then it is not enabled. See footnote 2
Deployment mode only
replicaCount Specifies replicaCount for pods. 1
service.type Specifies service types: ClusterIP, NodePort, LoadBalancer, ExternalName. "ClusterIP"

  1. The default ports enables otlp and otlp-http. See below specification for details.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        ports:
          otlp:
            enabled: true
            containerPort: 4317
            servicePort: 4317
            hostPort: 4317
            protocol: TCP
          otlp-http:
            enabled: true
            containerPort: 4318
            servicePort: 4318
            hostPort: 4318
            protocol: TCP    
    
     ↩︎
  2. The default config.config deploys an ADOT Collector with the metrics pipeline, which includes otlp and prometheus receiver, and logging exporter. See below specification for details.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        config:
          receivers:
            otlp:
              protocols:
                grpc:
                  endpoint: 0.0.0.0:4317
                http:
                  endpoint: 0.0.0.0:4318
            prometheus:
              config:
                scrape_configs:
                  - job_name: opentelemetry-collector
                    scrape_interval: 10s
                    static_configs:
                      - targets:
                          - ${MY_POD_IP}:8888
          processors:
            batch: {}
            memory_limiter: null
          exporters:
            logging:
              loglevel: info
          extensions:
            health_check: {}
            memory_ballast: {}
          service:
            telemetry:
              metrics:
                address: 0.0.0.0:8888
            extensions:
              - health_check
              - memory_ballast
            pipelines:
              metrics:
                exporters:
                  - logging
                processors:
                  - memory_limiter
                  - batch
                receivers:
                  - otlp
                  - prometheus    
    
     ↩︎

5.7.4 - Cert-Manager Configuration

The cert-manager package adds certificates and certificate issuers as resource types in Kubernetes clusters, and simplifies the process of obtaining, renewing and using those certificates.

Best Practice

Any package configuration options listed under Reference/Packages should be modified through package yaml files (with kind: Package) through command eksctl anywhere apply package -f packageFileName. Modifying objects outside of package yaml files may lead to unpredictable behaviors.

For automatic namespace (targetNamespace) creation, see createNamespace field: PackagebundleController.spec

Configuration options for Cert-Manager

5.7.4.1 - v1.9.1

Configuring Cert-Manager in EKS Anywhere package spec

Example

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: my-cert-manager
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: cert-manager
  config: | 
    global:
        logLevel: 4

The following table lists the configurable parameters of the cert-manager package spec and the default values.

Parameter Description Default
General
namespace The namespace to use for installing cert-manager package cert-manager
imagePullPolicy The image pull policy IfNotPresent
global
global.logLevel The log level: integer from 0-6 2
Webhook
webhook.timeoutSeconds The time in seconds to wait for the webhook to connect with the kube-api server 0

5.7.5 - Cluster Autoscaler Configuration

Cluster Autoscaler is a component that automatically adjusts the size of a Kubernetes Cluster so that all pods have a place to run and there are no unneeded nodes.

Configuration options for Cluster Autoscaler

5.7.5.1 - v9.21.0

Configuring Cluster Autoscaler in EKS Anywhere package spec

Parameter Description Default
General
cloudProvider Cluster Autoscaler cloud provider. This should always be clusterapi.
Example:
cloudProvider: “clusterapi”
“clusterapi”
autoDiscovery.clusterName Name of the kubernetes cluster this autoscaler package should autoscale.
Example:
autoDiscovery.clusterName: “mgmt-cluster”
false
clusterAPIMode Where Cluster Autoscaler should look for a kubeconfig to communicate with the cluster it will manage. See https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/clusterapi/README.md#connecting-cluster-autoscaler-to-cluster-api-management-and-workload-clusters
Example:
clusterAPIMode: “incluster-kubeconfig”
“incluster-incluster”
clusterAPICloudConfigPath Path to kubeconfig for connecting to Cluster API Management Cluster, only used if clusterAPIMode=kubeconfig-kubeconfig or incluster-kubeconfig
Example:
clusterAPICloudConfigPath: “/etc/kubernetes/value”
“/etc/kubernetes/mgmt-kubeconfig”
extraVolumeSecrets Additional volumes to mount from Secrets.
Example:
extraVolumeSecrets: {}
{}

5.7.6 - Emissary Configuration

Emissary Ingress is an open-source Kubernetes-native API Gateway + Layer 7 load balancer + Kubernetes Ingress built on Envoy Proxy.

Best Practice

Any package configuration options listed under Reference/Packages should be modified through package yaml files (with kind: Package) through command eksctl anywhere apply package -f packageFileName. Modifying objects outside of package yaml files may lead to unpredictable behaviors.

For automatic namespace (targetNamespace) creation, see createNamespace field: PackagebundleController.spec

Configuration options for Emissary

5.7.6.1 - v3.0.0

Configuring Emissary Ingress in EKS Anywhere package spec

Parameter Description Default
General
hostNetwork Whether Emissary will use the host network, useful for on-premise setup .
Example:
hostNetwork: false
false
createDefaultListeners Whether Emissary should be created with default listeners, HTTP on port 8080 and HTTPS on port 8443.
Example:
createDefaultListeners: false
false
replicaCount Replica count for Emissary to deploy.
Example:
replicaCount: 2
2
daemonSet Whether to create Emissary as a Daemonset instead of a deployment
Example:
daemonSet: false
false

5.7.6.2 - v3.3.0

Emissary version 0.3.3 has decoupled the CRD portion of the package, and now supports installing multiple instances of the emissary package in the same cluster.

Configuring Emissary Ingress in EKS Anywhere package spec

Parameter Description Default
General
hostNetwork Whether Emissary will use the host network, useful for on-premise setup .
Example:
hostNetwork: false
false
createDefaultListeners Whether Emissary should be created with default listeners, HTTP on port 8080 and HTTPS on port 8443.
Example:
createDefaultListeners: false
false
replicaCount Replica count for Emissary to deploy.
Example:
replicaCount: 2
2
daemonSet Whether to create Emissary as a Daemonset instead of a deployment
Example:
daemonSet: false
false

5.7.7 - Harbor configuration

Harbor is an open source trusted cloud native registry project that stores, signs, and scans content. Harbor extends the open source Docker Distribution by adding the functionalities usually required by users such as security, identity and management. Having a registry closer to the build and run environment can improve the image transfer efficiency. Harbor supports replication of images between registries, and also offers advanced security features such as user management, access control and activity auditing.

Best Practice

Any package configuration options listed under Reference/Packages should be modified through package yaml files (with kind: Package) through command eksctl anywhere apply package -f packageFileName. Modifying objects outside of package yaml files may lead to unpredictable behaviors.

For automatic namespace (targetNamespace) creation, see createNamespace field: PackagebundleController.spec

Configuration options for Harbor

5.7.7.1 - v2.5.0

Trivy, Notary and Chartmuseum are not supported at this moment.

Configuring Harbor in EKS Anywhere package spec

The following table lists the configurable parameters of the Harbor package spec and the default values.

Parameter Description Default
General
externalURL The external URL for Harbor core service https://127.0.0.1:30003
imagePullPolicy The image pull policy IfNotPresent
logLevel The log level: debug, info, warning, error or fatal info
harborAdminPassword The initial password of the Harbor admin account. Change it from the portal after launching Harbor Harbor12345
secretKey The key used for encryption. Must be a string of 16 chars ""
Expose
expose.type How to expose the service: nodePort or loadBalancer, other values will be ignored and the creation of the service will be skipped. nodePort
expose.tls.enabled Enable TLS or not. true
expose.tls.certSource The source of the TLS certificate. Set as auto, secret or none and fill the information in the corresponding section: 1) auto: generate the TLS certificate automatically 2) secret: read the TLS certificate from the specified secret. The TLS certificate can be generated manually or by cert manager 3) none: configure no TLS certificate. secret
expose.tls.auto.commonName The common name used to generate the certificate. It’s necessary when expose.tls.certSource is set to auto
expose.tls.secret.secretName The name of the secret which contains keys named: tls.crt - the certificate; tls.key - the private key harbor-tls-secret
expose.nodePort.name The name of the NodePort service harbor
expose.nodePort.ports.http.port The service port Harbor listens on when serving HTTP 80
expose.nodePort.ports.http.nodePort The node port Harbor listens on when serving HTTP 30002
expose.nodePort.ports.https.port The service port Harbor listens on when serving HTTPS 443
expose.nodePort.ports.https.nodePort The node port Harbor listens on when serving HTTPS 30003
expose.loadBalancer.name The name of the service harbor
expose.loadBalancer.IP The IP address of the loadBalancer. It only works when the loadBalancer supports assigning an IP address ""
expose.loadBalancer.ports.httpPort The service port Harbor listens on when serving HTTP 80
expose.loadBalancer.ports.httpsPort The service port Harbor listens on when serving HTTPS 30002
expose.loadBalancer.annotations The annotations attached to the loadBalancer service {}
expose.loadBalancer.sourceRanges List of IP address ranges to assign to loadBalancerSourceRanges []
Internal TLS
internalTLS.enabled Enable TLS for the components (core, jobservice, portal, and registry) true
Persistence
persistence.resourcePolicy Setting it to keep to avoid removing PVCs during a helm delete operation. Leaving it empty will delete PVCs after the chart is deleted. Does not affect PVCs created for internal database and redis components. keep
persistence.persistentVolumeClaim.registry.size The size of the volume 5Gi
persistence.persistentVolumeClaim.registry.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning ""
persistence.persistentVolumeClaim.jobservice.size The size of the volume 1Gi
persistence.persistentVolumeClaim.jobservice.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning ""
persistence.persistentVolumeClaim.database.size The size of the volume. If an external database is used, the setting will be ignored 1Gi
persistence.persistentVolumeClaim.database.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning. If an external database is used, the setting will be ignored ""
persistence.persistentVolumeClaim.redis.size The size of the volume. If an external Redis is used, the setting will be ignored 1Gi
persistence.persistentVolumeClaim.redis.storageClass Specify the storageClass used to provision the volumem, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning. If an external Redis is used, the setting will be ignored ""
Registry
registry.relativeurls If true, the registry returns relative URLs in Location headers. The client is responsible for resolving the correct URL. Needed if harbor is behind a reverse proxy false

5.7.7.2 - v2.5.1

Notary and Chartmuseum are not supported at this moment.

Configuring Harbor in EKS Anywhere package spec

The following table lists the configurable parameters of the Harbor package spec and the default values.

Parameter Description Default
General
externalURL The external URL for Harbor core service https://127.0.0.1:30003
imagePullPolicy The image pull policy IfNotPresent
logLevel The log level: debug, info, warning, error or fatal info
harborAdminPassword The initial password of the Harbor admin account. Change it from the portal after launching Harbor Harbor12345
secretKey The key used for encryption. Must be a string of 16 chars ""
Expose
expose.type How to expose the service: nodePort or loadBalancer, other values will be ignored and the creation of the service will be skipped. nodePort
expose.tls.enabled Enable TLS or not. true
expose.tls.certSource The source of the TLS certificate. Set as auto, secret or none and fill the information in the corresponding section: 1) auto: generate the TLS certificate automatically 2) secret: read the TLS certificate from the specified secret. The TLS certificate can be generated manually or by cert manager 3) none: configure no TLS certificate. secret
expose.tls.auto.commonName The common name used to generate the certificate. It’s necessary when expose.tls.certSource is set to auto
expose.tls.secret.secretName The name of the secret which contains keys named: tls.crt - the certificate; tls.key - the private key harbor-tls-secret
expose.nodePort.name The name of the NodePort service harbor
expose.nodePort.ports.http.port The service port Harbor listens on when serving HTTP 80
expose.nodePort.ports.http.nodePort The node port Harbor listens on when serving HTTP 30002
expose.nodePort.ports.https.port The service port Harbor listens on when serving HTTPS 443
expose.nodePort.ports.https.nodePort The node port Harbor listens on when serving HTTPS 30003
expose.loadBalancer.name The name of the service harbor
expose.loadBalancer.IP The IP address of the loadBalancer. It only works when loadBalancer supports assigning an IP address ""
expose.loadBalancer.ports.httpPort The service port Harbor listens on when serving HTTP 80
expose.loadBalancer.ports.httpsPort The service port Harbor listens on when serving HTTPS 30002
expose.loadBalancer.annotations The annotations attached to the loadBalancer service {}
expose.loadBalancer.sourceRanges List of IP address ranges to assign to loadBalancerSourceRanges []
Internal TLS
internalTLS.enabled Enable TLS for the components (core, jobservice, portal, and registry) true
Persistence
persistence.resourcePolicy Setting it to keep to avoid removing PVCs during a helm delete operation. Leaving it empty will delete PVCs after the chart is deleted. Does not affect PVCs created for internal database and redis components. keep
persistence.persistentVolumeClaim.registry.size The size of the volume 5Gi
persistence.persistentVolumeClaim.registry.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning ""
persistence.persistentVolumeClaim.jobservice.size The size of the volume 1Gi
persistence.persistentVolumeClaim.jobservice.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning ""
persistence.persistentVolumeClaim.database.size The size of the volume. If an external database is used, the setting will be ignored 1Gi
persistence.persistentVolumeClaim.database.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning. If an external database is used, the setting will be ignored ""
persistence.persistentVolumeClaim.redis.size The size of the volume. If an external Redis is used, the setting will be ignored 1Gi
persistence.persistentVolumeClaim.redis.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning. If an external Redis is used, the setting will be ignored ""
persistence.persistentVolumeClaim.trivy.size The size of the volume 5Gi
persistence.persistentVolumeClaim.trivy.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning ""
Trivy
trivy.enabled The flag to enable Trivy scanner true
trivy.vulnType Comma-separated list of vulnerability types. Possible values os and library. os,library
trivy.severity Comma-separated list of severities to be checked UNKNOWN,LOW,MEDIUM,HIGH,CRITICAL
trivy.skipUpdate The flag to disable Trivy DB downloads from GitHub false
trivy.offlineScan The flag prevents Trivy from sending API requests to identify dependencies. false
Registry
registry.relativeurls If true, the registry returns relative URLs in Location headers. The client is responsible for resolving the correct URL. Needed if harbor is behind a reverse proxy false

5.7.7.3 - v2.7.1

Notary and Chartmuseum are not supported at this moment.

Configuring Harbor in EKS Anywhere package spec

The following table lists the configurable parameters of the Harbor package spec and the default values.

Parameter Description Default
General
externalURL The external URL for Harbor core service https://127.0.0.1:30003
imagePullPolicy The image pull policy IfNotPresent
logLevel The log level: debug, info, warning, error or fatal info
harborAdminPassword The initial password of the Harbor admin account. Change it from the portal after launching Harbor Harbor12345
secretKey The key used for encryption. Must be a string of 16 chars ""
Expose
expose.type How to expose the service: nodePort or loadBalancer, other values will be ignored and the creation of the service will be skipped. nodePort
expose.tls.enabled Enable TLS or not. true
expose.tls.certSource The source of the TLS certificate. Set as auto, secret or none and fill the information in the corresponding section: 1) auto: generate the TLS certificate automatically 2) secret: read the TLS certificate from the specified secret. The TLS certificate can be generated manually or by cert manager 3) none: configure no TLS certificate. secret
expose.tls.auto.commonName The common name used to generate the certificate. It’s necessary when expose.tls.certSource is set to auto
expose.tls.secret.secretName The name of the secret which contains keys named: tls.crt - the certificate; tls.key - the private key harbor-tls-secret
expose.nodePort.name The name of the NodePort service harbor
expose.nodePort.ports.http.port The service port Harbor listens on when serving HTTP 80
expose.nodePort.ports.http.nodePort The node port Harbor listens on when serving HTTP 30002
expose.nodePort.ports.https.port The service port Harbor listens on when serving HTTPS 443
expose.nodePort.ports.https.nodePort The node port Harbor listens on when serving HTTPS 30003
expose.loadBalancer.name The name of the service harbor
expose.loadBalancer.IP The IP address of the loadBalancer. It only works when loadBalancer supports assigning an IP address ""
expose.loadBalancer.ports.httpPort The service port Harbor listens on when serving HTTP 80
expose.loadBalancer.ports.httpsPort The service port Harbor listens on when serving HTTPS 30002
expose.loadBalancer.annotations The annotations attached to the loadBalancer service {}
expose.loadBalancer.sourceRanges List of IP address ranges to assign to loadBalancerSourceRanges []
Internal TLS
internalTLS.enabled Enable TLS for the components (core, jobservice, portal, and registry) true
Persistence
persistence.resourcePolicy Setting it to keep to avoid removing PVCs during a helm delete operation. Leaving it empty will delete PVCs after the chart is deleted. Does not affect PVCs created for internal database and redis components. keep
persistence.persistentVolumeClaim.registry.size The size of the volume 5Gi
persistence.persistentVolumeClaim.registry.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning ""
persistence.persistentVolumeClaim.jobservice.jobLog.size The size of the volume 1Gi
persistence.persistentVolumeClaim.jobservice.jobLog.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning ""
persistence.persistentVolumeClaim.database.size The size of the volume. If an external database is used, the setting will be ignored 1Gi
persistence.persistentVolumeClaim.database.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning. If an external database is used, the setting will be ignored ""
persistence.persistentVolumeClaim.redis.size The size of the volume. If an external Redis is used, the setting will be ignored 1Gi
persistence.persistentVolumeClaim.redis.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning. If an external Redis is used, the setting will be ignored ""
persistence.persistentVolumeClaim.trivy.size The size of the volume 5Gi
persistence.persistentVolumeClaim.trivy.storageClass Specify the storageClass used to provision the volume, or the default StorageClass will be used (the default). Set it to - to disable dynamic provisioning ""
Trivy
trivy.enabled The flag to enable Trivy scanner true
trivy.vulnType Comma-separated list of vulnerability types. Possible values os and library. os,library
trivy.severity Comma-separated list of severities to be checked UNKNOWN,LOW,MEDIUM,HIGH,CRITICAL
trivy.skipUpdate The flag to disable Trivy DB downloads from GitHub false
trivy.offlineScan The flag prevents Trivy from sending API requests to identify dependencies. false
Registry
registry.relativeurls If true, the registry returns relative URLs in Location headers. The client is responsible for resolving the correct URL. Needed if harbor is behind a reverse proxy false

5.7.8 - MetalLB Configuration

MetalLB is a load-balancer implementation for on-premises Kubernetes clusters, using standard routing protocols.

Best Practice

Any package configuration options listed under Reference/Packages should be modified through package yaml files (with kind: Package) through command eksctl anywhere apply package -f packageFileName. Modifying objects outside of package yaml files may lead to unpredictable behaviors.

For automatic namespace (targetNamespace) creation, see createNamespace field: PackagebundleController.spec

Configuration options for MetalLB

5.7.8.1 - v0.12.1

FRRouting is currently not supported for MetalLB.

Configuring MetalLB in EKS Anywhere package spec

Example

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: mylb
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: metallb
  targetNamespace: metallb-system
  config: |
    IPAddressPools:
      - name: default
        addresses:
          - 10.220.0.93/32
          - 10.220.0.94/32
          - 10.220.0.95/32
      - name: bgp
        addresses:
          - 10.220.0.97-10.220.0.99
    L2Advertisements:
      - IPAddressPools:
          - default
    BGPAdvertisements:
      - IPAddressPools:
          - bgp 
    BGPPeers:
      - myASN: 123
        peerASN: 55001
        peerAddress: 1.2.3.4
        keepaliveTime: 30s
Parameter Description Default
IPAddressPools[] A list of IPAddressPool. None
L2Advertisements[] A list of L2Advertisement. None
BGPAdvertisements[] A list of BGPAdvertisement. None
BGPPeers[] A list of BGPPeer. None
IPAddressPool A list of IP address ranges over which MetalLB has authority. You can list multiple ranges in a single pool and they will all share the same settings. Each range can be either a CIDR prefix, or an explicit start-end range of IPs.
name Name for the address pool. None
addresses[] A list of string representing CIRD or IP ranges. None
autoAssign AutoAssign flag used to prevent MetalLB from automatic allocation for a pool. true
L2Advertisement L2Advertisement allows MetalLB to advertise the LoadBalancer IPs provided by the selected pools via L2.
IPAddressPools[] The list of IPAddressPools to advertise via this advertisement, selected by name. None
BGPAdvertisement BGPAdvertisement allows MetalLB to advertise the IPs coming from the selected IPAddressPools via BGP, setting the parameters of the BGP Advertisement.
aggregationLength The aggregation-length advertisement option lets you “roll up” the /32s into a larger prefix. Defaults to 32. Works for IPv4 addresses. 32
aggregationLengthV6 The aggregation-length advertisement option lets you “roll up” the /128s into a larger prefix. Defaults to 128. Works for IPv6 addresses. 128
communities[] The BGP communities to be associated with the announcement. Each item can be a community of the form 1234:1234 or the name of an alias defined in the Community CRD. None
IPAddressPools[] The list of IPAddressPools to advertise via this advertisement, selected by name. None
localPref The BGP LOCAL_PREF attribute which is used by BGP best path algorithm, Path with higher localpref is preferred over one with lower localpref. None
BGPPeer Peers for the BGP protocol.
bfdProfile The name of the BFD Profile to be used for the BFD session associated to the BGP session. If not set, the BFD session won’t be set up. None
holdTime Requested BGP hold time, per RFC4271. None
keepaliveTime Requested BGP keepalive time, per RFC4271. None
myASN AS number to use for the local end of the session. None
password Authentication password for routers enforcing TCP MD5 authenticated sessions. None
peerASN AS number to expect from the remote end of the session. None
peerAddress Address to dial when establishing the session. None
peerPort Port to dial when establishing the session. 179
routerID BGP router ID to advertise to the peer. None
sourceAddress Source address to use when establishing the session. None

5.7.8.2 - v0.13.5

FRRouting is currently not supported for MetalLB.

Configuring MetalLB in EKS Anywhere package spec

Starting at v0.13.5, keys within each config section start with lowercase. For example:

L2Advertisements:
    - IPAddressPools:
        - default

Becomes:

L2Advertisements:
    - ipAddressPools:
        - default

Top-level section names remain capitalized as they represent CRDs:

config: |
    IPAddressPools:
    ...

Example

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: mylb
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: metallb
  targetNamespace: metallb-system
  config: |
    IPAddressPools:
      - name: default
        addresses:
          - 10.220.0.93/32
          - 10.220.0.94/32
          - 10.220.0.95/32
      - name: bgp
        addresses:
          - 10.220.0.97-10.220.0.99
    L2Advertisements:
      - ipAddressPools:
          - default
    BGPAdvertisements:
      - ipAddressPools:
          - bgp 
        autoAssign: false
    BGPPeers:
      - myASN: 123
        peerASN: 55001
        peerAddress: 1.2.3.4
        keepaliveTime: 30s
Parameter Description Default Required
IPAddressPools[] A list of ip address pools. See IPAddressPool. None False
L2Advertisements[] A list of Layer 2 advertisements. See L2Advertisement. None False
BGPAdvertisements[] A list of BGP advertisements. See BGPAdvertisement. None False
BGPPeers[] A list of BGP peers. See BGPPeer. None False
IPAddressPool A list of IP address ranges over which MetalLB has authority. You can list multiple ranges in a single pool and they will all share the same settings. Each range can be either a CIDR prefix, or an explicit start-end range of IPs.
name Name for the address pool. None True
addresses[] A list of string representing CIRD or IP ranges. None True
autoAssign AutoAssign flag used to prevent MetalLB from automatic allocation for a pool. true False
L2Advertisement L2Advertisement allows MetalLB to advertise the LoadBalancer IPs provided by the selected pools via L2.
ipAddressPools[] The list of IPAddressPool names to advertise. None True
name Name for the L2Advertisement. None False
BGPAdvertisement BGPAdvertisement allows MetalLB to advertise the IPs coming from the selected ipAddressPools via BGP, setting the parameters of the BGP Advertisement.
aggregationLength The aggregation-length advertisement option lets you “roll up” the /32s into a larger prefix. Defaults to 32. Works for IPv4 addresses. 32 False
aggregationLengthV6 The aggregation-length advertisement option lets you “roll up” the /128s into a larger prefix. Defaults to 128. Works for IPv6 addresses. 128 False
communities[] The BGP communities to be associated with the announcement. Each item can be a community of the form 1234:1234 or the name of an alias defined in the Community CRD. None False
ipAddressPools[] The list of IPAddressPool names to be advertised via BGP. None True
localPref The BGP LOCAL_PREF attribute which is used by BGP best path algorithm, Path with higher localpref is preferred over one with lower localpref. None False
peers[] List of peer names. Limits the bgppeer to advertise the ips of the selected pools to. When empty, the loadbalancer IP is announced to all the BGPPeers configured. None False
BGPPeer Peers for the BGP protocol.
holdTime Requested BGP hold time, per RFC4271. None False
keepaliveTime Requested BGP keepalive time, per RFC4271. None False
myASN AS number to use for the local end of the session. None True
password Authentication password for routers enforcing TCP MD5 authenticated sessions. None False
peerASN AS number to expect from the remote end of the session. None True
peerAddress Address to dial when establishing the session. None True
peerPort Port to dial when establishing the session. 179 False
routerID BGP router ID to advertise to the peer. None False
sourceAddress Source address to use when establishing the session. None False

5.7.8.3 - v0.13.7

FRRouting is currently not supported for MetalLB.

Configuring MetalLB in EKS Anywhere package spec

Starting at v0.13.5, keys within each config section start with lowercase. See v0.13.5 for details.

Example

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: mylb
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: metallb
  targetNamespace: metallb-system
  config: |
    IPAddressPools:
      - name: default
        addresses:
          - 10.220.0.93/32
          - 10.220.0.94/32
          - 10.220.0.95/32
      - name: bgp
        addresses:
          - 10.220.0.97-10.220.0.99
    L2Advertisements:
      - ipAddressPools:
          - default
    BGPAdvertisements:
      - ipAddressPools:
          - bgp 
        autoAssign: false
    BGPPeers:
      - myASN: 123
        peerASN: 55001
        peerAddress: 1.2.3.4
        keepaliveTime: 30s
Parameter Description Default Required
IPAddressPools[] A list of ip address pools. See IPAddressPool. None False
L2Advertisements[] A list of Layer 2 advertisements. See L2Advertisement. None False
BGPAdvertisements[] A list of BGP advertisements. See BGPAdvertisement. None False
BGPPeers[] A list of BGP peers. See BGPPeer. None False
IPAddressPool A list of IP address ranges over which MetalLB has authority. You can list multiple ranges in a single pool and they will all share the same settings. Each range can be either a CIDR prefix, or an explicit start-end range of IPs.
name Name for the address pool. None True
addresses[] A list of string representing CIRD or IP ranges. None True
autoAssign AutoAssign flag used to prevent MetalLB from automatic allocation for a pool. true False
L2Advertisement L2Advertisement allows MetalLB to advertise the LoadBalancer IPs provided by the selected pools via L2.
ipAddressPools[] The list of IPAddressPool names to advertise. None True
name Name for the L2Advertisement. None False
BGPAdvertisement BGPAdvertisement allows MetalLB to advertise the IPs coming from the selected ipAddressPools via BGP, setting the parameters of the BGP Advertisement.
aggregationLength The aggregation-length advertisement option lets you “roll up” the /32s into a larger prefix. Defaults to 32. Works for IPv4 addresses. 32 False
aggregationLengthV6 The aggregation-length advertisement option lets you “roll up” the /128s into a larger prefix. Defaults to 128. Works for IPv6 addresses. 128 False
communities[] The BGP communities to be associated with the announcement. Each item can be a community of the form 1234:1234 or the name of an alias defined in the Community CRD. None False
ipAddressPools[] The list of IPAddressPool names to be advertised via BGP. None True
localPref The BGP LOCAL_PREF attribute which is used by BGP best path algorithm, Path with higher localpref is preferred over one with lower localpref. None False
peers[] List of peer names. Limits the bgppeer to advertise the ips of the selected pools to. When empty, the loadbalancer IP is announced to all the BGPPeers configured. None False
BGPPeer Peers for the BGP protocol.
holdTime Requested BGP hold time, per RFC4271. None False
keepaliveTime Requested BGP keepalive time, per RFC4271. None False
myASN AS number to use for the local end of the session. None True
password Authentication password for routers enforcing TCP MD5 authenticated sessions. None False
peerASN AS number to expect from the remote end of the session. None True
peerAddress Address to dial when establishing the session. None True
peerPort Port to dial when establishing the session. 179 False
routerID BGP router ID to advertise to the peer. None False
sourceAddress Source address to use when establishing the session. None False
password Authentication password for routers enforcing TCP MD5 authenticated sessions. None False
passwordSecret passwordSecret is a reference to the authentication secret for BGP Peer. The secret must be of type ‘kubernetes.io/basic-auth’ and the password stored under the “password” key. Example:
passwordSecret:
name: mySecret
namespace: metallb-system
None False

5.7.9 - Metrics Server Configuration

Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.

Configuration options for Metrics Server

5.7.9.1 - v3.8.2

Configuring Metrics Server in EKS Anywhere package spec

Parameter Description Default
General
args Additional args to provide to metrics-server
Example:
cloudProvider: ["–kubelet-insecure-tls"] 
[]

5.7.10 - Prometheus Configuration

Prometheus is an open-source systems monitoring and alerting toolkit. It collects and stores metrics as time series data.

Best Practice

Any package configuration options listed under Reference/Packages should be modified through package yaml files (with kind: Package) through command eksctl anywhere apply package -f packageFileName. Modifying objects outside of package yaml files may lead to unpredictable behaviors.

For automatic namespace (targetNamespace) creation, see createNamespace field: PackagebundleController.spec

Configuration options for Prometheus

5.7.10.1 - v2.39.1

Configuring Prometheus in EKS Anywhere package spec

Example

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: generated-prometheus
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: prometheus
  targetNamespace: observability
  config: |
    server:
      replicaCount: 2
      statefulSet:
        enabled: true

Configurable parameters and default values under spec.config

Parameter Description Default
General
rbac.create Specifies if clusterRole / role and clusterRoleBinding / roleBinding will be created for prometheus-server and node-exporter true
sourceRegistry Specifies image source registry for prometheus-server and node-exporter "783794618700.dkr.ecr.us-west-2.amazonaws.com"
Node-Exporter
nodeExporter.enabled Indicates if node-exporter is enabled true
nodeExporter.hostNetwork Indicates if node-exporter shares the host network namespace true
nodeExporter.hostPID Indicates if node-exporter shares the host process ID namespace true
nodeExporter.image.pullPolicy Specifies node-exporter image pull policy: IfNotPresent, Always, Never "IfNotPresent"
nodeExporter.image.repository Specifies node-exporter image repository "prometheus/node-exporter"
nodeExporter.resources Specifies resource requests and limits of the node-exporter container. Refer to the Kubernetes API documentation ResourceRequirements field for more details {}
nodeExporter.service Specifies how to expose node-exporter as a network service See footnote 1
nodeExporter.tolerations Specifies node tolerations for node-exporter scheduling to nodes with taints. Refer to the Kubernetes API documentation toleration field for more details. See footnote 2
serviceAccounts.nodeExporter.annotations Specifies node-exporter service account annotations {}
serviceAccounts.nodeExporter.create Indicates if node-exporter service account will be created true
serviceAccounts.nodeExporter.name Specifies node-exporter service account name ""
Prometheus-Server
server.enabled Indicates if prometheus-server is enabled true
server.global.evaluation_interval Specifies how frequently the prometheus-server rules are evaluated "1m"
server.global.scrape_interval Specifies how frequently prometheus-server will scrape targets "1m"
server.global.scrape_timeout Specifies how long until a prometheus-server scrape request times out "10s"
server.image.pullPolicy Specifies prometheus-server image pull policy: IfNotPresent, Always, Never "IfNotPresent"
server.image.repository Specifies prometheus-server image repository "prometheus/prometheus"
server.name Specifies prometheus-server container name "server"
server.persistentVolume.accessModes Specifies prometheus-server data Persistent Volume access modes "ReadWriteOnce"
server.persistentVolume.enabled Indicates if prometheus-server will create/use a Persistent Volume Claim true
server.persistentVolume.existingClaim Specifies prometheus-server data Persistent Volume existing claim name. It requires server.persistentVolume.enabled: true. If defined, PVC must be created manually before volume will be bound ""
server.persistentVolume.size Specifies prometheus-server data Persistent Volume size "8Gi"
server.remoteRead Specifies prometheus-server remote read configs. Refer to Prometheus docs remote_read for more details []
server.remoteWrite Specifies prometheus-server remote write configs. Refer to Prometheus docs remote_write for more details []
server.replicaCount Specifies the replicaCount for prometheus-server deployment / statefulSet. Note: server.statefulSet.enabled should be set to true if server.replicaCount is greater than 1 1
server.resources Specifies resource requests and limits of the prometheus-server container. Refer to the Kubernetes API documentation ResourceRequirements field for more details {}
server.retention Specifies prometheus-server data retention period "15d"
server.service Specifies how to expose prometheus-server as a network service See footnote 3
server.statefulSet.enabled Indicates if prometheus-server is deployed as a statefulSet. If set to false, prometheus-server will be deployed as a deployment false
serverFiles.“prometheus.yml”.scrape_configs Specifies a set of targets and parameters for prometheus-server describing how to scrape them. Refer to Prometheus docs scrape_config for more details See footnote 4
serviceAccounts.server.annotations Specifies prometheus-server service account annotations {}
serviceAccounts.server.create Indicates if prometheus-server service account will be created true
serviceAccounts.server.name Specifies prometheus-server service account name ""

  1. Node-exporter service is exposed as a clusterIP with port: 9100 (controlled by nodeExporter.service.servicePort below) and targetPort: 9100 (controlled by nodeExporter.service.hostPort below) by default. Note the annotation prometheus.io/scrape: "true" is mandatory in order for node-exporter to be discovered by prometheus-server as a scrape target. See below specification for details.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        nodeExporter:
          service:
            annotations:
              prometheus.io/scrape: "true"
            hostPort: 9100
            servicePort: 9100
            type: ClusterIP    
    
     ↩︎
  2. Node-exporter pods have the following toleration by default, which allows daemonSet to be scheduled on control plane node.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        nodeExporter:
          tolerations:
            # For K8 version prior to 1.24
            - key: "node-role.kubernetes.io/master"
              operator: "Exists"
              effect: "NoSchedule"
            # For K8 version 1.24+
            - key: "node-role.kubernetes.io/control-plane"
              operator: "Exists"
              effect: "NoSchedule"    
    
     ↩︎
  3. Prometheus-server service is exposed as a clusterIP with port: 9090 (controlled by server.service.servicePort below) and targetPort: 9090 (not overridable) by default. See below specification for details.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        server:
          service:
            enabled: true
            servicePort: 9090
            type: ClusterIP    
    
     ↩︎
  4. Prometheus-server by default has the following scrape configs.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: | 
        serverFiles:
          prometheus.yml:
            scrape_configs:
              - job_name: prometheus
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                static_configs:
                - targets:
                  - localhost:9090
              - job_name: kubernetes-apiservers
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: https
                authorization:
                  type: Bearer
                  credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                tls_config:
                  ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  insecure_skip_verify: false
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
                  separator: ;
                  regex: default;kubernetes;https
                  replacement: $1
                  action: keep
                kubernetes_sd_configs:
                - role: endpoints
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-nodes
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: https
                authorization:
                  type: Bearer
                  credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                tls_config:
                  ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  insecure_skip_verify: false
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - separator: ;
                  regex: __meta_kubernetes_node_label_(.+)
                  replacement: $1
                  action: labelmap
                - separator: ;
                  regex: (.*)
                  target_label: __address__
                  replacement: kubernetes.default.svc:443
                  action: replace
                - source_labels: [__meta_kubernetes_node_name]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: /api/v1/nodes/$1/proxy/metrics
                  action: replace
                kubernetes_sd_configs:
                - role: node
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-nodes-cadvisor
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: https
                authorization:
                  type: Bearer
                  credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                tls_config:
                  ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  insecure_skip_verify: false
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - separator: ;
                  regex: __meta_kubernetes_node_label_(.+)
                  replacement: $1
                  action: labelmap
                - separator: ;
                  regex: (.*)
                  target_label: __address__
                  replacement: kubernetes.default.svc:443
                  action: replace
                - source_labels: [__meta_kubernetes_node_name]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor
                  action: replace
                kubernetes_sd_configs:
                - role: node
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-service-endpoints
                honor_labels: true
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: keep
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape_slow]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: drop
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
                  separator: ;
                  regex: (https?)
                  target_label: __scheme__
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: $1
                  action: replace
                - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
                  separator: ;
                  regex: (.+?)(?::\d+)?;(\d+)
                  target_label: __address__
                  replacement: $1:$2
                  action: replace
                - separator: ;
                  regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
                  replacement: __param_$1
                  action: labelmap
                - separator: ;
                  regex: __meta_kubernetes_service_label_(.+)
                  replacement: $1
                  action: labelmap
                - source_labels: [__meta_kubernetes_namespace]
                  separator: ;
                  regex: (.*)
                  target_label: namespace
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_service_name]
                  separator: ;
                  regex: (.*)
                  target_label: service
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_node_name]
                  separator: ;
                  regex: (.*)
                  target_label: node
                  replacement: $1
                  action: replace
                kubernetes_sd_configs:
                - role: endpoints
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-service-endpoints-slow
                honor_labels: true
                honor_timestamps: true
                scrape_interval: 5m
                scrape_timeout: 30s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape_slow]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: keep
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
                  separator: ;
                  regex: (https?)
                  target_label: __scheme__
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: $1
                  action: replace
                - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
                  separator: ;
                  regex: (.+?)(?::\d+)?;(\d+)
                  target_label: __address__
                  replacement: $1:$2
                  action: replace
                - separator: ;
                  regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
                  replacement: __param_$1
                  action: labelmap
                - separator: ;
                  regex: __meta_kubernetes_service_label_(.+)
                  replacement: $1
                  action: labelmap
                - source_labels: [__meta_kubernetes_namespace]
                  separator: ;
                  regex: (.*)
                  target_label: namespace
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_service_name]
                  separator: ;
                  regex: (.*)
                  target_label: service
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_node_name]
                  separator: ;
                  regex: (.*)
                  target_label: node
                  replacement: $1
                  action: replace
                kubernetes_sd_configs:
                - role: endpoints
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: prometheus-pushgateway
                honor_labels: true
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
                  separator: ;
                  regex: pushgateway
                  replacement: $1
                  action: keep
                kubernetes_sd_configs:
                - role: service
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-services
                honor_labels: true
                honor_timestamps: true
                params:
                  module:
                  - http_2xx
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /probe
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: keep
                - source_labels: [__address__]
                  separator: ;
                  regex: (.*)
                  target_label: __param_target
                  replacement: $1
                  action: replace
                - separator: ;
                  regex: (.*)
                  target_label: __address__
                  replacement: blackbox
                  action: replace
                - source_labels: [__param_target]
                  separator: ;
                  regex: (.*)
                  target_label: instance
                  replacement: $1
                  action: replace
                - separator: ;
                  regex: __meta_kubernetes_service_label_(.+)
                  replacement: $1
                  action: labelmap
                - source_labels: [__meta_kubernetes_namespace]
                  separator: ;
                  regex: (.*)
                  target_label: namespace
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_service_name]
                  separator: ;
                  regex: (.*)
                  target_label: service
                  replacement: $1
                  action: replace
                kubernetes_sd_configs:
                - role: service
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-pods
                honor_labels: true
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: keep
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape_slow]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: drop
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
                  separator: ;
                  regex: (https?)
                  target_label: __scheme__
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: $1
                  action: replace
                - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
                  separator: ;
                  regex: (.+?)(?::\d+)?;(\d+)
                  target_label: __address__
                  replacement: $1:$2
                  action: replace
                - separator: ;
                  regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
                  replacement: __param_$1
                  action: labelmap
                - separator: ;
                  regex: __meta_kubernetes_pod_label_(.+)
                  replacement: $1
                  action: labelmap
                - source_labels: [__meta_kubernetes_namespace]
                  separator: ;
                  regex: (.*)
                  target_label: namespace
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_name]
                  separator: ;
                  regex: (.*)
                  target_label: pod
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_phase]
                  separator: ;
                  regex: Pending|Succeeded|Failed|Completed
                  replacement: $1
                  action: drop
                kubernetes_sd_configs:
                - role: pod
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-pods-slow
                honor_labels: true
                honor_timestamps: true
                scrape_interval: 5m
                scrape_timeout: 30s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape_slow]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: keep
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
                  separator: ;
                  regex: (https?)
                  target_label: __scheme__
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: $1
                  action: replace
                - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
                  separator: ;
                  regex: (.+?)(?::\d+)?;(\d+)
                  target_label: __address__
                  replacement: $1:$2
                  action: replace
                - separator: ;
                  regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
                  replacement: __param_$1
                  action: labelmap
                - separator: ;
                  regex: __meta_kubernetes_pod_label_(.+)
                  replacement: $1
                  action: labelmap
                - source_labels: [__meta_kubernetes_namespace]
                  separator: ;
                  regex: (.*)
                  target_label: namespace
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_name]
                  separator: ;
                  regex: (.*)
                  target_label: pod
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_phase]
                  separator: ;
                  regex: Pending|Succeeded|Failed|Completed
                  replacement: $1
                  action: drop
                kubernetes_sd_configs:
                - role: pod
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
    
     ↩︎

5.7.10.2 - v2.41.1

Configuring Prometheus in EKS Anywhere package spec

Example

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: generated-prometheus
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: prometheus
  targetNamespace: observability
  config: |
    server:
      replicaCount: 2
      statefulSet:
        enabled: true

Configurable parameters and default values under spec.config

Parameter Description Default
General
rbac.create Specifies if clusterRole / role and clusterRoleBinding / roleBinding will be created for prometheus-server and node-exporter true
sourceRegistry Specifies image source registry for prometheus-server and node-exporter "783794618700.dkr.ecr.us-west-2.amazonaws.com"
Node-Exporter
nodeExporter.enabled Indicates if node-exporter is enabled true
nodeExporter.hostNetwork Indicates if node-exporter shares the host network namespace true
nodeExporter.hostPID Indicates if node-exporter shares the host process ID namespace true
nodeExporter.image.pullPolicy Specifies node-exporter image pull policy: IfNotPresent, Always, Never "IfNotPresent"
nodeExporter.image.repository Specifies node-exporter image repository "prometheus/node-exporter"
nodeExporter.resources Specifies resource requests and limits of the node-exporter container. Refer to the Kubernetes API documentation ResourceRequirements field for more details {}
nodeExporter.service Specifies how to expose node-exporter as a network service See footnote 1
nodeExporter.tolerations Specifies node tolerations for node-exporter scheduling to nodes with taints. Refer to the Kubernetes API documentation toleration field for more details. See footnote 2
serviceAccounts.nodeExporter.annotations Specifies node-exporter service account annotations {}
serviceAccounts.nodeExporter.create Indicates if node-exporter service account will be created true
serviceAccounts.nodeExporter.name Specifies node-exporter service account name ""
Prometheus-Server
server.enabled Indicates if prometheus-server is enabled true
server.global.evaluation_interval Specifies how frequently the prometheus-server rules are evaluated "1m"
server.global.scrape_interval Specifies how frequently prometheus-server will scrape targets "1m"
server.global.scrape_timeout Specifies how long until a prometheus-server scrape request times out "10s"
server.image.pullPolicy Specifies prometheus-server image pull policy: IfNotPresent, Always, Never "IfNotPresent"
server.image.repository Specifies prometheus-server image repository "prometheus/prometheus"
server.name Specifies prometheus-server container name "server"
server.persistentVolume.accessModes Specifies prometheus-server data Persistent Volume access modes "ReadWriteOnce"
server.persistentVolume.enabled Indicates if prometheus-server will create/use a Persistent Volume Claim true
server.persistentVolume.existingClaim Specifies prometheus-server data Persistent Volume existing claim name. It requires server.persistentVolume.enabled: true. If defined, PVC must be created manually before volume will be bound ""
server.persistentVolume.size Specifies prometheus-server data Persistent Volume size "8Gi"
server.remoteRead Specifies prometheus-server remote read configs. Refer to Prometheus docs remote_read for more details []
server.remoteWrite Specifies prometheus-server remote write configs. Refer to Prometheus docs remote_write for more details []
server.replicaCount Specifies the replicaCount for prometheus-server deployment / statefulSet. Note: server.statefulSet.enabled should be set to true if server.replicaCount is greater than 1 1
server.resources Specifies resource requests and limits of the prometheus-server container. Refer to the Kubernetes API documentation ResourceRequirements field for more details {}
server.retention Specifies prometheus-server data retention period "15d"
server.service Specifies how to expose prometheus-server as a network service See footnote 3
server.statefulSet.enabled Indicates if prometheus-server is deployed as a statefulSet. If set to false, prometheus-server will be deployed as a deployment false
serverFiles.“prometheus.yml”.scrape_configs Specifies a set of targets and parameters for prometheus-server describing how to scrape them. Refer to Prometheus docs scrape_config for more details See footnote 4
serviceAccounts.server.annotations Specifies prometheus-server service account annotations {}
serviceAccounts.server.create Indicates if prometheus-server service account will be created true
serviceAccounts.server.name Specifies prometheus-server service account name ""

  1. Node-exporter service is exposed as a clusterIP with port: 9100 (controlled by nodeExporter.service.servicePort below) and targetPort: 9100 (controlled by nodeExporter.service.hostPort below) by default. Note the annotation prometheus.io/scrape: "true" is mandatory in order for node-exporter to be discovered by prometheus-server as a scrape target. See below specification for details.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        nodeExporter:
          service:
            annotations:
              prometheus.io/scrape: "true"
            hostPort: 9100
            servicePort: 9100
            type: ClusterIP    
    
     ↩︎
  2. Node-exporter pods have the following toleration by default, which allows daemonSet to be scheduled on control plane node.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        nodeExporter:
          tolerations:
            # For K8 version prior to 1.24
            - key: "node-role.kubernetes.io/master"
              operator: "Exists"
              effect: "NoSchedule"
            # For K8 version 1.24+
            - key: "node-role.kubernetes.io/control-plane"
              operator: "Exists"
              effect: "NoSchedule"    
    
     ↩︎
  3. Prometheus-server service is exposed as a clusterIP with port: 9090 (controlled by server.service.servicePort below) and targetPort: 9090 (not overridable) by default. See below specification for details.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: |
        server:
          service:
            enabled: true
            servicePort: 9090
            type: ClusterIP    
    
     ↩︎
  4. Prometheus-server by default has the following scrape configs.

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    ...
    spec:
      config: | 
        serverFiles:
          prometheus.yml:
            scrape_configs:
              - job_name: prometheus
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                static_configs:
                - targets:
                  - localhost:9090
              - job_name: kubernetes-apiservers
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: https
                authorization:
                  type: Bearer
                  credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                tls_config:
                  ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  insecure_skip_verify: false
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
                  separator: ;
                  regex: default;kubernetes;https
                  replacement: $1
                  action: keep
                kubernetes_sd_configs:
                - role: endpoints
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-nodes
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: https
                authorization:
                  type: Bearer
                  credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                tls_config:
                  ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  insecure_skip_verify: false
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - separator: ;
                  regex: __meta_kubernetes_node_label_(.+)
                  replacement: $1
                  action: labelmap
                - separator: ;
                  regex: (.*)
                  target_label: __address__
                  replacement: kubernetes.default.svc:443
                  action: replace
                - source_labels: [__meta_kubernetes_node_name]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: /api/v1/nodes/$1/proxy/metrics
                  action: replace
                kubernetes_sd_configs:
                - role: node
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-nodes-cadvisor
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: https
                authorization:
                  type: Bearer
                  credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                tls_config:
                  ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  insecure_skip_verify: false
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - separator: ;
                  regex: __meta_kubernetes_node_label_(.+)
                  replacement: $1
                  action: labelmap
                - separator: ;
                  regex: (.*)
                  target_label: __address__
                  replacement: kubernetes.default.svc:443
                  action: replace
                - source_labels: [__meta_kubernetes_node_name]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor
                  action: replace
                kubernetes_sd_configs:
                - role: node
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-service-endpoints
                honor_labels: true
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: keep
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape_slow]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: drop
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
                  separator: ;
                  regex: (https?)
                  target_label: __scheme__
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: $1
                  action: replace
                - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
                  separator: ;
                  regex: (.+?)(?::\d+)?;(\d+)
                  target_label: __address__
                  replacement: $1:$2
                  action: replace
                - separator: ;
                  regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
                  replacement: __param_$1
                  action: labelmap
                - separator: ;
                  regex: __meta_kubernetes_service_label_(.+)
                  replacement: $1
                  action: labelmap
                - source_labels: [__meta_kubernetes_namespace]
                  separator: ;
                  regex: (.*)
                  target_label: namespace
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_service_name]
                  separator: ;
                  regex: (.*)
                  target_label: service
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_node_name]
                  separator: ;
                  regex: (.*)
                  target_label: node
                  replacement: $1
                  action: replace
                kubernetes_sd_configs:
                - role: endpoints
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-service-endpoints-slow
                honor_labels: true
                honor_timestamps: true
                scrape_interval: 5m
                scrape_timeout: 30s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape_slow]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: keep
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
                  separator: ;
                  regex: (https?)
                  target_label: __scheme__
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: $1
                  action: replace
                - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
                  separator: ;
                  regex: (.+?)(?::\d+)?;(\d+)
                  target_label: __address__
                  replacement: $1:$2
                  action: replace
                - separator: ;
                  regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
                  replacement: __param_$1
                  action: labelmap
                - separator: ;
                  regex: __meta_kubernetes_service_label_(.+)
                  replacement: $1
                  action: labelmap
                - source_labels: [__meta_kubernetes_namespace]
                  separator: ;
                  regex: (.*)
                  target_label: namespace
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_service_name]
                  separator: ;
                  regex: (.*)
                  target_label: service
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_node_name]
                  separator: ;
                  regex: (.*)
                  target_label: node
                  replacement: $1
                  action: replace
                kubernetes_sd_configs:
                - role: endpoints
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: prometheus-pushgateway
                honor_labels: true
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
                  separator: ;
                  regex: pushgateway
                  replacement: $1
                  action: keep
                kubernetes_sd_configs:
                - role: service
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-services
                honor_labels: true
                honor_timestamps: true
                params:
                  module:
                  - http_2xx
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /probe
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: keep
                - source_labels: [__address__]
                  separator: ;
                  regex: (.*)
                  target_label: __param_target
                  replacement: $1
                  action: replace
                - separator: ;
                  regex: (.*)
                  target_label: __address__
                  replacement: blackbox
                  action: replace
                - source_labels: [__param_target]
                  separator: ;
                  regex: (.*)
                  target_label: instance
                  replacement: $1
                  action: replace
                - separator: ;
                  regex: __meta_kubernetes_service_label_(.+)
                  replacement: $1
                  action: labelmap
                - source_labels: [__meta_kubernetes_namespace]
                  separator: ;
                  regex: (.*)
                  target_label: namespace
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_service_name]
                  separator: ;
                  regex: (.*)
                  target_label: service
                  replacement: $1
                  action: replace
                kubernetes_sd_configs:
                - role: service
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-pods
                honor_labels: true
                honor_timestamps: true
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: keep
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape_slow]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: drop
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
                  separator: ;
                  regex: (https?)
                  target_label: __scheme__
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: $1
                  action: replace
                - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
                  separator: ;
                  regex: (.+?)(?::\d+)?;(\d+)
                  target_label: __address__
                  replacement: $1:$2
                  action: replace
                - separator: ;
                  regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
                  replacement: __param_$1
                  action: labelmap
                - separator: ;
                  regex: __meta_kubernetes_pod_label_(.+)
                  replacement: $1
                  action: labelmap
                - source_labels: [__meta_kubernetes_namespace]
                  separator: ;
                  regex: (.*)
                  target_label: namespace
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_name]
                  separator: ;
                  regex: (.*)
                  target_label: pod
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_phase]
                  separator: ;
                  regex: Pending|Succeeded|Failed|Completed
                  replacement: $1
                  action: drop
                kubernetes_sd_configs:
                - role: pod
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
              - job_name: kubernetes-pods-slow
                honor_labels: true
                honor_timestamps: true
                scrape_interval: 5m
                scrape_timeout: 30s
                metrics_path: /metrics
                scheme: http
                follow_redirects: true
                enable_http2: true
                relabel_configs:
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape_slow]
                  separator: ;
                  regex: "true"
                  replacement: $1
                  action: keep
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
                  separator: ;
                  regex: (https?)
                  target_label: __scheme__
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
                  separator: ;
                  regex: (.+)
                  target_label: __metrics_path__
                  replacement: $1
                  action: replace
                - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
                  separator: ;
                  regex: (.+?)(?::\d+)?;(\d+)
                  target_label: __address__
                  replacement: $1:$2
                  action: replace
                - separator: ;
                  regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
                  replacement: __param_$1
                  action: labelmap
                - separator: ;
                  regex: __meta_kubernetes_pod_label_(.+)
                  replacement: $1
                  action: labelmap
                - source_labels: [__meta_kubernetes_namespace]
                  separator: ;
                  regex: (.*)
                  target_label: namespace
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_name]
                  separator: ;
                  regex: (.*)
                  target_label: pod
                  replacement: $1
                  action: replace
                - source_labels: [__meta_kubernetes_pod_phase]
                  separator: ;
                  regex: Pending|Succeeded|Failed|Completed
                  replacement: $1
                  action: drop
                kubernetes_sd_configs:
                - role: pod
                  kubeconfig_file: ""
                  follow_redirects: true
                  enable_http2: true
    
     ↩︎

5.8 - What's New?

Unreleased

v0.14.5

Fixed

  • Fix kubectl get call to point to full API name (#5342 )
  • Expand all kubectl calls to fully qualified names (#5347 )

v0.14.4

Added

  • --no-timeouts flag in create and upgrade commands to disable timeout for all wait operations
  • Management resources backup procedure with clusterctl

v0.14.3

Added

  • --aws-region flag to copy packages command.

Upgraded

  • CAPAS from v0.1.22 to v0.1.24.

v0.14.2

Added

  • Enabled support for Kubernetes version 1.25

v0.14.1

Added

  • support for authenticated pulls from registry mirror (#4796 )
  • option to override default nodeStartupTimeout in machine health check (#4800 )
  • Validate control plane endpoint with pods and services CIDR blocks(#4816 )

Fixed

  • Fixed a issue where registry mirror settings weren’t being applied properly on Bottlerocket nodes for Tinkerbell provider

v0.14.0

Added

  • Add support for EKS Anywhere on AWS Snow (#1042 )
  • Static IP support for BottleRocket (#4359 )
  • Add registry mirror support for curated packages
  • Add copy packages command (#4420 )

Fixed

  • Improve management cluster name validation for workload clusters

v0.13.1

Added

  • Multi-region support for all supported curated packages

Fixed

  • Fixed nil pointer in eksctl anywhere upgrade plan command

v0.13.0

Added

  • Workload clusters full lifecycle API support for vSphere and Docker (#1090 )
  • Single node cluster support for Bare Metal provider
  • Cilium updated to version v1.11.10
  • CLI high verbosity log output is automatically included in the support bundle after a CLI cluster command error (#1703 implemented by #4289 )
  • Allow to configure machine health checks timeout through a new flag --unhealthy-machine-timeout (#3918 implemented by #4123 )
  • Ability to configure rolling upgrade for Bare Metal and Cloudstack via maxSurge and maxUnavailable parameters
  • New Nutanix Provider
  • Workload clusters support for Bare Metal
  • VM Tagging support for vSphere VM’s created in the cluster (#4228 )
  • Support for new curated packages:
    • Prometheus v2.39.1
  • Updated curated packages' versions:
    • ADOT v0.23.0 upgraded from v0.21.1
    • Emissary v3.3.0 upgraded from v3.0.0
    • Metallb v0.13.7 upgraded from v0.13.5
  • Support for packages controller to create target namespaces #601
  • (For more EKS Anywhere packages info: v0.13.0 )

Fixed

  • Kubernetes version upgrades from 1.23 to 1.24 for Docker clusters (#4266 )
  • Added missing docker login when doing authenticated registry pulls

Breaking changes

  • Removed support for Kubernetes 1.20

v0.12.2

Added

  • Add support for Kubernetes 1.24 (CloudStack support to come in future releases)#3491

Fixed

  • Fix authenticated registry mirror validations
  • Fix capc bug causing orphaned VM’s in slow environments
  • Bundle activation problem for package controller

v0.12.1

Changed

  • Setting minimum wait time for nodes and machinedeployments (#3868, fixes #3822)

Fixed

  • Fixed worker node count pointer dereference issue (#3852)
  • Fixed eks-anywhere-packages reference in go.mod (#3902)
  • Surface dropped error in Cloudstack validations (#3832)

v0.12.0

⚠️ Breaking changes

  • Certificates signed with SHA-1 are not supported anymore for Registry Mirror. Users with a registry mirror and providing a custom CA cert will need to rotate the certificate served by the registry mirror endpoint before using the new EKS-A version. This is true for both new clusters (create cluster command) and existing clusters (upgrade cluster command).
  • The --source option was removed from several package commands. Use either --kube-version for registry or --cluster for cluster.

Added

  • Add support for EKS Anywhere with provider CloudStack
  • Add support to upgrade Bare Metal cluster
  • Add support for using Registry Mirror for Bare Metal
  • Redhat-based node image support for vSphere, CloudStack and Bare Metal EKS Anywhere clusters
  • Allow authenticated image pull using Registry Mirror for Ubuntu on vSphere cluster
  • Add option to disable vSphere CSI driver #3148
  • Add support for skipping load balancer deployment for Bare Metal so users can use their own load balancers #3608
  • Add support to configure aws-iam-authenticator on workload clusters independent of management cluster #2814
  • Add EKS Anywhere Packages support for remote management on workload clusters. (For more EKS Anywhere packages info: v0.12.0 )
  • Add new EKS Anywhere Packages
    • AWS Distro for OpenTelemetry (ADOT)
    • Cert Manager
    • Cluster Autoscaler
    • Metrics Server

Fixed

  • Remove special cilium network policy with policyEnforcementMode set to always due to lack of pod network connectivity for vSphere CSI
  • Fixed #3391 #3560 for AWSIamConfig upgrades on EKS Anywhere workload clusters

v0.11.4

Added

  • Add validate session permission for vsphere

Fixed

  • Fix datacenter naming bug for vSphere #3381
  • Fix os family validation for vSphere
  • Fix controller overwriting secret for vSphere #3404
  • Fix unintended rolling upgrades when upgrading from an older EKS-A version for CloudStack

v0.11.3

Added

  • Add some bundleRef validation
  • Enable kube-rbac-proxy on CloudStack cluster controller’s metrics port

Fixed

  • Fix issue with fetching EKS-D CRDs/manifests with retries
  • Update BundlesRef when building a Spec from file
  • Fix worker node upgrade inconsistency in Cloudstack

v0.11.2

Added

  • Add a preflight check to validate vSphere user’s permissions #2744

Changed

  • Make DiskOffering in CloudStackMachineConfig optional

Fixed

  • Fix upgrade failure when flux is enabled #3091 #3093
  • Add token-refresher to default images to fix import/download images commands
  • Improve retry logic for transient issues with kubectl applies and helm pulls #3167
  • Fix issue fetching curated packages images

v0.11.1

Added

  • Add --insecure flag to import/download images commands #2878

v0.11.0

Breaking Changes

  • EKS Anywhere no longer distributes Ubuntu OVAs for use with EKS Anywhere clusters. Building your own Ubuntu-based nodes as described in Building node images is the only supported way to get that functionality.

Added

  • Add support for Kubernetes 1.23 #2159
  • Add support for Support Bundle for validating control plane IP with vSphere provider
  • Add support for aws-iam-authenticator on Bare Metal
  • Curated Packages General Availability
  • Added Emissary Ingress Curated Package

Changed

  • Install and enable GitOps in the existing cluster with upgrade command

v0.10.1

Changed

  • Updated EKS Distro versions to latest release

Fixed

  • Fixed control plane nodes not upgraded for same kube version #2636

v0.10.0

Added

  • Added support for EKS Anywhere on bare metal with provider tinkerbell . EKS Anywhere on bare metal supports complete provisioning cycle, including power on/off and PXE boot for standing up a cluster with the given hardware data.
  • Support for node CIDR mask config exposed via the cluster spec. #488

Changed

  • Upgraded cilium from 1.9 to 1.10. #1124
  • Changes for EKS Anywhere packages v0.10.0

Fixed

  • Fix issue using self-signed certificates for registry mirror #1857

v0.9.2

Fixed

  • Fix issue by avoiding processing Snow images when URI is empty

v0.9.1

v0.9.0

Added

  • Adding support to EKS Anywhere for a generic git provider as the source of truth for GitOps configuration management. #9
  • Allow users to configure Cloud Provider and CSI Driver with different credentials. #1730
  • Support to install, configure and maintain operational components that are secure and tested by Amazon on EKS Anywhere clusters.#2083
  • A new Workshop section has been added to EKS Anywhere documentation.
  • Added support for curated packages behind a feature flag #1893

Fixed

  • Fix issue specifying proxy configuration for helm template command #2009

v0.8.2

Fixed

  • Fix issue with upgrading cluster from a previous minor version #1819

v0.8.1

Fixed

  • Fix issue with downloading artifacts #1753

v0.8.0

Added

  • SSH keys and Users are now mutable #1208
  • OIDC configuration is now mutable #676
  • Add support for Cilium’s policy enforcement mode #726

Changed

  • Install Cilium networking through Helm instead of static manifest

v0.7.2 - 2022-02-28

Fixed

  • Fix issue with downloading artifacts #1327

v0.7.1 - 2022-02-25

Added

  • Support for taints in worker node group configurations #189
  • Support for taints in control plane configurations #189
  • Support for labels in worker node group configuration #486
  • Allow removal of worker node groups using the eksctl anywhere upgrade command #1054

v0.7.0 - 2022-01-27

Added

  • Support for aws-iam-authenticator as an authentication option in EKS-A clusters #90
  • Support for multiple worker node groups in EKS-A clusters #840
  • Support for IAM Role for Service Account (IRSA) #601
  • New command upgrade plan cluster lists core component changes affected by upgrade cluster #499
  • Support for workload cluster’s control plane and etcd upgrade through GitOps #1007
  • Upgrading a Flux managed cluster previously required manual steps. These steps have now been automated. #759 , #1019
  • Cilium CNI will now be upgraded by the upgrade cluster command #326

Changed

  • EKS-A now uses Cluster API (CAPI) v1.0.1 and v1beta1 manifests, upgrading from v0.3.23 and v1alpha3 manifests.
  • Kubernetes components and etcd now use TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 as the configured TLS cipher suite #657 , #759
  • Automated git repository structure changes during Flux component upgrade workflow #577

v0.6.0 - 2021-10-29

Added

  • Support to create and manage workload clusters #94
  • Support for upgrading eks-anywhere components #93 , Cluster upgrades
    • IMPORTANT: Currently upgrading existing flux manged clusters requires performing a few additional steps . The fix for upgrading the existing clusters will be published in 0.6.1 release to improve the upgrade experience.
  • k8s CIS compliance #193
  • Support bundle improvements #92
  • Ability to upgrade control plane nodes before worker nodes #100
  • Ability to use your own container registry #98
  • Make namespace configurable for anywhere resources #177

Fixed

  • Fix ova auto-import issue for multi-datacenter environments #437
  • OVA import via EKS-A CLI sometimes fails #254
  • Add proxy configuration to etcd nodes for bottlerocket #195

Removed

  • overrideClusterSpecFile field in cluster config

v0.5.0

Added

  • Initial release of EKS-A

5.9 - Frequently Asked Questions

Frequently asked questions about EKS Anywhere

AuthN / AuthZ

How do my applications running on EKS Anywhere authenticate with AWS services using IAM credentials?

You can now leverage the IAM Role for Service Account (IRSA) feature by following the IRSA reference guide for details.

Does EKS Anywhere support OIDC (including Azure AD and AD FS)?

Yes, EKS Anywhere can create clusters that support API server OIDC authentication. This means you can federate authentication through AD FS locally or through Azure AD, along with other IDPs that support the OIDC standard. In order to add OIDC support to your EKS Anywhere clusters, you need to configure your cluster by updating the configuration file before creating the cluster. Please see the OIDC reference for details.

Does EKS Anywhere support LDAP?

EKS Anywhere does not support LDAP out of the box. However, you can look into the Dex LDAP Connector .

Can I use AWS IAM for Kubernetes resource access control on EKS Anywhere?

Yes, you can install the aws-iam-authenticator on your EKS Anywhere cluster to achieve this.

Miscellaneous

How much does EKS Anywhere cost?

EKS Anywhere is free, open source software that you can download, install on your existing hardware, and run in your own data centers. It includes management and CLI tooling for all supported cluster topologies on all supported providers . You are responsible for providing infrastructure where EKS Anywhere runs (e.g. VMware, bare metal), and some providers require third party hardware and software contracts.

The EKS Anywhere Enterprise Subscription provides access to curated packages and enterprise support. This is an optional—but recommended—cost based on how many clusters and how many years of support you need.

Can I connect my EKS Anywhere cluster to EKS?

Yes, you can install EKS Connector to connect your EKS Anywhere cluster to AWS EKS. EKS Connector is a software agent that you can install on the EKS Anywhere cluster that enables the cluster to communicate back to AWS. Once connected, you can immediately see a read-only view of the EKS Anywhere cluster with workload and cluster configuration information on the EKS console, alongside your EKS clusters.

How does the EKS Connector authenticate with AWS?

During start-up, the EKS Connector generates and stores an RSA key-pair as Kubernetes secrets. It also registers with AWS using the public key and the activation details from the cluster registration configuration file. The EKS Connector needs AWS credentials to receive commands from AWS and to send the response back. Whenever it requires AWS credentials, it uses its private key to sign the request and invokes AWS APIs to request the credentials.

How does the EKS Connector authenticate with my Kubernetes cluster?

The EKS Connector acts as a proxy and forwards the EKS console requests to the Kubernetes API server on your cluster. In the initial release, the connector uses impersonation with its service account secrets to interact with the API server. Therefore, you need to associate the connector’s service account with a ClusterRole, which gives permission to impersonate AWS IAM entities.

How do I enable an AWS user account to view my connected cluster through the EKS console?

For each AWS user or other IAM identity, you should add cluster role binding to the Kubernetes cluster with the appropriate permission for that IAM identity. Additionally, each of these IAM entities should be associated with the IAM policy to invoke the EKS Connector on the cluster.

Can I use Amazon Controllers for Kubernetes (ACK) on EKS Anywhere?

Yes, you can leverage AWS services from your EKS Anywhere clusters on-premises through Amazon Controllers for Kubernetes (ACK) .

Can I deploy EKS Anywhere on other clouds?

EKS Anywhere can be installed on any infrastructure with the required Bare Metal, Cloudstack, or VMware vSphere components. See EKS Anywhere Baremetal , CloudStack , or vSphere documentation.

How is EKS Anywhere different from ECS Anywhere?

Amazon ECS Anywhere is an option for Amazon Elastic Container Service (ECS) to run containers on your on-premises infrastructure. The ECS Anywhere Control Plane runs in an AWS region and allows you to install the ECS agent on worker nodes that run outside of an AWS region. Workloads that run on ECS Anywhere nodes are scheduled by ECS. You are not responsible for running, managing, or upgrading the ECS Control Plane.

EKS Anywhere runs the Kubernetes Control Plane and worker nodes on your infrastructure. You are responsible for managing the EKS Anywhere Control Plane and worker nodes. There is no requirement to have an AWS account to run EKS Anywhere.

If you’d like to see how EKS Anywhere compares to EKS please see the information here.

How can I manage EKS Anywhere at scale?

You can perform cluster life cycle and configuration management at scale through GitOps-based tools. EKS Anywhere offers git-driven cluster management through the integrated Flux Controller. See Manage cluster with GitOps documentation for details.

Can I run EKS Anywhere on ESXi?

No. EKS Anywhere is only supported on providers listed on the Create production cluster page. There would need to be a change to the upstream project to support ESXi.

Can I deploy EKS Anywhere on a single node?

Yes. Single node cluster deployment is supported for Bare Metal. See workerNodeGroupConfigurations

5.10 - Troubleshooting

Troubleshooting reference for your EKS Anywhere Cluster

Read more about troubleshooting in the tasks section.

5.11 - Support

Support for EKS Anywhere

5.11.1 - Support scope

Support scope for EKS Anywhere

Enterprise support for Amazon EKS Anywhere is available to Amazon customers who pay for the Amazon EKS Anywhere Enterprise subscription. If you would like to purchase the Amazon EKS Anywhere Enterprise Subscription, contact an AWS specialist.

EKS Anywhere is an open source project and it is supported by the community. If you have a problem, open an issue and someone will get back to you as soon as possible. If you discover a potential security issue in this project, we ask that you notify AWS/Amazon Security via our vulnerability reporting page. Please do not create a public GitHub issue for security problems.

Operating system support

EKS Anywhere has some level of support for the following operating system nodes:

  • Bottlerocket: Bottlerocket is the only fully-supported operating system for EKS Anywhere nodes. Bottlerocket OVAs and Raw images are distributed by the EKS Anywhere project. See the Artifacts page for details on how to download Bottlerocket images for EKS Anywhere.

  • Ubuntu: EKS Anywhere has been tested with Ubuntu-based nodes. Amazon will assist with troubleshooting and configuration guidance with Ubuntu-based nodes under the EKS Anywhere Enterprise Subscription. To build your own Ubuntu-based EKS Anywhere node image, refer to Building node images . For official Ubuntu support, see the Canonical Support page.

  • Red Hat Enterprise Linux (RHEL): EKS Anywhere has been tested with RHEL-based nodes. As with Ubuntu, Amazon will assist with troubleshooting and configuration guidance with RHEL-based nodes under the EKS Anywhere Enterprise Subscription. To build your own RHEL-based EKS Anywhere node image, refer to Building node images . For official Red Hat support, see the Red Hat Enterprise Linux Subscriptions page.

Curated packages support

Amazon EKS Anywhere Curated Packages are Amazon-curated software packages that extend the core functionalities of Kubernetes on your EKS Anywhere clusters. All curated packages, including the curated OSS packages, are supported under the EKS Anywhere Enterprise Subscription.

5.11.2 - Version support

EKS Anywhere and Kubernetes version support policy

To see supported versions of Kubernetes for each release of EKS Anywhere, and information about that support, refer to the content below.

Kubernetes support

Each EKS Anywhere version generally includes support for multiple Kubernetes versions, with the exception of the initial few releases. Starting from EKS Anywhere version 0.11, the latest version supports at least four recent versions of Kubernetes. The end of support date of a Kubernetes version aligns with Amazon EKS in AWS as documented on the Amazon EKS Kubernetes release calendar .

Common vulnerabilities and exposures (CVE) patches and bug fixes, including those for the supported Kubernetes versions, are back-ported to the latest EKS Anywhere version (version n). The following table shows EKS Anywhere version support for different Kubernetes versions:

Kubernetes version Supported EKS Anywhere version First supported End of support
1.25 0.14 February 16, 2023 April 2024
1.24 0.14, 0.13, 0.12 November 17, 2022 January 2024
1.23 0.14, 0.13, 0.12, 0.11 August 18, 2022 October 2023
1.22 0.14, 0.13, 0.12, 0.11, 0.10, 0.9, 0.8 March 31, 2022 May 2023
1.21 0.14, 0.13, 0.12, 0.11, 0.10, 0.9, 0.8, 0.7, 0.6, 0.5 September 8, 2021 March 30, 2023
1.20 0.12, 0.11, 0.10, 0.9, 0.8, 0.7, 0.6, 0.5 September 8, 2021 November 1, 2022
1.19 Not supported
1.18 Not supported

The following table notes which EKS Anywhere and related Kubernetes versions are currently supported for CVE patches and bug fixes:

EKS Anywhere version Kubernetes versions included EKS Anywhere Release Date CVE patches and bug fixes back-ported?
0.14 1.25, 1.24, 1.23, 1.22, 1.21 January 19, 2023 Yes
0.13 1.24, 1.23, 1.22, 1.21 December 15, 2022 Yes
0.12 1.24, 1.23, 1.22, 1.21, 1.20 October 20, 2022 No
0.11 1.23, 1.22, 1.21, 1.20 August 18, 2022 No
0.10 1.22, 1.21, 1.20 June 30, 2022 No
0.9 1.21, 1.20 May 12, 2022 No
0.8 1.22, 1.21, 1.20 March 31, 2022 No
0.7 1.21, 1.20 January 27, 2022 No
0.6 1.21, 1.20 October 29, 2021 No
0.5 1.21, 1.20 September 8, 2021 No

EKS Anywhere version support FAQs

What is the difference between an Amazon EKS Anywhere minor version versus a patch version?

An Amazon EKS Anywhere minor version includes new Amazon EKS Anywhere capabilities, bug fixes, security patches, and a new Kubernetes minor version if there is one. An Amazon EKS Anywhere patch version generally includes only bug fixes, security patches, and Kubernetes patch version. Amazon EKS Anywhere patch versions are released more frequently than the Amazon EKS Anywhere minor versions so you can receive the latest security and bug fixes sooner.

Where can I find the content of the Amazon EKS Anywhere versions?

You can find the content of the previous Amazon EKS Anywhere minor versions and patch versions on the What’s New page.

Will I get notified when there is a new Amazon EKS Anywhere version release?

You will get notified if you have subscribed as documented on the Release Alerts page.

Can I skip Amazon EKS Anywhere minor versions during cluster upgrade (such as going from v0.9 directly to v0.11)?

No. We perform regular upgrade reliability testing for sequential version upgrade (e.g. going from version 0.9 to 0.10, then from version 0.10 to 0.11), but we do not perform testing on non-sequential upgrade path (e.g. going from version 0.9 directly to 0.11). You should not skip minor versions during cluster upgrade. However, you can choose to skip patch versions.

What kind of fixes are back-ported to the previous versions?*

Back-ported fixes include CVE patches and bug fixes for the Amazon EKS Anywhere components and the Kubernetes versions that are supported by the corresponding Amazon EKS Anywhere versions.

What happens on the end of support date for a Kubernetes version?

On the end of support date, you can still create a new cluster with the unsupported Kubernetes version using an old version of the Amazon EKS Anywhere toolkit that was released with this Kubernetes version. Any existing Amazon EKS Anywhere clusters with the unsupported Kubernetes version will continue to function. If you have the Amazon EKS Anywhere Enterprise subscription , AWS Support will continue to provide troubleshooting support and configuration guidance to those clusters as long as their Amazon EKS Anywhere versions are still being supported. However, you will not be able to receive CVE patches or bug fixes for the unsupported Kubernetes version.

Will I get notified when support is ending for a Kubernetes version on Amazon EKS Anywhere?

Not automatically. You should check this page regularly and take note of the end of support date for the Kubernetes version you’re using.

5.12 - Artifacts

Artifacts associated with this release: OVAs and images.

EKS Anywhere supports three different node operating systems:

  • Bottlerocket: For vSphere and Bare Metal providers
  • Ubuntu: For vSphere, Bare Metal, Nutanix, and Snow providers
  • Red Hat Enterprise Linux (RHEL): For vSphere, CloudStack, and Bare Metal providers

Bottlerocket OVAs and images are distributed by the EKS Anywhere project. To build your own Ubuntu-based or RHEL-based EKS Anywhere node, see Building node images .

Bare Metal artifacts

Artifacts for EKS Anyware Bare Metal clusters are listed below. If you like, you can download these images and serve them locally to speed up cluster creation. See descriptions of the osImageURL and hookImagesURLPath fields for details.

Ubuntu or RHEL OS images for Bare Metal

EKS Anywhere does not distribute Ubuntu or RHEL OS images. However, see Building node images for information on how to build EKS Anywhere images from those Linux distributions.

Bottlerocket OS images for Bare Metal

Bottlerocket vends its Baremetal variant Images using a secure distribution tool called tuftool. Please refer to Download Bottlerocket node images to download Bottlerocket image. You can also get the download URIs for Bottlerocket Baremetal images from the bundle release by running the following command:

LATEST_EKSA_RELEASE_VERSION=$(curl -sL https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.latestVersion")
BUNDLE_MANIFEST_URL=$(curl -sL https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.releases[] | select(.version==\"$LATEST_EKSA_RELEASE_VERSION\").bundleManifestUrl")
curl -s $BUNDLE_MANIFEST_URL | yq ".spec.versionsBundles[].eksD.raw.bottlerocket.uri"

HookOS (kernel and initial ramdisk) for Bare Metal

kernel:

LATEST_EKSA_RELEASE_VERSION=$(curl -sL https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.latestVersion")
BUNDLE_MANIFEST_URL=$(curl -sL https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.releases[] | select(.version==\"$LATEST_EKSA_RELEASE_VERSION\").bundleManifestUrl")
curl -s $BUNDLE_MANIFEST_URL | yq ".spec.versionsBundles[0].tinkerbell.tinkerbellStack.hook.vmlinuz.amd.uri"

initial ramdisk:

LATEST_EKSA_RELEASE_VERSION=$(curl -sL https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.latestVersion")
BUNDLE_MANIFEST_URL=$(curl -sL https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.releases[] | select(.version==\"$LATEST_EKSA_RELEASE_VERSION\").bundleManifestUrl")
curl -s $BUNDLE_MANIFEST_URL | yq ".spec.versionsBundles[0].tinkerbell.tinkerbellStack.hook.initramfs.amd.uri"

vSphere artifacts

Bottlerocket OVAs

Bottlerocket vends its VMware variant OVAs using a secure distribution tool called tuftool. Please refer Download Bottlerocket node images to download Bottlerocket OVA. You can also get the download URIs for Bottlerocket OVAs from the bundle release by running the following command:

LATEST_EKSA_RELEASE_VERSION=$(curl -sL https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.latestVersion")
BUNDLE_MANIFEST_URL=$(curl -sL https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.releases[] | select(.version==\"$LATEST_EKSA_RELEASE_VERSION\").bundleManifestUrl")
curl -s $BUNDLE_MANIFEST_URL | yq ".spec.versionsBundles[].eksD.ova.bottlerocket.uri"

Bottlerocket Template Tags

OS Family - os:bottlerocket

EKS Distro Release

1.25 - eksdRelease:kubernetes-1-25-eks-7

1.24 - eksdRelease:kubernetes-1-24-eks-11

1.23 - eksdRelease:kubernetes-1-23-eks-16

1.22 - eksdRelease:kubernetes-1-22-eks-21

1.21 - eksdRelease:kubernetes-1-21-eks-26

Ubuntu OVAs

EKS Anywhere no longer distributes Ubuntu OVAs for use with EKS Anywhere clusters. Building your own Ubuntu-based nodes as described in Building node images is the only supported way to get that functionality.

Download Bottlerocket node images

Bottlerocket vends its VMware variant OVAs and Baremetal variants images using a secure distribution tool called tuftool. Please follow instructions down below to download Bottlerocket node images.

  1. Install Rust and Cargo
curl https://sh.rustup.rs -sSf | sh
  1. Install tuftool using Cargo
CARGO_NET_GIT_FETCH_WITH_CLI=true cargo install --force tuftool
  1. Download the root role that will be used by tuftool to download the Bottlerocket images
curl -O "https://cache.bottlerocket.aws/root.json"
sha512sum -c <<<"b81af4d8eb86743539fbc4709d33ada7b118d9f929f0c2f6c04e1d41f46241ed80423666d169079d736ab79965b4dd25a5a6db5f01578b397496d49ce11a3aa2  root.json"
  1. Export the desired Kubernetes version. EKS Anywhere currently supports 1.21, 1.22, 1.23, 1.24 and 1.25.
export KUBEVERSION="1.25"
  1. Download Bottlerocket node image

    a. To download VMware variant Bottlerocket OVA

    OVA="bottlerocket-vmware-k8s-${KUBEVERSION}-x86_64-v1.12.0.ova"
    tuftool download ${TMPDIR:-/tmp/bottlerocket-ovas} --target-name "${OVA}" \
       --root ./root.json \
       --metadata-url "https://updates.bottlerocket.aws/2020-07-07/vmware-k8s-${KUBEVERSION}/x86_64/" \
       --targets-url "https://updates.bottlerocket.aws/targets/"
    

    The above command will download a Bottlerocket OVA. Please refer Deploy an OVA Template to proceed with the downloaded OVA.

    b. To download Baremetal variant Bottlerocket image

    IMAGE="bottlerocket-metal-k8s-${KUBEVERSION}-x86_64-v1.12.0.img.lz4"
    tuftool download ${TMPDIR:-/tmp/bottlerocket-metal} --target-name "${IMAGE}" \
       --root ./root.json \
       --metadata-url "https://updates.bottlerocket.aws/2020-07-07/metal-k8s-${KUBEVERSION}/x86_64/" \
       --targets-url "https://updates.bottlerocket.aws/targets/"
    

    The above command will download a Bottlerocket lz4 compressed image. Decompress and gzip the image with the following commands and host the image on a webserver for using it for an EKS Anywhere Baremetal cluster.

    lz4 --decompress ${TMPDIR:-/tmp/bottlerocket-metal}/${IMAGE} ${TMPDIR:-/tmp/bottlerocket-metal}/bottlerocket.img
    gzip ${TMPDIR:-/tmp/bottlerocket-metal}/bottlerocket.img
    

Building node images

The image-builder CLI lets you build your own Ubuntu-based vSphere OVAs, Nutanix qcow2 images, RHEL-based qcow2 images, or Bare Metal gzip images to use in EKS Anywhere clusters. When you run image-builder, it will pull in all components needed to build images to be used as Kubernetes nodes in an EKS Anywhere cluster, including the latest operating system, Kubernetes control plane components, and EKS Distro security updates, bug fixes, and patches. When building an image using this tool, you get to choose:

  • Operating system type (for example, ubuntu, redhat)
  • Provider (vsphere, cloudstack, baremetal, ami, nutanix)
  • Release channel for EKS Distro (generally aligning with Kubernetes releases)
  • vSphere only: configuration file providing information needed to access your vSphere setup
  • CloudStack only: configuration file providing information needed to access your CloudStack setup
  • Snow AMI only: configuration file providing information needed to customize your Snow AMI build parameters
  • Nutanix only: configuration file providing information needed to access Nutanix Prism Central

Because image-builder creates images in the same way that the EKS Anywhere project does for their own testing, images built with that tool are supported.

The table below shows the support matrix for the hypervisor and OS combinations that image-builder supports.

vSphere Baremetal CloudStack Nutanix Snow
Ubuntu
RHEL

Prerequisites

To use image-builder, you must meet the following prerequisites:

System requirements

image-builder has been tested on Ubuntu, RHEL and Amazon Linux 2 machines. The following system requirements should be met for the machine on which image-builder is run:

  • AMD 64-bit architecture
  • 50 GB disk space
  • 2 vCPUs
  • 8 GB RAM
  • Baremetal only: Run on a bare metal machine with virtualization enabled

Network connectivity requirements

  • public.ecr.aws (to download container images from EKS Anywhere)
  • anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere artifacts such as binaries, manifests and OS images)
  • distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
  • d2glxqk2uabbnd.cloudfront.net (to pull the EKS Anywhere and EKS Distro ECR container images)
  • api.ecr.us-west-2.amazonaws.com (for EKS Anywhere package authentication matching your region)
  • d5l0dvt14r5h8.cloudfront.net (for EKS Anywhere package ECR container)
  • github.com (to download binaries and tools required for image builds from GitHub releases)
  • releases.hashicorp.com (to download Packer binary for image builds)
  • galaxy.ansible.com (to download Ansible packages from Ansible Galaxy)
  • vSphere only: VMware vCenter endpoint
  • CloudStack only: Apache CloudStack endpoint
  • Nutanix only: Nutanix Prism Central endpoint
  • Red Hat only: dl.fedoraproject.org (to download RPMs and GPG keys for RHEL image builds)
  • Ubuntu only: cdimage.ubuntu.com (to download Ubuntu server ISOs for Ubuntu image builds)

vSphere requirements

image-builder uses the Hashicorp vsphere-iso Packer Builder for building vSphere OVAs.

Permissions

Configure a user with a role containing the following permissions.

The role can be configured programmatically with the govc command below, or configured in the vSphere UI using the table below as reference.

Note that no matter how the role is created, it must be assigned to the user or user group at the Global Permissions level.

Unfortunately there is no API for managing vSphere Global Permissions, so they must be set on the user via the UI under Administration > Access Control > Global Permissions.

To generate a role named EKSAImageBuilder with the required privileges via govc, run the following:

govc role.create "EKSAImageBuilder" $(curl https://raw.githubusercontent.com/aws/eks-anywhere/main/pkg/config/static/imageBuilderPrivs.json | jq .[] | tr '\n' ' ' | tr -d '"')

If creating a role with these privileges via the UI, refer to the table below.

Category UI Privilege Programmatic Privilege
Content Library Add library item ContentLibrary.AddLibraryItem
Content Library Delete library item ContentLibrary.DeleteLibraryItem
Content Library Download files ContentLibrary.DownloadSession
Content Library Evict library item ContentLibrary.EvictLibraryItem
Content Library Update library item ContentLibrary.UpdateLibraryItem
Datastore Allocate space Datastore.AllocateSpace
Datastore Browse datastore Datastore.Browse
Datastore Low level file operations Datastore.FileManagement
Network Assign network Network.Assign
Resource Assign virtual machine to resource pool Resource.AssignVMToPool
vApp Export vApp.Export
VirtualMachine Configuration > Add new disk VirtualMachine.Config.AddNewDisk
VirtualMachine Configuration > Add or remove device VirtualMachine.Config.AddRemoveDevice
VirtualMachine Configuration > Advanced configuration VirtualMachine.Config.AdvancedConfiguration
VirtualMachine Configuration > Change CPU count VirtualMachine.Config.CPUCount
VirtualMachine Configuration > Change memory VirtualMachine.Config.Memory
VirtualMachine Configuration > Change settings VirtualMachine.Config.Settings
VirtualMachine Configuration > Change Resource VirtualMachine.Config.Resource
VirtualMachine Configuration > Set annotation VirtualMachine.Config.Annotation
VirtualMachine Edit Inventory > Create from existing VirtualMachine.Inventory.CreateFromExisting
VirtualMachine Edit Inventory > Create new VirtualMachine.Inventory.Create
VirtualMachine Edit Inventory > Remove VirtualMachine.Inventory.Delete
VirtualMachine Interaction > Configure CD media VirtualMachine.Interact.SetCDMedia
VirtualMachine Interaction > Configure floppy media VirtualMachine.Interact.SetFloppyMedia
VirtualMachine Interaction > Connect devices VirtualMachine.Interact.DeviceConnection
VirtualMachine Interaction > Inject USB HID scan codes VirtualMachine.Interact.PutUsbScanCodes
VirtualMachine Interaction > Power off VirtualMachine.Interact.PowerOff
VirtualMachine Interaction > Power on VirtualMachine.Interact.PowerOn
VirtualMachine Interaction > Create template from virtual machine VirtualMachine.Provisioning.CreateTemplateFromVM
VirtualMachine Interaction > Mark as template VirtualMachine.Provisioning.MarkAsTemplate
VirtualMachine Interaction > Mark as virtual machine VirtualMachine.Provisioning.MarkAsVM
VirtualMachine State > Create snapshot VirtualMachine.State.CreateSnapshot

CloudStack requirements

Refer to the CloudStack Permissions for CAPC doc for required CloudStack user permissions.

Snow AMI requirements

Packer will require prior authentication with your AWS account to launch EC2 instances for the Snow AMI build. Refer to the Authentication guide for Amazon EBS Packer builder for possible modes of authentication. We recommend that you run image-builder on a pre-existing Ubuntu EC2 instance and use an IAM instance role with the required permissions .

Nutanix permissions

Prism Central Administrator permissions are required to build a Nutanix image using image-builder.

Optional Proxy configuration

You can use a proxy server to route outbound requests to the internet. To configure image-builder tool to use a proxy server, export these proxy environment variables:

export HTTP_PROXY=<HTTP proxy URL e.g. http://proxy.corp.com:80>
export HTTPS_PROXY=<HTTPS proxy URL e.g. http://proxy.corp.com:443>
export NO_PROXY=<No proxy>

Build vSphere OVA node images

These steps use image-builder to create an Ubuntu-based or RHEL-based image for vSphere. Before proceeding, ensure that the above system-level, network-level and vSphere-specific prerequisites have been met.

  1. Create a linux user for running image-builder.

    sudo adduser image-builder
    

    Follow the prompt to provide a password for the image-builder user.

  2. Add image-builder user to the sudo group and change user as image-builder providing in the password from previous step when prompted.

    sudo usermod -aG sudo image-builder
    su image-builder
    cd /home/$USER
    
  3. Install packages and prepare environment:

    sudo apt update -y
    sudo apt install jq unzip make ansible python3-pip -y
    sudo snap install yq
    mkdir -p /home/$USER/.ssh
    echo "HostKeyAlgorithms +ssh-rsa" >> /home/$USER/.ssh/config
    echo "PubkeyAcceptedKeyTypes +ssh-rsa" >> /home/$USER/.ssh/config
    
  4. Get image-builder:

    cd /tmp
    LATEST_EKSA_RELEASE_VERSION=$(curl -s https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.latestVersion")
    BUNDLE_MANIFEST_URL=$(curl -s https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.releases[] | select(.version==\"$LATEST_EKSA_RELEASE_VERSION\").bundleManifestUrl")
    IMAGEBUILDER_TARBALL_URI=$(curl -s $BUNDLE_MANIFEST_URL | yq ".spec.versionsBundles[0].eksD.imagebuilder.uri")
    curl -s $IMAGEBUILDER_TARBALL_URI | tar xz ./image-builder
    sudo cp ./image-builder /usr/local/bin
    cd -
    
  5. Get the latest version of govc:

    curl -L -o - "https://github.com/vmware/govmomi/releases/latest/download/govc_$(uname -s)_$(uname -m).tar.gz" | sudo tar -C /usr/local/bin -xvzf - govc
    
  6. Create a content library on vSphere:

    govc library.create "<library name>"
    
  7. Create a vsphere configuration file (for example, vsphere-connection.json):

    {
      "cluster": "<vsphere cluster used for image building>",
      "convert_to_template": "false",
      "create_snapshot": "<creates a snapshot on base OVA after building if set to true>",
      "datacenter": "<vsphere datacenter used for image building>",
      "datastore": "<datastore used to store template/for image building>",
      "folder": "<folder on vsphere to create temporary VM>",
      "insecure_connection": "true",
      "linked_clone": "false",
      "network": "<vsphere network used for image building>",
      "password": "<vcenter password>",
      "resource_pool": "<resource pool used for image building VM>",
      "username": "<vcenter username>",
      "vcenter_server": "<vcenter fqdn>",
      "vsphere_library_name": "<vsphere content library name>"
    }
    

    For RHEL images, add the following fields:

    {
      "iso_url": "<https://endpoint to RHEL ISO endpoint or path to file>",
      "iso_checksum": "<for example: ea5f349d492fed819e5086d351de47261c470fc794f7124805d176d69ddf1fcd>",
      "iso_checksum_type": "<for example: sha256>",
      "rhel_username": "<rhsm username>",
      "rhel_password": "<rhsm password>"
    }
    
  8. Create an Ubuntu or Redhat image:

    • To create an Ubuntu-based image, run image-builder with the following options:

      • --os: ubuntu
      • --hypervisor: For vSphere use vsphere
      • --release-channel: Supported EKS Distro releases include 1-21, 1-22, 1-23, 1-24 and 1-25.
      • --vsphere-config: vSphere configuration file (vsphere-connection.json in this example)
      image-builder build --os ubuntu --hypervisor vsphere --release-channel 1-25 --vsphere-config vsphere-connection.json
      
    • To create a RHEL-based image, run image-builder with the following options:

      • --os: redhat
      • --hypervisor: For vSphere use vsphere
      • --release-channel: Supported EKS Distro releases include 1-21, 1-22, 1-23, 1-24 and 1-25.
      • --vsphere-config: vSphere configuration file (vsphere-connection.json in this example)
      image-builder build --os redhat --hypervisor vsphere --release-channel 1-25 --vsphere-config vsphere-connection.json
      

Build Bare Metal node images

These steps use image-builder to create an Ubuntu-based or RHEL-based image for Bare Metal. Before proceeding, ensure that the above system-level, network-level and baremetal-specific prerequisites have been met.

  1. Create a linux user for running image-builder.

    sudo adduser image-builder
    

    Follow the prompt to provide a password for the image-builder user.

  2. Add image-builder user to the sudo group and change user as image-builder providing in the password from previous step when prompted.

    sudo usermod -aG sudo image-builder
    su image-builder
    cd /home/$USER
    
  3. Install packages and prepare environment:

    sudo apt update -y
    sudo apt install jq make python3-pip qemu-kvm libvirt-daemon-system libvirt-clients virtinst cpu-checker libguestfs-tools libosinfo-bin unzip ansible -y
    sudo snap install yq
    sudo usermod -a -G kvm $USER
    sudo chmod 666 /dev/kvm
    sudo chown root:kvm /dev/kvm
    mkdir -p /home/$USER/.ssh
    echo "HostKeyAlgorithms +ssh-rsa" >> /home/$USER/.ssh/config
    echo "PubkeyAcceptedKeyTypes +ssh-rsa" >> /home/$USER/.ssh/config
    
  4. Get image-builder:

    cd /tmp
    LATEST_EKSA_RELEASE_VERSION=$(curl -s https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.latestVersion")
    BUNDLE_MANIFEST_URL=$(curl -s https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.releases[] | select(.version==\"$LATEST_EKSA_RELEASE_VERSION\").bundleManifestUrl")
    IMAGEBUILDER_TARBALL_URI=$(curl -s $BUNDLE_MANIFEST_URL | yq ".spec.versionsBundles[0].eksD.imagebuilder.uri")
    curl -s $IMAGEBUILDER_TARBALL_URI | tar xz ./image-builder
    sudo cp ./image-builder /usr/local/bin
    cd -
    
  5. Create an Ubuntu or Red Hat image.

    Ubuntu

    Run image-builder with the following options:

    • --os: ubuntu
    • --hypervisor: baremetal
    • --release-channel: A supported EKS Distro release formatted as “[major]-[minor]"; for example “1-25”
    image-builder build --os ubuntu --hypervisor baremetal --release-channel 1-25
    

    Red Hat Enterprise Linux (RHEL)

    RHEL images require a configuration file to identify the location of the RHEL 8 ISO image and Red Hat subscription information. The image-builder command will temporarily consume a Red Hat subscription that is returned once the image is built.

    {
      "iso_url": "<https://endpoint to RHEL ISO endpoint or path to file>",
      "iso_checksum": "<for example: ea5f349d492fed819e5086d351de47261c470fc794f7124805d176d69ddf1fcd>",
      "iso_checksum_type": "<for example: sha256>",
      "rhel_username": "<rhsm username>",
      "rhel_password": "<rhsm password>",
      "extra_rpms": "<space-separated list of RPM packages; useful for adding required drivers or other packages>"
    }
    

    Run the image-builder with the following options:

    • --os: redhat
    • --hypervisor: baremetal
    • --release-channel: A supported EKS Distro release formatted as “[major]-[minor]"; for example “1-25”
    • --baremetal-config: Bare metal config file
    image-builder build --os redhat --hypervisor baremetal --release-channel 1-25 --baremetal-config baremetal.json
    
  6. To consume the image, serve it from an accessible web server, then create the bare metal cluster spec configuring the osImageURL field URL of the image. For example:

    osImageURL: "http://<artifact host address>/my-ubuntu-v1.23.9-eks-a-17-amd64.gz"
    

    See descriptions of osImageURL for further information.

Build CloudStack node images

These steps use image-builder to create a RHEL-based image for CloudStack. Before proceeding, ensure that the above system-level, network-level and CloudStack-specific prerequisites have been met.

  1. Create a linux user for running image-builder.

    sudo adduser image-builder
    

    Follow the prompt to provide a password for the image-builder user.

  2. Add image-builder user to the sudo group and change user as image-builder providing in the password from previous step when prompted.

    sudo usermod -aG sudo image-builder
    su image-builder
    cd /home/$USER
    
  3. Install packages and prepare environment:

    sudo apt update -y
    sudo apt install jq make python3-pip qemu-kvm libvirt-daemon-system libvirt-clients virtinst cpu-checker libguestfs-tools libosinfo-bin unzip ansible -y
    sudo snap install yq
    sudo usermod -a -G kvm $USER
    sudo chmod 666 /dev/kvm
    sudo chown root:kvm /dev/kvm
    mkdir -p /home/$USER/.ssh
    echo "HostKeyAlgorithms +ssh-rsa" >> /home/$USER/.ssh/config
    echo "PubkeyAcceptedKeyTypes +ssh-rsa" >> /home/$USER/.ssh/config
    
  4. Get image-builder:

    cd /tmp
    LATEST_EKSA_RELEASE_VERSION=$(curl -s https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.latestVersion")
    BUNDLE_MANIFEST_URL=$(curl -s https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.releases[] | select(.version==\"$LATEST_EKSA_RELEASE_VERSION\").bundleManifestUrl")
    IMAGEBUILDER_TARBALL_URI=$(curl -s $BUNDLE_MANIFEST_URL | yq ".spec.versionsBundles[0].eksD.imagebuilder.uri")
    curl -s $IMAGEBUILDER_TARBALL_URI | tar xz ./image-builder
    sudo cp ./image-builder /usr/local/bin
    cd -
    
  5. Create a CloudStack configuration file (for example, cloudstack.json) to identify the location of a Red Hat Enterprise Linux 8 ISO image and related checksum and Red Hat subscription information:

    {
      "iso_url": "<https://endpoint to RHEL ISO endpoint or path to file>",
      "iso_checksum": "<for example: ea5f349d492fed819e5086d351de47261c470fc794f7124805d176d69ddf1fcd>",
      "iso_checksum_type": "<for example: sha256>",
      "rhel_username": "<rhsm username>",
      "rhel_password": "<rhsm password>"
    }
    

    NOTE: To build the RHEL-based image, image-builder temporarily consumes a Red Hat subscription. That subscription is returned once the image is built.

  6. To create a RHEL-based image, run image-builder with the following options:

    • --os: redhat
    • --hypervisor: For CloudStack use cloudstack
    • --release-channel: Supported EKS Distro releases include 1-21, 1-22, 1-23, 1-24 and 1-25.
    • --cloudstack-config: CloudStack configuration file (cloudstack.json in this example)
    image-builder build --os redhat --hypervisor cloudstack --release-channel 1-25 --cloudstack-config cloudstack.json
    
  7. To consume the resulting RHEL-based image, add it as a template to your CloudStack setup as described in Preparing CloudStack .

Build Snow node images

These steps use image-builder to create an Ubuntu-based Amazon Machine Image (AMI) that is backed by EBS volumes for Snow. Before proceeding, ensure that the above system-level, network-level and AMI-specific prerequisites have been met

  1. Create a linux user for running image-builder.

    sudo adduser image-builder
    

    Follow the prompt to provide a password for the image-builder user.

  2. Add the image-builder user to the sudo group and switch user to image-builder, providing in the password from previous step when prompted.

    sudo usermod -aG sudo image-builder
    su image-builder
    cd /home/$USER
    
  3. Install packages and prepare environment:

    sudo apt update -y
    sudo apt install jq unzip make ansible python3-pip -y
    sudo snap install yq
    mkdir -p /home/$USER/.ssh
    echo "HostKeyAlgorithms +ssh-rsa" >> /home/$USER/.ssh/config
    echo "PubkeyAcceptedKeyTypes +ssh-rsa" >> /home/$USER/.ssh/config
    
  4. Get image-builder:

    cd /tmp
    LATEST_EKSA_RELEASE_VERSION=$(curl -s https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.latestVersion")
    BUNDLE_MANIFEST_URL=$(curl -s https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.releases[] | select(.version==\"$LATEST_EKSA_RELEASE_VERSION\").bundleManifestUrl")
    IMAGEBUILDER_TARBALL_URI=$(curl -s $BUNDLE_MANIFEST_URL | yq ".spec.versionsBundles[0].eksD.imagebuilder.uri")
    curl -s $IMAGEBUILDER_TARBALL_URI | tar xz ./image-builder
    sudo cp ./image-builder /usr/local/bin
    cd -
    
  5. Create an AMI configuration file (for example, ami.json) that contains various AMI parameters.

    {
      "ami_filter_name": "<Regular expression to filter a source AMI (default: ubuntu/images/*ubuntu-focal-20.04-amd64-server-*)>",
      "ami_filter_owners": "<AWS account ID or AWS owner alias such as 'amazon', 'aws-marketplace', etc (default: 679593333241 - the AWS Marketplace AWS account ID)>",
      "ami_regions": "<A list of AWS regions to copy the AMI to>",
      "aws_region": "<The AWS region in which to launch the EC2 instance to create the AMI>",
      "ansible_extra_vars": "<The absolute path to the additional variables to pass to Ansible. These are converted to the `--extra-vars` command-line argument. This path must be prefix with '@'>",
      "builder_instance_type": "<The EC2 instance type to use while building the AMI (default: t3.small)>",
      "custom_role": "<If set to true, this will run a custom Ansible role before the `sysprep` role to allow for further customization>",
      "custom_role_name_list" : "<Array of strings representing the absolute paths of custom Ansible roles to run. This field is mutually exclusive with custom_role_names>",
      "custom_role_names": "<Space-delimited string of the custom roles to run. This field is mutually exclusive with custom_role_name_list and is provided for compatibility with Ansible's input format>",
      "manifest_output": "<The absolute path to write the build artifacts manifest to. If you wish to export the AMI using this manifest, ensure that you provide a path that is not inside the '/home/$USER/eks-anywhere-build-tooling' path since that will be cleaned up when the build finishes>",
      "root_device_name": "<The device name used by EC2 for the root EBS volume attached to the instance>",
      "subnet_id": "<The ID of the subnet where Packer will launch the EC2 instance. This field is required when using a non-default VPC>",
      "volume_size": "<The size of the root EBS volume in GiB>",
      "volume_type": "<The type of root EBS volume, such as gp2, gp3, io1, etc.>"
    }
    
  6. To create an Ubuntu-based image, run image-builder with the following options:

    • --os: ubuntu
    • --hypervisor: For AMI, use ami
    • --release-channel: Supported EKS Distro releases include 1-21, 1-22, 1-23 and 1-24.
    • --ami-config: AMI configuration file (ami.json in this example)
    image-builder build --os ubuntu --hypervisor ami --release-channel 1-24 --ami-config ami.json
    
  7. After the build, the Ubuntu AMI will be available in your AWS account in the AWS region specified in your AMI configuration file. If you wish to export it as a Raw image, you can achieve this using the AWS CLI.

    ARTIFACT_ID=$(cat <manifest output location> | jq -r '.builds[0].artifact_id')
    AMI_ID=$(echo $ARTIFACT_ID | cut -d: -f2)
    IMAGE_FORMAT=raw
    AMI_EXPORT_BUCKET_NAME=<S3 bucket to export the AMI to>
    AMI_EXPORT_PREFIX=<S3 prefix for the exported AMI object>
    EXPORT_RESPONSE=$(aws ec2 export-image --disk-image-format $IMAGE_FORMAT --s3-export-location S3Bucket=$AMI_EXPORT_BUCKET_NAME,S3Prefix=$AMI_EXPORT_PREFIX --image-id $AMI_ID)
    EXPORT_TASK_ID=$(echo $EXPORT_RESPONSE | jq -r '.ExportImageTaskId')
    

    The exported image will be available at the location s3://$AMI_EXPORT_BUCKET_NAME/$AMI_EXPORT_PREFIX/$EXPORT_IMAGE_TASK_ID.raw.

Build Nutanix node images

These steps use image-builder to create a Ubuntu-based image for Nutanix AHV and import it into the AOS Image Service. Before proceeding, ensure that the above system-level, network-level and Nutanix-specific prerequisites have been met.

  1. Download an Ubuntu cloud image for the build and upload it to the AOS Image Service using Prism. You will need to specify this image name as the source_image_name in the nutanix-connection.json config file specified below.

  2. Create a linux user for running image-builder.

    sudo adduser image-builder
    

    Follow the prompt to provide a password for the image-builder user.

  3. Add image-builder user to the sudo group and change user as image-builder providing in the password from previous step when prompted.

    sudo usermod -aG sudo image-builder
    su image-builder
    cd /home/$USER
    
  4. Install packages and prepare environment:

    sudo apt update -y
    sudo apt install jq unzip make ansible python3-pip -y
    sudo snap install yq
    mkdir -p /home/$USER/.ssh
    echo "HostKeyAlgorithms +ssh-rsa" >> /home/$USER/.ssh/config
    echo "PubkeyAcceptedKeyTypes +ssh-rsa" >> /home/$USER/.ssh/config
    
  5. Get image-builder:

    cd /tmp
    LATEST_EKSA_RELEASE_VERSION=$(curl -s https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.latestVersion")
    BUNDLE_MANIFEST_URL=$(curl -s https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml | yq ".spec.releases[] | select(.version==\"$LATEST_EKSA_RELEASE_VERSION\").bundleManifestUrl")
    IMAGEBUILDER_TARBALL_URI=$(curl -s $BUNDLE_MANIFEST_URL | yq ".spec.versionsBundles[0].eksD.imagebuilder.uri")
    curl -s $IMAGEBUILDER_TARBALL_URI | tar xz ./image-builder
    sudo cp ./image-builder /usr/local/bin
    cd -
    
  6. Create a nutanix-connection.json config file. More details on values can be found in the image-builder documentation . See example below:

    {
      "nutanix_cluster_name": "Name of PE Cluster",
      "source_image_name": "Name of Source Image",
      "image_name": "Name of Destination Image",
      "nutanix_subnet_name": "Name of Subnet",
      "nutanix_endpoint": "Prism Central IP / FQDN",
      "nutanix_insecure": "false",
      "nutanix_port": "9440",
      "nutanix_username": "PrismCentral_Username",
      "nutanix_password": "PrismCentral_Password"
    }
    
  7. Run image-builder with the following options:

    • --os: ubuntu
    • --hypervisor: For Nutanix use nutanix
    • --release-channel: Supported EKS Distro releases include 1-21, 1-22, 1-23, 1-24 and 1-25.
    • --nutanix-config: Nutanix configuration file (nutanix-connection.json in this example)
    cd /home/$USER
    image-builder build --os ubuntu --hypervisor nutanix --release-channel 1-25 --nutanix-config nutanix-connection.json
    

Images

The various images for EKS Anywhere can be found in the EKS Anywhere ECR repository . The various images for EKS Distro can be found in the EKS Distro ECR repository .

5.13 - Ports and protocols

Ports used with an EKS Anywhere cluster

EKS Anywhere requires that various ports on control plane and worker nodes be open. Some Kubernetes-specific ports need open access only from other Kubernetes nodes, while others are exposed externally. Beyond Kubernetes ports, someone managing an EKS Anywhere cluster must also have external access to ports on the underlying EKS Anywhere provider (such as VMware) and to external tooling (such as Jenkins).

If you are responsible for network firewall rules between nodes on your EKS Anywhere clusters, the following tables describe both Kubernetes and EKS Anywhere-specific ports you should be aware of.

Kubernetes control plane

The following table represents the ports published by the Kubernetes project that must be accessible on any Kubernetes control plane.

Protocol Direction Port Range Purpose Used By
TCP Inbound 6443 Kubernetes API server All
TCP Inbound 10250 Kubelet API Self, Control plane
TCP Inbound 10259 kube-scheduler Self
TCP Inbound 10257 kube-controller-manager Self

Although etcd ports are included in control plane section, you can also host your own etcd cluster externally or on custom ports.

Protocol Direction Port Range Purpose Used By
TCP Inbound 2379-2380 etcd server client API kube-apiserver, etcd

Use the following to access the SSH service on the control plane and etcd nodes:

Protocol Direction Port Range Purpose Used By
TCP Inbound 22 SSHD server SSH clients

Kubernetes worker nodes

The following table represents the ports published by the Kubernetes project that must be accessible from worker nodes.

Protocol Direction Port Range Purpose Used By
TCP Inbound 10250 Kubelet API Self, Control plane
TCP Inbound 30000-32767 NodePort Services All

The API server port that is sometimes switched to 443. Alternatively, the default port is kept as is and API server is put behind a load balancer that listens on 443 and routes the requests to API server on the default port.

Use the following to access the SSH service on the worker nodes:

Protocol Direction Port Range Purpose Used By
TCP Inbound 22 SSHD server SSH clients

Bare Metal provider

On the Admin machine for a Bare Metal provider, the following ports need to be accessible to all the nodes in the cluster, from the same level 2 network, for initially network booting:

Protocol Direction Port Range Purpose Used By
UDP Inbound 67 Boots DHCP All nodes, for network boot
UDP Inbound 69 Boots TFTP All nodes, for network boot
TCP Inbound 80 Boots HTTP All nodes, for network boot
TCP Inbound 42113 Tink-server gRPC All nodes, talk to Tinkerbell
TCP Inbound 50061 Hegel HTTP All nodes, talk to Tinkerbell
TCP Outbound 623 Rufio IPMI All nodes, out-of-band power and next boot (optional )
TCP Outbound 80,443 Rufio Redfish All nodes, out-of-band power and next boot (optional )

VMware provider

The following table displays ports that need to be accessible from the VMware provider running EKS Anywhere:

Protocol Direction Port Range Purpose Used By
TCP Inbound 443 vCenter Server vCenter API endpoint
TCP Inbound 6443 Kubernetes API server Kubernetes API endpoint
TCP Inbound 2379 Manager Etcd API endpoint
TCP Inbound 2380 Manager Etcd API endpoint

Nutanix provider

The following table displays ports that need to be accessible from the Nutanix provider running EKS Anywhere:

Protocol Direction Port Range Purpose Used By
TCP Inbound 9440 Prism Central Server Prism Central API endpoint
TCP Inbound 6443 Kubernetes API server Kubernetes API endpoint
TCP Inbound 2379 Manager Etcd API endpoint
TCP Inbound 2380 Manager Etcd API endpoint

Snow provider

In addition to the Ports Required to Use AWS Services on an AWS Snowball Edge Device , the following table displays ports that need to be accessible from the Snow provider running EKS Anywhere:

Protocol Direction Port Range Purpose Used By
TCP Inbound 9092 Device Controller EKS Anywhere and CAPAS controller
TCP Inbound 8242 EC2 HTTPS endpoint EKS Anywhere and CAPAS controller
TCP Inbound 6443 Kubernetes API server Kubernetes API endpoint
TCP Inbound 2379 Manager Etcd API endpoint
TCP Inbound 2380 Manager Etcd API endpoint

Control plane management tools

A variety of control plane management tools are available to use with EKS Anywhere. One example is Jenkins.

Protocol Direction Port Range Purpose Used By
TCP Inbound 8080 Jenkins Server HTTP Jenkins endpoint
TCP Inbound 8443 Jenkins Server HTTPS Jenkins endpoint

5.14 - Release Alerts

SNS Alerts for EKS Anywhere release

EKS Anywhere uses Amazon Simple Notification Service (SNS) to notify availability of a new release. It is recommended that your clusters are kept up to date with the latest EKS Anywhere release. Please follow the instructions below to subscribe to SNS notification.

  • Sign in to your AWS Account
  • Select us-east-1 region
  • Go to the SNS Console
  • In the left navigation pane, choose “Subscriptions”
  • On the Subscriptions page, choose “Create subscription”
  • On the Create subscription page, in the Details section enter the following information
    • Topic ARN
      arn:aws:sns:us-east-1:153288728732:eks-anywhere-updates
      
    • Protocol - Email
    • Endpoint - Your preferred email address
  • Choose Create Subscription
  • In few minutes, you will receive an email asking you to confirm the subscription
  • Click the confirmation link in the email

5.15 - eksctl anywhere CLI reference

Details on the options and parameters for eksctl anywhere CLI

The eksctl CLI, with the EKS Anywhere plugin added, lets you create and manage EKS Anywhere clusters. While a cluster is running, most EKS Anywhere administration can be done using kubectl or other native Kubernetes tools.

Use this page as a reference to useful eksctl anywhere command examples for working with EKS Anywhere clusters. Available eksctl anywhere commands include:

  • create cluster To create an EKS Anywhere cluster
  • upgrade To upgrade a workload cluster
  • delete cluster To delete an EKS Anywhere cluster
  • generate [clusterconfig | support-bundle | support-bundle-config | packages | hardware] To generate cluster, support configs, package configs, and tinkerbell hardware files
  • help To get help information
  • version To get the EKS Anywhere version

Options used with multiple commands include:

  • -h or --help To get help for a command or subcommand
  • -v int or --verbosity int To set log level verbosity from 0-9
  • -f filenameor–filename filename` To identify the filename containing the cluster config
  • --force-cleanup To force deletion of previously created bootstrap cluster
  • -w string or --w-config string To identify the kubeconfig file when needed to create a support bundle or upgrade a cluster

Other available options and arguments are listed with the command examples that follow.

eksctl anywhere generate

With eksctl anywhere generate, you can output sets of cluster resources to create a new cluster or troubleshoot an existing cluster. Here are some examples.

eksctl anywhere generate clusterconfig

Using eksctl anywhere generate clusterconfig you can generate a cluster configuration for a specific provider (-p or --providerprovider_name). Here are examples:

Generate a configuration file to create an EKS Anywhere cluster for a vsphere provider:

export CLUSTER_NAME=vsphere01
eksctl anywhere generate clusterconfig ${CLUSTER_NAME} -p vsphere > ${CLUSTER_NAME}.yaml

Generate a configuration file to create an EKS Anywhere cluster for a Docker provider:

export CLUSTER_NAME=docker01
eksctl anywhere generate clusterconfig ${CLUSTER_NAME} -p docker > ${CLUSTER_NAME}.yaml

Once you have generated the yaml configuration file, edit that file to add configuration information before you use the file to create your cluster. See local and production cluster creation procedures for details.

eksctl anywhere generate support-bundle-config

If you would like to customize your support bundle, you can generate a support bundle configuration file (support-bundle-config), edit that file to choose the data you want to gather, then gather the selected data into a support bundle (support-bundle).

Generate a support bundle config file (then edit that file to select the log data you want to gather):

export CLUSTER_NAME=vsphere01
eksctl anywhere generate support-bundle-config > ${CLUSTER_NAME}_bundle_config.yaml 

eksctl anywhere generate support-bundle

Once you have a bundle config file, generate a support bundle from an existing EKS Anywhere cluster. Additional options available for this command include:

  • --bundle-config string To identify the bundle config file to use to generate the support bundle
  • --since string To collect pod logs in the latest duration like 5s, 2m, or 3h.
  • --since-time string To collect pod logs after a specific datetime(RFC3339) like 2021-06-28T15:04:05Z

Here is an example:

export CLUSTER_NAME=vsphere01
eksctl anywhere generate support-bundle --bundle-config ${CLUSTER_NAME}_bundle_config.yaml \
   -w ${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig \
   --since 2h -f ${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.yaml

The example just shown:

  • Uses ${CLUSTER_NAME}_bundle.yaml as the file to hold the results
  • Collects pod logs for the past two hours (2h)
  • Identifies the bundle config file to use (${CLUSTER_NAME}_bundle_config.yaml)
  • Identifies the .kubeconfig file to use for a workload cluster

To change the command to generate a support bundle that gathers pod logs starting from a specific date (September 8, 2021) and time (1:27 PM):

export CLUSTER_NAME=vsphere01
eksctl anywhere generate support-bundle --bundle-config ${CLUSTER_NAME}_bundle_config.yaml \
   -w KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig \
   --since-time 2021-09-8T13:27:00Z 2h -f ${CLUSTER_NAME}_bundle.yaml

eksctl anywhere create cluster

Create an EKS Anywhere cluster from a cluster configuration file you generated (and modified) earlier. This example sets verbosity to most verbose (-v 9):

export CLUSTER_NAME=vsphere01
eksctl anywhere create cluster -v 9 -f ${CLUSTER_NAME}.yaml

See local and production cluster creation procedures for details.

eksctl anywhere upgrade cluster

Upgrade an existing EKS Anywhere cluster. This example uses maximum verbosity and forces a cleanup of the previously created bootstrap cluster:

export CLUSTER_NAME=vsphere01
eksctl anywhere upgrade cluster -f ${CLUSTER_NAME}.yaml --force-cleanup -v9 \
   -w KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig 

For more information on this and other ways to upgrade a cluster, see Upgrade cluster .

eksctl anywhere delete cluster

Delete an existing EKS Anywhere cluster. This example deletes all VMs and the forces the deletion of the previously created bootstrap cluster:

export CLUSTER_NAME=vsphere01
eksctl anywhere delete cluster -f ${CLUSTER_NAME}.yaml \
   --force-cleanup \
   -w KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig 

For more information on deleting a cluster, see Delete cluster .

eksctl anywhere version

View the version of eksctl anywhere:

eksctl anywhere version
v0.5.0

eksctl anywhere help

Use eksctl anywhere help or the -h option to see general options or options specific to a particular set of commands.

View general help information using help:

eksctl anywhere help

Use eksctl anywhere to build your own self-managing cluster on your hardware with the best of Amazon EKS

Usage:
  eksctl anywhere [command]

Available Commands:
  create      Create resources
  delete      Delete resources
  generate    Generate resources
  help        Help about any command
  upgrade     Upgrade resources
  version     Get the eksctl version

Flags:
  -h, --help            help for eksctl
  -v, --verbosity int   Set the log level verbosity

Use "eksctl [command] --help" for more information about a command.
...

Display help options for generating a support bundle:

eksctl anywhere generate support-bundle -h

This command is used to create a support bundle to troubleshoot a cluster

Usage:
  eksctl anywhere generate support-bundle -f my-cluster.yaml [flags]

Flags:
      --bundle-config string   Bundle Config file to use when generating support bundle
  -f, --filename string        Filename that contains EKS-A cluster configuration
  -h, --help                   help for support-bundle
      --since string           Collect pod logs in the latest duration like 5s, 2m, or 3h.
      --since-time string      Collect pod logs after a specific datetime(RFC3339) like 2021-06-28T15:04:05Z
  -w, --w-config string        Kubeconfig file to use when creating support bundle for a workload cluster

Global Flags:
  -v, --verbosity int   Set the log level verbosity

Display options for creating a cluster:

eksctl anywhere create cluster -h
This command is used to create workload clusters

Usage:
  eksctl anywhere create cluster [flags]

Flags:
  -f, --filename string   Filename that contains EKS-A cluster configuration
      --force-cleanup     Force deletion of previously created bootstrap cluster
  -h, --help              help for cluster

Global Flags:
  -v, --verbosity int   Set the log level verbosity

5.15.1 - anywhere

anywhere

Amazon EKS Anywhere

Synopsis

Use eksctl anywhere to build your own self-managing cluster on your hardware with the best of Amazon EKS

Options

  -h, --help            help for anywhere
  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.2 - anywhere apply

anywhere apply

Apply resources

Synopsis

Use eksctl anywhere apply to apply resources

Options

  -h, --help   help for apply

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.3 - anywhere apply package(s)

anywhere apply package(s)

Apply curated packages

Synopsis

Apply Curated Packages Custom Resources to the cluster

anywhere apply package(s) [flags]

Options

  -f, --filename string     Filename that contains curated packages custom resources to apply
  -h, --help                help for package(s)
      --kubeconfig string   Path to an optional kubeconfig file to use.

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.4 - anywhere check-images

anywhere check-images

Check images used by EKS Anywhere do exist in the target registry

Synopsis

This command is used to check images used by EKS-Anywhere for cluster provisioning do exist in the target registry

anywhere check-images [flags]

Options

  -f, --filename string   Filename that contains EKS-A cluster configuration
  -h, --help              help for check-images

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.5 - anywhere copy

anywhere copy

Copy resources

Synopsis

Copy EKS Anywhere resources and artifacts

Options

  -h, --help   help for copy

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.6 - anywhere copy packages

anywhere copy packages

Copy curated package images and charts from a source to a destination

Synopsis

Copy all the EKS Anywhere curated package images and helm charts from a source to a destination.

anywhere copy packages [flags]

Options

      --aws-region string   Region to copy images from
  -b, --bundle string       EKS-A bundle file to read artifact dependencies from
      --dry-run             Dry run copy to print images that would be copied
      --dst-cert string     TLS certificate for destination registry
  -h, --help                help for packages
      --insecure            Skip TLS verification while copying images and charts
      --src-cert string     TLS certificate for source registry

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.7 - anywhere create

anywhere create

Create resources

Synopsis

Use eksctl anywhere create to create resources, such as clusters

Options

  -h, --help   help for create

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.8 - anywhere create cluster

anywhere create cluster

Create workload cluster

Synopsis

This command is used to create workload clusters

anywhere create cluster -f <cluster-config-file> [flags]

Options

      --bundles-override string          Override default Bundles manifest (not recommended)
  -f, --filename string                  Filename that contains EKS-A cluster configuration
      --force-cleanup                    Force deletion of previously created bootstrap cluster
  -z, --hardware-csv string              Path to a CSV file containing hardware data.
  -h, --help                             help for cluster
      --install-packages string          Location of curated packages configuration files to install to the cluster
      --kubeconfig string                Management cluster kubeconfig file
      --no-timeouts                      Disable timeout for all wait operations
      --skip-ip-check                    Skip check for whether cluster control plane ip is in use
      --tinkerbell-bootstrap-ip string   Override the local tinkerbell IP in the bootstrap cluster

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.9 - anywhere create package(s)

anywhere create package(s)

Create curated packages

Synopsis

Create Curated Packages Custom Resources to the cluster

anywhere create package(s) [flags]

Options

  -f, --filename string     Filename that contains curated packages custom resources to create
  -h, --help                help for package(s)
      --kubeconfig string   Path to an optional kubeconfig file to use.

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.10 - anywhere delete

anywhere delete

Delete resources

Synopsis

Use eksctl anywhere delete to delete clusters

Options

  -h, --help   help for delete

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.11 - anywhere delete cluster

anywhere delete cluster

Workload cluster

Synopsis

This command is used to delete workload clusters created by eksctl anywhere

anywhere delete cluster (<cluster-name>|-f <config-file>) [flags]

Options

      --bundles-override string   Override default Bundles manifest (not recommended)
  -f, --filename string           Filename that contains EKS-A cluster configuration, required if <cluster-name> is not provided
      --force-cleanup             Force deletion of previously created bootstrap cluster
  -h, --help                      help for cluster
      --kubeconfig string         kubeconfig file pointing to a management cluster
  -w, --w-config string           Kubeconfig file to use when deleting a workload cluster

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.12 - anywhere delete package(s)

anywhere delete package(s)

Delete package(s)

Synopsis

This command is used to delete the curated packages custom resources installed in the cluster

anywhere delete package(s) [flags]

Options

      --cluster string      Cluster for package deletion.
  -h, --help                help for package(s)
      --kubeconfig string   Path to an optional kubeconfig file to use.

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.13 - anywhere describe

anywhere describe

Describe resources

Synopsis

Use eksctl anywhere describe to show details of a specific resource or group of resources

Options

  -h, --help   help for describe

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.14 - anywhere describe package(s)

anywhere describe package(s)

Describe curated packages in the cluster

anywhere describe package(s) [flags]

Options

      --cluster string      Cluster to describe packages.
  -h, --help                help for package(s)
      --kubeconfig string   Path to an optional kubeconfig file to use.

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.15 - anywhere download

anywhere download

Download resources

Synopsis

Use eksctl anywhere download to download artifacts (manifests, bundles) used by EKS Anywhere

Options

  -h, --help   help for download

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.16 - anywhere download artifacts

anywhere download artifacts

Download EKS Anywhere artifacts/manifests to a tarball on disk

Synopsis

This command is used to download the S3 artifacts from an EKS Anywhere bundle manifest and package them into a tarball

anywhere download artifacts [flags]

Options

      --bundles-override string   Override default Bundles manifest (not recommended)
  -d, --download-dir string       Directory to download the artifacts to (default "eks-anywhere-downloads")
      --dry-run                   Print the manifest URIs without downloading them
  -f, --filename string           [Deprecated] Filename that contains EKS-A cluster configuration
  -h, --help                      help for artifacts
  -r, --retain-dir                Do not delete the download folder after creating a tarball

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.17 - anywhere download images

anywhere download images

Download all eks-a images to disk

Synopsis

Creates a tarball containing all necessary images to create an eks-a cluster for any of the supported Kubernetes versions.

anywhere download images [flags]

Options

      --bundles-override string   Override default Bundles manifest (not recommended)
  -h, --help                      help for images
      --include-packages          this flag no longer works, use copy packages instead (DEPRECATED: use copy packages command)
      --insecure                  Flag to indicate skipping TLS verification while downloading helm charts
  -o, --output string             Output tarball containing all downloaded images

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.18 - anywhere exp

anywhere exp

experimental commands

Synopsis

Use eksctl anywhere experimental commands

Options

  -h, --help   help for exp

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.19 - anywhere exp validate

anywhere exp validate

Validate resource or action

Synopsis

Use eksctl anywhere validate to validate a resource or action

Options

  -h, --help   help for validate

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.20 - anywhere exp validate create

anywhere exp validate create

Validate create resources

Synopsis

Use eksctl anywhere validate create to validate the create action on resources, such as cluster

Options

  -h, --help   help for create

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.21 - anywhere exp validate create cluster

anywhere exp validate create cluster

Validate create cluster

Synopsis

Use eksctl anywhere validate create cluster to validate the create cluster action

anywhere exp validate create cluster -f <cluster-config-file> [flags]

Options

  -f, --filename string                  Filename that contains EKS-A cluster configuration
  -z, --hardware-csv string              Path to a CSV file containing hardware data.
  -h, --help                             help for cluster
      --tinkerbell-bootstrap-ip string   Override the local tinkerbell IP in the bootstrap cluster

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.22 - anywhere exp vsphere

anywhere exp vsphere

Utility vsphere operations

Synopsis

Use eksctl anywhere vsphere to perform utility operations on vsphere

Options

  -h, --help   help for vsphere

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.23 - anywhere exp vsphere setup

anywhere exp vsphere setup

Setup vSphere objects

Synopsis

Use eksctl anywhere vsphere setup to configure vSphere objects

Options

  -h, --help   help for setup

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.24 - anywhere exp vsphere setup user

anywhere exp vsphere setup user

Setup vSphere user

Synopsis

Use eksctl anywhere vsphere setup user to configure EKS Anywhere vSphere user

anywhere exp vsphere setup user -f <config-file> [flags]

Options

  -f, --filename string   Filename containing vsphere setup configuration
      --force             Force flag. When set, setup user will proceed even if the group and role objects already exist. Mutually exclusive with --password flag, as it expects the user to already exist. default: false
  -h, --help              help for user
  -p, --password string   Password for creating new user

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.25 - anywhere generate

anywhere generate

Generate resources

Synopsis

Use eksctl anywhere generate to generate resources, such as clusterconfig yaml

Options

  -h, --help   help for generate

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.26 - anywhere generate clusterconfig

anywhere generate clusterconfig

Generate cluster config

Synopsis

This command is used to generate a cluster config yaml for the create cluster command

anywhere generate clusterconfig <cluster-name> (max 80 chars) [flags]

Options

  -h, --help              help for clusterconfig
  -p, --provider string   Provider to use (vsphere or tinkerbell or docker)

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.27 - anywhere generate hardware

anywhere generate hardware

Generate hardware files

Synopsis

Generate Kubernetes hardware YAML manifests for each Hardware entry in the source.

anywhere generate hardware [flags]

Options

  -z, --hardware-csv string   Path to a CSV file containing hardware data.
  -h, --help                  help for hardware
  -o, --output string         Path to output hardware YAML.

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.28 - anywhere generate packages

anywhere generate packages

Generate package(s) configuration

Synopsis

Generates Kubernetes configuration files for curated packages

anywhere generate packages [flags] package

Options

      --cluster string        Name of cluster for package generation
  -h, --help                  help for packages
      --kube-version string   Kubernetes Version of the cluster to be used. Format <major>.<minor>
      --kubeconfig string     Path to an optional kubeconfig file to use.
      --registry string       Used to specify an alternative registry for package generation

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.29 - anywhere generate support-bundle

anywhere generate support-bundle

Generate a support bundle

Synopsis

This command is used to create a support bundle to troubleshoot a cluster

anywhere generate support-bundle -f my-cluster.yaml [flags]

Options

      --bundle-config string   Bundle Config file to use when generating support bundle
  -f, --filename string        Filename that contains EKS-A cluster configuration
  -h, --help                   help for support-bundle
      --since string           Collect pod logs in the latest duration like 5s, 2m, or 3h.
      --since-time string      Collect pod logs after a specific datetime(RFC3339) like 2021-06-28T15:04:05Z
  -w, --w-config string        Kubeconfig file to use when creating support bundle for a workload cluster

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.30 - anywhere generate support-bundle-config

anywhere generate support-bundle-config

Generate support bundle config

Synopsis

This command is used to generate a default support bundle config yaml

anywhere generate support-bundle-config [flags]

Options

  -f, --filename string   Filename that contains EKS-A cluster configuration
  -h, --help              help for support-bundle-config

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.31 - anywhere get

anywhere get

Get resources

Synopsis

Use eksctl anywhere get to display one or many resources

Options

  -h, --help   help for get

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.32 - anywhere get package(s)

anywhere get package(s)

Get package(s)

Synopsis

This command is used to display the curated packages installed in the cluster

anywhere get package(s) [flags]

Options

      --cluster string      Cluster to get list of packages.
  -h, --help                help for package(s)
      --kubeconfig string   Path to an optional kubeconfig file.
  -o, --output string       Specifies the output format (valid option: json, yaml)

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.33 - anywhere get packagebundle(s)

anywhere get packagebundle(s)

Get packagebundle(s)

Synopsis

This command is used to display the currently supported packagebundles

anywhere get packagebundle(s) [flags]

Options

  -h, --help                help for packagebundle(s)
      --kubeconfig string   Path to an optional kubeconfig file.
  -o, --output string       Specifies the output format (valid option: json, yaml)

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.34 - anywhere get packagebundlecontroller(s)

anywhere get packagebundlecontroller(s)

Get packagebundlecontroller(s)

Synopsis

This command is used to display the current packagebundlecontrollers

anywhere get packagebundlecontroller(s) [flags]

Options

  -h, --help                help for packagebundlecontroller(s)
      --kubeconfig string   Path to an optional kubeconfig file.
  -o, --output string       Specifies the output format (valid option: json, yaml)

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.35 - anywhere import

anywhere import

Import resources

Synopsis

Use eksctl anywhere import to import resources, such as images and helm charts

Options

  -h, --help   help for import

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.36 - anywhere import images

anywhere import images

Import images and charts to a registry from a tarball

Synopsis

Import all the images and helm charts necessary for EKS Anywhere clusters into a registry. Use this command in conjunction with download images, passing it output tarball as input to this command.

anywhere import images [flags]

Options

  -b, --bundles string     Bundles file to read artifact dependencies from
  -h, --help               help for images
      --include-packages   Flag to indicate inclusion of curated packages in imported images (DEPRECATED: use copy packages command)
  -i, --input string       Input tarball containing all images and charts to import
      --insecure           Flag to indicate skipping TLS verification while pushing helm charts
  -r, --registry string    Registry where to import images and charts

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.37 - anywhere install

anywhere install

Install resources to the cluster

Synopsis

Use eksctl anywhere install to install resources into a cluster

Options

  -h, --help   help for install

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.38 - anywhere install package

anywhere install package

Install package

Synopsis

This command is used to Install a curated package. Use list to discover curated packages

anywhere install package [flags] package

Options

      --cluster string        Target cluster for installation.
  -h, --help                  help for package
      --kube-version string   Kubernetes Version of the cluster to be used. Format <major>.<minor>
      --kubeconfig string     Path to an optional kubeconfig file to use.
  -n, --package-name string   Custom name of the curated package to install
      --registry string       Used to specify an alternative registry for discovery
      --set stringArray       Provide custom configurations for curated packages. Format key:value

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.39 - anywhere install packagecontroller

anywhere install packagecontroller

Install packagecontroller on the cluster

Synopsis

This command is used to Install the packagecontroller on to an existing cluster

anywhere install packagecontroller [flags]

Options

  -f, --filename string     Filename that contains EKS-A cluster configuration
  -h, --help                help for packagecontroller
      --kubeConfig string   Management cluster kubeconfig file

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.40 - anywhere list

anywhere list

List resources

Synopsis

Use eksctl anywhere list to list images and artifacts used by EKS Anywhere

Options

  -h, --help   help for list

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.41 - anywhere list images

anywhere list images

Generate a list of images used by EKS Anywhere

Synopsis

This command is used to generate a list of images used by EKS-Anywhere for cluster provisioning

anywhere list images [flags]

Options

      --bundles-override string   Override default Bundles manifest (not recommended)
  -f, --filename string           Filename that contains EKS-A cluster configuration
  -h, --help                      help for images

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.42 - anywhere list ovas

anywhere list ovas

List the OVAs that are supported by current version of EKS Anywhere

Synopsis

This command is used to list the vSphere OVAs from the EKS Anywhere bundle manifest for the current version of the EKS Anywhere CLI

anywhere list ovas [flags]

Options

      --bundles-override string   Override default Bundles manifest (not recommended)
  -f, --filename string           Filename that contains EKS-A cluster configuration
  -h, --help                      help for ovas

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.43 - anywhere list packages

anywhere list packages

Lists curated packages available to install

anywhere list packages [flags]

Options

      --cluster string        Name of cluster for package list.
  -h, --help                  help for packages
      --kube-version string   Kubernetes version <major>.<minor> of the packages to list, for example: "1.23".
      --kubeconfig string     Path to a kubeconfig file to use when source is a cluster.
      --registry string       Specifies an alternative registry for packages discovery.

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.44 - anywhere upgrade

anywhere upgrade

Upgrade resources

Synopsis

Use eksctl anywhere upgrade to upgrade resources, such as clusters

Options

  -h, --help   help for upgrade

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.45 - anywhere upgrade cluster

anywhere upgrade cluster

Upgrade workload cluster

Synopsis

This command is used to upgrade workload clusters

anywhere upgrade cluster [flags]

Options

      --bundles-override string   Override default Bundles manifest (not recommended)
  -f, --filename string           Filename that contains EKS-A cluster configuration
      --force-cleanup             Force deletion of previously created bootstrap cluster
  -z, --hardware-csv string       Path to a CSV file containing hardware data.
  -h, --help                      help for cluster
      --kubeconfig string         Management cluster kubeconfig file
      --no-timeouts               Disable timeout for all wait operations
  -w, --w-config string           Kubeconfig file to use when upgrading a workload cluster

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.46 - anywhere upgrade packages

anywhere upgrade packages

Upgrade all curated packages to the latest version

anywhere upgrade packages [flags]

Options

      --bundle-version string   Bundle version to use
      --cluster string          Cluster to upgrade.
  -h, --help                    help for packages
      --kubeconfig string       Path to an optional kubeconfig file to use.

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.47 - anywhere upgrade plan

anywhere upgrade plan

Provides information for a resource upgrade

Synopsis

Use eksctl anywhere upgrade plan to get information for a resource upgrade

Options

  -h, --help   help for plan

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.48 - anywhere upgrade plan cluster

anywhere upgrade plan cluster

Provides new release versions for the next cluster upgrade

Synopsis

Provides a list of target versions for upgrading the core components in the workload cluster

anywhere upgrade plan cluster [flags]

Options

      --bundles-override string   Override default Bundles manifest (not recommended)
  -f, --filename string           Filename that contains EKS-A cluster configuration
  -h, --help                      help for cluster
      --kubeconfig string         Management cluster kubeconfig file
  -o, --output string             Output format: text|json (default "text")

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

5.15.49 - anywhere version

anywhere version

Get the eksctl anywhere version

Synopsis

This command prints the version of eksctl anywhere

anywhere version [flags]

Options

  -h, --help   help for version

Options inherited from parent commands

  -v, --verbosity int   Set the log level verbosity

SEE ALSO

6 - Community

Guidelines for community contribution

We work hard to provide a high-quality Kubernetes installer for EKS, and we greatly value feedback and contributions from our community. Please review the contribution guidelines before submitting any issues or pull requests to ensure we have all the necessary information to respond to your bug report or contribution effectively. If you have a concern with a security vulnerability, please review our reporting a vulnerability policy .

6.1 - Contributing Guidelines

How to best contribute to the project

Thank you for your interest in contributing to our project. Whether it’s a bug report, new feature, correction, or additional documentation, we greatly value feedback and contributions from our community.

Please read through this document before submitting any issues or pull requests to ensure we have all the necessary information to effectively respond to your bug report or contribution.

General Guidelines

Pull Requests

Make sure to keep Pull Requests small and functional to make them easier to review, understand, and look up in commit history. This repository uses “Squash and Commit” to keep our history clean and make it easier to revert changes based on PR.

Adding the appropriate documentation, unit tests and e2e tests as part of a feature is the responsibility of the feature owner, whether it is done in the same Pull Request or not.

Pull Requests should follow the “subject: message” format, where the subject describes what part of the code is being modified.

Refer to the template for more information on what goes into a PR description.

Design Docs

A contributor proposes a design with a PR on the repository to allow for revisions and discussions. If a design needs to be discussed before formulating a document for it, make use of GitHub Discussions to involve the community on the discussion.

GitHub Discussions

GitHub Discussions are used for feature requests (that don’t have actionable items/issues), questions, and anything else the community would like to share.

Categories:

  • Q/A - Questions
  • Proposals - Feature requests and other suggestions
  • Show and tell - Anything that the community would like to share
  • General - Everything else (possibly announcements as well)

GitHub Issues

GitHub Issues are used to file bugs, work items, and feature requests with actionable items/issues (Please refer to the “Reporting Bugs/Feature Requests” section below for more information).

Labels:

  • “<area>” - area of project that issue is related to (create, upgrade, flux, test, etc.)
  • “priority/p<n>” - priority of task based on following numbers
    • p0: need to do right away
    • p1: don’t have a set time but need to do
    • p2: not currently being tracked (backlog)
  • “status/<status>” - status of the issue (notstarted, implementation, etc.)
  • “kind/<kind>” - type of issue (bug, feature, enhancement, docs, etc.)

Refer to the template for more information on what goes into an issue description.

GitHub Milestones

GitHub Milestones are used to plan work that is currently being tracked.

  • next: changes for next release
  • next+1: won’t make next release but the following
  • techdebt: used to keep track of techdebt items, separate ongoing effort from release action items
  • oncall: used to keep track of issues needing active follow-up
  • backlog: items that don’t have a home in the others

GitHub Projects (or tasks within a GitHub Issue)

GitHub Projects are used to keep track of bigger features that are made up of a collection of issues. Certain features can also have a tracking issue that contains a checklist of tasks that link to other issues.

Reporting Bugs/Feature Requests

We welcome you to use the GitHub issue tracker to report bugs or suggest features that have actionable items/issues (as opposed to introducing a feature request on GitHub Discussions).

When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn’t already reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:

  • A reproducible test case or series of steps
  • The version of the code being used
  • Any modifications you’ve made relevant to the bug
  • Anything unusual about your environment or deployment

Contributing via Pull Requests

Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:

  1. You are working against the latest source on the main branch.
  2. You check existing open, and recently merged, pull requests to make sure someone else hasn’t addressed the problem already.
  3. You open an issue to discuss any significant work - we would hate for your time to be wasted.

To send us a pull request, please:

  1. Fork the repository.
  2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
  3. Ensure local tests pass.
  4. Commit to your fork using clear commit messages.
  5. Send us a pull request, answering any default questions in the pull request interface.
  6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.

GitHub provides additional document on forking a repository and creating a pull request .

Finding contributions to work on

Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ‘help wanted’ and ‘good first issue’ issues are a great place to start.

Code of Conduct

This project has adopted the Amazon Open Source Code of Conduct . For more information see the Code of Conduct FAQ or contact opensource-codeofconduct@amazon.com with any additional questions or comments.

Security issue notifications

If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our vulnerability reporting page . Please do not create a public GitHub issue.

Licensing

See the LICENSE file for our project’s licensing. We will ask you to confirm the licensing of your contribution.

6.2 - Contributing to documentation

Guidelines for contributing to EKS Anywhere documentation

EKS Anywhere documentation uses the Hugo site generator and the Docsy theme. To get started contributing:

Style issues

  • EKS Anywhere: Always refer to EKS Anywhere as EKS Anywhere and NOT EKS-A or EKS-Anywhere.

  • Line breaks: Put each sentence on its own line and don’t do a line break in the middle of a sentence. We are using a modified Semantic Line Breaking in that we are requiring a break at the end of every sentence, but not at commas or other semantic boundaries.

  • Headings: Use sentence case in headings. So do “Cluster specification reference” and not “Cluster Specification Reference”

  • Cross references: To cross reference to another doc in the EKS Anywhere docs set, use relref in the link so that Hugo will test it and fail the build for links not found. Also, use relative paths to point to other content in the docs set. Here is an example of a cross reference (code and results):

      See the [troubleshooting section](/docs/tasks/troubleshoot/) page.
    

    See the troubleshooting section page.

  • Notes, Warnings, etc.: You can use this form for notes:

    {{% alert title=“Note” color=“primary” %}}

    <put note here, multiple paragraphs are allowed>

    {{% /alert %}}

  • Embedding content: If you want to read in content from a separate file, you can use the following format. Do this if you think the content might be useful in multiple pages:

    {{% content “./newfile.md” %}}

  • General style issues: Unless otherwise instructed, follow the Kubernetes Documentation Style Guide for formatting and presentation guidance.

Where to put content

  • Images: Put all images into the EKS Anywhere GitHub site’s docs/static/images directory.
  • Yaml examples: Put full yaml file examples into the EKS Anywhere GitHub site’s docs/static/manifests directory. In kubectl examples, you can point to those files using: https://anywhere.eks.amazonaws.com/manifests/whatever.yaml
  • Generic instructions for creating a cluster should go into the getting started section in either:
  • Instructions that are specific to an EKS Anywhere provider should go into the appropriate provider section. Provider-specific sections are in the Reference sections for CloudStack , Bare Metal , and vSphere .
  • Workshop content should contain organized links to existing documentation pages. The workshop content should not duplicate existing documentation pages or contain guides that are not part of the main documentation.

Contributing docs for third-party solutions

To contribute documentation describing how to use third-party software products or projects with EKS Anywhere, follow these guidelines.

Docs for third-party software in EKS Anywhere

Documentation PRs for EKS Anywhere that describe third-party software that is included in EKS Anywhere are acceptable, provided they meet the quality standards described in the Tips described below. This includes:

  • Software bundled with EKS Anywhere (for example, Cilium docs )
  • Supported platforms on which EKS Anywhere runs (for example, VMware vSphere )
  • Curated software that is packaged by the EKS Anywhere project to run EKS Anywhere. This includes documentation for Harbor local registry, Ingress controller, and Prometheus, Grafana, and Fluentd monitoring and logging.

Docs for third-party software NOT in EKS Anywhere

Documentation for software that is not part of EKS Anywhere software can still be added to EKS Anywhere docs by meeting one of the following criteria:

  • Partners: Documentation PRs for software from vendors listed on the EKS Anywhere Partner page can be considered to add to the EKS Anywhere docs. Links point to partners from the Compare EKS Anywhere to EKS page and other content can be added to EKS Anywhere documentation for features from those partners. Contact the AWS container partner team if you are interested in becoming a partner: aws-container-partners@amazon.com
  • Cluster integrations: Separate, less stringent criteria can be met for a third-party vendor to be listed on the Add cluster integrations page.

Tips for contributing third-party docs

The Kubernetes docs project itself describes a similar approach to docs covering third-party software in the How Docs Handle Third Party and Dual Sourced Content blog. In line with these general guidelines, we recommend that even acceptable third-party docs contributions to EKS Anywhere:

  • Not be dual-sourced: The project does not allow content that is already published somewhere else. You can provide links to that content, if it is relevant. Heavily rewriting such content to be EKS Anywhere-specific might be acceptable.
  • Not be marketing oriented. The content shouldn’t sell a third-party products or make vague claims of quality.
  • Not outside the scope of EKS Anywhere: Just because some projects or products of a partner are appropriate for EKS Anywhere docs, it doesn’t mean that any project or product by that partner can be documented in EKS Anywhere.
  • Stick to the facts: So, for example, docs about third-party software could say: “To set up load balancer ABC, do XYZ” or “Make these modifications to improve speed and efficiency.” It should not make blanket statements like: “ABC load balancer is the best one in the industry.”
  • EKS features: Features that relate to EKS which runs in AWS or requires an AWS account should link to the official documentation as much as possible.

6.3 - Code of Conduct

Details on the project code of conduct

This project has adopted the Amazon Open Source Code of Conduct . For more information, see the Code of Conduct FAQ or contact opensource-codeofconduct@amazon.com with any additional questions or comments.

6.4 - Project governance

Roles and responsibilities of the project

This document lays out the guidelines under which the EKS Anywhere project will be governed. The goal is to make sure that the roles and responsibilities are well-defined and clarify how decisions are made.

Roles

In the context of EKS Anywhere, we consider the following roles:

  • Users … everyone using EKS Anywhere, typically willing to provide feedback on EKS Anywhere by proposing features and/or filing issues.
  • Contributors … everyone contributing code, documentation, examples, testing infra, and participating in feature proposals as well as design discussions.
  • Maintainers … are responsible for engaging with and assisting contributors to iterate on the contributions until it reaches acceptable quality. Maintainers can decide whether the contributions can be accepted into the project or rejected.

Communication

The primary mechanism for communication will be via the #eks channel on the Kubernetes Slack community. All features and bug fixes will be tracked as issues in GitHub. All decisions will be documented in GitHub issues.

In the future, we may consider using a public mailing list, which can be better archived.

Release Management

The release process will be governed by AWS and will coincide with the release of EKS.

Roadmap Planning

Maintainers will share roadmap and release versions as milestones in GitHub.

7 - Welcome to EKS Anywhere Workshop!

Steps through setting up and using EKS Anywhere

The intent of this workshop is to educate users about EKS Anywhere and its different use cases. As part of this workshop we also covering how to provision and manage EKS Anywhere clusters, run workloads and leverage observability tools like Prometheus and Grafana to monitor the EKS Anywhere cluster. We recommend this workshop for Cloud Architects, SREs, DevOps engineers, and other IT Professionals.

7.1 - Introduction

The following topics are covered part of this chapter:

  • EKS Anywhere service overview
  • Benefits & service considerations
  • Frequently asked questions (FAQs)

7.1.1 - Overview

What is the purpose of this workshop?

The purpose of this workshop is to provide a more perscriptive walkthrough of building, deploying, and operating an EKS Anywhere cluster. This will use existing content from the documentation, just in a more condensed format for those wishing to get started.

EKS Anywhere Overview

Amazon EKS Anywhere is a new deployment option for Amazon EKS that allows customers to create and operate Kubernetes clusters on customer-managed infrastructure, supported by AWS. Customers can now run Amazon EKS Anywhere on their own on-premises infrastructure using Bare Metal, CloudStack, or VMware vSphere.

Amazon EKS Anywhere helps simplify the creation and operation of on-premises Kubernetes clusters with default component configurations while providing tools for automating cluster management. It builds on the strengths of Amazon EKS Distro: the same Kubernetes distribution that powers Amazon EKS on AWS. AWS supports all Amazon EKS Anywhere components including the integrated 3rd-party software, so that customers can reduce their support costs and avoid maintenance of redundant open-source and third-party tools. In addition, Amazon EKS Anywhere gives customers on-premises Kubernetes operational tooling that’s consistent with Amazon EKS. You can leverage the EKS console to view all of your Kubernetes clusters (including EKS Anywhere clusters) running anywhere, through the EKS Connector (public preview)

how-it-works

7.1.2 - Benefits & Use cases

Here are some key customer benefits of using Amazon EKS Anywhere:

  • Simplify on-premises Kubernetes management - Amazon EKS Anywhere helps simplify the creation and operation of on-premises Kubernetes clusters with default component configurations while providing tools for automating cluster management.
  • One stop support - AWS supports all Amazon EKS Anywhere components including the integrated 3rd-party software, so that customers can reduce their support costs and avoid maintenance of redundant open-source and third-party tools.
  • Consistent and reliable - Amazon EKS Anywhere gives you on-premises Kubernetes operational tooling that’s consistent with Amazon EKS. It builds on the strengths of Amazon EKS Distro and provides open-source software that’s up-to-date and patched, so you can have a Kubernetes environment on-premises that is more reliable than self-managed Kubernetes offerings.

Use-cases supported by EKS Anywhere

EKS Anywhere is suitable for the following use-cases:

  • Hybrid cloud consistency - You may have lots of Kubernetes workloads on Amazon EKS but also need to operate Kubernetes clusters on-premises. Amazon EKS Anywhere offers strong operational consistency with Amazon EKS so you can standardize your Kubernetes operations based on a unified toolset.
  • Disconnected environment - You may need to secure your applications in disconnected environment or run applications in areas without internet connectivity. Amazon EKS Anywhere allows you to deploy and operate highly-available clusters with the same Kubernetes distribution that powers Amazon EKS on AWS.
  • Application modernization - Amazon EKS Anywhere empowers you to modernize your on-premises applications, removing the heavy lifting of keeping up with upstream Kubernetes and security patches, so you can focus on your core business value.
  • Data sovereignty - You may want to keep your large data sets on-premises due to legal requirements concerning the location of the data. Amazon EKS Anywhere brings the trusted Amazon EKS Kubernetes distribution and tools to where your data needs to be.

7.1.3 - Customer FAQ

AuthN / AuthZ

How do my applications running on EKS Anywhere authenticate with AWS services using IAM credentials?

You can now leverage the IAM Role for Service Account (IRSA)

feature by following the IRSA reference

guide for details.

Does EKS Anywhere support OIDC (including Azure AD and AD FS)?

Yes, EKS Anywhere can create clusters that support API server OIDC authentication. This means you can federate authentication through AD FS locally or through Azure AD, along with other IDPs that support the OIDC standard. In order to add OIDC support to your EKS Anywhere clusters, you need to configure your cluster by updating the configuration file before creating the cluster. Please see the OIDC reference

for details.

Does EKS Anywhere support LDAP?

EKS Anywhere does not support LDAP out of the box. However, you can look into the Dex LDAP Connector

.

Can I use AWS IAM for Kubernetes resource access control on EKS Anywhere?

Yes, you can install the aws-iam-authenticator

on your EKS Anywhere cluster to achieve this.

Miscellaneous

How much does EKS Anywhere cost?

EKS Anywhere is free, open source software that you can download, install on your existing hardware, and run in your own data centers. It includes management and CLI tooling for all supported cluster topologies

on all supported providers

. You are responsible for providing infrastructure where EKS Anywhere runs (e.g. VMware, bare metal), and some providers require third party hardware and software contracts.

The EKS Anywhere Enterprise Subscription

provides access to curated packages and enterprise support. This is an optional—but recommended—cost based on how many clusters and how many years of support you need.

Can I connect my EKS Anywhere cluster to EKS?

Yes, you can install EKS Connector to connect your EKS Anywhere cluster to AWS EKS. EKS Connector is a software agent that you can install on the EKS Anywhere cluster that enables the cluster to communicate back to AWS. Once connected, you can immediately see a read-only view of the EKS Anywhere cluster with workload and cluster configuration information on the EKS console, alongside your EKS clusters.

How does the EKS Connector authenticate with AWS?

During start-up, the EKS Connector generates and stores an RSA key-pair as Kubernetes secrets. It also registers with AWS using the public key and the activation details from the cluster registration configuration file. The EKS Connector needs AWS credentials to receive commands from AWS and to send the response back. Whenever it requires AWS credentials, it uses its private key to sign the request and invokes AWS APIs to request the credentials.

How does the EKS Connector authenticate with my Kubernetes cluster?

The EKS Connector acts as a proxy and forwards the EKS console requests to the Kubernetes API server on your cluster. In the initial release, the connector uses impersonation

with its service account secrets to interact with the API server. Therefore, you need to associate the connector’s service account with a ClusterRole, which gives permission to impersonate AWS IAM entities.

How do I enable an AWS user account to view my connected cluster through the EKS console?

For each AWS user or other IAM identity, you should add cluster role binding to the Kubernetes cluster with the appropriate permission for that IAM identity. Additionally, each of these IAM entities should be associated with the IAM policy to invoke the EKS Connector on the cluster.

Can I use Amazon Controllers for Kubernetes (ACK) on EKS Anywhere?

Yes, you can leverage AWS services from your EKS Anywhere clusters on-premises through Amazon Controllers for Kubernetes (ACK)

.

Can I deploy EKS Anywhere on other clouds?

EKS Anywhere can be installed on any infrastructure with the required Bare Metal, Cloudstack, or VMware vSphere components. See EKS Anywhere Baremetal

, CloudStack

, or vSphere

documentation.

How is EKS Anywhere different from ECS Anywhere?

Amazon ECS Anywhere

is an option for Amazon Elastic Container Service (ECS)

to run containers on your on-premises infrastructure. The ECS Anywhere Control Plane runs in an AWS region and allows you to install the ECS agent on worker nodes that run outside of an AWS region. Workloads that run on ECS Anywhere nodes are scheduled by ECS. You are not responsible for running, managing, or upgrading the ECS Control Plane.

EKS Anywhere runs the Kubernetes Control Plane and worker nodes on your infrastructure. You are responsible for managing the EKS Anywhere Control Plane and worker nodes. There is no requirement to have an AWS account to run EKS Anywhere.

If you’d like to see how EKS Anywhere compares to EKS please see the information here.

How can I manage EKS Anywhere at scale?

You can perform cluster life cycle and configuration management at scale through GitOps-based tools. EKS Anywhere offers git-driven cluster management through the integrated Flux Controller. See Manage cluster with GitOps

documentation for details.

Can I run EKS Anywhere on ESXi?

No. EKS Anywhere is only supported on providers listed on the Create production cluster

page. There would need to be a change to the upstream project to support ESXi.

Can I deploy EKS Anywhere on a single node?

Yes. Single node cluster deployment is supported for Bare Metal. See workerNodeGroupConfigurations

7.2 - Provisioning

This chapter walks through the following:

  • Overview of provisioning
  • Prerequisites for creating an EKS Anywhere cluster
  • Provisioning a new EKS Anywhere cluster
  • Verifying the cluster installation

7.2.1 - Overview

EKS Anywhere creates a Kubernetes cluster on premises to a chosen provider. Supported providers include Bare Metal (via Tinkerbell), CloudStack, and vSphere. To manage that cluster, you can run cluster create and delete commands from an Ubuntu or Mac Administrative machine.

Creating a cluster involves downloading EKS Anywhere tools to an Administrative machine, then running the eksctl anywhere create cluster command to deploy that cluster to the provider. A temporary bootstrap cluster runs on the Administrative machine to direct the target cluster creation. For a detailed description, see Cluster creation workflow

.

Here’s a diagram that explains the process visually.

EKS Anywhere Create Cluster

EKS Anywhere create cluster overview


Next steps:

7.2.2 - Admin machine setup

EKS Anywhere will create and manage Kubernetes clusters on multiple providers. Currently we support creating development clusters locally using Docker and production clusters from providers listed on the Create production cluster

page.

Creating an EKS Anywhere cluster begins with setting up an Administrative machine where you will run Docker and add some binaries. From there, you create the cluster for your chosen provider. See Create cluster workflow

for an overview of the cluster creation process.

To create an EKS Anywhere cluster you will need eksctl

and the eksctl-anywhere plugin. This will let you create a cluster in multiple providers for local development or production workloads.

NOTE: For Snow provider, the Snow devices will come with a pre-configured Admin AMI which can be used to create an Admin instance with all the necessary binaries, dependencies and artifacts to create an EKS Anywhere cluster. Skip the below steps and see Create Snow production cluster

to get started with EKS Anywhere on Snow.

Administrative machine prerequisites

  • Docker 20.x.x
  • Mac OS 10.15 / Ubuntu 20.04.2 LTS (See Note on newer Ubuntu versions)
  • 4 CPU cores
  • 16GB memory
  • 30GB free disk space
  • Administrative machine must be on the same Layer 2 network as the cluster machines (Bare Metal provider only).

If you are using Ubuntu, use the Docker CE installation instructions to install Docker and not the Snap installation, as described here.

If you are using Ubuntu 21.10 or 22.04, you will need to switch from cgroups v2 to cgroups v1. For details, see Troubleshooting Guide.

If you are using Docker Desktop, you need to know that:

  • For EKS Anywhere Bare Metal, Docker Desktop is not supported
  • For EKS Anywhere vSphere, if you are using Mac OS Docker Desktop 4.4.2 or newer "deprecatedCgroupv1": true must be set in ~/Library/Group\ Containers/group.com.docker/settings.json.

Install EKS Anywhere CLI tools

Via Homebrew (macOS and Linux)

You can install eksctl and eksctl-anywhere with homebrew

. This package will also install kubectl and the aws-iam-authenticator which will be helpful to test EKS Anywhere clusters.

brew install aws/tap/eks-anywhere

Manually (macOS and Linux)

Install the latest release of eksctl. The EKS Anywhere plugin requires eksctl version 0.66.0 or newer.

curl "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" \
    --silent --location \
    | tar xz -C /tmp
sudo mv /tmp/eksctl /usr/local/bin/

Install the eksctl-anywhere plugin.

RELEASE_VERSION=$(curl https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml --silent --location | yq ".spec.latestVersion")
EKS_ANYWHERE_TARBALL_URL=$(curl https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml --silent --location | yq ".spec.releases[] | select(.version==\"$RELEASE_VERSION\").eksABinary.$(uname -s | tr A-Z a-z).uri")
curl $EKS_ANYWHERE_TARBALL_URL \
    --silent --location \
    | tar xz ./eksctl-anywhere
sudo mv ./eksctl-anywhere /usr/local/bin/

Install the kubectl Kubernetes command line tool. This can be done by following the instructions here

.

Or you can install the latest kubectl directly with the following.

export OS="$(uname -s | tr A-Z a-z)" ARCH=$(test "$(uname -m)" = 'x86_64' && echo 'amd64' || echo 'arm64')
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/${OS}/${ARCH}/kubectl"
sudo mv ./kubectl /usr/local/bin
sudo chmod +x /usr/local/bin/kubectl

Upgrade eksctl-anywhere

If you installed eksctl-anywhere via homebrew you can upgrade the binary with

brew update
brew upgrade aws/tap/eks-anywhere

If you installed eksctl-anywhere manually you should follow the installation steps to download the latest release.

You can verify your installed version with

eksctl anywhere version

Prepare for airgapped deployments (optional)

When creating an EKS Anywhere cluster, there may be times where you need to do so in an airgapped environment. In this type of environment, cluster nodes are connected to the Admin Machine, but not to the internet. In order to download images and artifacts, however, the Admin machine needs to be temporarily connected to the internet.

An airgapped environment is especially important if you require the most secure networks. EKS Anywhere supports airgapped installation for creating clusters using a registry mirror. For airgapped installation to work, the Admin machine must have:

  • Temporary access to the internet to download images and artifacts
  • Ample space (80 GB or more) to store artifacts locally

To create a cluster in an airgapped environment, perform the following:

  1. Download the artifacts and images that will be used by the cluster nodes to the Admin machine using the following command:

    eksctl anywhere download artifacts
    

    A compressed file eks-anywhere-downloads.tar.gz will be downloaded.

  2. To decompress this file, use the following command:

    tar -xvf eks-anywhere-downloads.tar.gz
    

    This will create an eks-anywhere-downloads folder that we’ll be using later.

  3. In order for the next command to run smoothly, ensure that Docker has been pre-installed and is running. Then run the following:

    eksctl anywhere download images -o images.tar
    

    For the remaining steps, the Admin machine no longer needs to be connected to the internet or the bastion host.

  4. Next, you will need to set up a local registry mirror to host the downloaded EKS Anywhere images. In order to set one up, refer to Registry Mirror configuration.

  5. Now that you’ve configured your local registry mirror, you will need to import images to the local registry mirror using the following command (be sure to replace with the url of the local registry mirror you created in step 4):

    eksctl anywhere import images -i images.tar -r <registryUrl> \
       -- bundles ./eks-anywhere-downloads/bundle-release.yaml
    

You are now ready to deploy a cluster by following instructions to Create local cluster

or Create production cluster.

See text below for specific provider instructions.

For Bare Metal (Tinkerbell)

You will need to have hookOS and its OS artifacts downloaded and served locally from an HTTP file server. You will also need to modify the hookImagesURLPath

and the osImageURL

in the cluster configuration files. Ensure that structure of the files is set up as described in hookImagesURLPath.

For vSphere

If you are using the vSphere provider, be sure that the requirements in the Prerequisite checklist

have been met.

Deploy a cluster

Once you have the tools installed you can deploy a local cluster or production cluster in the next steps.

7.2.3 - Local cluster setup

EKS Anywhere docker provider deployments

EKS Anywhere supports a Docker provider for development and testing use cases only. This allows you to try EKS Anywhere on your local system before deploying to a supported provider to create either:

  • A single, standalone cluster or
  • Multiple management/workload clusters on the same provider, as described in Cluster topologies

    . The management/workload topology is recommended for production clusters and can be tried out here using both eksctl and GitOps tools.

Create a standalone cluster

Prerequisite Checklist

To install the EKS Anywhere binaries and see system requirements please follow the installation guide

.

Steps

  1. Generate a cluster config

    CLUSTER_NAME=mgmt
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider docker > $CLUSTER_NAME.yaml
    

    The command above creates a file named eksa-cluster.yaml with the contents below in the path where it is executed. The configuration specification is divided into two sections:

    • Cluster
    • DockerDatacenterConfig
    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
       name: mgmt
    spec:
       clusterNetwork:
          cniConfig:
             cilium: {}
          pods:
             cidrBlocks:
                - 192.168.0.0/16
          services:
             cidrBlocks:
                - 10.96.0.0/12
       controlPlaneConfiguration:
          count: 1
       datacenterRef:
          kind: DockerDatacenterConfig
          name: mgmt
       externalEtcdConfiguration:
          count: 1
       kubernetesVersion: "1.25"
       managementCluster:
          name: mgmt
       workerNodeGroupConfigurations:
          - count: 1
             name: md-0
    ---
    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: DockerDatacenterConfig
    metadata:
       name: mgmt
    spec: {}
    
    
    • Apart from the base configuration, you can add additional optional configuration to enable supported features:
  2. Configure Curated Packages

    The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here

    . Cluster creation will succeed if authentication is not set up, but some warnings may be generated. Detailed package configurations can be found here

    .

    If you are going to use packages, set up authentication. These credentials should have limited capabilities

    :

    export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
    export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"
    export EKSA_AWS_REGION="us-west-2"
    

    NOTE: The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. Due to this there might be some warnings in the CLI if proper authentication is not set up.

  3. Create Cluster:

    For a regular cluster create (with internet access), type the following:

    eksctl anywhere create cluster \
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f $CLUSTER_NAME.yaml
    

    For an airgapped cluster create, follow Preparation for airgapped deployments

    instructions, then type the following:

    eksctl anywhere create cluster 
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f $CLUSTER_NAME.yaml \
       --bundles-override ./eks-anywhere-downloads/bundle-release.yaml
    

    Example command output

    Performing setup and validations
    ✅ validation succeeded {"validation": "docker Provider setup is valid"}
    Creating new bootstrap cluster
    Installing cluster-api providers on bootstrap cluster
    Provider specific setup
    Creating new workload cluster
    Installing networking on workload cluster
    Installing cluster-api providers on workload cluster
    Moving cluster management from bootstrap to workload cluster
    Installing EKS-A custom components (CRD and controller) on workload cluster
    Creating EKS-A CRDs instances on workload cluster
    Installing GitOps Toolkit on workload cluster
    GitOps field not specified, bootstrap flux skipped
    Deleting bootstrap cluster
    🎉 Cluster created!
    ----------------------------------------------------------------------------------
    The Amazon EKS Anywhere Curated Packages are only available to customers with the
    Amazon EKS Anywhere Enterprise Subscription
    ----------------------------------------------------------------------------------
    Installing curated packages controller on management cluster
    secret/aws-secret created
    job.batch/eksa-auth-refresher created
    

    NOTE: to install curated packages during cluster creation, use --install-packages packages.yaml flag

  4. Use the cluster

    Once the cluster is created you can use it with the generated KUBECONFIG file in your local directory

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl get ns
    

    Example command output

    NAME                                STATUS   AGE
    capd-system                         Active   21m
    capi-kubeadm-bootstrap-system       Active   21m
    capi-kubeadm-control-plane-system   Active   21m
    capi-system                         Active   21m
    capi-webhook-system                 Active   21m
    cert-manager                        Active   22m
    default                             Active   23m
    eksa-packages                       Active   23m
    eksa-system                         Active   20m
    kube-node-lease                     Active   23m
    kube-public                         Active   23m
    kube-system                         Active   23m
    

    You can now use the cluster like you would any Kubernetes cluster. Deploy the test application with:

    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    

    Verify the test application in the deploy test application section

    .

Create management/workload clusters

To try the recommended EKS Anywhere topology,

you can create a management cluster and one or more workload clusters on the same Docker provider.

Prerequisite Checklist

To install the EKS Anywhere binaries and see system requirements please follow the installation guide

.

Create a management cluster

  1. Generate a management cluster config (named mgmt for this example):

    CLUSTER_NAME=mgmt
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider docker > eksa-mgmt-cluster.yaml
    
  2. Modify the management cluster config (eksa-mgmt-cluster.yaml) you could use the same one described earlier or modify it to use GitOps, as shown below:

    apiVersion: anywhere.eks.amazonaws.com/v1alpha1
    kind: Cluster
    metadata:
      name: mgmt
      namespace: default
    spec:
      bundlesRef:
        apiVersion: anywhere.eks.amazonaws.com/v1alpha1
        name: bundles-1
        namespace: eksa-system
      clusterNetwork:
        cniConfig:
          cilium: {}
        pods:
          cidrBlocks:
          - 192.168.0.0/16
        services:
          cidrBlocks:
          - 10.96.0.0/12
      controlPlaneConfiguration:
        count: 1
      datacenterRef:
        kind: DockerDatacenterConfig
        name: mgmt
      externalEtcdConfiguration:
        count: 1
      gitOpsRef:
        kind: FluxConfig
        name: mgmt
      kubernetesVersion: "1.25"
      managementCluster:
        name: mgmt
      workerNodeGroupConfigurations:
      - count: 1
        name: md-1
    

    apiVersion: anywhere.eks.amazonaws.com/v1alpha1 kind: DockerDatacenterConfig metadata: name: mgmt namespace: default spec: {}


    apiVersion: anywhere.eks.amazonaws.com/v1alpha1 kind: FluxConfig metadata: name: mgmt namespace: default spec: branch: main clusterConfigPath: clusters/mgmt github: owner: <your github account, such as example for https://github.com/example> personal: true repository: <your github repo, such as test for https://github.com/example/test> systemNamespace: flux-system


  3. Configure Curated Packages

    The Amazon EKS Anywhere Curated Packages are only available to customers with the Amazon EKS Anywhere Enterprise Subscription. To request a free trial, talk to your Amazon representative or connect with one here

    . Cluster creation will succeed if authentication is not set up, but some warnings may be generated. Detailed package configurations can be found here

    .

    If you are going to use packages, set up authentication. These credentials should have limited capabilities

    :

    export EKSA_AWS_ACCESS_KEY_ID="your*access*id"
    export EKSA_AWS_SECRET_ACCESS_KEY="your*secret*key"  
    
  4. Create cluster:

    For a regular cluster create (with internet access), type the following:

    eksctl anywhere create cluster \ 
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f $CLUSTER_NAME.yaml
    

    For an airgapped cluster create, follow Preparation for airgapped deployments

    instructions, then type the following:

    eksctl anywhere create cluster \
       # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
       -f $CLUSTER_NAME.yaml \
       --bundles-override ./eks-anywhere-downloads/bundle-release.yaml
    
  5. Once the cluster is created you can use it with the generated KUBECONFIG file in your local directory:

    export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
    
  6. Check the initial cluster’s CRD:

    To ensure you are looking at the initial cluster, list the CRD to see that the name of its management cluster is itself:

    kubectl get clusters mgmt -o yaml
    

    Example command output

    ...
    kubernetesVersion: "1.25"
    managementCluster:
      name: mgmt
    workerNodeGroupConfigurations:
    ...
    

Create separate workload clusters

Follow these steps to have your management cluster create and manage separate workload clusters.

  1. Generate a workload cluster config:

    CLUSTER_NAME=w01
    eksctl anywhere generate clusterconfig $CLUSTER_NAME \
       --provider docker > eksa-w01-cluster.yaml
    

    Refer to the initial config described earlier for the required and optional settings.

    NOTE: Ensure workload cluster object names (Cluster, DockerDatacenterConfig, etc.) are distinct from management cluster object names. Be sure to set the managementCluster field to identify the name of the management cluster.

  2. Create a workload cluster in one of the following ways:

    • GitOps: See Manage separate workload clusters with GitOps

    • Terraform: See Manage separate workload clusters with Terraform

    • Kubernetes CLI: The cluster lifecycle feature lets you use kubectl to manage a workload cluster. For example:

      kubectl apply -f eksa-w01-cluster.yaml 
      
    • eksctl CLI: Useful for temporary cluster configurations. To create a workload cluster with eksctl, do one of the following. For a regular cluster create (with internet access), type the following:

      eksctl anywhere create cluster \
          -f eksa-w01-cluster.yaml  \
         # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
          --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
      

      For an airgapped cluster create, follow Preparation for airgapped deployments

      instructions, then type the following:

      eksctl create cluster \
         # --install-packages packages.yaml \ # uncomment to install curated packages at cluster creation
         -f $CLUSTER_NAME.yaml \
         --bundles-override ./eks-anywhere-downloads/bundle-release.yaml \
          --kubeconfig mgmt/mgmt-eks-a-cluster.kubeconfig
      

      As noted earlier, adding the --kubeconfig option tells eksctl to use the management cluster identified by that kubeconfig file to create a different workload cluster.

  3. To check the workload cluster, get the workload cluster credentials and run a test workload:

    • If your workload cluster was created with eksctl, change your credentials to point to the new workload cluster (for example, w01), then run the test application with:

      export CLUSTER_NAME=w01
      export KUBECONFIG=${PWD}/${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
      kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
      
    • If your workload cluster was created with GitOps or Terraform, you can get credentials and run the test application as follows:

      kubectl get secret -n eksa-system w01-kubeconfig -o jsonpath={.data.value}' | base64 —decode > w01.kubeconfig
      export KUBECONFIG=w01.kubeconfig
      kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
      

      NOTE: For Docker, you must modify the server field of the kubeconfig file by replacing the IP with 127.0.0.1 and the port with its value. The port’s value can be found by running docker ps and checking the workload cluster’s load balancer.

  4. Add more workload clusters:

    To add more workload clusters, go through the same steps for creating the initial workload, copying the config file to a new name (such as eksa-w02-cluster.yaml), modifying resource names, and running the create cluster command again.

Next steps:

  • See the Cluster management

    section for more information on common operational tasks like scaling and deleting the cluster.

  • See the Package management

    section for more information on post-creation curated packages installation.

To verify that a cluster control plane is up and running, use the kubectl command to show that the control plane pods are all running.

kubectl get po -A -l control-plane=controller-manager
NAMESPACE                           NAME                                                             READY   STATUS    RESTARTS   AGE
capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-controller-manager-57b99f579f-sd85g       2/2     Running   0          47m
capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-controller-manager-79cdf98fb8-ll498   2/2     Running   0          47m
capi-system                         capi-controller-manager-59f4547955-2ks8t                         2/2     Running   0          47m
capi-webhook-system                 capi-controller-manager-bb4dc9878-2j8mg                          2/2     Running   0          47m
capi-webhook-system                 capi-kubeadm-bootstrap-controller-manager-6b4cb6f656-qfppd       2/2     Running   0          47m
capi-webhook-system                 capi-kubeadm-control-plane-controller-manager-bf7878ffc-rgsm8    2/2     Running   0          47m
capi-webhook-system                 capv-controller-manager-5668dbcd5-v5szb                          2/2     Running   0          47m
capv-system                         capv-controller-manager-584886b7bd-f66hs                         2/2     Running   0          47m

You may also check the status of the cluster control plane resource directly. This can be especially useful to verify clusters with multiple control plane nodes after an upgrade.

kubectl get kubeadmcontrolplanes.controlplane.cluster.x-k8s.io
NAME                       INITIALIZED   API SERVER AVAILABLE   VERSION              REPLICAS   READY   UPDATED   UNAVAILABLE
supportbundletestcluster   true          true                   v1.20.7-eks-1-20-6   1          1       1

To verify that the expected number of cluster worker nodes are up and running, use the kubectl command to show that nodes are Ready.

This will confirm that the expected number of worker nodes are present. Worker nodes are named using the cluster name followed by the worker node group name (example: my-cluster-md-0)

kubectl get nodes
NAME                                           STATUS   ROLES                  AGE    VERSION
supportbundletestcluster-md-0-55bb5ccd-mrcf9   Ready    <none>                 4m   v1.20.7-eks-1-20-6
supportbundletestcluster-md-0-55bb5ccd-zrh97   Ready    <none>                 4m   v1.20.7-eks-1-20-6
supportbundletestcluster-mdrwf                 Ready    control-plane,master   5m   v1.20.7-eks-1-20-6

To test a workload in your cluster you can try deploying the hello-eks-anywhere

.

7.2.4 - Preparing needed for hosting EKS Anywhere on vSphere

Certain resources must be in place with appropriate user permissions to create an EKS Anywhere cluster using the vSphere provider.

Configuring Folder Resources

Create a VM folder:

For each user that needs to create workload clusters, have the vSphere administrator create a VM folder. That folder will host:

  • The VMs of the Control plane and Data plane nodes of each cluster.
  • A nested folder for the management cluster and another one for each workload cluster.
  • Each cluster VM in its own nested folder under this folder.

Follow these steps to create the user’s vSphere folder:

  1. From vCenter, select the Menus/VM and Template tab.
  2. Select either a datacenter or another folder as a parent object for the folder that you want to create.
  3. Right-click the parent object and click New Folder.
  4. Enter a name for the folder and click OK. For more details, see the vSphere Create a Folder

    documentation.

Configuring vSphere User, Group, and Roles

You need a vSphere user with the right privileges to let you create EKS Anywhere clusters on top of your vSphere cluster.

Configure via EKSA CLI

To configure a new user via CLI, you will need two things:

  • a set of vSphere admin credentials with the ability to create users and groups. If you do not have the rights to create new groups and users, you can invoke govc commands directly as outlined here.
  • a user.yaml file:
apiVersion: "eks-anywhere.amazon.com/v1"
kind: vSphereUser
spec:
  username: "eksa"                # optional, default eksa
  group: "MyExistingGroup"        # optional, default EKSAUsers
  globalRole: "MyGlobalRole"      # optional, default EKSAGlobalRole
  userRole: "MyUserRole"          # optional, default EKSAUserRole
  adminRole: "MyEKSAAdminRole"    # optional, default EKSACloudAdminRole
  datacenter: "MyDatacenter"
  vSphereDomain: "vsphere.local"  # this should be the domain used when you login, e.g. YourUsername@vsphere.local
  connection:
    server: "https://my-vsphere.internal.acme.com"
    insecure: false
  objects:
    networks:
      - !!str "/MyDatacenter/network/My Network"
    datastores:
      - !!str "/MyDatacenter/datastore/MyDatastore2"
    resourcePools:
      - !!str "/MyDatacenter/host/Cluster-03/MyResourcePool" # NOTE: see below if you do not want to use a resource pool
    folders:
      - !!str "/MyDatacenter/vm/OrgDirectory/MyVMs"
    templates:
      - !!str "/MyDatacenter/vm/Templates/MyTemplates"

NOTE: if you do not want to create a resource pool, you can instead specify the cluster directly as /MyDatacenter/host/Cluster-03 in user.yaml, where Cluster-03 is your cluster name. In your cluster spec, you will need to specify /MyDatacenter/host/Cluster-03/Resources for the resourcePool field.

Set the admin credentials as environment variables:

export EKSA_VSPHERE_USERNAME=<ADMIN_VSPHERE_USERNAME>
export EKSA_VSPHERE_PASSWORD=<ADMIN_VSPHERE_PASSWORD>

If the user does not already exist, you can create the user and all the specified group and role objects by running:

eksctl anywhere exp vsphere setup user -f user.yaml --password '<NewUserPassword>'

If the user or any of the group or role objects already exist, use the force flag instead to overwrite Group-Role-Object mappings for the group, roles, and objects specified in the user.yaml config file:

eksctl anywhere exp vsphere setup user -f user.yaml --force

Please note that there is one more manual step to configure global permissions here

.

Configure via govc

If you do not have the rights to create a new user, you can still configure the necessary roles and permissions using the govc cli

.

#! /bin/bash
# govc calls to configure a user with minimal permissions
set -x
set -e

EKSA_USER='<Username>@<UserDomain>' USER_ROLE='EKSAUserRole' GLOBAL_ROLE='EKSAGlobalRole' ADMIN_ROLE='EKSACloudAdminRole'

FOLDER_VM='/YourDatacenter/vm/YourVMFolder' FOLDER_TEMPLATES='/YourDatacenter/vm/Templates'

NETWORK='/YourDatacenter/network/YourNetwork' DATASTORE='/YourDatacenter/datastore/YourDatastore' RESOURCE_POOL='/YourDatacenter/host/Cluster-01/Resources/YourResourcePool'

govc role.create "$GLOBAL_ROLE" $(curl https://raw.githubusercontent.com/aws/eks-anywhere/main/pkg/config/static/globalPrivs.json | jq .[] | tr '\n' ' ' | tr -d '"')

govc role.create "$USER_ROLE" $(curl https://raw.githubusercontent.com/aws/eks-anywhere/main/pkg/config/static/eksUserPrivs.json | jq .[] | tr '\n' ' ' | tr -d '"')

govc role.create "$ADMIN_ROLE" $(curl https://raw.githubusercontent.com/aws/eks-anywhere/main/pkg/config/static/adminPrivs.json | jq .[] | tr '\n' ' ' | tr -d '"')

govc permissions.set -group=false -principal "$EKSA_USER" -role "$GLOBAL_ROLE" /

govc permissions.set -group=false -principal "$EKSA_USER" -role "$ADMIN_ROLE" "$FOLDER_VM"

govc permissions.set -group=false -principal "$EKSA_USER" -role "$ADMIN_ROLE" "$FOLDER_TEMPLATES"

govc permissions.set -group=false -principal "$EKSA_USER" -role "$USER_ROLE" "$NETWORK"

govc permissions.set -group=false -principal "$EKSA_USER" -role "$USER_ROLE" "$DATASTORE"

govc permissions.set -group=false -principal "$EKSA_USER" -role "$USER_ROLE" "$RESOURCE_POOL"

NOTE: if you do not want to create a resource pool, you can instead specify the cluster directly as /MyDatacenter/host/Cluster-03 in user.yaml, where Cluster-03 is your cluster name. In your cluster spec, you will need to specify /MyDatacenter/host/Cluster-03/Resources for the resourcePool field.

Please note that there is one more manual step to configure global permissions here

.

Configure via UI

Add a vCenter User

Ask your VSphere administrator to add a vCenter user that will be used for the provisioning of the EKS Anywhere cluster in VMware vSphere.

  1. Log in with the vSphere Client to the vCenter Server.
  2. Specify the user name and password for a member of the vCenter Single Sign-On Administrators group.
  3. Navigate to the vCenter Single Sign-On user configuration UI.
    • From the Home menu, select Administration.
    • Under Single Sign On, click Users and Groups.
  4. If vsphere.local is not the currently selected domain, select it from the drop-down menu. You cannot add users to other domains.
  5. On the Users tab, click Add.
  6. Enter a user name and password for the new user.
  7. The maximum number of characters allowed for the user name is 300.
  8. You cannot change the user name after you create a user. The password must meet the password policy requirements for the system.
  9. Click Add.

For more details, see vSphere Add vCenter Single Sign-On Users

documentation.

Create and define user roles

When you add a user for creating clusters, that user initially has no privileges to perform management operations. So you have to add this user to groups with the required permissions, or assign a role or roles with the required permission to this user.

Three roles are needed to be able to create the EKS Anywhere cluster:

  1. Create a global custom role: For example, you could name this EKS Anywhere Global. Define it for the user on the vCenter domain level and its children objects. Create this role with the following privileges:

    > Content Library
    * Add library item
    * Check in a template
    * Check out a template
    * Create local library
    * Update files
    > vSphere Tagging
    * Assign or Unassign vSphere Tag
    * Assign or Unassign vSphere Tag on Object
    * Create vSphere Tag
    * Create vSphere Tag Category
    * Delete vSphere Tag
    * Delete vSphere Tag Category
    * Edit vSphere Tag
    * Edit vSphere Tag Category
    * Modify UsedBy Field For Category
    * Modify UsedBy Field For Tag
    > Sessions
    * Validate session
    
  2. Create a user custom role: The second role is also a custom role that you could call, for example, EKSAUserRole. Define this role with the following objects and children objects.

    • The pool resource level and its children objects. This resource pool that our EKS Anywhere VMs will be part of.
    • The storage object level and its children objects. This storage that will be used to store the cluster VMs.
    • The network VLAN object level and its children objects. This network that will host the cluster VMs.
    • The VM and Template folder level and its children objects.

    Create this role with the following privileges:

    > Content Library
    * Add library item
    * Check in a template
    * Check out a template
    * Create local library
    > Datastore
    * Allocate space
    * Browse datastore
    * Low level file operations
    > Folder
    * Create folder
    > vSphere Tagging
    * Assign or Unassign vSphere Tag
    * Assign or Unassign vSphere Tag on Object
    * Create vSphere Tag
    * Create vSphere Tag Category
    * Delete vSphere Tag
    * Delete vSphere Tag Category
    * Edit vSphere Tag
    * Edit vSphere Tag Category
    * Modify UsedBy Field For Category
    * Modify UsedBy Field For Tag
    > Network
    * Assign network
    > Resource
    * Assign virtual machine to resource pool
    > Scheduled task
    * Create tasks
    * Modify task
    * Remove task
    * Run task
    > Profile-driven storage
    * Profile-driven storage view
    > Storage views
    * View
    > vApp
    * Import
    > Virtual machine
    * Change Configuration
      - Add existing disk
      - Add new disk
      - Add or remove device
      - Advanced configuration
      - Change CPU count
      - Change Memory
      - Change Settings
      - Configure Raw device
      - Extend virtual disk
      - Modify device settings
      - Remove disk
    * Edit Inventory
      - Create from existing
      - Create new
      - Remove
    * Interaction
      - Power off
      - Power on
    * Provisioning
      - Clone template
      - Clone virtual machine
      - Create template from virtual machine
      - Customize guest
      - Deploy template
      - Mark as template
      - Read customization specifications
    * Snapshot management
      - Create snapshot
      - Remove snapshot
      - Revert to snapshot
    
  3. Create a default Administrator role: The third role is the default system role Administrator that you define to the user on the folder level and its children objects (VMs and OVA templates) that was created by the VSphere admistrator for you.

    To create a role and define privileges check Create a vCenter Server Custom Role

    and Defined Privileges

    pages.

Manually set Global Permissions role in Global Permissions UI

vSphere does not currently support a public API for setting global permissions. Because of this, you will need to manually assign the Global Role you created to your user or group in the Global Permissions UI.

Deploy an OVA Template

If the user creating the cluster has permission and network access to create and tag a template, you can skip these steps because EKS Anywhere will automatically download the OVA and create the template if it can. If the user does not have the permissions or network access to create and tag the template, follow this guide. The OVA contains the operating system (Ubuntu, Bottlerocket, or RHEL) for a specific EKS Distro Kubernetes release and EKS Anywhere version. The following example uses Ubuntu as the operating system, but a similar workflow would work for Bottlerocket or RHEL.

Steps to deploy the OVA

  1. Go to the artifacts

    page and download or build the OVA template with the newest EKS Distro Kubernetes release to your computer.

  2. Log in to the vCenter Server.
  3. Right-click the folder you created above and select Deploy OVF Template. The Deploy OVF Template wizard opens.
  4. On the Select an OVF template page, select the Local file option, specify the location of the OVA template you downloaded to your computer, and click Next.
  5. On the Select a name and folder page, enter a unique name for the virtual machine or leave the default generated name, if you do not have other templates with the same name within your vCenter Server virtual machine folder. The default deployment location for the virtual machine is the inventory object where you started the wizard, which is the folder you created above. Click Next.
  6. On the Select a compute resource page, select the resource pool where to run the deployed VM template, and click Next.
  7. On the Review details page, verify the OVF or OVA template details and click Next.
  8. On the Select storage page, select a datastore to store the deployed OVF or OVA template and click Next.
  9. On the Select networks page, select a source network and map it to a destination network. Click Next.
  10. On the Ready to complete page, review the page and click Finish. For details, see Deploy an OVF or OVA Template

To build your own Ubuntu OVA template check the Building your own Ubuntu OVA section in the following link

.

To use the deployed OVA template to create the VMs for the EKS Anywhere cluster, you have to tag it with specific values for the os and eksdRelease keys. The value of the os key is the operating system of the deployed OVA template, which is ubuntu in our scenario. The value of the eksdRelease holds kubernetes and the EKS-D release used in the deployed OVA template. Check the following Customize OVAs

page for more details.

Steps to tag the deployed OVA template:

  1. Go to the artifacts

    page and take notes of the tags and values associated with the OVA template you deployed in the previous step.

  2. In the vSphere Client, select Menu > Tags & Custom Attributes.
  3. Select the Tags tab and click Tags.
  4. Click New.
  5. In the Create Tag dialog box, copy the os tag name associated with your OVA that you took notes of, which in our case is os:ubuntu and paste it as the name for the first tag required.
  6. Specify the tag category os if it exist or create it if it does not exist.
  7. Click Create.
  8. Repeat steps 2-4.
  9. In the Create Tag dialog box, copy the os tag name associated with your OVA that you took notes of, which in our case is eksdRelease:kubernetes-1-21-eks-8 and paste it as the name for the second tag required.
  10. Specify the tag category eksdRelease if it exist or create it if it does not exist.
  11. Click Create.
  12. Navigate to the VM and Template tab.
  13. Select the folder that was created.
  14. Select deployed template and click Actions.
  15. From the drop-down menu, select Tags and Custom Attributes > Assign Tag.
  16. Select the tags we created from the list and confirm the operation.

To run EKS Anywhere, you will need:

Prepare Administrative machine

Set up an Administrative machine as described in Install EKS Anywhere

.

Prepare a VMware vSphere environment

To prepare a VMware vSphere environment to run EKS Anywhere, you need the following:

  • A vSphere 7+ environment running vCenter

  • Capacity to deploy 6-10 VMs

  • DHCP service

    running in vSphere environment in the primary VM network for your workload cluster

  • One network in vSphere to use for the cluster. EKS Anywhere clusters need access to vCenter through the network to enable self-managing and storage capabilities.

  • An OVA

    imported into vSphere and converted into a template for the workload VMs

  • User credentials to create VMs and attach networks, etc

  • One IP address routable from cluster but excluded from DHCP offering. This IP address is to be used as the Control Plane Endpoint IP

    Below are some suggestions to ensure that this IP address is never handed out by your DHCP server.

    You may need to contact your network engineer.

    • Pick an IP address reachable from cluster subnet which is excluded from DHCP range OR
    • Alter DHCP ranges to leave out an IP address(s) at the top and/or the bottom of the range OR
    • Create an IP reservation for this IP on your DHCP server. This is usually accomplished by adding a dummy mapping of this IP address to a non-existent mac address.

Each VM will require:

  • 2 vCPUs
  • 8GB RAM
  • 25GB Disk

The administrative machine and the target workload environment will need network access to:

  • vCenter endpoint (must be accessible to EKS Anywhere clusters)
  • public.ecr.aws
  • anywhere-assets.eks.amazonaws.com (to download the EKS Anywhere binaries, manifests and OVAs)
  • distro.eks.amazonaws.com (to download EKS Distro binaries and manifests)
  • d2glxqk2uabbnd.cloudfront.net (for EKS Anywhere and EKS Distro ECR container images)
  • api.ecr.us-west-2.amazonaws.com (for EKS Anywhere package authentication matching your region)
  • d5l0dvt14r5h8.cloudfront.net (for EKS Anywhere package ECR container images)
  • api.github.com (only if GitOps is enabled)

vSphere information needed before creating the cluster

You need to get the following information before creating the cluster:

  • Static IP Addresses: You will need one IP address for the management cluster control plane endpoint, and a separate one for the controlplane of each workload cluster you add.

    Let’s say you are going to have the management cluster and two workload clusters. For those, you would need three IP addresses, one for each. All of those addresses will be configured the same way in the configuration file you will generate for each cluster.

    A static IP address will be used for each control plane VM in your EKS Anywhere cluster. Choose IP addresses in your network range that do not conflict with other VMs and make sure they are excluded from your DHCP offering.

    An IP address will be the value of the property controlPlaneConfiguration.endpoint.host in the config file of the management cluster. A separate IP address must be assigned for each workload cluster.

    Import ova wizard

  • vSphere Datacenter Name: The vSphere datacenter to deploy the EKS Anywhere cluster on.

    Import ova wizard

  • VM Network Name: The VM network to deploy your EKS Anywhere cluster on.

    Import ova wizard

  • vCenter Server Domain Name: The vCenter server fully qualified domain name or IP address. If the server IP is used, the thumbprint must be set or insecure must be set to true.

    Import ova wizard

  • thumbprint (required if insecure=false): The SHA1 thumbprint of the vCenter server certificate which is only required if you have a self-signed certificate for your vSphere endpoint.

    There are several ways to obtain your vCenter thumbprint. If you have govc installed

    , you can run the following command in the Administrative machine terminal, and take a note of the output:

    govc about.cert -thumbprint -k
    
  • template: The VM template to use for your EKS Anywhere cluster. This template was created when you imported the OVA file into vSphere.

    Import ova wizard

  • datastore: The vSphere datastore

    to deploy your EKS Anywhere cluster on.

    Import ova wizard

  • folder: The folder parameter in VSphereMachineConfig allows you to organize the VMs of an EKS Anywhere cluster. With this, each cluster can be organized as a folder in vSphere. You will have a separate folder for the management cluster and each cluster you are adding.

    Import ova wizard

  • resourcePool: The vSphere Resource pools for your VMs in the EKS Anywhere cluster. If there is a resource pool: /<datacenter>/host/<resource-pool-name>/Resources

    Import ova wizard

7.2.5 - vSphere cluster

EKS Anywhere supports a vSphere provider for production grade EKS Anywhere deployments. EKS Anywhere allows you to provision and manage Amazon EKS on your own infrastructure.

This document walks you through setting up EKS Anywhere in a way that:

  • Deploys an initial cluster on your vSphere environment. That cluster can be used as a self-managed cluster (to run workloads) or a management cluster (to create and manage other clusters)
  • Deploys zero or more workload clusters from the management cluster

If your initial cluster is a management cluster, it is intended to stay in place so you can use it later to modify, upgrade, and delete workload clusters. Using a management cluster makes it faster to provision and delete workload clusters. Also it lets you keep vSphere credentials for a set of clusters in one place: on the management cluster. The alternative is to simply use your initial cluster to run workloads.

Prerequisite Checklist

EKS Anywhere needs to be run on an administrative machine that has certain machine requirements . An EKS Anywhere deployment will also require the availability of certain resources from your VMware vSphere deployment .

Steps

The following steps are divided into two sections:

  • Create an initial cluster (used as a management or self-managed cluster)
  • Create zero or more workload clusters from the management cluster

Create an initial cluster

Follow these steps to create an EKS Anywhere cluster that can be used either as a management cluster or as a self-managed cluster (for running workloads itself).

All steps listed below should be executed on the admin machine with reachability to the vSphere environment where the EKA Anywhere clusters are created.

  1. Generate an initial cluster config (named mgmt-cluster for this example):

    export MGMT_CLUSTER_NAME=mgmt-cluster
    eksctl anywhere generate clusterconfig $MGMT_CLUSTER_NAME \
       --provider vsphere > $MGMT_CLUSTER_NAME.yaml
    

    The command above creates a config file named mgmt-cluster.yaml in the path where it is executed. Refer to vsphere configuration for information on configuring this cluster config for a vSphere provider.

    The configuration specification is divided into three sections:

    • Cluster
    • VSphereDatacenterConfig
    • VSphereMachineConfig

    Some key considerations and configuration parameters:

    • Create at least two control plane nodes, three worker nodes, and three etcd nodes for a production cluster, to provide high availability and rolling upgrades.

    • osFamily (operating System on virtual machines) parameter in VSphereMachineConfig by default is set to bottlerocket. Permitted values: ubuntu, bottlerocket.

    • The recommended mode of deploying etcd on EKS Anywhere production clusters is unstacked (etcd members have dedicated machines and are not collocated with control plane components). More information here. The generated config file comes with external etcd enabled already. So leave this part as it is.

    • Apart from the base configuration, you can optionally add additional configuration to enable supported EKS Anywhere functionalities.

      As of now, you have to pre-determine which features you want to enable on your cluster before cluster creation. Otherwise, to enable them post-creation will require you to delete and recreate the cluster. However, the next EKS-A release will remove such limitation.

    • To enable managing cluster resources using GitOps, you would need to enable GitOps configurations on the initial/managemet cluster. You can not enable GitOps on workload clusters as long as you have enabled it on the initial/management cluster. And if you want to manage the deployment of Kubernetes resources on a workload cluster, then you would need to bootstrap Flux against your workload cluster manually, to be able deploying Kubernetes resources to this workload cluster using GitOps

  2. Modify the initial cluster generated config (mgmt-cluster.yaml) as follows: You will notice that the generated config file comes with the following fields with empty values. All you need is to fill them with the values we gathered in the prerequisites page.

    • Cluster: controlPlaneConfiguration.endpoint.host: ""

      controlPlaneConfiguration:
         count: 3
         endpoint:
            # Fill this value with the IP address you want to use for the management 
            # cluster control plane endpoint. You will also need  a separate one for the 
            # controlplane of each workload cluster you add later.
            host: "" 
      
    • VSphereDatacenterConfig:

      datacenter: "" # Fill it with the vSphere Datacenter Name. Example: "Example Datacenter"
      insecure: false
      network: "" # Fill it with VM Network Name. Example: "/Example Datacenter/network/VLAN035"
      server: "" # Fill it with the vCenter Server Domain Name. Example: "sample.exampledomain.com"
      thumbprint: "" # Fill it with the thumprint of your vCenter server. Example: "BF:B5:D4:C5:72:E4:04:40:F7:22:99:05:12:F5:0B:0E:D7:A6:35:36"
      
    • VSphereMachineConfig sections:

      datastore: "" # Fill in the vSphere datastore name: Example "/Example Datacenter/datastore/LabStorage"
      diskGiB: 25
      # Fill in the folder name that the VMs of the cluster will be organized under.
      # You will have a separate folder for the management cluster and each cluster you are adding.
      folder: "" # Fill in the foler name Example: /Example Datacenter/vm/EKS Anywhere/mgmt-cluster
      memoryMiB: 8192 
      numCPUs: 2
      osFamily: ubuntu # You can set it to botllerocket or ubuntu
      resourcePool: "" # Fill in the vSphere Resource pool. Example: /Example Datacenter/host/Lab/Resources
      
      • Remove the users property, and it will be genrated during the cluster creation automatically. It will set the username to capv if osFamily=ubuntu, and ec2-user if osFamily=botllerocket which is the default option. It will also generate an SSH Key pair, that you can use later to connect to your cluster VMs.
      • Add template property if you chose to import the EKS-A VM OVA template, and set it to the VM template you imported. Check the vSphere preparation steps
      template: /Example Datacenter/vm/EKS Anywhere/ubuntu-2004-kube-v1.21.2
      

    Refer to vsphere configuration for more information on the configuring that can be used for a vSphere provider.

  3. Set Credential Environment Variables

    Before you create the initial/management cluster, you will need to set and export these environment variables for your vSphere user name and password. Make sure you use single quotes around the values so that your shell does not interpret the values

    # vCenter User Credentials
    export GOVC_URL='[vCenter Server Domain Name]'     # Example: https://sample.exampledomain.com
    export GOVC_USERNAME='[vSphere user name]'         # Example: USER1@exampledomain
    export GOVC_PASSWORD='[vSphere password]'                                     
    export GOVC_INSECURE=true
    export EKSA_VSPHERE_USERNAME='[vSphere user name]' # Example: USER1@exampledomain
    export EKSA_VSPHERE_PASSWORD='[vSphere password]'               
    
  4. Set License Environment Variable

    If you are creating a licensed cluster, set and export the license variable (see License cluster if you are licensing an existing cluster):

    export EKSA_LICENSE='my-license-here'
    
  5. Now you are ready to create a cluster with the basic stettings.

    After you have finish adding all the configuration needed to your configuration file the mgmt-cluster.yaml and set your credential environment variables, you are ready to create the cluster. Run the create command with the option -v 9 to get the highest level of verbosity, in case you want to troubleshoot any issue happened during the creation of the cluster. You may need also to output it to a file, so you can look at it later.

    eksctl anywhere create cluster -f $MGMT_CLUSTER_NAME.yaml \
    -v 9 > $MGMT_CLUSTER_NAME-$(date "+%Y%m%d%H%M").log 2>&1
    
  6. With the completion of the above steps, the management EKS Anywhere cluster is created on the configured vSphere environment under a sub-folder of the EKS Anywhere folder. You can see the cluster VMs from the vSphere console as below:

    Import ova wizard

  7. Once the cluster is created a folder got created on the admin machine with the cluster name which contains the kubeconfig file and the cluster configuration file used to create the cluster, in addition to the generated SSH key pair that you can use to SSH into the VMs of the cluster.

    ls mgmt-cluster/
    

    Output

    eks-a-id_rsa      mgmt-cluster-eks-a-cluster.kubeconfig
    eks-a-id_rsa.pub  mgmt-cluster-eks-a-cluster.yaml
    
  8. Now you can use your cluster with the generated KUBECONFIG file:

    export KUBECONFIG=${PWD}/${MGMT_CLUSTER_NAME}/${MGMT_CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl cluster-info
    

    The cluster endpoint in the output of this command would be the controlPlaneConfiguration.endpoint.host provided in the mgmt-cluster.yaml config file.

  9. Check the cluster nodes:

    To check that the cluster completed, list the machines to see the control plane, etcd, and worker nodes:

    kubectl get machines -A
    

    Example command output

    NAMESPACE   NAME                PROVIDERID        PHASE    VERSION
    eksa-system mgmt-b2xyz          vsphere:/xxxxx    Running  v1.21.2-eks-1-21-5
    eksa-system mgmt-etcd-r9b42     vsphere:/xxxxx    Running  
    eksa-system mgmt-md-8-6xr-rnr   vsphere:/xxxxx    Running  v1.21.2-eks-1-21-5
    ...
    

    The etcd machine doesn’t show the Kubernetes version because it doesn’t run the kubelet service.

  10. Check the initial/management cluster’s CRD:

    To ensure you are looking at the initial/management cluster, list the CRD to see that the name of its management cluster is itself:

    kubectl get clusters mgmt -o yaml
    

    Example command output

    ...
    kubernetesVersion: "1.21"
    managementCluster:
      name: mgmt
    workerNodeGroupConfigurations:
    ...
    

Create separate workload clusters

Follow these steps if you want to use your initial cluster to create and manage separate workload clusters. All steps listed below should be executed on the same admin machine the management cluster created on.

  1. Generate a workload cluster config:

    export WORKLOAD_CLUSTER_NAME='w01-cluster'
    export MGMT_CLUSTER_NAME='mgmt-cluster'
    eksctl anywhere generate clusterconfig $WORKLOAD_CLUSTER_NAME \
       --provider vsphere > $WORKLOAD_CLUSTER_NAME.yaml
    

    The command above creates a file named w01-cluster.yaml with similar contents to the mgmt.cluster.yaml file that was generated for the management cluster in the previous section. It will be generated in the path where it is executed.

    Same key considerations and configuration parameters apply to workload cluster as well, that were mentioned above with the initial cluster.

  2. Refer to the initial config described earlier for the required and optional settings. Ensure workload cluster object names (Cluster, vSphereDatacenterConfig, vSphereMachineConfig, etc.) are distinct from management cluster object names. Be sure to set the managementCluster field to identify the name of the management cluster.

  3. Modify the generated workload cluster config parameters same way you did in the generated configuration file of the management cluster. The only differences are with the following fields:

    • controlPlaneConfiguration.endpoint.host: That you will use a different IP address for the Cluster filed controlPlaneConfiguration.endpoint.host for each workload cluster as with the initial cluster. Notice here that you use a different IP address from this one that was used with the management cluster.

    • managementCluster.name: By default the value of this field is the same as the cluster name, when you generate the configuration file. But because we want this workload cluster we are adding, to managed by the management cluster, then you need to change that to the management cluster name.

      managementCluster:
         name: mgmt-cluster # the name of the initial/management cluster
      
    • VSphereMachineConfig.folder It’s recommended to have a separate folder path for each cluster you add for organization purposes.

      folder: /Example Datacenter/vm/EKS Anywhere/w01-cluster
      

    Other than that all other parameters will be configured the same way.

  4. Create a workload cluster

    To create a new workload cluster from your management cluster run this command, identifying:

    • The workload cluster yaml file
    • The initial cluster’s credentials (this causes the workload cluster to be managed from the management cluster)
    eksctl anywhere create cluster \
        -f $WORKLOAD_CLUSTER_NAME.yaml \
        --kubeconfig $MGMT_CLUSTER_NAME/$MGMT_CLUSTER_NAME-eks-a-cluster.kubeconfig \
        -v 9 > $WORKLOAD_CLUSTER_NAME-$(date "+%Y%m%d%H%M").log 2>&1
    

    As noted earlier, adding the --kubeconfig option tells eksctl to use the management cluster identified by that kubeconfig file to create a different workload cluster.

  5. With the completion of the above steps, the management EKS Anywhere cluster is created on the configured vSphere environment under a sub-folder of the EKS Anywhere folder. You can see the cluster VMs from the vSphere console as below:

    Import ova wizard

  6. Once the cluster is created a folder got created on the admin machine with the cluster name which contains the kubeconfig file and the cluster configuration file used to create the cluster, in addition to the generated SSH key pair that you can use to SSH into the VMs of the cluster.

    ls w01-cluster/
    

    Output

    eks-a-id_rsa      w01-cluster-eks-a-cluster.kubeconfig
    eks-a-id_rsa.pub  w01-cluster-eks-a-cluster.yaml
    
  7. You can list the workload clusters managed by the management cluster.

    export KUBECONFIG=${PWD}/${MGMT_CLUSTER_NAME}/${MGMT_CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl get clusters
    
  8. Check the workload cluster:

    You can now use the workload cluster as you would any Kubernetes cluster. Change your credentials to point to the kubconfig file of the new workload cluster, then get the cluster info

    export KUBECONFIG=${PWD}/${WORKLOAD_CLUSTER_NAME}/${WORKLOAD_CLUSTER_NAME}-eks-a-cluster.kubeconfig
    kubectl cluster-info
    

    The cluster endpoint in the output of this command should be the controlPlaneConfiguration.endpoint.host provided in the w01-cluster.yaml config file.

  9. To verify that the expected number of cluster worker nodes are up and running, use the kubectl command to show that nodes are Ready.

    kubectl get nodes
    
  10. Test deploying an application with:

    kubectl apply -f "https://anywhere.eks.amazonaws.com/manifests/hello-eks-a.yaml"
    

    Verify the test application in the deploy test application section .

  11. Add more workload clusters:

    To add more workload clusters, go through the same steps for creating the initial workload, copying the config file to a new name (such as w01-cluster.yaml), modifying resource names, and running the create cluster command again.

See the Cluster management section with more information on common operational tasks like scaling and deleting the cluster.

7.3 - Curated Packages Workshop

Workshops for using curated packages.

7.3.1 - Prometheus use cases

7.3.1.1 - Prometheus with Grafana

This tutorial demonstrates how to config the Prometheus package to scrape metrics from an EKS Anywhere cluster, and visualize them in Grafana.

This tutorial walks through the following procedures:

Install the Prometheus package

The Prometheus package creates two components by default:

  • Prometheus-server, which collects metrics from configured targets, and stores the metrics as time series data;
  • Node-exporter, which exposes a wide variety of hardware- and kernel-related metrics for prometheus-server (or an equivalent metrics collector, i.e. ADOT collector) to scrape.

The prometheus-server is pre-configured to scrape the following targets at 1m interval:

  • Kubernetes API servers
  • Kubernetes nodes
  • Kubernetes nodes cadvisor
  • Kubernetes service endpoints
  • Kubernetes services
  • Kubernetes pods
  • Prometheus-server itself

If no config modification is needed, a user can proceed to the Prometheus installation guide .

Prometheus Package Customization

In this section, we cover a few frequently-asked config customizations. After determining the appropriate customization, proceed to the Prometheus installation guide to complete the package installation. Also refer to Prometheus package spec for additional config options.

Change prometheus-server global configs

By default, prometheus-server is configured with evaluation_interval: 1m, scrape_interval: 1m, scrape_timeout: 10s. Those values can be overwritten if preferred / needed.

The following config allows the user to do such customization:

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: generated-prometheus
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: prometheus
  config: |
    server:
      global:
        evaluation_interval: "30s"
        scrape_interval: "30s"
        scrape_timeout: "15s"    

Run prometheus-server as statefulSets

By default, prometheus-server is created as a deployment with replicaCount equals to 1. If there is a need to increase the replicaCount greater than 1, a user should deploy prometheus-server as a statefulSet instead. This allows multiple prometheus-server pods to share the same data storage.

The following config allows the user to do such customization:

apiVersion: packages.eks.amazonaws.com/v1alpha1
kind: Package
metadata:
  name: generated-prometheus
  namespace: eksa-packages-<cluster-name>
spec:
  packageName: prometheus
  config: |
    server:
      replicaCount: 2
      statefulSet:
        enabled: true    

Disable prometheus-server and use node-exporter only

A user may disable the prometheus-server when:

  • they would like to use node-exporter to expose hardware- and kernel-related metrics, while
  • they have deployed another metrics collector in the cluster and configured a remote-write storage solution, which fulfills the prometheus-server functionality (check out the ADOT with Amazon Managed Prometheus and Amazon Managed Grafana workshop to learn how to do so).

The following config allows the user to do such customization:

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
      name: generated-prometheus
      namespace: eksa-packages-<cluster-name>
    spec:
      packageName: prometheus
      config: |
        server:
          enabled: false        

Disable node-exporter and use prometheus-server only

A user may disable the node-exporter when:

  • they would like to deploy multiple prometheus-server packages for a cluster, while
  • deploying only one or none node-exporter instance per node.

The following config allows the user to do such customization:

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
      name: generated-prometheus
      namespace: eksa-packages-<cluster-name>
    spec:
      packageName: prometheus
      config: |
        nodeExporter:
          enabled: false        

Prometheus Package Test

To ensure the Prometheus package is installed correctly in the cluster, a user can perform the following tests.

Access prometheus-server web UI

Port forward Prometheus to local host 9090:

export PROM_SERVER_POD_NAME=$(kubectl get pods --namespace <namespace> -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name")
kubectl port-forward $PROM_SERVER_POD_NAME -n <namespace> 9090

Go to http://localhost:9090 to access the web UI.

Run sample queries

Run sample queries in Prometheus web UI to confirm the targets have been configured properly. For example, a user can run the following query to obtain the CPU utilization rate by node.

100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100 )

The output will be displayed on the Graph tab. Prometheus Grafana Import Dashboard

Install Grafana helm charts

A user can install Grafana in the cluster to visualize the Prometheus metrics. We used the Grafana helm chart as an example below, though other deployment methods are also possible.

  1. Get helm chart repo info

    helm repo add grafana https://grafana.github.io/helm-charts
    helm repo update
    
  2. Install the helm chart

    helm install my-grafana grafana/grafana
    

Set up Grafana dashboards

Access Grafana web UI

  1. Obtain Grafana login password:

    kubectl get secret --namespace default my-grafana -o jsonpath="{.data.admin-password}" | base64 --decode; echo
    
  2. Port forward Grafana to local host 3000:

    export GRAFANA_POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=my-grafana" -o jsonpath="{.items[0].metadata.name}")
    kubectl --namespace default port-forward $GRAFANA_POD_NAME 3000
    
  3. Go to http://localhost:3000 to access the web UI. Log in with username admin, and password obtained from the Obtain Grafana login password in step 1 above.

Add Prometheus data source

  1. Click on the Configuration sign on the left navigation bar, select Data sources, then choose Prometheus as the Data source.

    Prometheus Grafana Add Data Source

  2. Configure Prometheus data source with the following details:

    • Name: Prometheus as an example.
    • URL: http://<prometheus-server-end-point-name>.<namespace>:9090. If the package default values are used, this will be http://generated-prometheus-server.observability:9090.
    • Scrape interval: 1m or the value specified by user in the package config.
    • Select Save and test. A notification data source is working should be displayed.

    Prometheus Grafana Config Data Source

Import dashboard templates

  1. Import a dashboard template by hovering over to the Dashboard sign on the left navigation bar, and click on Import. Type 315 in the Import via grafana.com textbox and select Import. From the dropdown at the bottom, select Prometheus and select Import.

    Prometheus Grafana Import Dashboard

  2. A Kubernetes cluster monitoring (via Prometheus) dashboard will be displayed.

    Prometheus Grafana View Dashboard Kubernetes

  3. Perform the same procedure for template 1860. A Node Exporter Full dashboard will be displayed. Prometheus Grafana View Dashboard Node Exporter

7.3.2 - ADOT use cases

7.3.2.1 - ADOT with AMP and AMG

This tutorial demonstrates how to config the ADOT package to scrape metrics from an EKS Anywhere cluster, and send them to Amazon Managed Service for Prometheus (AMP) and Amazon Managed Grafana (AMG).

This tutorial walks through the following procedures:

Create an AMP workspace

An AMP workspace is created to receive metrics from the ADOT package, and respond to query requests from AMG. Follow steps below to complete the set up:

  1. Open the AMP console at https://console.aws.amazon.com/prometheus/.

  2. Choose region us-west-2 from the top right corner.

  3. Click on Create to create a workspace.

  4. Type a workspace alias (adot-amp-test as an example), and click on Create workspace.

    ADOT AMP Create Workspace

  5. Make notes of the URLs displayed for Endpoint - remote write URL and Endpoint - query URL. You’ll need them when you configure your ADOT package to remote write metrics to this workspace and when you query metrics from this workspace. Make sure the workspace’s Status shows Active before proceeding to the next step.

    ADOT AMP Identify URLs

For additional options (i.e. through CLI) and configurations (i.e. add a tag) to create an AMP workspace, refer to AWS AMP create a workspace guide.

Create a cluster with IRSA

To enable ADOT pods that run in EKS Anywhere clusters to authenticate with AWS services, a user needs to set up IRSA at cluster creation. EKS Anywhere cluster spec for Pod IAM gives step-by-step guidance on how to do so. There are a few things to keep in mind while working through the guide:

  1. While completing step Create an OIDC provider , a user should:

    • create the S3 bucket in the us-west-2 region, and

    • attach an IAM policy with proper AMP access to the IAM role.

      Below is an example that gives full access to AMP actions and resources. Refer to AMP IAM permissions and policies guide for more customized options.

      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Action": [
                      "aps:*"
                  ],
                  "Effect": "Allow",
                  "Resource": "*"
              }
          ]
      }
      
  2. While completing step deploy pod identity webhook , a user should:

    • make sure the service account is created in the same namespace as the ADOT package (which is controlled by the package definition file with field spec.targetNamespace);
    • take a note of the service account that gets created in this step as it will be used in ADOT package installation;
    • add an annotation eks.amazonaws.com/role-arn: <role-arn> to the created service account.

    By default, the service account is installed in the default namespace with name pod-identity-webhook, and the annotation eks.amazonaws.com/role-arn: <role-arn> is not added automatically.

IRSA Set Up Test

To ensure IRSA is set up properly in the cluster, a user can create an awscli pod for testing.

  1. Apply the following yaml file in the cluster:

    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: Pod
    metadata:
      name: awscli
    spec:
      serviceAccountName: pod-identity-webhook
      containers:
      - image: amazon/aws-cli
        command:
          - sleep
          - "infinity"
        name: awscli
        resources: {}
      dnsPolicy: ClusterFirst
      restartPolicy: Always
    EOF
    
  2. Exec into the pod:

    kubectl exec -it awscli -- /bin/bash
    
  3. Check if the pod can list AMP workspaces:

    aws amp list-workspaces --region=us-west-2
    
  4. If the pod has issues listing AMP workspaces, re-visit IRSA set up guidance before proceeding to the next step.

  5. Exit the pod:

    exit
    

Install the ADOT package

The ADOT package will be created with three components:

  1. the Prometheus Receiver, which is designed to be a drop-in replacement for a Prometheus Server and is capable of scraping metrics from microservices instrumented with the Prometheus client library ;

  2. the Prometheus Remote Write Exporter, which employs the remote write features and send metrics to AMP for long term storage;

  3. the Sigv4 Authentication Extension, which enables ADOT pods to authenticate to AWS services.

Follow steps below to complete the ADOT package installation:

  1. Update the following config file. Review comments carefully and replace everything that is wrapped with a <> tag. Note this configuration aims to mimic the Prometheus community helm chart. A user can tailor the scrape targets further by modifying the receiver section below. Refer to ADOT package spec for additional explanations of each section.

    Click to expand ADOT package config
    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
      name: my-adot
      namespace: eksa-packages
    spec:
      packageName: adot
      targetNamespace: default # this needs to match the namespace of the serviceAccount below
      config: |
        mode: deployment
    
        serviceAccount:
          # Specifies whether a service account should be created
          create: false
          # Annotations to add to the service account
          annotations: {}
          # Specifies the serviceAccount annotated with eks.amazonaws.com/role-arn.
          name: "pod-identity-webhook" # name of the service account created at step Create a cluster with IRSA
    
        config:
          extensions:
            sigv4auth:
              region: "us-west-2"
              service: "aps"
              assume_role:
                sts_region: "us-west-2"
    
          receivers:
            # Scrape configuration for the Prometheus Receiver
            prometheus:
              config:
                global:
                  scrape_interval: 15s
                  scrape_timeout: 10s
                scrape_configs:
                - job_name: kubernetes-apiservers
                  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                  kubernetes_sd_configs:
                  - role: endpoints
                  relabel_configs:
                  - action: keep
                    regex: default;kubernetes;https
                    source_labels:
                    - __meta_kubernetes_namespace
                    - __meta_kubernetes_service_name
                    - __meta_kubernetes_endpoint_port_name
                  scheme: https
                  tls_config:
                    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                    insecure_skip_verify: false
                - job_name: kubernetes-nodes
                  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                  kubernetes_sd_configs:
                  - role: node
                  relabel_configs:
                  - action: labelmap
                    regex: __meta_kubernetes_node_label_(.+)
                  - replacement: kubernetes.default.svc:443
                    target_label: __address__
                  - regex: (.+)
                    replacement: /api/v1/nodes/$$1/proxy/metrics
                    source_labels:
                    - __meta_kubernetes_node_name
                    target_label: __metrics_path__
                  scheme: https
                  tls_config:
                    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                    insecure_skip_verify: false
                - job_name: kubernetes-nodes-cadvisor
                  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                  kubernetes_sd_configs:
                  - role: node
                  relabel_configs:
                  - action: labelmap
                    regex: __meta_kubernetes_node_label_(.+)
                  - replacement: kubernetes.default.svc:443
                    target_label: __address__
                  - regex: (.+)
                    replacement: /api/v1/nodes/$$1/proxy/metrics/cadvisor
                    source_labels:
                    - __meta_kubernetes_node_name
                    target_label: __metrics_path__
                  scheme: https
                  tls_config:
                    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                    insecure_skip_verify: false
                - job_name: kubernetes-service-endpoints
                  kubernetes_sd_configs:
                  - role: endpoints
                  relabel_configs:
                  - action: keep
                    regex: true
                    source_labels:
                    - __meta_kubernetes_service_annotation_prometheus_io_scrape
                  - action: replace
                    regex: (https?)
                    source_labels:
                    - __meta_kubernetes_service_annotation_prometheus_io_scheme
                    target_label: __scheme__
                  - action: replace
                    regex: (.+)
                    source_labels:
                    - __meta_kubernetes_service_annotation_prometheus_io_path
                    target_label: __metrics_path__
                  - action: replace
                    regex: ([^:]+)(?::\d+)?;(\d+)
                    replacement: $$1:$$2
                    source_labels:
                    - __address__
                    - __meta_kubernetes_service_annotation_prometheus_io_port
                    target_label: __address__
                  - action: labelmap
                    regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
                    replacement: __param_$$1
                  - action: labelmap
                    regex: __meta_kubernetes_service_label_(.+)
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_namespace
                    target_label: kubernetes_namespace
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_service_name
                    target_label: kubernetes_name
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_pod_node_name
                    target_label: kubernetes_node
                - job_name: kubernetes-service-endpoints-slow
                  kubernetes_sd_configs:
                  - role: endpoints
                  relabel_configs:
                  - action: keep
                    regex: true
                    source_labels:
                    - __meta_kubernetes_service_annotation_prometheus_io_scrape_slow
                  - action: replace
                    regex: (https?)
                    source_labels:
                    - __meta_kubernetes_service_annotation_prometheus_io_scheme
                    target_label: __scheme__
                  - action: replace
                    regex: (.+)
                    source_labels:
                    - __meta_kubernetes_service_annotation_prometheus_io_path
                    target_label: __metrics_path__
                  - action: replace
                    regex: ([^:]+)(?::\d+)?;(\d+)
                    replacement: $$1:$$2
                    source_labels:
                    - __address__
                    - __meta_kubernetes_service_annotation_prometheus_io_port
                    target_label: __address__
                  - action: labelmap
                    regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
                    replacement: __param_$$1
                  - action: labelmap
                    regex: __meta_kubernetes_service_label_(.+)
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_namespace
                    target_label: kubernetes_namespace
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_service_name
                    target_label: kubernetes_name
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_pod_node_name
                    target_label: kubernetes_node
                  scrape_interval: 5m
                  scrape_timeout: 30s
    
                - job_name: prometheus-pushgateway
                  kubernetes_sd_configs:
                  - role: service
                  relabel_configs:
                  - action: keep
                    regex: pushgateway
                    source_labels:
                    - __meta_kubernetes_service_annotation_prometheus_io_probe
                - job_name: kubernetes-services
                  kubernetes_sd_configs:
                  - role: service
                  metrics_path: /probe
                  params:
                    module:
                    - http_2xx
                  relabel_configs:
                  - action: keep
                    regex: true
                    source_labels:
                    - __meta_kubernetes_service_annotation_prometheus_io_probe
                  - source_labels:
                    - __address__
                    target_label: __param_target
                  - replacement: blackbox
                    target_label: __address__
                  - source_labels:
                    - __param_target
                    target_label: instance
                  - action: labelmap
                    regex: __meta_kubernetes_service_label_(.+)
                  - source_labels:
                    - __meta_kubernetes_namespace
                    target_label: kubernetes_namespace
                  - source_labels:
                    - __meta_kubernetes_service_name
                    target_label: kubernetes_name
                - job_name: kubernetes-pods
                  kubernetes_sd_configs:
                  - role: pod
                  relabel_configs:
                  - action: keep
                    regex: true
                    source_labels:
                    - __meta_kubernetes_pod_annotation_prometheus_io_scrape
                  - action: replace
                    regex: (https?)
                    source_labels:
                    - __meta_kubernetes_pod_annotation_prometheus_io_scheme
                    target_label: __scheme__
                  - action: replace
                    regex: (.+)
                    source_labels:
                    - __meta_kubernetes_pod_annotation_prometheus_io_path
                    target_label: __metrics_path__
                  - action: replace
                    regex: ([^:]+)(?::\d+)?;(\d+)
                    replacement: $$1:$$2
                    source_labels:
                    - __address__
                    - __meta_kubernetes_pod_annotation_prometheus_io_port
                    target_label: __address__
                  - action: labelmap
                    regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
                    replacement: __param_$$1
                  - action: labelmap
                    regex: __meta_kubernetes_pod_label_(.+)
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_namespace
                    target_label: kubernetes_namespace
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_pod_name
                    target_label: kubernetes_pod_name
                  - action: drop
                    regex: Pending|Succeeded|Failed|Completed
                    source_labels:
                    - __meta_kubernetes_pod_phase
                - job_name: kubernetes-pods-slow
                  scrape_interval: 5m
                  scrape_timeout: 30s          
                  kubernetes_sd_configs:
                  - role: pod
                  relabel_configs:
                  - action: keep
                    regex: true
                    source_labels:
                    - __meta_kubernetes_pod_annotation_prometheus_io_scrape_slow
                  - action: replace
                    regex: (https?)
                    source_labels:
                    - __meta_kubernetes_pod_annotation_prometheus_io_scheme
                    target_label: __scheme__
                  - action: replace
                    regex: (.+)
                    source_labels:
                    - __meta_kubernetes_pod_annotation_prometheus_io_path
                    target_label: __metrics_path__
                  - action: replace
                    regex: ([^:]+)(?::\d+)?;(\d+)
                    replacement: $$1:$$2
                    source_labels:
                    - __address__
                    - __meta_kubernetes_pod_annotation_prometheus_io_port
                    target_label: __address__
                  - action: labelmap
                    regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
                    replacement: __param_$1
                  - action: labelmap
                    regex: __meta_kubernetes_pod_label_(.+)
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_namespace
                    target_label: namespace
                  - action: replace
                    source_labels:
                    - __meta_kubernetes_pod_name
                    target_label: pod
                  - action: drop
                    regex: Pending|Succeeded|Failed|Completed
                    source_labels:
                    - __meta_kubernetes_pod_phase
    
          processors:
            batch/metrics:
              timeout: 60s
    
          exporters:
            logging:
              logLevel: info
            prometheusremotewrite:
              endpoint: "<AMP-WORKSPACE>/api/v1/remote_write" # Replace with your AMP workspace
              auth:
                authenticator: sigv4auth
    
          service:
            extensions:
              - health_check
              - memory_ballast
              - sigv4auth
            pipelines:
              metrics:
                receivers: [prometheus]
                processors: [batch/metrics]
                exporters: [logging, prometheusremotewrite]    
    
    
  2. Bind additional roles to the service account pod-identity-webhook (created at step Create a cluster with IRSA ) by applying the following file in the cluster (using kubectl apply -f <file-name>). This is because pod-identity-webhook by design does not have sufficient permissions to scrape all Kubernetes targets listed in the ADOT config file above. If modifications are made to the Prometheus Receiver, make updates to the file below to add / remove additional permissions before applying the file.

    Click to expand clusterrole and clusterrolebinding config
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: otel-prometheus-role
    rules:
      - apiGroups:
          - ""
        resources:
          - nodes
          - nodes/proxy
          - services
          - endpoints
          - pods
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - extensions
        resources:
          - ingresses
        verbs:
          - get
          - list
          - watch
      - nonResourceURLs:
          - /metrics
        verbs:
          - get
    
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: otel-prometheus-role-binding
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: otel-prometheus-role
    subjects:
      - kind: ServiceAccount
        name: pod-identity-webhook  # replace with name of the service account created at step Create a cluster with IRSA
        namespace: default  # replace with namespace where the service account was created at step Create a cluster with IRSA
    
  3. Use the ADOT package config file defined above to complete the ADOT installation. Refer to ADOT installation guide for details.

ADOT Package Test

To ensure the ADOT package is installed correctly in the cluster, a user can perform the following tests.

Check pod logs

Check ADOT pod logs using kubectl logs <adot-pod-name> -n <namespace>. It should display logs similar to below.

...
2022-09-30T23:22:59.184Z	info	service/telemetry.go:103	Setting up own telemetry...
2022-09-30T23:22:59.184Z	info	service/telemetry.go:138	Serving Prometheus metrics	{"address": "0.0.0.0:8888", "level": "basic"}
2022-09-30T23:22:59.185Z	info	components/components.go:30	In development component. May change in the future.	{"kind": "exporter", "data_type": "metrics", "name": "logging", "stability": "in development"}
2022-09-30T23:22:59.186Z	info	extensions/extensions.go:42	Starting extensions...
2022-09-30T23:22:59.186Z	info	extensions/extensions.go:45	Extension is starting...	{"kind": "extension", "name": "health_check"}
2022-09-30T23:22:59.186Z	info	healthcheckextension@v0.58.0/healthcheckextension.go:44	Starting health_check extension	{"kind": "extension", "name": "health_check", "config": {"Endpoint":"0.0.0.0:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"Path":"/","CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2022-09-30T23:22:59.186Z	info	extensions/extensions.go:49	Extension started.	{"kind": "extension", "name": "health_check"}
2022-09-30T23:22:59.186Z	info	extensions/extensions.go:45	Extension is starting...	{"kind": "extension", "name": "memory_ballast"}
2022-09-30T23:22:59.187Z	info	ballastextension/memory_ballast.go:52	Setting memory ballast	{"kind": "extension", "name": "memory_ballast", "MiBs": 0}
2022-09-30T23:22:59.187Z	info	extensions/extensions.go:49	Extension started.	{"kind": "extension", "name": "memory_ballast"}
2022-09-30T23:22:59.187Z	info	extensions/extensions.go:45	Extension is starting...	{"kind": "extension", "name": "sigv4auth"}
2022-09-30T23:22:59.187Z	info	extensions/extensions.go:49	Extension started.	{"kind": "extension", "name": "sigv4auth"}
2022-09-30T23:22:59.187Z	info	pipelines/pipelines.go:74	Starting exporters...
2022-09-30T23:22:59.187Z	info	pipelines/pipelines.go:78	Exporter is starting...	{"kind": "exporter", "data_type": "metrics", "name": "logging"}
2022-09-30T23:22:59.187Z	info	pipelines/pipelines.go:82	Exporter started.	{"kind": "exporter", "data_type": "metrics", "name": "logging"}
2022-09-30T23:22:59.187Z	info	pipelines/pipelines.go:78	Exporter is starting...	{"kind": "exporter", "data_type": "metrics", "name": "prometheusremotewrite"}
2022-09-30T23:22:59.187Z	info	pipelines/pipelines.go:82	Exporter started.	{"kind": "exporter", "data_type": "metrics", "name": "prometheusremotewrite"}
2022-09-30T23:22:59.187Z	info	pipelines/pipelines.go:86	Starting processors...
2022-09-30T23:22:59.187Z	info	pipelines/pipelines.go:90	Processor is starting...	{"kind": "processor", "name": "batch/metrics", "pipeline": "metrics"}
2022-09-30T23:22:59.187Z	info	pipelines/pipelines.go:94	Processor started.	{"kind": "processor", "name": "batch/metrics", "pipeline": "metrics"}
2022-09-30T23:22:59.187Z	info	pipelines/pipelines.go:98	Starting receivers...
2022-09-30T23:22:59.187Z	info	pipelines/pipelines.go:102	Receiver is starting...	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics"}
2022-09-30T23:22:59.187Z	info	kubernetes/kubernetes.go:326	Using pod service account via in-cluster config	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics", "discovery": "kubernetes"}
2022-09-30T23:22:59.188Z	info	kubernetes/kubernetes.go:326	Using pod service account via in-cluster config	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics", "discovery": "kubernetes"}
2022-09-30T23:22:59.188Z	info	kubernetes/kubernetes.go:326	Using pod service account via in-cluster config	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics", "discovery": "kubernetes"}
2022-09-30T23:22:59.188Z	info	kubernetes/kubernetes.go:326	Using pod service account via in-cluster config	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics", "discovery": "kubernetes"}
2022-09-30T23:22:59.189Z	info	pipelines/pipelines.go:106	Receiver started.	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics"}
2022-09-30T23:22:59.189Z	info	healthcheck/handler.go:129	Health Check state change	{"kind": "extension", "name": "health_check", "status": "ready"}
2022-09-30T23:22:59.189Z	info	service/collector.go:215	Starting aws-otel-collector...	{"Version": "v0.21.1", "NumCPU": 2}
2022-09-30T23:22:59.189Z	info	service/collector.go:128	Everything is ready. Begin running and processing data.
...

Check AMP endpoint using awscurl

Use awscurl commands below to check if AMP received the metrics data sent by ADOT. The awscurl tool is a curl like tool with AWS Signature Version 4 request signing. The command below should return a status code success.

pip install awscurl
awscurl -X POST --region us-west-2 --service aps "<amp-query-endpoint>?query=up"

Create an AMG workspace and connect to the AMP workspace

An AMG workspace is created to query metrics from the AMP workspace and visualize the metrics in user-selected or user-built dashboards.

Follow steps below to create the AMG workspace:

  1. Enable AWS Single-Sign-on (AWS SSO). Refer to IAM Identity Center for details.

  2. Open the Amazon Managed Grafana console at https://console.aws.amazon.com/grafana/.

  3. Choose Create workspace.

  4. In the Workspace details window, for Workspace name, enter a name for the workspace.

    ADOT AMG Workspace Details

  5. In the config settings window, choose Authentication access by AWS IAM Identity Center, and Permission type of Service managed.

    ADOT AMG Workspace Configure Settings

  6. In the IAM permission access setting window, choose Current account access, and Amazon Managed Service for Prometheus as data source.

    ADOT AMG Workspace Permission Settings

  7. Review all settings and click on Create workspace.

    ADOT AMG Workspace Review and Create

  8. Once the workspace shows a Status of Active, you can access it by clicking the Grafana workspace URL. Click on Sign in with AWS IAM Identity Center to finish the authentication.

Follow steps below to add the AMP workspace to AMG.

  1. Click on the config sign on the left navigation bar, select Data sources, then choose Prometheus as the Data source.

    ADOT AMG Add Data Source

  2. Configure Prometheus data source with the following details:

    • Name: AMPDataSource as an example.
    • URL: add the AMP workspace remote write URL without the api/v1/remote_write at the end.
    • SigV4 auth: enable.
    • Under the SigV4 Auth Details section:
      • Authentication Provider: choose Workspace IAM Role;
      • Default Region: choose us-west-2 (where you created the AMP workspace)
    • Select the Save and test, and a notification data source is working should be displayed.

    ADOT AMG Config Data Source

  3. Import a dashboard template by clicking on the plus (+) sign on the left navigation bar. In the Import screen, type 3119 in the Import via grafana.com textbox and select Import. From the dropdown at the bottom, select AMPDataSource and select Import.

    ADOT AMG Import Dashboard

  4. A Kubernetes cluster monitoring (via Prometheus) dashboard will be displayed.

    ADOT AMG View Dashboard

7.3.3 - Harbor use cases

Proxy a public Amazon Elastic Container Registry (ECR) repository

This use case is to use Harbor to proxy and cache images from a public ECR repository, which helps limit the amount of requests made to a public ECR repository, avoiding consuming too much bandwidth or being throttled by the registry server.

  1. Login

    Log in to the Harbor web portal with the default credential as shown below

    admin
    Harbor12345
    

    Harbor web portal

  2. Create a registry proxy

    Navigate to Registries on the left panel, and then click on NEW ENDPOINT button. Choose Docker Registry as the Provider, and enter public-ecr as the Name, and enter https://public.ecr.aws/ as the Endpoint URL. Save it by clicking on OK.

    Harbor public ecr proxy

  3. Create a proxy project

    Navigate to Projects on the left panel and click on the NEW PROJECT button. Enter proxy-project as the Project Name, check Public access level, and turn on Proxy Cache and choose public-ecr from the pull-down list. Save the configuration by clicking on OK.

    Harbor public proxy project

  4. Pull images

    docker pull harbor.eksa.demo:30003/proxy-project/cloudwatch-agent/cloudwatch-agent:latest
    

Proxy a private Amazon Elastic Container Registry (ECR) repository

This use case is to use Harbor to proxy and cache images from a private ECR repository, which helps limit the amount of requests made to a private ECR repository, avoiding consuming too much bandwidth or being throttled by the registry server.

  1. Login

    Log in to the Harbor web portal with the default credential as shown below

    admin
    Harbor12345
    

    Harbor web portal

  2. Create a registry proxy

    In order for Harbor to proxy a remote private ECR registry, an IAM credential with necessary permissions need to be created. Usually, it follows three steps:

    1. Policy

      This is where you specify all necessary permissions. Please refer to private repository policies , IAM permissions for pushing an image and ECR policy examples to figure out the minimal set of required permissions.

      For simplicity, the build-in policy AdministratorAccess is used here.

      Harbor private ecr policy

    2. User group

      This is an easy way to manage a pool of users who share the same set of permissions by attaching the policy to the group.

      Harbor private ecr user group

    3. User

      Create a user and add it to the user group. In addition, please navigate to Security credentials to generate an access key. Access keys consists of two parts: an access key ID and a secret access key. Please save both as they are used in the next step.

      Harbor private ecr user

    Navigate to Registries on the left panel, and then click on NEW ENDPOINT button. Choose Aws ECR as Provider, and enter private-ecr as Name, https://[ACCOUNT NUMBER].dkr.ecr.us-west-2.amazonaws.com/ as Endpoint URL, use the access key ID part of the generated access key as Access ID, and use the secret access key part of the generated access key as Access Secret. Save it by click on OK.

    Harbor private ecr proxy

  3. Create a proxy project

    Navigate to Projects on the left panel and click on NEW PROJECT button. Enter proxy-private-project as Project Name, check Public access level, and turn on Proxy Cache and choose private-ecr from the pull-down list. Save the configuration by clicking on OK.

    Harbor private proxy project

  4. Pull images

    Create a repository in the target private ECR registry

    Harbor private ecr repository

    Push an image to the created repository

    docker pull alpine
    docker tag alpine [ACCOUNT NUMBER].dkr.ecr.us-west-2.amazonaws.com/alpine:latest
    docker push [ACCOUNT NUMBER].dkr.ecr.us-west-2.amazonaws.com/alpine:latest
    
    docker pull harbor.eksa.demo:30003/proxy-private-project/alpine:latest
    

Repository replication from Harbor to a private Amazon Elastic Container Registry (ECR) repository

This use case is to use Harbor to replicate local images and charts to a private ECR repository in push mode. When a replication rule is set, all resources that match the defined filter patterns are replicated to the destination registry when the triggering condition is met.

  1. Login

    Log in to the Harbor web portal with the default credential as shown below

    admin
    Harbor12345
    

    Harbor web portal

  2. Create a nonproxy project

    Harbor nonproxy project

  3. Create a registry proxy

    In order for Harbor to proxy a remote private ECR registry, an IAM credential with necessary permissions need to be created. Usually, it follows three steps:

    1. Policy

      This is where you specify all necessary permissions. Please refer to private repository policies , IAM permissions for pushing an image and ECR policy examples to figure out the minimal set of required permissions.

      For simplicity, the build-in policy AdministratorAccess is used here.

      Harbor private ecr policy

    2. User group

      This is an easy way to manage a pool of users who share the same set of permissions by attaching the policy to the group.

      Harbor private ecr user group

    3. User

      Create a user and add it to the user group. In addition, please navigate to Security credentials to generate an access key. Access keys consists of two parts: an access key ID and a secret access key. Please save both as they are used in the next step.

      Harbor private ecr user

    Navigate to Registries on the left panel, and then click on the NEW ENDPOINT button. Choose Aws ECR as the Provider, and enter private-ecr as the Name, https://[ACCOUNT NUMBER].dkr.ecr.us-west-2.amazonaws.com/ as the Endpoint URL, use the access key ID part of the generated access key as Access ID, and use the secret access key part of the generated access key as Access Secret. Save it by clicking on OK.

    Harbor private ecr proxy

  4. Create a replication rule

    Harbor replication rule

  5. Prepare an image

    docker pull alpine
    docker tag alpine:latest harbor.eksa.demo:30003/nonproxy-project/alpine:latest
    
  6. Authenticate with Harbor with the default credential as shown below

    admin
    Harbor12345
    
    docker logout
    docker login harbor.eksa.demo:30003
    
  7. Push images

    Create a repository in the target private ECR registry

    Harbor private ecr repository

    docker push harbor.eksa.demo:30003/nonproxy-project/alpine:latest
    

    The image should appear in the target ECR repository shortly.

    Harbor replication result

Set up trivy image scanner in an air-gapped environment

This use case is to manually import vulnerability database to Harbor trivy when Harbor is running in an air-gapped environment. All the following commands are assuming Harbor is running in the default namespace.

  1. Configure trivy

    TLS example with auto certificate generation

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
       name: my-harbor
       namespace: eksa-packages
    spec:
       packageName: harbor
       config: |-
         secretKey: "use-a-secret-key"
         externalURL: https://harbor.eksa.demo:30003
         expose:
           tls:
             certSource: auto
             auto:
               commonName: "harbor.eksa.demo"
           trivy:
             skipUpdate: true
             offlineScan: true     
    

    Non-TLS example

    apiVersion: packages.eks.amazonaws.com/v1alpha1
    kind: Package
    metadata:
       name: my-harbor
       namespace: eksa-packages
    spec:
       packageName: harbor
       config: |-
         secretKey: "use-a-secret-key"
         externalURL: http://harbor.eksa.demo:30002
         expose:
           tls:
             enabled: false
         trivy:
           skipUpdate: true
           offlineScan: true     
    

    If Harbor is already running without the above trivy configurations, run the following command to update both skipUpdate and offlineScan

    kubectl edit statefulsets/harbor-helm-trivy
    
  2. Download the vulnerability database to your local host

    Please follow oras installation instruction .

    oras pull ghcr.io/aquasecurity/trivy-db:2 -a
    
  3. Upload database to trivy pod from your local host

    kubectl cp db.tar.gz harbor-helm-trivy-0:/home/scanner/.cache/trivy -c trivy
    
  4. Set up database on Harbor trivy pod

    kubectl exec -it harbor-helm-trivy-0 -c trivy bash
    cd /home/scanner/.cache/trivy
    mkdir db
    mv db.tar.gz db
    cd db
    tar zxvf db.tar.gz