All posts by Sylvia Qi

Deploy and Manage Gitlab Runners on Amazon EC2

Post Syndicated from Sylvia Qi original https://aws.amazon.com/blogs/devops/deploy-and-manage-gitlab-runners-on-amazon-ec2/

Gitlab CI is a tool utilized by many enterprises to automate their Continuous integration, continuous delivery and deployment (CI/CD) process. A Gitlab CI/CD pipeline consists of two major components: A .gitlab-ci.yml file describing a pipeline’s jobs, and a Gitlab Runner, an application that executes the pipeline jobs.

Setting up the Gitlab Runner is a time-consuming process. It involves provisioning the necessary infrastructure, installing the necessary software to run pipeline workloads, and configuring the runner. For enterprises running hundreds of pipelines across multiple environments, it is essential to automate the Gitlab Runner deployment process so as to be deployed quickly in a repeatable, consistent manner.

This post will guide you through utilizing Infrastructure-as-Code (IaC) to automate Gitlab Runner deployment and administrative tasks on Amazon EC2. With IaC, you can quickly and consistently deploy the entire Gitlab Runner architecture by running a script. You can track and manage changes efficiently. And, you can enforce guardrails and best practices via code. The solution presented here also offers autoscaling so that you save costs by terminating resources when not in use. You will learn:

  • How to deploy Gitlab Runner quickly and consistently across multiple AWS accounts.
  • How to enforce guardrails and best practices on the Gitlab Runner through IaC.
  • How to autoscale Gitlab Runner based on workloads to ensure best performance and save costs.

This post comes from a DevOps engineer perspective, and assumes that the engineer is familiar with the practices and tools of IaC and CI/CD.

Overview of the solution

The following diagram displays the solution architecture. We use AWS CloudFormation to describe the infrastructure that is hosting the Gitlab Runner. The main steps are as follows:

  1. The user runs a deploy script in order to deploy the CloudFormation template. The template is parameterized, and the parameters are defined in a properties file. The properties file specifies the infrastructure configuration, as well as the environment in which to deploy the template.
  2. The deploy script calls CloudFormation CreateStack API to create a Gitlab Runner stack in the specified environment.
  3. During stack creation, an EC2 autoscaling group is created with the desired number of EC2 instances. Each instance is launched via a launch template, which is created with values from the properties file. An IAM role is created and attached to the EC2 instance. The role contains permissions required for the Gitlab Runner to execute pipeline jobs. A lifecycle hook is attached to the autoscaling group on instance termination events. This ensures graceful instance termination.
  4. During instance launch, CloudFormation uses a cfn-init helper script to install and configure the Gitlab Runner:
    1. cfn-init installs the Gitlab Runner software on the EC2 instance.
    2. cfn-init configures the Gitlab Runner as a docker executor using a pre-defined docker image in the Gitlab Container Registry. The docker executor implementation lets the Gitlab Runner run each build in a separate and isolated container. The docker image contains the software required to run the pipeline workloads, thereby eliminating the need to install these packages during each build.
    3. cfn-init registers the Gitlab Runner to Gitlab projects specified in the properties file, so that these projects can utilize the Gitlab Runner to run pipelines.
  1. The user may repeat the same steps to deploy Gitlab Runner into another environment.

Architecture diagram previously explained in post.

Walkthrough

This walkthrough will demonstrate how to deploy the Gitlab Runner, and how easy it is to conduct Gitlab Runner administrative tasks via this architecture. We will walk through the following tasks:

  • Build a docker executor image for the Gitlab Runner.
  • Deploy the Gitlab Runner stack.
  • Update the Gitlab Runner.
  • Terminate the Gitlab Runner.
  • Add/Remove Gitlab projects from the Gitlab Runner.
  • Autoscale the Gitlab Runner based on workloads.

The code in this post is available at https://github.com/aws-samples/amazon-ec2-gitlab-runner.git

Prerequisites

For this walkthrough, you need the following:

  • A Gitlab account (all tiers including Gitlab Free self-managed, Gitlab Free SaaS, and higher tiers). This demo uses gitlab.com free tire.
  • A Gitlab Container Registry.
  • Git client to clone the source code provided.
  • An AWS account with local credentials properly configured (typically under ~/.aws/credentials).
  • The latest version of the AWS CLI. For more information, see Installing, updating, and uninstalling the AWS CLI.
  • Docker is installed and running on the localhost/laptop.
  • Nodejs and npm installed on the localhost/laptop.
  • A VPC with 2 private subnets and that is connected to the internet via NAT gateway allowing outbound traffic.
  • The following IAM service-linked role created in the AWS account: AWSServiceRoleForAutoScaling
  • An Amazon S3 bucket for storing Lambda deployment packages.
  • Familiarity with Git, Gitlab CI/CD, Docker, EC2, CloudFormation and Amazon CloudWatch.

Build a docker executor image for the Gitlab Runner

The Gitlab Runner in this solution is implemented as docker executor. The Docker executor connects to Docker Engine and runs each build in a separate and isolated container via a predefined docker image. The first step in deploying the Gitlab Runner is building a docker executor image. We provided a simple Dockerfile in order to build this image. You may customize the Dockerfile to install your own requirements.

To build a docker image using the sample Dockerfile:

  1. Create a directory where we will store our demo code. From your terminal run:
mkdir demo-repos && cd demo-repos
  1. Clone the source code repository found in the following location:
git clone https://github.com/aws-samples/amazon-ec2-gitlab-runner.git
  1. Create a new project on your Gitlab server. Name the project any name you like.
  2. Clone your newly created repo to your laptop. Ignore the warning about cloning an empty repository.
git clone <your-repo-url>
  1. Copy the demo repo files into your newly created repo on your laptop, and push it to your Gitlab repository. You may customize the Dockerfile before pushing it to Gitlab.
cp -r amazon-ec2-gitlab-runner/* <your-repo-dir>
cd <your-repo-dir>
git add .
git commit -m “Initial commit”
git push
  1. On the Gitlab console, go to your repository’s Package & Registries -> Container Registry. Follow the instructions provided on the Container Registry page in order to build and push a docker image to your repository’s container registry.

Deploy the Gitlab Runner stack

Once the docker executor image has been pushed to the Gitlab Container Registry, we can deploy the Gitlab Runner. The Gitlab Runner infrastructure is described in the Cloudformation template gitlab-runner.yaml. Its configuration is stored in a properties file called sample-runner.properties. A launch template is created with the values in the properties file. Then it is used to launch instances. This architecture lets you deploy Gitlab Runner to as many environments as you like by utilizing the configurations provided in the appropriate properties files.

During the provisioning process, utilize a cfn-init helper script to run a series of commands to install and configure the Gitlab Runner.

          commands:
            01InstallDocker:
              command: sudo yum -y install docker
            02StartDocker:
              command: sudo service docker start
            03DownloadGitlabRunner:
              command: sudo wget -O /usr/bin/gitlab-runner https://gitlab-runner-downloads.s3.amazonaws.com/latest/binaries/gitlab-runner-linux-amd64
            04ChmodGitlabRunner:
              command: sudo chmod a+x /usr/bin/gitlab-runner
            05AddUser:
              command: sudo useradd --comment 'GitLab Runner' --create-home gitlab-runner --shell /bin/bash
            06InstallGitlabRunner:
              command: sudo gitlab-runner install --user=gitlab-runner --working-directory=/home/gitlab-runner
            07SetRegion:
              command: !Sub 'aws configure set default.region ${AWS::Region}'
            08ConfigureDockerExecutor:
              command: !Sub 
                - |
                  for GitlabGroupToken in `aws ssm get-parameters --names /${AWS::StackName}/ci-tokens --query 'Parameters[0].Value' | sed -e "s/\"//g" | sed "s/,/ /g"`;do
                      sudo gitlab-runner register \
                      --non-interactive \
                      --url "${GitlabServerURL}" \
                      --registration-token $GitlabGroupToken \
                      --executor "docker" \
                      --docker-image "${DockerImagePath}" \
                      --description "Gitlab Runner with Docker Executor" \
                      --locked="${isLOCKED}" --access-level "${ACCESS}" \
                      --docker-volumes "/var/run/docker.sock:/var/run/docker.sock" \
                      --tag-list "${RunnerEnvironment}-${RunnerVersion}-docker"
                  done
                - isLOCKED: !FindInMap [GitlabRunnerRegisterOptionsMap, !Ref RunnerEnvironment, isLOCKED]
                  ACCESS: !FindInMap [GitlabRunnerRegisterOptionsMap, !Ref RunnerEnvironment, ACCESS]                              
            09StartGitlabRunner:
              command: sudo gitlab-runner start

The helper script ensures that the Gitlab Runner setup is consistent and repeatable for each deployment. If a configuration change is required, users simply update the configuration steps and redeploy the stack. Furthermore, all changes are tracked in Git, which allows for versioning of the Gitlab Runner.

To deploy the Gitlab Runner stack:

  1. Obtain the runner registration tokens of the Gitlab projects that you want registered to the Gitlab Runner. Obtain the token by selecting the project’s Settings > CI/CD and expand the Runners section.
  2. Update the sample-runner.properties file parameters according to your own environment. Refer to the gitlab-runner.yaml file for a description of these parameters. Rename the file if you like. You may also create an additional properties file for deploying into other environments.
  3. Run the deploy script to deploy the runner:
cd <your-repo-dir>
./deploy-runner.sh <properties-file> <region> <aws-profile> <stack-name> 

<properties-file> is the name of the properties file.

<region> is the region where you want to deploy the stack.

<aws-profile> is the name of the CLI profile you set up in the prerequisites section.

<stack-name> is the name you chose for the CloudFormation stack.

For example:

./deploy-runner.sh sample-runner.properties us-east-1 dev amazon-ec2-gitlab-runner-demo

After the stack is deployed successfully, you will see the Gitlab Runner autoscaling group created in the EC2 console:

After the stack is deployed successfully, you will see the Gitlab Runner autoscaling group created in the EC2 console.

Under your Gitlab project Settings > CICD > Runners > Available specific runners, you will see the fully configured Gitlab Runner. The green circle indicates that the Gitlab Runner is ready for use.

Now go to your Gitlab project Settings  CICD  Runners  Available specific runners, you will see the fully configured Gitlab Runner. The green circle indicates that the Gitlab Runner is ready for use.

Updating the Gitlab Runner

There are times when you would want to update the Gitlab Runner. For example, updating the instance VolumeSize in order to resolve a disk space issue, or updating the AMI ID when a new AMI becomes available.

Utilizing the properties file and launch template makes it easy to update the Gitlab Runner. Simply update the Gitlab Runner configuration parameters in the properties file. Then, run the deploy script to udpate the Gitlab Runner stack. To ensure that the changes take effect immediately (e.g., existing instances are replaced by new instances with the new configuration), we utilize an AutoscalingRollingUpdate update policy to automatically update the instances in the autoscaling group.

    UpdatePolicy:
      AutoScalingRollingUpdate:
        MinInstancesInService: !Ref MinInstancesInService
        MaxBatchSize: !Ref MaxBatchSize
        PauseTime: "PT5M"
        WaitOnResourceSignals: true
        SuspendProcesses:
          - HealthCheck
          - ReplaceUnhealthy
          - AZRebalance
          - AlarmNotification
          - ScheduledActions

The policy tells CloudFormation that when changes are detected in the launch template, update the instances in batch size of MaxBatchSize, while keeping a number of instances (specified in MinInstanceInService) in service during the update.

Below is an example of updating the Gitlab Runner instance type.

To update the instance type of the runner instance:

  1. Update the “InstanceType” parameter in the properties file.

InstanceType=t2.medium

  1. Run the deploy-runner.sh script to update the CloudFormation stack:
cd <your-repo-dir>
./deploy-runner.sh <properties-file> <region> <aws-profile> <stack-name> 

In the CloudFormation console, you will see that the launch template is updated first, then a rolling update is initiated. The instance type update requires a replacement of the original instance, so a temporary instance was launched and put in service. Then, the temporary instance was terminated when the new instance was launched successfully.

In the CloudFormation console, you will see that the launch template is updated first, then a rolling update is initiated. The instance type update requires a replacement of the original instance, so a temporary instance was launched and put in service. Then, the temporary instance was terminated when the new instance was launched successfully.

After the update is complete, you will see that on the Gitlab project’s console, the old Gitlab Runner, ez_5x8Rv, is replaced by the new Gitlab Runner, N1_UQ7yc.

After the update is complete, you will see that on the Gitlab project’s console, the old Gitlab Runner, ez_5x8Rv, is replaced by the new Gitlab Runner, N1_UQ7yc.

Terminate the Gitlab Runner

There are times when an autoscaling group instance must be terminated. For example, during an autoscaling scale-in event, or when the instance is being replaced by a new instance during a stack update, as seen previously. When terminating an instance, you must ensure that the Gitlab Runner finishes executing any running jobs before the instance is terminated, otherwise your environment could be left in an inconsistent state. Also, we want to ensure that the terminated Gitlab Runner is removed from the Gitlab project. We utilize an autoscaling lifecycle hook to achieve these goals.

The lifecycle hook works like this: A CloudWatch event rule actively listens for the EC2 Instance-terminate events. When one is detected, the event rule triggers a Lambda function. The Lambda function calls SSM Run Command to run a series of commands on the EC2 instances, via a SSM Document. The commands include stopping the Gitlab Runner gracefully when all running jobs are finished, de-registering the runner from Gitlab projects, and signaling the autoscaling group to terminate the instance.

The lifecycle hook works like this: A CloudWatch event rule actively listens for the EC2 Instance-terminate events. When one is detected, the event rule triggers a Lambda function. The Lambda function calls SSM Run Command to run a series of commands on the EC2 instances, via a SSM Document. The commands include stopping the Gitlab Runner gracefully when all running jobs are finished, de-registering the runner from Gitlab projects, and signaling the autoscaling group to terminate the instance.

There are also times when you want to terminate an instance manually. For example, when an instance is suspected to not be functioning properly. To terminate an instance from the Gitlab Runner autoscaling group, use the following command:

aws autoscaling terminate-instance-in-auto-scaling-group \
    --instance-id="${InstanceId}" \
    --no-should-decrement-desired-capacity \
    --region="${region}" \
    --profile="${profile}"

The above command terminates the instance. The lifecycle hook ensures that the cleanup steps are conducted properly, and the autoscaling group launches another new instance to replace the old one.

Note that if you terminate the instance by using the “ec2 terminate-instance” command, then the autoscaling lifecycle hook actions will not be triggered.

Add/Remove Gitlab projects from the Gitlab Runner

As new projects are added to your enterprise, you may want to register them to the Gitlab Runner, so that those projects can utilize the Gitlab Runner to run pipelines. On the other hand, you would want to remove the Gitlab Runner from a project if it no longer wants to utilize the Gitlab Runner, or if it qualifies to utilize the Gitlab Runner. For example, if a project is no longer allowed to deploy to an environment configured by the Gitlab Runner. Our architecture offers a simple way to add and remove projects from the Gitlab Runner. To add new projects to the Gitlab Runner, update the RunnerRegistrationTokens parameter in the properties file, and then rerun the deploy script to update the Gitlab Runner stack.

To add new projects to the Gitlab Runner:

  1. Update the RunnerRegistrationTokens parameter in the properties file. For example:
RunnerRegistrationTokens=ps8RjBSruy1sdRdP2nZX,XbtZNv4yxysbYhqvjEkC
  1. Update the Gitlab Runner stack. This updates the SSM parameter which stores the tokens.
cd <your-repo-dir>
./deploy-runner.sh <properties-file> <region> <aws-profile> <stack-name> 
  1. Relaunch the instances in the Gitlab Runner autoscaling group. The new instances will use the new RunnerRegistrationTokens value. Run the following command to relaunch the instances:
./cycle-runner.sh <runner-autoscaling-group-name> <region> <optional-aws-profile>

To remove projects from the Gitlab Runner, follow the steps described above, with just one difference. Instead of adding new tokens to the RunnerRegistrationTokens parameter, remove the token(s) of the project that you want to dissociate from the runner.

Autoscale the runner based on custom performance metrics

Each Gitlab Runner can be configured to handle a fixed number of concurrent jobs. Once this capacity is reached for every runner, any new jobs will be in a Queued/Waiting status until the current jobs complete, which would be a poor experience for our team. Setting the number of concurrent jobs too high on our runners would also result in a poor experience, because all jobs leverage the same CPU, memory, and storage in order to conduct the builds.

In this solution, we utilize a scheduled Lambda function that runs every minute in order to inspect the number of jobs running on every runner, leveraging the Prometheus Metrics endpoint that the runners expose. If we approach the concurrent build limit of the group, then we increase the Autoscaling Group size so that it can take on more work. As the number of concurrent jobs decreases, then the scheduled Lambda function will scale the Autoscaling Group back in an effort to minimize cost. The Scaling-Up operation will ignore the Autoscaling Group’s cooldown period, which will help ensure that our team is not waiting on a new instance, whereas the Scale-Down operation will obey the group’s cooldown period.

Here is the logical sequence diagram for the work:

Sequence diagram

For operational monitoring, the Lambda function also publishes custom CloudWatch Metrics for the count of active jobs, along with the target and actual capacities of the Autoscaling group. We can utilize this information to validate that the system is working properly and determine if we need to modify any of our autoscaling parameters.

For operational monitoring, the Lambda function also publishes custom CloudWatch Metrics for the count of active jobs, along with the target and actual capacities of the Autoscaling group. We can utilize this information to validate that the system is working properly and determine if we need to modify any of our autoscaling parameters.

Congratulations! You have completed the walkthrough. Take some time to review the resources you have deployed, and practice the various runner administrative tasks that we have covered in this post.

Troubleshooting

Problem: I deployed the CloudFormation template, but no runner is listed in my repository.

Possible Cause: Errors have been encountered during cfn-init, causing runner registration to fail. Connect to your runner EC2 instance, and check /var/log/cfn-*.log files.

Cleaning up

To avoid incurring future charges, delete every resource provisioned in this demo by deleting the CloudFormation stack created in the “Deploy the Gitlab Runner stack” section.

Conclusion

This article demonstrated how to utilize IaC to efficiently conduct various administrative tasks associated with a Gitlab Runner. We deployed Gitlab Runner consistently and quickly across multiple accounts. We utilized IaC to enforce guardrails and best practices, such as tracking Gitlab Runner configuration changes, terminating the Gitlab Runner gracefully, and autoscaling the Gitlab Runner to ensure best performance and minimum cost. We walked through the deploying, updating, autoscaling, and terminating of the Gitlab Runner. We also saw how easy it was to clean up the entire Gitlab Runner architecture by simply deleting a CloudFormation stack.

About the authors

Sylvia Qi

Sylvia is a Senior DevOps Architect focusing on architecting and automating DevOps processes, helping customers through their DevOps transformation journey. In her spare time, she enjoys biking, swimming, yoga, and photography.

Sebastian Carreras

Sebastian is a Senior Cloud Application Architect with AWS Professional Services. He leverages his breadth of experience to deliver bespoke solutions to satisfy the visions of his customer. In his free time, he really enjoys doing laundry. Really.

Authenticate AWS Client VPN users with AWS Single Sign-On

Post Syndicated from Sylvia Qi original https://aws.amazon.com/blogs/security/authenticate-aws-client-vpn-users-with-aws-single-sign-on/

AWS Client VPN is a managed client-based VPN service that enables users to use an OpenVPN-based client to securely access their resources in Amazon Web Services (AWS) and in their on-premises network from any location. In this blog post, we show you how you can integrate Client VPN with your existing AWS Single Sign-On via a custom SAML 2.0 application to authenticate and authorize your Client VPN connections and traffic.

Maintaining a separate set of credentials to authenticate users and authorize access for each resource is not only tedious, it’s not scalable. A common way to solve this challenge is to use a central identity store such as AWS SSO, which functions as your identity provider (IdP). You can then use Security Assertion Markup Language 2.0 (SAML 2.0) to integrate AWS SSO with each of your resources or applications, also known as service providers (SPs). The IdP authenticates users and passes their identity and security information to the SP via SAML. With SAML, you can enable a single sign-on experience for your users across many SAML-enabled applications and services. Users authenticate with the IdP once using a single set of credentials, and then have access to multiple applications and services without additional sign-ins.

Client VPN supports identity federation with SAML 2.0 for Client VPN endpoints. Deploying custom SAML applications can present some challenges, specifically around the mapping of attributes between what the SP expects to receive and what the IdP can provide. We’ve taken the guesswork out of the process and show you the exact mappings needed for the Client VPN to AWS SSO integration. The integration lets you use AWS SSO groups to not only grant access to create a Client VPN connection, but also to allow access to specific network ranges based upon group membership. We walk you through setting up all of the components required to implement the authentication workflow described in Figure 1. This consists of creating the custom SAML applications and tying them into AWS Identity and Access Management (IAM), creating and configuring the Client VPN endpoint, creating a Client VPN connection with an AWS SSO user, and testing your connectivity.
 

Figure 1: Authentication workflow

Figure 1: Authentication workflow

The steps illustrated in Figure 1 are:

  1. The user opens the AWS-provided VPN client on their device and initiates a connection to the Client VPN endpoint.
  2. The Client VPN endpoint sends an IdP URL and authentication request back to the client, based on the information that was provided in the IAM SAML provider.
  3. The AWS provided VPN client opens a new browser window on the user’s device. The browser makes a request to the IdP and displays a sign-in page. This is the same sign-in experience as the AWS SSO user portal, as the IdP URL points to a custom SAML application created within AWS SSO.
  4. The user enters their credentials on the sign-in page, and the IdP sends a signed SAML assertion back to the client in the form of an HTTP POST to the AWS provided VPN client.
  5. The SAML assertion is passed from the AWS provided VPN client to the Client VPN endpoint.
  6. The endpoint validates the assertion and either allows or denies access to the user.

Prerequisites

Here are the requirements to complete the VPN and SSO setup:

  • AWS SSO is configured to use the internal AWS SSO identity store. Refer to the AWS Single Sign-On Getting Started guide for help configuring AWS SSO. AWS SSO can exist in a different AWS account than the account where you deploy Client VPN endpoints. The steps outlined in this blog post are specific to the internal AWS SSO identity store, however they could be adapted to support other identity stores that support SAML 2.0.
  • Two AWS SSO users and two AWS SSO groups for testing. Each user should be a member of only one of the SSO groups. The purpose of this configuration is to demonstrate how access can be allowed or denied based upon group membership.
  • An Amazon Virtual Private Cloud (Amazon VPC) with an Amazon Elastic Compute Cloud (Amazon EC2) instance for connectivity testing.
  • An x.509 certificate imported into AWS Certificate Manager (ACM). You can generate a self-signed certificate for this walkthrough, however you should review the prerequisites for importing certificates into ACM. This certificate will be used for encrypted communication between the client VPN software and the client VPN endpoint.
  • Administrative access to your AWS environment, or at least sufficient access to create AWS SSO applications, ACM certificates, EC2 Instances, and Client VPN endpoints.
  • A client device running Windows or macOS with the latest version of Client VPN software installed. You can download it from the AWS Client VPN download.

Solution walkthrough

For this solution, you’ll complete the following steps:

  1. Establish trust with your IdP
  2. Create and configure Client VPN SAML applications in AWS SSO.
  3. Integrate the Client VPN SAML applications with IAM.
  4. Create and configure the Client VPN endpoint.
  5. Test the solution.
  6. Cleanup the test environment.

Establish trust with your IdP

In this walkthrough, Client VPN is the SAML SP and AWS SSO is the SAML IdP. One of the key steps to deploying this solution is to establish trust between the SP and IdP. This one-time configuration is done by creating custom SAML applications within AWS SSO and exporting application-specific metadata information from the applications. This metadata is then uploaded—in the form of IAM IdPs—into your AWS account where the Client VPN endpoint is created. IAM IdPs let you manage your user identities in a centralized identity store, such as AWS SSO, and grant those user identities permissions to AWS resources within your account. For organizations with multiple AWS accounts, the use of IAM IdPs resolves the management, scalability, and security issues associated with creating IAM users directly within each account.

Create and configure the Client VPN SAML applications in AWS SSO

Create two custom SAML 2.0 applications in AWS SSO. One will be the IdP for the Client VPN software, the other will be a self-service portal that allows users to download their Client VPN software and client configuration file.

To create the VPN client SAML application:

  1. In the AWS SSO console, select Applications from the left pane and select Add a new application.
  2. Select Add a custom SAML 2.0 application to use as the IdP for the Client VPN software.
     
    Figure 2: Add a SAML application

    Figure 2: Add a SAML application

  3. In the Details section, set Display name to VPN Client.
  4. In the Application Metadata section, select If you don’t have a metadata file, you can manually type your metadata values and enter the following values:
    • Application ACS URL: http://127.0.0.1:35001
    • Application SAML audience: urn:amazon:webservices:clientvpn
  5. Accept the default values for all other fields.
  6. Choose Save Changes.
  7. Select the Attribute mappings tab and configure the mappings as shown in the table and Figure 3 below.

    Note: For production environments, you should grant access to these applications via an AWS SSO group instead of individual users as shown in this walkthrough.

    User attribute in the application Maps to this string value or user attribute in AWS SSO Format
    Subject ${user:email} emailAddress
    Name ${user:email} unspecified
    FirstName ${user:givenName} unspecified
    LastName ${user:familyName} unspecified
    memberOf ${user:groups} unspecified
    Figure 3: VPN client attribute mappings

    Figure 3: VPN client attribute mappings

  8. On the Assign users tab, add your two test user accounts.
  9. On the application configuration page, choose the download link for AWS SSO SAML metadata. Save the file to use in a later step.

To create the VPN client self-service SAML application

  1. In the AWS SSO console, select Applications from the left pane and select Add a new application.
  2. Select Add a custom SAML 2.0 application to use as the application that will serve as the IdP for the Client VPN software.
     
    Figure 4: Add a SAML application

    Figure 4: Add a SAML application

  3. In the Details section, set Display name to VPN Client Self Service.
  4. In the Application Metadata section, select If you don’t have a metadata file, you can manually type your metadata values and enter the following values:
    • Application ACS URL: https://self-service.clientvpn.amazonaws.com/api/auth/sso/saml
    • Application SAML audience: urn:amazon:webservices:clientvpn
  5. Accept the default values for all other fields.
  6. Choose Save Changes.
  7. Choose the Attribute mappings tab and configure the mappings as shown in the following table and in Figure 5.

    Note: For production environments you should grant access to these applications via an AWS SSO group instead of individual users as shown in this walkthrough. For the purposes of this walkthrough, you grant individual users access to the SAML applications but grant network access via group membership. This is done to allow easier demonstration of the ability to grant or deny network specific access via groups when testing the solution.

    User attribute in the application Maps to this string value or user attribute in AWS SSO Format
    Subject ${user:email} emailAddress
    Name ${user:email} unspecified
    FirstName ${user:givenName} unspecified
    LastName ${user:familyName} unspecified
    memberOf ${user:groups} unspecified
    Figure 5: VPN Client self-service attribute mappings

    Figure 5: VPN Client self-service attribute mappings

  8. On the Assign users tab, add your two test user accounts.
  9. On the application’s Configuration page, choose the download link for AWS SSO SAML metadata. Save the file to use in a later step.

Integrate the Client VPN SAML applications with IAM

Client VPN requires a unique IdP definition in IAM. You must set up the IdP in the same AWS account where the Client VPN endpoint will be created.

To create the IAM IdP:

  1. In the IAM console, select Identity providers and Add provider. Name the provider aws-client-vpn and upload the metadata document that you downloaded from the VPN Client SAML application.
  2. Add a second provider, name the provider aws-client-vpn-self-service and upload the metadata document that you downloaded from the VPN Client Self Service SAML application.

Create and configure the Client VPN endpoint

All Client VPN sessions end at the Client VPN endpoint. You configure the Client VPN endpoint to manage and control all Client VPN sessions. In the following steps, you create a Client VPN endpoint and configure it to use the newly added IAM IdPs. You then associate the endpoint with a VPC and configure authorization rules to allow traffic into the VPC, then set up the Client VPN self-service portal.

To create the Client VPN endpoint

  1. Open the AWS VPC console and select Client VPN Endpoints and then select Create Client VPN endpoint.
  2. Enter a Name Tag and Description for the endpoint.
  3. Enter 172.16.0.0/22 for the Client IPv4 CIDR. This is the IP range that will be allocated to your VPN clients. It shouldn’t overlap the CIDR of your AWS VPCs or of the network that your client device is connected to and must be at least a /22 bitmask. You can adjust this value as needed for your specific network requirements. The Client IPv4 CIDR value can only be set during endpoint creation.

    Note: For production environments you should review the Client VPN documentation for scaling considerations before you create the endpoint.

  4. In the Server certificate ARN drop down menu, select the ACM certificate that you created for your VPN clients.
  5. Set the Authentication Options to Use user-based authentication with Federated authentication. Select the aws-client-vpn IAM IdP for the SAML provider ARN, and select the aws-client-vpn-self-service IAM IdP as the Self-service SAML provider ARN.
     
    Figure 6: Authentication settings

    Figure 6: Authentication settings

  6. For this walkthrough, set Connection Logging to No. Connection logging is a feature of Client VPN that enables you to capture connection logs for your Client VPN endpoint. Those logs are published to an Amazon CloudWatch Logs log group in your account. For production environments or for troubleshooting purposes, you can enable connection logging while or after you create the endpoint.
  7. Select the VPC ID to associate with the endpoint. This should be the VPC with an EC2 instance deployed that can be used to test connectivity. You can select an existing security group, or create a new one for the VPN endpoint. The only requirement for this walkthrough is that it has outbound rules that allow access to your test EC2 instance. For additional flexibility, you can create and apply multiple security groups that use different rulesets to the endpoint to provide fine-grained control of which resources can be accessed within the VPC.
  8. Select Enable self-service portal and—if desired—select Enable split-tunnel. Split tunneling is designed to ensure that only client traffic destined for the IP ranges configured on the Client VPN endpoint is routed to your VPC. By default, all traffic, including internet bound traffic, is routed through your VPC.
  9. Choose Create Client VPN endpoint.

To configure the Client VPN endpoint

  1. On the Client VPN endpoint Associations tab, select Associate. Select the same VPC that you chose when you set up the endpoint and select a subnet to associate. This creates an elastic network interface (ENI) in the selected subnet that will be the ingress point from VPN clients into your AWS VPC. For production environments, you should select at least two subnets based upon your redundancy requirements.
  2. Authorizing VPN ingress traffic from your users can be done either globally for all users or via group membership. When granting access via an AWS SSO group, you must use the group ID of the AWS SSO group, not the friendly name of the group. After selecting a group in the AWS SSO management console, you can find group ID in the Details section. You can also obtain the group ID by using AWS Command Line Interface (AWS CLI) to issue the following command, replacing the <AWSRegion>, <Identity Store ID>, and <AWS SSO Group Display Name> variables with your information. This command should be issued within the same AWS account where AWS SSO is configured. The identity store ID can be found in the AWS SSO console under Settings.
    aws identitystore list-groups --region <AWSRegion> --identity-store-id <Identity Store ID> --filter AttributePath=DisplayName,AttributeValue=<AWS SSO Group Display Name>
    

  3. Create an ingress authorization rule by selecting Authorize Ingress on the Authorization tab. Configure the destination network to enable as 0.0.0.0/0, set Grant access to: Allow access to users in a specific access group and enter the access group ID that you discovered in the previous step. This should be the group that contains one of your test user accounts. For production environments, you should follow the principle of least privilege and narrow the destination network range to only what is required. Ingress authorization rules can be used to restrict network access to specific network ranges based upon IdP group membership. You can use a client connection handler to enforce additional security policies on Client VPN connections. Refer to the Client VPN documentation for additional details.
    Figure 7: Add authorization rule

    Figure 7: Add authorization rule

  4. From the Client VPN Endpoint Summary tab, copy the Self-service portal URL to use in the next step.

To set up the Client VPN self-service portal

  1. Open the Client VPN self-service SAML application in the AWS SSO management console to edit the configuration.
  2. In the Application start URL textbox, paste the Client VPN endpoint self-service portal URL that you copied in the previous section. This ties the Client VPN self-service SAML application to the self-service portal URL for the specific Client VPN endpoint that you created, allowing users to download their AWS VPN Client configuration file.
     
    Figure 8: Client VPN self-service portal

    Figure 8: Client VPN self-service portal

Test the solution

During the testing phase, you download the VPN client configuration file and configure the VPN client application. You then create a Client VPN connection and validate that you have access to your target VPC. You also test the Client VPN connection with multiple user accounts in order to confirm that the ingress authorization rules are functioning as expected.

To test the Client VPN solution:

  1. Open an internet browser and sign in to your AWS SSO user portal as a user who has access to the VPN Client SAML applications and is a member of the AWS SSO group defined in the VPN endpoint ingress authorization rule. You should see two new SAML applications. Select the VPN client self-service application.
  2. In the VPN Client Self Service portal, you can download the AWS VPN Client software if you haven’t already done so. Select Download client configuration and save the file on your local device. Close the browser window that you used to sign in to the AWS SSO user portal.
  3. Open the AWS VPN Client application and configure a new profile, selecting the client configuration file that you downloaded in the previous step. Once your client profile has been created, select Connect.
     
    Figure 9: VPN Client ready to connect

    Figure 9: VPN Client ready to connect

  4. A new browser window should open automatically to an AWS SSO sign-in page. Enter the credentials of your test user who is a member of the AWS SSO group defined in your ingress authorization rule.
  5. Upon a successful connection through the VPN client, you can make a management connection (RDP, SSH, HTTP, or other) to one of the EC2 instances within your VPC. Connect to the private IPv4 address of your EC2 instance (rfc1918)—you should not attempt to connect to your EC2 instance through an EIP. You might need to adjust the security group rules on your EC2 instance to allow traffic from the subnets that you selected when you created the VPN endpoint associations.
  6. Once you have a successful connection to your test EC2 instance and you know that your Client VPN connectivity is working, you should also validate that access is denied for users who aren’t a member of the group specified in your ingress authorization rule.
    1. Disconnect from your Client VPN connection and close all browser windows.
    2. Depending upon your internet browser and its configuration, you might need to delete any cookies associated with your AWS SSO user portal in order to sign in as a different AWS SSO user.
    3. Initiate a new Client VPN connection and sign in as the test user account that is not a member of the AWS SSO group specified in the ingress authorization rule.
    4. You should be able to successfully establish the Client VPN connection, but not to access your test EC2 instance. This validates that the ingress authorization rule isn’t allowing Client VPN traffic from users who aren’t a member of the AWS SSO group to enter your VPC.

Troubleshooting

If you have any issues completing the walkthrough and testing, here are some things that you can check:

  • In the AWS VPC management console, review the Connections tab to verify that you see a connection from your test user account and that it’s active.
  • Confirm that your test user account is in the group that was defined in your ingress authorization rule.
  • Confirm that the access group ID specified in the ingress authorization rule is for the AWS SSO group that your test user is a member of.
  • Confirm that the AWS SSO group still exists and hasn’t been deleted. You might encounter an error message similar to the one shown in Figure 10 if you attempt a Client VPN connection but the AWS SSO group no longer exists.
     
    Figure 10: Error message

    Figure 10: Error message

  • If you receive a credential error when attempting to sign in to the AWS SSO browser window that’s launched by the VPN Client application, you might have an issue with the ACM certificate that you’re using. There can be authentication related issues if the root CA certificates aren’t correct or if any part of the certificate chain is missing.
  • Validate your EC2 instance security group rules and VPC route table configuration. From a routing perspective, your test EC2 instance must be accessible from the subnet that you selected when you created the Client VPN endpoint association.
  • If you want to see the SAML assertion that’s being sent to the AWS VPN client application. Sign in to the AWS SSO user portal, and hold down the Shift key while selecting the VPN client SAML application. A new browser tab will open with the SAML assertion visible. The SAML assertion contains the access group IDs of all groups that your test user is a member of. You can use this information to validate that the correct group memberships and group IDs are defined in your ingress authorization rules.
  • Make sure that TCP port 35001 is available on your client device. It shouldn’t be used by any other process or blocked by a firewall. Port 35001 only needs to be open on your localhost interface. The SAML assertion is sent to localhost on port 35001 as an HTTP POST from the browser window opened by the AWS VPN client application after a successful sign-in.

Clean up the test environment

To avoid charges for the use of AWS EC2, Client VPN, SSO, or ACM services, remove any components that were created as part of this walkthrough. Components that can be deleted if applicable are:

  1. The Client VPN endpoint. You must first remove all associations that were created for the endpoint.
  2. The EC2 instance and VPC.
  3. The test IdPs from IAM.
  4. The VPN client custom SAML applications from AWS SSO.
  5. AWS SSO users and groups.
  6. The ACM certificate.

Conclusion

In this blog post, we’ve shown how you can integrate Client VPN and AWS SSO to provide a familiar and seamless VPN connection experience to your users. By adding the Client VPN self-service portal, you can reduce the effort needed to deploy the solution by allowing users to perform their own VPN client application installation and configuration. We demonstrated the creation of IdPs using AWS SSO custom applications and then showed you how to configure a Client VPN endpoint to use SAML-based federated authentication and associate it with the IdPs. Client VPN users can then use their centralized credentials to connect to the Client VPN endpoint and access specific network ranges based upon their group membership or further refined through a client connection handler.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Drew Marumoto

Drew is a DevOps Consultant with Aws Professional Service. A long time system administrator with a passion for automation and orchestration, he enjoys solving difficult problems for customers and helping them achieve their business goals.

Author

Sylvia Qi

Sylvia is a DevOps Consultant focusing on architecting and automating DevOps processes, helping customers through their DevOps transformation journey, and achieving their goals. In her spare time, she enjoyes biking, swimming, painting, and photograhy.