All posts by Mikhail Vasilyev

Choosing a Well-Architected CI/CD approach: Open Source on AWS

Post Syndicated from Mikhail Vasilyev original https://aws.amazon.com/blogs/devops/choosing-a-well-architected-ci-cd-approach-open-source-on-aws/

Introduction

When building a CI/CD platform, it is important to make an informed decision regarding every underlying tool. This post explores evaluating the criteria for selecting each tool focusing on a balance between meeting functional and non-functional requirements, and maximizing value.

Your first decision: source code management.

Source code is potentially your most valuable asset, and so we start by choosing a source code management tool. These tools normally have high non-functional requirements in order to protect your assets and to ensure they are available to the organization when needed. The requirements usually include demand for high durability, high availability (HA), consistently high throughput, and strong security with role-based access controls.

At the same time, source code management tools normally have many specific functional requirements as well. For example, the ability to provide collaborative code review in the UI, flexible and tunable merge policies including both automated and manual gates (code checks), and out-of-box UI-level integrations with numerous other tools. These kinds of integrations can include enabling monitoring, CI, chats, and agile project management.

Many teams also treat source code management tools as their portal into other CI/CD tools. They make them shareable between teams, and might prefer to stay within one single context and user interface throughout the entire DevOps cycle. Many source code management tools are actually a stack of services that support multiple steps of your CI/CD workflows from within a single UI. This makes them an excellent starting point for building your CI/CD platforms.

The first decision your need to make is whether to go with an open source solution for managing code or with AWS-managed solutions, such as AWS CodeCommit. Open source solutions include (but are not limited to) the following: Gerrit, Gitlab, Gogs, and Phabricator.

You decision will be influenced by the amount of benefit your team can gain from the flexibility provided through open source, and how well your team can support deploying and managing these solutions. You will also need to consider the infrastructure and management overhead cost.

Engineering teams that have the capacity to develop their own plugins for their CI/CD platforms, or whom even contribute directly to open source projects, will often prefer open source solutions for the flexibility they provide. This will be especially true if they are fluent in designing and supporting their own cloud infrastructure. If the team gets more value by trading the flexibility of open source for not having to worry about managing infrastructure (especially if High Availability, Scalability, Durability, and Security are more critical) an AWS-managed solution would be a better choice.

Source Code Management Solution

When the choice is made in favor of an open-source code management solution (such as Gitlab), the next decision will be how to architect the deployment. Will the team deploy to a single instance, or design for high availability, durability, and scalability? Teams that want to design Gitlab for HA can use the following guide to proceed: Installing GitLab on Amazon Web Services (AWS)

By adopting AWS services (such as Amazon RDS, Amazon ElastiCache for Redis, and Autoscaling Groups), you can lower the management burden of supporting the underlying infrastructure in this self-managed HA scenario.

High level overview of self-managed HA Gitlab deployment

Your second decision: Continuous Integration engine

Selecting your CI engine, you might be able to benefit from additional features of previously selected solutions. Gitlab provides both source control services, as well as built-in CI tools, called Gitlab CI. Gitlab Runners are responsible for running CI jobs, and the actual jobs are described as YML files stored in Gitlab’s git repository along with product code. For security and performance reasons, GitLab Runners should be on resources separate from your GitLab instance.

You could manage those resources or you could use one of the AWS services that can support deploying and managing Runners. The use of an on-demand service removes the expense of implementing and managing a capability that is undifferentiated heavy lifting for you. This provides cost optimization and enables operational excellence. You pay for what you use and the service team manages the underlying service.

Continuous Integration engine Solution

In an architecture example (below), Gitlab Runners are deployed in containers running on Amazon EKS. The team has less infrastructure to manage, can start focusing on development faster by not having to implement the capability, and can provision resources in an optimal way for their on-demand needs.

To further optimize costs, you can use EC2 Spot Instances for your EKS nodes. CI jobs are normally compute intensive and limited in run time. The runner jobs can easily be restarted on a different resource with little impact. This makes them tolerant of failure and the use of EC2 Spot instances very appealing. Amazon EKS and Spot Instances are supported out-of-box in Gitlab. As a result there is no integration to develop, only configuration is required.

To support infrastructure as code best practices, Runners are deployed with Helm and are stored and versioned as Helm charts. All of the infrastructure as code information used to implement the CI/CD platform itself is stored in templates such as Terraform.

High level overview of Infrastructure as Code on Gitlab and Gitlab CI

High level overview of Infrastructure as Code on Gitlab and Gitlab CI

Your third decision: Container Registry

You will be unable to deploy Runners if the container images are not available. As a result, the primary non-functional requirements for your production container registry are likely to include high availability, durability, transparent scalability, and security. At the same time, your functional requirements for a container registry might be lower. It might be sufficient to have a simple UI, and simple APIs supporting basic flows. Customers looking for a managed solution can use Amazon ECR, which is OCI compliant and supports Helm Charts.

Container Registry Solution

For this set of requirements, the flexibility and feature velocity of open source tools does not provide an advantage. Self-supporting high availability and strengthened security could be costly in implementation time and long-term management. Based on [Blog post 1 Diagram 1], an AWS-managed solution provides cost advantages and has no management overhead. In this case, an AWS-managed solution is a better choice for your container registry than an open-source solution hosted on AWS. In this example, Amazon ECR is selected. Customers who prefer to go with open-source container registries might consider solutions like Harbor.

High level overview of Gitlab CI with Amazon ECR

High level overview of Gitlab CI with Amazon ECR

Additional Considerations

Now that the main services for the CI/CD platform are selected, we will take a high level look at additional important considerations. You need to make sure you have observability into both infrastructure and applications, that backup tools and policies are in place, and that security needs are addressed.

There are many mechanisms to strengthen security including the use of security groups. Use IAM for granular permission control. Robust policies can limit the exposure of your resources and control the flow of traffic. Implement policies to prevent your assets leaving your CI environment inappropriately. To protect sensitive data, such as worker secrets, encrypt these assets while in transit and at rest. Select a key management solution to reduce your operational burden and to support these activities such as AWS Key Management Service (AWS KMS). To deliver secure and compliant application changes rapidly while running operations consistently with automation, implement DevSecOps.

Amazon S3 is durable, secure, and highly available by design making it the preferred choice to store EBS-level backups by many customers. Amazon S3 satisfies the non-functional requirements for a backup store. It also supports versioning and tiered storage classes, making it a cost-effective as well.

Your observability requirements may emphasize versatility and flexibility for application-level monitoring. Using Amazon CloudWatch to monitor your infrastructure and then extending your capabilities through an open-source solutions such as Prometheus may be advantageous. You can get many of the benefits of both open-source Prometheus and AWS services with Amazon Managed Service for Prometheus (AMP). For interactive visualization of metrics, many customers choose solutions such as open-source Grafana, available as an AWS service Amazon Managed Service for Grafana (AMG).

CI/CD Platform with Gitlab and AWS

CI/CD Platform with Gitlab and AWS

Conclusion

We have covered how making informed decisions can maximize value and synergy between open-source solutions on AWS, such as Gitlab, and AWS-managed services, such as Amazon EKS and Amazon ECR. You can find the right balance of open-source tools and AWS services that will meet your functional and non-functional requirements, and help maximizing the value you get from those resources.

Pete Goldberg, Director of Partnerships at GitLab: “When aligning your development process to AWS Well Architected Framework, GitLab allows customers to build and automate processes to achieve Operational Excellence. As a single tool designed to facilitate collaboration across the organization, GitLab simplifies the process to follow the Fully Separated Operating Model where Engineering and Operations come together via automated processes that remove the historical barriers between the groups. This gives organizations the ability to efficiently and rapidly deploy new features and applications that drive the business while providing the risk mitigation and compliance they require. By allowing operations teams to define infrastructure as code in the same tool that the engineering teams are storing application code, and allowing your automation bring those together for your CI/CD workflows companies can move faster while having compliance and controls built-in, providing the entire organization greater transparency. With GitLab’s integrations with different AWS compute options (EC2, Lambda, Fargate, ECS or EKS), customers can choose the best type of compute for the job without sacrificing the controls required to maintain Operational Excellence.”

 

Author bio

Mikhail is a Solutions Architect for RUS-CIS. Mikhail supports customers on their cloud journeys with Well-architected best practices and adoption of DevOps techniques on AWS. Mikhail is a fan of ChatOps, Open Source on AWS and Operational Excellence design principles.

Choosing a CI/CD approach: Open Source on AWS, an Iponweb story

Post Syndicated from Mikhail Vasilyev original https://aws.amazon.com/blogs/devops/choosing-a-ci-cd-approach-open-source-on-aws-an-iponweb-story/

Iponweb is a global leader in building programmatic and real-time advertising technology and infrastructure for some of the world’s biggest digital media buyers and sellers. The company develops client-facing products and internal development tools that must be platform agnostic to support spanning across multiple cloud services.

In this post, we explore how Iponweb applied key considerations when choosing a continuous integration, continuous deployment (CI/CD), what they determined to be the right CI/CD approach for them, and review some considerations that may apply to your own business needs. And in the next post, we will dive even deeper into these key considerations.

How did Iponweb decide what they needed?

The first and most important question in designing a Well-Architected approach is: “How do you determine your priorities?” AWS Well-Architected defines the first two best practices to do that as: ”evaluate external customer needs” (Iponweb’s clients) and “evaluate internal customer needs” (Iponweb’s team).

Iponweb started with these two considerations while selecting the strategic toolset. After evaluating their customers’ requirements, the next step was to look at the needs of the Iponweb team. Their priorities included the products and features required, the cost, and the ability to build multi-cloud solutions.

Iponweb is dedicated to operating securely with the reliability and performance to support their customers. Solutions had to satisfy their fundamental requirements in these areas to be considered in their evaluation.

Feature set

Iponweb evaluated available options for the CI tool chain and found that, for their needs, GitLab was the clear winner, differentiated by delivering the greatest number of required features at the best price while being platform agnostic.

AWS had the complete set of tools, services, and best practices to support Iponweb’s goal to establish an open-source, self-hosted CI environment using GitLab. Upon completing their thorough evaluation process, Iponweb selected AWS to implement its CI environment.

Cost

Iponweb understood the investment they would be making within their team to leverage and support all the desired features of GitLab. Iponweb evaluated the expertise of its internal teams and factored in ease of integration with supporting services.

They adopted several AWS services that satisfied their undifferentiated needs, which allowed them to remove the operational burden and cost of maintaining their own implementations of various capabilities and features.

Furthermore, the availability of Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances provided the opportunity to further manage costs for their CI resource needs and usage patterns.

Security

Iponweb leveraged their existing security control implementations and integration with AWS to support adopting additional AWS services. AWS was responsible for the security of the cloud, including the underlying AWS services. Iponweb was able to focus on secure and effective configurations of those services and secure and effective configuration of their GitLab implementation. This ensured the security of their open-source, self-hosted CI environment.

When setting priorities for the design of a Well-Architected approach, it’s imperative to “manage benefits and risks,” which emphasizes making informed decisions when adopting open source or any tools. Iponweb achieved their best value solution by applying Well-Architected practices in Operational Excellence, Cost Optimization, and Security pillars by leveraging AWS products and services.

Overview of solution

Continuous integration consists of three key processes, each of which AWS supports:

  • Code stage – Iponweb built the centralized Git repository on the GitLab platform on EC2 servers, providing the UI and API to store and manage the code.
  • Test and build stage – They used GitLab as the application layer to manage build and test flows through GitLab Runners (compute workers for CI jobs). This layer is implanted via GitLab in containers, and is deployed and managed by Amazon Elastic Kubernetes Service (Amazon EKS).
  • Publish stageAmazon Elastic Container Registry (Amazon ECR) stores the infrastructure containers for the runners and product containers.

The following diagram illustrates this architecture:

At the core of Iponweb’s CI platform architecture is the open-source GitLab Community Edition.

Implementing the solution

CI jobs are either run regularly or triggered by events such as merge requests. The jobs are described as code in YAML files and are stored and versioned along with the product code itself. Runner versions are published into Amazon ECR and launched as Docker containers in Amazon EKS.

Runner code is stored as Helm charts that help Iponweb package up and manage their large-scale Kubernetes deployments. In addition, Amazon EKS has support for Helm and many other plugins for Kubernetes.

Iponweb developers innovate at a very fast pace, and customize Iponweb’s client solutions in rapid iterations. To address uncertain container registry requirements, Iponweb decided to use Amazon ECR. As a managed service, Amazon ECR eliminates concerns about scaling capacity and management. Integration of GitLab with Amazon EKS and Amazon ECR is provided out of the box through a UI and predefined scripts, with no additional overhead to develop and deploy code or plugins.

Iponweb was able to implement the Well-Architected design principle: “stop continuously estimating its capacity needs.” Enabling them to focus on more strategic development activities. They performed a thorough analysis of each component, looking at the total cost of ownership, including operations and management. In doing so, they implemented the best practice from the Cost Optimization pillar: “How do you evaluate cost when you select services?”

In the Cost Optimization pillar, a key question is “How do you use pricing models to reduce costs?” Iponweb deployed runners in Amazon EKS for precise, granular, and on-demand compute scaling for each CI job. These tasks have short-term capacity needs, so Iponweb benefited from configuring Amazon EKS on Spot Instances, achieving factor price reduction. The EC2 Spot pricing model is most appropriate for their CI resource needs and usage patterns.

To protect their data at rest, Iponweb followed a best practice from the Security pillar: “Implement secure key management.” They used AWS Key Management Service (AWS KMS) to manage secrets for the runners.

To protect the code and artifacts, and to ensure these valuable assets don’t leave the CI environment inappropriately, Iponweb followed best practices in Infrastructure Protection from the Security pillar question, “How do you protect your networks?” Iponweb scrupulously defined the network protection requirements, limiting their exposure by controlling traffic at all layers, and implementing security groups to prevent inappropriate access into and out of their VPC.

Michael Benuhis, CTO at Iponweb, says:

“Iponweb was able to get the best of open-source software and public cloud services by building the continuous integration platform on Amazon Web Services. Open-source tools provided Iponweb platform agnosticism for serving our diverse customer base, while managed Amazon EKS on EC2 Spot Instances eliminated the operational burden of managing our own Kubernetes infrastructure, and with greater cost efficiency.”

Conclusion

Iponweb has satisfied their current needs and aren’t looking for improvement in the short term. They will stay on the free version of GitLab, satisfied for the moment with what they have achieved. They have custom automations in place to synchronize with GitLab and integrate with their existing tools. They like the features provided by the paid version of GitLab, but there isn’t a business case to support an informed decision to upgrade at this time.

They have achieved their goal of using Amazon EKS and Spot under GitLab CI/CD integrated with their existing systems and satisfying their needs.