Tag Archives: CodeArtifact

How Pushly Media used AWS to pivot and quickly spin up a StartUp

Post Syndicated from Eddie Moser original https://aws.amazon.com/blogs/devops/how-pushly-media-used-aws-to-pivot-and-quickly-spin-up-a-startup/

This is a guest post from Pushly. In their own words, “Pushly provides a scalable, easy-to-use platform designed to deliver targeted and timely content via web push notifications across all modern desktop browsers and Android devices.”

Introduction

As a software engineer at Pushly, I’m part of a team of developers responsible for building our SaaS platform.

Our customers are content publishers spanning the news, ecommerce, and food industries, with the primary goal of increasing page views and paid subscriptions, ultimately resulting in increased revenue.

Pushly’s platform is designed to integrate seamlessly into a publisher’s workflow and enables advanced features such as customizable opt-in flow management, behavioral targeting, and real-time reporting and campaign delivery analytics.

As developers, we face various challenges to make all this work seamlessly. That’s why we turned to Amazon Web Services (AWS). In this post, I explain why and how we use AWS to enable the Pushly user experience.

At Pushly, my primary focus areas are developer and platform user experience. On the developer side, I’m responsible for building and maintaining easy-to-use APIs and a web SDK. On the UX side, I’m responsible for building a user-friendly and stable platform interface.

The CI/CD process

We’re a cloud native company and have gone all in with AWS.

AWS CodePipeline lets us automate the software release process and release new features to our users faster. Rapid delivery is key here, and CodePipeline lets us automate our build, test, and release process so we can quickly and easily test each code change and fail fast if needed. CodePipeline is vital to ensuring the quality of our code by running each change through a staging and release process.

One of our use cases is continuous reiteration deployment. We foster an environment where developers can fully function in their own mindset while adhering to our company’s standards and the architecture within AWS.

We deploy code multiple times per day and rely on AWS services to run through all checks and make sure everything is packaged uniformly. We want to fully test in a staging environment before moving to a customer-facing production environment.

The development and staging environments

Our development environment allows developers to securely pull down applications as needed and access the required services in a development AWS account. After an application is tested and is ready for staging, the application is deployed to our staging environment—a smaller reproduction of our production environment—so we can test how the changes work together. This flow allows us to see how the changes run within the entire Pushly ecosystem in a secure environment without pushing to production.

When testing is complete, a pull request is created for stakeholder review and to merge the changes to production branches. We use AWS CodeBuild, CodePipeline, and a suite of in-house tools to ensure that the application has been thoroughly tested to our standards before being deployed to our production AWS account.

Here is a high level diagram of the environment described above:

Diagram showing at a high level the Pushly environment.Ease of development

Ease of development was—and is—key. AWS provides the tools that allow us to quickly iterate and adapt to ever-changing customer needs. The infrastructure as code (IaC) approach of AWS CloudFormation allows us to quickly and simply define our infrastructure in an easily reproducible manner and rapidly create and modify environments at scale. This has given us the confidence to take on new challenges without concern over infrastructure builds impacting the final product or causing delays in development.

The Pushly team

Although Pushly’s developers all have the skill-set to work on both front-end-facing and back-end-facing projects, primary responsibilities are split between front-end and back-end developers. Developers that primarily focus on front-end projects concentrate on public-facing projects and internal management systems. The back-end team focuses on the underlying architecture, delivery systems, and the ecosystem as a whole. Together, we create and maintain a product that allows you to segment and target your audiences, which ensures relevant delivery of your content via web push notifications.

Early on we ran all services entirely off of AWS Lambda. This allowed us to develop new features quickly in an elastic, cost efficient way. As our applications have matured, we’ve identified some services that would benefit from an always on environment and moved them to AWS Elastic Beanstalk. The capability to quickly iterate and move from service to service is a credit to AWS, because it allows us to customize and tailor our services across multiple AWS offerings.

Elastic Beanstalk has been the fastest and simplest way for us to deploy this suite of services on AWS; their blue/green deployments allow us to maintain minimal downtime during deployments. We can easily configure deployment environments with capacity provisioning, load balancing, autoscaling, and application health monitoring.

The business side

We had several business drivers behind choosing AWS: we wanted to make it easier to meet customer demands and continually scale as much as needed without worrying about the impact on development or on our customers.

Using AWS services allowed us to build our platform from inception to our initial beta offering in fewer than 2 months! AWS made it happen with tools for infrastructure deployment on top of the software deployment. Specifically, IaC allowed us to tailor our infrastructure to our specific needs and be confident that it’s always going to work.

On the infrastructure side, we knew that we wanted to have a staging environment that truly mirrored the production environment, rather than managing two entirely disparate systems. We could provide different sets of mappings based on accounts and use the templates across multiple environments. This functionality allows us to use the exact same code we use in our current production environment and easily spin up additional environments in 2 hours.

The need for speed

It took a very short time to get our project up and running, which included rewriting different pieces of the infrastructure in some places and completely starting from scratch in others.

One of the new services that we adopted is AWS CodeArtifact. It lets us have fully customized private artifact stores in the cloud. We can keep our in-house libraries within our current AWS accounts instead of relying on third-party services.

CodeBuild lets us compile source code, run test suites, and produce software packages that are ready to deploy while only having to pay for the runtime we use. With CodeBuild, you don’t need to provision, manage, and scale your own build servers, which saves us time.

The new tools that AWS is releasing are going to even further streamline our processes. We’re interested in the impact that CodeArtifact will have on our ability to share libraries in Pushly and with other business units.

Cost savings is key

What are we saving by choosing AWS? A lot. AWS lets us scale while keeping costs at a minimum. This was, and continues to be, a major determining factor when choosing a cloud provider.

By using Lambda and designing applications with horizontal scale in mind, we have scaled from processing millions of requests per day to hundreds of millions, with very little change to the underlying infrastructure. Due to the nature of our offering, our traffic patterns are unpredictable. Lambda allows us to process these requests elastically and avoid over-provisioning. As a result, we can increase our throughput tenfold at any time, pay for the few minutes of extra compute generated by a sudden burst of traffic, and scale back down in seconds.

In addition to helping us process these requests, AWS has been instrumental in helping us manage an ever-growing data warehouse of clickstream data. With Amazon Kinesis Data Firehose, we automatically convert all incoming events to Parquet and store them in Amazon Simple Storage Service (Amazon S3), which we can query directly using Amazon Athena within minutes of being received. This has once again allowed us to scale our near-real-time data reporting to a degree that would have otherwise required a significant investment of time and resources.

As we look ahead, one thing we’re interested in is Lambda custom stacks, part of AWS’s Lambda-backed custom resources. Amazon supports many languages, so we can run almost every language we need. If we want to switch to a language that AWS doesn’t support by default, they still provide a way for us to customize a solution. All we have to focus on is the code we’re writing!

The importance of speed for us and our customers is one of our highest priorities. Think of a news publisher in the middle of a briefing who wants to get the story out before any of the competition and is relying on Pushly—our confidence in our ability to deliver on this need comes from AWS services enabling our code to perform to its fullest potential.

Another way AWS has met our needs was in the ease of using Amazon ElastiCache, a fully managed in-memory data store and cache service. Although we try to be as horizontal thinking as possible, some services just can’t scale with the immediate elasticity we need to handle a sudden burst of requests. We avoid duplicate lookups for the same resources with ElastiCache. ElastiCache allows us to process requests quicker and protects our infrastructure from being overwhelmed.

In addition to caching, ElastiCache is a great tool for job locking. By locking messages by their ID as soon as they are received, we can use the near-unlimited throughput of Amazon Simple Queue Service (Amazon SQS) in a massively parallel environment without worrying that messages are processed more than once.

The heart of our offering is in the segmentation of subscribers. We allow building complex queries in our dashboard that calculate reach in real time and are available to use immediately after creation. These queries are often never-before-seen and may contain custom properties provided by our clients, operate on complex data types, and include geospatial conditions. No matter the size of the audience, we see consistent sub-second query times when calculating reach. We can provide this to our clients using Amazon Elasticsearch Service (Amazon ES) as the backbone to our subscriber store.

Summary

AWS has countless positives, but one key theme that we continue to see is overall ease of use, which enables us to rapidly iterate. That’s why we rely on so many different AWS services—Amazon API Gateway with Lambda integration, Elastic Beanstalk, Amazon Relational Database Service (Amazon RDS), ElastiCache, and many more.

We feel very secure about our future working with AWS and our continued ability to improve, integrate, and provide a quality service. The AWS team has been extremely supportive. If we run into something that we need to adjust outside of the standard parameters, or that requires help from the AWS specialists, we can reach out and get feedback from subject matter experts quickly. The all-around capabilities of AWS and its teams have helped Pushly get where we are, and we’ll continue to rely on them for the foreseeable future.

 

AWS CodeArtifact and your package management flow – Best Practices for Integration

Post Syndicated from John Standish original https://aws.amazon.com/blogs/devops/integrating-aws-codeartifact-package-mgmt-flow/

You often use artifact repositories to store and share software or deployment packages. Centralized artifacts enable teams to operate independently and share versioned software artifacts across your organization. Sharing versioned artifacts across organizations increases code reuse and reduces delivery time. Having a central artifact store enables tighter artifact governance and improves security visibility. This post uses some of these patterns to show you how to integrate AWS CodeArtifact in an effective, cost-controlled, and efficient manner.

AWS CodeArtifact Diagram

AWS CodeArtifact Service Usage

AWS CodeArtifact concepts

AWS CodeArtifact uses the following elements:

  • Asset – An individual file stored in AWS CodeArtifact that is associated with a package version, such as an npm .tgz file or Maven POM and JAR files
  • Package – A package is a bundle of software and the metadata that is required to resolve dependencies and install the software. AWS CodeArtifact supports npmPyPI, and Maven package formats.
  • Repository – An CodeArtifact repository contains a set of package versions, each of which maps to a set of assets. Repositories are polyglot—a single repository can contain packages of any supported type. Each repository exposes endpoints for fetching and publishing packages using tools like the npm CLI, the Maven CLI (mvn), and pip.
  • Domain – Repositories are aggregated into a higher-level entity known as a domain. The domain allows organizational policy to be applied across multiple repositories. A domain deduplicates storage of the repositories packages.

Creating a domain based on organizational ownership

When you create a domain in CodeArtifact, it’s important to organize the domain by ownership within the organization. An example would be a a company being a domain, and the products being repositories. Domains allow you to apply organizational policies across multiple repositories. Generally we recommend creating one domain per company. In some cases it may also be beneficial to have a sandbox domain where prototype repositories reside. In a sandbox domain teams are at liberty to create their own repositories and experiment as needed, without affecting product deliverable assets. Using a sandbox domain will duplicate packages, isolate repositories since you can not copy packages between domains, and increase costs since package deduplication is handle at the domain level. Organizing packages by domain ownership increases the cache hits on a package within the domain and reduces cost for each subsequent package fetch request.

Whenever a package is fetched from a repository, the asset is cached in your CodeArtifact domain to minimize the cost of subsequent downstream requests. A given asset only needs to be stored once in a domain, even if it’s available in two—or two thousand—repositories. That means you only pay for storage once. Copying a package version with the CopyPackageVersions API is only possible between repositories within the same CodeArtifact domain.

You can create a domain for your organization by calling create-domain in the AWS Command Line Interface (AWS CLI), AWS SDK, or on the CodeArtifact console. See the following code:

aws codeartifact create-domain --domain "my-org"

After creating the domain you will see the domains listed in the Domains section on the CodeArtifact console.

AWS CodeArtifact domains per governing organization

Organizing packages by domain ownership

Using a shared repository

A shared repository is applicable when a team feels that a component is useful to the rest of the organization and isn’t in an experimental state, personal project, and not meant for wide distribution within the organization. Examples of shared components are open source public repositories (npm, PyPI, and Maven), authentication, logging, or helper libraries. Shared libraries aren’t related to product libraries; for instance, a service contract library shouldn’t live in a shared repository. The shared repository should be marked read-only to all users except for the publishing IAM role. At Amazon, we have found that many teams want to consume common packages as part of their application build, and don’t need to publish any package themselves. Those teams don’t need their own repository and pull packages from shared. Overall, approximately 80% of packages are downloaded from the shared repository, and 20% from team or project specific repositories.

You can create a shared repository by calling the create-repository command and setting a resource policy that makes the repository read-only.

Here is how you create a repository with the AWS CLI using the create-repository command. See the following code:

aws codeartifact create-repository --domain "my-org" \
--domain-owner "account-id" --repository "my-shared-repo-name" \
--description "My new repository"

Next you make the repository read-only by setting a resource policy. See the following code:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
		"codeartifact:DescribePackageVersion",
                "codeartifact:DescribeRepository",
                "codeartifact:GetPackageVersionReadme",
                "codeartifact:GetRepositoryEndpoint",
                "codeartifact:ListPackages",
                "codeartifact:ListPackageVersions",
                "codeartifact:ListPackageVersionAssets",
                "codeartifact:ListPackageVersionDependencies",
                "codeartifact:ReadFromRepository"
            ],
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::444455556666:root"
            },
            "Resource": "*"
        }
    ]
}

To attach a resource polcicy to a repository by calling the put-repository-permissions command. See the following code:

aws codeartifact put-repository-permissions-policy --domain "my-org" \
--domain-owner "account-id" --repository "my-shared-repo-name" \
--policy-document file:///PATH/TO/policy.json

When you have created the repository, you will see it listed in the Repositories section on the CodeArtifact console.

A list of shared repositories in AWS CodeArtifact

Shared repositories in AWS CodeArtifact

External repository connections

CodeArtifact enables you to set external repository connections and replicate them within CodeArtifact. An external connection reduces the downstream dependency on the remote external repository. When you request a package from the CodeArtifact repository that’s not already present in the repository, the package can be fetched from the external connection. This makes it possible to consume open-source dependencies used by your application. Using an external connection reduces interruption in your development process for package external dependencies, an example is if a package is removed from a public repository, you will still have a copy of the package stored in CodeArtifact. You should have a one-to-one mapping with external repositories, and rather than have multiple CodeArtifact repositories pointing to the same public repository. Each asset that CodeArtifact imports into your repository from a public repository is billed as a single request, and each connection must reconcile and fetch the package before the response is returned. By having a one-to-one mapping, you can increase cache hits, reduces time to download an application dependency from CodeArtifact, and reduce the number of external package resolution requests. Associating an external repository connection with your repository is done using the associate-external-connection command. See the following code:

aws codeartifact associate-external-connection \
--domain "my-org" --domain-owner "account-id" \
--repository "my-external-repo" --external-connection public:npmjs

Once you have associated an external connection with your repository, you’ll see the external connection visible in the Repositories section detail. In this example we’ve connected the repository to the external npmjs repository.

External connection to npmjs with AWS CodeArtifact repositories

External connection to npmjs for an AWS CodeArtifact repository

Team and product repositories

When working in distributed teams, you often align repositories to the product or service ownership. Teams working on there own repository can update as needed. An example would be creating a private package that your team only uses internally.

See the following code:

aws codeartifact create-repository --domain my-org \
--domain-owner account-id --repository my-team-repo \
--description "My new team repository"

As team’s develop against the package they will need to publish their changes to the repository. As part of your development pipeline you would publish the package to the repository. See the following code for an example:

# Log in to CodeArtifact
aws codeartifact login --tool npm \
--domain "my-org" --domain-owner "account-id" \
--repository "my-team-repo"

# Run build commands here
...

# Set $VERSION from your build system
npm version $VERSION

# Publish to CodeArtifact
npm publish

After testing the feature and you find that it will be usable across your organization, you can copy the package into your shared repository. See the following code:

# Promoting to a shared repo
aws codeartifact copy-package-versions --domain "my-org" \
--domain-owner "account-id" --source-repository "my-team-repo" \
--destination-repository "my-shared-repo" \
--package my-package --format npm \
--versions '["6.0.2"]'

Once you’ve created your shared repository you will see the repositories updated as shown here.

Team and product repositories in AWS CodeArtifact

Team and product repositories

Sharing repositories across accounts

Often teams or workloads have separate accounts within an organization. This is a recommended practice because it clearly defines operational boundaries and domain of ownership and establishes security boundaries. If your organization uses a multi-account strategy, you can share repositories across accounts using CodeArtifact resource policy. Teams can develop in their own account and publish to a CodeArtifact repository controlled in a shared account.

Here you see a list of repositories, which includes both a shared and team repository.

Cross account sharing of AWS CodeArtifact repositories

Cross account sharing of AWS CodeArtifact repositories

Using Amazon CloudWatch Events when a package is pushed

When a package is pushed into a repository, its change can affect software dependencies, teams, or process dependencies. When an artifact is pushed to CodeArtifact, an Amazon CloudWatch Events event is triggered, which you can trigger additional functionality. You can react to these events by subscribing to a CodeArtifact event in Amazon EventBridge. Some examples of reactions to a change you could take are: checking dependencies, deploying dependent services, notifying teams or services of a change, or building the dependencies.

You can also use EventBridge to start a pipeline in AWS CodePipeline, notify an Amazon Simple Notification Service (Amazon SNS) topic, and have that call AWS Chatbot. For more information see, CodeArtifact event format and example. If you are looking to integrate AWS Chatbot into your delivery flow, see Receive AWS Developer Tools Notifications over Slack using AWS Chatbot.

Deploying code in a hybrid environment

You can enable seamless software deployment into AWS and on-premises environments by integrating CodeArtifact with software build and deployment services. You can use CodeArtifact with your existing development pipeline tooling such as NPM, Python, and Maven. With native support for these package managers, you can access CodeArtifact wherever you operate today.

First, log in to CodeArtifact, build your code, and finally publish using npm publish with the following code:

# Log in to CodeArtifact 
aws codeartifact login --tool npm \
--domain "my-org" --domain-owner "account-id" \
--repository "my-team-repo"

# Run build commands here 
... 

# Set $VERSION from your build system 
npm version $VERSION 

# Publish to CodeArtifact 
npm publish

Cleaning Up

When you’re ready to clean up the repositories and domains you’ve created, you’ll need to remove them in a specific order. Please be aware that deleting a repository is a destructive action which will remove any stored packages. To delete a domain and delete a repository created from the previous sections in this blog, you will be using the delete-domain and delete-repository commands.

You will need to remove the domain and repository in the following order:

  1. Remove any repositories in a domain
  2. Remove the domain

To delete the repository and domain, see the following code:

# Delete the repository
aws codeartifact delete-repository --domain "my-org" --domain-owner "account-id" --repository "my-team-repo"

# Delete the domain
aws codeartifact delete-domain --domain "my-org" --domain-owner "account-id"

Conclusion

This post covered how to integrate CodeArtifact into your delivery flow and use CodeArtifact effectively. A shared repository approach aides in creating reusable components across your organization. Using team repositories and promoting to a consumable repository allows your teams to iterate independently. For more information, see Getting started with CodeArtifact.

About the Author

John Standish

John Standish is a Solutions Architect at AWS and spent over 13 years as a Microsoft .Net developer. Outside of work, he enjoys playing video games, cooking, and watching hockey.

Yogesh Chaturvedi

Yogesh Chaturvedi is a Solutions Architect at AWS and has over 20 years of software development and architecture experience.