Tag Archives: open source

Public Preview – AWS Distro for OpenTelemetry

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/public-preview-aws-distro-open-telemetry/

It took me a while to figure out what observability was all about. A year or two I asked around and my colleagues told me that I needed to follow Charity Majors and to read her blog (done, and done). Just this week, Charity tweeted:

Kislay’s tweet led to his blog post, Observing is not Debugging, which I found very helpful. As Charity noted, Kislay tells us that Observability is a study of the system in motion.

Today’s large-scale distributed applications and systems are effectively always in motion. Whether serving web requests, processing streams of data or handling events, something is always happening. At world-scale, looking at individual requests or events is not always feasible. Instead, it is necessary to take a statistical approach and to watch how well a system is working, instead of simply waiting for a total failure.

New AWS Distro for OpenTelemetry
Today we are launching a preview of AWS Distro for OpenTelemetry. We are part of the Cloud Native Computing Foundation (CNCF)’s OpenTelemetry community, working to define an open standard for the collection of distributed traces and metrics. AWS Distro for OpenTelemetry is a secure and supported distribution of the APIs, libraries, agents, and collectors defined in the OpenTelemetry Specification.

One of the coolest features of the toolkit is auto instrumentation. Starting with Java and in the works for other languages and environments (.NET and JavaScript are next), the auto-instrumentation agent identifies the frameworks and languages used by your application and automatically instruments them to collect and forward metrics and traces.

Here’s how all of the pieces fit together:

The AWS Observability Collector runs within your environment. It can be launched as a sidecar or daemonset for EKS, a sidecar for ECS, or an agent on EC2. You configure the metrics and traces that you want to collect, and also which AWS services to forward them to. You can set up a central account for monitoring complex multi-account applications, and you can also control the sampling rate (what percentage of the raw data is forwarded and ultimately stored).

Partners in Action
You can make use of AWS and partner tools and applications to observe, analyze, and act on what you see. We’re working with Cisco AppDynamics, Datadog, New Relic, Splunk, and other partners and will have more information to share during the preview.

Things to Know
The preview of the AWS Distro for OpenTelemetry is available now and you can start using it today. In addition to the .NET and JavaScript support that I mentioned earlier, we plan to support Python, Ruby, Go, C++, Erlang, and Rust as well.

This is an open source project and welcome your pull requests! We will be tracking the upstream repository and plan to release a fresh version of the toolkit quarterly.

Jeff;

PS – Be sure to sign up for our upcoming webinar, Observability at AWS and AWS Distro for OpenTelemetry Deep Dive.

 

Testing cloud apps with GitHub Actions and cloud-native open source tools

Post Syndicated from Sarah Khalife original https://github.blog/2020-10-09-devops-cloud-testing/

See this post in action during GitHub Demo Days on October 16.

What makes a project successful? For developers building cloud-native applications, successful projects thrive on transparent, consistent, and rigorous collaboration. That collaboration is one of the reasons that many open source projects, like Docker containers and Kubernetes, grow to become standards for how we build, deliver, and operate software. Our Open Source Guides and Introduction to innersourcing are great first steps to setting up and encouraging these best practices in your own projects.

However, a common challenge that application developers face is manually testing against inconsistent environments. Accurately testing Kubernetes applications can differ from one developer’s environment to another, and implementing a rigorous and consistent environment for end-to-end testing isn’t easy. It can also be very time consuming to spin up and down Kubernetes clusters. The inconsistencies between environments and the time required to spin up new Kubernetes clusters can negatively impact the speed and quality of cloud-native applications.

Building a transparent CI process

On GitHub, integration and testing becomes a little easier by combining GitHub Actions with open source tools. You can treat Actions as the native continuous integration and continuous delivery (CI/CD) tool for your project, and customize your Actions workflow to include automation and validation as next steps.

Since Actions can be triggered based on nearly any GitHub event, it’s also possible to build in accountability for updating tests and fixing bugs. For example, when a developer creates a pull request, Actions status checks can automatically block the merge if the test fails.

Here are a few more examples:

Branch protection rules in the repository help enforce certain workflows, such as requiring more than one pull request review or requiring certain status checks to pass before allowing a pull request to merge.

GitHub Actions are natively configured to act as status checks when they’re set up to trigger `on: [pull_request]`.

Continuous integration (CI) is extremely valuable as it allows you to run tests before each pull request is merged into production code. In turn, this will reduce the number of bugs that are pushed into production and increases confidence that newly introduced changes will not break existing functionality.

But transparency remains key: Requiring CI status checks on protected branches provides a clearly-defined, transparent way to let code reviewers know if the commits meet the conditions set for the repository—right in the pull request view.

Using community-powered workflows

Now that we’ve thought through the simple CI policies, automated workflows are next. Think of an Actions workflow as a set of “plug and play” open sourced, automated steps contributed by the community. You can use them as they are, or customize and make them your own. Once you’ve found the right one, open sourced Actions can be plugged into your workflow with the`- uses: repo/action-name` field.

You might ask, “So how do I find available Actions that suit my needs?”

The GitHub Marketplace!

As you’re building automation and CI pipelines, take advantage of Marketplace to find pre-built Actions provided by the community. Examples of pre-built Actions span from a Docker publish and the kubectl CLI installation to container scans and cloud deployments. When it comes to cloud-native Actions, the list keeps growing as container-based development continues to expand.

Testing with kind

Testing is a critical part of any CI/CD pipeline, but running tests in Kubernetes can absorb the extra time that automation saves. Enter kind. kind stands for “Kubernetes in Docker.” It’s an open source project from the Kubernetes special interest group (SIGs) community, and a tool for running local Kubernetes clusters using Docker container “nodes.” Creating a kind cluster is a simple way to run Kubernetes cluster and application testing—without having to spin up a complete Kubernetes environment.

As the number of Kubernetes users pushing critical applications to production grows, so does the need for a repeatable, reliable, and rigorous testing process. This can be accomplished by combining the creation of a homogenous Kubernetes testing environment with kind, the community-powered Marketplace, and the native and transparent Actions CI process.

Bringing it all together with kind and Actions

Come see kind and Actions at work during our next GitHub Demo Day live stream on October 16, 2020 at 11am PT. I’ll walk you through how to easily set up automated and consistent tests per pull request, including how to use kind with Actions to automatically run end-to-end tests across a common Kubernetes environment.

New – Redis 6 Compatibility for Amazon ElastiCache

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-redis-6-compatibility-for-amazon-elasticache/

After the last Redis 5.0 compatibility for Amazon ElastiCache, there has been lots of improvements to Amazon ElastiCache for Redis including upstream supports such as 5.0.6.

Earlier this year, we announced Global Datastore for Redis that lets you replicate a cluster in one region to clusters in up to two other regions. Recently we improved your ability to monitor your Redis fleet by enabling 18 additional engine and node-level CloudWatch metrics. Also, we added support for resource-level permission policies, allowing you to assign AWS Identity and Access Management (IAM) principal permissions to specific ElastiCache resource or resources.

Today, I am happy to announce Redis 6 compatibility to Amazon ElastiCache for Redis. This release brings several new and important features to Amazon ElastiCache for Redis:

  • Managed Role-Based Access Control – Amazon ElastiCache for Redis 6 now provides you with the ability to create and manage users and user groups that can be used to set up Role-Based Access Control (RBAC) for Redis commands. You can now simplify your architecture while maintaining security boundaries by having several applications use the same Redis cluster without being able to access each other’s data. You can also take advantage of granular access control and authorization to create administration and read-only user groups. Amazon ElastiCache enhances the new Access Control Lists (ACL) introduced in open source Redis 6 to provide a managed RBAC experience, making it easy to set up access control across several Amazon ElastiCache for Redis clusters.
  • Client-Side Caching – Amazon ElastiCache for Redis 6 comes with server-side enhancements to deliver efficient client-side caching to further improve your application performance. Redis clusters now support client-side caching by tracking client requests and sending invalidation messages for data stored on the client. In addition, you can also take advantage of a broadcast mode that allows clients to subscribe to a set of notifications from Redis clusters.
  • Significant Operational Improvements – This release also includes several enhancements that improve application availability and reliability. Specifically, Amazon ElastiCache has improved replication under low memory conditions, especially for workloads with medium/large sized keys, by reducing latency and the time it takes to perform snapshots. Open source Redis enhancements include improvements to expiry algorithm for faster eviction of expired keys and various bug fixes.

Note that open source Redis 6 also announced support for encryption-in-transit, a capability that is already available in Amazon ElastiCache for Redis 4.0.10 onwards. This release of Amazon ElastiCache for Redis 6 does not impact Amazon ElastiCache for Redis’ existing support for encryption-in-transit.

In order to apply RBAC to a new or existing Redis 6 cluster, we first need to ensure you have a user and user group created. We’ll review the process to do this below.

Using Role-Based Access Control – How it works
An alternative to Authenticating Users with the Redis AUTH Command, Amazon ElastiCache for Redis 6 offers Role-Based Access Control (RBAC). With RBAC, you create users and assign them specific permissions via an Access String.

If you want to create, modify, and delete users and user groups, you will need to select to the User Management and User Group Management sections in the ElastiCache console.

ElastiCache will automatically configure a default user with user ID and user name “default”, and then you can add it or new created users to new groups in User Group Management.

If you want to change the default user with your own password and access setting, you need to create a new user with the username set to “default” and can then swap it with the original default user. We recommend using your own strong password for a default user.

The following example shows how to swap the original default user with another default that has a modified access string via AWS CLI.

$ aws elasticache create-user \
 --user-id "new-default-user" \
 --user-name "default" \
 --engine "REDIS" \
 --passwords "a-str0ng-pa))word" \ 
 --access-string "off +get ~keys*"

Create a user group and add the user you created previously.

$ aws elasticache create-user-group \
  --user-group-id "new-default-group" \
  --engine "REDIS" \
  --user-ids "default"

Swap the new default user with the original default user.

$ aws elasticache modify-user-group \
    --user-group-id "new-default-group" \
    --user-ids-to-add "new-default-user" \
    --user-ids-to-remove "default"

Also, you can modify a user’s password or change its access permissions using modify-user command, or remove a specific user using delete-user command. It will be removed from any user groups to which it belongs.

Similarly you can modify a user group by adding new users and/or removing current users using modify-user-group command, or delete a user group using delete-user-group command. Note that the user group itself, not the users belonging to the group, will be deleted.

Once you have created a user group and added users, you can assign the user group to a replication group, or migrate between Redis AUTH and RBAC. For more information, see the documentation in detail.

Redis 6 cluster for ElastiCache – Getting Started
As usual, you can use the ElastiCache Console, CLI, APIs, or a CloudFormation template to create to new Redis 6 cluster. I’ll use the Console, choose Redis from the navigation pane and click Create with the following settings:

Select “Encryption in-transit” checkbox to ensure you can see the “Access Control” options. You can select an option of Access Control either User Group Access Control List by RBAC features or Redis AUTH default user. If you select RBAC, you can choose one of the available user groups.

My cluster is up and running within minutes. You can also use the in-place upgrade feature on existing cluster. By selecting the cluster, click Action and Modify. You can change the Engine Version from 5.0.6-compatible engine to 6.x.

Now Available
Amazon ElastiCache for Redis 6 is now available in all AWS regions. For a list of ElastiCache for Redis supported versions, refer to the documentation. Please send us feedback either in the AWS forum for Amazon ElastiCache or through AWS support, or your account team.

Channy;

Amazon SageMaker Continues to Lead the Way in Machine Learning and Announces up to 18% Lower Prices on GPU Instances

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-leads-way-in-machine-learning/

Since 2006, Amazon Web Services (AWS) has been helping millions of customers build and manage their IT workloads. From startups to large enterprises to public sector, organizations of all sizes use our cloud computing services to reach unprecedented levels of security, resiliency, and scalability. Every day, they’re able to experiment, innovate, and deploy to production in less time and at lower cost than ever before. Thus, business opportunities can be explored, seized, and turned into industrial-grade products and services.

As Machine Learning (ML) became a growing priority for our customers, they asked us to build an ML service infused with the same agility and robustness. The result was Amazon SageMaker, a fully managed service launched at AWS re:Invent 2017 that provides every developer and data scientist with the ability to build, train, and deploy ML models quickly.

Today, Amazon SageMaker is helping tens of thousands of customers in all industry segments build, train and deploy high quality models in production: financial services (Euler Hermes, Intuit, Slice Labs, Nerdwallet, Root Insurance, Coinbase, NuData Security, Siemens Financial Services), healthcare (GE Healthcare, Cerner, Roche, Celgene, Zocdoc), news and media (Dow Jones, Thomson Reuters, ProQuest, SmartNews, Frame.io, Sportograf), sports (Formula 1, Bundesliga, Olympique de Marseille, NFL, Guiness Six Nations Rugby), retail (Zalando, Zappos, Fabulyst), automotive (Atlas Van Lines, Edmunds, Regit), dating (Tinder), hospitality (Hotels.com, iFood), industry and manufacturing (Veolia, Formosa Plastics), gaming (Voodoo), customer relationship management (Zendesk, Freshworks), energy (Kinect Energy Group, Advanced Microgrid Systems), real estate (Realtor.com), satellite imagery (Digital Globe), human resources (ADP), and many more.

When we asked our customers why they decided to standardize their ML workloads on Amazon SageMaker, the most common answer was: “SageMaker removes the undifferentiated heavy lifting from each step of the ML process.” Zooming in, we identified five areas where SageMaker helps them most.

#1 – Build Secure and Reliable ML Models, Faster
As many ML models are used to serve real-time predictions to business applications and end users, making sure that they stay available and fast is of paramount importance. This is why Amazon SageMaker endpoints have built-in support for load balancing across multiple AWS Availability Zones, as well as built-in Auto Scaling to dynamically adjust the number of provisioned instances according to incoming traffic.

For even more robustness and scalability, Amazon SageMaker relies on production-grade open source model servers such as TensorFlow Serving, the Multi-Model Server, and TorchServe. A collaboration between AWS and Facebook, TorchServe is available as part of the PyTorch project, and makes it easy to deploy trained models at scale without having to write custom code.

In addition to resilient infrastructure and scalable model serving, you can also rely on Amazon SageMaker Model Monitor to catch prediction quality issues that could happen on your endpoints. By saving incoming requests as well as outgoing predictions, and by comparing them to a baseline built from a training set, you can quickly identify and fix problems like missing features or data drift.

Says Aude Giard, Chief Digital Officer at Veolia Water Technologies: “In 8 short weeks, we worked with AWS to develop a prototype that anticipates when to clean or change water filtering membranes in our desalination plants. Using Amazon SageMaker, we built a ML model that learns from previous patterns and predicts the future evolution of fouling indicators. By standardizing our ML workloads on AWS, we were able to reduce costs and prevent downtime while improving the quality of the water produced. These results couldn’t have been realized without the technical experience, trust, and dedication of both teams to achieve one goal: an uninterrupted clean and safe water supply.” You can learn more in this video.

#2 – Build ML Models Your Way
When it comes to building models, Amazon SageMaker gives you plenty of options. You can visit AWS Marketplace, pick an algorithm or a model shared by one of our partners, and deploy it on SageMaker in just a few clicks. Alternatively, you can train a model using one of the built-in algorithms, or your own code written for a popular open source ML framework (TensorFlow, PyTorch, and Apache MXNet), or your own custom code packaged in a Docker container.

You could also rely on Amazon SageMaker AutoPilot, a game-changing AutoML capability. Whether you have little or no ML experience, or you’re a seasoned practitioner who needs to explore hundreds of datasets, SageMaker AutoPilot takes care of everything for you with a single API call. It automatically analyzes your dataset, figures out the type of problem you’re trying to solve, builds several data processing and training pipelines, trains them, and optimizes them for maximum accuracy. In addition, the data processing and training source code is available in auto-generated notebooks that you can review, and run yourself for further experimentation. SageMaker Autopilot also now creates machine learning models up to 40% faster with up to 200% higher accuracy, even with small and imbalanced datasets.

Another popular feature is Automatic Model Tuning. No more manual exploration, no more costly grid search jobs that run for days: using ML optimization, SageMaker quickly converges to high-performance models, saving you time and money, and letting you deploy the best model to production quicker.

NerdWallet relies on data science and ML to connect customers with personalized financial products“, says Ryan Kirkman, Senior Engineering Manager. “We chose to standardize our ML workloads on AWS because it allowed us to quickly modernize our data science engineering practices, removing roadblocks and speeding time-to-delivery. With Amazon SageMaker, our data scientists can spend more time on strategic pursuits and focus more energy where our competitive advantage is—our insights into the problems we’re solving for our users.” You can learn more in this case study.
Says Tejas Bhandarkar, Senior Director of Product, Freshworks Platform: “We chose to standardize our ML workloads on AWS because we could easily build, train, and deploy machine learning models optimized for our customers’ use cases. Thanks to Amazon SageMaker, we have built more than 30,000 models for 11,000 customers while reducing training time for these models from 24 hours to under 33 minutes. With SageMaker Model Monitor, we can keep track of data drifts and retrain models to ensure accuracy. Powered by Amazon SageMaker, Freddy AI Skills is constantly-evolving with smart actions, deep-data insights, and intent-driven conversations.

#3 – Reduce Costs
Building and managing your own ML infrastructure can be costly, and Amazon SageMaker is a great alternative. In fact, we found out that the total cost of ownership (TCO) of Amazon SageMaker over a 3-year horizon is over 54% lower compared to other options, and developers can be up to 10 times more productive. This comes from the fact that Amazon SageMaker manages all the training and prediction infrastructure that ML typically requires, allowing teams to focus exclusively on studying and solving the ML problem at hand.

Furthermore, Amazon SageMaker includes many features that help training jobs run as fast and as cost-effectively as possible: optimized versions of the most popular machine learning libraries, a wide range of CPU and GPU instances with up to 100GB networking, and of course Managed Spot Training which lets you save up to 90% on your training jobs. Last but not least, Amazon SageMaker Debugger automatically identifies complex issues developing in ML training jobs. Unproductive jobs are terminated early, and you can use model information captured during training to pinpoint the root cause.

Amazon SageMaker also helps you slash your prediction costs. Thanks to Multi-Model Endpoints, you can deploy several models on a single prediction endpoint, avoiding the extra work and cost associated with running many low-traffic endpoints. For models that require some hardware acceleration without the need for a full-fledged GPU, Amazon Elastic Inference lets you save up to 90% on your prediction costs. At the other end of the spectrum, large-scale prediction workloads can rely on AWS Inferentia, a custom chip designed by AWS, for up to 30% higher throughput and up to 45% lower cost per inference compared to GPU instances.

Lyft, one of the largest transportation networks in the United States and Canada, launched its Level 5 autonomous vehicle division in 2017 to develop a self-driving system to help millions of riders. Lyft Level 5 aggregates over 10 terabytes of data each day to train ML models for their fleet of autonomous vehicles. Managing ML workloads on their own was becoming time-consuming and expensive. Says Alex Bain, Lead for ML Systems at Lyft Level 5: “Using Amazon SageMaker distributed training, we reduced our model training time from days to couple of hours. By running our ML workloads on AWS, we streamlined our development cycles and reduced costs, ultimately accelerating our mission to deliver self-driving capabilities to our customers.

#4 – Build Secure and Compliant ML Systems
Security is always priority #1 at AWS. It’s particularly important to customers operating in regulated industries such as financial services or healthcare, as they must implement their solutions with the highest level of security and compliance. For this purpose, Amazon SageMaker implements many security features, making it compliant with the following global standards: SOC 1/2/3, PCI, ISO, FedRAMP, DoD CC SRG, IRAP, MTCS, C5, K-ISMS, ENS High, OSPAR, and HITRUST CSF. It’s also HIPAA BAA eligible.

Says Ashok Srivastava, Chief Data Officer, Intuit: “With Amazon SageMaker, we can accelerate our Artificial Intelligence initiatives at scale by building and deploying our algorithms on the platform. We will create novel large-scale machine learning and AI algorithms and deploy them on this platform to solve complex problems that can power prosperity for our customers.”

#5 – Annotate Data and Keep Humans in the Loop
As ML practitioners know, turning data into a dataset requires a lot of time and effort. To help you reduce both, Amazon SageMaker Ground Truth is a fully managed data labeling service that makes it easy to annotate and build highly accurate training datasets at any scale (text, image, video, and 3D point cloud datasets).

Says Magnus Soderberg, Director, Pathology Research, AstraZeneca: “AstraZeneca has been experimenting with machine learning across all stages of research and development, and most recently in pathology to speed up the review of tissue samples. The machine learning models first learn from a large, representative data set. Labeling the data is another time-consuming step, especially in this case, where it can take many thousands of tissue sample images to train an accurate model. AstraZeneca uses Amazon SageMaker Ground Truth, a machine learning-powered, human-in-the-loop data labeling and annotation service to automate some of the most tedious portions of this work, resulting in reduction of time spent cataloging samples by at least 50%.

Amazon SageMaker is Evaluated
The hundreds of new features added to Amazon SageMaker since launch are testimony to our relentless innovation on behalf of customers. In fact, the service was highlighted in February 2020 as the overall leader in Gartner’s Cloud AI Developer Services Magic Quadrant. Gartner subscribers can click here to learn more about why we have an overall score of 84/100 in their “Solution Scorecard for Amazon SageMaker, July 2020”, the highest rating among our peer group. According to Gartner, we met 87% of required criteria, 73% of preferred, and 85% of optional.

Announcing a Price Reduction on GPU Instances

To thank our customers for their trust and to show our continued commitment to make Amazon SageMaker the best and most cost-effective ML service, I’m extremely happy to announce a significant price reduction on all ml.p2 and ml.p3 GPU instances. It will apply starting October 1st for all SageMaker components and across the following regions: US East (N. Virginia), US East (Ohio), US West (Oregon), EU (Ireland), EU (Frankfurt), EU (London), Canada (Central), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Seoul), Asia Pacific (Tokyo), Asia Pacific (Mumbai), and AWS GovCloud (US-Gov-West).

Instance NamePrice Reduction
ml.p2.xlarge-11%
ml.p2.8xlarge-14%
ml.p2.16xlarge-18%
ml.p3.2xlarge-11%
ml.p3.8xlarge-14%
ml.p3.16xlarge-18%
ml.p3dn.24xlarge-18%

Getting Started with Amazon SageMaker
As you can see, there are a lot of exciting features in Amazon SageMaker, and I encourage you to try them out! Amazon SageMaker is available worldwide, so chances are you can easily get to work on your own datasets. The service is part of the AWS Free Tier, letting new users work with it for free for hundreds of hours during the first two months.

If you’d like to kick the tires, this tutorial will get you started in minutes. You’ll learn how to use SageMaker Studio to build, train, and deploy a classification model based on the XGBoost algorithm.

Last but not least, I just published a book named “Learn Amazon SageMaker“, a 500-page detailed tour of all SageMaker features, illustrated by more than 60 original Jupyter notebooks. It should help you get up to speed in no time.

As always, we’re looking forward to your feedback. Please share it with your usual AWS support contacts, or on the AWS Forum for SageMaker.

– Julien

Migrating cdnjs to serverless with Workers KV

Post Syndicated from Tyler Caslin original https://blog.cloudflare.com/migrating-cdnjs-to-serverless-with-workers-kv/

Migrating cdnjs to serverless with Workers KV

Cloudflare powers cdnjs, an open-source project that accelerates websites by delivering popular JavaScript libraries and resources via Cloudflare’s network. Since our major update in December, we focused on remodelling cdnjs for scalability and resilience. Today, we are excited to announce how Cloudflare delivers cdnjs—a migration to a serverless infrastructure using Cloudflare Workers and its distributed key-value store Workers KV!

What is cdnjs and why do I care?

Migrating cdnjs to serverless with Workers KV

For those unfamiliar, cdnjs is an acronym describing a Content Delivery Network (CDN) for JavaScript (JS). A CDN simply refers to a geographically distributed network of servers that provide Internet content, whether it is memes, cat videos, or HTML pages. In our case, the CDN refers to Cloudflare’s ever expanding network of over 200 globally distributed data centers.

And here’s why this is relevant to you: it makes page load times lightning-fast. Virtually every website you visit needs to fetch JS libraries in order to load, including this one. Let’s say you visit a Sydney-based website that contains a local file from jQuery, a popular library found in 76.2% of websites. If you are located in New York, you may notice a delay, as it can easily exceed 300ms to fetch the file—not to mention the time it takes for the round trips involved with the TLS handshake. However, if the website references jQuery using cdnjs.cloudflare.com, you can retrieve the file from the closest Cloudflare data center in Buffalo, reducing the latency to a blazing 20ms.

While cdnjs operates behind the scenes, it is used by over 11% of websites, making the Internet a much faster and more reliable place. In July, cdnjs served almost 190 billion requests—an enormous 3.46PB of data.

Where are the files stored?

Migrating cdnjs to serverless with Workers KV

While cdnjs speeds up the Internet, it certainly isn’t magic!

Historically, a number of load-balanced machines at one of Cloudflare’s core data centers would periodically pull cdnjs files from a backing store, acting as the origin for cdnjs.cloudflare.com. When a new file is requested, it is cached by Cloudflare, allowing it to be fetched quickly from any of our data centers.

The backing store is a catalogue of JS, CSS, and other web libraries in the form of an open-source GitHub repository. What this means is that anyone—including you—can contribute to it, subject to review and other processes.

However, until recently, these existing operations were very labor intensive and fragile.

This blog post will explain why we changed the infrastructure behind cdnjs to make it faster, more reliable, and easier to maintain. First, we will discuss how the community used to contribute to cdnjs, outlining the pains and concerns of the old system. Then, we will explore the benefits of migrating to Workers KV. After, we will dive into the new architecture, as well as upgrades to the website and cdnjs API. Finally, we will review the history of cdnjs, and where it is headed in the future.

If you think you know how to make a PR, think again

Migrating cdnjs to serverless with Workers KV

For the non-technical reader, a pull request (PR) is a request to merge changes you’ve made to a repository. Traditionally, if you wanted to include your JavaScript library in cdnjs, you would first create a PR on GitHub to cdnjs/cdnjs with a JSON file describing your package and additional files for any version you wished to include. Once your PR was approved by our old bot, manually reviewed, and then merged by a maintainer, your package would be integrated with cdnjs.

Sounds easy, right? You can just fork the repo, clone it, and copy paste a few files, no?

Exactly. Contributing was easy if you had several hours to burn, a case-sensitive file system, and a couple hundred gigabytes of free disk space to git clone the 300GB repo. If you were short on time—no problem, you could always use your advanced knowledge of git sparse-checkout to get the job done. Don’t know git? Just add one file at a time manually through GitHub’s UI.

I think you get the point. I know I certainly did when I naively spent 10 hours cloning the repo, only to discover that macOS is case-insensitive by default.

However, updating cdnjs was not only difficult for the contributors, but also the maintainers. Historically, the community was able to contribute version files directly, which could potentially be malicious. This created lots of work for maintainers, requiring them to inspect each file manually, diffing files against the official library source and running malware checks.
So how did packages update once they were in cdnjs? In the JSON file describing each package, there was an optional auto-update definition telling the bot where to look for new versions of the library. If present, when your package released a new version from npm or GitHub, the bot would download it, pushing the files to cdnjs/cdnjs and computed Subresource Integrity (SRI) hashes to cdnjs/SRIs. If the auto-update property was missing, it would be your responsibility to make manual PRs to update cdnjs with any future versions.

A wake-up call for cdnjs

Migrating cdnjs to serverless with Workers KV

In April, during maintenance at one of our core data centers, a technician accidentally disconnected the cables supplying all external connections to our other data centers, causing the data center to go offline for approximately four hours. This incident served as the first wake-up call for cdnjs, especially since the affected data center housed the primary cdnjs origin web servers. In this case, we did have a backup running on an external provider, but what really saved us was Cloudflare’s global cache, which minimized the impact of the outage as only uncached assets failed to load.

We started to think about how we can improve both the reliability and performance of how we serve cdnjs. We went straight to Cloudflare Workers, our own platform for developing on the edge. One powerful tool built into Workers is Workers KV—a low-latency, globally distributed key-value store optimized for high-read applications.

We put two and two together, realizing that instead of pulling the cdnjs/cdnjs repository and serving files from disk, we could cut the physical machines out entirely, distributing the data around the world and serving files straight from the edge. That way, cdnjs would be able to recover from any origin data center failure, while also increasing its scalability.

Workers KV to the rescue

Migrating cdnjs to serverless with Workers KV

At first glance, the decision to use Workers KV was a no-brainer. Since files in cdnjs never change but require frequent reads, Workers KV was a perfect fit.

However, as we planned our migration, we became concerned that with over 7 million assets in cdnjs, there would undoubtedly exist files that exceed Workers KV’s 10MiB value limit. After investigating, we discovered that several hundred cdnjs files were oversized, the majority being JavaScript Source Maps.

Then the idea hit us. We could store compressed versions of cdnjs files in Workers KV, not only solving our oversized file issue, but also optimizing how we serve files.

If you pay the Internet bill, you’ll know that bandwidth is expensive! For this reason, all modern browsers will try to fetch compressed web content whenever it is available. Similarly, within Cloudflare we often experiment with on-the-fly compression to reduce our bandwidth, always serving compressed content to the eyeball when it is accepted. As a result, we decided to compress all cdnjs files ahead of time, writing them to Workers KV with both optimal Brotli and gzip forms. That way, we could increase the compression level compared to on-the-fly compression as we no longer have the latency requirements.

This means we now serve cdnjs files faster and smaller!

A complete makeover for cdnjs

Migrating cdnjs to serverless with Workers KV

Today, if you want to include your JavaScript library in cdnjs, you first create a PR on GitHub to our new repository cdnjs/packages. The repo is easily cloneable at 50MB and consists of thousands of JSON files, each describing a cdnjs package and how it is auto-updated from npm or git. Once your file is validated by our automated CI—powered by a new bot—and merged by a maintainer, your package would be automatically enrolled in our auto-update service.

In the new system, security and maintainability are prioritized. For starters, cdnjs version files are created by our bot, minimizing the possibility of human error when merging a new version. While the JSON files in cdnjs/packages are added by error-prone humans, they are inspected by our bot before being approved by a maintainer. Each file is automatically validated against a JSON schema, as well as checked for popularity on npm or GitHub.

When the bot discovers a new release, it pushes Brotli and gzip-compressed versions of the files to a files namespace in Workers KV. With each entry, the bot writes some metadata in Workers KV for the ETag and Last-Modified HTTP headers. Similar to before, the bot also computes Subresource Integrity (SRI) hashes of the uncompressed files, but now pushes them instead to a SRIs namespace in Workers KV.

Then, when a new file is requested from cdnjs.cloudflare.com, a Cloudflare Worker will inspect the client’s Accept-Encoding header, fetching either the Brotli or gzip-compressed version with its ETag and Last-Modified metadata from Workers KV. As the compressed file travels back through Cloudflare, it is cached for future requests and uncompressed on-the-fly if needed.

At the moment, there are still a handful of files exceeding Workers KV’s size limit. Consequently, if the Cloudflare Worker fails to retrieve a file from Workers KV, it is fetched from the origin backed by the original git repo. In the coming months, we plan on gradually removing this infrastructure.

Scaling the website and API

Migrating cdnjs to serverless with Workers KV

Besides the core cdnjs infrastructure, many of its other components received upgrades as well!

On the cdnjs project’s homepage, you will be greeted by a slick new beta website built by Matt. Constructed with Vue and Nuxt, the beta website is powered entirely by the cdnjs API. As a result, it is always up-to-date with the latest package information and requires low resource usage to serve the site—which runs completely on the client-side after the first page load—helping us scale with cdnjs’s never-ending growth.

In fact, the cdnjs API also strengthened its scalability, benefitting from a serverless architecture close to the one we have seen with cdnjs and Workers KV.

Before migrating to Workers KV, the cdnjs API relied on a regularly scheduled process that involved generating about 300MB of metadata. The cdnjs API’s backend would then fetch this enormous “package.min.js” file into memory and use it to operate the API. If you are curious, the file is still being hosted here, but be warned—it may lag your browser! Similarly, file SRIs were pushed to cdnjs/SRIs, which was cloned by the API locally to serve SRI responses.

After all cdnjs files (within the permitted size limit) were moved to Workers KV, these legacy processes became unsustainable, requiring millions of reads and an unreasonable amount of time. Therefore, we decided to upload all metadata found into Workers KV. We split the metadata into four namespaces—one for package-level metadata, one for version-specific metadata, one containing aggregated metadata, and one for file SRIs.

Similar to cdnjs’s serverless design, a Cloudflare Worker sits on top of metadata.speedcdnjs.com, serving data from Workers KV using several public endpoints. Currently, the cdnjs API is fully integrated with these endpoints, which provide an elegant solution as cdnjs continues to scale.

Transparency and the future of cdnjs

Since its birth in January 2011, cdnjs has always been deeply rooted in transparency, deriving its strength from the community. Even when cdnjs exploded in size and its founders Ryan Kirkman and Thomas Davis teamed up with us in June 2011, the project remained entirely open-source on GitHub.

As the years passed, it became harder for the founders to stay active, heavily depending on the community for support. With a nearly nonexistent budget and little access to the repository, core cdnjs maintainers were challenged every day to keep the project alive.

Last year, this led us to contact the founders, who were happy to have our assistance with the project. With Cloudflare’s increased role, cdnjs is as stable as ever, with active members from both Cloudflare and the community.

However, as we remove our reliance on the legacy system and store files in Workers KV, there are concerns that cdnjs will become proprietary. Don’t worry, we are working hard to ensure that cdnjs remains as transparent and open-source as possible. To help the community audit updates to Workers KV, there is a new repository, cdnjs/logs, which is used by the bot to log all Workers KV-related events. Furthermore, anyone can validate the integrity of cdnjs files by fetching SRIs from the cdnjs API.

Conclusion

Overall, this past year has been a turbulent time for cdnjs, but all of its shortcomings have acted as red flags to help us build a better system. Most recently, we have mitigated the risks of depending on physical machines at a single location, migrating cdnjs to a serverless infrastructure where its files are stored in Workers KV.

Today, cdnjs is in good hands, and is not going away anytime soon. Shout out especially to the maintainers Sven and Matt for creating tons of momentum with the project, working on everything from scaling cdnjs to editing this post.

Moving forward, we are committed to making cdnjs as transparent as possible. As we continue to improve cdnjs, we will release more blog posts to keep the community up to date. If you are interested, please subscribe to our blog. After all, it is the community that makes cdnjs possible! A special thanks to our active GitHub contributors and members of the cdnjs Community Forum for sticking with us!

Introducing the Rally + GitHub integration

Post Syndicated from Jared Murrell original https://github.blog/2020-08-18-introducing-the-rally-github-integration/

GitHub’s Professional Services Engineering team has decided to open source another project: Rally + GitHub. You may have seen our most recent open source project, Super Linter. Well, the team has done it again, this time to help users ensure that Rally stays up to date with the latest development in GitHub! 🎉

Rally + GitHub

This project integrates GitHub Enterprise Server (and cloud, if you host it yourself) with Broadcom’s Rally project management.

Every time a pull request is created or updated, Rally + GitHub will check for the existence of a Rally User Story or Defect in the titlebody, or commit messages, and then validate that they both exist and are in the correct state within Rally.

Animation showing a pull request being created

Why was it created?

GitHub Enterprise Server had a legacy Services integration with Rally. The deprecation of legacy Services for GitHub was announced in 2018, and the release of GitHub Enterprise Server 2.20 officially removed this functionality. As a result, many GitHub Enterprise users will be left without the ability to integrate the two platforms when upgrading to recent releases of GitHub Enterprise Server.

While Broadcom created a new integration for github.com, this functionality does not extend to GitHub Enterprise Server environments.

Get Started

We encourage you to check out this project and set it up with your existing Rally instance. A good place to start getting set up is the Get Started guide in the project’s README.md

We invite you to join us in developing this project! Come engage with us by opening up an issue even just to share your experience with the project.

Animation showing Rally and GitHub integration

CodeGen: Semantic’s improved language support system

Post Syndicated from Ayman Nadeem original https://github.blog/2020-08-04-codegen-semantics-improved-language-support-system/

The Semantic Code team shipped a massive improvement to the language support system that powers code navigation. Code navigation features only scratch the surface of possibilities that start to open up when we combine Semantic‘s program analysis potential with GitHub’s scale.

GitHub is home to over 50 million developers worldwide. Our team’s mission is to analyze code on our platform and surface insights that empower users to feel more productive. Ideally, this analysis should work for all users, regardless of which programming language they use. Until now, however, we’ve only been able to support a handful of languages due to the high cost of adding and maintaining them. Our new language support system, CodeGen, cuts that cost dramatically by automating a significant portion of the pipeline, in addition to making it more resilient. The result is that it is now easier than ever to add new programming languages that get parsed by our library.

Language Support is mission-critical

Before Semantic can compute diffs or perform abstract interpretation, we need to transform code into a representation that is amenable to our analysis goals. For this reason, a significant portion of our pipeline deals with processing source code from files on GitHub.com into an appropriate representation—we call this our “language support system”.

Diagram showing semantic architecture

Adding languages has been difficult

Zooming into the part of Semantic that achieves language support, we see that it involved several development phases, including two parsing steps that required writing and maintaining two separate grammars per language.

Diagram showing language support pipeline

Reading the diagram from left to right, our historic language support pipeline:

  1. Parsed source code into ASTs. A grammar is hand-written for a given language using tree-sitter, an incremental GLR parsing library for programming tools.
  2. Read tree-sitter ASTs into Semantic. Connecting Semantic to tree-sitter‘s C library requires providing an interface to the C source. We achieve this through our haskell-tree-sitter library, which has Haskell bindings to tree-sitter.
  3. Parsed ASTs into a generalized representation of syntax. For these ASTs to be consumable by our Haskell project, we had to translate the tree-sitter parse trees into an appropriate representation. This required:
    • À la carte syntax types: generalization across programming languages Many constructs, such as If statements, occur in several languages. Instead of having different representations of If statements for each language, could we reduce duplication by creating a generalized representation of syntax that could be shared across languages, such as a datatype modeling the semantics of conditional logic? This was the reasoning behind creating our hand-written, generalized à la carte syntaxes based on Wouter Swierstra’s Data types à la carte approach, allowing us to represent those shared semantics across languages. For example, this file captures a subset of à la carte datatypes that model expression syntaxes across languages.
    • Assignment: turning tree-sitter‘s representation into a Haskell datatype representation We had to translate tree-sitter AST nodes to be represented by the new generalized à la carte syntax. To do this, a second grammar was written in Haskell to assign the nodes of the tree-sitter ASTs onto a generalized representation of syntax modeled by the à la carte datatypes. As an example, here is the Assignment grammar written for Ruby.
  4. Performed Evaluation. Next, we captured what it meant to interpret the syntax datatypes. To do so, we wrote a polymorphic type class called Evaluatable, which defined the necessary interface for a term to be evaluated. Evaluatable instances were added for each of the à la carte syntaxes.
  5. Handled Effects. In addition to evaluation semantics, describing the control flow of a given program also necessitates modeling effects. This helps ensure we can represent things like the file system, state, non-determinism, and other effectful computations.
  6. Validated via tests. Tests for diffing, tagging, graphing, and evaluating source code written in that language were added along the process.

Challenges posed by the system

The process described had several obstacles. Not only was it very technically involved, but it had additional limitations.

  1. The system was brittle. Each language’s Assignment code was tightly coupled to the language’s tree-sitter grammar. This meant it could break at runtime if we changed the structure of the grammar, without any compile-time error. To prevent such errors required tracking ongoing changes in tree-sitter, which was also tedious, manual, and error-prone. Each time a grammar changed, assignment changes had to be made to accommodate new tree-structures, such as nodes that changed names or shifted positions. Because improvements to the underlying grammars required changes to Assignment—which were costly in terms of time and risky in terms of the possibility of introducing bugs—, our system had inadvertently become incentivized against iterative improvement.
  2. There were no named child nodes. tree-sitter‘s syntax nodes didn’t provide us with named child nodes. Instead, child nodes were structured as ordered-lists, without any name indicating the role of each child. This didn’t match Semantic’s internal representation of syntax nodes, where each type of node has a specific set of named children. This meant more Assignment work was necessary to compensate for the discrepancy. One concern, for example, was about how we represented comments, which could be any arbitrary node attached to any part of the AST. But if we had named child nodes, this would allow us to associate comments relative to their parent nodes (like if a comment appeared in an if statement, it could be the first child for that if statement node). This would also apply to any other nodes that could appear anywhere within the tree, such as Ruby heredocs.
  3. Evaluation and à la carte sum types were sub-optimal. Taking a step back to examine language support also gave us an opportunity to rethink our à la carte datatypes and the evaluation machinery. À la carte syntax types were motivated by a desire to better share tooling in evaluating common fragments of languages. However, the introduction of these syntax types (and the design of the Evaluatable typeclass) did not make our analysis sensitive to minor linguistic differences, or even to relate different pieces of syntax together. We could overcome this by adding language-specific syntax datatypes to be used with Assignment, along with their accompanying Evaluatable instances—but this would defeat the purpose of a generalized representation. This is because à la carte syntax was essentially untyped; it enforced only a minimal structure on the tree. As a result, any given subterm could be any element of the syntax, and not some limited subset. This meant that many Evaluatable instances had to deal with error conditions that in practice can’t occur. To make this idea more concrete, consider examples showcasing a before and after syntax type transformation:
    -- former system: à la carte syntax
    
    data Op a = Op { ann :: a, left :: Expression a, right :: Expression a }
    -- new system: auto-generated, precisely typed syntax
    
    data Op a = Op { ann :: a, left :: Err (Expression a), right :: Err (Expression a) }

    The shape of a syntax type in our à la carte paradigm has polymorphic children, compared with the monomorphic children of our new “precisely-typed” syntax, which offers better guarantees of what we could expect.

  4. Infeasible time and effort was required. A two-step parsing process required writing two separate language-specific grammars by hand. This was time-consuming, engineering-intensive, error-prone, and tedious. The Assignment grammar used parser combinators in Haskell mirroring the tree-sitter grammar specification, which felt like a lot of duplicated work. For a long time, this work’s mechanical nature begged the question of whether we could automate parts of it. While we’ve open-sourced Semantic, leveraging community support for adding languages has been difficult because, until recently, it was backed by such a grueling process.

Designing a new system

To address challenges, we introduced a few changes:

  1. Add named child nodes. To address the issue of not having named child nodes, we modified the tree-sitter library by adding a new function called field to the grammar API and resultantly updating every language grammar. When parsing, you can retrieve a nodes’ children based on their field name. Here is an example of what a Python if_statement looks like in the old and new tree-sitter grammar APIs:
    Screenshot of a diff highlighting an example of what a Python if_statement looks like in the old and new tree-sitter grammar APIs
  2. Generate a Node Interface File. Once a grammar has this way of associating child references, the parser generation code also produces a node-types.json file that indicates what kinds of children references you can expect for each node type. This JSON file provides static information about nodes’ fields based on the grammar. Using this JSON, applications like ours can use meta-programming to generate specific datatypes for each kind of node. Here is an example of the JSON generated from the grammar definition of an if statement. This file provided a schema for a language’s ASTs and introduced additional improvements, such as the way we specify highlighting.
  3. Auto-generate language-specific syntax datatypes. Using the structure provided by the node-types.json file, we can auto-generate syntax types instead of writing them by hand. First, we deserialize the JSON file to capture the structure we want to represent in the desired shape of datatypes. Specifically, we have four distinct shapes that the nodes in our node-types JSON file take on: sumsproductsnamed leaves, and anonymous leaves. We then use Template Haskell to generate syntax datatypes for each of the language constructs represented by the Node Interface File. This means that our hand-written à la carte syntax types get replaced with auto-generated language-specific types, saving all of the developer time historically spent writing them. Here is an example of an auto-generated datatype representing a Python if statement derived from the JSON snippet provided above, which is structurally a product type.
  4. Build ASTs generically. Once we have an exhaustive set of language-specific datatypes, we need to have a mechanism that can map appropriate auto-generated datatypes onto the ASTs representing the source code being parsed. Historically, this was accomplished by manually writing an Assignment grammar. To obviate the need for a second grammar, we have created an API that uses Haskell’s generic metaprogramming framework to unmarshal tree-sitter’s parse trees automatically. We iterate over tree-sitter‘s parse trees using its tree cursor API and produce Haskell ASTs, where each node is represented by a Template Haskell generated datatype described by the previous step. This allows us to parse a particular set of nodes according to their structure, and return an AST with meta-data (such as range and span information). Here is an example of the AST generated if the Python source code is simply 1
    Screenshot of CodeGen language support pipeline

The final result is a set of language-specific, strongly-typed, TH-generated datatypes represented as the sum of syntax possible at a given point in the grammar. Strongly-typed trees give us the ability to indicate only the subset of the syntax that can occur at a given position. For example, a function’s name would be strongly typed as an identifier; a switch statement would contain case statements; and so on. This provides better guarantees about where syntax can occur, and strong compile-time guarantees about both correctness and completeness.

The new system bypasses a significant part of the engineering effort historically required; it cuts code from our pipeline in addition to addressing the technical limitations described above. The diagram below provides a visual “diff” of the old and new systems.

Diagram showing language support pipeline

A big testament to our approach’s success was that we were able to remove our à la carte syntaxes completely. In addition, we were also able to ship two new languages, Java and CodeQL, using precise ASTs generated by the new system.

Contributions welcome!

To learn more about how you can help, check out our documentation here.

Highlights from Git 2.28

Post Syndicated from Taylor Blau original https://github.blog/2020-07-27-highlights-from-git-2-28/

The open source Git project just released Git 2.28 with features and bug fixes from over 58 contributors, 13 of them new. We last caught up with you on the latest in Git back when 2.26 was released. Here’s a look at some of the most interesting features and changes introduced since then.

Introducing init.defaultBranch

When you initialize a new Git repository from scratch with git init, Git has always created an initial first branch with the name master. In Git 2.28, a new configuration option, init.defaultBranch is being introduced to replace the hard-coded term. (For more background on this change, this statement from the Software Freedom Conservancy is an excellent place to look).

Starting in Git 2.28, git init will instead look to the value of init.defaultBranch when creating the first branch in a new repository. If that value is unset, init.defaultBranch defaults to master. Here, it’s important to note that:

  1. This configuration variable can be set by the user, and overriding the default value is as easy as:
    $ git config --global init.defaultBranch main
    
  2. This configuration variable only affects new repositories, and does not cause branches in existing projects to be renamed. git clone will also continue to respect the HEAD of the repository you’re cloning from, so you won’t see a change in branch names until a maintainer initiates one.

This change supports the many communities, both on GitHub and in the wider Git community, who are considering renaming the default branch name of their repository from master.

To learn more about the complementary changes GitHub is making, see github/renaming. GitLab and Bitbucket are also making similar changes.

[source]

Changed-path Bloom filters

In Git 2.27, the commit-graph file format was extended to store changed-path Bloom filters. What does all of that mean? In a sense, this new information helps Git find points in history that touched a given path much more quickly (for example, git log -- <path>, or git blame). Git 2.28 takes advantage of these optimizations to deliver a handful of sizeable performance improvements.

Before we get into all of that, it’s worth taking a refresher through commit graphs whether you’re new to the concept, or familiar with them. (If you are familiar, and want to take a deeper dive, check out this blog post explaining all of the juicy technical details).
In the very simplest terms, the commit-graph file stores information about commits. In essence, the commit-graph acts like a cache for commonly-accessed information about commits: who their parent(s) are, what their root tree is, and things like that. It also stores computed information, too, like a commit’s generation number, and changed-path Bloom filters (more on that in just a moment).

Why store all of this information? To understand the answer to this, it is helpful to have a cursory understanding of how Git stores objects. Git stores objects in one of two ways: either as a loose object (in which case the object’s contents are stored in a single file unique to that object on disk), or as a packed object (in which case the object is assembled from a compressed format in a *.pack file). No matter which way a commit is stored, we still have to parse and decompress it before its fields like “root tree” and “parents” can be accessed.

With a commit-graph file, all of that information is immediate: for a given commit C, Git knows exactly where to look in a commit-graph file for all of those fields that we store, and can read them off immediately, no decompression or piecing together required. This can shave some time off your usual Git operations by itself, but where the commit-graph really shines is in the computed data it stores.

Generation numbers are a sort of reachability index that can help Git answer questions about things like reachability and topological ordering very quickly. Since generation numbers aren’t new in this release (and trying to explain them quickly would lose a lot of the benefit of a more careful exposition), I’ll refer you instead to this blog post by freshly-minted Hubber Derrick Stolee on the matter.

What’s new in 2.28?

OK, if you’ve made it this far, you’ve got a pretty good handle on what commit graphs are, and what they’re useful for. Now, let’s get to the juicy details. In Git 2.27, the commit-graph file learned how to store changed-path Bloom filters. What are changed-path Bloom filters, you ask? A Bloom filter is a probabilistic set; that is it’s a set of items, but querying that set for the presence of some item x returns either “x is definitely not in this set” or “x might be in this set”, but never “x is definitely in this set”. The commit-graph stores one of these Bloom filters for commits that reside in the commit-graph, and it populates that Bloom filter with a list of paths changed between that commit and its first parent.

These Bloom filters are a huge boon for performance in lots of Git commands. The general pattern is something like: if you have a Git command that computes diffs (which can sometimes be proportionally expensive), then having Bloom filters allows Git to compute far fewer diffs by skipping the computation for certain commits when their Bloom filters return “definitely not” for paths of interest.

Take git log -- /path/to/file, for example. Prior to Git 2.27, git log would have to compute a diff over every revision in its walk before determining whether or not to show it (i.e., whether or not that diff has any entries for /path/to/file). In Git 2.27 and newer, Git can skip computing many of those diffs altogether by consulting each commit C‘s changed-path Bloom filter and querying it for /path/to/file. Again: if querying returns “definitely not”, then Git knows that computing that diff is strictly uninteresting.

Because computing diffs between commits can be expensive (at least, relative to the complexity of the algorithm for which they are being generated), reducing the number of diffs computed overall can greatly improve performance.

To try this for yourself, you can run the command:

$ git commit-graph write --reachable --changed-paths

This generates a commit-graph file with changed path Bloom filters enabled.[1] You should be able to see performance improvements in commands like git log -- <path>, git log -L, git blame, and anything else that computes first-parent diffs against a given pathspec.

[source, source, source]

Tidbits

Now that we’ve talked about a few of the headlining changes from the past couple of releases, let’s look at a few more new features 🔎

  • Have you ever been looking for the parts of history that changed some path? Maybe you just want to know about the commits that have modified some file, and that can be found easily enough by running git log -- <path>.Sometimes, you might be interested not only in which commits touched <path>, but which merge commits brought those commits into the main line of developement. Have you ever found those merges difficult to find? You’re not alone. In most cases, Git will skip showing you those kind of merges with git log -- <path>, since those commits don’t modify the <path> by themselves.Now you can bring those merges back into view with Git’s new --show-pulls flag to revision walking commands, like git log and git rev-list. For a particularly informative view, try:
    $ git log --oneline --graph --show-pulls -- <path>
    

    [source]

  • When you run git pull in a repository when you’re tracking a remote branch, one of four things can happen: there might be no changes, changes on the server, client, or both. As long as there aren’t changes in both directions, resolving the difference is straightforward: when there are no changes at all, there’s nothing to do. When the server is strictly ahead of the client, the client fast-forwards to the state on the server.But, when there are change both on the client and on the server: what happens? That depends on whether not you have the pull.rebase configuration set. If you do, your branch is rebased on top of where you’re pulling from, and otherwise, a merge is performed.These merges can clutter your history and be tricky to back out of without starting over your pull from scratch. Git 2.28 now warns you of this case (specifically, when pull.rebase is unset, and you didn’t explicitly specify --[no-]rebase as an argument to git pull).

    [source]

  • Git now includes a GitHub Actions workflow which you can use to run Git’s own integration tests on a variety of platforms and compilers. There’s no extra effort required on your part: if you have a fork of git/git on GitHub, each push will be run through the array of tests necessary to validate your change. But wait: doesn’t Git use a mailing list for development? Yes, it does, but now you can use GitGitGadget on the git/git repository. This means that you can open a pull request, and have GitGitGadget send it to the mailing list on your behalf. So, if you’re more comfortable contributing to Git like that instead of composing emails manually, you can now contribute to Git from start to finish using GitHub.

    [source]

  • On the other hand, if you don’t mind sending an email or two, it’s now much easier to interact with the Git mailing list when you encounter a bug by running git bugreport. Running this new command will open your $EDITOR with a pre-populated form of questions that will be useful in debugging your issue. It also includes some helpful information about your system, like your CPU architecture, what version of Git you’re running, and so on.When you’re done, you can send that file as the body of an email to the Git mailing list, and rest assured that you’ve opened a helpful bug report.

    [source]

  • We’ve talked a number of times about Git’s clean and smudge filters and the corresponding process filter (which simulates multiple clean and smudge filters in a single process). Up until recently, the protocol for these filters has been relatively straightforward: Git supplies one end of the content, and the filter produces the other.In Git 2.27, more information is supplied over the protocol, like metadata about the branch being checked out in the case of git checkout, or the remote that was contacted in case of a git fetch. This new information could be used in tools like, for eg., Git LFS in order to figure out which remote to contact for extra data.

    [source]

  • Last but not least, git status learned some new tricks, too. You might recall from a recent blog post that we talked how sparse checkouts can shrink the size of your monorepo. Now, git status can remind you of when you are in a sparse checkout by telling you what percentage of files you have checked out.For fans of git-prompt.sh, the prompt will now display SPARSE if you are in a sparse checkout, too.

    [source]

The rest of the iceberg

That’s just a sample of changes from the latest couple of releases. For more, check out the release notes for 2.27 and 2.28, or any previous version in the Git repository.

[1]: Note that since Bloom filters are not persisted automatically (that is, you have to pass --changed-paths explicitly on each subsequent write), it is a good idea to disable configuration that automatically generates commit-graphs, like fetch.writeCommitGraph and gc.writeCommitGraph.

Introducing GitHub’s OpenAPI Description

Post Syndicated from Marc-Andre Giroux original https://github.blog/2020-07-27-introducing-githubs-openapi-description/

The GitHub REST API has been through three major revisions since it was first released, only a month after the site was launched. We often receive feedback that our REST API is an inspiration to many for design, and that it’s an industry reference for what an API should look like. Today, we’re excited to announce an improvement to how developers can interact with the API. GitHub has open sourced an OpenAPI description of the REST API.

OpenAPI

The OpenAPI specification is a programming language agnostic standard that lets providers describe the interface of their HTTP APIs. This allows both humans and machines to discover the capabilities of an API without needing to first read documentation or understand the implementation. OpenAPI is a widely adopted industry standard and GitHub is proud to be part of the community and help push the standard forward.

Try it Out

The GitHub OpenAPI description contains more than 600 operations exposed in our API. For visual exploration of the API, you can load the description as a Postman Collection. Programmatically, the description can be used to generate mock servers, test suites, and bindings for languages not supported by Octokit.

The description is provided under two formats. The bundled version is preferred for most use cases as it makes use of OpenAPI components for reuse and readability. For tooling that has poor support for inline references to components, we also provide a fully dereferenced version.

Active Development

The description is currently in beta. Describing a 12-year-old API is no easy task. We’ve built this description using a mix of existing JSON schemas, documented examples, contract testing, and love. We expect to make the description even more complete and accurate as we go forward and as OpenAPI becomes central to our developer experience — internally and externally.

Quarterly releases of the description are available for GitHub Enterprise Server and GitHub Private Instances, with versions like v2.21. More frequent updates to the description will be available for GitHub.com.

How Can I Contribute?

We’re always looking to make our OpenAPI description more complete and accurate as well as making it easier to consume. If you’d like to help contribute to the description, check out our contributing guide. If something is not working for you, please file an Issue on the repository.

Building a complete OpenAPI description for the GitHub API was no easy task and could not have been possible without a great team. Thanks to Gregor Martynus for his initial work on describing the API, the Docs Engineering team for their amazing work around OpenAPI and documentation, Will Roden for his help validating the description with octo-go, as well as the folks at Redoc.ly who helped along the way.

Learn more about our REST API OpenAPI Description

*  The OpenAPI Initiative logo is a trademark of The Linux Foundation

AWS Solutions Constructs – A Library of Architecture Patterns for the AWS CDK

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-solutions-constructs-a-library-of-architecture-patterns-for-the-aws-cdk/

Cloud applications are built using multiple components, such as virtual servers, containers, serverless functions, storage buckets, and databases. Being able to provision and configure these resources in a safe, repeatable way is incredibly important to automate your processes and let you focus on the unique parts of your implementation.

With the AWS Cloud Development Kit, you can leverage the expressive power of your favorite programming languages to model your applications. You can use high-level components called constructs, preconfigured with “sensible defaults” that you can customize, to quickly build a new application. The CDK provisions your resources using AWS CloudFormation to get all the benefits of managing your infrastructure as code. One of the reasons I like the CDK, is that you can compose and share your own custom components as higher-level constructs.

As you can imagine, there are recurring patterns that can be useful to more than one customer. For this reason, today we are launching the AWS Solutions Constructs, an open source extension library for the CDK that provides well-architected patterns to help you build your unique solutions. CDK constructs mostly cover single services. AWS Solutions Constructs provide multi-service patterns that combine two or more CDK resources, and implement best practices such as logging and encryption.

Using AWS Solutions Constructs
To see the power of a pattern-based approach, let’s take a look at how that works when building a new application. As an example, I want to build an HTTP API to store data in a Amazon DynamoDB table. To keep the content of the table small, I can use DynamoDB Time to Live (TTL) to expire items after a few days. After the TTL expires, data is deleted from the table and sent, via DynamoDB Streams, to a AWS Lambda function to archive the expired data on Amazon Simple Storage Service (S3).

To build this application, I can use a few components:

  • An Amazon API Gateway endpoint for the API.
  • A DynamoDB table to store data.
  • A Lambda function to process the API requests, and store data in the DynamoDB table.
  • DynamoDB Streams to capture data changes.
  • A Lambda function processing data changes to archive the expired data.

Can I make it simpler? Looking at the available patterns in the AWS Solutions Constructs, I find two that can help me build my app:

  • aws-apigateway-lambda, a Construct that implements an API Gateway REST API connected to a Lambda function. As an example of the “sensible defaults” used by AWS Solutions Constructs, this pattern enables CloudWatch logging for the API Gateway.
  • aws-dynamodb-stream-lambda, a Construct implementing a DynamoDB table streaming data changes to a Lambda function with the least privileged permissions.

To build the final architecture, I simply connect those two Constructs together:

I am using TypeScript to define the CDK stack, and Node.js for the Lambda functions. Let’s start with the CDK stack:

 

import * as cdk from '@aws-cdk/core';
import * as lambda from '@aws-cdk/aws-lambda';
import * as apigw from '@aws-cdk/aws-apigateway';
import * as dynamodb from '@aws-cdk/aws-dynamodb';
import { ApiGatewayToLambda } from '@aws-solutions-constructs/aws-apigateway-lambda';
import { DynamoDBStreamToLambda } from '@aws-solutions-constructs/aws-dynamodb-stream-lambda';

export class DemoConstructsStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const apiGatewayToLambda = new ApiGatewayToLambda(this, 'ApiGatewayToLambda', {
      deployLambda: true,
      lambdaFunctionProps: {
        code: lambda.Code.fromAsset('lambda'),
        runtime: lambda.Runtime.NODEJS_12_X,
        handler: 'restApi.handler'
      },
      apiGatewayProps: {
        defaultMethodOptions: {
          authorizationType: apigw.AuthorizationType.NONE
        }
      }
    });

    const dynamoDBStreamToLambda = new DynamoDBStreamToLambda(this, 'DynamoDBStreamToLambda', {
      deployLambda: true,
      lambdaFunctionProps: {
        code: lambda.Code.fromAsset('lambda'),
        runtime: lambda.Runtime.NODEJS_12_X,
        handler: 'processStream.handler'
      },
      dynamoTableProps: {
        tableName: 'my-table',
        partitionKey: { name: 'id', type: dynamodb.AttributeType.STRING },
        timeToLiveAttribute: 'ttl'
      }
    });

    const apiFunction = apiGatewayToLambda.lambdaFunction;
    const dynamoTable = dynamoDBStreamToLambda.dynamoTable;

    dynamoTable.grantReadWriteData(apiFunction);
    apiFunction.addEnvironment('TABLE_NAME', dynamoTable.tableName);
  }
}

At the beginning of the stack, I import the standard CDK constructs for the Lambda function, the API Gateway endpoint, and the DynamoDB table. Then, I add the two patterns from the AWS Solutions Constructs, ApiGatewayToLambda and DynamoDBStreamToLambda.

After declaring the two ApiGatewayToLambda and DynamoDBStreamToLambda constructs, I store the Lambda function, created by the ApiGatewayToLambda constructs, and the DynamoDB table, created by DynamoDBStreamToLambda, in two variables.

At the end of the stack, I “connect” the two patterns together by granting permissions to the Lambda function to read/write in the DynamoDB table, and add the name of the DynamoDB table to the environment of the Lambda function, so that it can be used in the function code to store data in the table.

The code of the two Lambda functions is in the lambda folder of the CDK application. I am using the Node.js 12 runtime.

The restApi.js function implements the API and writes data to the DynamoDB table. The URL path is used as partition key, all the query string parameters in the URL are stored as attributes. The TTL for the item is computed adding a time window of 7 days to the current time.

const { DynamoDB } = require("aws-sdk");

const docClient = new DynamoDB.DocumentClient();

const TABLE_NAME = process.env.TABLE_NAME;
const TTL_WINDOW = 7 * 24 * 60 * 60; // 7 days expressed in seconds

exports.handler = async function (event) {

  const item = event.queryStringParameters;
  item.id = event.pathParameters.proxy;

  const now = new Date(); 
  item.ttl = Math.round(now.getTime() / 1000) + TTL_WINDOW;

  const response = await docClient.put({
    TableName: TABLE_NAME,
    Item: item
  }).promise();

  let statusCode = 204;
  
  if (response.err != null) {
    console.error('request: ', JSON.stringify(event, undefined, 2));
    console.error('error: ', response.err);
    statusCode = 500
  }

  return {
    statusCode: statusCode
  };
};

The processStream.js function is processing data capture records from the DynamoDB Stream, looking for the items deleted by TTL. The archive functionality is not implemented in this sample code.

exports.handler = async function (event) {
  event.Records.forEach((record) => {
    console.log('Stream record: ', JSON.stringify(record, null, 2));
    if (record.userIdentity.type == "Service" &&
      record.userIdentity.principalId == "dynamodb.amazonaws.com") {

      // Record deleted by DynamoDB Time to Live (TTL)
      
      // I can archive the record to S3, for example using Kinesis Data Firehose.
    }
  }
};

Let’s see if this works! First, I need to install all dependencies. To simplify dependencies, each release of AWS Solutions Constructs is linked to the corresponding version of the CDK. I this case, I am using version 1.46.0 for both the CDK and the AWS Solutions Constructs patterns. The first three commands are installing plain CDK constructs. The last two commands are installing the AWS Solutions Constructs patterns I am using for this application.

npm install @aws-cdk/[email protected]
npm install @aws-cdk/[email protected]
npm install @aws-cdk/[email protected]
npm install @aws-solutions-constructs/[email protected]
npm install @aws-solutions-constructs/[email protected]

Now, I build the application and use the CDK to deploy the application.

npm run build
cdk deploy

Towards the end of the output of the cdk deploy command, a green light is telling me that the deployment of the stack is completed. Just next, in the Outputs, I find the endpoint of the API Gateway.

 ✅  DemoConstructsStack

Outputs:
DemoConstructsStack.ApiGatewayToLambdaLambdaRestApiEndpoint9800D4B5 = https://1a2c3c4d.execute-api.eu-west-1.amazonaws.com/prod/

I can now use curl to test the API:

curl "https://1a2c3c4d.execute-api.eu-west-1.amazonaws.com/prod/danilop?name=Danilo&amp;company=AWS"

Let’s have a look at the DynamoDB table:

The item is stored, and the TTL is set. After a week, the item will be deleted and sent via DynamoDB Streams to the processStream.js function.

After I complete my testing, I use the CDK again to quickly delete all resources created for this application:

cdk destroy

Available Now
The AWS Solutions Constructs are available now for TypeScript and Python. The AWS Solutions Builders team is working to make these constructs also available when using Java and C# with the CDK, stay tuned. There is no cost in using the AWS Solutions Constructs, or the CDK, you only pay for the resources created when deploying the stack.

In this first release, 25 patterns are included, covering lots of different use cases. Which new patterns and features should we focus now? Give use your feedback in the open source project repository!

Danilo

Introducing GitHub Super Linter: one linter to rule them all

Post Syndicated from Lucas Gravley original https://github.blog/2020-06-18-introducing-github-super-linter-one-linter-to-rule-them-all/

Setting up a new repository with all the right linters for the different types of code can be time consuming and tedious. So many tools and configurations to choose from and often more than one linter is needed to cover all the languages used.

The GitHub Super Linter was built out of necessity by the GitHub Services DevOps Engineering team to maintain consistency in our documentation and code while making communication and collaboration across the company a more productive experience. Now we are open sourcing that so everyone can use and improve it!

The Super Linter solves many of these requirements through automation. Some included features:

  • Prevent broken code from being uploaded to master branches
  • Help establish coding best practices across multiple languages
  • Build guidelines for code layout and format
  • Automate the process to help streamline code reviews
  • With these basic criteria, we should be shipping better, cleaner, and more stable code internally and to our customers and partners

What is it?

The Super Linter is a source code repository that is packaged into a Docker container and called by GitHub Actions. This allows for any repository on GitHub.com to call the Super Linter and start utilizing its benefits.

The Super Linter will currently support a lot of languages and more coming in the future. For details on languages, check out the README.md.

How it works

When you’ve set your repository to start running this action, any time you open a pull request, it will start linting the code case and return via the Status API. It will let you know if any of your code changes passed successfully, or if any errors were detected, where they are, and what they are. This then allows the developer to go back to their branch, fix any issues, and create a new push to the open pull request. At that point, the Super Linter will run again and validate the updated code and repeat the process. You can configure your branch protection rules to make sure all code must pass before being able to merge as an additional measure.

There’s a ton of customization with flags and templates that can help you customize the Super Linter to your individual repository. Just follow the detailed directions at the Super Linter repository and the Super Linter wiki.

This tool can also be helpful for any repository where multiple types of code and/or documentation all live together (monorepo).

GitHub Super Linter in action

Default rules

Standardizing a rule set across the Super Linter has been an interesting challenge as each developer is unique in how they code. This is why we allow users to use any rules for the linter as they see fit for their repository. But, if no ruleset is defined, we must default to a certain standard.

The rule set for Ruby and Rails are pulled from the Ruby gem: rubocop-github and follow the same rules and versioning we use on GitHub.com.

For other languages, we choose what is the default when installing the linter such as: coffeelint or yamllint. For others, we try to find a happy middle ground that lays the simple groundwork and helps establish some best practices like: Markdownlint or pylint.

The beauty of this is, out of the box you will start establishing the framework, and your team can decide at any point, if additional customization is needed, you have all the ability to do so.

Just navigate to the Super Linter and copy templates from the TEMPLATES folder to your local repository.

Join in the fun

We encourage you to set up this action and start the process of cleaning up your codebase and building your team’s standards and best practices.

How can I contribute?

We’re always looking to update best practices, add additional languages, and make the tool easier for consumption. If you’d like to help contribute to this action, check out our contributing guide.

Learn more about our Super Linter

An open source camera stack for Raspberry Pi using libcamera

Post Syndicated from David Plowman original https://www.raspberrypi.org/blog/an-open-source-camera-stack-for-raspberry-pi-using-libcamera/

Since we released the first Raspberry Pi camera module back in 2013, users have been clamouring for better access to the internals of the camera system, and even to be able to attach camera sensors of their own to the Raspberry Pi board. Today we’re releasing our first version of a new open source camera stack which makes these wishes a reality.

(Note: in what follows, you may wish to refer to the glossary at the end of this post.)

We’ve had the building blocks for connecting other sensors and providing lower-level access to the image processing for a while, but Linux has been missing a convenient way for applications to take advantage of this. In late 2018 a group of Linux developers started a project called libcamera to address that. We’ve been working with them since then, and we’re pleased now to announce a camera stack that operates within this new framework.

Here’s how our work fits into the libcamera project.

We’ve supplied a Pipeline Handler that glues together our drivers and control algorithms, and presents them to libcamera with the API it expects.

Here’s a little more on what this has entailed.

V4L2 drivers

V4L2 (Video for Linux 2) is the Linux kernel driver framework for devices that manipulate images and video. It provides a standardised mechanism for passing video buffers to, and/or receiving them from, different hardware devices. Whilst it has proved somewhat awkward as a means of driving entire complex camera systems, it can nonetheless provide the basis of the hardware drivers that libcamera needs to use.

Consequently, we’ve upgraded both the version 1 (Omnivision OV5647) and version 2 (Sony IMX219) camera drivers so that they feature a variety of modes and resolutions, operating in the standard V4L2 manner. Support for the new Raspberry Pi High Quality Camera (using the Sony IMX477) will be following shortly. The Broadcom Unicam driver – also V4L2‑based – has been enhanced too, signalling the start of each camera frame to the camera stack.

Finally, dumping raw camera frames (in Bayer format) into memory is of limited value, so the V4L2 Broadcom ISP driver provides all the controls needed to turn raw images into beautiful pictures!

Configuration and control algorithms

Of course, being able to configure Broadcom’s ISP doesn’t help you to know what parameters to supply. For this reason, Raspberry Pi has developed from scratch its own suite of ISP control algorithms (sometimes referred to generically as 3A Algorithms), and these are made available to our users as well. Some of the most well known control algorithms include:

  • AEC/AGC (Auto Exposure Control/Auto Gain Control): this monitors image statistics into order to drive the camera exposure to an appropriate level.
  • AWB (Auto White Balance): this corrects for the ambient light that is illuminating a scene, and makes objects that appear grey to our eyes come out actually grey in the final image.

But there are many others too, such as ALSC (Auto Lens Shading Correction, which corrects vignetting and colour variation across an image), and control for noise, sharpness, contrast, and all other aspects of image processing. Here’s how they work together.

The control algorithms all receive statistics information from the ISP, and cooperate in filling in metadata for each image passing through the pipeline. At the end, the metadata is used to update control parameters in both the image sensor and the ISP.

Previously these functions were proprietary and closed source, and ran on the Broadcom GPU. Now, the GPU just shovels pixels through the ISP hardware block and notifies us when it’s done; practically all the configuration is computed and supplied from open source Raspberry Pi code on the ARM processor. A shim layer still exists on the GPU, and turns Raspberry Pi’s own image processing configuration into the proprietary functions of the Broadcom SoC.

To help you configure Raspberry Pi’s control algorithms correctly for a new camera, we include a Camera Tuning Tool. Or if you’d rather do your own thing, it’s easy to modify the supplied algorithms, or indeed to replace them entirely with your own.

Why libcamera?

Whilst ISP vendors are in some cases contributing open source V4L2 drivers, the reality is that all ISPs are very different. Advertising these differences through kernel APIs is fine – but it creates an almighty headache for anyone trying to write a portable camera application. Fortunately, this is exactly the problem that libcamera solves.

We provide all the pieces for Raspberry Pi-based libcamera systems to work simply “out of the box”. libcamera remains a work in progress, but we look forward to continuing to help this effort, and to contributing an open and accessible development platform that is available to everyone.

Summing it all up

So far as we know, there are no similar camera systems where large parts, including at least the control (3A) algorithms and possibly driver code, are not closed and proprietary. Indeed, for anyone wishing to customise a camera system – perhaps with their own choice of sensor – or to develop their own algorithms, there would seem to be very few options – unless perhaps you happen to be an extremely large corporation.

In this respect, the new Raspberry Pi Open Source Camera System is providing something distinctly novel. For some users and applications, we expect its accessible and non-secretive nature may even prove quite game-changing.

What about existing camera applications?

The new open source camera system does not replace any existing camera functionality, and for the foreseeable future the two will continue to co-exist. In due course we expect to provide additional libcamera-based versions of raspistill, raspivid and PiCamera – so stay tuned!

Where next?

If you want to learn more about the libcamera project, please visit https://libcamera.org.

To try libcamera for yourself with a Raspberry Pi, please follow the instructions in our online documentation, where you’ll also find the full Raspberry Pi Camera Algorithm and Tuning Guide.

If you’d like to know more, and can’t find an answer in our documentation, please go to the Camera Board forum. We’ll be sure to keep our eyes open there to pick up any of your questions.

Acknowledgements

Thanks to Naushir Patuck and Dave Stevenson for doing all the really tricky bits (lots of V4L2-wrangling).

Thanks also to the libcamera team (Laurent Pinchart, Kieran Bingham, Jacopo Mondi and Niklas Söderlund) for all their help in making this project possible.

 

Glossary

3A, 3A Algorithms: refers to AEC/AGC (Auto Exposure Control/Auto Gain Control), AWB (Auto White Balance) and AF (Auto Focus) algorithms, but may implicitly cover other ISP control algorithms. Note that Raspberry Pi does not implement AF (Auto Focus), as none of our supported camera modules requires it
AEC: Auto Exposure Control
AF: Auto Focus
AGC: Auto Gain Control
ALSC: Auto Lens Shading Correction, which corrects vignetting and colour variations across an image. These are normally caused by the type of lens being used and can vary in different lighting conditions
AWB: Auto White Balance
Bayer: an image format where each pixel has only one colour component (one of R, G or B), creating a sort of “colour mosaic”. All the missing colour values must subsequently be interpolated. This is a raw image format meaning that no noise, sharpness, gamma, or any other processing has yet been applied to the image
CSI-2: Camera Serial Interface (version) 2. This is the interface format between a camera sensor and Raspberry Pi
GPU: Graphics Processing Unit. But in this case it refers specifically to the multimedia coprocessor on the Broadcom SoC. This multimedia processor is proprietary and closed source, and cannot directly be programmed by Raspberry Pi users
ISP: Image Signal Processor. A hardware block that turns raw (Bayer) camera images into full colour images (either RGB or YUV)
Raw: see Bayer
SoC: System on Chip. The Broadcom processor at the heart of all Raspberry Pis
Unicam: the CSI-2 receiver on the Broadcom SoC on the Raspberry Pi. Unicam receives pixels being streamed out by the image sensor
V4L2: Video for Linux 2. The Linux kernel driver framework for devices that process video images. This includes image sensors, CSI-2 receivers, and ISPs

The post An open source camera stack for Raspberry Pi using libcamera appeared first on Raspberry Pi.

Announcing TorchServe, An Open Source Model Server for PyTorch

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/announcing-torchserve-an-open-source-model-server-for-pytorch/

PyTorch is one of the most popular open source libraries for deep learning. Developers and researchers particularly enjoy the flexibility it gives them in building and training models. Yet, this is only half the story, and deploying and managing models in production is often the most difficult part of the machine learning process: building bespoke prediction APIs, scaling them, securing them, etc.

One way to simplify the model deployment process is to use a model server, i.e. an off-the-shelf web application specially designed to serve machine learning predictions in production. Model servers make it easy to load one or several models, automatically creating a prediction API backed by a scalable web server. They’re also able to run preprocessing and postprocessing code on prediction requests. Last but not least, model servers also provide production-critical features like logging, monitoring, and security. Popular model servers include TensorFlow Serving and the Multi Model Server.

Today, I’m extremely happy to announce TorchServe, a PyTorch model serving library that makes it easy to deploy trained PyTorch models at scale without having to write custom code.

Introducing TorchServe
TorchServe is a collaboration between AWS and Facebook, and it’s available as part of the PyTorch open source project. If you’re interested in how the project was initiated, you can read the initial RFC on Github.

With TorchServe, PyTorch users can now bring their models to production quicker, without having to write custom code: on top of providing a low latency prediction API, TorchServe embeds default handlers for the most common applications such as object detection and text classification. In addition, TorchServe includes multi-model serving, model versioning for A/B testing, monitoring metrics, and RESTful endpoints for application integration. As you would expect, TorchServe supports any machine learning environment, including Amazon SageMaker, container services, and Amazon Elastic Compute Cloud (EC2).

Several customers are already enjoying the benefits of TorchServe.

Toyota Research Institute Advanced Development, Inc. (TRI-AD) is developing software for automated driving at Toyota Motor Corporation. Says Yusuke Yachide, Lead of ML Tools at TRI-AD: “we continuously optimize and improve our computer vision models, which are critical to TRI-AD’s mission of achieving safe mobility for all with autonomous driving. Our models are trained with PyTorch on AWS, but until now PyTorch lacked a model serving framework. As a result, we spent significant engineering effort in creating and maintaining software for deploying PyTorch models to our fleet of vehicles and cloud servers. With TorchServe, we now have a performant and lightweight model server that is officially supported and maintained by AWS and the PyTorch community”.

Matroid is a maker of computer vision software that detects objects and events in video footage. Says Reza Zadeh, Founder and CEO at Matroid Inc.: “we develop a rapidly growing number of machine learning models using PyTorch on AWS and on-premise environments. The models are deployed using a custom model server that requires converting the models to a different format, which is time-consuming and burdensome. TorchServe allows us to simplify model deployment using a single servable file that also serves as the single source of truth, and is easy to share and manage”.

Now, I’d like to show you how to install TorchServe, and load a pretrained model on Amazon Elastic Compute Cloud (EC2). You can try other environments by following the documentation.

Installing TorchServe
First, I fire up a CPU-based Amazon Elastic Compute Cloud (EC2) instance running the Deep Learning AMI (Ubuntu edition). This AMI comes preinstalled with several dependencies that I’ll need, which will speed up setup. Of course you could use any AMI instead.

TorchServe is implemented in Java, and I need the latest OpenJDK to run it.

sudo apt install openjdk-11-jdk

Next, I create and activate a new Conda environment for TorchServe. This will keep my Python packages nice and tidy (virtualenv works too, of course).

conda create -n torchserve

source activate torchserve

Next, I install dependencies for TorchServe.

pip install sentencepiece       # not available as a Conda package

conda install psutil pytorch torchvision torchtext -c pytorch

If you’re using a GPU instance, you’ll need an extra package.

conda install cudatoolkit=10.1

Now that dependencies are installed, I can clone the TorchServe repository, and install TorchServe.

git clone https://github.com/pytorch/serve.git

cd serve

pip install .

cd model-archiver

pip install .

Setup is complete, let’s deploy a model!

Deploying a Model
For the sake of this demo, I’ll simply download a pretrained model from the PyTorch model zoo. In real life, you would probably use your own model.

wget https://download.pytorch.org/models/densenet161-8d451a50.pth

Next, I need to package the model into a model archive. A model archive is a ZIP file storing all model artefacts, i.e. the model itself (densenet161-8d451a50.pth), a Python script to load the state dictionary (matching tensors to layers), and any extra file you may need. Here, I include a file named index_to_name.json, which maps class identifiers to class names. This will be used by the built-in image_classifier handler, which is in charge of the prediction logic. Other built-in handlers are available (object_detector, text_classifier, image_segmenter), and you can implement your own.

torch-model-archiver --model-name densenet161 --version 1.0 \
--model-file examples/image_classifier/densenet_161/model.py \
--serialized-file densenet161-8d451a50.pth \
--extra-files examples/image_classifier/index_to_name.json \
--handler image_classifier

Next, I create a directory to store model archives, and I move the one I just created there.

mkdir model_store

mv densenet161.mar model_store/

Now, I can start TorchServe, pointing it at the model store and at the model I want to load. Of course, I could load several models if needed.

torchserve --start --model-store model_store --models densenet161=densenet161.mar

Still on the same machine, I grab an image and easily send it to TorchServe for local serving using an HTTP POST request. Note the format of the URL, which includes the name of the model I want to use.

curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg

curl -X POST http://127.0.0.1:8080/predictions/densenet161 -T kitten.jpg

The result appears immediately. Note that class names are visible, thanks to the built-in handler.

[
{"tiger_cat": 0.4693356156349182},
{"tabby": 0.46338796615600586},
{"Egyptian_cat": 0.06456131488084793},
{"lynx": 0.0012828155886381865},
{"plastic_bag": 0.00023323005007114261}
]

I then stop TorchServe with the ‘stop‘ command.

torchserve --stop

As you can see, it’s easy to get started with TorchServe using the default configuration. Now let me show you how to set it up for remote serving.

Configuring TorchServe for Remote Serving
Let’s create a configuration file for TorchServe, named config.properties (the default name). This files defines which model to load, and sets up remote serving. Here, I’m binding the server to all public IP addresses, but you can restrict it to a specific address if you want to. As this is running on an EC2 instance, I need to make sure that ports 8080 and 8081 are open in the Security Group.

model_store=model_store
load_models=densenet161.mar
inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081

Now I can start TorchServe in the same directory, without having to pass any command line arguments.

torchserve --start

Moving back to my local machine, I can now invoke TorchServe remotely, and get the same result.

curl -X POST http://ec2-54-85-61-250.compute-1.amazonaws.com:8080/predictions/densenet161 -T kitten.jpg

You probably noticed that I used HTTP. I’m guessing a lot of you will require HTTPS in production, so let me show you how to set it up.

Configuring TorchServe for HTTPS
TorchServe can use either the Java keystore or a certificate. I’ll go with the latter.

First, I create a certificate and a private key with openssl.

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout mykey.key -out mycert.pem

Then, I update the configuration file to define the location of the certificate and key, and I bind TorchServe to its default secure ports (don’t forget to update the Security Group).

model_store=model_store
load_models=densenet161.mar
inference_address=https://0.0.0.0:8443
management_address=https://0.0.0.0:8444
private_key_file=mykey.key
certificate_file=mycert.pem

I restart TorchServe, and I can now invoke it with HTTPS. As I use a self-signed certificate, I need to pass the ‘–insecure’ flag to curl.

curl --insecure -X POST https://ec2-54-85-61-250.compute-1.amazonaws.com:8443/predictions/densenet161 -T kitten.jpg

There’s a lot more to TorchServe configuration, and I encourage you to read its documentation!

Getting Started
TorchServe is available now at https://github.com/pytorch/serve.

Give it a try, and please send us feedback on Github.

– Julien

 

 

 

AWS Step Functions support in Visual Studio Code

Post Syndicated from Rob Sutter original https://aws.amazon.com/blogs/compute/aws-step-functions-support-in-visual-studio-code/

The AWS Toolkit for Visual Studio Code has been installed over 115,000 times since launching in July 2019. We are excited to announce toolkit support for AWS Step Functions, enabling you to define, visualize, and create your Step Functions workflows without leaving VS Code.

Version 1.8 of the toolkit provides two new commands in the Command Palette to help you define and visualize your workflows. The toolkit also provides code snippets for seven different Amazon States Language (ASL) state types and additional service integrations to speed up workflow development. Automatic linting detects errors in your state machine as you type, and provides tooltips to help you correct the errors. Finally, the toolkit allows you to create or update Step Functions workflows in your AWS account without leaving VS Code.

Defining a new state machine

To define a new Step Functions state machine, first open the VS Code Command Palette by choosing Command Palette from the View menu. Enter Step Functions to filter the available options and choose AWS: Create a new Step Functions state machine.

Screen capture of the Command Palette in Visual Studio Code with the text ">AWS Step Functions" entered

Creating a new Step Functions state machine in VS Code

A dialog box appears with several options to help you get started quickly. Select Hello world to create a basic example using a series of Pass states.

A screen capture of the Visual Studio Code Command Palette "Select a starter template" dialog with "Hello world" selected

Selecting the “Hello world” starter template

VS Code creates a new Amazon States Language file containing a workflow with examples of the Pass, Choice, Fail, Wait, and Parallel states.

A screen capture of a Visual Studio Code window with a "Hello World" example state machine

The “Hello World” example state machine

Pass states allow you to define your workflow before building the implementation of your logic with Task states. This lets you work with business process owners to ensure you have the workflow right before you start writing code. For more information on the other state types, see State Types in the ASL documentation.

Save your new workflow by choosing Save from the File menu. VS Code automatically applies the .asl.json extension.

Visualizing state machines

In addition to helping define workflows, the toolkit also enables you to visualize your workflows without leaving VS Code.

To visualize your new workflow, open the Command Palette and enter Preview state machine to filter the available options. Choose AWS: Preview state machine graph.

A screen capture of the Visual Studio Code Command Palette with the text ">Preview state machine" entered and the option "AWS: Preview state machine graph" highlighted

Previewing the state machine graph in VS Code

The toolkit renders a visualization of your workflow in a new tab to the right of your workflow definition. The visualization updates automatically as the workflow definition changes.

A screen capture of a Visual Studio Code window with two side-by-side tabs, one with a state machine definition and one with a preview graph for the same state machine

A state machine preview graph

Modifying your state machine definition

The toolkit provides code snippets for 12 different ASL states and service integrations. To insert a code snippet, place your cursor within the States object in your workflow and press Ctrl+Space to show the list of available states.

A screen capture of a Visual Studio Code window with a code snippet insertion dialog showing twelve Amazon States Langauge states

Code snippets are available for twelve ASL states

In this example, insert a newline after the definition of the Pass state, press Ctrl+Space, and choose Map State to insert a code snippet with the required structure for an ASL Map State.

Debugging state machines

The toolkit also includes features to help you debug your Step Functions state machines. Visualization is one feature, as it allows the builder and the product owner to confirm that they have a shared understanding of the relevant process.

Automatic linting is another feature that helps you debug your workflows. For example, when you insert the Map state into your workflow, a number of errors are detected, underlined in red in the editor window, and highlighted in red in the Minimap. The visualization tab also displays an error to inform you that the workflow definition has errors.

A screen capture of a Visual Studio Code window with a tooltip dialog indicating an "Unreachable state" error

A tooltip indicating an “Unreachable state” error

Hovering over an error opens a tooltip with information about the error. In this case, the toolkit is informing you that MapState is unreachable. Correct this error by changing the value of Next in the Pass state above from Hello World Example to MapState. The red underline automatically disappears, indicating the error has been resolved.

To finish reconciling the errors in your workflow, cut all of the following states from Hello World Example? through Hello World and paste into MapState, replacing the existing values of MapState.Iterator.States. The workflow preview updates automatically, indicating that the errors have been resolved. The MapState is indicated by the three dashed lines surrounding most of the workflow.

A Visual Studio Code window displaying two tabs, an updated state machine definition and the automatically-updated preview of the same state machine

Automatically updating the state machine preview after changes

Creating and updating state machines in your AWS account

The toolkit enables you to publish your state machine directly to your AWS account without leaving VS Code. Before publishing a state machine to your account, ensure that you establish credentials for your AWS account for the toolkit.

Creating a state machine in your AWS account

To publish a new state machine to your AWS account, bring up the VS Code Command Palette as before. Enter Publish to filter the available options and choose AWS: Publish state machine to Step Functions.

Screen capture of the Visual Studio Command Palette with the command "AWS: Publish state machine to Step Functions" highlighted

Publishing a state machine to AWS Step Functions

Choose Quick Create from the dialog box to create a new state machine in your AWS account.

Screen Capture from a Visual Studio Code flow to publish a state machine to AWS Step Functions with "Quick Create" highlighted

Publishing a state machine to AWS Step Functions

Select an existing execution role for your state machine to assume. This role must already exist in your AWS account.

For more information on creating execution roles for state machines, please visit Creating IAM Roles for AWS Step Functions.

Screen capture from Visual Studio Code showing a selection execution role dialog with "HelloWorld_IAM_Role" selected

Selecting an IAM execution role for a state machine

Provide a name for the new state machine in your AWS account, for example, Hello-World. The name must be from one to 80 characters, and can use alphanumeric characters, dashes, or underscores.

Screen capture from a Visual Studio Code flow entering "Hello-World" as a state machine name

Naming your state machine

Press the Enter or Return key to confirm the name of your state machine. The Output console opens, and the toolkit displays the result of creating your state machine. The toolkit provides the full Amazon Resource Name (ARN) of your new state machine on completion.

Screen capture from Visual Studio Code showing the successful creation of a new state machine in the Output window

Output of creating a new state machine

You can check creation for yourself by visiting the Step Functions page in the AWS Management Console. Choose the newly-created state machine and the Definition tab. The console displays the definition of your state machine along with a preview graph.

Screen capture of the AWS Management Console showing the newly-created state machine

Viewing the new state machine in the AWS Management Console

Updating a state machine in your AWS account

It is common to change workflow definitions as you refine your application. To update your state machine in your AWS account, choose Quick Update instead of Quick Create. Select your existing workflow.

A screen capture of a Visual Studio Code dialog box with a single state machine displayed and highlighted

Selecting an existing state machine to update

The toolkit displays “Successfully updated state machine” and the ARN of your state machine in the Output window on completion.

Summary

In this post, you learn how to use the AWS Toolkit for VS Code to create and update Step Functions state machines in your local development environment. You discover how sample templates, code snippets, and automatic linting can accelerate your development workflows. Finally, you see how to create and update Step Functions workflows in your AWS account without leaving VS Code.

Install the latest release of the toolkit and start building your workflows in VS Code today.

 

Improving Transparency of AWS Elastic Beanstalk

Post Syndicated from Rob Sutter original https://aws.amazon.com/blogs/compute/improving-transparency-of-aws-elastic-beanstalk/

This post is courtesy of David LaBissoniere, Software Development Manager, AWS Elastic Beanstalk.

Today I want to discuss two recent announcements from the AWS Elastic Beanstalk team which improve transparency into our planning and development. We launched a new public roadmap, and we shifted to developing the Elastic Beanstalk command line interface (EB CLI) on GitHub as a community-involved open source project.

Public Roadmap

In January, we launched an experimental public roadmap on GitHub, joining other teams like AWS container services, AWS CloudFormation, and AWS App Mesh. The roadmap allows us to be more transparent about our priorities, and enables you to directly influence them. You can propose a feature by opening a GitHub issue, or comment on existing issues. 2020 is shaping up to be a significant year for us, and as we continue to invest in the service, we want customer input to help direct our focus.

The roadmap itself is built as a GitHub project board and contains five columns:

Just Shipped — Launched and available for production use.
Public Beta — Available in a preview form but not yet recommended for production usage.
Coming Soon — Launching soon, generally within the next one to three months.
We’re Working On It — In progress, but further out.
Researching — We’re interested in this feature but are still thinking about the best way to implement it.

Screen capture of the AWS Elastic Beanstalk project board on GitHub

Please feel free to create a GitHub issue for a feature you want us to support, or give a thumbs-up to existing issues. We’d also love to hear from you in the issue comments about how you’d like to use a particular feature or how you think it should work. While the roadmap doesn’t include every single item we are working on, it does include many of the regular incremental launches customers rely on, for example, new platform runtime updates like PHP 7.3 or .NET Core 3.1. We’re starting out with a subset of our planned and in-flight work, and expect to gradually expand our use of the roadmap over the course of the year.

EB CLI on GitHub

A popular way to use Elastic Beanstalk is our command line interface, the EB CLI. As of January 16, it is hosted on GitHub as an Apache 2.0-licensed open source project. We plan to do nearly all of our CLI development openly on GitHub and welcome pull requests from the community. Many customers rely on the EB CLI as part of their development and deployment workflows. We hope to improve transparency into this critical tool by open-sourcing it, and we also hope you join us in improving it.

We’re thrilled to start off the year with these two announcements. Watch the roadmap for more announcements in this space!

SVT-AV1: an open-source AV1 encoder and decoder

Post Syndicated from Netflix Technology Blog original https://netflixtechblog.com/svt-av1-an-open-source-av1-encoder-and-decoder-ad295d9b5ca2

SVT-AV1: open-source AV1 encoder and decoder

by Andrey Norkin, Joel Sole, Mariana Afonso, Kyle Swanson, Agata Opalach, Anush Moorthy, Anne Aaron

SVT-AV1 is an open-source AV1 codec implementation hosted on GitHub https://github.com/OpenVisualCloud/SVT-AV1/ under a BSD + patent license. As mentioned in our earlier blog post, Intel and Netflix have been collaborating on the SVT-AV1 encoder and decoder framework since August 2018. The teams have been working closely on SVT-AV1 development, discussing architectural decisions, implementing new tools, and improving compression efficiency. Since open-sourcing the project, other partner companies and the open-source community have contributed to SVT-AV1. In this tech blog, we will report the current status of the SVT-AV1 project, as well as the characteristics and performance of the encoder and decoder.

SVT-AV1 codebase status

The SVT-AV1 repository includes both an AV1 encoder and decoder, which share a significant amount of the code. The SVT-AV1 decoder is fully functional and compliant with the AV1 specification for all three profiles (Main, High, and Professional).

The SVT-AV1 encoder supports all AV1 tools which contribute to compression efficiency. Compared to the most recent master version of libaom (AV1 reference software), SVT-AV1 is similar in compression efficiency and at the same time achieves significantly lower encoding latency on multi-core platforms when using its inherent parallelization capabilities.

SVT-AV1 is written in C and can be compiled on major platforms, such as Windows, Linux, and macOS. In addition to the pure C function implementations, which allows for more flexible experimentation, the codec features extensive assembly and intrinsic optimizations for the x86 platform. See the next section for an outline of the main SVT-AV1 features that allow high performance at competitive compression efficiency. SVT-AV1 also includes extensive documentation on the encoder design targeted to facilitate the onboarding process for new developers.

Architectural features

One of Intel’s goals for SVT-AV1 development was to create an AV1 encoder that could offer performance and scalability. SVT-AV1 uses parallelization at several stages of the encoding process, which allows it to adapt to the number of available cores, including the newest servers with significant core count. This makes it possible for SVT-AV1 to decrease encoding time while still maintaining compression efficiency.

The SVT-AV1 encoder uses multi-dimensional (process-, picture/tile-, and segment-based) parallelism, multi-stage partitioning decisions, block-based multi-stage and multi-class mode decisions, and RD-optimized classification to achieve attractive trade-offs between compression and performance. Another feature of the SVT architecture is open-loop hierarchical motion estimation, which makes it possible to decouple the first stage of motion estimation from the rest of the encoding process.

Compression efficiency and performance

Encoder performance

SVT-AV1 reaches similar compression efficiency as libaom at the slowest speed settings. During the codec development, we have been tracking the compression and encoding results at the https://videocodectracker.dev/ site. The plot below shows the improvements in the compression efficiency of SVT-AV1 compared to the libaom encoder over time. Note that the libaom compression has also been improving over time, and the plot below represents SVT-AV1 catching up with the moving target. In the plot, the Y-axis shows the additional bitrate in percent needed to achieve similar quality as libaom encoder according to three metrics. The plot shows the results of the 2-pass encoding mode in both codecs. SVT-AV1 uses 4-thread mode, whereas libaom operates in a single-thread mode. The SVT-AV1 results for the 1-pass fixed-QP encoding mode, commonly used in research, are even more competitive, as detailed below.

Reducing BD-rate between SVT-AV1 and libaom in 2-pass encoding mode

The comparison results of the SVT-AV1 against libaom on objective-1-fast test set are presented in the table below. For estimating encoding times, we used Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz machine with 52 physical cores and 96 GB of RAM, with 60 jobs running in parallel. Both codecs use bi-directional hierarchical prediction structure of 16 pictures. The results are presented for 1-pass mode with fixed frame-level QP offsets. A single-threaded compression mode is used. Below, we compute the BD-rates for the various quality metrics: PSNR on all three color planes, VMAF, and MS-SSIM. A negative BD-Rate indicates that the SVT-AV1 encodes produce the same quality with the indicated relative reduction in bitrate. As seen below, SVT-AV1 demonstrates 16.5% decrease in encoding time compared to libaom while being slightly more efficient in compression ability. Note that the encoding times ratio may vary depending on the instruction sets supported by the platform. The results have been obtained on SVT-AV1 cs2 branch (a development branch that is currently being merged into the master, git hash 3a19f29) against the libaom master branch (git hash fe72512). The QP values used to calculate the BD-rates are: 20, 32, 43, 55, 63.

BD-rates of SVT-AV1 vs libaom in 1-pass encoding mode with fixed QP offsets. Negative numbers indicate reduction in bitrate needed to reach the same quality level. The overall encoding time difference is change in total CPU time for all sequences and QPs of SVT-AV1 compared to that of libaom.

*The overall encoding CPU time difference is calculated as change in total CPU time for all sequences and QPs of the test compared to that of the anchor. It is not equal to the average of per sequence values. Per each sequence, the encoding CPU time difference is calculated as change in total CPU time for all QPs for this sequence.

Since all sequences in the objective-1-fast test set have 60 frames, both codecs use one key frame. The following command line parameters have been used to compare the codecs.

libaom parameters:

--passes=1 --lag-in-frames=25 --auto-alt-ref=1 --min-gf-interval=16 --max-gf-interval=16 --gf-min-pyr-height=4 --gf-max-pyr-height=4 --kf-min-dist=65 --kf-max-dist=65 --end-usage=q --use-fixed-qp-offsets=1 --deltaq-mode=0 --enable-tpl-model=0 --cpu-used=0

SVT-AV1 parameters:

--preset 1 --scm 2 --keyint 63 --lookahead 0 --lp 1

The results above demonstrate the excellent objective performance of SVT-AV1. In addition, SVT-AV1 includes implementations of some subjective quality tools, which can be used if the codec is configured for the subjective quality.

Decoder performance

On the objective-1-fast test set, the SVT-AV1 decoder is slightly faster than the libaom in the 1-thread mode, with larger improvements in the 4-thread mode. We observe even larger speed gains over libaom decoder when decoding bitstreams with multiple tiles using the 4-thread mode. The testing has been performed on Windows, Linux, and macOS platforms. We believe the performance is satisfactory for a research decoder, where the trade-offs favor easier experimentation over further optimizations necessary for a production decoder.

Testing framework

To help ensure codec conformance, especially for new code contributions, the code has been comprehensively covered with unit tests and end-to-end tests. The unit tests are built on the Google Test framework. The unit and end-to-end tests are triggered automatically for each pull request to the repository, which is supported by GitHub actions. The tests support sharding, and they run in parallel to speed-up the turn-around time on pull requests.

Unit and e2e test have passed for this pull request

What’s next?

Over the last several months, SVT-AV1 has matured to become a complete encoder/decoder package providing competitive compression efficiency and performance trade-offs. The project is bolstered with extensive unit test coverage and documentation.

Our hope is that the SVT-AV1 codebase helps further adoption of AV1 and encourages more research and development on top of the current AV1 tools. We believe that the demonstrated advantages of SVT-AV1 make it a good platform for experimentation and research. We invite colleagues from industry and academia to check out the project on Github, reach out to the codebase maintainers for questions and comments or join one of the SVT-AV1 Open Dev meetings. We welcome more contributors to the project.


SVT-AV1: an open-source AV1 encoder and decoder was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Automating MySQL schema migrations with GitHub Actions and more

Post Syndicated from Shlomi Noach original https://github.blog/2020-02-14-automating-mysql-schema-migrations-with-github-actions-and-more/

In the past year, GitHub engineers shipped GitHub Packages, Actions, Sponsors, Mobile, security advisories and updates, notifications, code navigation, and more. Needless to say, the development pace at GitHub is accelerated.

With MySQL serving our backends, updating code requires changes to the underlying database schema. New features may require new tables, columns, changes to existing columns or indexes, dropping unused tables, and so on. On average, we have two schema migrations running daily on our production servers. Some days we have a half dozen migrations to run. We’ll cover how this amounted to a significant toil on the database infrastructure team, and how we searched for a solution to automate the manual parts of the process.

What’s in a migration?

At first glance, migrating appears to be no more difficult than adding a CREATE, ALTER or DROP TABLE statement. At a closer look, the process is far more complex, and involves multiple owners, platforms, environments, and transitions between those pieces. Here’s the flow as we experience it at GitHub:

1. Starting the process

It begins with a developer who identifies the need for a schema change. Maybe they need a new table, or a new column in an existing table. The developer has a local testing environment where they can experiment however they like, until they’re satisfied and wish to apply changes to production.

2. Feedback and review

The developer doesn’t just apply their changes online. First, they seek review and discussion with their peers. Depending on the change, they may ask for a review from a group of schema reviewers (at GitHub, this is a volunteer group experienced with database design). Then, they seek the agreement of the database infrastructure team, who owns the production databases. The database infrastructure team reviews the changes, looking for performance concerns, among other potential issues. Assuming all reviews are favorable, it’s on the database infrastructure engineer to deploy the change to production.

3. Taking the change to production

At this point, we need to determine where the change is taking place since we have multiple clusters. Some of them are sharded, so we have to ask: Where do the affected tables exist in our clusters or schemas? Next, we need to know what to run. The developer presented the schema they want to see in production, but how do we transition the existing production schema into the one requested? What’s the formal CREATE, ALTER or DROP statement? Following what to run, we need to know how we should run the migration. Do we run the query directly? Or is it a blocking operation and we need an online schema change tool? And finally, we need to know when to execute the migration. Perhaps now is not a good time if there’s already a migration running on the cluster.

4. Migration

At long last, we’re ready to run the migration. Some of our larger tables may take hours and even days to migrate, especially since the site needs to be up and running. We want to track status. And we want to see what impact the migration may have on production, or, preferably, to ensure it does not have an impact.

5. Completing the process

Even as the migration completes there are further steps to take. There’s some cleanup process, and we want to unblock the next migration, if any currently exists. The database infrastructure team wishes to advertise to the developer that the changes have taken place, and the developer will have their own followup to address.

Throughout that flow, there’s a lot of potential for friction:

  • Does the database infrastructure team review the developer’s request in a timely fashion?
  • Is the review process productive?
  • Do we need to wait for something before running the migration?
  • Is the database infrastructure engineer actually available to run the migration, or perhaps they’re busy with other tasks?

The database infrastructure engineer needs to either create or review the migration statement, double-check their logic, ensure they can begin the migration, follow up, unblock other migrations as needed, advertise progress to the developer, and so on.

With our volume of daily migrations, this flow sometimes consumed hours of work of a database infrastructure engineer per day, and—in the best-case scenario—at least several hours of work per week. They would frequently multitask between two or three migrations and keep mental notes for next steps. Developers would ping us to ask what the status was, and their work was sometimes blocked until the migration was complete.

A brief history of schema migration automation at GitHub

GitHub was originally created as a Ruby on Rails (RoR) app. Like other frameworks, and in particular, those using Active Record, RoR has a built-in mechanism to generate database schema from code, as well as programmatically express migrations. RoR tooling can analyze code changes and create and run the SQL statements to change the database schema.

We use the GitHub flow to manage our own development: when suggesting a change, we create a branch, commit, push, and open a pull request. We use the declarative approach to schema definition: our RoR GitHub repository contains the full schema definition, such as the CREATE TABLE statements that generate the complete schema. This way, we know exactly what schema is associated with each commit or branch. Counter that with the programmatic approach, where your commits contain migration statements, and where to deduce a schema you need to start at some baseline and run through all statements sequentially.

The database infrastructure and the application teams collaborated to create a set of chatops tooling. We ran a chatops command to list pull requests with schema changes, and then another command to generate the CREATE/ALTER/DROP statement for a given pull request. For this, we used RoR’s rake command. Our wrapper scripts then added meta information, like which cluster is involved, and generated a script used to run the migration.

The generated statements and script were mostly fine, with occasional SQL syntax errors. We’d review the output and fix it manually as needed.

A few years ago we developed gh-ost, an online table migration solution, which added even more visibility and control through our chatops. We’d check progress, change runtime configuration, and cut-over the migration through chat. While simple, these were still manual steps.

The heart of GitHub’s app remains with the same RoR, but we’ve expanded far beyond it. We created more repositories and some also use RoR, while others are in other programming languages such as Go. However, we didn’t use Object Relational Mapping practice with the new repositories.

As GitHub expanded, the more toil the database infrastructure team had. We’d review pull requests, compare schemas, generate migration statements manually, and verify on a local machine. Other than the git log, no formal tracking for schema migrations existed. We’d check in chat, issues, and pull requests to see what was done and what wasn’t. We’d keep track of ongoing migrations in our heads, context switch between the migrations throughout the day, and how often we’d get interrupted by notifications. And we did this while taking each migration through the next step, keeping mental notes, and communicating the progress to our peers.

With these steps in mind, we wanted a solution to automate the process. We came up with various ideas, and in 2019 GitHub Actions was released. This was our solution: multiple loosely coupled components, each owning a specific aspect of the flow, all orchestrated by a controller service. The next section covers the breakdown of our solution.

Code

Our basic premise is that schema design should be treated as code. We want the schema to be versioned, and we want to know what schema is associated and with what version of our code.

To illustrate, GitHub provides not only github.com, but also GitHub Enterprise, an on-premise solution. On github.com we run continuous deployments. With GitHub Enterprise, we make periodic releases, and our customers can upgrade in-house. This means we need to be able to reproduce any schema changes we make to github.com on a customer’s Enterprise server.

Therefore we must keep our schema design coupled with the code in the same git repository. For a developer to design a schema change, they need to follow our normal development flow: create a branch, commit, push, and open a pull request. The pull request is where code is reviewed and discussion takes place for any changes. It’s where continuous integration and testing run. Our solution revolves around the pull request, and this is standardized across all our repositories.

The change

Once a pull request is opened, we need to be able to identify what changes we’d like to make. Typically, when we review code changes, we look at the diff. And it might be tempting to expect that git diff can help us formalize the schema change. Unfortunately, this is not the case, and git diff is poor at identifying these changes. For example, consider this simplified table definition:

CREATE TABLE some_table (
  id int(10) unsigned NOT NULL AUTO_INCREMENT,
  hostname varchar(128) NOT NULL,
  PRIMARY KEY (id),
  KEY (hostname)
);

Suppose we decide to add a new column and drop the index on hostname. The new schema becomes:

CREATE TABLE some_table (
  id int(10) unsigned NOT NULL AUTO_INCREMENT,
  hostname varchar(128) NOT NULL,
  time_created TIMESTAMP NOT NULL,
  PRIMARY KEY (id)
);

Running git diff on the two schemas yields the following:

@@ -1,6 +1,6 @@
 CREATE TABLE some_table (
   id int(10) unsigned NOT NULL DEFAULT 0,
   hostname varchar(128) NOT NULL,
-  PRIMARY KEY (id),
-  KEY (hostname)
+  time_created TIMESTAMP NOT NULL,
+  PRIMARY KEY (id)
 );

The pull request’s “Files changed” tab shows the same:

This is a sample Pull Request where we change a table's schema. git diff does a poor job of analyzing the schema change.

See how the PRIMARY KEY line goes into the diff because of the trailing comma. This diff does not capture the schema change well, and while RoR provides tooling for that,  we’ve still had to carefully review them. Fortunately, there’s a good MySQL-oriented tool to do the task.

skeema

skeema is an open source schema management utility developed by Evan Elias. It expects the declarative approach, and looks for a schema definition on your file system (hopefully as part of your repository). The file system layout should include a directory per schema/database, a file per table, and then some special configuration files telling skeema the identities of, and the credentials for, MySQL servers in various environments. Skeema is able to run useful tasks, such as:

  • skeema diff: generate SQL statements that convert the existing database schema into the schema defined in the file system. This includes as many CREATE, ALTER and DROP TABLE statements as needed.
  • skeema push: actually apply changes to the database server for the schema to match the one on file system
  • skeema pull: rewrite the filesystem schema based on the existing schema in the MySQL server.

skeema can do much more, including the ability to invoke online schema change tools—but that’s outside this post’s scope.

Git users will feel comfortable with skeema. Indeed, skeema works very well with git-versioned schemas. For us, the most valuable asset is its diff output: a well formed, reliable set of statements to show the SQL transition from one schema to another. For example, skeema diff output for the above schema change is:

USE `test`;
ALTER TABLE `some_table` ADD COLUMN `time_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, DROP KEY `hostname`;

Note that the above is not only correct, but also formal. It reproduces correctly whether our code uses lower/upper case, includes/omits default value, etc.

We wanted to use skeema to tell us what statements we needed to run to get from our existing state into the state defined in the pull request. Assuming the master branch reflects our current production schema, this now becomes a matter of diffing the schemas between master and the pull request’s branch.

Skeema wasn’t without its challenges, and we had to figure out where to place skeema from a design perspective. Do the developers own it? Does every repository own it? Is there a central service to own it? Each presented its own problems, from false ownership to excessive responsibilities and access.

GitHub Actions

Enter GitHub Actions. With Actions, you’re able to run code as a response to events taking place in your repository. A new pull request, review, comment, issue, and quite a few others, are such events. The code (the action) is arbitrary, and GitHub spawns a container on its own infrastructure, where your code will run. What makes this extra interesting is that the container can get access to your repository. GitHub Actions implicitly receives an API token to interact with the repository.

The container comes with popular software packages pre-installed, such as a MySQL server.

Perhaps the most classic use of Actions is CI/CD.  When  a pull_request event occurs (a new pull request and any subsequent commit) run some code to build, test, lint, or validate the change. We took this approach to run skeema as part of a pull_request action flow, called skeema-diff.

Here’s a simplified breakdown of the action:

  1. Fetch skeema binary
  2. Checkout master branch
  3. Run skeema push to populate the container’s MySQL server with the schema as defined by the master branch
  4. Checkout pull request’s branch
  5. Run skeema diff to generate the statements that take the schema from the one in MySQL (remember, this is the master schema) to the one in the pull request’s branch
  6. Add the diff as a comment in the pull request
  7. Add a special label to indicate this pull request has a schema change

The GitHub Action, running skeema, generates schema diff output, which is added as a comment to the Pull Request. The comment presents the correct ALTER statement implied by the code change. This comment is both human and machine readable.

The code is more complex than what we’ve shown. We actually use base and head instead of master and branch, and there’s some logic to formalize, edit and validate the diff, to handle commits that further change the schema, among other processes.

By now, we have a partial flow, which works entirely on GitHub’s platform:

  • Schema change as code
  • Review process, based on GitHub’s pull request flow
  • Automated schema change analysis, based on skeema running in a GitHub Action
  • A visible output, presented as a pull request comment

Up to this point, everything is constrained to the repository. The repository itself doesn’t have information about where the schema gets deployed in production. This information is something that’s outside the repository’s scope, and it’s owned by the database infrastructure team rather than the repository’s developers. Neither the repository nor any action running on that repository has access to production, nor should they, as that would be a breach of domains.

Before we describe how the schema gets to production, let’s jump ahead and discuss the schema migration itself.

Schema migrations and gh-ost

Even the simplest schema migration isn’t simple. We are concerned with three types of table migrations:

  • CREATE TABLE is the simplest and the safest. We created something that didn’t exist before, and its creation time is instantaneous. Note that if the target cluster is sharded, this must be applied on all shards. If the cluster is sharded with vitess, then the vitess vtgate service automatically handles this for us.
  • DROP TABLE is a simple statement that comes with a great risk. What if it’s still in use and some code breaks as a result of the table going away? Note that we don’t actually drop tables as part of schema migrations. Any DROP TABLE statement is converted into a RENAME TABLE. Instead of DROP TABLE repositories (whoops!), our automation runs RENAME TABLE repositories TO _repositories_DROP_20200101123456. If our application fails because of this, we have an instant revert command: RENAME back to the original. Renamed tables are kept around for a few days prior to being garbage collected and dropped by our automation.
  • ALTER TABLE is the most complex case, mainly because it takes time to alter a table. We don’t actually ALTER tables in-place. We use gh-ost to emulate an ALTER TABLE, and the end result is the same even though the process is completely different. It doesn’t lock our apps, throttles as much as needed, and it’s controllable as well as auditable. We’ve run gh-ost in production for over three and a half years. It has little to no impact on production, and we generally don’t care that it’s running. But some of our larger tables may still take hours or even days to migrate. We also only run one ALTER (or, gh-ost) at a time on a cluster. Concurrent migrations are possible but compete over resources, leading to overall longer runtimes than sequential execution. This means that an ALTER migration requires scheduling. We need to be able to tell if a migration is already running on a cluster, as well as prioritize and queue migrations that apply to the same cluster. We also need to be able to tell the status over the duration of hours or days, and this needs to be communicated to the developer, the owner of the change. And, if the cluster is sharded, we need to run the migration per shard.

In order to run a migration, we must first determine the strategy for that migration (Is it direct query, gh-ost, or a manual?). We need to be able to tell where it can run,  how to go about the process if the cluster is sharded, as well as When to schedule it. While migrations can wait in queue while others are running, we want to be able to prioritize migrations, in case the queue is large.

skeefree

We created skeefree as the glue, which means it’s an orchestrating service that’s aware of our repositories, can communicate with our pull requests, knows about production (or, can get information about production) and which invokes the migrations. We run skeefree as a stateless kubernetes service, backed by a MySQL database that holds the state. Note that skeefree’s own schema is managed by skeefree.

skeefree uses GitHub’s API to interact with pull requests, GitHub’s internal inventory and discovery services, to locate clusters in production, and gh-ost to run migrations. Skeefree is best described by following a schema migration flow:

  1. A developer wishes to change the schema, so they open a pull request.
  2. skeema-diff Action springs to life and seeks a schema change. If a schema change isn’t found in the pull request, nothing happens. If there is a schema change, the Action, computes the change via skeema, adds a well-formed comment to the pull request indicating the change, and adds a migration:skeema:diff label to the pull request. This is done via the GitHub API.
  3. A developer looks into the change, and seeks review from a team member. At this time they may communicate to team members without actually going to production. Finally, they add the label migration:for:review.
  4. skeefree is aware of the developer’s repository and uses the GitHub API to periodically look for open pull requests, which are labeled by both migration:skeema:diff and migration:for:review, and have been approved by at least one developer.
  5. Once detected, skeefree investigates the pull request, and reads the schema change comment, generated by the Action. It maps the schema/repository to the schema/production cluster, and uses our inventory and discovery services to know if the cluster is sharded. Then, it finds the location and name of the cluster.
  6. skeefree then adds this to its backend database, and advertises its analysis on the pull request with another comment. This comment generally means “here’s what I will do if you approve”. And it proceeds to get a review from an authority. Once the user labels the Pull Request as "migration:for:review", skeefree analyzes the migration and evaluates where it needs to run. It proceeds to seek review from an authority.
  7. For most repositories, the authority is the database-infrastructure team. On our original RoR repository, we also seek review from a cross-functional team, known as the db-schema-reviewers, who are familiar with the general application and database design throughout the years and who have more context to offer. skeefree automatically knows which teams should be notified on which repositories.
  8. The relevant teams review and hopefully approve, and skeefree detects the approval, before choosing the proper strategy (direct query for CREATE and DROP, or RENAME), and gh-ost for ALTER. It then queues the migration(s).
  9. skeefree’s scheduler periodically checks what next can be executed. Remember we only run a single ALTER migration on a given cluster at a time, but we also have a limited number of runner hosts. If there’s a free runner host and the cluster is not running any migration, skeefree then proceeds to kick off a migration. Skeefree advertises this fact as a pull request comment to notify the developer that the migration started.
  10. Once the migration is complete, skeefree announces it in a pull request comment. The same applies should the migration fail.
  11. The pull request may also have more than one migration. Perhaps the cluster is sharded, or there may be multiple tables changed in the pull request. Once all migrations are successfully completed, skeefree advertises this in a pull request comment. The developer is notified that all migrations are done, and they’re encouraged to proceed with their standard deploy/merge flow.

as skeefree runs the migrations, it adds comments on the Pull Request page to indicate its progress. When all migrations are complete, skeefree comments as much, again on the pull request page.

Analysis of the flow

There are a few nuances here that make a good experience to everyone involved:

  • The database infrastructure team doesn’t know about the pull request until the developer explicitly adds the migration:for:review label. It’s like a draft pull request or a pull request that’s a work in progress, only this flag applies specifically to the schema migration flow. This allows the developer to use their preferred flow, and communicate with their team without interrupting the database infrastructure team or getting premature reviews.
  • The skeema analysis is contained within the repository, which means That no external service is required. The developer can check the diff result, themselves.
  • The Action is the only part of the flow that looks at the code. Neither skeefree nor gh-ost look at the actual code, and they don’t need git access.
  • The database infrastructure team only needs to take a single step, which is review the pull request.
  • The developers own the creation of pull requests, getting peer reviews, and finally, deploying and merging. These are the exact operations that should be under their ownership. Moreover, they get visibility into the state of their migration. By looking at the pull request page or their GitHub notifications, they can tell whether the pull request has been reviewed, queued, started, completed, or failed. They don’t need to ask. Even better, we have chatops that give visibility into the overall state of migration queue, a running migration’s progress, and more. These chatops are available for all to invoke.
  • The database infrastructure team owns the process of mapping the repository schema to production. This is done via chatops, but can also be completed via configuration. The team is able to cancel a pull request, retry a failed migration, and more.
  • gh-ost is generally trusted, and we have control over a running migration. This means that we can force it to throttle, set up a different throttle threshold, make it use less resources, or terminate it, if needed. We also have a throttling mechanism throughout our stack, so that long running processes like migrations yield to higher priority operations, which extends their own runtime so it doesn’t generate too much load on our database servers.
  • We use our own prefered pull request flow, oActions (skeefree was an early adopter for Actions), GitHub API, and our existing datacenter and database infrastructure, all of which are well understood internally.

Public availability

skeefree and the skeema-diff Action were authored internally at GitHub to solve a specific problem. skeefree uses our internal inventory and discovery services, it works with our chatops and uses some internal libraries.

Our experience in releasing open source software is that no one’s use case is exactly the same as ours. Our perception of an automated migrations flow may be very different from another organization’s perception. We still want to share more than just our words, so we’ve open sourced the code.

It’s a bit of a peculiar OSS release:

  • it’s missing some libraries; it will not build.
  • It expects some of our internal services to exist, which more than likely won’t be on your platform.
  • It expects chatops, and you may not be using chatops.
  • The code also needs to be rewritten for adaptation to your environment,

Note that the code is available, but not open for issues and pull requests. We hope the community finds it useful.

Get the code

The post Automating MySQL schema migrations with GitHub Actions and more appeared first on The GitHub Blog.

Building an AWS IoT Core device using AWS Serverless and an ESP32

Post Syndicated from Moheeb Zara original https://aws.amazon.com/blogs/compute/building-an-aws-iot-core-device-using-aws-serverless-and-an-esp32/

Using a simple Arduino sketch, an AWS Serverless Application Repository application, and a microcontroller, you can build a basic serverless workflow for communicating with an AWS IoT Core device.

A microcontroller is a programmable chip and acts as the brain of an electronic device. It has input and output pins for reading and writing on digital or analog components. Those components could be sensors, relays, actuators, or various other devices. It can be used to build remote sensors, home automation products, robots, and much more. The ESP32 is a powerful low-cost microcontroller with Wi-Fi and Bluetooth built in and is used this walkthrough.

The Arduino IDE, a lightweight development environment for hardware, now includes support for the ESP32. There is a large collection of community and officially supported libraries, from addressable LED strips to spectral light analysis.

The following walkthrough demonstrates connecting an ESP32 to AWS IoT Core to allow it to publish and subscribe to topics. This means that the device can send any arbitrary information, such as sensor values, into AWS IoT Core while also being able to receive commands.

Solution overview

This post walks through deploying an application from the AWS Serverless Application Repository. This allows an AWS IoT device to be messaged using a REST endpoint powered by Amazon API Gateway and AWS Lambda. The AWS SAR application also configures an AWS IoT rule that forwards any messages published by the device to a Lambda function that updates an Amazon DynamoDB table, demonstrating basic bidirectional communication.

The last section explores how to build an IoT project with real-world application. By connecting a thermal printer module and modifying a few lines of code in the example firmware, the ESP32 device becomes an AWS IoT–connected printer.

All of this can be accomplished within the AWS Free Tier, which is necessary for the following instructions.

An example of an AWS IoT project using an ESP32, AWS IoT Core, and an Arduino thermal printer

An example of an AWS IoT project using an ESP32, AWS IoT Core, and an Arduino thermal printer.

Required steps

To complete the walkthrough, follow these steps:

  • Create an AWS IoT device.
  • Install and configure the Arduino IDE.
  • Configure and flash an ESP32 IoT device.
  • Deploying the lambda-iot-rule AWS SAR application.
  • Monitor and test.
  • Create an IoT thermal printer.

Creating an AWS IoT device

To communicate with the ESP32 device, it must connect to AWS IoT Core with device credentials. You must also specify the topics it has permissions to publish and subscribe on.

  1. In the AWS IoT console, choose Register a new thing, Create a single thing.
  2. Name the new thing. Use this exact name later when configuring the ESP32 IoT device. Leave the remaining fields set to their defaults. Choose Next.
  3.  Choose Create certificate. Only the thing cert, private key, and Amazon Root CA 1 downloads are necessary for the ESP32 to connect. Download and save them somewhere secure, as they are used when programming the ESP32 device.
  4. Choose Activate, Attach a policy.
  5. Skip adding a policy, and choose Register Thing.
  6. In the AWS IoT console side menu, choose Secure, Policies, Create a policy.
  7. Name the policy Esp32Policy. Choose the Advanced tab.
  8. Paste in the following policy template.
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": "iot:Connect",
          "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:client/THINGNAME"
        },
        {
          "Effect": "Allow",
          "Action": "iot:Subscribe",
          "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:topicfilter/esp32/sub"
        },
    	{
          "Effect": "Allow",
          "Action": "iot:Receive",
          "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:topic/esp32/sub"
        },
        {
          "Effect": "Allow",
          "Action": "iot:Publish",
          "Resource": "arn:aws:iot:REGION:ACCOUNT_ID:topic/esp32/pub"
        }
      ]
    }
  9. Replace REGION with the matching AWS Region you’re currently operating in. This can be found on the top right corner of the AWS console window.
  10.  Replace ACCOUNT_ID with your own, which can be found in Account Settings.
  11. Replace THINGNAME with the name of your device.
  12. Choose Create.
  13. In the AWS IoT console, choose Secure, Certification. Select the one created for your device and choose Actions, Attach policy.
  14. Choose Esp32Policy, Attach.

Your AWS IoT device is now configured to have permission to connect to AWS IoT Core. It can also publish to the topic esp32/pub and subscribe to the topic esp32/sub. For more information on securing devices, see AWS IoT Policies.

Installing and configuring the Arduino IDE

The Arduino IDE is an open-source development environment for programming microcontrollers. It supports a continuously growing number of platforms including most ESP32-based modules. It must be installed along with the ESP32 board definitions, MQTT library, and ArduinoJson library.

  1. Download the Arduino installer for the desired operating system.
  2. Start Arduino and open the Preferences window.
  3. For Additional Board Manager URLs, add
    https://dl.espressif.com/dl/package_esp32_index.json.
  4. Choose Tools, Board, Boards Manager.
  5. Search esp32 and install the latest version.
  6. Choose Sketch, Include Library, Manage Libraries.
  7. Search MQTT, and install the latest version by Joel Gaehwiler.
  8. Repeat the library installation process for ArduinoJson.

The Arduino IDE is now installed and configured with all the board definitions and libraries needed for this walkthrough.

Configuring and flashing an ESP32 IoT device

A collection of various ESP32 development boards.

A collection of various ESP32 development boards.

For this section, you need an ESP32 device. To check if your board is compatible with the Arduino IDE, see the boards.txt file. The following code connects to AWS IoT Core securely using MQTT, a publish and subscribe messaging protocol.

This project has been tested on the following devices:

  1. Install the required serial drivers for your device. Some boards use different USB/FTDI chips for interfacing. Here are the most commonly used with links to drivers.
  2. Open the Arduino IDE and choose File, New to create a new sketch.
  3. Add a new tab and name it secrets.h.
  4. Paste the following into the secrets file.
    #include <pgmspace.h>
    
    #define SECRET
    #define THINGNAME ""
    
    const char WIFI_SSID[] = "";
    const char WIFI_PASSWORD[] = "";
    const char AWS_IOT_ENDPOINT[] = "xxxxx.amazonaws.com";
    
    // Amazon Root CA 1
    static const char AWS_CERT_CA[] PROGMEM = R"EOF(
    -----BEGIN CERTIFICATE-----
    -----END CERTIFICATE-----
    )EOF";
    
    // Device Certificate
    static const char AWS_CERT_CRT[] PROGMEM = R"KEY(
    -----BEGIN CERTIFICATE-----
    -----END CERTIFICATE-----
    )KEY";
    
    // Device Private Key
    static const char AWS_CERT_PRIVATE[] PROGMEM = R"KEY(
    -----BEGIN RSA PRIVATE KEY-----
    -----END RSA PRIVATE KEY-----
    )KEY";
  5. Enter the name of your AWS IoT thing, as it is in the console, in the field THINGNAME.
  6. To connect to Wi-Fi, add the SSID and PASSWORD of the desired network. Note: The network name should not include spaces or special characters.
  7. The AWS_IOT_ENDPOINT can be found from the Settings page in the AWS IoT console.
  8. Copy the Amazon Root CA 1, Device Certificate, and Device Private Key to their respective locations in the secrets.h file.
  9. Choose the tab for the main sketch file, and paste the following.
    #include "secrets.h"
    #include <WiFiClientSecure.h>
    #include <MQTTClient.h>
    #include <ArduinoJson.h>
    #include "WiFi.h"
    
    // The MQTT topics that this device should publish/subscribe
    #define AWS_IOT_PUBLISH_TOPIC   "esp32/pub"
    #define AWS_IOT_SUBSCRIBE_TOPIC "esp32/sub"
    
    WiFiClientSecure net = WiFiClientSecure();
    MQTTClient client = MQTTClient(256);
    
    void connectAWS()
    {
      WiFi.mode(WIFI_STA);
      WiFi.begin(WIFI_SSID, WIFI_PASSWORD);
    
      Serial.println("Connecting to Wi-Fi");
    
      while (WiFi.status() != WL_CONNECTED){
        delay(500);
        Serial.print(".");
      }
    
      // Configure WiFiClientSecure to use the AWS IoT device credentials
      net.setCACert(AWS_CERT_CA);
      net.setCertificate(AWS_CERT_CRT);
      net.setPrivateKey(AWS_CERT_PRIVATE);
    
      // Connect to the MQTT broker on the AWS endpoint we defined earlier
      client.begin(AWS_IOT_ENDPOINT, 8883, net);
    
      // Create a message handler
      client.onMessage(messageHandler);
    
      Serial.print("Connecting to AWS IOT");
    
      while (!client.connect(THINGNAME)) {
        Serial.print(".");
        delay(100);
      }
    
      if(!client.connected()){
        Serial.println("AWS IoT Timeout!");
        return;
      }
    
      // Subscribe to a topic
      client.subscribe(AWS_IOT_SUBSCRIBE_TOPIC);
    
      Serial.println("AWS IoT Connected!");
    }
    
    void publishMessage()
    {
      StaticJsonDocument<200> doc;
      doc["time"] = millis();
      doc["sensor_a0"] = analogRead(0);
      char jsonBuffer[512];
      serializeJson(doc, jsonBuffer); // print to client
    
      client.publish(AWS_IOT_PUBLISH_TOPIC, jsonBuffer);
    }
    
    void messageHandler(String &topic, String &payload) {
      Serial.println("incoming: " + topic + " - " + payload);
    
    //  StaticJsonDocument<200> doc;
    //  deserializeJson(doc, payload);
    //  const char* message = doc["message"];
    }
    
    void setup() {
      Serial.begin(9600);
      connectAWS();
    }
    
    void loop() {
      publishMessage();
      client.loop();
      delay(1000);
    }
  10. Choose File, Save, and give your project a name.

Flashing the ESP32

  1. Plug the ESP32 board into a USB port on the computer running the Arduino IDE.
  2. Choose Tools, Board, and then select the matching type of ESP32 module. In this case, a Sparkfun ESP32 Thing was used.
  3. Choose Tools, Port, and then select the matching port for your device.
  4. Choose Upload. Arduino reads Done uploading when the upload is successful.
  5. Choose the magnifying lens icon to open the Serial Monitor. Set the baud rate to 9600.

Keep the Serial Monitor open. When connected to Wi-Fi and then AWS IoT Core, any messages received on the topic esp32/sub are logged to this console. The device is also now publishing to the topic esp32/pub.

The topics are set at the top of the sketch. When changing or adding topics, remember to add permissions in the device policy.

// The MQTT topics that this device should publish/subscribe
#define AWS_IOT_PUBLISH_TOPIC   "esp32/pub"
#define AWS_IOT_SUBSCRIBE_TOPIC "esp32/sub"

Within this sketch, the relevant functions are publishMessage() and messageHandler().

The publishMessage() function creates a JSON object with the current time in milliseconds and the analog value of pin A0 on the device. It then publishes this JSON object to the topic esp32/pub.

void publishMessage()
{
  StaticJsonDocument<200> doc;
  doc["time"] = millis();
  doc["sensor_a0"] = analogRead(0);
  char jsonBuffer[512];
  serializeJson(doc, jsonBuffer); // print to client

  client.publish(AWS_IOT_PUBLISH_TOPIC, jsonBuffer);
}

The messageHandler() function prints out the topic and payload of any message from a subscribed topic. To see all the ways to parse JSON messages in Arduino, see the deserializeJson() example.

void messageHandler(String &topic, String &payload) {
  Serial.println("incoming: " + topic + " - " + payload);

//  StaticJsonDocument<200> doc;
//  deserializeJson(doc, payload);
//  const char* message = doc["message"];
}

Additional topic subscriptions can be added within the connectAWS() function by adding another line similar to the following.

// Subscribe to a topic
  client.subscribe(AWS_IOT_SUBSCRIBE_TOPIC);

  Serial.println("AWS IoT Connected!");

Deploying the lambda-iot-rule AWS SAR application

Now that an ESP32 device has been connected to AWS IoT, the following steps walk through deploying an AWS Serverless Application Repository application. This is a base for building serverless integration with a physical device.

  1. On the lambda-iot-rule AWS Serverless Application Repository application page, make sure that the Region is the same as the AWS IoT device.
  2. Choose Deploy.
  3. Under Application settings, for PublishTopic, enter esp32/sub. This is the topic to which the ESP32 device is subscribed. It receives messages published to this topic. Likewise, set SubscribeTopic to esp32/pub, the topic on which the device publishes.
  4. Choose Deploy.
  5. When creation of the application is complete, choose Test app to navigate to the application page. Keep this page open for the next section.

Monitoring and testing

At this stage, two Lambda functions, a DynamoDB table, and an AWS IoT rule have been deployed. The IoT rule forwards messages on topic esp32/pub to TopicSubscriber, a Lambda function, which inserts the messages on to the DynamoDB table.

  1. On the application page, under Resources, choose MyTable. This is the DynamoDB table that the TopicSubscriber Lambda function updates.
  2. Choose Items. If the ESP32 device is still active and connected, messages that it has published appear here.

The TopicPublisher Lambda function is invoked by the API Gateway endpoint and publishes to the AWS IoT topic esp32/sub.

1.     On the application page, find the Application endpoint.

2.     To test that the TopicPublisher function is working, enter the following into a terminal or command-line utility, replacing ENDPOINT with the URL from above.

curl -d '{"text":"Hello world!"}' -H "Content-Type: application/json" -X POST https://ENDPOINT/publish

Upon success, the request returns a copy of the message.

Back in the Serial Monitor, the message published to the topic esp32/sub prints out.

Creating an IoT thermal printer

With the completion of the previous steps, the ESP32 device currently logs incoming messages to the serial console.

The following steps demonstrate how the code can be modified to use incoming messages to interact with a peripheral component. This is done by wiring a thermal printer to the ESP32 in order to physically print messages. The REST endpoint from the previous section can be used as a webhook in third-party applications to interact with this device.

A wiring diagram depicting an ESP32 connected to a thermal printer.

A wiring diagram depicting an ESP32 connected to a thermal printer.

  1. Follow the product instructions for powering, wiring, and installing the correct Arduino library.
  2. Ensure that the thermal printer is working by holding the power button on the printer while connecting the power. A sample receipt prints. On that receipt, the default baud rate is specified as either 9600 or 19200.
  3. In the Arduino code from earlier, include the following lines at the top of the main sketch file. The second line defines what interface the thermal printer is connected to. &Serial2 is used to set the third hardware serial interface on the ESP32. For this example, the pins on the Sparkfun ESP32 Thing, GPIO16/GPIO17, are used for RX/TX respectively.
    #include "Adafruit_Thermal.h"
    
    Adafruit_Thermal printer(&Serial2);
  4. Replace the setup() function with the following to initialize the printer on device bootup. Change the baud rate of Serial2.begin() to match what is specified in the test print. The default is 19200.
    void setup() {
      Serial.begin(9600);
    
      // Start the thermal printer
      Serial2.begin(19200);
      printer.begin();
      printer.setSize('S');
    
      connectAWS();
    }
    
  5. Replace the messageHandler() function with the following. On any incoming message, it parses the JSON and prints the message on the thermal printer.
    void messageHandler(String &topic, String &payload) {
      Serial.println("incoming: " + topic + " - " + payload);
    
      // deserialize json
      StaticJsonDocument<200> doc;
      deserializeJson(doc, payload);
      String message = doc["message"];
    
      // Print the message on the thermal printer
      printer.println(message);
      printer.feed(2);
    }
  6. Choose Upload.
  7. After the firmware has successfully uploaded, open the Serial Monitor to confirm that the board has connected to AWS IoT.
  8. Enter the following into a command-line utility, replacing ENDPOINT, as in the previous section.
    curl -d '{"message": "Hello World!"}' -H "Content-Type: application/json" -X POST https://ENDPOINT/publish

If successful, the device prints out the message “Hello World” from the attached thermal printer. This is a fully serverless IoT printer that can be triggered remotely from a webhook. As an example, this can be used with GitHub Webhooks to print a physical readout of events.

Conclusion

Using a simple Arduino sketch, an AWS Serverless Application Repository application, and a microcontroller, this post demonstrated how to build a basic serverless workflow for communicating with a physical device. It also showed how to expand that into an IoT thermal printer with real-world applications.

With the use of AWS serverless, advanced compute and extensibility can be added to an IoT device, from machine learning to translation services and beyond. By using the Arduino programming environment, the vast collection of open-source libraries, projects, and code examples open up a world of possibilities. The next step is to explore what can be done with an Arduino and the capabilities of AWS serverless. The sample Arduino code for this project and more can be found at this GitHub repository.

An Update on CDNJS

Post Syndicated from Zack Bloom original https://blog.cloudflare.com/an-update-on-cdnjs/

An Update on CDNJS

When you loaded this blog, a file was delivered to your browser called jquery-3.2.1.min.js. jQuery is a library which makes it easier to build websites, and was at one point included on as many as 74.1% of all websites. A full eighteen million sites include jQuery and other libraries using one of the most popular tools on Earth: CDNJS. Beginning about a month ago Cloudflare began to take a more active role in the operation of CDNJS. This post is here to tell you more about CDNJS’ history and explain why we are helping to manage CDNJS.

What CDNJS Does

Virtually every site is composed of not just the code written by its developers, but also dozens or hundreds of libraries. These libraries make it possible for websites to extend what a web browser can do on its own. For example, libraries can allow a site to include powerful data visualizations, respond to user input, or even get more performant.

These libraries created wondrous and magical new capabilities for web browsers, but they can also cause the size of a site to explode. Particularly a decade ago, connections were not always fast enough to permit the use of many libraries while maintaining performance. But if so many websites are all including the same libraries, why was it necessary for each of them to load their own copy?

If we all load jQuery from the same place the browser can do a much better job of not actually needing to download it for every site. When the user visits the first jQuery-powered site it will have to be downloaded, but it will already be cached on the user’s computer for any subsequent jQuery-powered site they might visit.

An Update on CDNJS

The first visit might take time to load:

An Update on CDNJS

But any future visit to any website pointing to this common URL would already be cached:

An Update on CDNJS

<!-- Loaded only on my site, will need to be downloaded by every user -->
<script src="./jquery.js"></script>

<!-- Loaded from a common location across many sites -->
<script src="https://cdnjs.cloudflare.com/jquery.js"></script>

Beyond the performance advantage, including files this way also made it very easy for users to experiment and create. When using a web browser as a creation tool users often didn’t have elaborate build systems (this was also before npm), so being able to include a simple script tag was a boon. It’s worth noting that it’s not clear a massive performance advantage was ever actually provided by this scheme. It is becoming even less of a performance advantage now that browser vendors are beginning to use separate cache’s for each website you visit, but with millions of sites using CDNJS there’s no doubt it is a critical part of the web.

A CDN for all of us

My first Pull Request into the CDNJS project was in 2013. Back then if you created a JavaScript project it wasn’t possible to have it included in the jQuery CDN, or the ones provided by large companies like Google and Microsoft. They were only for big, important, projects. Of course, however, even the biggest project starts small. The community needed a CDN which would agree to host nearly all JavaScript projects, even the ones which weren’t world-changing (yet). In 2011, that project was launched by Ryan Kirkman and Thomas Davis as CDNJS.

The project was quickly wildly successful, far beyond their expectations. Their CDN bill quickly began to skyrocket (it would now be over a million dollars a year on AWS). Under the threat of having to shut down the service, Cloudflare was approached by the CDNJS team to see if we could help. We agreed to support their efforts and created cdnjs.cloudflare.com which serves the CDNJS project free of charge.

CDNJS has been astonishingly successful. The project is currently installed on over eighteen million websites (10% of the Internet!), offers files totaling over 1.5 billion lines of code, and serves over 173 billion requests a month. CDNJS only gets more popular as sites get larger, with 34% of the top 10k websites using the service. Each month we serve almost three petabytes of JavaScript, CSS, and other resources which power the web via cdnjs.cloudflare.com.

An Update on CDNJS
Spikes can happen when a very large or popular site installs CDNJS, or when a misbehaving web crawler discovers a CDNJS link.

The future value of CDNJS is now in doubt, as web browsers are beginning to use a separate cache for every website you visit. It is currently used on such a wide swath of the web, however, it is unlikely it will be disappearing any time soon.

How CDNJS Works

CDNJS starts with a Github repo. That project contains every file served by CDNJS, at every version which it has ever offered. That’s 182 GB without the commit history, over five million files, and over 1.5 billion lines of code.

Given that it stores and delivers versioned code files, in many ways it was the Internet’s first JavaScript package manager. Unlike other package managers and even other CDNs everything CDNJS serves is publicly versioned. All 67,724 commits! This means you as a user can verify that you are being served files which haven’t been tampered with.

To make changes to CDNJS a commit has to be made. For new projects being added to CDNJS, or when projects change significantly, these commits are made by humans, and get reviewed by other humans. When projects just release new versions there is a bot made by Peter and maintained by Sven which sucks up changes from npm and automatically creates commits.

Within Cloudflare’s infrastructure there is a set of machines which are responsible for pulling the latest version of the repo periodically. Those machines then become the origin for cdnjs.cloudflare.com, with Cloudflare’s Global Load Balancer automatically handling failures. Cloudflare’s cache automatically stores copies of many of the projects making it possible for us to deliver them quickly from all 195 of our data centers.

An Update on CDNJS

The Internet on a Shoestring Budget

The CDNJS project has always been administered independently of Cloudflare. In addition to the founders, the project has additionally been maintained by exceptionally hard-working caretakers like Peter and Matt Cowley. Maintaining a single repo of nearly every frontend project on Earth is no small task, and it has required a substantial amount of both manual work and bot development.

Unfortunately approximately thirty days ago one of those bots stopped working, preventing updated projects from appearing in CDNJS. The bot’s open-source maintainer was not able to invest the time necessary to keep the bot running. After several weeks we were asked by the community and the CDNJS founders to take over maintenance of the CDNJS repo itself. This means the Cloudflare engineering team is taking responsibility for keeping the contents of github.com/cdnjs/cdnjs up to date, in addition to ensuring it is correctly served on cdnjs.cloudflare.com.

We agreed to do this because we were, frankly, scared. Like so many open-source projects CDNJS was a critical part of our world, but wasn’t getting the attention it needed to survive. The Internet relies on CDNJS as much as on any other single project, losing it or allowing it to be commandeered would be catastrophic to millions of websites and their visitors. If it began to fail, some sites would adapt and update, others would be broken forever.

CDNJS has always been, and remains, a project for and by the community. We are invested in making all decisions in a transparent and inclusive manner. If you are interested in contributing to CDNJS or in the topics we’re currently discussing please visit the CDNJS Github Issues page.

An Update on CDNJS

A Plan for the Future

One example of an area where we could use your help is in charting a path towards a CDNJS which requires less manual moderation. Nothing can replace the intelligence and creativity of a human (yet), but for a task like managing what resources go into a CDN, it is error prone and time consuming. At present a human has to review every new project to be included, and often has to take additional steps to include new versions of a project.

As a part of our analysis of the project we examined a snapshot of the still-open PRs made against CDNJS for several months:

An Update on CDNJS

The vast majority of these PRs were changes which ultimately passed the automated review but nevertheless couldn’t be merged without manual review.

There is consensus that we should move to a model which does not require human involvement in most cases. We would love your input and collaboration on the best way for that to be solved. If this is something you are passionate about, please contribute here.

Our plan is to support the CDNJS project in whichever ways it requires for as long as the Internet relies upon it. We invite you to use CDNJS in your next project with the full assurance that it is backed by the same network and team who protect and accelerate over twenty million of your favorite websites across the Internet. We are also planning more posts diving further into the CDNJS data, subscribe to this blog if you would like to be notified upon their release.

Behind the scenes: GitHub security alerts

Post Syndicated from Justin Hutchings original https://github.blog/2019-12-11-behind-the-scenes-github-vulnerability-alerts/

If you have code on GitHub, chances are that you’ve had a security vulnerability alert at some point. Since the feature launched, GitHub has sent more than 62 million security alerts for vulnerable dependencies.

How does it work?

Vulnerability alerts rely on two pieces of data: an inventory of all the software that your code depends on, and a curated list of known vulnerabilities in open-source code. 

Any time you push a change to a dependency manifest file, GitHub has a job that parses those manifest files, and stores your dependency on those packages in the dependency graph. If you’re dependent on something that hasn’t been seen before, a background task runs to get more information about the package from the package registries themselves and adds it. We use the information from the package registries to establish the canonical repository that the package came from, and to help populate metadata like readmes, known versions, and the published licenses. 

On GitHub Enterprise Server, this process works identically, except we don’t get any information from the public package registries in order to protect the privacy of the server and its code. 

The dependency graph supports manifests for JavaScript (npm, Yarn), .NET (Nuget), Java (Maven), PHP (Composer), Python (PyPI), and Ruby (Rubygems). This data powers our vulnerability alerts, but also dependency insights, the used by badge, and the community contributors experiences. 

Beyond the dependency graph, we aggregate data from a number of sources and curate those to bring you actionable security alerts. GitHub brings in security vulnerability data from a number of sources, including the National Vulnerability Database (a service of the United States National Institute of Standards and Technology), maintainer security advisories from open-source maintainers, community datasources, and our partner WhiteSource

Once we learn about a vulnerability, it passes through an advanced machine learning model that’s trained to recognize vulnerabilities which impact developers. This model rejects anything that isn’t related to an open-source toolchain. If the model accepts the vulnerability, a bot creates a pull request in a GitHub private repository for our  team of curation experts to manually review.

GitHub curates vulnerabilities because CVEs (Common Vulnerability Entries) are often ambiguous about which open-source projects are impacted. This can be particularly challenging when multiple libraries with similar names exist, or when they’re a part of a larger toolkit. Depending on the kind of vulnerability, our curation team may follow-up with outside security researchers or maintainers about the impact assessment. This follow-up helps to confirm that an alert is warranted and to identify the exact packages that are impacted. 

Once the curation team completes the mappings, we merge the pull request and it starts a background job that notifies users about any affected repositories. Depending on the vulnerability, this can cause a lot of alerts. In a recent incident, more than two million repositories were alerted about a vulnerable version of lodash, a popular JavaScript utility library.

GitHub Enterprise Server customers get a slightly different experience. If an admin has enabled security vulnerability alerts through GitHub Connect, the server will download the latest curated list of vulnerabilities from GitHub.com over the private GitHub Connect channel on its next scheduled sync (about once per hour). If a new vulnerability exists, the server determines the impacted users and repositories before generating alerts directly. 

Security vulnerabilities are a matter of public good. High-profile breaches impact the trustworthiness of the entire tech industry, so we publish a curated set of vulnerabilities on our GraphQL APIs for community projects and enterprise tools to use in custom workflows as necessary. Users can also browse the known vulnerabilities from public sources on the GitHub Advisory Database.

Engineers behind the feature

Despite advanced technology, security alerting is a human process driven by dedicated GitHubbers. Meet Rob (@rschultheis), one of the core members of our security team, and learn about his experiences at GitHub through a friendly Q&A:

Humphrey Dogart (German Shepherd) and Rob Schultheis (Software Engineer on the GitHub Security team)

How long have you been with GitHub? 

Two years

How did you get into software security? 

I’ve worked with open source software for most of my 20 year career in tech, and honestly for much of that time I didn’t pay much attention to security. When I started at GitHub I was given the opportunity to work on the first iteration of security alerts. It quickly became clear that having a high quality, open dataset was going to be a critical factor in the success of the feature. I dove into the task of curating that advisory dataset and found a whole side to the industry that was open for exploration, and I’ve stayed with it ever since!

What are the trickiest parts of vulnerability curation? 

The hardest problem is probably confirming that our advisory data correctly identifies which version(s) of a package are vulnerable to a given advisory, and which version(s) first address it.

What was the most difficult security vulnerability you’ve had to publish? 

One memorable vulnerability was CVE-2015-9284. This one was tough in several ways because it was a part of a popular library, it was also unpatched when it became fully public, and finally, it was published four years after the initial disclosure to maintainers. Even worse, all attempts to fix it had stalled.

We ended up proceeding to publish it and the community quickly responded and finally got the security issue patched.

What’s your favorite feel-good moment working in security? 

Seeing tweets and other feedback thanking us is always wonderful. We do read them! And that goes the same for those critical of the feature or the way certain advisories were disclosed or published. Please keep them coming—they’re really valuable to us as we keep evolving our security offerings.

Since you work at home, can you introduce us to your furry officemate? 

I live with a seven month old shepherd named Humphrey Dogart. His primary responsibilities are making sure I don’t spend all day on the computer, and he does a great job of that. I think we make a great team!


Learn more about GitHub security alerts

The post Behind the scenes: GitHub security alerts appeared first on The GitHub Blog.