Tag Archives: python

Create a serverless event-driven workflow to ingest and process Microsoft data with AWS Glue and Amazon EventBridge

Post Syndicated from Venkata Sistla original https://aws.amazon.com/blogs/big-data/create-a-serverless-event-driven-workflow-to-ingest-and-process-microsoft-data-with-aws-glue-and-amazon-eventbridge/

Microsoft SharePoint is a document management system for storing files, organizing documents, and sharing and editing documents in collaboration with others. Your organization may want to ingest SharePoint data into your data lake, combine the SharePoint data with other data that’s available in the data lake, and use it for reporting and analytics purposes. AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. AWS Glue provides all the capabilities needed for data integration so that you can start analyzing your data and putting it to use in minutes instead of months.

Organizations often manage their data on SharePoint in the form of files and lists, and you can use this data for easier discovery, better auditing, and compliance. SharePoint as a data source is not a typical relational database and the data is mostly semi structured, which is why it’s often difficult to join the SharePoint data with other relational data sources. This post shows how to ingest and process SharePoint lists and files with AWS Glue and Amazon EventBridge, which enables you to join other data that is available in your data lake. We use SharePoint REST APIs with a standard open data protocol (OData) syntax. OData advocates a standard way of implementing REST APIs that allows for SQL-like querying capabilities. OData helps you focus on your business logic while building RESTful APIs without having to worry about the various approaches to define request and response headers, query options, and so on.

AWS Glue event-driven workflows

Unlike a traditional relational database, SharePoint data may or may not change frequently, and it’s difficult to predict the frequency at which your SharePoint server generates new data, which makes it difficult to plan and schedule data processing pipelines efficiently. Running data processing frequently can be expensive, whereas scheduling pipelines to run infrequently can deliver cold data. Similarly, triggering pipelines from an external process can increase complexity, cost, and job startup time.

AWS Glue supports event-driven workflows, a capability that lets developers start AWS Glue workflows based on events delivered by EventBridge. The main reason to choose EventBridge in this architecture is because it allows you to process events, update the target tables, and make information available to consume in near-real time. Because frequency of data change in SharePoint is unpredictable, using EventBridge to capture events as they arrive enables you to run the data processing pipeline only when new data is available.

To get started, you simply create a new AWS Glue trigger of type EVENT and place it as the first trigger in your workflow. You can optionally specify a batching condition. Without event batching, the AWS Glue workflow is triggered every time an EventBridge rule matches, which may result in multiple concurrent workflows running. AWS Glue protects you by setting default limits that restrict the number of concurrent runs of a workflow. You can increase the required limits by opening a support case. Event batching allows you to configure the number of events to buffer or the maximum elapsed time before firing the particular trigger. When the batching condition is met, a workflow run is started. For example, you can trigger your workflow when 100 files are uploaded in Amazon Simple Storage Service (Amazon S3) or 5 minutes after the first upload. We recommend configuring event batching to avoid too many concurrent workflows, and optimize resource usage and cost.

To illustrate this solution better, consider the following use case for a wine manufacturing and distribution company that operates across multiple countries. They currently host all their transactional system’s data on a data lake in Amazon S3. They also use SharePoint lists to capture feedback and comments on wine quality and composition from their suppliers and other stakeholders. The supply chain team wants to join their transactional data with the wine quality comments in SharePoint data to improve their wine quality and manage their production issues better. They want to capture those comments from the SharePoint server within an hour and publish that data to a wine quality dashboard in Amazon QuickSight. With an event-driven approach to ingest and process their SharePoint data, the supply chain team can consume the data in less than an hour.

Overview of solution

In this post, we walk through a solution to set up an AWS Glue job to ingest SharePoint lists and files into an S3 bucket and an AWS Glue workflow that listens to S3 PutObject data events captured by AWS CloudTrail. This workflow is configured with an event-based trigger to run when an AWS Glue ingest job adds new files into the S3 bucket. The following diagram illustrates the architecture.

To make it simple to deploy, we captured the entire solution in an AWS CloudFormation template that enables you to automatically ingest SharePoint data into Amazon S3. SharePoint uses ClientID and TenantID credentials for authentication and uses Oauth2 for authorization.

The template helps you perform the following steps:

  1. Create an AWS Glue Python shell job to make the REST API call to the SharePoint server and ingest files or lists into Amazon S3.
  2. Create an AWS Glue workflow with a starting trigger of EVENT type.
  3. Configure CloudTrail to log data events, such as PutObject API calls to CloudTrail.
  4. Create a rule in EventBridge to forward the PutObject API events to AWS Glue when they’re emitted by CloudTrail.
  5. Add an AWS Glue event-driven workflow as a target to the EventBridge rule. The workflow gets triggered when the SharePoint ingest AWS Glue job adds new files to the S3 bucket.

Prerequisites

For this walkthrough, you should have the following prerequisites:

Configure SharePoint server authentication details

Before launching the CloudFormation stack, you need to set up your SharePoint server authentication details, namely, TenantID, Tenant, ClientID, ClientSecret, and the SharePoint URL in AWS Systems Manager Parameter Store of the account you’re deploying in. This makes sure that no authentication details are stored in the code and they’re fetched in real time from Parameter Store when the solution is running.

To create your AWS Systems Manager parameters, complete the following steps:

  1. On the Systems Manager console, under Application Management in the navigation pane, choose Parameter Store.
    systems manager
  2. Choose Create Parameter.
  3. For Name, enter the parameter name /DATALAKE/GlueIngest/SharePoint/tenant.
  4. Leave the type as string.
  5. Enter your SharePoint tenant detail into the value field.
  6. Choose Create parameter.
  7. Repeat these steps to create the following parameters:
    1. /DataLake/GlueIngest/SharePoint/tenant
    2. /DataLake/GlueIngest/SharePoint/tenant_id
    3. /DataLake/GlueIngest/SharePoint/client_id/list
    4. /DataLake/GlueIngest/SharePoint/client_secret/list
    5. /DataLake/GlueIngest/SharePoint/client_id/file
    6. /DataLake/GlueIngest/SharePoint/client_secret/file
    7. /DataLake/GlueIngest/SharePoint/url/list
    8. /DataLake/GlueIngest/SharePoint/url/file

Deploy the solution with AWS CloudFormation

For a quick start of this solution, you can deploy the provided CloudFormation stack. This creates all the required resources in your account.

The CloudFormation template generates the following resources:

  • S3 bucket – Stores data, CloudTrail logs, job scripts, and any temporary files generated during the AWS Glue extract, transform, and load (ETL) job run.
  • CloudTrail trail with S3 data events enabled – Enables EventBridge to receive PutObject API call data in a specific bucket.
  • AWS Glue Job – A Python shell job that fetches the data from the SharePoint server.
  • AWS Glue workflow – A data processing pipeline that is comprised of a crawler, jobs, and triggers. This workflow converts uploaded data files into Apache Parquet format.
  • AWS Glue database – The AWS Glue Data Catalog database that holds the tables created in this walkthrough.
  • AWS Glue table – The Data Catalog table representing the Parquet files being converted by the workflow.
  • AWS Lambda function – The AWS Lambda function is used as an AWS CloudFormation custom resource to copy job scripts from an AWS Glue-managed GitHub repository and an AWS Big Data blog S3 bucket to your S3 bucket.
  • IAM roles and policies – We use the following AWS Identity and Access Management (IAM) roles:
    • LambdaExecutionRole – Runs the Lambda function that has permission to upload the job scripts to the S3 bucket.
    • GlueServiceRole – Runs the AWS Glue job that has permission to download the script, read data from the source, and write data to the destination after conversion.
    • EventBridgeGlueExecutionRole – Has permissions to invoke the NotifyEvent API for an AWS Glue workflow.
    • IngestGlueRole – Runs the AWS Glue job that has permission to ingest data into the S3 bucket.

To launch the CloudFormation stack, complete the following steps:

  1. Sign in to the AWS CloudFormation console.
  2. Choose Launch Stack:
  3. Choose Next.
  4. For pS3BucketName, enter the unique name of your new S3 bucket.
  5. Leave pWorkflowName and pDatabaseName as the default.

cloud formation 1

  1. For pDatasetName, enter the SharePoint list name or file name you want to ingest.
  2. Choose Next.

cloud formation 2

  1. On the next page, choose Next.
  2. Review the details on the final page and select I acknowledge that AWS CloudFormation might create IAM resources.
  3. Choose Create.

It takes a few minutes for the stack creation to complete; you can follow the progress on the Events tab.

You can run the ingest AWS Glue job either on a schedule or on demand. As the job successfully finishes and ingests data into the raw prefix of the S3 bucket, the AWS Glue workflow runs and transforms the ingested raw CSV files into Parquet files and loads them into the transformed prefix.

Review the EventBridge rule

The CloudFormation template created an EventBridge rule to forward S3 PutObject API events to AWS Glue. Let’s review the configuration of the EventBridge rule:

  1. On the EventBridge console, under Events, choose Rules.
  2. Choose the rule s3_file_upload_trigger_rule-<CloudFormation-stack-name>.
  3. Review the information in the Event pattern section.

event bridge

The event pattern shows that this rule is triggered when an S3 object is uploaded to s3://<bucket_name>/data/SharePoint/tablename_raw/. CloudTrail captures the PutObject API calls made and relays them as events to EventBridge.

  1. In the Targets section, you can verify that this EventBridge rule is configured with an AWS Glue workflow as a target.

event bridge target section

Run the ingest AWS Glue job and verify the AWS Glue workflow is triggered successfully

To test the workflow, we run the ingest-glue-job-SharePoint-file job using the following steps:

  1. On the AWS Glue console, select the ingest-glue-job-SharePoint-file job.

glue job

  1. On the Action menu, choose Run job.

glue job action menu

  1. Choose the History tab and wait until the job succeeds.

glue job history tab

You can now see the CSV files in the raw prefix of your S3 bucket.

csv file s3 location

Now the workflow should be triggered.

  1. On the AWS Glue console, validate that your workflow is in the RUNNING state.

glue workflow running status

  1. Choose the workflow to view the run details.
  2. On the History tab of the workflow, choose the current or most recent workflow run.
  3. Choose View run details.

glue workflow visual

When the workflow run status changes to Completed, let’s check the converted files in your S3 bucket.

  1. Switch to the Amazon S3 console, and navigate to your bucket.

You can see the Parquet files under s3://<bucket_name>/data/SharePoint/tablename_transformed/.

parquet file s3 location

Congratulations! Your workflow ran successfully based on S3 events triggered by uploading files to your bucket. You can verify everything works as expected by running a query against the generated table using Amazon Athena.

Sample wine dataset

Let’s analyze a sample red wine dataset. The following screenshot shows a SharePoint list that contains various readings that relate to the characteristics of the wine and an associated wine category. This is populated by various wine tasters from multiple countries.

redwine dataset

The following screenshot shows a supplier dataset from the data lake with wine categories ordered per supplier.

supplier dataset

We process the red wine dataset using this solution and use Athena to query the red wine data and supplier data where wine quality is greater than or equal to 7.

athena query and results

We can visualize the processed dataset using QuickSight.

Clean up

To avoid incurring unnecessary charges, you can use the AWS CloudFormation console to delete the stack that you deployed. This removes all the resources you created when deploying the solution.

Conclusion

Event-driven architectures provide access to near-real-time information and help you make business decisions on fresh data. In this post, we demonstrated how to ingest and process SharePoint data using AWS serverless services like AWS Glue and EventBridge. We saw how to configure a rule in EventBridge to forward events to AWS Glue. You can use this pattern for your analytical use cases, such as joining SharePoint data with other data in your lake to generate insights, or auditing SharePoint data and compliance requirements.


About the Author

Venkata Sistla is a Big Data & Analytics Consultant on the AWS Professional Services team. He specializes in building data processing capabilities and helping customers remove constraints that prevent them from leveraging their data to develop business insights.

Introducing Code Club World: a new way for young people to learn to code at home

Post Syndicated from Laura Kirsop original https://www.raspberrypi.org/blog/code-club-world-free-online-platform-young-people-children-learn-to-code-at-home/

Today we are introducing you to Code Club World — a free online platform where young people aged 9 to 13 can learn to make stuff with code.

Images from Code Club World, a free online platform for children who want to learn to code

In Code Club World, young people can:

  • Start out by creating their personal robot avatar
  • Make music, design a t-shirt, and teach their robot avatar to dance!
  • Learn to code on islands with structured activities
  • Discover block-based and text-based coding in Scratch and Python
  • Earn badges for their progress 
  • Share their coding creations with family, friends, and the Code Club World community

Learning to code at home with Code Club World: meaningful, fun, flexible

When we spoke to parents and children about learning at home during the pandemic, it became clear to us that they were looking for educational tools that the children can enjoy and master independently, and that are as fun and social as the computer games and other apps the children love.

A girl has fun learning to code at home, sitting with a laptop on a sofa, with a dog sleeping next to her and her father writing code too.
Code Club World is educational, and as fun as the games and apps young people love.

What’s more, a free tool for learning to code at home is particularly important for young people who are unable to attend coding clubs in person. We believe every child should have access to a high-quality coding and digital making education. And with this in mind, we set out to create Code Club World, an online environment as rich and engaging as a face-to-face extracurricular learning experience, where all young people can learn to code.

The Code Club World activities are mapped to our research-informed Digital Making Framework — a coding and digital making curriculum for non-formal settings. That means when children are in the Code Club World environment, they are learning to code and use digital making to independently create their ideas and address challenges that matter to them.

Islands in the Code Club World online platform for children who want to learn to code for free.
Welcome to Code Club World — so many islands to explore!

By providing a structured pathway through the coding activities, a reward system of badges to engage and motivate learners, and a broad range of projects covering different topics, Code Club World supports learners at every stage, while making the activities meaningful, fun, and flexible.

A girl has fun learning to code at home on a tablet sitting on a sofa.
Code Club World’s home island works as well on mobile phones and tablets as on computers.

We’ve also designed Code Club World to be mobile-friendly, so if a young person uses a phone or tablet to visit the platform, they can still code cool things they will be proud of.

Created with the community

Since we started developing Code Club World, we have been working with a community of more than 1000 parents, educators, and children who are giving us valuable input to shape the direction of the platform. We’ve had some fantastic feedback from them:

“I’ve not coded before, but found this really fun! … I LOVED making the dance. It was so much fun and made me laugh!”

Learner, aged 11

“I love the concept of having islands to explore in making the journey through learning coding, it is fabulous and eye-catching.”

Parent

The platform is still in beta status — this means we’d love you to share it with young people in your family, school, or community so they can give their feedback and help make Code Club World even better.

Together, we will ensure every child has an equal opportunity to learn to code and make things that change their world.

The post Introducing Code Club World: a new way for young people to learn to code at home appeared first on Raspberry Pi.

Open-Sourcing a Monitoring GUI for Metaflow

Post Syndicated from Netflix Technology Blog original https://netflixtechblog.com/open-sourcing-a-monitoring-gui-for-metaflow-75ff465f0d60

Open-Sourcing a Monitoring GUI for Metaflow, Netflix’s ML Platform

tl;dr Today, we are open-sourcing a long-awaited GUI for Metaflow. The Metaflow GUI allows data scientists to monitor their workflows in real-time, track experiments, and see detailed logs and results for every executed task. The GUI can be extended with plugins, allowing the community to build integrations to other systems, custom visualizations, and embed upcoming features of Metaflow directly into its views.

Metaflow is a full-stack framework for data science that we started developing at Netflix over four years ago and which we open-sourced in 2019. It allows data scientists to define ML workflows, test them locally, scale-out to the cloud, and deploy to production in idiomatic Python code. Since open-sourcing, the Metaflow community has been growing quickly: it is now the 7th most starred active project on Netflix’s GitHub account with nearly 4800 stars. Outside Netflix, Metaflow is used to power machine learning in production by hundreds of companies across industries from bioinformatics to real estate.

Since its inception, Metaflow has been a command-line-centric tool. It makes it easy for data scientists to express even complex machine learning applications in idiomatic Python, test them locally, or scale them out in the cloud — all using their favorite IDEs and terminals. Following our culture of freedom and responsibility, Metaflow grants data scientists the freedom to choose the right modeling approach, handle data and features flexibly, and construct workflows easily while ensuring that the resulting project executes responsibly and robustly on the production infrastructure.

As the number and criticality of projects running on Metaflow increased — some of which are very central to our business — our ML platform team started receiving an increasing number of support requests. Frequently, the questions were of the nature “can you help me understand why my flow takes so long to execute” or “how can I find the logs for a model that failed last night.” Technically, Metaflow provides a Python API that allows the user to inspect all details e.g., in a notebook, but writing code in a notebook to answer basic questions like this felt overkill and unnecessarily tedious. After observing the situation for months, we started forming an understanding of the kind of a new user interface that could address the growing needs of our users.

Requirements for a Metaflow GUI

Metaflow is a human-centered system by design. We consider our Python API and the CLI to be integral parts of the overall user interface and user experience, which singularly focuses on making it easier to build production-ready ML projects from scratch. In our approach, Python code provides a highly expressive and productive user interface for expressing complex business logic, such as ML models and workflows. At the same time, the CLI allows users to execute specific commands quickly and even automate common actions. When it comes to complex, real-life development work like this, it would be hard to achieve the same level of productivity on a graphical user interface.

However, textual UIs are quite lacking when it comes to discoverability and getting a holistic understanding of the system’s state. The questions we were hearing reflected this gap: we were lacking a user interface that would allow the users, quite simply, to figure out quickly what is happening in their Metaflow projects.

Netflix has a long history of developing innovative tools for observability, so when we began to specify requirements for the new GUI, we were able to leverage experiences from the previous GUIs built for other use cases, as well as real-life user stories from Metaflow users. We wanted to scope the GUI tightly, focusing on a specific gap in the Metaflow experience:

  1. The GUI should allow the users to see what flows and tasks are executing and what is happening inside them. Notably, we didn’t want to replace any of the functionality in the Metaflow APIs or CLI with the GUI — just to complement them. This meant that the GUI would be read-only: all actions like writing code and starting executions should happen on the users’ IDE and terminal as before. We also had no need to build a model-monitoring GUI yet, which is a wholly separate problem domain.
  2. The GUI would be targeted at professional data scientists. Instead of a fancy GUI for demos and presentations, we wanted a serious productivity tool with carefully thought-out user workflows that would fit seamlessly into our toolchain of data science. This requires attention to small details: for instance, users should be able to copy a link to any view in the GUI and share it e.g., on Slack, for easy collaboration and support (or to integrate with the Metaflow Slack bot). And, there should be natural affordances for navigating between the CLI, the GUI, and notebooks.
  3. The GUI should be scalable and snappy: it should handle our existing repository consisting of millions of runs, some of which contain tens of thousands of tasks without hiccups. Based on our experiences with other GUIs operating at Netflix-scale, this is not a trivial requirement: scalability needs to be baked into the design from the very beginning. Sluggish GUIs are hard to debug and fix afterwards, and they can have a significantly negative impact on productivity.
  4. The GUI should integrate well with other GUIs. A modern ML stack consists of many independent systems like data warehouses, compute layers, model serving systems, and, in particular, notebooks. It should be possible to find runs and tasks of interest in the Metaflow GUI and use a task-specific view to jump to other GUIs for further information. Our landscape of tools is constantly evolving, so we didn’t want to hardcode these links and views in the GUI itself. Instead, following the integration-friendly ethos of Metaflow, we want to embed relevant information in the GUI as plugins.
  5. Finally, we wanted to minimize the operational overhead of the GUI. In particular, under no circumstances should the GUI impact Metaflow executions. The GUI backend should be a simple service, optionally sitting alongside the existing Metaflow metadata service, providing a read-only, real-time view to the stored state. The frontend side should be easily extensible and maintainable, suggesting that we wanted a modern React app.

Monitoring GUI for Metaflow

As our ML Platform team had limited frontend resources, we reached out to Codemate to help with the implementation. As it often happens in software engineering projects, the project took longer than expected to finish, mostly because the problem of tracking and visualizing thousands of concurrent objects in real-time in a highly distributed environment is a surprisingly non-trivial problem (duh!). After countless iterations, we are finally very happy with the outcome, which we have now used in production for a few months.

When you open the GUI, you see an overview of all flows and runs, both current and historical, which you can group and filter in various ways:

Runs Grouped by flows

We can use this view for experiment tracking: Metaflow records every execution automatically, so data scientists can track all their work using this view. Naturally, the view can be grouped by user. They can also tag their runs and filter the view by tags, allowing them to focus on particular subsets of experiments.

After you click a specific run, you see all its tasks on a timeline:

Timeline view for a run

The timeline view is extremely useful in understanding performance bottlenecks, distribution of task runtimes, and finding failed tasks. At the top, you can see global attributes of the run, such as its status, start time, parameters etc. You can click a specific task to see more details:

Task view

This task view shows logs produced by a task, its results, and optionally links to other systems that are relevant to the task. For instance, if the task had deployed a model to a model serving platform, the view could include a link to a UI used for monitoring microservices.

As specified in our requirements, the GUI should work well with Metaflow CLI. To facilitate this, the top bar includes a navigation component where the user can copy-paste any pathspec, i.e., a path to any object in the Metaflow universe, which are prominently shown in the CLI output. This way, the user can easily move from the CLI to the GUI to observe runs and tasks in detail.

While the CLI is great, it is challenging to visualize flows. Each flow can be represented as a Directed Acyclic Graph (DAG), and so the GUI provides a much better way to visualize a flow. The DAG view presents all the steps of a flow and how they are related. Each step may have developer comments. They are colored to indicate the current state. Split steps are grouped by shaded boxes, while steps that participated in a foreach are grouped by a double shade box. Clicking on a step will take you to the Task view.

DAG View

Users at different organizations will likely have some special use cases that are not directly supported. The Metaflow GUI is extensible through its plugin API. For example, Netflix has its container orchestration platform called Titus. Users can configure tasks to utilize Titus to scale up or out. When failures happen, users will need to access their Titus containers for more information, and within the task view, a simple plugin provides a link for further troubleshooting.

Example task-level plugin

Try it at home!

We know that our user stories and requirements for a Metaflow GUI are not unique to Netflix. A number of companies in the Metaflow community have requested GUI for Metaflow in the past. To support the thriving community and invite 3rd party contributions to the GUI, we are open-sourcing our Monitoring GUI for Metaflow today!

You can find detailed instructions for how to deploy the GUI here. If you want to see the GUI in action before deploying it, Outerbounds, a new startup founded by our ex-colleagues, has deployed a public demo instance of the GUI. Outerbounds also hosts an active Slack community of Metaflow users where you can find support for GUI-related issues and share feedback and ideas for improvement.

With the new GUI, data scientists don’t have to fly blind anymore. Instead of reaching out to a platform team for support, they can easily see the state of their workflows on their own. We hope that Metaflow users outside Netflix will find the GUI equally beneficial, and companies will find creative ways to improve the GUI with new plugins.

For more context on the development process and motivation for the GUI, you can watch this recording of the GUI launch meetup.


Open-Sourcing a Monitoring GUI for Metaflow was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Code a Spectrum-style Crazy Golf game | Wireframe #54

Post Syndicated from Ryan Lambie original https://www.raspberrypi.org/blog/code-a-spectrum-style-crazy-golf-game-wireframe-54/

Putt the ball around irrational obstacles in our retro take on golf. Mark Vanstone has the code

First released by Mr. Micro in 1983 – then under the banner of Sinclair Research – Krazy Golf was, confusingly, also called Crazy Golf. The loading screen featured the Krazy spelling, but on the cover, it was plain old Crazy Golf.

Designed for the ZX Spectrum, the game provided nine holes and a variety of obstacles to putt the ball around. Crazy Golf was released at a time when dozens of other games were hitting the Spectrum market, and although it was released under the Sinclair name and reviewed in magazines such as Crash, it didn’t make much impact. The game itself employed a fairly rudimentary control system, whereby the player selects the angle of the shot at the top left of the screen, sets the range via a bar along the top, and then presses the RETURN key to take the shot.

The game was called Crazy Golf on the cover, but weirdly, the loading screen spelled the name as Krazy Golf. The early games industry was strange.

If you’ve been following our Source Code articles each month, you will have seen the pinball game where a ball bounces off various surfaces. In that example, we used a few shortcuts to approximate the bounce angles. Here, we’re only going to have horizontal and vertical walls, so we can use some fairly straightforward maths to calculate more precisely the new angle as the ball bounces off a surface. In the original game, the ball was limited to only 16 angles, and the ball moved at the same speed regardless of the strength of the shot. We’re going to improve on this a bit so that there’s more flexibility around the shot angle; we’ll also get the ball to start moving fast and then reduce its speed until it stops.

Horizontal or vertical obstruction?

To make this work, we need to have a way of defining whether an obstruction is horizontal or vertical, as the calculation is different for each. We’ll have a background graphic showing the course and obstacles, but we’ll also need another map to check our collisions. We need to make a collision map that just has the obstacles on it, so we need a white background; mark all the horizontal surfaces red and all the vertical surfaces blue.

As we move the ball around the screen (in much the same way as our pinball game) we check to see if it has collided with a surface by sampling the colours of the pixels from the collision map. If the pixel’s blue, we know that the ball has hit a vertical wall; if it’s red, the wall’s horizontal. We then calculate the new angle for the ball. If we mark the hole as black, then we can also test for collision with that – if the ball’s in the hole, the game ends.

The pointer’s angle is rotated using degrees, but we’ll use radians for our ball direction as it will simplify our movement and bounce calculations.

Get the code

We have our ball bouncing mechanism, so now we need our user interaction system. We’ll use the left and right arrow keys to rotate our pointer, which designates the direction of the next shot. We also need a range-setting gizmo, which will be shown as a bar at the top of the screen. We can make that grow and shrink with the up and down arrows.

Then when we press the RETURN key, we transfer the pointer angle and the range to the ball and watch it go. We ought to count each shot so that we can display a tally to the player once they’ve putted the ball into the hole. From this point, it’s a simple task to create another eight holes – and then you’ll have a full crazy golf game!

Here’s Mark’s code for a simple golf game. To get it running on your system, you’ll need to install Pygame Zero. And for the full code, head to our Github.

Get your copy of Wireframe issue 55

You can read more features like this one in Wireframe issue 54, available directly from Raspberry Pi Press — we deliver worldwide.

And if you’d like a handy digital version of the magazine, you can also download issue 54 for free in PDF format.

The post Code a Spectrum-style Crazy Golf game | Wireframe #54 appeared first on Raspberry Pi.

AWS Config RDK: Deploying the Custom Rules using the Terraform

Post Syndicated from Madhu Sarma original https://aws.amazon.com/blogs/devops/aws-config-rdk-deploying-the-custom-rules-using-the-terraform/

To help customers using the Terraform for multi-cloud infrastructure deployment, we have introduced a new feature in the AWS Config Rule Development Kit (RDK) that allows you to export custom AWS Config rules to Terraform files so that you can deploy the RDK rules with Terraform.

This blog post is a complement to the previous post – How to develop custom AWS Config rules using the Rule Development Kit. Here I will show you how to prototype, develop, and deploy custom AWS Config rules. The steps for prototyping and developing the custom AWS Config rules remain identical, while a variation exists in the deployment step, which I’ll walk you through in detail. I would encourage you to review the previous blog post, so that you can follow along here.

In this post, you will learn how to export the custom AWS Config rule to Terraform files and deploy to AWS using the Terraform.

Background

RDK doesn’t support the Terraform for rules deployment, which is impacting customers using the Terraform (“Infrastructure As Code”) to provision AWS infrastructure. Therefore, we have provided one more option to deploy the rules by using the Terraform.

Getting Started

The first step is making sure that you installed the latest RDK version. After you have defined an AWS Config rule and prototyped using the AWS Config RDK as described in the previous blog post, follow the steps below to deploy the various AWS Config components across the compliance and satellite accounts.

Prerequisites

Validate that you downloaded the RDK that supports “export”, using the command “rdk export -h”, and you should see the below output. If the installed RDK doesn’t support the export feature, then update it by using the command  “pip install rdk”

(venv) 8c85902e4110:7RDK test$ rdk export -h 
 
usage: rdk export [-h] [-s RULESETS] [--all] [--lambda-layers LAMBDA_LAYERS]  
                  [--lambda-subnets LAMBDA_SUBNETS]  
                  [--lambda-security-groups LAMBDA_SECURITY_GROUPS]  
                  [--lambda-role-arn LAMBDA_ROLE_ARN]  
                  [--rdklib-layer-arn RDKLIB_LAYER_ARN] -v {0.11,0.12} -f  
                  {terraform}  
                  [<rulename> [<rulename> ...]]  
  
Used to export the Config Rule to terraform file.  
  
positional arguments:  
  <rulename>            Rule name(s) to export to a file.  
  
optional arguments:  
  -h, --help            show this help message and exit  
  -s RULESETS, --rulesets RULESETS  
                        comma-delimited list of RuleSet names  
  --all, -a             All rules in the working directory will be deployed.  
  --lambda-layers LAMBDA_LAYERS  
                        [optional] Comma-separated list of Lambda Layer ARNs  
                        to deploy with your Lambda function(s).  
  --lambda-subnets LAMBDA_SUBNETS  
                        [optional] Comma-separated list of Subnets to deploy  
                        your Lambda function(s).  
  --lambda-security-groups LAMBDA_SECURITY_GROUPS  
                        [optional] Comma-separated list of Security Groups to  
                        deploy with your Lambda function(s).  
  --lambda-role-arn LAMBDA_ROLE_ARN  
                        [optional] Assign existing iam role to lambda  
                        functions. If omitted, new lambda role will be  
                        created.  
  --rdklib-layer-arn RDKLIB_LAYER_ARN  
                        [optional] Lambda Layer ARN that contains the desired  
                        rdklib. Note that Lambda Layers are region-specific.  
  -v {0.11,0.12}, --version {0.11,0.12}  
                        Terraform version  
  -f {terraform}, --format {terraform}  
                        Export Format  

Create your rule

Create your rule by using the command below which creates the MY_FIRST_RULE rule.

7RDK test$ rdk create MY_FIRST_RULE  --runtime python3.6 --resource-types AWS::EC2::SecurityGroup  
Running create!  
Local Rule files created.  

This creates the three files below. Edit the “MY_FIRST_RULE.py” as per your business requirement, as described in the “Edit” section of this blog.

7RDK test$ cd MY_FIRST_RULE/ 
(venv) 8c85902e4110:MY_FIRST_RULE test$ls 
MY_FIRST_RULE.py        MY_FIRST_RULE_test.py   parameters.json

Export your rule to Terraform

Use the command below to export your rule to the Terraform files, which supports the two versions of Terraform (0.11 and 0.12). Use the “-v” argument to specify the version.

test$ cd ..  
7RDK test$ rdk export MY_FIRST_RULE -f terraform -v 0.12  
Running export  
Found Custom Rule.  
Zipping MY_FIRST_RULE  
Zipping complete.  
terraform version: 0.12  
Export completed.This will generate three .tf files.  
7RDK test$

This creates the four files.

  • << rule-name >>_rule.tf :
    • This script uploads the rule to the Amazon S3 bucket, deploys the lambda, and creates the AWS config rule and the required IAM roles/policies.
  • << rule-name >>_variables.tf:  Terraform variable definitions.
  • << rule-name >>.tfvars.json: Terraform variable values.
  • << rule-name >>.zip: Compiled rule code.
7RDK test$ cd MY_FIRST_RULE/  
(venv) 8c85902e4110:MY_FIRST_RULE test$ ls -1  
MY_FIRST_RULE.py  
MY_FIRST_RULE.zip  
MY_FIRST_RULE_test.py  
my_first_rule.tfvars.json  
my_first_rule_rule.tf  
my_first_rule_variables.tf  
parameters.json  

Deploy your rule using the Terraform

Initialize the Terraform by using “terraform init” to download the AWS provider Plug-In.

MY_FIRST_RULE test$ terraform init  
  
Initializing the backend...  
  
Initializing provider plugins...  
- Checking for available provider plugins...  
- Downloading plugin for provider "aws" (hashicorp/aws) 2.70.0...  
  
The following providers do not have any version constraints in configuration,  
so the latest version was installed.  
  
To prevent automatic upgrades to new major versions that may contain breaking  
changes, it is recommended to add version = "..." constraints to the  
corresponding provider blocks in configuration, with the constraint strings  
suggested below.  
  
* provider.aws: version = "~> 2.70"  
  
Terraform has been successfully initialized!  

To deploy the config rules, your role should have the permissions and should mention the role ARN in my_rule.tfvars.json

To apply the Terraform, it requires two arguments:

  • var-file: Terraform script variable file name, created while exporting the rule using RDK.
  • source_bucket: Your Amazon S3 bucket name, to upload the config rule lambda code.

Make sure that AWS provider is configured for your Terraform environment as mentioned in the docs.

MY_FIRST_RULE test$ terraform apply -var-file=my_first_rule.tfvars.json --var source_bucket=config-bucket-xxxxx  
  
aws_iam_policy.awsconfig_policy[0]: Creating...  
aws_iam_role.awsconfig[0]: Creating...  
aws_s3_bucket_object.rule_code: Creating...  
aws_iam_role.awsconfig[0]: Creation complete after 3s [id=my_first_rule-awsconfig-role]  
aws_iam_role_policy_attachment.readonly-role-policy-attach[0]: Creating...  
aws_iam_policy.awsconfig_policy[0]: Creation complete after 4s [id=arn:aws:iam::xxxxxxxxxxxx:policy/my_first_rule-awsconfig-policy]  
aws_iam_role_policy_attachment.awsconfig_policy_attach[0]: Creating...  
aws_s3_bucket_object.rule_code: Creation complete after 5s [id=MY_FIRST_RULE.zip]  
aws_lambda_function.rdk_rule: Creating...  
aws_iam_role_policy_attachment.readonly-role-policy-attach[0]: Creation complete after 2s [id=my_first_rule-awsconfig-role-20200726023315892200000001]  
aws_iam_role_policy_attachment.awsconfig_policy_attach[0]: Creation complete after 3s [id=my_first_rule-awsconfig-role-20200726023317242000000002]  
aws_lambda_function.rdk_rule: Still creating... [10s elapsed]  
aws_lambda_function.rdk_rule: Creation complete after 18s [id=RDK-Rule-Function-MY_FIRST_RULE]  
aws_lambda_permission.lambda_invoke: Creating...  
aws_config_config_rule.event_triggered[0]: Creating...  
aws_lambda_permission.lambda_invoke: Creation complete after 2s [id=AllowExecutionFromConfig]  
aws_config_config_rule.event_triggered[0]: Creation complete after 4s [id=MY_FIRST_RULE]  
  
Apply complete! Resources: 8 added, 0 changed, 0 destroyed.  

Login to your AWS console to validate the deployed config rule.

Clean up

Enter the following command to remove all the resources.

  1. MY_FIRST_RULE test$ terraform destroy

Conclusion

With this new feature, you can export the AWS config rules developed by RDK to the Terraform,  and integrate these files into your Terraform CI/CD pipeline to provision the config rules in AWS without using the RDK.

Code your own pinball game | Wireframe #53

Post Syndicated from Ryan Lambie original https://www.raspberrypi.org/blog/code-your-own-pinball-game-wireframe-53/

Get flappers flapping and balls bouncing off bumpers. Mark Vanstone has the code in the new issue of Wireframe magazine, available now.

There are so many pinball video games that it’s become a genre in its own right. For the few of you who haven’t encountered pinball for some reason, it originated as an analogue arcade machine where a metal ball would be fired onto a sloping play area and bounce between obstacles. The player operates a pair of flippers by pressing buttons on each side of the machine, which will in turn ping the ball back up the play area to hit obstacles and earn points. The game ends when the ball falls through the exit at the bottom of the play area.

NES Pinball
One of the earliest pinball video games – it’s the imaginatively-named Pinball on the NES.

Recreating pinball machines for video games

Video game developers soon started trying to recreate pinball, first with fairly rudimentary graphics and physics, but with increasingly greater realism over time – if you look at Nintendo’s Pinball from 1984, then, say, Devil’s Crush on the Sega Mega Drive in 1990, and then 1992’s Pinball Dreams on PC, you can see how radically the genre evolved in just a few years. In this month’s Source Code, we’re going to put together a very simple rendition of pinball in Pygame Zero. We’re not going to use any complicated maths or physics systems, just a little algebra and trigonometry.

Let’s start with our background. We need an image which has barriers around the outside for the ball to bounce off, and a gap at the bottom for the ball to fall through. We also want some obstacles in the play area and an entrance at the side for the ball to enter when it’s first fired. In this case, we’re going to use our background as a collision map, too, so we need to design it so that all the areas that the ball can move in are black.

Pinball in Python
Here it is: your own pinball game in less than 100 lines of code.

Next, we need some flippers. These are defined as Actors with a pivot anchor position set near the larger end, and are positioned near the bottom of the play area. We detect left and right key presses and rotate the angle of the flippers by 20 degrees within a range of -30 to +30 degrees. If no key is pressed, then the flipper drops back down. With these elements in place, we have our play area and an ability for the player to defend the exit.

All we need now is a ball to go bouncing around the obstacles we’ve made. Defining the ball as an Actor, we can add a direction and a speed parameter to it. With these values set, the ball can be moved using a bit of trigonometry. Our new x-coordinate will move by the sin of the ball direction multiplied by the speed, and the new y-coordinate will move by the cos of the ball direction multiplied by speed. We need to detect collisions with objects and obstacles, so we sample four pixels around the ball to see if it’s hit anything solid. If it has, we need to make the ball bounce.

Get the code

Here’s Mark’s pinball code. To get it working on your system, you’ll need to install Pygame Zero. And to download the full code and assets, head here.

If you wanted more realistic physics, you’d calculate the reflection angle from the surface which has been hit, but in this case, we’re going to use a shortcut which will produce a rough approximation. We work out what direction the ball is travelling in and then rotate either left or right by a quarter of a turn until the ball no longer collides with a wall. We could finesse this calculation further to create a more accurate effect, but we’ll keep it simple for this sample. Finally, we need to add some gravity. As the play area is tilted downwards, we need to increase the ball speed as it travels down and decrease it as it travels up.

All of this should give you the bare bones of a pinball game. There’s lots more you could add to increase the realism, but we’ll leave you to discover the joys of normal vectors and dot products…

Get your copy of Wireframe issue 53

You can read more features like this one in Wireframe issue 53, available directly from Raspberry Pi Press — we deliver worldwide.

And if you’d like a handy digital version of the magazine, you can also download issue 53 for free in PDF format.

The post Code your own pinball game | Wireframe #53 appeared first on Raspberry Pi.

Recreate Gradius’ rock-spewing volcanoes | Wireframe #52

Post Syndicated from Ryan Lambie original https://www.raspberrypi.org/blog/recreate-gradius-volcanoes-wireframe-52/

Code an homage to Konami’s classic shoot-’em-up, Gradius. Mark Vanstone has the code in the new edition of Wireframe magazine, available now.

Released by Konami in 1985, Gradius – also known as Nemesis outside Japan – brought a new breed of power-up system to arcades. One of the keys to its success was the way the player could customise their Vic Viper fighter craft by gathering capsules, which could then be ‘spent’ on weapons, speed-ups, and shields from a bar at the bottom of the screen.

Gradius screenshot
The Gradius volcanoes spew rocks at the player just before the end-of-level boss ship arrives.

Flying rocks

A seminal side-scrolling shooter, Gradius was particularly striking thanks to the variety of its levels: a wide range of hazards were thrown at the player, including waves of aliens, natural phenomena, and boss ships with engine cores that had to be destroyed in order to progress. One of the first stage’s biggest obstacles was a pair of volcanoes that spewed deadly rocks into the air: the rocks could be shot for extra points or just avoided to get through to the next section. In this month’s Source Code, we’re going to have a look at how to recreate the volcano-style flying rock obstacle from the game.

Our sample uses Pygame Zero and the randint function from the random module to provide the variations of trajectory that we need our rocks to have. We’ll need an actor created for our spaceship and a list to hold our rock Actors. We can also make a bullet Actor so we can make the ship fire lasers and shoot the rocks. We build up the scene in layers in our draw() function with a star-speckled background, then our rocks, followed by the foreground of volcanoes, and finally the spaceship and bullets.

Dodge and shoot the rocks in our homage to the classic Gradius.

Get the ship moving

In the update() function, we need to handle moving the ship around with the cursor keys. We can use a limit() function to make sure it doesn’t go off the screen, and the SPACE bar to trigger the bullet to be fired. After that, we need to update our rocks. At the start of the game our list of rocks will be empty, so we’ll get a random number generated, and if the number is 1, we make a new rock and add it to the list. If we have more than 100 rocks in our list, some of them will have moved off the screen, so we may as well reuse them instead of making more new rocks. During each update cycle, we’ll need to run through our list of rocks and update their position. When we make a rock, we give it a speed and direction, then when it’s updated, we move the rock upwards by its speed and then reduce the speed by 0.2. This will make it fly into the air, slow down, and then fall to the ground. 

Collision detection

From this code, we can make rocks appear just behind both of the volcanoes, and they’ll fly in a random direction upwards at a random speed. We can increase or decrease the number of rocks flying about by changing the random numbers that spawn them. We should be able to fly in and out of the rocks, but we could add some collision detection to check whether the rocks hit the ship – we may also want to destroy the ship if it’s hit by a rock. In our sample, we have an alternative, ‘shielded’ state to indicate that a collision has occurred. We can also check for collisions with the bullets: if a collision’s detected, we can make the rock and the bullet disappear by moving them off-screen, at which point they’re ready to be reused.

That’s about it for this month’s sample, but there are many more elements from the original game that you could add yourself: extra weapons, more enemies, or even an area boss.

Here’s Mark’s volcanic code. To get it working on your system, you’ll need to install Pygame Zero. And to download the full code and assets, head here.

Get your copy of Wireframe issue 52

You can read more features like this one in Wireframe issue 52, available directly from Raspberry Pi Press — we deliver worldwide.

Wireframe issue 52's cover

And if you’d like a handy digital version of the magazine, you can also download issue 52 for free in PDF format.

The post Recreate Gradius’ rock-spewing volcanoes | Wireframe #52 appeared first on Raspberry Pi.

Hosting Hugging Face models on AWS Lambda for serverless inference

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/hosting-hugging-face-models-on-aws-lambda/

This post written by Eddie Pick, AWS Senior Solutions Architect – Startups and Scott Perry, AWS Senior Specialist Solutions Architect – AI/ML

Hugging Face Transformers is a popular open-source project that provides pre-trained, natural language processing (NLP) models for a wide variety of use cases. Customers with minimal machine learning experience can use pre-trained models to enhance their applications quickly using NLP. This includes tasks such as text classification, language translation, summarization, and question answering – to name a few.

First introduced in 2017, the Transformer is a modern neural network architecture that has quickly become the most popular type of machine learning model applied to NLP tasks. It outperforms previous techniques based on convolutional neural networks (CNNs) or recurrent neural networks (RNNs). The Transformer also offers significant improvements in computational efficiency. Notably, Transformers are more conducive to parallel computation. This means that Transformer-based models can be trained more quickly, and on larger datasets than their predecessors.

The computational efficiency of Transformers provides the opportunity to experiment and improve on the original architecture. Over the past few years, the industry has seen the introduction of larger and more powerful Transformer models. For example, BERT was first published in 2018 and was able to get better benchmark scores on 11 natural language processing tasks using between 110M-340M neural network parameters. In 2019, the T5 model using 11B parameters achieved better results on benchmarks such as summarization, question answering, and text classification. More recently, the GPT-3 model was introduced in 2020 with 175B parameters and in 2021 the Switch Transformers are scaling to over 1T parameters.

One consequence of this trend toward larger and more powerful models is an increased barrier to entry. As the number of model parameters increases, as does the computational infrastructure that is necessary to train such a model. This is where the open-source Hugging Face Transformers project helps.

Hugging Face Transformers provides over 30 pretrained Transformer-based models available via a straightforward Python package. Additionally, there are over 10,000 community-developed models available for download from Hugging Face. This allows users to use modern Transformer models within their applications without requiring model training from scratch.

The Hugging Face Transformers project directly addresses challenges associated with training modern Transformer-based models. Many customers want a zero administration ML inference solution that allows Hugging Face Transformers models to be hosted in AWS easily. This post introduces a low touch, cost effective, and scalable mechanism for hosting Hugging Face models for real-time inference using AWS Lambda.

Overview

Our solution consists of an AWS Cloud Development Kit (AWS CDK) script that automatically provisions container image-based Lambda functions that perform ML inference using pre-trained Hugging Face models. This solution also includes Amazon Elastic File System (EFS) storage that is attached to the Lambda functions to cache the pre-trained models and reduce inference latency.Solution architecture

In this architectural diagram:

  1. Serverless inference is achieved by using Lambda functions that are based on container image
  2. The container image is stored in an Amazon Elastic Container Registry (ECR) repository within your account
  3. Pre-trained models are automatically downloaded from Hugging Face the first time the function is invoked
  4. Pre-trained models are cached within Amazon Elastic File System storage in order to improve inference latency

The solution includes Python scripts for two common NLP use cases:

  • Sentiment analysis: Identifying if a sentence indicates positive or negative sentiment. It uses a fine-tuned model on sst2, which is a GLUE task.
  • Summarization: Summarizing a body of text into a shorter, representative text. It uses a Bart model that was fine-tuned on the CNN / Daily Mail dataset.

For simplicity, both of these use cases are implemented using Hugging Face pipelines.

Prerequisites

The following is required to run this example:

Deploying the example application

  1. Clone the project to your development environment:
    git clone https://github.com/aws-samples/zero-administration-inference-with-aws-lambda-for-hugging-face.git
  2. Install the required dependencies:
    pip install -r requirements.txt
  3. Bootstrap the CDK. This command provisions the initial resources needed by the CDK to perform deployments:
    cdk bootstrap
  4. This command deploys the CDK application to its environment. During the deployment, the toolkit outputs progress indications:
    $ cdk deploy

Testing the application

After deployment, navigate to the AWS Management Console to find and test the Lambda functions. There is one for sentiment analysis and one for summarization.

To test:

  1. Enter “Lambda” in the search bar of the AWS Management Console:Console Search
  2. Filter the functions by entering “ServerlessHuggingFace”:Filtering functions
  3. Select the ServerlessHuggingFaceStack-sentimentXXXXX function:Select function
  4. In the Test event, enter the following snippet and then choose Test:Test function
{
   "text": "I'm so happy I could cry!"
}

The first invocation takes approximately one minute to complete. The initial Lambda function environment must be allocated and the pre-trained model must be downloaded from Hugging Face. Subsequent invocations are faster, as the Lambda function is already prepared and the pre-trained model is cached in EFS.Function test results

The JSON response shows the result of the sentiment analysis:

{
  "statusCode": 200,
  "body": {
    "label": "POSITIVE",
    "score": 0.9997532367706299
  }
}

Understanding the code structure

The code is organized using the following structure:

├── inference
│ ├── Dockerfile
│ ├── sentiment.py
│ └── summarization.py
├── app.py
└── ...

The inference directory contains:

  • The Dockerfile used to build a custom image to be able to run PyTorch Hugging Face inference using Lambda functions
  • The Python scripts that perform the actual ML inference

The sentiment.py script shows how to use a Hugging Face Transformers model:

import json
from transformers import pipeline

nlp = pipeline("sentiment-analysis")

def handler(event, context):
    response = {
        "statusCode": 200,
        "body": nlp(event['text'])[0]
    }
    return response

For each Python script in the inference directory, the CDK generates a Lambda function backed by a container image and a Python inference script.

CDK script

The CDK script is named app.py in the solution’s repository. The beginning of the script creates a virtual private cloud (VPC).

vpc = ec2.Vpc(self, 'Vpc', max_azs=2)

Next, it creates the EFS file system and an access point in EFS for the cached models:

        fs = efs.FileSystem(self, 'FileSystem',
                            vpc=vpc,
                            removal_policy=cdk.RemovalPolicy.DESTROY)
        access_point = fs.add_access_point('MLAccessPoint',
                                           create_acl=efs.Acl(
                                               owner_gid='1001', owner_uid='1001', permissions='750'),
                                           path="/export/models",
                                           posix_user=efs.PosixUser(gid="1001", uid="1001"))>

It iterates through the Python files in the inference directory:

docker_folder = os.path.dirname(os.path.realpath(__file__)) + "/inference"
pathlist = Path(docker_folder).rglob('*.py')
for path in pathlist:

And then creates the Lambda function that serves the inference requests:

            base = os.path.basename(path)
            filename = os.path.splitext(base)[0]
            # Lambda Function from docker image
            function = lambda_.DockerImageFunction(
                self, filename,
                code=lambda_.DockerImageCode.from_image_asset(docker_folder,
                                                              cmd=[
                                                                  filename+".handler"]
                                                              ),
                memory_size=8096,
                timeout=cdk.Duration.seconds(600),
                vpc=vpc,
                filesystem=lambda_.FileSystem.from_efs_access_point(
                    access_point, '/mnt/hf_models_cache'),
                environment={
                    "TRANSFORMERS_CACHE": "/mnt/hf_models_cache"},
            )

Adding a translator

Optionally, you can add more models by adding Python scripts in the inference directory. For example, add the following code in a file called translate-en2fr.py:

import json
from transformers 
import pipeline

en_fr_translator = pipeline('translation_en_to_fr')

def handler(event, context):
    response = {
        "statusCode": 200,
        "body": en_fr_translator(event['text'])[0]
    }
    return response

Then run:

$ cdk synth
$ cdk deploy

This creates a new endpoint to perform English to French translation.

Cleaning up

After you are finished experimenting with this project, run “cdk destroy” to remove all of the associated infrastructure.

Conclusion

This post shows how to perform ML inference for pre-trained Hugging Face models by using Lambda functions. To avoid repeatedly downloading the pre-trained models, this solution uses an EFS-based approach to model caching. This helps to achieve low-latency, near real-time inference. The solution is provided as infrastructure as code using Python and the AWS CDK.

We hope this blog post allows you to prototype quickly and include modern NLP techniques in your own products.

Introducing new self-paced courses to improve Java and Python code quality with Amazon CodeGuru

Post Syndicated from Rafael Ramos original https://aws.amazon.com/blogs/devops/new-self-paced-courses-to-improve-java-and-python-code-quality-with-amazon-codeguru/

Amazon CodeGuru icon

During the software development lifecycle, organizations have adopted peer code reviews as a common practice to keep improving code quality and prevent bugs from reaching applications in production. Developers traditionally perform those code reviews manually, which causes bottlenecks and blocks releases while waiting for the peer review. Besides impacting the teams’ agility, it’s a challenge to maintain a high bar for code reviews during the development workflow. This is especially challenging for less experienced developers, who have more difficulties identifying defects, such as thread concurrency and resource leaks.

With Amazon CodeGuru Reviewer, developers have an automated code review tool that catches critical issues, security vulnerabilities, and hard-to-find bugs during application development. CodeGuru Reviewer is powered by pre-trained machine learning (ML) models and uses millions of code reviews on thousands of open-source and Amazon repositories. It also provides recommendations on how to fix issues to improve code quality and reduces the time it takes to fix bugs before they reach customer-facing applications. Java and Python developers can simply add Amazon CodeGuru to their existing development pipeline and save time and reduce the cost and burden of bad code.

If you’re new to writing code or an experienced developer looking to automate code reviews, we’re excited to announce two new courses on CodeGuru Reviewer. These courses, developed by the AWS Training and Certification team, consist of guided walkthroughs, gaming elements, knowledge checks, and a final course assessment.

About the course

During these courses, you learn how to use CodeGuru Reviewer to automatically scan your code base, identify hard-to-find bugs and vulnerabilities, and get recommendations for fixing the bugs and security issues. The course covers CodeGuru Reviewer’s main features, provides a peek into how CodeGuru finds code anomalies, describes how its ML models were built, and explains how to understand and apply its prescriptive guidance and recommendations. Besides helping on improving the code quality, those recommendations are useful for new developers to learn coding best practices, such as refactor duplicated code, correct implementation of concurrency constructs, and how to avoid resource leaks.

The CodeGuru courses are designed to be completed within a 2-week time frame. The courses comprise 60 minutes of videos, which include 15 main lectures. Four of the lectures are specific to Java, and four focus on Python. The courses also include exercises and assessments at the end of each week, to provide you with in-depth, hands-on practice in a lab environment.

Week 1

During the first week, you learn the basics of CodeGuru Reviewer, including how you can benefit from ML and automated reasoning to perform static code analysis and identify critical defects from coding best practices. You also learn what kind of actionable recommendations CodeGuru Reviewer provides, such as refactoring, resource leak, potential race conditions, deadlocks, and security analysis. In addition, the course covers how to integrate this tool on your development workflow, such as your CI/CD pipeline.

Topics include:

  • What is Amazon CodeGuru?
  • How CodeGuru Reviewer is trained to provide intelligent recommendations
  • CodeGuru Reviewer recommendation categories
  • How to integrate CodeGuru Reviewer into your workflow

Week 2

Throughout the second week, you have the chance to explore CodeGuru Reviewer in more depth. With Java and Python code snippets, you have a more hands-on experience and dive into each recommendation category. You use these examples to learn how CodeGuru Reviewer looks for duplicated lines of code to suggest refactoring opportunities, how it detects code maintainability issues, and how it prevents resource leaks and concurrency bugs.

Topics include (for both Java and Python):

  • Common coding best practices
  • Resource leak prevention
  • Security analysis

Get started

Developed at the source, this new digital course empowers you to learn about CodeGuru from the experts at AWS whenever, wherever you want. Advance your skills and knowledge to build your future in the AWS Cloud. Enroll today:

Rafael Ramos

Rafael Ramos

Rafael is a Solutions Architect at AWS, where he helps ISVs on their journey to the cloud. He spent over 13 years working as a software developer, and is passionate about DevOps and serverless. Outside of work, he enjoys playing tabletop RPG, cooking and running marathons.

Integrate GitHub monorepo with AWS CodePipeline to run project-specific CI/CD pipelines

Post Syndicated from Vivek Kumar original https://aws.amazon.com/blogs/devops/integrate-github-monorepo-with-aws-codepipeline-to-run-project-specific-ci-cd-pipelines/

AWS CodePipeline is a continuous delivery service that enables you to model, visualize, and automate the steps required to release your software. With CodePipeline, you model the full release process for building your code, deploying to pre-production environments, testing your application, and releasing it to production. CodePipeline then builds, tests, and deploys your application according to the defined workflow either in manual mode or automatically every time a code change occurs. A lot of organizations use GitHub as their source code repository. Some organizations choose to embed multiple applications or services in a single GitHub repository separated by folders. This method of organizing your source code in a repository is called a monorepo.

This post demonstrates how to customize GitHub events that invoke a monorepo service-specific pipeline by reading the GitHub event payload using AWS Lambda.

 

Solution overview

With the default setup in CodePipeline, a release pipeline is invoked whenever a change in the source code repository is detected. When using GitHub as the source for a pipeline, CodePipeline uses a webhook to detect changes in a remote branch and starts the pipeline. When using a monorepo style project with GitHub, it doesn’t matter which folder in the repository you change the code, CodePipeline gets an event at the repository level. If you have a continuous integration and continuous deployment (CI/CD) pipeline for each of the applications and services in a repository, all pipelines detect the change in any of the folders every time. The following diagram illustrates this scenario.

 

GitHub monorepo folder structure

 

This post demonstrates how to customize GitHub events that invoke a monorepo service-specific pipeline by reading the GitHub event payload using Lambda. This solution has the following benefits:

  • Add customizations to start pipelines based on external factors – You can use custom code to evaluate whether a pipeline should be triggered. This allows for further customization beyond polling a source repository or relying on a push event. For example, you can create custom logic to automatically reschedule deployments on holidays to the next available workday.
  • Have multiple pipelines with a single source – You can trigger selected pipelines when multiple pipelines are listening to a single GitHub repository. This lets you group small and highly related but independently shipped artifacts such as small microservices without creating thousands of GitHub repos.
  • Avoid reacting to unimportant files – You can avoid triggering a pipeline when changing files that don’t affect the application functionality (such as documentation, readme, PDF, and .gitignore files).

In this post, we’re not debating the advantages or disadvantages of a monorepo versus a single repo, or when to create monorepos or single repos for each application or project.

 

Sample architecture

This post focuses on controlling running pipelines in CodePipeline. CodePipeline can have multiple stages like test, approval, and deploy. Our sample architecture considers a simple pipeline with two stages: source and build.

 

Github monorepo - CodePipeline Sample Architecture

This solution is made up of following parts:

  • An Amazon API Gateway endpoint (3) is backed by a Lambda function (5) to receive and authenticate GitHub webhook push events (2)
  • The same function evaluates incoming GitHub push events and starts the pipeline on a match
  • An Amazon Simple Storage Service (Amazon S3) bucket (4) stores the CodePipeline-specific configuration files
  • The pipeline contains a build stage with AWS CodeBuild

 

Normally, after you create a CI/CD pipeline, it automatically triggers a pipeline to release the latest version of your source code. From then on, every time you make a change in your source code, the pipeline is triggered. You can also manually run the last revision through a pipeline by choosing Release change on the CodePipeline console. This architecture uses the manual mode to run the pipeline. GitHub push events and branch changes are evaluated by the Lambda function to avoid commits that change unimportant files from starting the pipeline.

 

Creating an API Gateway endpoint

We need a single API Gateway endpoint backed by a Lambda function with the responsibility of authenticating and validating incoming requests from GitHub. You can authenticate requests using HMAC security or GitHub Apps. API Gateway only needs one POST method to consume GitHub push events, as shown in the following screenshot.

 

Creating an API Gateway endpoint

 

Creating the Lambda function

This Lambda function is responsible for authenticating and evaluating the GitHub events. As part of the evaluation process, the function can parse through the GitHub events payload, determine which files are changed, added, or deleted, and perform the appropriate action:

  • Start a single pipeline, depending on which folder is changed in GitHub
  • Start multiple pipelines
  • Ignore the changes if non-relevant files are changed

You can store the project configuration details in Amazon S3. Lambda can read this configuration to decide what needs to be done when a particular folder is matched from a GitHub event. The following code is an example configuration:

{

    "GitHubRepo": "SampleRepo",

    "GitHubBranch": "main",

    "ChangeMatchExpressions": "ProjectA/.*",

    "IgnoreFiles": "*.pdf;*.md",

    "CodePipelineName": "ProjectA - CodePipeline"

}

For more complex use cases, you can store the configuration file in Amazon DynamoDB.

The following is the sample Lambda function code in Python 3.7 using Boto3:

def lambda_handler(event, context):

    import json
    modifiedFiles = event["commits"][0]["modified"]
    #full path
    for filePath in modifiedFiles:
        # Extract folder name
        folderName = (filePath[:filePath.find("/")])
        break

    #start the pipeline
    if len(folderName)>0:
        # Codepipeline name is foldername-job. 
        # We can read the configuration from S3 as well. 
        returnCode = start_code_pipeline(folderName + '-job')

    return {
        'statusCode': 200,
        'body': json.dumps('Modified project in repo:' + folderName)
    }
    

def start_code_pipeline(pipelineName):
    client = codepipeline_client()
    response = client.start_pipeline_execution(name=pipelineName)
    return True

cpclient = None
def codepipeline_client():
    import boto3
    global cpclient
    if not cpclient:
        cpclient = boto3.client('codepipeline')
    return cpclient
   

Creating a GitHub webhook

GitHub provides webhooks to allow external services to be notified on certain events. For this use case, we create a webhook for a push event. This generates a POST request to the URL (API Gateway URL) specified for any files committed and pushed to the repository. The following screenshot shows our webhook configuration.

Creating a GitHub webhook2

Conclusion

In our sample architecture, two pipelines monitor the same GitHub source code repository. A Lambda function decides which pipeline to run based on the GitHub events. The same function can have logic to ignore unimportant files, for example any readme or PDF files.

Using API Gateway, Lambda, and Amazon S3 in combination serves as a general example to introduce custom logic to invoke pipelines. You can expand this solution for increasingly complex processing logic.

 

About the Author

Vivek Kumar

Vivek is a Solutions Architect at AWS based out of New York. He works with customers providing technical assistance and architectural guidance on various AWS services. He brings more than 23 years of experience in software engineering and architecture roles for various large-scale enterprises.

 

 

Gaurav-Sharma

Gaurav is a Solutions Architect at AWS. He works with digital native business customers providing architectural guidance on AWS services.

 

 

 

Nitin-Aggarwal

Nitin is a Solutions Architect at AWS. He works with digital native business customers providing architectural guidance on AWS services.

 

 

 

 

Swing into action with an homage to Pitfall! | Wireframe #48

Post Syndicated from Ryan Lambie original https://www.raspberrypi.org/blog/swing-into-action-with-an-homage-to-pitfall-wireframe-48/

Grab onto ropes and swing across chasms in our Python rendition of an Atari 2600 classic. Mark Vanstone has the code

Whether it was because of the design brilliance of the game itself or because Raiders of the Lost Ark had just hit the box office, Pitfall Harry became a popular character on the Atari 2600 in 1982.

His hazardous attempts to collect treasure struck a chord with eighties gamers, and saw Pitfall!, released by Activision, sell over four million copies. A sequel, Pitfall II: The Lost Caverns quickly followed the next year, and the game was ported to several other systems, even making its way to smartphones and tablets in the 21st century.

Pitfall

Designed by David Crane, Pitfall! was released for the Atari 2600 and published by Activision in 1982

The game itself is a quest to find 32 items of treasure within a 20-minute time limit. There are a variety of hazards for Pitfall Harry to navigate around and over, including rolling logs, animals, and holes in the ground. Some of these holes can be jumped over, but some are too wide and have a convenient rope swinging from a tree to aid our explorer in getting to the other side of the screen. Harry must jump towards the rope as it moves towards him and then hang on as it swings him over the pit, releasing his grip at the other end to land safely back on firm ground.

For this code sample, we’ll concentrate on the rope swinging (and catching) mechanic. Using Pygame Zero, we can get our basic display set up quickly. In this case, we can split the background into three layers: the background, including the back of the pathway and the tree trunks, the treetops, and the front of the pathway. With these layers we can have a rope swinging with its pivot point behind the leaves of the trees, and, if Harry gets a jump wrong, it will look like he falls down the hole in the ground. The order in which we draw these to the screen is background, rope, tree-tops, Harry, and finally the front of the pathway.

Now, let’s get our rope swinging. We can create an Actor and anchor it to the centre and top of its bounding box. If we rotate it by changing the angle property of the Actor, then it will rotate at the top of the Actor rather than the mid-point. We can make the rope swing between -45 degrees and 45 degrees by increments of 1, but if we do this, we get a rather robotic sort of movement. To fix this, we add an ‘easing’ value which we can calculate using a square root to make the rope slow down as it reaches the extremes of the swing.

Our homage to the classic Pitfall! Atari game. Can you add some rolling logs and other hazards?

Our Harry character will need to be able to run backwards and forwards, so we’ll need a few frames of animation. There are several ways of coding this, but for now, we can take the x coordinate and work out which frame to display as the x value changes. If we have four frames of running animation, then we would use the %4 operator and value on the x coordinate to give us animation frames of 0, 1, 2, and 3. We use these frames for running to the right, and if he’s running to the left, we just mirror the images. We can check to see if Harry is on the ground or over the pit, and if he needs to be falling downward, we add to his y coordinate. If he’s jumping (by pressing the SPACE bar), we reduce his y coordinate.

We now need to check if Harry has reached the rope, so after a collision, we check to see if he’s connected with it, and if he has, we mark him as attached and then move him with the end of the rope until the player presses the SPACE bar and he can jump off at the other side. If he’s swung far enough, he should land safely and not fall down the pit. If he falls, then the player can have another go by pressing the SPACE bar to reset Harry back to the start.

That should get Pitfall Harry over one particular obstacle, but the original game had several other challenges to tackle – we’ll leave you to add those for yourselves.

Pitfall Python code

Here’s Mark’s code for a Pitfall!-style platformer. To get it working on your system, you’ll need to  install Pygame Zero.  And to download the full code and assets, head here.

Get your copy of Wireframe issue 48

You can read more features like this one in Wireframe issue 48, available directly from Raspberry Pi Press — we deliver worldwide.
Wireframe issue 48
And if you’d like a handy digital version of the magazine, you can also download issue 48 for free in PDF format.
A banner with the words "Be a Pi Day donor today"

The post Swing into action with an homage to Pitfall! | Wireframe #48 appeared first on Raspberry Pi.

Creating serendipity with Python

Post Syndicated from John Graham-Cumming original https://blog.cloudflare.com/creating-serendipity-with-python/

Creating serendipity with Python

We’ve been experimenting with breaking up employees into random groups (of size 4) and setting up video hangouts between them. We’re doing this to replace the serendipitous meetings that sometimes occur around coffee machines, in lunch lines or while waiting for the printer. And also, we just want people to get to know each other.

Which lead to me writing some code. The core of which is divide n elements into groups of at least size g minimizing the size of each group. So, suppose an office has 15 employees in it then it would be divided into three groups of sizes 5, 5, 5; if an office had 16 employees it would be 4, 4, 4, 4; if it had 17 employees it would be 4, 4, 4, 5 and so on.

I initially wrote the following code (in Python):

    groups = [g] * (n//g)

    for e in range(0, n % g):
        groups[e % len(groups)] += 1

The first line creates n//g (// is integer division) entries of size g (for example, if g == 4 and n == 17 then groups == [4, 4, 4, 4]). The for loop deals with the ‘left over’ parts that don’t divide exactly into groups of size g. If g == 4 and n == 17 then there will be one left over element to add to one of the existing [4, 4, 4, 4] groups resulting in [5, 4, 4, 4].

The e % len(groups) is needed because it’s possible that there are more elements left over after dividing into equal sized groups than there are entries in groups. For example, if g == 4 and n == 11 then groups is initially set to [4, 4] with three left over elements that have to be distributed into just two entries in groups.

So, that code works and here’s the output for various sizes of n (and g == 4):

    4 [4]
    5 [5]
    6 [6]
    7 [7]
    8 [4, 4]
    9 [5, 4]
    10 [5, 5]
    11 [6, 5]
    12 [4, 4, 4]
    13 [5, 4, 4]
    14 [5, 5, 4]
    15 [5, 5, 5]
    16 [4, 4, 4, 4]
    17 [5, 4, 4, 4]

But the code irritated me because I felt there must be a simple formula to work out how many elements should be in each group. After noodling on this problem I decided to do something that’s often helpful… make the problem simple and naive, or, at least, the solution simple and naive, and so I wrote this code:

    groups = [0] * (n//g)

    for i in range(n):
        groups[i % len(groups)] += 1

This is a really simple implementation. I don’t like it because it loops n times but it helps visualize something. Imagine that g == 4 and n == 17. This loop ‘fills up’ each entry in groups like this (each square is an entry in groups and numbers in the squares are values of i for which that entry was incremented by the loop).

Creating serendipity with Python

So groups ends up being [5, 4, 4, 4].  What this helps see is that the number of times groups[i] is incremented depends on the number of times the for loop ‘loops around’ on the ith element. And that’s something that’s easy to calculate without looping.

So this means that the code is now simply:

    groups = [1+max(0,n-(i+1))//(n//g) for i in range(n//g)]

And to me that is more satisfying. n//g is the size of groups which makes the loop update each entry in groups once. Each entry is set to 1 + max(0, n-(i+1))//(n//g). You can think of this as follows:

1. The 1 is the first element to be dropped into each entry in groups.

2. max(0, n-(i+1)) is the number of elements left over once you’ve placed 1 in each of the elements of groups up to position i. It’s divided by n//g to work out how many times the process of sharing out elements (see the naive loop above) will loop around.

If #2 there isn’t clear, consider the image above and imagine we are computing groups[0] (n == 17 and g == 4). We place 1 in groups[0] leaving 16 elements to share out. If you naively shared them out you’d loop around four times and thus need to add 16/4 elements to groups[0]making it 5.

Move on to groups[1] and place a 1 in it. Now there are 15 elements to share out, that’s 15/4 (which is 3 in integer division) and so you place 4 in groups[1]. And so on…

And that solution pleases me most. It succinctly creates groups in one shot. Of course, I might have over thought this… and others might think the other solutions are clearer or more maintainable.

Coding on Raspberry Pi remotely with Visual Studio Code

Post Syndicated from Ashley Whittaker original https://www.raspberrypi.org/blog/coding-on-raspberry-pi-remotely-with-visual-studio-code/

Jim Bennett from Microsoft, who showed you all how to get Visual Studio Code up and running on Raspberry Pi last week, is back to explain how to use VS Code for remote development on a headless Raspberry Pi.

Like a lot of Raspberry Pi users, I like to run my Raspberry Pi as a ‘headless’ device to control various electronics – such as a busy light to let my family know I’m in meetings, or my IoT powered ugly sweater.

The upside of headless is that my Raspberry Pi can be anywhere, not tied to a monitor, keyboard and mouse. The downside is programming and debugging it – do you plug your Raspberry Pi into a monitor and run the full Raspberry Pi OS desktop, or do you use Raspberry Pi OS Lite and try to program and debug over SSH using the command line? Or is there a better way?

Remote development with VS Code to the rescue

There is a better way – using Visual Studio Code remote development! Visual Studio Code, or VS Code, is a free, open source, developer’s text editor with a whole swathe of extensions to support you coding in multiple languages, and provide tools to support your development. I practically live day to day in VS Code: whether I’m writing blog posts, documentation or Python code, or programming microcontrollers, it’s my work ‘home’. You can run VS Code on Windows, macOS, and of course on a Raspberry Pi.

One of the extensions that helps here is the Remote SSH extension, part of a pack of remote development extensions. This extension allows you to connect to a remote device over SSH, and run VS Code as if you were running on that remote device. You see the remote file system, the VS Code terminal runs on the remote device, and you access the remote device’s hardware. When you are debugging, the debug session runs on the remote device, but VS Code runs on the host machine.

Photograph of Raspberry Pi 4
Raspberry Pi 4

For example – I can run VS Code on my MacBook Pro, and connect remotely to a Raspberry Pi 4 that is running headless. I can access the Raspberry Pi file system, run commands on a terminal connected to it, access whatever hardware my Raspberry Pi has, and debug on it.

Remote SSH needs a Raspberry Pi 3 or 4. It is not supported on older Raspberry Pis, or on Raspberry Pi Zero.

Set up remote development on Raspberry Pi

For remote development, your Raspberry Pi needs to be connected to your network either by ethernet or WiFi, and have SSH enabled. The Raspberry Pi documentation has a great article on setting up a headless Raspberry Pi if you don’t already know how to do this.

You also need to know either the IP address of the Raspberry Pi, or its hostname. If you don’t know how to do this, it is also covered in the Raspberry Pi documentation.

Connect to the Raspberry Pi from VS Code

Once the Raspberry Pi is set up, you can connect from VS Code on your Mac or PC.

First make sure you have VS Code installed. If not, you can install it from the VS Code downloads page.

From inside VS Code, you will need to install the Remote SSH extension. Select the Extensions tab from the sidebar menu, then search for Remote development. Select the Remote Development extension, and select the Install button.

Next you can connect to your Raspberry Pi. Launch the VS Code command palette using Ctrl+Shift+P on Linux or Windows, or Cmd+Shift+P on macOS. Search for and select Remote SSH: Connect current window to host (there’s also a connect to host option that will create a new window).

Enter the SSH connection details, using user@host. For the user, enter the Raspberry Pi username (the default is pi). For the host, enter the IP address of the Raspberry Pi, or the hostname. The hostname needs to end with .local, so if you are using the default hostname of raspberrypi, enter raspberrypi.local.

The .local syntax is supported on macOS and the latest versions of Windows or Linux. If it doesn’t work for you then you can install additional software locally to add support. On Linux, install Avahi using the command sudo apt-get install avahi-daemon. On Windows, install either Bonjour Print Services for Windows, or iTunes for Windows.

For example, to connect to my Raspberry Pi 400 with a hostname of pi-400 using the default pi user, I enter [email protected].

The first time you connect, it will validate the fingerprint to ensure you are connecting to the correct host. Select Continue from this dialog.

Enter your Raspberry Pi’s password when promoted. The default is raspberry, but you should have changed this (really, you should!).

VS Code will then install the relevant tools on the Raspberry Pi and configure the remote SSH connection.

Code!

You will now be all set up and ready to code on your Raspberry Pi. Start by opening a folder or cloning a git repository and away you go coding, debugging and deploying your applications.

In the remote session, not all extensions you have installed locally will be available remotely. Any extensions that change the behavior of VS Code as an application, such as themes or tools for managing cloud resources, will be available.

Things like language packs and other programming tools are not installed in the remote session, so you’ll need to re-install them. When you install these extensions, you’ll see the Install button has changed to Install in SSH:< hostname > to show it’s being installed remotely.

VS Code may seem daunting at first – it’s a powerful tool with a huge range of extensions. The good news is Microsoft has you covered with lots of hands-on, self-guided learning guides on how to use it with different languages and development tools, from using Git version control, to developing web applications. There’s even a guide to learning Python basics with Wonder Woman!

Jim with his arms folded wearing a dark t shirt
Jim Bennett

You remember Jim – his blog Expecting Someone Geekier is well good. You can find him on Twitter @jimbobbennett and on github.

The post Coding on Raspberry Pi remotely with Visual Studio Code appeared first on Raspberry Pi.

Visual Studio Code comes to Raspberry Pi

Post Syndicated from Ashley Whittaker original https://www.raspberrypi.org/blog/visual-studio-code-comes-to-raspberry-pi/

Microsoft’s Visual Studio Code is an excellent C development environment, and now it’s an easy install on Raspberry Pi. Here’s Jim Bennett from Microsoft to show you all how to get VS Code up and running on our tiny computer. Take it away, Jim…

There are a few products in the tech sphere that get me really excited. One of them is Raspberry Pi (obviously), and the other is Visual Studio Code or VS Code. I always hoped that the two would come together one day — and now, to my great pleasure, they have!

VS Code is a free, open source developer text editor originally released for Windows, macOS and x64 Linux. Out of the box it supports generic text editing and git source code control, as well as full web development with JavaScript, TypeScript and Node.js, with debugging, intellisense and all the goodness you’d expect from a full-featured IDE. What makes it super powerful is extensions — bringing a huge range of programming languages, developer tools and other capabilities.

For example my VS Code setup includes a Python extension so I can code and debug in Python, a set of Microsoft Azure extensions so I can manage my cloud services, PlatformIO to allow me to program micro-controllers like Arduino boards coupled with a C++ extension to support coding in C and C++, and even some Docker support. Not a bad setup for a completely free developer tool.

Jim’s Raspberry Pi 400 running VS Code

I’ve been hoping for years VS Code would come to Raspberry Pi, and finally it’s here. As well as supporting Debian Linux on x64, there are now builds for ARM and ARM64 – both of which can run on Raspberry Pi OS (the ARM build on Raspberry Pi OS, the ARM64 on the beta of the 64-bit Raspberry Pi OS). And yes — I am writing this right now on a Raspberry Pi 400 running VS Code!

Why am I so excited about this?

Well, there are a couple of reasons.

Firstly, it brings an exceptional developer tool to Raspberry Pi. There are already some great editors, but nothing of the calibre of VS Code. I can take my $35 computer, plug it into a keyboard and mouse, connect a monitor and a TV and code in a wide range of languages from the same place.

I see kids learning Python at school using one tool, then learning web development in an after-school coding club with a different tool. They can now do both in the same application, reducing the cognitive load – they only have to learn one tool, one debugger, one setup. Combine this with the new Raspberry Pi 400 and you have an all-in-one solution to learning to code, reminiscent of my ZX Spectrum of decades ago, but so much more powerful.

The second reason is to me the most important — it allows kids to share the same development environment as their grown-ups. Imagine the joy of a 10-year-old coding Python using VS Code on their Raspberry Pi plugged into the family TV, then seeing their Mum working from home coding Python in exactly the same tool on her work laptop as part of her job as an AI engineer or data scientist. It also makes it easier when Mum has to inevitably help with unblocking the issues that always come up with learners.

As a young child it was mind-blowing when my Dad brought home a work PC so he could write reports and I could use it to write up my school work – I was using what Dad used at work, making me feel important. I see this with my seven-year-old daughter, seeing her excitement that I use Microsoft Teams for work, the same as she uses for her virtual schooling (she’s even offered to teach me how to use it if I get stuck). To be able to bring that unadulterated joy of using ‘grown-up tools’ to our young learners is priceless.

Installing VS Code

The great news is VS Code is now available as part of the Raspberry Pi OS apt packages. Launch the Raspberry Pi Terminal and run the following commands:

sudo apt update 
sudo apt install code -y

This will download and install VS Code. If you’ve got your hands on a Pico, then you may not even need to do this – VS Code is installed as part of the Pico setup from the Getting Started guide.

After installing VS Code, you can run it from the Programming folder in the Raspberry Pi menu.

Getting started with VS Code

VS Code may seem daunting at first – it’s a powerful tool with a huge range of extensions. The good news is Microsoft has you covered with lots of hands-on, self-guided learning guides on how to use it with different languages and development tools, from using Git version control, to developing web applications — there’s even a guide to learning Python basics with Wonder Woman.

Go grab it and happy coding!

Jim with his arms folded wearing a dark t shirt
There he is – that’s the real life Jim!

Brilliant Jim Bennett shares loads of Raspberry Pi builds and tutorials over on Expecting Someone Geekier and tweets @jimbobbennett. He also works in Developer Relations at Microsoft. You can learn pretty much everything there is to know about him on github.

The post Visual Studio Code comes to Raspberry Pi appeared first on Raspberry Pi.

Code a Light Cycle arcade minigame | Wireframe #47

Post Syndicated from Ryan Lambie original https://www.raspberrypi.org/blog/code-a-light-cycle-arcade-minigame-wireframe-47/

Speed around an arena, avoiding walls and deadly trails in this Light Cycle minigame. Mark Vanstone has the code.

Battle against AI enemies in the original arcade classic.

At the beginning of the 1980s, Disney made plans for an entirely new kind of animated movie that used cutting-edge computer graphics. The resulting film was 1982’s TRON, and it inevitably sparked one of the earliest tie-in arcade machines.

The game featured several minigames, including one based on the Light Cycle section of the movie, where players speed around an arena on high-tech motorbikes, which leave a deadly trail of light in their wake. If competitors hit any walls or cross the path of any trails, then it’s game over.

Players progress through the twelve levels which were all named after programming languages. In the Light Cycle game, the players compete against AI players who drive yellow Light Cycles around the arena. As the levels progress, more AI Players are added.

The TRON game, distributed by Bally Midway, was well-received in arcades, and even won Electronic Games Magazine’s (presumably) coveted Coin-operated Game of the Year gong.

Although the arcade game wasn’t ported to home computers at the time, several similar games – and outright clones – emerged, such as the unsubtly named Light Cycle for the BBC Micro, Oric, and ZX Spectrum.

The Light Cycle minigame is essentially a variation on Snake, with the player leaving a trail behind them as they move around the screen. There are various ways to code this with Pygame Zero.

In this sample, we’ll focus on the movement of the player Light Cycle and creating the trails that are left behind as it moves around the screen. We could use line drawing functions for the trail behind the bike, or go for a system like Snake, where blocks are added to the trail as the player moves.

In this example, though, we’re going to use a two-dimensional list as a matrix of positions on the screen. This means that wherever the player moves on the screen, we can set the position as visited or check to see if it’s been visited before and, if so, trigger an end-game event.

Our homage to the TRON Light Cycle classic arcade game.

For the main draw() function, we first blit our background image which is the cross-hatched arena, then we iterate through our two-dimensional list of screen positions (each 10 pixels square) displaying a square anywhere the Cycle has been. The Cycle is then drawn and we can add a display of the score.

The update() function contains code to move the Cycle and check for collisions. We use a list of directions in degrees to control the angle the player is pointing, and another list of x and y increments for each direction. Each update we add x and y coordinates to the Cycle actor to move it in the direction that it’s pointing multiplied by our speed variable.

We have an on_key_down() function defined to handle changing the direction of the Cycle actor with the arrow keys. We need to wait a while before checking for collisions on the current position, as the Cycle won’t have moved away for several updates, so each screen position in the matrix is actually a counter of how many updates it’s been there for.

We can then test to see if 15 updates have happened before testing the square for collisions, which gives our Cycle enough time to clear the area. If we do detect a collision, then we can start the game-end sequence.

We set the gamestate variable to 1, which then means the update() function uses that variable as a counter to run through the frames of animation for the Cycle’s explosion. Once it reaches the end of the sequence, the game stops.

We have a key press defined (the SPACE bar) in the on_key_down() function to call our init() function, which will not only set up variables when the game starts but sets things back to their starting state.

Here’s Mark’s code for a TRON-style Light Cycle minigame. To get it working on your system, you’ll need to install Pygame Zero. And to download the full code and assets, head here.

So that’s the fundamentals of the player Light Cycle movement and collision checking. To make it more like the original arcade game, why not try experimenting with the code and adding a few computer-controlled rivals?

Get your copy of Wireframe issue 47

You can read more features like this one in Wireframe issue 47, available directly from Raspberry Pi Press — we deliver worldwide.

And if you’d like a handy digital version of the magazine, you can also download issue 47 for free in PDF format.

The post Code a Light Cycle arcade minigame | Wireframe #47 appeared first on Raspberry Pi.

Code your own Pipe Mania puzzler | Wireframe #46

Post Syndicated from Ryan Lambie original https://www.raspberrypi.org/blog/code-your-own-pipe-mania-puzzler-wireframe-46/

Create a network of pipes before the water starts to flow in our re-creation of a classic puzzler. Jordi Santonja shows you how.

A screen grab of the game in motion
Pipe Mania’s design is so effective, it’s appeared in various guises elsewhere – even as a minigame in BioShock.

Pipe Mania, also called Pipe Dream in the US, is a puzzle game developed by The Assembly Line in 1989 for Amiga, Atari ST, and PC, and later ported to other platforms, including arcades. The player must place randomly generated sections of pipe onto a grid. When a counter reaches zero, water starts to flow and must reach the longest possible distance through the connected pipes.

Let’s look at how to recreate Pipe Dream in Python and Pygame Zero. The variable start is decremented at each frame. It begins with a value of 60*30, so it reaches zero after 30 seconds if our monitor runs at 60 frames per second. In that time, the player can place tiles on the grid to build a path. Every time the user clicks on the grid, the last tile from nextTiles is placed on the play area and a new random tile appears at the top of the next tiles. randint(2,8) computes a random value between 2 and 8.

Our Pipe Mania homage. Build a pipeline before the water escapes, and see if you can beat your own score.

grid and nextTiles are lists of tile values, from 0 to 8, and are copied to the screen in the draw function with the screen.blit operation. grid is a two-dimensional list, with sizes gridWidth=10 and gridHeight=7. Every pipe piece is placed in grid with a mouse click. This is managed with the Pygame functions on_mouse_move and on_mouse_down, where the variable pos contains the mouse position in the window. panelPosition defines the position of the top-left corner of the grid in the window. To get the grid cell, panelPosition is subtracted from pos, and the result is divided by tileSize with the integer division //. tileMouse stores the resulting cell element, but it is set to (-1,-1) when the mouse lies outside the grid.

The images folder contains the PNGs with the tile images, two for every tile: the graphical image and the path image. The tiles list contains the name of every tile, and adding to it _block or _path obtains the name of the file. The values stored in nextTiles and grid are the indexes of the elements in tiles.

wfmag46code
Here’s Jordi’s code for a Pipemania-style puzzler. To get it working on your system, you’ll need to install Pygame Zero. And to download the full code and assets, head here.

The image waterPath isn’t shown to the user, but it stores the paths that the water is going to follow. The first point of the water path is located in the starting tile, and it’s stored in currentPoint. update calls the function CheckNextPointDeleteCurrent, when the water starts flowing. That function finds the next point in the water path, erases it, and adds a new point to the waterFlow list. waterFlow is shown to the user in the draw function.

pointsToCheck contains a list of relative positions, offsets, that define a step of two pixels from currentPoint in every direction to find the next point. Why two pixels? To be able to define the ‘cross’ tile, where two lines cross each other. In a ‘cross’ tile the water flow must follow a straight line, and this is how the only points found are the next points in the same direction. When no next point is found, the game ends and the score is shown: the number of points in the water path, playState is set to 0, and no more updates are done.

Get your copy of Wireframe issue 46

You can read more features like this one in Wireframe issue 46, available directly from Raspberry Pi Press — we deliver worldwide.

wfcover

And if you’d like a handy digital version of the magazine, you can also download issue 46 for free in PDF format.

The post Code your own Pipe Mania puzzler | Wireframe #46 appeared first on Raspberry Pi.

New for Amazon CodeGuru – Python Support, Security Detectors, and Memory Profiling

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-amazon-codeguru-python-support-security-detectors-and-memory-profiling/

Amazon CodeGuru is a developer tool that helps you improve your code quality and has two main components:

  • CodeGuru Reviewer uses program analysis and machine learning to detect potential defects that are difficult to find in your code and offers suggestions for improvement.
  • CodeGuru Profiler collects runtime performance data from your live applications, and provides visualizations and recommendations to help you fine-tune your application performance.

Today, I am happy to announce three new features:

  • Python Support for CodeGuru Reviewer and Profiler (Preview) – You can now use CodeGuru to improve applications written in Python. Before this release, CodeGuru Reviewer could analyze Java code, and CodeGuru Profiler supported applications running on a Java virtual machine (JVM).
  • Security Detectors for CodeGuru Reviewer – A new set of detectors for CodeGuru Reviewer to identify security vulnerabilities and check for security best practices in your Java code.
  • Memory Profiling for CodeGuru Profiler – A new visualization of memory retention per object type over time. This makes it easier to find memory leaks and optimize how your application is using memory.

Let’s see these functionalities in more detail.

Python Support for CodeGuru Reviewer and Profiler (Preview)
Python Support for CodeGuru Reviewer is available in Preview and offers recommendations on how to improve the Python code of your applications in multiple categories such as concurrency, data structures and control flow, scientific/math operations, error handling, using the standard library, and of course AWS best practices.

You can now also use CodeGuru Profiler to collect runtime performance data from your Python applications and get visualizations to help you identify how code is running on the CPU and where time is consumed. In this way, you can detect the most expensive lines of code of your application. Focusing your tuning activities on those parts helps you reduce infrastructure cost and improve application performance.

Let’s see the CodeGuru Reviewer in action with some Python code. When I joined AWS eight years ago, one of the first projects I created was a Filesystem in Userspace (FUSE) interface to Amazon Simple Storage Service (S3) called yas3fs (Yet Another S3-backed File System). It was inspired by the more popular s3fs-fuse project but rewritten from scratch to implement a distributed cache synchronized by Amazon Simple Notification Service (SNS) notifications (now, thanks to the many contributors, it’s using S3 event notifications). It was also a good excuse for me to learn more about Python programming and S3. It’s a personal project that at the time made available as open source. Today, if you need a shared file system, you can use Amazon Elastic File System (EFS).

In the CodeGuru console, I associate the yas3fs repository. You can associate repositories from GitHub, including GitHub Enterprise Cloud and GitHub Enterprise Server, Bitbucket, or AWS CodeCommit.

After that, I can get a code review from CodeGuru in two ways:

  • Automatically, when I create a pull request. This is a great way to use it as you and your team are working on a code base.
  • Manually, creating a repository analysis to get a code review for all the code in one branch. This is useful to start using GodeGuru with an existing code base.

Since I just associated the whole repository, I go for a full analysis and write down the branch name to review (apologies, I was still using master at the time, now I use main for new projects).

After a few minutes, the code review is completed, and there are 14 recommendations. Not bad, but I can definitely improve the code. Here’s a few of the recommendations I get. I was using exceptions and global variables too much at the time.

Security Detectors for CodeGuru Reviewer
The new CodeGuru Reviewer Security Detector uses automated reasoning to analyze all code paths and find potential security issues deep in your Java code, even ones that span multiple methods and files and that may involve multiple sequences of operations. To build this detector, we used learning and best practices from Amazon’s 20+ years of experience.

The Security Detector is also identifying security vulnerabilities in the top 10 Open Web Application Security Project (OWASP) categories, such as weak hash encryption.

If the security detector discovers an issue, it offers a suggested remediation along with an explanation. In this way, it’s much easier to follow security best practices for AWS APIs, such as those for AWS Key Management Service (KMS) and Amazon Elastic Compute Cloud (EC2), and for common Java cryptography and TLS/SSL libraries.

With help from the security detector, security engineers can focus on architectural and application-specific security best-practices, and code reviewers can focus their attention on other improvements.

Memory Profiling for CodeGuru Profiler
For applications running on a JVM, CodeGuru Profiler can now show the Heap Summary, a consolidated view of memory retention during a time frame, tracking both overall sizes and number of objects per object type (such as String, int, char[], and custom types). These metrics are presented in a timeline graph, so that you can easily spot trends and peaks of memory utilization per object type.

Here are a couple of scenarios where this can help:

Memory Leaks – A constantly growing memory utilization curve for one or more object types may indicate a leak (intended here as unnecessary retention of memory objects by the application), possibly leading to out-of-memory errors and application crashes.

Memory Optimizations – Having a breakdown of memory utilization per object type is a step beyond traditional memory utilization monitoring, based solely on JVM-level metrics like total heap usage. By knowing that an unexpectedly high amount of memory has been associated with a specific object type, you can focus your analysis and optimization efforts on the parts of your application that are responsible for allocating and referencing objects of that type.

For example, here is a graph showing how memory is used by a Java application over an interval of time. Apart from the total capacity available and the used space, I can see how memory is being used by some specific object types, such as byte[], java.lang.UUID, and the entries of a java.util.LinkedHashMap. The continuous growth over time of the memory retained by these object types is suspicious. There is probably a memory leak I have to investigate.

In the table just below, I have a longer list of object types allocating memory on the heap. The first three are selected and for that reason are shown in the graph above. Here, I can inspect other object types and select them to see their memory usage over time. It looks like the three I already selected are the ones with more risk of being affected by a memory leak.

Available Now
These new features are available today in all regions where Amazon CodeGuru is offered. For more information, please see the AWS Regional Services table.

There are no pricing changes for Python support, security detectors, and memory profiling. You pay for what you use without upfront fees or commitments.

Learn more about Amazon CodeGuru and start using these new features today to improve the code quality of your applications.  

Danilo

Recreate Tiger-Heli’s bomb mechanic | Wireframe #45

Post Syndicated from Ryan Lambie original https://www.raspberrypi.org/blog/recreate-tiger-helis-bomb-mechanic-wireframe-45/

Code an explosive homage to Toaplan’s classic blaster. Mark Vanstone has the details

Tiger-Heli was developed by Toaplan and published in Japan by Taito and by Romstar in North America.

Released in 1985, Tiger-Heli was one of the earliest games from Japanese developer Toaplan: a top-down shoot-’em-up that pitted a lone helicopter against relentless waves of enemy tanks and military installations. Toaplan would go on to refine and evolve the genre through the eighties and nineties with such titles as Truxton and Fire Shark, so Tiger-Heli served as a kind of blueprint for the studio’s legendary blasters.

Tiger-Heli featured a powerful secondary weapon, too: as well as a regular shot, the game’s attack helicopter could also drop a deadly bomb capable of destroying everything within its blast radius. The mechanic was one that first appeared as far back as Atari’s Defender in 1981, but Toaplan quickly made it its own, with variations on the bomb becoming one of the signatures in the studio’s later games.

For our Tiger-Heli-style Pygame Zero code, we’ll concentrate on the unique bomb aspect, but first, we need to get the basic scrolling background and helicopter on the screen. In a game like this, we’d normally make the background out of tiles that can be used to create a varied but continuous scrolling image. For this example, though, we’ll keep things simple and have one long image that we scroll down the screen and then display a copy above it. When the first image goes off the screen, we just reset the co-ordinates to display it above the second image copy. In this way, we can have an infinitely scrolling background.

Our Tiger-Heli homage in Python. Fly over the military targets, firing missiles and dropping bombs.

 

The helicopter can be set up as an Actor with just two frames for the movement of the rotors. This should look like it’s hovering above the ground, so we blit a shadow bitmap to the bottom right of the helicopter. We can set up keyboard events to move the Actor left, right, up, and down, making sure we don’t allow it to go off the screen.

Now we can go ahead and set up the bombs. We can predefine a list of bomb Actors but only display them while the bombs are active. We’ll trigger a bomb drop with the SPACE bar and set all the bombs to the co-ordinates of the helicopter. Then, frame by frame, we move each bomb outwards in different directions so that they spread out in a pattern. You could try adjusting the number of bombs or their pattern to see what effects can be achieved. When the bombs get to frame 30, we start changing the image so that we get a flashing, expanding circle for each bomb.

Here’s Mark’s code for a Tiger-Heli-style shooter. To get it working on your system, you’ll need to install Pygame Zero. And to download the full code and assets, head here.

It’s all very well having bombs to fire, but we could really do with something to drop them on, so let’s make some tank Actors waiting on the ground for us to destroy. We can move them with the scrolling background so that they look like they’re static on the ground. Then if one of our bombs has a collision detected with one of the tanks, we can set an animation going by cycling through a set of explosion frames, ending with the tank disappearing.

We can also add in some sound effects as the bombs are dropped, and explosion sounds if the tanks are hit. And with that, there you have it: the beginnings of a Tiger-Heli-style blaster.

Get your copy of Wireframe issue 45

You can read more features like this one in Wireframe issue 45, available directly from Raspberry Pi Press — we deliver worldwide.

And if you’d like a handy digital version of the magazine, you can also download issue 45 for free in PDF format.

Baldur’s Gate III: our cover star for Wireframe #45.

Make sure to follow Wireframe on Twitter and Facebook for updates and exclusive offers and giveaways. Subscribe on the Wireframe website to save up to 72% compared to newsstand pricing!

The post Recreate Tiger-Heli’s bomb mechanic | Wireframe #45 appeared first on Raspberry Pi.

Raising code quality for Python applications using Amazon CodeGuru

Post Syndicated from Ran Fu original https://aws.amazon.com/blogs/devops/raising-code-quality-for-python-applications-using-amazon-codeguru/

We are pleased to announce the launch of Python support for Amazon CodeGuru, a service for automated code reviews and application performance recommendations. CodeGuru is powered by program analysis and machine learning, and trained on best practices and hard-learned lessons across millions of code reviews and thousands of applications profiled on open-source projects and internally at Amazon.

Amazon CodeGuru has two services:

  • Amazon CodeGuru Reviewer – Helps you improve source code quality by detecting hard-to-find defects during application development and recommending how to remediate them.
  • Amazon CodeGuru Profiler – Helps you find the most expensive lines of code, helps reduce your infrastructure cost, and fine-tunes your application performance.

The launch of Python support extends CodeGuru beyond its original Java support. Python is a widely used language for various use cases, including web app development and DevOps. Python’s growth in data analysis and machine learning areas is driven by its rich frameworks and libraries. In this post, we discuss how to use CodeGuru Reviewer and Profiler to improve your code quality for Python applications.

CodeGuru Reviewer for Python

CodeGuru Reviewer now allows you to analyze your Python code through pull requests and full repository analysis. For more information, see Automating code reviews and application profiling with Amazon CodeGuru. We analyzed large code corpuses and Python documentation to source hard-to-find coding issues and trained our detectors to provide best practice recommendations. We expect such recommendations to benefit beginners as well as expert Python programmers.

CodeGuru Reviewer generates recommendations in the following categories:

  • AWS SDK APIs best practices
  • Data structures and control flow, including exception handling
  • Resource leaks
  • Secure coding practices to protect from potential shell injections

In the following sections, we provide real-world examples of bugs that can be detected in each of the categories:

AWS SDK API best practices

AWS has hundreds of services and thousands of APIs. Developers can now benefit from CodeGuru Reviewer recommendations related to AWS APIs. AWS recommendations in CodeGuru Reviewer cover a wide range of scenarios such as detecting outdated or deprecated APIs, warning about API misuse, authentication and exception scenarios, and efficient API alternatives.

Consider the pagination trait, implemented by over 1,000 APIs from more than 150 AWS services. The trait is commonly used when the response object is too large to return in a single response. To get the complete set of results, iterated calls to the API are required, until the last page is reached. If developers were not aware of this, they would write the code as the following (this example is patterned after actual code):

def sync_ddb_table(source_ddb, destination_ddb):
    response = source_ddb.scan(TableName=“table1”)
    for item in response['Items']:
        ...
        destination_ddb.put_item(TableName=“table2”, Item=item)
    …   

Here the scan API is used to read items from one Amazon DynamoDB table and the put_item API to save them to another DynamoDB table. The scan API implements the Pagination trait. However, the developer missed iterating on the results beyond the first scan, leading to only partial copying of data.

The following screenshot shows what CodeGuru Reviewer recommends:

The following screenshot shows CodeGuru Reviewer recommends on the need for pagination

The developer fixed the code based on this recommendation and added complete handling of paginated results by checking the LastEvaluatedKey value in the response object of the paginated API scan as follows:

def sync_ddb_table(source_ddb, destination_ddb):
    response = source_ddb.scan(TableName==“table1”)
    for item in response['Items']:
        ...
        destination_ddb.put_item(TableName=“table2”, Item=item)
    # Keeps scanning util LastEvaluatedKey is null
    while "LastEvaluatedKey" in response:
        response = source_ddb.scan(
            TableName="table1",
            ExclusiveStartKey=response["LastEvaluatedKey"]
        )
        for item in response['Items']:
            destination_ddb.put_item(TableName=“table2”, Item=item)
    …   

CodeGuru Reviewer recommendation is rich and offers multiple options for implementing Paginated scan. We can also initialize the ExclusiveStartKey value to None and iteratively update it based on the LastEvaluatedKey value obtained from the scan response object in a loop. This fix below conforms to the usage mentioned in the official documentation.

def sync_ddb_table(source_ddb, destination_ddb):
    table = source_ddb.Table(“table1”)
    scan_kwargs = {
                  …
    }
    done = False
    start_key = None
    while not done:
        if start_key:
            scan_kwargs['ExclusiveStartKey'] = start_key
        response = table.scan(**scan_kwargs)
        for item in response['Items']:
            destination_ddb.put_item(TableName=“table2”, Item=item)
        start_key = response.get('LastEvaluatedKey', None)
        done = start_key is None

Data structures and control flow

Python’s coding style is different from other languages. For code that does not conform to Python idioms, CodeGuru Reviewer provides a variety of suggestions for efficient and correct handling of data structures and control flow in the Python 3 standard library:

  • Using DefaultDict for compact handling of missing dictionary keys over using the setDefault() API or dealing with KeyError exception
  • Using a subprocess module over outdated APIs for subprocess handling
  • Detecting improper exception handling such as catching and passing generic exceptions that can hide latent issues.
  • Detecting simultaneous iteration and modification to loops that might lead to unexpected bugs because the iterator expression is only evaluated one time and does not account for subsequent index changes.

The following code is a specific example that can confuse novice developers.

def list_sns(region, creds, sns_topics=[]):
    sns = boto_session('sns', creds, region)
    response = sns.list_topics()
    for topic_arn in response["Topics"]:
        sns_topics.append(topic_arn["TopicArn"])
    return sns_topics
  
def process():
    ...
    for region, creds in jobs["auth_config"]:
        arns = list_sns(region, creds)
        ... 

The process() method iterates over different AWS Regions and collects Regional ARNs by calling the list_sns() method. The developer might expect that each call to list_sns() with a Region parameter returns only the corresponding Regional ARNs. However, the preceding code actually leaks the ARNs from prior calls to subsequent Regions. This happens due to an idiosyncrasy of Python relating to the use of mutable objects as default argument values. Python default value are created exactly one time, and if that object is mutated, subsequent references to the object refer to the mutated value instead of re-initialization.

The following screenshot shows what CodeGuru Reviewer recommends:

The following screenshot shows CodeGuru Reviewer recommends about initializing a value for mutable objects

The developer accepted the recommendation and issued the below fix.

def list_sns(region, creds, sns_topics=None):
    sns = boto_session('sns', creds, region)
    response = sns.list_topics()
    if sns_topics is None: 
        sns_topics = [] 
    for topic_arn in response["Topics"]:
        sns_topics.append(topic_arn["TopicArn"])
    return sns_topics

Resource leaks

A Pythonic practice for resource handling is using Context Managers. Our analysis shows that resource leaks are rampant in Python code where a developer may open external files or windows and forget to close them eventually. A resource leak can slow down or crash your system. Even if a resource is closed, using Context Managers is Pythonic. For example, CodeGuru Reviewer detects resource leaks in the following code:

def read_lines(file):
    lines = []
    f = open(file, ‘r’)
    for line in f:
        lines.append(line.strip(‘\n’).strip(‘\r\n’))
    return lines

The following screenshot shows that CodeGuru Reviewer recommends that the developer either use the ContextLib with statement or use a try-finally block to explicitly close a resource.

The following screenshot shows CodeGuru Reviewer recommend about fixing the potential resource leak

The developer accepted the recommendation and fixed the code as shown below.

def read_lines(file):
    lines = []
    with open(file, ‘r’) as f: 
        for line in f:
            lines.append(line.strip(‘\n’).strip(‘\r\n’))
    return lines

Secure coding practices

Python is often used for scripting. An integral part of such scripts is the use of subprocesses. As of this writing, CodeGuru Reviewer makes a limited, but important set of recommendations to make sure that your use of eval functions or subprocesses is secure from potential shell injections. It issues a warning if it detects that the command used in eval or subprocess scenarios might be influenced by external factors. For example, see the following code:

def execute(cmd):
    try:
        retcode = subprocess.call(cmd, shell=True)
        ...
    except OSError as e:
        ...

The following screenshot shows the CodeGuru Reviewer recommendation:

The following screenshot shows CodeGuru Reviewer recommends about potential shell injection vulnerability

The developer accepted this recommendation and made the following fix.

def execute(cmd):
    try:
        retcode = subprocess.call(shlex.quote(cmd), shell=True)
        ...
    except OSError as e:
        ...

As shown in the preceding recommendations, not only are the code issues detected, but a detailed recommendation is also provided on how to fix the issues, along with a link to the Python official documentation. You can provide feedback on recommendations in the CodeGuru Reviewer console or by commenting on the code in a pull request. This feedback helps improve the performance of Reviewer so that the recommendations you see get better over time.

Now let’s take a look at CodeGuru Profiler.

CodeGuru Profiler for Python

Amazon CodeGuru Profiler analyzes your application’s performance characteristics and provides interactive visualizations to show you where your application spends its time. These visualizations a. k. a. flame graphs are a powerful tool to help you troubleshoot which code methods have high latency or are over utilizing your CPU.

Thanks to the new Python agent, you can now use CodeGuru Profiler on your Python applications to investigate performance issues.

The following list summarizes the supported versions as of this writing.

  • AWS Lambda functions: Python3.8, Python3.7, Python3.6
  • Other environments: Python3.9, Python3.8, Python3.7, Python3.6

Onboarding your Python application

For this post, let’s assume you have a Python application running on Amazon Elastic Compute Cloud (Amazon EC2) hosts that you want to profile. To onboard your Python application, complete the following steps:

1. Create a new profiling group in CodeGuru Profiler console called ProfilingGroupForMyApplication. Give access to your Amazon EC2 execution role to submit to this profiling group. See the documentation for details about how to create a Profiling Group.

2. Install the codeguru_profiler_agent module:

pip3 install codeguru_profiler_agent

3. Start the profiler in your application.

An easy way to profile your application is to start your script through the codeguru_profiler_agent module. If you have an app.py script, use the following code:

python -m codeguru_profiler_agent -p ProfilingGroupForMyApplication app.py

Alternatively, you can start the agent manually inside the code. This must be done only one time, preferably in your startup code:

from codeguru_profiler_agent import Profiler

if __name__ == "__main__":
     Profiler(profiling_group_name='ProfilingGroupForMyApplication')
     start_application()    # your code in there....

Onboarding your Python Lambda function

Onboarding for an AWS Lambda function is quite similar.

  1. Create a profiling group called ProfilingGroupForMyLambdaFunction, this time we select “Lambda” for the compute platform. Give access to your Lambda function role to submit to this profiling group. See the documentation for details about how to create a Profiling Group.
  2. Include the codeguru_profiler_agent module in your Lambda function code.
  3. Add the with_lambda_profiler decorator to your handler function:
from codeguru_profiler_agent import with_lambda_profiler

@with_lambda_profiler(profiling_group_name='ProfilingGroupForMyLambdaFunction')
def handler_function(event, context):
      # Your code here

Alternatively, you can profile an existing Lambda function without updating the source code by adding a layer and changing the configuration. For more information, see Profiling your applications that run on AWS Lambda.

Profiling a Lambda function helps you see what is slowing down your code so you can reduce the duration, which reduces the cost and improves latency. You need to have continuous traffic on your function in order to produce a usable profile.

Viewing your profile

After running your profile for some time, you can view it on the CodeGuru console.

Screenshot of Flame graph visualization by CodeGuru Profiler

Each frame in the flame graph shows how much that function contributes to latency. In this example, an outbound call that crosses the network is taking most of the duration in the Lambda function, caching its result would improve the latency.

For more information, see Investigating performance issues with Amazon CodeGuru Profiler.

Supportability for CodeGuru Profiler is documented here.

If you don’t have an application to try CodeGuru Profiler on, you can use the demo application in the following GitHub repo.

Conclusion

This post introduced how to leverage CodeGuru Reviewer to identify hard-to-find code defects in various issue categories and how to onboard your Python applications or Lambda function in CodeGuru Profiler for CPU profiling. Combining both services can help you improve code quality for Python applications. CodeGuru is now available for you to try. For more pricing information, please see Amazon CodeGuru pricing.

 

About the Authors

Neela Sawant is a Senior Applied Scientist in the Amazon CodeGuru team. Her background is building AI-powered solutions to customer problems in a variety of domains such as software, multimedia, and retail. When she isn’t working, you’ll find her exploring the world anew with her toddler and hacking away at AI for social good.

 

 

Pierre Marieu is a Software Development Engineer in the Amazon CodeGuru Profiler team in London. He loves building tools that help the day-to-day life of other software engineers. Previously, he worked at Amadeus IT, building software for the travel industry.

 

 

 

Ran Fu is a Senior Product Manager in the Amazon CodeGuru team. He has a deep customer empathy, and love exploring who are the customers, what are their needs, and why those needs matter. Besides work, you may find him snowboarding in Keystone or Vail, Colorado.