Tag Archives: generative AI

Preview – Connect Foundation Models to Your Company Data Sources with Agents for Amazon Bedrock

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/preview-connect-foundation-models-to-your-company-data-sources-with-agents-for-amazon-bedrock/

In July, we announced the preview of agents for Amazon Bedrock, a new capability for developers to create generative AI applications that complete tasks. Today, I’m happy to introduce a new capability to securely connect foundation models (FMs) to your company data sources using agents.

With a knowledge base, you can use agents to give FMs in Bedrock access to additional data that helps the model generate more relevant, context-specific, and accurate responses without continuously retraining the FM. Based on user input, agents identify the appropriate knowledge base, retrieve the relevant information, and add the information to the input prompt, giving the model more context information to generate a completion.

Knowledge Base for Amazon Bedrock

Agents for Amazon Bedrock use a concept known as retrieval augmented generation (RAG) to achieve this. To create a knowledge base, specify the Amazon Simple Storage Service (Amazon S3) location of your data, select an embedding model, and provide the details of your vector database. Bedrock converts your data into embeddings and stores your embeddings in the vector database. Then, you can add the knowledge base to agents to enable RAG workflows.

For the vector database, you can choose between vector engine for Amazon OpenSearch Serverless, Pinecone, and Redis Enterprise Cloud. I’ll share more details on how to set up your vector database later in this post.

Primer on Retrieval Augmented Generation, Embeddings, and Vector Databases
RAG isn’t a specific set of technologies but a concept for providing FMs access to data they didn’t see during training. Using RAG, you can augment FMs with additional information, including company-specific data, without continuously retraining your model.

Continuously retraining your model is not only compute-intensive and expensive, but as soon as you’ve retrained the model, your company might have already generated new data, and your model has stale information. RAG addresses this issue by providing your model access to additional external data at runtime. Relevant data is then added to the prompt to help improve both the relevance and the accuracy of completions.

This data can come from a number of data sources, such as document stores or databases. A common implementation for document search is converting your documents, or chunks of the documents, into vector embeddings using an embedding model and then storing the vector embeddings in a vector database, as shown in the following figure.

Knowledge Base for Amazon Bedrock

The vector embedding includes the numeric representations of text data within your documents. Each embedding aims to capture the semantic or contextual meaning of the data. Each vector embedding is put into a vector database, often with additional metadata such as a reference to the original content the embedding was created from. The vector database then indexes the vectors, which can be done using a variety of approaches. This indexing enables quick retrieval of relevant data.

Compared to traditional keyword search, vector search can find relevant results without requiring an exact keyword match. For example, if you search for “What is the cost of product X?” and your documents say “The price of product X is […]”, then keyword search might not work because “price” and “cost” are two different words. With vector search, it will return the accurate result because “price” and “cost” are semantically similar; they have the same meaning. Vector similarity is calculated using distance metrics such as Euclidean distance, cosine similarity, or dot product similarity.

The vector database is then used within the prompt workflow to efficiently retrieve external information based on an input query, as shown in the figure below.

Knowledge Base for Amazon Bedrock

The workflow starts with a user input prompt. Using the same embedding model, you create a vector embedding representation of the input prompt. This embedding is then used to query the database for similar vector embeddings to return the most relevant text as the query result.

The query result is then added to the prompt, and the augmented prompt is passed to the FM. The model uses the additional context in the prompt to generate the completion, as shown in the following figure.

Knowledge Stores for Amazon Bedrock

Similar to the fully managed agents experience I described in the blog post on agents for Amazon Bedrock, the knowledge base for Amazon Bedrock manages the data ingestion workflow, and agents manage the RAG workflow for you.

Get Started with Knowledge Bases for Amazon Bedrock
You can add a knowledge base by specifying a data source, such as Amazon S3, select an embedding model, such as Amazon Titan Embeddings to convert the data into vector embeddings, and a destination vector database to store the vector data. Bedrock takes care of creating, storing, managing, and updating your embeddings in the vector database.

If you add knowledge bases to an agent, the agent will identify the appropriate knowledge base based on user input, retrieve the relevant information, and add the information to the input prompt, providing the model with more context information to generate a response, as shown in the figure below. All information retrieved from knowledge bases comes with source attribution to improve transparency and minimize hallucinations.

Knowledge Base for Amazon Bedrock

Let me walk you through those steps in more detail.

Create a Knowledge Base for Amazon Bedrock
Let’s assume you’re a developer at a tax consulting company and want to provide users with a generative AI application—a TaxBot—that can answer US tax filing questions. You first create a knowledge base that holds the relevant tax documents. Then, you configure an agent in Bedrock with access to this knowledge base and integrate the agent into your TaxBot application.

To get started, open the Bedrock console, select Knowledge base in the left navigation pane, then choose Create knowledge base.

Knowledge Base for Amazon Bedrock

Step 1 – Provide knowledge base details. Enter a name for the knowledge base and a description (optional). You also must select an AWS Identity and Access Management (IAM) runtime role with a trust policy for Amazon Bedrock, permissions to access the S3 bucket you want the knowledge base to use, and read/write permissions to your vector database. You can also assign tags as needed.

Knowledge Base for Amazon Bedrock

Step 2 – Set up data source. Enter a data source name and specify the Amazon S3 location for your data. Supported data formats include .txt, .md, .html, .doc and .docx, .csv, .xls and .xlsx, and .pdf files. You can also provide an AWS Key Management Service (AWS KMS) key to allow Bedrock to decrypt and encrypt your data and another AWS KMS key for transient data storage while Bedrock is converting your data into embeddings.

Choose the embedding model, such as Amazon Titan Embeddings – Text, and your vector database. For the vector database, as mentioned earlier, you can choose between vector engine for Amazon OpenSearch Serverless, Pinecone, or Redis Enterprise Cloud.

Knowledge Base for Amazon Bedrock

Important note on the vector database: Amazon Bedrock is not creating a vector database on your behalf. You must create a new, empty vector database from the list of supported options and provide the vector database index name as well as index field and metadata field mappings. This vector database will need to be for exclusive use with Amazon Bedrock.

Let me show you what the setup looks like for vector engine for Amazon OpenSearch Serverless. Assuming you’ve set up an OpenSearch Serverless collection as described in the Developer Guide and this AWS Big Data Blog post, provide the ARN of the OpenSearch Serverless collection, specify the vector index name, and the vector field and metadata field mapping.

Knowledge Base for Amazon Bedrock

The configuration for Pinecone and Redis Enterprise Cloud is similar. Check out this Pinecone blog post and this Redis Inc. blog post for more details on how to set up and prepare their vector database for Bedrock.

Step 3 – Review and create. Review your knowledge base configuration and choose Create knowledge base.

Knowledge Base for Amazon Bedrock

Back in the knowledge base details page, choose Sync for the newly created data source, and whenever you add new data to the data source, to start the ingestion workflow of converting your Amazon S3 data into vector embeddings and upserting the embeddings into the vector database. Depending on the amount of data, this whole workflow can take some time.

Knowledge Base for Amazon Bedrock

Next, I’ll show you how to add the knowledge base to an agent configuration.

Add a Knowledge Base to Agents for Amazon Bedrock
You can add a knowledge base when creating or updating an agent for Amazon Bedrock. Create an agent as described in this AWS News Blog post on agents for Amazon Bedrock.

For my tax bot example, I’ve created an agent called “TaxBot,” selected a foundation model, and provided these instructions for the agent in step 2: “You are a helpful and friendly agent that answers US tax filing questions for users.” In step 4, you can now select a previously created knowledge base and provide instructions for the agent describing when to use this knowledge base.

Knowledge Base for Amazon Bedrock

These instructions are very important as they help the agent decide whether or not a particular knowledge base should be used for retrieval. The agent will identify the appropriate knowledge base based on user input and available knowledge base instructions.

For my tax bot example, I added the knowledge base “TaxBot-Knowledge-Base” together with these instructions: “Use this knowledge base to answer tax filing questions.”

Once you’ve finished the agent configuration, you can test your agent and how it’s using the added knowledge base. Note how the agent provides a source attribution for information pulled from knowledge bases.

Knowledge Base for Amazon Bedrock

Generative AI with large language modelsLearn the Fundamentals of Generative AI
Generative AI with large language models (LLMs) is an on-demand, three-week course for data scientists and engineers who want to learn how to build generative AI applications with LLMs, including RAG. It’s the perfect foundation to start building with Amazon Bedrock. Enroll for generative AI with LLMs today.

Sign up to Learn More about Amazon Bedrock (Preview)
Amazon Bedrock is currently available in preview. Reach out through your usual AWS support contacts if you’d like access to knowledge bases for Amazon Bedrock as part of the preview. We’re regularly providing access to new customers. To learn more, visit the Amazon Bedrock Features page and sign up to learn more about Amazon Bedrock.

— Antje

AWS Weekly Roundup – Amazon MWAA, EMR Studio, Generative AI, and More – August 14, 2023

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-mwaa-emr-studio-generative-ai-and-more-august-14-2023/

While I enjoyed a few days off in California to get a dose of vitamin sea, a lot has happened in the AWS universe. Let’s take a look together!

Last Week’s Launches
Here are some launches that got my attention:

Amazon MWAA now supports Apache Airflow version 2.6Amazon Managed Workflows for Apache Airflow (Amazon MWAA) is a managed orchestration service for Apache Airflow that you can use to set up and operate end-to-end data pipelines in the cloud. Apache Airflow version 2.6 introduces important security updates and bug fixes that enhance the security and reliability of your workflows. If you’re currently running Apache Airflow version 2.x, you can now seamlessly upgrade to version 2.6.3. Check out this AWS Big Data Blog post to learn more.

Amazon EMR Studio adds support for AWS Lake Formation fine-grained access controlAmazon EMR Studio is a web-based integrated development environment (IDE) for fully managed Jupyter notebooks that run on Amazon EMR clusters. When you connect to EMR clusters from EMR Studio workspaces, you can now choose the AWS Identity and Access Management (IAM) role that you want to connect with. Apache Spark interactive notebooks will access only the data and resources permitted by policies attached to this runtime IAM role. When data is accessed from data lakes managed with AWS Lake Formation, you can enforce table and column-level access using policies attached to this runtime role. For more details, have a look at the Amazon EMR documentation.

AWS Security Hub launches 12 new security controls AWS Security Hub is a cloud security posture management (CSPM) service that performs security best practice checks, aggregates alerts, and enables automated remediation. With the newly released controls, Security Hub now supports three additional AWS services: Amazon Athena, Amazon DocumentDB (with MongoDB compatibility), and Amazon Neptune. Security Hub has also added an additional control against Amazon Relational Database Service (Amazon RDS). AWS Security Hub now offers 276 controls. You can find more information in the AWS Security Hub documentation.

Additional AWS services available in the AWS Israel (Tel Aviv) Region – The AWS Israel (Tel Aviv) Region opened on August 1, 2023. This past week, AWS Service Catalog, Amazon SageMaker, Amazon EFS, and Amazon Kinesis Data Analytics were added to the list of available services in the Israel (Tel Aviv) Region. Check the AWS Regional Services List for the most up-to-date availability information.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Here are some additional blog posts and news items that you might find interesting:

AWS recognized as a Leader in 2023 Gartner Magic Quadrant for Contact Center as a Service with Amazon Connect – AWS was named a Leader for the first time since Amazon Connect, our flexible, AI-powered cloud contact center, was launched in 2017. Read the full story here. 

Generate creative advertising using generative AI –  This AWS Machine Learning Blog post shows how to generate captivating and innovative advertisements at scale using generative AI. It discusses the technique of inpainting and how to seamlessly create image backgrounds, visually stunning and engaging content, and reducing unwanted image artifacts.

AWS open-source news and updates – My colleague Ricardo writes this weekly open-source newsletter in which he highlights new open-source projects, tools, and demos from the AWS Community.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

Build On AWS - Generative AIBuild On Generative AI – Your favorite weekly Twitch show about all things generative AI is back for season 2 today! Every Monday, 9:00 US PT, my colleagues Emily and Darko look at new technical and scientific patterns on AWS, inviting guest speakers to demo their work and show us how they built something new to improve the state of generative AI.

In today’s episode, Emily and Darko discussed the latest models LlaMa-2 and Falcon, and explored them in retrieval-augmented generation design patterns. You can watch the video here. Check out show notes and the full list of episodes on community.aws.

AWS NLP Conference 2023 – Join this in-person event on September 13–14 in London to hear about the latest trends, ground-breaking research, and innovative applications that leverage natural language processing (NLP) capabilities on AWS. This year, the conference will primarily focus on large language models (LLMs), as they form the backbone of many generative AI applications and use cases. Register here.

AWS Global Summits – The 2023 AWS Summits season is almost coming to an end with the last two in-person events in Mexico City (August 30) and Johannesburg (September 26).

AWS Community Days – Join a community-led conference run by AWS user group leaders in your region: West Africa (August 19), Taiwan (August 26), Aotearoa (September 6), Lebanon (September 9), and Munich (September 14).

AWS re:Invent 2023AWS re:Invent (November 27 – December 1) – Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Registration is now open.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Antje

P.S. We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link doesn’t lead to our website. AWS handles your information as described in the AWS Privacy Notice.

AWS Week in Review – Agents for Amazon Bedrock, Amazon SageMaker Canvas New Capabilities, and More – July 31, 2023

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-week-in-review-agents-for-amazon-bedrock-amazon-sagemaker-canvas-new-capabilities-and-more-july-31-2023/

This July, AWS communities in ASEAN wrote a new history. First, the AWS User Group Malaysia recently held the first AWS Community Day in Malaysia.

Another significant milestone has been achieved by the AWS User Group Philippines. They just celebrated their tenth anniversary by running 2 days of AWS Community Day Philippines. Here are a few photos from the event, including Jeff Barr sharing his experiences attending AWS User Group meetup, in Manila, Philippines 10 years ago.

Big congratulations to AWS Community Heroes, AWS Community Builders, AWS User Group leaders and all volunteers who organized and delivered AWS Community Days! Also, thank you to everyone who attended and help support our AWS communities.

Last Week’s Launches
We had interesting launches last week, including from AWS Summit, New York. Here are some of my personal highlights:

(Preview) Agents for Amazon Bedrock – You can now create managed agents for Amazon Bedrock to handle tasks using API calls to company systems, understand user requests, break down complex tasks into steps, hold conversations to gather more information, and take actions to fulfill requests.

(Coming Soon) New LLM Capabilities in Amazon QuickSight Q – We are expanding the innovation in QuickSight Q by introducing new LLM capabilities through Amazon Bedrock. These Generative BI capabilities will allow organizations to easily explore data, uncover insights, and facilitate sharing of insights.

AWS Glue Studio support for Amazon CodeWhisperer – You can now write specific tasks in natural language (English) as comments in the Glue Studio notebook, and Amazon CodeWhisperer provides code recommendations for you.

(Preview) Vector Engine for Amazon OpenSearch Serverless – This capability empowers you to create modern ML-augmented search experiences and generative AI applications without the need to handle the complexities of managing the underlying vector database infrastructure.

Last week, Amazon SageMaker Canvas also released a set of new capabilities:

AWS Open-Source Updates
As always, my colleague Ricardo has curated the latest updates for open-source news at AWS. Here are some of the highlights.

cdk-aws-observability-accelerator is a set of opinionated modules to help you set up observability for your AWS environments with AWS native services and AWS-managed observability services such as Amazon Managed Service for Prometheus, Amazon Managed Grafana, AWS Distro for OpenTelemetry (ADOT) and Amazon CloudWatch.

iac-devtools-cli-for-cdk is a command line interface tool that automates many of the tedious tasks of building, adding to, documenting, and extending AWS CDK applications.

Upcoming AWS Events
There are upcoming events that you can join to learn. Let’s start with AWS events:

And let’s learn from our fellow builders and join AWS Community Days:

Open for Registration for AWS re:Invent
We want to be sure you know that AWS re:Invent registration is now open!

This learning conference hosted by AWS for the global cloud computing community will be held from November 27 to December 1, 2023, in Las Vegas.

Pro-tip: You can use information on the Justify Your Trip page to prove the value of your trip to AWS re:Invent trip.

Give Us Your Feedback
We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.

That’s all for this week. Check back next Monday for another Week in Review.

Happy building!


This post is part of our Week in Review series. Check back each week for a quick round-up of interesting news and announcements from AWS!

P.S. We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.

New – Amazon EC2 P5 Instances Powered by NVIDIA H100 Tensor Core GPUs for Accelerating Generative AI and HPC Applications

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-p5-instances-powered-by-nvidia-h100-tensor-core-gpus-for-accelerating-generative-ai-and-hpc-applications/

In March 2023, AWS and NVIDIA announced a multipart collaboration focused on building the most scalable, on-demand artificial intelligence (AI) infrastructure optimized for training increasingly complex large language models (LLMs) and developing generative AI applications.

We preannounced Amazon Elastic Compute Cloud (Amazon EC2) P5 instances powered by NVIDIA H100 Tensor Core GPUs and AWS’s latest networking and scalability that will deliver up to 20 exaflops of compute performance for building and training the largest machine learning (ML) models. This announcement is the product of more than a decade of collaboration between AWS and NVIDIA, delivering the visual computing, AI, and high performance computing (HPC) clusters across the Cluster GPU (cg1) instances (2010), G2 (2013), P2 (2016), P3 (2017), G3 (2017), P3dn (2018), G4 (2019), P4 (2020), G5 (2021), and P4de instances (2022).

Most notably, ML model sizes are now reaching trillions of parameters. But this complexity has increased customers’ time to train, where the latest LLMs are now trained over the course of multiple months. HPC customers also exhibit similar trends. With the fidelity of HPC customer data collection increasing and data sets reaching exabyte scale, customers are looking for ways to enable faster time to solution across increasingly complex applications.

Introducing EC2 P5 Instances
Today, we are announcing the general availability of Amazon EC2 P5 instances, the next-generation GPU instances to address those customer needs for high performance and scalability in AI/ML and HPC workloads. P5 instances are powered by the latest NVIDIA H100 Tensor Core GPUs and will provide a reduction of up to 6 times in training time (from days to hours) compared to previous generation GPU-based instances. This performance increase will enable customers to see up to 40 percent lower training costs.

P5 instances provide 8 x NVIDIA H100 Tensor Core GPUs with 640 GB of high bandwidth GPU memory, 3rd Gen AMD EPYC processors, 2 TB of system memory, and 30 TB of local NVMe storage. P5 instances also provide 3200 Gbps of aggregate network bandwidth with support for GPUDirect RDMA, enabling lower latency and efficient scale-out performance by bypassing the CPU on internode communication.

Here are the specs for these instances:

vCPUs Memory
Network Bandwidth
EBS Bandwidth
Local Storage
P5.48xlarge 192 2048 8 3200 80 8 x 3.84

Here’s a quick infographic that shows you how the P5 instances and NVIDIA H100 Tensor Core GPUs compare to previous instances and processors:

P5 instances are ideal for training and running inference for increasingly complex LLMs and computer vision models behind the most demanding and compute-intensive generative AI applications, including question answering, code generation, video and image generation, speech recognition, and more. P5 will provide up to 6 times lower time to train compared with previous generation GPU-based instances across those applications. Customers who can use lower precision FP8 data types in their workloads, common in many language models that use a transformer model backbone, will see further benefit at up to 6 times performance increase through support for the NVIDIA transformer engine.

HPC customers using P5 instances can deploy demanding applications at greater scale in pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling. Customers using dynamic programming (DP) algorithms for applications like genome sequencing or accelerated data analytics will also see further benefit from P5 through support for a new DPX instruction set.

This enables customers to explore problem spaces that previously seemed unreachable, iterate on their solutions at a faster clip, and get to market more quickly.

You can see the detail of instance specifications along with comparisons of instance types between p4d.24xlarge and new p5.48xlarge below:

Feature p4d.24xlarge p5.48xlarge Comparision
Number & Type of Accelerators 8 x NVIDIA A100 8 x NVIDIA H100
FP8 TFLOPS per Server 16,000 640% vs.A100 FP16
FP16 TFLOPS per Server 2,496 8,000
GPU Memory 40 GB 80 GB 200%
GPU Memory Bandwidth 12.8 TB/s 26.8 TB/s 200%
CPU Family Intel Cascade Lake AMD Milan
vCPUs 96  192 200%
Total System Memory 1152 GB 2048 GB 200%
Networking Throughput 400 Gbps 3200 Gbps 800%
EBS Throughput 19 Gbps 80 Gbps 400%
Local Instance Storage 8 TBs NVMe 30 TBs NVMe 375%
GPU to GPU Interconnect 600 GB/s 900 GB/s 150%

Second-generation Amazon EC2 UltraClusters and Elastic Fabric Adaptor
P5 instances provide market-leading scale-out capability for multi-node distributed training and tightly coupled HPC workloads. They offer up to 3,200 Gbps of networking using the second-generation Elastic Fabric Adaptor (EFA) technology, 8 times compared with P4d instances.

To address customer needs for large-scale and low latency, P5 instances are deployed in the second-generation EC2 UltraClusters, which now provide customers with lower latency across up to 20,000+ NVIDIA H100 Tensor Core GPUs. Providing the largest scale of ML infrastructure in the cloud, P5 instances in EC2 UltraClusters deliver up to 20 exaflops of aggregate compute capability.

EC2 UltraClusters use Amazon FSx for Lustre, fully managed shared storage built on the most popular high-performance parallel file system. With FSx for Lustre, you can quickly process massive datasets on demand and at scale and deliver sub-millisecond latencies. The low-latency and high-throughput characteristics of FSx for Lustre are optimized for deep learning, generative AI, and HPC workloads on EC2 UltraClusters.

FSx for Lustre keeps the GPUs and ML accelerators in EC2 UltraClusters fed with data, accelerating the most demanding workloads. These workloads include LLM training, generative AI inferencing, and HPC workloads, such as genomics and financial risk modeling.

Getting Started with EC2 P5 Instances
To get started, you can use P5 instances in the US East (N. Virginia) and US West (Oregon) Region.

When launching P5 instances, you will choose AWS Deep Learning AMIs (DLAMIs) to support P5 instances. DLAMI provides ML practitioners and researchers with the infrastructure and tools to quickly build scalable, secure distributed ML applications in preconfigured environments.

You will be able to run containerized applications on P5 instances with AWS Deep Learning Containers using libraries for Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service  (Amazon EKS).  For a more managed experience, you can also use P5 instances via Amazon SageMaker, which helps developers and data scientists easily scale to tens, hundreds, or thousands of GPUs to train a model quickly at any scale without worrying about setting up clusters and data pipelines. HPC customers can leverage AWS Batch and ParallelCluster with P5 to help orchestrate jobs and clusters efficiently.

Existing P4 customers will need to update their AMIs to use P5 instances. Specifically, you will need to update your AMIs to include the latest NVIDIA driver with support for NVIDIA H100 Tensor Core GPUs. They will also need to install the latest CUDA version (CUDA 12), CuDNN version, framework versions (e.g., PyTorch, Tensorflow), and EFA driver with updated topology files. To make this process easy for you, we will provide new DLAMIs and Deep Learning Containers that come prepackaged with all the needed software and frameworks to use P5 instances out of the box.

Now Available
Amazon EC2 P5 instances are available today in AWS Regions: US East (N. Virginia) and US West (Oregon). For more information, see the Amazon EC2 pricing page. To learn more, visit our P5 instance page and explore AWS re:Post for EC2 or through your usual AWS Support contacts.

You can choose a broad range of AWS services that have generative AI built in, all running on the most cost-effective cloud infrastructure for generative AI. To learn more, visit Generative AI on AWS to innovate faster and reinvent your applications.


Preview – Enable Foundation Models to Complete Tasks With Agents for Amazon Bedrock

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/preview-enable-foundation-models-to-complete-tasks-with-agents-for-amazon-bedrock/

This April, Swami Sivasubramanian, Vice President of Data and Machine Learning at AWS, announced Amazon Bedrock and Amazon Titan models as part of new tools for building with generative AI on AWS. Amazon Bedrock, currently available in preview, is a fully managed service that makes foundation models (FMs) from Amazon and leading AI startups—such as AI21 Labs, Anthropic, Cohere, and Stability AI—available through an API.

Today, I’m excited to announce the preview of agents for Amazon Bedrock, a new capability for developers to create fully managed agents in a few clicks. Agents for Amazon Bedrock accelerate the delivery of generative AI applications that can manage and perform tasks by making API calls to your company systems. Agents extend FMs to understand user requests, break down complex tasks into multiple steps, carry on a conversation to collect additional information, and take actions to fulfill the request.

Agents for Amazon Bedrock

Using agents for Amazon Bedrock, you can automate tasks for your internal or external customers, such as managing retail orders or processing insurance claims. For example, an agent-powered generative AI e-commerce application can not only respond to the question, “Do you have this jacket in blue?” with a simple answer but can also help you with the task of updating your order or managing an exchange.

For this to work, you first need to give the agent access to external data sources and connect it to existing APIs of other applications. This allows the FM that powers the agent to interact with the broader world and extend its utility beyond just language processing tasks. Second, the FM needs to figure out what actions to take, what information to use, and in which sequence to perform these actions. This is possible thanks to an exciting emerging behavior of FMs—their ability to reason. You can show FMs how to handle such interactions and how to reason through tasks by building prompts that include definitions and instructions. The process of designing prompts to guide the model towards desired outputs is known as prompt engineering.

Introducing Agents for Amazon Bedrock
Agents for Amazon Bedrock automate the prompt engineering and orchestration of user-requested tasks. Once configured, an agent automatically builds the prompt and securely augments it with your company-specific information to provide responses back to the user in natural language. The agent is able to figure out the actions required to automatically process user-requested tasks. It breaks the task into multiple steps, orchestrates a sequence of API calls and data lookups, and maintains memory to complete the action for the user.

With fully managed agents, you don’t have to worry about provisioning or managing infrastructure. You’ll have seamless support for monitoring, encryption, user permissions, and API invocation management without writing custom code. As a developer, you can use the Bedrock console or SDK to upload the API schema. The agent then orchestrates the tasks with the help of FMs and performs API calls using AWS Lambda functions.

Primer on Advanced Reasoning and ReAct
You can help FMs to reason and figure out how to solve user-requested tasks with a reasoning technique called ReAct (synergizing reasoning and acting). Using ReAct, you can structure prompts to show an FM how to reason through a task and decide on actions that help find a solution. The structured prompts include a sequence of question-thought-action-observation examples.

The question is the user-requested task or problem to solve. The thought is a reasoning step that helps demonstrate to the FM how to tackle the problem and identify an action to take. The action is an API that the model can invoke from an allowed set of APIs. The observation is the result of carrying out the action. The actions that the FM is able to choose from are defined by a set of instructions that are prepended to the example prompt text. Here is an illustration of how you would build up a ReAct prompt:

Building up a ReAct prompt

The good news is that Bedrock performs the heavy lifting for you! Behind the scenes, agents for Amazon Bedrock build the prompts based on the information and actions you provide.

Now, let me show you how to get started with agents for Amazon Bedrock.

Create an Agent for Amazon Bedrock
Let’s assume you’re a developer at an insurance company and want to provide a generative AI application that helps the insurance agency owners automate repetitive tasks. You create an agent in Bedrock and integrate it into your application.

To get started with the agent, open the Bedrock console, select Agents in the left navigation panel, then choose Create Agent.

Agents for Amazon Bedrock

This starts the agent creation workflow.

  1. Provide agent details including agent name, description (optional), whether the agent is allowed to request additional user inputs, and the AWS Identity and Access Management (IAM) service role that gives your agent access to other required services, such as Amazon Simple Storage Service (Amazon S3) and AWS Lambda.Agents for Amazon Bedrock
  2. Select a foundation model from Bedrock that fits your use case. Here, you provide an instruction to your agent in natural language. The instruction tells the agent what task it’s supposed to perform and the persona it’s supposed to assume. For example, “You are an agent designed to help with processing insurance claims and managing pending paperwork.”Agents for Amazon Bedrock
  3. Add action groups. An action is a task that the agent can perform automatically by making API calls to your company systems. A set of actions is defined in an action group. Here, you provide an API schema that defines the APIs for all the actions in the group. You also must provide a Lambda function that represents the business logic for each API. For example, let’s define an action group called ClaimManagementActionGroup that manages insurance claims by pulling a list of open claims, identifying outstanding paperwork for each claim, and sending reminders to policy holders. Make sure to capture this information in the action group description. Agents for Amazon BedrockThe business logic for my action group is captured in the Lambda function InsuranceClaimsLambda. This AWS Lambda function implements methods for the following API calls: open-claims, identify-missing-documents, and send-reminders.Here’s a short extract from my OrderManagementLambda:
    import json
    import time
    def open_claims():
    def identify_missing_documents(parameters):
    def send_reminders():
    def lambda_handler(event, context):
        responses = []
        for prediction in event['actionGroups']:
            response_code = ...
            action = prediction['actionGroup']
            api_path = prediction['apiPath']
            if api_path == '/claims':
                body = open_claims() 
            elif api_path == '/claims/{claimId}/identify-missing-documents':
    			parameters = prediction['parameters']
                body = identify_missing_documents(parameters)
            elif api_path == '/send-reminders':
                body =  send_reminders()
                body = {"{}::{} is not a valid api, try another one.".format(action, api_path)}
            response_body = {
                'application/json': {
                    'body': str(body)
            action_response = {
                'actionGroup': prediction['actionGroup'],
                'apiPath': prediction['apiPath'],
                'httpMethod': prediction['httpMethod'],
                'httpStatusCode': response_code,
                'responseBody': response_body
        api_response = {'response': responses}
        return api_response

    Note that you also must provide an API schema in the OpenAPI schema JSON format. Here’s what my API schema file insurance_claim_schema.json looks like:

    {"openapi": "3.0.0",
        "info": {
            "title": "Insurance Claims Automation API",
            "version": "1.0.0",
            "description": "APIs for managing insurance claims by pulling a list of open claims, identifying outstanding paperwork for each claim, and sending reminders to policy holders."
        "paths": {
            "/claims": {
                "get": {
                    "summary": "Get a list of all open claims",
                    "description": "Get the list of all open insurance claims. Return all the open claimIds.",
                    "operationId": "getAllOpenClaims",
                    "responses": {
                        "200": {
                            "description": "Gets the list of all open insurance claims for policy holders",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "array",
                                        "items": {
                                            "type": "object",
                                            "properties": {
                                                "claimId": {
                                                    "type": "string",
                                                    "description": "Unique ID of the claim."
                                                "policyHolderId": {
                                                    "type": "string",
                                                    "description": "Unique ID of the policy holder who has filed the claim."
                                                "claimStatus": {
                                                    "type": "string",
                                                    "description": "The status of the claim. Claim can be in Open or Closed state"
            "/claims/{claimId}/identify-missing-documents": {
                "get": {
                    "summary": "Identify missing documents for a specific claim",
                    "description": "Get the list of pending documents that need to be uploaded by policy holder before the claim can be processed. The API takes in only one claim id and returns the list of documents that are pending to be uploaded by policy holder for that claim. This API should be called for each claim id",
                    "operationId": "identifyMissingDocuments",
                    "parameters": [{
                        "name": "claimId",
                        "in": "path",
                        "description": "Unique ID of the open insurance claim",
                        "required": true,
                        "schema": {
                            "type": "string"
                    "responses": {
                        "200": {
                            "description": "List of documents that are pending to be uploaded by policy holder for insurance claim",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "object",
                                        "properties": {
                                            "pendingDocuments": {
                                                "type": "string",
                                                "description": "The list of pending documents for the claim."
            "/send-reminders": {
                "post": {
                    "summary": "API to send reminder to the customer about pending documents for open claim",
                    "description": "Send reminder to the customer about pending documents for open claim. The API takes in only one claim id and its pending documents at a time, sends the reminder and returns the tracking details for the reminder. This API should be called for each claim id you want to send reminders for.",
                    "operationId": "sendReminders",
                    "requestBody": {
                        "required": true,
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "properties": {
                                        "claimId": {
                                            "type": "string",
                                            "description": "Unique ID of open claims to send reminders for."
                                        "pendingDocuments": {
                                            "type": "string",
                                            "description": "The list of pending documents for the claim."
                                    "required": [
                    "responses": {
                        "200": {
                            "description": "Reminders sent successfully",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "object",
                                        "properties": {
                                            "sendReminderTrackingId": {
                                                "type": "string",
                                                "description": "Unique Id to track the status of the send reminder Call"
                                            "sendReminderStatus": {
                                                "type": "string",
                                                "description": "Status of send reminder notifications"
                        "400": {
                            "description": "Bad request. One or more required fields are missing or invalid."

    When a user asks your agent to complete a task, Bedrock will use the FM you configured for the agent to identify the sequence of actions and invoke the corresponding Lambda functions in the right order to solve the user-requested task.

  4. In the final step, review your agent configuration and choose Create Agent.Agents for Amazon Bedrock
  5. Congratulations, you’ve just created your first agent in Amazon Bedrock!Agents for Amazon Bedrock

Deploy an Agent for Amazon Bedrock
To deploy an agent in your application, you must create an alias. Bedrock then automatically creates a version for that alias.

  1. In the Bedrock console, select your agent, then select Deploy, and choose Create to create an alias.Agents for Amazon Bedrock
  2. Provide an alias name and description and choose whether to create a new version or use an existing version of your agent to associate with this alias.
    Agents for Amazon Bedrock
  3. This saves a snapshot of the agent code and configuration and associates an alias with this snapshot or version. You can use the alias to integrate the agent into your applications.
    Agents for Amazon Bedrock

Now, let’s test the insurance agent! You can do this right in the Bedrock console.

Let’s ask the agent to “Send reminder to all policy holders with open claims and pending paper work.” You can see how the FM-powered agent is able to understand the user request, break down the task into steps (collect the open insurance claims, lookup the claim IDs, send reminders), and perform the corresponding actions.

Agents for Amazon Bedrock

Agents for Amazon Bedrock can help you increase productivity, improve your customer service experience, or automate DevOps tasks. I’m excited to see what use cases you will implement!

Generative AI with large language modelsLearn the Fundamentals of Generative AI
If you’re interested in the fundamentals of generative AI and how to work with FMs, including advanced prompting techniques and agents, check out this this new hands-on course that I developed with AWS colleagues and industry experts in collaboration with DeepLearning.AI:

Generative AI with large language models (LLMs) is an on-demand, three-week course for data scientists and engineers who want to learn how to build generative AI applications with LLMs. It’s the perfect foundation to start building with Amazon Bedrock. Enroll for generative AI with LLMs today.

Sign up to Learn More about Amazon Bedrock (Preview)
Amazon Bedrock is currently available in preview. Reach out to us if you’d like access to agents for Amazon Bedrock as part of the preview. We’re regularly providing access to new customers. Visit the Amazon Bedrock Features page and sign up to learn more about Amazon Bedrock.

— Antje

P.S. We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.

How to build a GPT-3 App with Nextjs, React, and GitHub Copilot

Post Syndicated from Kedasha Kerr original https://github.blog/2023-07-25-how-to-build-a-gpt-3-app-with-nextjs-react-and-github-copilot/

At the beginning of the year, I started working out with a trainer who wanted me to start tracking my food, but I’ve always been super against tracking my meals because it just doesn’t work for me. Instead of tracking my meals however, I decided to build an application that automagically tells me the nutritional information of any recipe. But to do that I needed some pretty complex natural language parsing capabilities so I figured this would be a great opportunity for me to play around with OpenAI and get to use GitHub Copilot a little bit more to help me build the app quickly.

GitHub Copilot is a great example of a product that takes advantage of Large Language Models (LLM) to solve problems for people and improve their productivity. In this blog,I’ll take you through how I created my own application that finds the nutritional information for any recipe using OpenAI’s GPT-3.5-turbo model, GitHub Copilot, Next.js, React, and Material UI.

Let’s dig right into the tutorial.

1. Create a repository and install dependencies

To get started, let’s create a new repository from the GitHub Codespaces Next.js template to get up and running quickly. Go to this repository, and make a copy. To do so, click on the green “Use this template” button then select “Create a new repository” and name your repository whatever you like. I called mine “mealmetrics-copilot.”

Now, clone the repository to your local machine, and open up the repository in your preferred code editor. I’m using VS Code.

Open up your terminal and cd into the project so we can install a few needed dependencies. In your terminal run the following command:

npm i express openai dotenv @material-ui/core @material-ui/icons

Then, install the following as a dev dependency:

npm i --save-dev nodemon

Once everything installs successfully, we’re ready to start building the server, but first, let’s grab our api key from OpenAI.

2. Getting your OpenAI API key

Go to OpenAI’s developer login page and create a new account or sign in if you already have one. Once you’ve logged in, click your name in the upper right hand corner and select “view API keys.” Click the “Create a new secret key” button. From there you can name your apikey, click the green button to generate the key and then copy the apikey and save it in a secure location (such as a password manager).

Save your apikey in a .env file in vscode at the root of the project and add .env to the gitignore file.

3. Install GitHub Copilot extension

We’ll be using GitHub Copilot as our assistant to build this application. If you’re not familiar with GitHub Copilot, read this blog post to learn more.

In your code editor of choice, go to your extensions panel and search for GitHub Copilot—I’m using VSCode and this is what that looks like.

Click the install button then click the login button to authenticate your access. Once that’s done, you’ll be ready to get started and follow along!

4. Building the server

Now that we have our apikey, have installed dependencies, and have GitHub Copilot in our code editor, let’s dig into building the application!

The first thing we’re going to do is build a simple server with Express.js (if you prefer to use Fastify, NestJS, Koa or something else, feel free to use them!).

In the pages folder, create a folder called api and then a file called server.js. This is where we’ll add our prompting information for GitHub Copilot. Let’s add our first prompt as a comment in the server.js file that says the following:

Create a server with the following specifications:

1. import express and dotenv node modules
3. create the server with express and name it app
4. use port 8080 as default port
5. enable body parser to accept json data
6. state which port the server is listening to and log it to the console

Hit the enter key and GitHub Copilot will start generating suggestions to build the server. To accept the suggestions, hit the tab key. Take a look at this video to see what accepting suggestions look like.

We can update the package.json file to include the script devserver: "nodemon pages/api/server.js then run the command in our terminal using npm run devserver. You’ll see that the server has started!

After we build the simple server, let’s move on to making this a bit more complex by building the controller for our app.

Let’s create a new file in the api folder called generateInfo.js and add the following comment in the file:

Create a controller with the following specifications:

1. import the Configuration class and the OpenAIApi class from the openai npm module
2. create a new configuration object that includes the api key and uses the Configuration class from the openai module
3. create a new instance of the OpenAIApi class and pass in the configuration object
4. create an async function called generateInfo that accepts a request and response object as parameters
5. use try to make a request to the OpenAI completetion api and return the response
6. use catch to catch any errors and return the error include a message to the user
7. export the generateInfo function as a module

As you’ll notice, I’m being very explicit in my instructions to GitHub Copilot. One thing to always remember when working with LLMs is that the magic is in the prompt—the clearer you are in your instructions, the better the results you’ll get.

Hit enter on your keyboard and then hit tab to accept the recommendations that GitHub Copilot provides. In the image below, you’ll notice that Copilot’s suggestions are gray.

Accept the suggestion by hitting tab on your keyboard and let’s do some cleaning up.

Remember, GitHub Copilot is our assistant, so we still need to ensure that the suggestions it provides meet our requirements.

Since we’re building a gpt-3 application, we’ll be using the completion API from OpenAI and the gpt-3.5-turbo model to generate nutritional information for us.

If you look at the suggestion above, we were provided with the davinci engine and parameters that are not needed for this project—we also need the messages parameter to send requests with the 3.5-turbo model.

We also want to add the recipe prompt, create a function called recipe that represents the recipe a user inputs and also update the completion object sent to OpenAI. The keys we’ll be using are max_tokens, prompt, model, temperature and n. Learn more about these parameters by reading OpenAI’s api docs.

Let’s also update the error message to include 401 just in case our api key is invalid. So, let’s make these updates.

1. Add recipe prompt

Create a new folder called data at the root of your project, then create a file called prompt.json. This will contain a part of the recipe prompt that we’ll send to OpenAI. Add the following script to the prompt.json file:

"recipePrompt": "I want you to act as a Nutrition Facts Generator. I will provide you with a recipe and your role is to generate nutrition facts for that recipe. You should use your knowledge of nutrition science, nutrition facts labels and other relevant information to generate nutritional information for the recipe. Add each nutrition fact to a new line. I want you to only reply with the nutrition fact. Do not provide any other information. My first request is: "

Then import the recipePrompt to the generateInfo.js file and update the main function to grab the recipe submitted by the user.

// add the prompt to the top of the file
const { recipePrompt } = require('../../data/recipe.json');

// update this function to include the recipe before the try
const generateInfo = async(req, res) => {
const { recipe } = req.body

2. Update the completion function and response

Now, let’s update the try to look a bit more like what we want.

model: "gpt-3.5-turbo",
messages: [{ role: "user", content: `${recipePrompt}${recipe}` }],
max_tokens: 200,
temperature: 0,
n: 1,

And let’s also update the response that we receive.

const response = completion.data.choices[0].message.content;

return res.status(200).json({
success: true,
data: response,

3. Update the error message

Finally, let’s update the catch to have more explicit error messages.

catch (error) {
if (error.response.status === 401) {
return res.status(401).json({
error: "Please provide a valid API key.",
return res.status(500).json({
"An error occurred while generating recipe information. Please try again later.",

Once we’ve updated everything, our controller function should look like this:

const { Configuration, OpenAIApi } = require("openai");
const { recipePrompt } = require("../../data/recipe.json");

const config = new Configuration({
apiKey: process.env.OPENAI_API_KEY,

const openai = new OpenAIApi(config);

const generateInfo = async (req, res) => {
const { recipe } = req.body;

try {
const completion = await openai.createChatCompletion({
model: "gpt-3.5-turbo",
messages: [{ role: "user", content: `${recipePrompt}${recipe}` }],
max_tokens: 200,
temperature: 0,
n: 1,
const response = completion.data.choices[0].message.content;

return res.status(200).json({
success: true,
data: response,
} catch (error) {
if (error.response.status === 401) {
return res.status(401).json({
error: "Please provide a valid API key.",
return res.status(500).json({
"An error occurred while generating recipe information. Please try again later.",

module.exports = { generateInfo };

Now, let’s create the router and test this out in Postman. Create a new file called router.js and start typing to allow GitHub Copilot to assist you while typing.

As you can see, GitHub Copilot offered suggestions while I was typing and I hit the tab button to accept the suggestions. Add the newly created router to the server.js file and test the route in postman. Add the following to your server.js file.

app.use('/openai', require('./router'));

Now, let’s test the route in postman with a POST request—make sure your server is still running!

Go to the URL below and add any recipe to the body of the request:

URL: http://localhost:8080/openai/generateinfo

"recipe": "1 cup of all purpose flour, sifted 1 1/2 teaspoon baking powder 1/4 teaspoon salt 2 Tablespoon granulated sugar 1/2 Tablespoon unsalted butter, room temperature Approximately 1/3 cup water"

You should receive a successful response that looks like this:

"success": true,
"data": "\n\nCalories: 112 \nTotal Fat: 2.3g \nSaturated Fat: 1.3g \nTrans Fat: 0g \nCholesterol: 5.4mg \nSodium: 175.8mg \nTotal Carbohydrates: 21.6g \nDietary Fiber: 0.8g \nSugars: 6.5g \nProtein: 2.6g"

Here is the data that we submitted to OpenAI that generated that successful response:

data: '{"model":"text-davinci-003","prompt":"I want you to act as a Nutrition Facts Generator. I will provide you with a recipe and your role is to generate nutrition facts for that recipe. You should use your knowledge of nutrition science, nutrition facts labels and other relevant information to generate nutritional information for the recipe. Add each nutrition fact to a new line. I want you to only reply with the nutrition fact. Do not provide any other information. My first request is: 1 cup of all purpose flour, sifted 1 1/2 teaspoon baking powder 1/4 teaspoon salt 2 Tablespoon granulated sugar 1/2 Tablespoon unsalted butter, room temperature Approximately 1/3 cup water","max_tokens":200,"temperature":0.5,"n":1}',

As you can see, both the prompt script and the recipe entered in the request body was sent to OpenAI, which generated the nutritional data for this Jamaican fried dumpling recipe.

Now, let’s create the frontend of the application to display the info on the web.

5. Building the frontend app

We’ll be using React for the frontend. Delete all the code in the index.js file that currently exists in the project. Then, in a comment, instruct GitHub Copilot to build a simple text area.

Create a text area with the following specifications:
1. a H1 with the text "Find Nutrition Facts for any recipe"
2. a text area for users to upload recipe
3. a button for users to submit the entered recipe
4. a section at the bottom to display nutrition facts
5. Get the data from this link: http://localhost:8080/openai/generateinfo
6. Name the component RecipeInfo

GitHub Copilot quickly generated the code for us.

Let’s hit tab to accept the suggestion and run npm run dev in your terminal to see the app on the web. When you go to localhost:3000, you’ll see the following displayed on the web.

Admittedly, it’s not the most beautiful thing, but we were able to spin this up very quickly.

In less than one minute we completed a functional frontend mvp of our application with GitHub Copilot.

Let’s enter the recipe we have into the text box and see the response that we get back.

And we have our first error—which is not surprising since we didn’t validate the code that was provided. Let’s look into the console and see if we have any additional details.

Seems it’s a CORS issue—classic. Let’s ask GitHub Copilot how to resolve this.

Add the following questions as a comment anywhere in your file:

q: how do I resolve the CORS error?
q: how do I add Access-Control-Allow-Origin to the header?

The question and response should look something like this:

We can also ask GitHub Copilot Chat how to resolve CORS errors and it gives us a seamless response.

Let’s install the cors middleware and add it to the server.js file.

const cors = require("cors");

// Allow cross-origin requests (CORS)

Then, let’s update our router.js file.

router.options("/generateInfo", (req, res) => {
res.setHeader("Access-Control-Allow-Origin", "*");
res.setHeader("Access-Control-Allow-Headers", "*");
res.setHeader("Access-Control-Allow-Methods", "*");

Now, let’s try fetching nutritional data again and see what happens.

And we have another error. Progress!

This time there’s no need to debug with GitHub Copilot since we’re being told that the data being returned is an object with the keys success and data. Let’s change the name of the response function to recipeInfo and update the nutrition state to receive recipeInfo.data.

const recipeInfo = await response.json();

Let’s try sending the recipe again and hope for a successful response.


We just created a GPT-3 app in record time with GitHub Copilot, React, Next.js, and OpenAI. Now that we have the data that we need, let’s make the application more beautiful with Material UI.

6. Styling the app with Material UI

In this section, we’ll be using a GitHub Copilot X feature in technical preview for individuals and in public beta for organizationsCopilot Chat–to improve the appearance of our application. You must have GitHub Copilot access to be on the Copilot Chat’s waitlist which is currently open. Sign up today if you haven’t yet!

Let’s ask GitHub Copilot chat how we can implement material-ui into the application:

Let’s go ahead and implement the suggestions and see what happens, and also ask GitHub Copilot chat to implement a header for us.

After we implement the header, new text area and center the content, this is what the app is looking like.

Ok, we’re getting somewhere.

Let’s make a few more updates with the assistance of GitHub Copilot Chat. I’ve included the prompt/questions I asked:

  • Make the text area larger and implement Material UI
update the component to use material ui with the content centered and the buttoned positioned below the text area. use Grid from material ui and any other components needed.
  • Add the paper component from Material UI to elevate the look and feel of the app
add the Paper component from material ui to the text area highlighted
  • Add a second button that clears the text area + facts after a recipe is submitted
add a button to the app to clear the text in the textarea
  • Add a loader while waiting for the data to load
add a loader to the highlighted code that checks if the data is loading. If the data is loading, then display the text "Nutrition Facts" and loader, if there is an error, display the error message otherwise, display nothing
  • Add a theme with custom primary and secondary colors
how do I create a custom theme with material ui and where do I create the custom theme?
  • Prevent the text area from going over the paper component from Material UI
in the highlihghted code how do I prevent the text area line from going over the paper component?
  • Add a new component—footer
Create a footer component with the following specifications:
1. The footer must be fixed at the bottom of the page
2. Use the Paper component from Material UI
3. Use the Typography component from Material UI
3. The text must say "Made with ❤ by LadyKerr & GitHub Copilot" and "Powered by OpenAI"
4. The text "GitHub Copilot" must be a link to https://copilot.github.com/ that opens in a new tab with alt text " GitHub Copilot"
5. The text "OpenAI" must be a link to https://openai.com/ that opens in a new tab with alt text "openai api"

Ask GitHub Copilot chat to make the changes above and, once it’s done, your application should be looking and functioning like this:

Now, let’s create a new component that splits each nutrition fact onto a new line. Create a new file called NutritionFacts.js in the components folder, and write the following as a comment so GitHub Copilot can get to work.

Create a component with the following specifications:
1. the component must split the received string data at /n/n or /n and return a Typography component for each string
2. the component must set a unique key for each Typography component
3. the component must return a div with the Typography components
4. the component must return null if the data is not a string
5. Name the component NutritionFacts
6. Use the Paper Component from Material UI
7. Add text above the data that says "Here are the nutrition facts for your recipe:"

From this prompt, GitHub Copilot generated the following code.

import React from "react";
import { Typography, Paper } from "@material-ui/core";

const NutritionFacts = ({ data }) => {
if (typeof data !== "string") {
return null;
const nutritionFacts = data.split(/\n\n|\n/);
return (

Here are the nutrition facts for your recipe:

{nutritionFacts.map((fact, index) => (

export default NutritionFacts;

Brilliant! Let’s import this component into our main index.js file and test the application to see if each fact was split on to a new line as expected.

And it did. Our app is functioning as expected.

Now, let’s move the code for the header into a new file in the components folder called Header.js. Once everything is updated, the final application looks like this and returns the nutritional data for any recipe.

So, there we have it!

We just built an application using GitHub Copilot, OpenAI, React, Next.js, and GitHub Copilot Chat. The next step would be to deploy the application on GitHub Pages and deploy your server on a service like Azure.

You can see the full code here—feel free to clone or fork the project and make it your own. This was a fun little project to build and I hope you learned something new and feel inspired to create your own GPT-3 app!

Learn more about prompting GitHub Copilot by reading How to use GitHub Copilot: Prompts, tips and use cases and A Developer’s Guide to Prompt Engineering and LLMs.

Until next time, happy coding!

AWS Week in Review – AWS Glue Crawlers Now Supports Apache Iceberg, Amazon RDS Updates, and More – July 10, 2023

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/aws-week-in-review-aws-glue-crawlers-now-supports-apache-iceberg-amazon-rds-updates-and-more-july-10-2023/

The US celebrated Independence Day last week on July 4 with fireworks and barbecues across the country. But fireworks weren’t the only thing that launched last week. Let’s have a look!

Last Week’s Launches
Here are some launches that got my attention:

AWS GlueAWS Glue Crawlers now supports Apache Iceberg tables. Apache Iceberg is an open-source table format for data stored in data lakes. You can now automatically register Apache Iceberg tables into AWS Glue Data Catalog by running the Glue Crawler. You can then query Glue Catalog Iceberg tables across various analytics engines and apply AWS Lake Formation fine-grained permissions when querying from Amazon Athena. Check out the AWS Glue Crawler documentation to learn more.

Amazon Relational Database Service (Amazon RDS) for PostgreSQL – PostgreSQL 16 Beta 2 is now available in the Amazon RDS Database Preview Environment. The PostgreSQL community released PostgreSQL 16 Beta 2 on June 29, 2023, which enables logical replication from standbys and includes numerous performance improvements. You can deploy PostgreSQL 16 Beta 2 in the preview environment and start evaluating the pre-release of PostgreSQL 16 on Amazon RDS for PostgreSQL.

In addition, Amazon RDS for PostgreSQL Multi-AZ Deployments with two readable standbys now supports logical replication. With logical replication, you can stream data changes from Amazon RDS for PostgreSQL to other databases for use cases such as data consolidation for analytical applications, change data capture (CDC), replicating select tables rather than the entire database, or for replicating data between different major versions of PostgreSQL. Check out the Amazon RDS User Guide for more details.

Amazon CloudWatch – Amazon CloudWatch now supports Service Quotas in cross-account observability. With this, you can track and visualize resource utilization and limits across various AWS services from multiple AWS accounts within a region using a central monitoring account. You no longer have to track the quotas by logging in to individual accounts, instead from a central monitoring account, you can create dashboards and alarms for the AWS service quota usage across all your source accounts from a central monitoring account. Setup CloudWatch cross-account observability to get started.

Amazon SageMaker – You can now associate a SageMaker Model Card with a specific model version in SageMaker Model Registry. This lets you establish a single source of truth for your registered model versions, with comprehensive, centralized, and standardized documentation across all stages of the model’s journey on SageMaker, facilitating discoverability and promoting governance, compliance, and accountability throughout the model lifecycle. Learn more about SageMaker Model Cards in the developer guide.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Here are some additional blog posts and news items that you might find interesting:

Building generative AI applications for your startup – In this AWS Startups Blog post, Hrushikesh explains various approaches to build generative AI applications, and reviews their key component. Read the full post for the details.

Components of the generative AI landscape

Components of the generative AI landscape.

How Alexa learned to speak with an Irish accent – If you’re curious how Amazon researchers used voice conversation to generate Irish-accented training data in Alexa’s own voice, check out this Amazon Science Blog post. 

AWS open-source news and updates – My colleague Ricardo writes this weekly open-source newsletter in which he highlights new open-source projects, tools, and demos from the AWS Community.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

AWS Global Summits – Check your calendars and sign up for the AWS Summit close to where you live or work: Hong Kong (July 20), New York City (July 26), Taiwan (August 2-3), São Paulo (August 3), and Mexico City (August 30).

AWS Community Days – Join a community-led conference run by AWS user group leaders in your region: Malaysia (July 22), Philippines (July 29-30), Colombia (August 12), and West Africa (August 19).

AWS re:Invent 2023AWS re:Invent (November 27 – December 1) – Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Registration is now open.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Week in Review!

— Antje

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Building Generative AI into Marketing Strategies: A Primer

Post Syndicated from nnatri original https://aws.amazon.com/blogs/messaging-and-targeting/building-generative-ai-into-marketing-strategies-a-primer/


Artificial Intelligence has undoubtedly shaped many industries and is poised to be one of the most transformative technologies in the 21st century. Among these is the field of marketing where the application of generative AI promises to transform the landscape. This blog post explores how generative AI can revolutionize marketing strategies, offering innovative solutions and opportunities.

According to Harvard Business Review, marketing’s core activities, such as understanding customer needs, matching them to products and services, and persuading people to buy, can be dramatically enhanced by AI. A 2018 McKinsey analysis of more than 400 advanced use cases showed that marketing was the domain where AI would contribute the greatest value. The ability to leverage AI can not only help automate and streamline processes but also deliver personalized, engaging content to customers. It enhances the ability of marketers to target the right audience, predict consumer behavior, and provide personalized customer experiences. AI allows marketers to process and interpret massive amounts of data, converting it into actionable insights and strategies, thereby redefining the way businesses interact with customers.

Generating content is just one part of the equation. AI-generated content, no matter how good, is useless if it does not arrive at the intended audience at the right point of time. Integrating the generated content into an automated marketing pipeline that not only understands the customer profile but also delivers a personalized experience at the right point of interaction is also crucial to getting the intended action from the customer.

Amazon Web Services (AWS) provides a robust platform for implementing generative AI in marketing strategies. AWS offers a range of AI and machine learning services that can be leveraged for various marketing use cases, from content creation to customer segmentation and personalized recommendations. Two services that are instrumental to delivering customer contents and can be easily integrated with other generative AI services are Amazon Pinpoint and Amazon Simple Email Service. By integrating generative AI with Amazon Pinpoint and Amazon SES, marketers can automate the creation of personalized messages for their customers, enhancing the effectiveness of their campaigns. This combination allows for a seamless blend of AI-powered content generation and targeted, data-driven customer engagement.

As we delve deeper into this blog post, we’ll explore the mechanics of generative AI, its benefits and how AWS services can facilitate its integration into marketing communications.

What is Generative AI?

Generative AI is a subset of artificial intelligence that leverages machine learning techniques to generate new data instances that resemble your training data. It works by learning the underlying patterns and structures of the input data, and then uses this understanding to generate new, similar data. This is achieved through the use of models like Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Transformer models.

What do Generative AI buzzwords mean?

In the world of AI, buzzwords are abundant. Terms like “deep learning”, “neural networks”, “machine learning”, “generative AI”, and “large language models” are often used interchangeably, but they each have distinct meanings. Understanding these terms is crucial for appreciating the capabilities and limitations of different AI technologies.

Machine Learning (ML) is a subset of AI that involves the development of algorithms that allow computers to learn from and make decisions or predictions based on data. These algorithms can be ‘trained’ on a dataset and then used to predict or classify new data. Machine learning models can be broadly categorized into supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.

Deep Learning is a subset of machine learning that uses neural networks with many layers (hence “deep”) to model and understand complex patterns. These layers of neurons process different features, and their outputs are combined to produce a final result. Deep learning models can handle large amounts of data and are particularly good at processing images, speech, and text.

Generative AI refers specifically to AI models that can generate new data that mimic the data they were trained on. This is achieved through the use of models like Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). Generative AI can create anything from written content to visual designs, and even music, making it a versatile tool in the hands of marketers.

Large Language Models (LLMs) are a type of generative AI that are trained on a large corpus of text data and can generate human-like text. They predict the probability of a word given the previous words used in the text. They are particularly useful in applications like text completion, translation, summarization, and more. While they are a type of generative AI, they are specifically designed for handling text data.

Simply put, you can understand that Large Language Model is a subset of Generative AI, which is then a subset of Machine Learning and they ultimately falls under the umbrella term of Artificial Intelligence.

What are the problems with generative AI and marketing?

While generative AI holds immense potential for transforming marketing strategies, it’s important to be aware of its limitations and potential pitfalls, especially when it comes to content generation and customer engagement. Here are some common challenges that marketers should be aware of:

Bias in Generative AI Generative AI models learn from the data they are trained on. If the training data is biased, the AI model will likely reproduce these biases in its output. For example, if a model is trained primarily on data from one demographic, it may not accurately represent other demographics, leading to marketing campaigns that are ineffective or offensive. Imagine if you are trying to generate an image for a campaign targeting females, a generative AI model might not generate images of females in jobs like doctors, lawyers or judges, leading your campaign to suffer from bias and uninclusiveness.

Insensitivity to Cultural Nuances Generative AI models may not fully understand cultural nuances or sensitive topics, which can lead to content that is insensitive or even harmful. For instance, a generative AI model used to create social media posts for a global brand may inadvertently generate content that is seen as disrespectful or offensive by certain cultures or communities.

Potential for Inappropriate or Offensive Content Generative AI models can sometimes generate content that is inappropriate or offensive. This is often because the models do not fully understand the context in which certain words or phrases should be used. It’s important to have safeguards in place to review and approve content before it’s published. A common problem with LLMs is hallucination: whereby the model speaks false knowledge as if it is accurate. A marketing team might mistakenly publish a auto-generated promotional content that contains a 20% discount on an item when no such promotions were approved. This could have disastrous effect if safeguards are not in place and erodes customers’ trust.

Intellectual Property and Legal Concerns Generative AI models can create new content, such as images, music, videos, and text, which raises questions of ownership and potential copyright infringement. Being a relatively new field, legal discussions are still ongoing to discuss legal implications of using Generative AI, e.g. who should own generated AI content, and copyright infringement.

Not a Replacement for Human Creativity Finally, while generative AI can automate certain aspects of marketing campaigns, it cannot replace the creativity or emotional connections that marketers use in crafting compelling campaigns. The most successful marketing campaigns touch the hearts of the customers, and while Generative AI is very capable of replicating human content, it still lacks in mimicking that “human touch”.

In conclusion, while generative AI offers exciting possibilities for marketing, it’s important to approach its use with a clear understanding of its limitations and potential pitfalls. By doing so, marketers can leverage the benefits of generative AI while mitigating risks.

How can I use generative AI in marketing communications?

Amazon Web Services (AWS) provides a comprehensive suite of services that facilitate the use of generative AI in marketing. These services are designed to handle a variety of tasks, from data processing and storage to machine learning and analytics, making it easier for marketers to implement and benefit from generative AI technologies.

Overview of Relevant AWS Services

AWS offers several services that are particularly relevant for generative AI in marketing:

  • Amazon Bedrock: This service makes FMs accessible via an API. Bedrock offers the ability to access a range of powerful FMs for text and images, including Amazon’s Titan FMs. With Bedrock’s serverless experience, customers can easily find the right model for what they’re trying to get done, get started quickly, privately customize FMs with their own data, and easily integrate and deploy them into their applications using the AWS tools and capabilities they are familiar with.
  • Amazon Titan Models: These are two new large language models (LLMs) that AWS is announcing. The first is a generative LLM for tasks such as summarization, text generation, classification, open-ended Q&A, and information extraction. The second is an embeddings LLM that translates text inputs into numerical representations (known as embeddings) that contain the semantic meaning of the text. In response to the pitfalls mentioned above around Generative AI hallucinations and inaccurate information, AWS is actively working on improving accuracy and ensuring its Titan models produce high-quality responses, said Bratin Saha, an AWS vice president.
  • Amazon SageMaker: This fully managed service enables data scientists and developers to build, train, and deploy machine learning models quickly. SageMaker includes modules that can be used for generative AI, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs).
  • Amazon Pinpoint: This flexible and scalable outbound and inbound marketing communications service enables businesses to engage with customers across multiple messaging channels. Amazon Pinpoint is designed to scale with your business, allowing you to send messages to a large number of users in a short amount of time. It integrates with AWS’s generative AI services to enable personalized, AI-driven marketing campaigns.
  • Amazon Simple Email Service (SES): This cost-effective, flexible, and scalable email service enables marketers to send transactional emails, marketing messages, and other types of high-quality content to their customers. SES integrates with other AWS services, making it easy to send emails from applications being hosted on services such as Amazon EC2. SES also works seamlessly with Amazon Pinpoint, allowing for the creation of customer engagement communications that drive user activity and engagement.

How to build Generative AI into marketing communications

Dynamic Audience Targeting and Segmentation: Generative AI can help marketers to dynamically target and segment their audience. It can analyze customer data and behavior to identify patterns and trends, which can then be used to create more targeted marketing campaigns. Using Amazon Sagemaker or the soon-to-be-available Amazon Bedrock and Amazon Titan Models, Generative AI can suggest labels for customers based on unstructured data. According to McKinsey, generative AI can analyze data and identify consumer behavior patterns to help marketers create appealing content that resonates with their audience.

Personalized Marketing: Generative AI can be used to automate the creation of marketing content. This includes generating text for blogs, social media posts, and emails, as well as creating images and videos. This can save marketers a significant amount of time and effort, allowing them to focus on other aspects of their marketing strategy. Where it really shines is the ability to productionize marketing content creation, reducing the needs for marketers to create multiple copies for different customer segments. Previously, marketers would need to generate many different copies for each granularity of customers (e.g. attriting customers who are between the age of 25-34 and loves food). Generative AI can automate this process, providing the opportunities to dynamically create these contents programmatically and automatically send out to the most relevant segments via Amazon Pinpoint or Amazon SES.

Marketing Automation: Generative AI can automate various aspects of marketing, such as email marketing, social media marketing, and search engine marketing. This includes automating the creation and distribution of marketing content, as well as analyzing the performance of marketing campaigns. Amazon Pinpoint currently automates customer communications using journeys which is a customized, multi-step engagement experience. Generative AI could create a Pinpoint journey based on customer engagement data, engagement parameters and a prompt. This enables GenAI to not only personalize the content but create a personalized omnichannel experience that can extend throughout a period of time. It then becomes possible that journeys are created dynamically by generative AI and A/B tested on the fly to achieve an optimal pre-defined Key Performance Indicator (KPI).

A Sample Generative AI Use Case in Marketing Communications

AWS services are designed to work together, making it easy to implement generative AI in your marketing strategies. For instance, you can use Amazon SageMaker to build and train your generative AI models which assist with automating marketing content creation, and Amazon Pinpoint or Amazon SES to deliver the content to your customers.

Companies using AWS can theoretically supplement their existing workloads with generative AI capabilities without the needs for migration. The following reference architecture outlines a sample use case and showcases how Generative AI can be integrated into your customer journeys built on the AWS cloud. An e-commerce company can potentially receive many complaints emails a day. Companies spend a lot of money to acquire customers, it’s therefore important to think about how to turn that negative experience into a positive one.


When an email is received via Amazon SES (1), its content can be passed through to generative AI models using GANs to help with sentiment analysis (2). An article published by Amazon Science utilizes GANs for sentiment analysis for cases where a lack of data is a problem. Alternatively, one can also use Amazon Comprehend at this step and run A/B tests between the two models. The limitations with Amazon Comprehend would be the limited customizations you can perform to the model to fit your business needs.

Once the email’s sentiment is determined, the sentiment event is logged into Pinpoint (3), which then triggers an automatic winback journey (4).

Generative AI (e.g. HuggingFace’s Bloom Text Generation Models) can again be used here to dynamically create the content without needing to wait for the marketer’s input (5). Whereas marketers would need to generate many different copies for each granularity of customers (e.g. attriting customers who are between the age of 25-34 and loves food), generative AI provides the opportunities to dynamically create these contents on the fly given the above inputs.

Once the campaign content has been generated, the model pumps the template backs into Amazon Pinpoint (6), which then sends the personalized copy to the customer (7).

Result: Another customer is saved from attrition!


The landscape of generative AI is vast and ever-evolving, offering a plethora of opportunities for marketers to enhance their strategies and deliver more personalized, engaging content. AWS plays a pivotal role in this landscape, providing a comprehensive suite of services that facilitate the implementation of generative AI in marketing. From building and training AI models with Amazon SageMaker to delivering personalized messages with Amazon Pinpoint and Amazon SES, AWS provides the tools and infrastructure needed to harness the power of generative AI.

The potential of generative AI in relation to the marketer is immense. It offers the ability to automate content creation, personalize customer interactions, and derive valuable insights from data, among other benefits. However, it’s important to remember that while generative AI can automate certain aspects of marketing, it is not a replacement for human creativity and intuition. Instead, it should be viewed as a tool that can augment human capabilities and free up time for marketers to focus on strategy and creative direction.

Get started with Generative AI in marketing communications

As we conclude this exploration of generative AI and its applications in marketing, we encourage you to:

  • Brainstorm potential Generative AI use cases for your business. Consider how you can leverage generative AI to enhance your marketing strategies. This could involve automating content creation, personalizing customer interactions, or deriving insights from data.
  • Start leveraging generative AI in your marketing strategies with AWS today. AWS provides a comprehensive suite of services that make it easy to implement generative AI in your marketing strategies. By integrating these services into your workflows, you can enhance personalization, improve customer engagement, and drive better results from your campaigns.
  • Watch out for the next part in the series of integrating Generative AI into Amazon Pinpoint and SES. We will delve deeper into how you can leverage Amazon Pinpoint and SES together with generative AI to enhance your marketing campaigns. Stay tuned!

The journey into the world of generative AI is just beginning. As technology continues to evolve, so too will the opportunities for marketers to leverage AI to enhance their strategies and deliver more personalized, engaging content. We look forward to exploring this exciting frontier with you.

About the Author

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen is an Amazon Pinpoint and Amazon Simple Email Service Specialist Solutions Architect at AWS. At work, he specializes in technical implementation of communications services in enterprise systems and architecture/solutions design. In his spare time, he enjoys chess, rock climbing, hiking and triathlon.

Generative AI with Large Language Models — New Hands-on Course by DeepLearning.AI and AWS

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/generative-ai-with-large-language-models-new-hands-on-course-by-deeplearning-ai-and-aws/

Generative AI has taken the world by storm, and we’re starting to see the next wave of widespread adoption of AI with the potential for every customer experience and application to be reinvented with generative AI. Generative AI lets you to create new content and ideas including conversations, stories, images, videos, and music. Generative AI is powered by very large machine learning models that are pre-trained on vast amounts of data, commonly referred to as foundation models (FMs).

A subset of FMs called large language models (LLMs) are trained on trillions of words across many natural-language tasks. These LLMs can understand, learn, and generate text that’s nearly indistinguishable from text produced by humans. And not only that, LLMs can also engage in interactive conversations, answer questions, summarize dialogs and documents, and provide recommendations. They can power applications across many tasks and industries including creative writing for marketing, summarizing documents for legal, market research for financial, simulating clinical trials for healthcare, and code writing for software development.

Companies are moving rapidly to integrate generative AI into their products and services. This increases the demand for data scientists and engineers who understand generative AI and how to apply LLMs to solve business use cases.

This is why I’m excited to announce that DeepLearning.AI and AWS are jointly launching a new hands-on course Generative AI with large language models on Coursera’s education platform that prepares data scientists and engineers to become experts in selecting, training, fine-tuning, and deploying LLMs for real-world applications.

DeepLearning.AI was founded in 2017 by machine learning and education pioneer Andrew Ng with the mission to grow and connect the global AI community by delivering world-class AI education.

Generative AI with large language models

DeepLearning.AI teamed up with generative AI specialists from AWS including Chris Fregly, Shelbee Eigenbrode, Mike Chambers, and me to develop and deliver this course for data scientists and engineers who want to learn how to build generative AI applications with LLMs. We developed the content for this course under the guidance of Andrew Ng and with input from various industry experts and applied scientists at Amazon, AWS, and Hugging Face.

Course Highlights
This is the first comprehensive Coursera course focused on LLMs that details the typical generative AI project lifecycle, including scoping the problem, choosing an LLM, adapting the LLM to your domain, optimizing the model for deployment, and integrating into business applications. The course not only focuses on the practical aspects of generative AI but also highlights the science behind LLMs and why they’re effective.

The on-demand course is broken down into three weeks of content with approximately 16 hours of videos, quizzes, labs, and extra readings. The hands-on labs hosted by AWS Partner Vocareum let you apply the techniques directly in an AWS environment provided with the course and includes all resources needed to work with the LLMs and explore their effectiveness.

In just three weeks, the course prepares you to use generative AI for business and real-world applications. Let’s have a quick look at each week’s content.

Week 1 – Generative AI use cases, project lifecycle, and model pre-training
In week 1, you will examine the transformer architecture that powers many LLMs, see how these models are trained, and consider the compute resources required to develop them. You will also explore how to guide model output at inference time using prompt engineering and by specifying generative configuration settings.

In the first hands-on lab, you’ll construct and compare different prompts for a given generative task. In this case, you’ll summarize conversations between multiple people. For example, imagine summarizing support conversations between you and your customers. You’ll explore prompt engineering techniques, try different generative configuration parameters, and experiment with various sampling strategies to gain intuition on how to improve the generated model responses.

Week 2 – Fine-tuning, parameter-efficient fine-tuning (PEFT), and model evaluation
In week 2, you will explore options for adapting pre-trained models to specific tasks and datasets through a process called fine-tuning. A variant of fine-tuning, called parameter efficient fine-tuning (PEFT), lets you fine-tune very large models using much smaller resources—often a single GPU. You will also learn about the metrics used to evaluate and compare the performance of LLMs.

In the second lab, you’ll get hands-on with parameter-efficient fine-tuning (PEFT) and compare the results to prompt engineering from the first lab. This side-by-side comparison will help you gain intuition into the qualitative and quantitative impact of different techniques for adapting an LLM to your domain specific datasets and use cases.

Week 3 – Fine-tuning with reinforcement learning from human feedback (RLHF), retrieval-augmented generation (RAG), and LangChain
In week 3, you will make the LLM responses more humanlike and align them with human preferences using a technique called reinforcement learning from human feedback (RLHF). RLHF is key to improving the model’s honesty, harmlessness, and helpfulness. You will also explore techniques such as retrieval-augmented generation (RAG) and libraries such as LangChain that allow the LLM to integrate with custom data sources and APIs to improve the model’s response further.

In the final lab, you’ll get hands-on with RLHF. You’ll fine-tune the LLM using a reward model and a reinforcement-learning algorithm called proximal policy optimization (PPO) to increase the harmlessness of your model responses. Finally, you will evaluate the model’s harmlessness before and after the RLHF process to gain intuition into the impact of RLHF on aligning an LLM with human values and preferences.

Enroll Today
Generative AI with large language models is an on-demand, three-week course for data scientists and engineers who want to learn how to build generative AI applications with LLMs.

Enroll for generative AI with large language models today.

— Antje

AWS Week in Review – AWS Wickr, Amazon Redshift, Generative AI, and More – May 29, 2023

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-week-in-review-aws-wickr-amazon-redshift-generative-ai-and-more-may-29-2023/

This edition of Week in Review marks the end of the month of May. In addition, we just finished all of the in-person AWS Summits in Asia-Pacific and Japan starting from AWS Summit Sydney and AWS Summit Tokyo in April to AWS Summit ASEAN, AWS Summit Seoul, and AWS Summit Mumbai in May.

Thank you to everyone who attended our AWS Summits in APJ, especially the AWS Heroes, AWS Community Builders, and AWS User Group leaders, for your collaboration in supporting activities at AWS Summit events.

Last Week’s Launches
Here are some launches that caught my attention last week:

AWS Wickr is now HIPAA eligible — AWS Wickr is an end-to-end encrypted enterprise messaging and collaboration tool that enables one-to-one and group messaging, voice and video calling, file sharing, screen sharing, and location sharing, without increasing organizational risk. With this announcement, you can now use AWS Wickr for workloads that are within the scope of HIPAA. Visit AWS Wickr to get started.

Amazon Redshift announces support for auto-commit statements in stored procedure — If you’re using stored procedures in Amazon Redshift, you now have enhanced transaction controls that enable you to automatically commit the statements inside the procedure. This new NONATOMIC mode can be used to handle exceptions inside a stored procedure. You can also use the new PL/pgSQL statement RAISE to programmatically raise the exception, which helps prevent disruptions in applications due to an error inside a stored procedure. For more information on using this feature, refer to Managing transactions.

AWS Chatbot supports access to Amazon CloudWatch dashboards and logs insights in chat channels — With this launch, you now can receive Amazon CloudWatch alarm notifications for an incident directly in your chat channel, analyze the diagnostic data from the dashboards, and remediate directly from the chat channel without switching context. Visit the AWS Chatbot page to learn more.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

AWS Open Source Updates
As always, my colleague Ricardo has curated the latest updates for open source news at AWS. Here are some of the highlights:

OpenEMR on AWS Fargate — OpenEMR is a popular Electronic Health and Medical Practice management solution. If you’re looking to deploy OpenEMR on AWS, then this repo will help you to get your OpenEMR up and running on AWS Fargate using Amazon ECS.

Cloud-Radar — If you’re working with AWS Cloudformation and looking for performing unit tests, then you might want to try Cloud-Radar. You can also perform functional testing with Cloud-Radar as this tool also acts a wrapper around Taskcat.

Amazon and Generative AI
Using generative AI to improve extreme multilabel classification — In their research on extreme multilabel classification (XMC), Amazon scientists explored a generative approach, in which a model generates a sequence of labels for input sequences of words. The generative models with clustering consistently outperformed them. This demonstrates the effectiveness of incorporating hierarchical clustering in improving XMC performance.

Upcoming AWS Events
Don’t miss upcoming AWS-led events happening soon:

Also, let’s learn from our fellow builders and give them support by attending AWS Community Days:

That’s all for this week. Check back next Monday for another Week in Review!

Happy building
— Donnie

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Inside GitHub: Working with the LLMs behind GitHub Copilot

Post Syndicated from Sara Verdi original https://github.blog/2023-05-17-inside-github-working-with-the-llms-behind-github-copilot/

The first time that engineers at GitHub worked with one of OpenAI’s large language models (LLM), they were equal parts excited and astonished. Alireza Goudarzi, a senior researcher of machine learning at GitHub recounts, “As a theoretical AI researcher, my job has been to take apart deep learning models to make sense of them and how they learn, but this was the first time that a model truly astonished me.” Though the emergent behavior of the model was somewhat surprising, it was obviously powerful. Powerful enough, in fact, to lead to the creation of GitHub Copilot.

Due to the growing interest in LLMs and generative AI models, we decided to speak to the researchers and engineers at GitHub who helped build the early versions of GitHub Copilot and talk through what it was like to work with different LLMs from OpenAI, and how model improvements have helped evolve GitHub Copilot to where it is today—and beyond.

A brief history of GitHub Copilot

In June 2020, OpenAI released GPT-3, an LLM that sparked intrigue in developer communities and beyond. Over at GitHub, this got the wheels turning for a project our engineers had only talked about before: code generation.

“Every six months or so, someone would ask in our meetings, ‘Should we think about general purpose code generation,’ but the answer was always ‘No, it’s too difficult, the current models just can’t do it,’” says Albert Ziegler, a principal machine learning engineer and member of the GitHub Next research and development team.

But GPT-3 changed all that—suddenly the model was good enough to begin considering how a code generation tool might work.

“OpenAI gave us the API to play around with,” Ziegler says. “We assessed it by giving it coding-like tasks and evaluated it in two different forms.”

For the first form of evaluation, the GitHub Next team crowdsourced self-contained problems to help test the model. “The reason we don’t do this anymore is because the models just got too good,” Ziegler laughs.

In the beginning, the model could solve about half of the problems it was posed with, but soon enough, it was solving upwards of 90 percent of the problems.

This original testing method sparked the first ideas for how to harness the power of this model, and they began to conceptualize an AI-powered chatbot for developers to ask coding questions and receive immediate, runnable code snippets. “We built a prototype, but it turned out there was a better modality for this technology available,” Ziegler says. “We thought, ‘Let’s try to put this in the IDE.’”

“The moment we did that and saw how well it worked, the whole static question-and-answer modality was forgotten,” he says. “This new approach was interactive and it was useful in almost every situation.”

And with that, the development of GitHub Copilot began.

Exploring model improvements

To keep this project moving forward, GitHub returned to OpenAI to make sure that they could stay on track with the latest models. “The first model that OpenAI gave us was a Python-only model,” Ziegler remembers. “Next we were delivered a JavaScript model and a multilingual model, and it turned out that the Javascript model had particular problems that the multilingual model did not. It actually came as a surprise to us that the multilingual model could perform so well. But each time, the models were just getting better and better, which was really exciting for GitHub Copilot’s progress.”

In 2021, OpenAI released the multilingual Codex model, which was built in partnership with GitHub. This model was an offshoot of GPT-3, so its original capability was generating natural language in response to text prompts. But what set the Codex model apart was that it was trained on billions of lines of public code—so that, in addition to natural language outputs, it also produced code suggestions.

This model was open for use via an API that businesses could build on, and while this breakthrough was huge for GitHub Copilot, the team needed to work on internal model improvements to ensure that it was as accurate as possible for end users.

As the GitHub Copilot product was prepared for launch as a technical preview, the team split off into further functional teams, and the Model Improvements team became responsible for monitoring and improving GitHub Copilot’s quality through communicating with the underlying LLM. This team also set out to work on improving completion for users. Completion refers to when users accept and keep GitHub Copilot suggestions in their code, and there are several different levers that the Model Improvements team works on to increase completion, including prompt crafting and fine tuning.

An example of completion in action with GitHub Copilot
An example of completion in action with GitHub Copilot.

Prompt crafting

When working with LLMs, you have to be very specific and intentional with your inputs to receive your desired output, and prompt crafting explores the art behind communicating these requests to get the optimal completion from the model.

“In very simple terms, the LLM is, at its core, just a document completion model. For training it was given partial documents and it learned how to complete them one token at a time. Therefore, the art of prompt crafting is really all about creating a ‘pseudo-document’ that will lead the model to a completion that benefits the customer,” John Berryman, a senior researcher of machine learning on the Model Improvements team explains. Since LLMs are trained on partial document completion, then if the partial document is code, then this completion capability lends itself well to code completion, which is, in its base form, exactly what GitHub Copilot does.

To better understand how the model could be applied to code completion, the team would provide the model with a file and evaluate the code completions it returned.

“Sometimes the results are ok, sometimes they are quite good, and sometimes the results seem almost magical,” Berryman says. “The secret is that we don’t just have to provide the model with the original file that the GitHub Copilot user is currently editing; instead we look for additional pieces of context inside the IDE that can hint the model towards better completions.”

He continues, “There have been several changes that helped get GitHub Copilot where it is today, but one of my favorite tricks was when we pulled similar texts in from the user’s neighboring editor tabs. That was a huge lift in our acceptance rate and characters retained.”

Generative AI and LLMs are incredibly fascinating, but Berryman still seems to be most excited about the benefit that the users are seeing from the research and engineering efforts.

“The idea here is to make sure that we make developers more productive, but the way we do that is where things start to get interesting: we can make the user more productive by incorporating the way they think about code into the algorithm itself,” Berryman says. “Where the developer might flip back and forth between tabs to reference code, we just can do that for them, and the completion is exactly what it would be if the user had taken all of the time to look that information up.”


Fine-tuning is a technique used in AI to adapt and improve a pre-trained model for a specific task or domain. The process involves taking a pre-trained model that has been trained on a large dataset and training it on a smaller, more specific dataset that is relevant to a particular use case. This enables the model to learn and adapt to the nuances of the new data, thus improving its performance on the specific task.

These larger, more sophisticated LLMs can sometimes produce outputs that aren’t necessarily helpful because it’s hard to statistically define what constitutes a “good” response. It’s also incredibly difficult to train a model like Codex that contains upwards of 170 billion parameters.

“Basically, we’re training the underlying Codex model on a user’s specific codebase to provide more focused, customized completions,” Goudarzi adds.

“Our greatest challenge right now is to consider why the user rejects or accepts a suggestion,” Goudarzi adds. “We have to consider what context, or information, that we served to the model caused the model to output something that was either helpful or not helpful. There’s no way for us to really troubleshoot in the typical engineering way, but what we can do is figure out how to ask the right questions to get the output we desire.”

Read more about how GitHub Copilot is getting better at understanding your code to provide a more customized coding experience here.

GitHub Copilot—then and now

As the models from OpenAI got stronger—and as we identified more areas to build on top of those LLMs in house—GitHub Copilot has improved and gained new capabilities with chat functionality, voice-assisted development, and more via GitHub Copilot X on the horizon.

Johan Rosenkilde, a staff researcher on the GitHub Next team remembers, “When we received the latest model drops from OpenAI in the past, the improvements were good, but they couldn’t really be felt by the end user. When the third iteration of Codex dropped, you could feel it, especially when you were working with programming languages that are not one of the top five languages,” Rosenkilde says.

He continues, “I happened to be working on a programming competition with some friends on the weekend that model version was released, and we were programming with F#. In the first 24 hours, we evidently had the old model for GitHub Copilot, but then BOOM! Magic happened,” he laughs. “There was an incredibly noticeable difference.”

In the beginning, GitHub Copilot also had the tendency to suggest lines of code in a completely different programming language, which created a poor developer experience (for somewhat obvious reasons).

“You could be working in a C# project, then all of the sudden at the top of a new file, it would suggest Python code,” Rosenkilde explains. So, the team added a headline to the prompt which listed the language you were working in. “Now this had no impact when you were deep down in the file because Copilot could understand which language you were in. But at the top of the file, there could be some ambiguity, and those early models just defaulted to the top popular languages.”

About a month following that improvement, the team discovered that it was much more powerful to put the path of the file at the top of the document.

A diagram of the file path improvement
A diagram of the file path improvement.

“The end of the file name would give away the language in most cases, and in fact the file name could provide crucial, additional information,” Rosenkilde says. “For example, the file might be named ‘connectiondatabase.py.’ Well that file is most likely about databases or connections, so you might want to import an SQL library, and that file was written in Python. So, that not only solved the language problem, but it also improved the quality and user experience by a surprising margin because GitHub Copilot could now suggest boilerplate code.”

After a few more months of work, and several iterations, the team was able to create a component that lifted code from other files, which is a capability that had been talked about since the genesis of GitHub Copilot. Rosenkilde recalls, “this never really amounted to anything more than conversations or a draft pull request because it was so abstract. But then, Albert Ziegler built this component that looked at other files you have open in the IDE at that moment in time and scanned through those files for similar text to what’s in your current cursor. This was a huge boost in code acceptance because suddenly, GitHub Copilot knew about other files.”

What’s next for GitHub Copilot

After working with generative AI models and LLMs over the past three years, we’ve seen their transformative value up close. As the industry continues to find new uses for generative AI, we’re working to continue building new developer experiences. And in March 2023, GitHub announced the future of Copilot, GitHub Copilot X, our vision for an AI-powered developer experience. GitHub Copilot X aims to bring AI beyond the IDE to more components of the overall platform, such as docs and pull requests. LLMs are changing the ways that we interact with technology and how we work, and ideas like GitHub Copilot X are just an example of what these models, along with some dedicated training techniques, are capable of.

How GitHub Copilot is getting better at understanding your code

Post Syndicated from Johan Rosenkilde original https://github.blog/2023-05-17-how-github-copilot-is-getting-better-at-understanding-your-code/

To make working with GitHub Copilot feel like a meeting of the minds between developers and the pair programmer, GitHub’s machine learning experts have been busy researching, developing, and testing new capabilities—and many are focused on improving the AI pair programmer’s contextual understanding. That’s because good communication is key to pair programming, and inferring context is critical to making good communication happen.

To pull back the curtain, we asked GitHub’s researchers and engineers about the work they’re doing to help GitHub Copilot improve its contextual understanding. Here’s what we discovered.

From OpenAI’s Codex model to GitHub Copilot

When OpenAI released GPT-3 in June 2020, GitHub knew developers would benefit from a product that leveraged the model specifically for coding. So, we gave input to OpenAI as it built Codex, a descendant of GPT-3 and the LLM that would power GitHub Copilot. The pair programmer launched as a technical preview in June 2021 and became generally available in June 2022 as the world’s first at-scale generative AI coding tool.

To ensure that the model has the best information to make the best predictions with speed, GitHub’s machine learning (ML) researchers have done a lot of work called prompt engineering (which we’ll explain in more detail below) so that the model provides contextually relevant responses with low latency.

Though GitHub’s always experimenting with new models as they come out, Codex was the first really powerful generative AI model that was available, said David Slater, a ML engineer at GitHub. “The hands-on experience we gained from iterating on model and prompt improvements was invaluable.”

All that experimentation resulted in a pair programmer that, ultimately, frees up a developer’s time to focus on more fulfilling work. The tool is often a huge help even for starting new projects or files from scratch because it scaffolds a starting point that developers can adapt and tweak as desired, said Alice Li, a ML researcher at GitHub.

I still find myself impressed and even surprised by what GitHub Copilot can do, even after having worked on it for some time now.

– Alice Li, ML researcher at GitHub

Why context matters

Developers use details from pull requests, a folder in a project, open issues, and more to contextualize their code. When it comes to a generative AI coding tool, we need to teach that tool what information to use to do the same.

Transformer LLMs are good at connecting the dots and big-picture thinking. Generative AI coding tools are made possible by large language models (LLMs). These models are sets of algorithms trained on large amounts of code and human language. Today’s state-of-the-art LLMs are transformers, which makes them adept at making connections between text in a user’s input and the output that the model has already generated. This is why today’s generative AI tools are providing responses that are more contextually relevant than previous AI models.

But they need to be told what information is relevant to your code. Right now, transformers that are fast enough to power GitHub Copilot can process about 6,000 characters at a time. While that’s been enough to advance and accelerate tasks like code completion and code change summarization, the limited amount of characters means that not all of a developer’s code can be used as context.

So, our challenge is to figure out not only what data to feed the model, but also how to best order and enter it to get the best suggestions for the developer.

Learn more about LLMs, generative AI coding tools, and how they’re changing the way developers work.

How GitHub Copilot understands your code

It all comes down to prompts, which are compilations of IDE code and relevant context that’s fed to the model. Prompts are generated by algorithms in the background, at any point in your coding. That’s why GitHub Copilot will generate coding suggestions whether you’re currently writing or just finished a comment, or in the middle of some gnarly code.

  • Here’s how a prompt is created: a series of algorithms first select relevant code snippets or comments from your current file and other sources (which we’ll dive into below). These snippets and comments are then prioritized, filtered, and assembled into the final prompt.

GitHub Copilot’s contextual understanding has continuously matured over time. The first version was only able to consider the file you were working on in your IDE to be contextually relevant. But we knew context went beyond that. Now, just a year later, we’re experimenting with algorithms that will consider your entire codebase to generate customized suggestions.

Let’s look at how we got here:

  • Prompt engineering is the delicate art of creating a prompt so that the model makes the most useful prediction for the user. The prompt tells LLMs, including GitHub Copilot, what data, and in what order, to process in order to contextualize your code. Most of this work takes place in what’s called a prompt library, which is where our in-house ML experts work with algorithms to extract and prioritize a variety of sources of information about the developer’s context, creating the prompt that’ll be processed by the GitHub Copilot model.

  • Neighboring tabs is what we call the technique that allows GitHub Copilot to process all of the files open in a developer’s IDE instead of just the single one the developer is working on. By opening all files relevant to their project, developers automatically invoke GitHub Copilot to comb through all of the data and find matching pieces of code between their open files and the code around their cursor—and add those matches to the prompt.

When developing neighboring tabs, the GitHub Next team and in-house ML researchers did A/B tests to figure out the best parameters for identifying matches between code in your IDE and code in your open tabs. They found that setting a very low bar for when to include a match actually made for the best coding suggestions.

By including every little bit of context, neighboring tabs helped to relatively increase user acceptance of GitHub Copilot’s suggestions by 5%**.

Even if there was no perfect match—or even a very good one—picking the best match we found and including that as context for the model was better than including nothing at all.

– Albert Ziegler, principal ML engineer at GitHub
  • The Fill-In-the-Middle (FIM) paradigm widened the context aperture even more. Prior to FIM, only the code before your cursor would be put into the prompt—ignoring the code after your cursor. (At GitHub, we refer to code before the cursor as the prefix and after the cursor as the suffix.) With FIM, we can tell the model which part of the prompt is the prefix, and which part is the suffix.

Even if you’re creating something from scratch and have a skeleton of a file, we know that coding isn’t linear or sequential. So, while you bounce around your file, FIM helps GitHub Copilot offer better coding suggestions for the part in your file where your cursor is located, or the code that’s supposed to come between the prefix and suffix.

Based on A/B testing, FIM gave a 10% relative boost in performance, meaning developers accepted 10% more of the completions that were shown to them. And thanks to optimal use of caching, neighboring tabs and FIM work in the background without any added latency.

System diagram focused on model quality efforts. The diagram starts on the left with inputs from open tabs, data from editor, and vector database, which feed into a prompt library. (We are continuously working on improvements to provide better context from available sources in the prompt.) This then goes into the prompt, which is fed through a contextual filter model and a GPT model. (We are continuously working on new and improved model engines optimized for GitHub Copilot.) This model provides completions to fill in the middle of the prompt prefix and suffix. From the models, n completions are generated, and less than or equal to n completions are shown.

Improving semantic understanding

Today, we’re experimenting with vector databases that could create a customized coding experience for developers working in private repositories or with proprietary code. Generative AI coding tools use something called embeddings to retrieve information from a vector database.

  • What’s a vector database? It’s a database that indexes high-dimensional vectors.

  • What’s a high-dimensional vector? They’re mathematical representations of objects, and because these vectors can model objects in a number of dimensions, they can capture complexities of that object. When used properly to represent pieces of code, they may represent both the semantics and even intention of the code—not just the syntax.

  • What’s an embedding? In the context of coding and LLMs, an embedding is the representation of a piece of code as a high-dimensional vector. Because of the “knowledge” the LLM has of both programming and natural language, it’s able to capture both the syntax and semantics of the code in the vector.

Here’s how they’d all work together:

  • Algorithms would create embeddings for all snippets in the repository (potentially billions of them), and keep them stored in the vector database.
  • Then, as you’re coding, algorithms would embed the snippets in your IDE.
  • Algorithms would then make approximate matches—also, in real time—between the embeddings that are created for your IDE snippets and the embeddings already stored in the vector database. The vector database is what allows algorithms to quickly search for approximate matches (not just exact ones) on the vectors it stores, even if it’s storing billions of embedded code snippets.

Developers are familiar with retrieving data with hashcodes, which typically look for exact character by character matches, explained Alireza Goudarzi, senior ML researcher at GitHub. “But embeddings—because they arise from LLMs that were trained on a vast amount of data—develop a sense of semantic closeness between code snippets and natural language prompts.”

Read the three sentences below and identify which two are the most semantically similar.

  • Sentence A: The king moved and captured the pawn.
  • Sentence B: The king was crowned in Westminster Abbey.
  • Sentence C: Both white rooks were still in the game.

The answer is sentences A and C because both are about chess. While sentences A and B are syntactically, or structurally similar because both have a king as the subject, they’re semantically different because “king” is used in different contexts.

Here’s how each of those statements could translate to Python. Note the syntactic similarity between snippets A and B despite their semantic difference, and the semantic similarity between snippets A and C despite their syntactic difference.

Snippet A:

if king.location() == pawn.location():
    board.captures_piece(king, pawn)

Snippet B:

if king.location() == "Westminster Abbey":

Snippet C:

if len([ r for r in board.pieces("white") if r.type == "rook" ]) == 2:
    return True

As mentioned above, we’re still experimenting with retrieval algorithms. We’re designing the feature with enterprise customers in mind, specifically those who are looking for a customized coding experience with private repositories and would explicitly opt in to use the feature.

Take this with you

Last year, we conducted quantitative research on GitHub Copilot and found that developers code up to 55% faster while using the pair programmer. This means developers feel more productive, complete repetitive tasks more quickly, and can focus more on satisfying work. But our work won’t stop there.

The GitHub product and R&D teams, including GitHub Next, have been collaborating with Microsoft Azure AI-Platform to continue bringing improvements to GitHub Copilot’s contextual understanding. So much of the work that helps GitHub Copilot contextualize your code happens behind the scenes. While you write and edit your code, GitHub Copilot is responding to your writing and edits in real time by generating prompts–or, in other words, prioritizing and sending relevant information to the model based on your actions in your IDE—to keep giving you the best coding suggestions.

Learn more

How I used GitHub Copilot to build a browser extension

Post Syndicated from Rizel Scarlett original https://github.blog/2023-05-12-how-i-used-github-copilot-to-build-a-browser-extension/

For the first time ever, I built a browser extension and did it with the help of GitHub Copilot. Here’s how.

I’ve built a rock, paper, scissors game with GitHub Copilot but never a browser extension. As a developer advocate at GitHub, I decided to put GitHub Copilot to the test, including its upcoming chat feature, and see if it could help me write an extension for Google Chrome to clear my cache.

I’m going to be honest: it wasn’t as straightforward as I expected it to be. I had a lot of questions throughout the build and had to learn new information.

But at the end of the day, I gained experience with learning an entirely new skill with a generative AI coding tool and pair programming with GitHub Copilot—and with other developers on Twitch 👥.

I wanted to create steps that anyone—even those without developer experience—could easily replicate when building this extension, or any other extension. But I also wanted to share my new takeaways after a night of pair programming with GitHub Copilot and human developers.

So, below you’ll find two sections:

Let’s jump in.

How to build a Chrome extension with GitHub Copilot

To get started, you’ll need to have GitHub Copilot installed and open in your IDE. I also have access to an early preview of GitHub Copilot chat, which is what I used when I had a question. If you don’t have GitHub Copilot chat, sign up for the waitlist, and pair GitHub Copilot with ChatGPT for now.

1. 🧑🏾‍💻 Using the chat window, I asked GitHub Copilot, “How do I create a Chrome extension? What should the file structure look like?”

💻 GitHub Copilot gave me general steps for creating an extension—from designing the folder structure to running the project locally in Chrome.

Screenshot of the char window where the user asked GitHub Copilot "How do I build a browser extension? What should the file structure look like?" GitHub Copilot provided some instructions in response."

Then, it shared an example of a Chrome extension file structure.

Copilot response showing an example structure for a simple Chrome extension.

To save you some time, here’s a chart that briefly defines the purpose of these files:

manifest.json 🧾 Metadata about your extension, like the name and version, and permissions. Manifest as a proper noun is the name of the Google Chrome API. The latest is V3.
popup.js 🖼 When users click on your extension icon in their Chrome toolbar, a pop-up window will appear. This file is what determines the behavior of that pop-up and contains code for handling user interactions with the pop-up window.
popup.html and style.css 🎨 These files make up the visual of your pop-up window. popup.html is the interface, including layout, structure, and content. style.css determines the way the HTML file should be displayed in the browser, including font, text color, background, etc.

2. Create the manifest.json 🧾

🧑🏾‍💻 Inside a folder in my IDE, I created a file called manifest.json. In manifest.json,

I described my desired file:

Manifest for Chrome extension that clears browser cache.
manifest_version: 3
Permissions for the extension are: storage, tabs, browsingData

I pressed enter and invoked suggestions from GitHub Copilot by typing a curly brace.

💻 Inside the curly brace, GitHub Copilot suggested the manifest. I deleted the lines describing my desired manifest.json, and the final file looked like this:

   "name": "Clear Cache",
   "version": "1.0",
   "manifest_version": 3,
   "description": "Clears browser cache",
   "permissions": [
   "action": {
       "default_popup": "popup.html"
   "background": {
       "service_worker": "background.js"

3. Create a service worker, which is a file called background.js 🔧

This wasn’t a file that was recommended from my chat with GitHub Copilot. I learned that it was a necessary file from a developer who tuned into my livestream 👥. The background.js is what gives your extension the ability to run in the background, perform tasks, and respond to user events outside of the extension’s pop-up window (like network requests and data storage).

🧑🏾‍💻 In my background.js file, I wrote a comment describing my desired service worker:

Service Worker for Google Chrome Extension 
Handles when extension is installed
Handles when message is received

Then, I wrote a comment for the first function, which prompted a suggestion from GitHub Copilot, and then wrote another comment to describe the second function.

💻 The final file looked like this:

Service Worker for Google Chrome Extension
Handles when extension is installed
Handles when message is received

// console.log when extension is installed
chrome.runtime.onInstalled.addListener(function() {
   console.log("Extension installed");

// send response when message is received and console.log when message is received
chrome.runtime.onMessage.addListener(function(request, sender, sendResponse) {
   console.log("Message received");
   sendResponse("Message received");

4. Create the popup.html file 🖼

🧑🏾‍💻 In the popup.html file, I wrote a comment describing how I wanted my pop-up window to appear. This window is the one users will see when they click on the extension icon.

   HTML for Chrome extension that clears browser cache.
   Connect to javascript file called popup.js and CSS file called style.css
   Will render the following buttons with id's:
   - "All History"
   - "Past Month"
   - "Past Week"
   - "Past Day"
   - "Past Hour"
   - "Past Minute"

   Will render an empty paragraph with id "lastCleared"

💻 GitHub Copilot suggested the following:

<!DOCTYPE html>
       <meta charset="utf-8">
       <title>Clear Cache</title>
       <link rel="stylesheet" href="style.css">
       <h1>Clear Cache</h1>
       <button id="allHistory">All History</button>
       <button id="pastMonth">Past Month</button>
       <button id="pastWeek">Past Week</button>
       <button id="pastDay">Past Day</button>
       <button id="pastHour">Past Hour</button>
       <button id="pastMinute">Past Minute</button>
       <p id="lastCleared"></p>
       <script src="popup.js"></script>

5. Test the browser extension

I decided to test the modifications before incorporating additional styles or interactivity. I prefer making incremental changes and conducting manual tests in between because it simplifies the debugging process.

🧑🏾‍💻 Navigate to chrome://extensions/ in your Chrome browser. Turn on developer mode. Press “Load unpacked.” Upload the folder that holds your Chrome extension. Once loaded, you should be able to test your extension. Here’s what mine looked like:

Screenshot of the Chrome extension offering options to "Clear Cache."

6. Create a popup.js file to add interactivity

🧑🏾‍💻 Inside this file, write a comment that will act as pseudocode for you but as a prompt for GitHub Copilot.

Pseudocode isn’t code, but a method of conceptualizing and communicating ideas to plan and design programs before writing the actual code. Programmers can concentrate on algorithmic logic without worrying about syntactic details of a particular programming language—and communicate ideas to other developers regardless of coding experience.

Here’s the pseudocode I used:

This program is a Chrome Extension that clears browser cache.
Handle on button click:
- button with id "allHistory" that clears all cache history
- button with id "pastMonth" that clears cache history from the past month
- button with id "pastWeek" that clears cache history from the past week
- button with id "pastDay" that clears cache history from the past day
- button with id "pastHour" that clears cache history from the past hour
- button with id "pastMinute" that clears cache history from the past minute

Create function that
- converts dates and times into human-readable format
- adds "Successfully cleared cache" with date and time in a paragraph with id "lastCleared"

Then, write the series of comments below, but write them one at a time to allow GitHub Copilot to generate a code suggestion after each one. The final popup.js looked like this:

This program is a Chrome Extension that clears browser cache.
Handle on button click:
- button with id "allHistory" that clears all cache history
- button with id "pastMonth" that clears cache history from the past month
- button with id "pastWeek" that clears cache history from the past week
- button with id "pastDay" that clears cache history from the past day
- button with id "pastHour" that clears cache history from the past hour
- button with id "pastMinute" that clears cache history from the past minute

Create function that
- converts dates and times into human-readable format
- adds "Successfully cleared cache" with date and time in a paragraph with id "lastCleared"

// convert date and time into human-readable format
function convertDate(date) {
   let date = new Date(date);
   var options = { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' };
   return date.toLocaleDateString("en-US", options);

// add successfully cleared cache into paragraph with id "lastCleared"
function addCleared() {
   var p = document.getElementById("lastCleared");
   let date = new Date();
   p.innerHTML = "Successfully cleared cache " + convertDate(date);

// clear all cache history
document.getElementById("allHistory").addEventListener("click", function() {
   chrome.browsingData.removeCache({ "since": 0 }, function() {

// clear cache history from the past month
document.getElementById("pastMonth").addEventListener("click", function() {
   let date = new Date();
   date.setMonth(date.getMonth() - 1);
   chrome.browsingData.removeCache({ "since": date.getTime() }, function() {

// clear cache history from the past week
document.getElementById("pastWeek").addEventListener("click", function() {
   let date = new Date();
   date.setDate(date.getDate() - 7);
   chrome.browsingData.removeCache({ "since": date.getTime() }, function() {

// clear cache history from the past day
document.getElementById("pastDay").addEventListener("click", function() {
   let date = new Date();
   date.setDate(date.getDate() - 1);
   chrome.browsingData.removeCache({ "since": date.getTime() }, function() {

// clear cache history from the past hour
document.getElementById("pastHour").addEventListener("click", function() {
  let date = new Date();
   date.setHours(date.getHours() - 1);
   chrome.browsingData.removeCache({ "since": date.getTime() }, function() {

// clear cache history from the past minute
document.getElementById("pastMinute").addEventListener("click", function() {
  let date = new Date();
   date.setMinutes(date.getMinutes() - 1);
   chrome.browsingData.removeCache({ "since": date.getTime() }, function() {

🧑🏾‍💻 GitHub Copilot actually generated the var keyword, which is outdated. So I changed that keyword to let.

7. Create the last file in your folder: style.css

🧑🏾‍💻 Write a comment that describes the style you want for your extension. Then, type “body” and continue tabbing until GitHub Copilot suggests all the styles.

My final style.css looked like this:

/* Style the Chrome extension's popup to be wider and taller
Use accessible friendly colors and fonts
Make h1 elements legible
Highlight when buttons are hovered over
Highlight when buttons are clicked
Align buttons in a column and center them but space them out evenly
Make paragraph bold and legible

body {
   background-color: #f1f1f1;
   font-family: Arial, Helvetica, sans-serif;
   font-size: 16px;
   color: #333;
   width: 400px;
   height: 400px;

h1 {
   font-size: 24px;
   color: #333;
   text-align: center;

button {
   background-color: #4CAF50;
   color: white;
   padding: 15px 32px;
   text-align: center;
   text-decoration: none;
   display: inline-block;
   font-size: 16px;
   margin: 4px 2px;
   cursor: pointer;
   border-radius: 8px;

button:hover {
   background-color: #45a049;

button:active {
   background-color: #3e8e41;

p {
   font-weight: bold;
   font-size: 18px;
   color: #333;
For detailed, step-by-step instructions, check out my Chrome extension with GitHub Copilot repo.

Three important lessons about learning and pair programming in the age of AI

  1. Generative AI reduces the fear of making mistakes. It can be daunting to learn a new language or framework, or start a new project. The fear of not knowing where to start—or making a mistake that could take hours to debug—can be a significant barrier to getting started. I’ve been a developer for over three years, but streaming while coding makes me nervous. I sometimes focus more on people watching me code and forget the actual logic. When I conversed with GitHub Copilot, I gained reassurance that I was going in the right direction and that helped me to stay motivated and confident during the stream.

  2. Generative AI makes it easier to learn about new subjects, but it doesn’t replace the work of learning. GitHub Copilot didn’t magically write an entire Chrome extension for me. I had to experiment with different prompts, and ask questions to GitHub Copilot, ChatGPT, Google, and developers on my livestream. To put it in perspective, it took me about 1.5 hours to do steps 1 to 5 while streaming.

    But if I hadn’t used GitHub Copilot, I would’ve had to write all this code by scratch or look it up in piecemeal searches. With the AI-generated code suggestions, I was able to jump right into review and troubleshooting, so a lot of my time and energy was focused on understanding how the code worked. I still had to put in the effort to learn an entirely new skill, but I was analyzing and evaluating code more often than I was trying to learn and then remember it.

  3. Generative AI coding tools made it easier for me to collaborate with other developers. Developers who tuned into the livestream could understand my thought process because I had to tell GitHub Copilot what I wanted it to do. By clearly communicating my intentions with the AI pair programmer, I ended up communicating them more clearly with developers on my livestream, too. That made it easy for people tuning in to become my virtual pair programmers during my livestream.

Overall, working with GitHub Copilot made my thought process and workflow more transparent. Like I said earlier, it was actually a developer on my livestream who recommended a service worker file after noticing that GitHub Copilot didn’t include it in its suggested file structure. Once I confirmed in a chat conversation with GitHub Copilot and a Google search that I needed a service worker, I used GitHub Copilot to help me write one.

Take this with you

GitHub Copilot made me more confident with learning something new and collaborating with other developers. As I said before, live coding can be nerve-wracking. (I even get nervous even when I’m just pair programming with a coworker!) But GitHub Copilot’s real-time code suggestions and corrections created a safety net, allowing me to code more confidently—and quickly— in front of a live audience. Also, because I had to clearly communicate my intentions with the AI pair programmer, I was also communicating clearly with the developers who tuned into my livestream. This made it easy to virtually collaborate with them.

The real-time interaction with GitHub Copilot and the other developers helped with catching errors, learning coding suggestions, and reinforcing my own understanding and knowledge. The result was a high-quality codebase for a browser extension.

This project is a testament to the collaborative power of human-AI interaction. The experience underscored how GitHub Copilot can be a powerful tool in boosting confidence, facilitating learning, and fostering collaboration among developers.

More resources

How companies are boosting productivity with generative AI

Post Syndicated from Chris Reddington original https://github.blog/2023-05-09-how-companies-are-boosting-productivity-with-generative-ai/

Is your company using generative AI yet?

While it’s still in its infancy, generative AI coding tools are already changing the way developers and companies build software. Generative AI can boost developer and business productivity by automating tasks, improving communication and collaboration, and providing insights that can inform better decision-making.

In this post, we’ll explore the full story of how companies are adopting generative AI to ship software faster, including:

Want to explore the world of generative AI for developers? 🌎

Check out our generative AI guide to learn what it is, how it works, and what it means for developers everywhere.

Get the guide >

What is generative AI?

Generative AI refers to a class of artificial intelligence (AI) systems designed to create new content similar to what humans produce. These systems are trained on large datasets of content that include text, images, audio, music, or code.

Generative AI is an extension of traditional machine learning, which trains models to predict or classify data based on existing patterns. But instead of simply predicting the outcome, generative AI models are designed to identify underlying patterns and structures of the data, and then use that knowledge to quickly generate new content. However, the main difference between the two is one of magnitude and the size of the prediction or generation. Machine learning typically predicts the next word. Generative AI can generate the next paragraph.

AI-generated image from Shutterstack of a developer using a generative AI tool to code faster.
AI-generated image from Shutterstack of a developer using a generative AI tool to code faster.

Generative AI tools have attracted particular interest in the business world. From marketing to software development, organizational leaders are increasingly curious about the benefits of the new generative AI applications and products.

“I do think that all companies will adopt generative AI tools in the near future, at least indirectly,” said Albert Ziegler, principal machine learning engineer at GitHub. “The bakery around the corner might have a logo that the designer made using a generative transformer. The neighbor selling knitted socks might have asked Bing where to buy a certain kind of wool. My taxi driver might do their taxes with a certain Excel plugin. This adoption will only increase over time.”

What are some business uses of generative AI tools? 💡

  • Software development: generative AI tools can assist engineers with building, editing, and testing code.
  • Content creation: writers can use generative AI tools to help personalize product descriptions and write ad copy.
  • Design creation: from generating layouts to assisting with graphics, generative AI design tools can help designers create entirely new designs.
  • Video creation: generative AI tools can help videographers with building, editing, or enhancing videos and images.
  • Language translation: translators can use generative AI tools to create communications in different languages.
  • Personalization: generative AI tools can assist businesses with personalizing products and services to meet the needs of individual customers.
  • Operations: from supply chain management to pricing, generative AI tools can help operations professionals drive efficiency.

How generative AI coding tools are changing the developer experience

Generative AI has big implications for developers, as the tools can enable them to code and ship software faster.

How is generative AI affecting software development?⚡

Check out our guide to learn what generative AI coding tools are, what developers are using them for, and how they’re impacting the future of development.

Get the guide >

Similar to how spell check and other automation tools can help writers build content more efficiently, generative AI coding tools can help developers produce cleaner work—and the models powering these tools are getting better by the month. Tools such as GitHub Copilot, for instance, can be used in many parts of the software development lifecycle, including in IDEs, code reviews, and testing.

The science backs this up. In 2022, we conducted research into how our generative AI tool, GitHub Copilot, helps developers. Here’s what we found:

Source: Research: quantifying GitHub Copilot’s impact on developer productivity and happiness

GitHub Copilot is only continuing to improve. When the tool was first launched for individuals in June 2022, more than 27% of developers’ code was generated by GitHub Copilot, on average. Today, that number is 46% across all programming languages—and in Java, that jumps to 61%.

How can generative AI tools help you build software? 🚀

These tools can help:

  • Write boilerplate code for various programming languages and frameworks.
  • Find information in documentation to understand what the code does.
  • Identify security vulnerabilities and implement fixes.
  • Streamline code reviews before merging new or edited code.

Explore GitHub’s vision for embedding generative AI into every aspect of the developer workflow.

Using generative AI responsibly 🙏

Like all technologies, responsibility and ethics are important with generative AI.

In February 2023, a group of 10 companies including OpenAI, Adobe, the BBC, and others agreed upon a new set of recommendations on how to use generative AI content in a responsible way.

The recommendations were put together by the Partnership on AI (PAI), an AI research nonprofit, in consultation with more than 50 organizations. The guidelines call for creators and distributors of generative AI to be transparent about what the technology can and can’t do and disclose when users might be interacting with this type of content (by using watermarks, disclaimers, or traceable elements in an AI model’s training data).

Is generative AI accurate? 🔑

Businesses should be aware that while generative AI tools can speed up the creation of content, they should not be solely relied upon as a source of truth. A recent study suggests that people can identify whether AI-generated content is real or fake only 50% of the time. Here at GitHub, we named our generative AI tool “GitHub Copilot” to signify just this—the tool can help, but at the end of the day, it’s just a copilot. The developer needs to take responsibility for ensuring that the finished code is accurate and complete.

How companies are using generative AI

Even as generative AI models and tools continue to rapidly advance, businesses are already exploring how to incorporate these into their day-to-day operations.

This is particularly true for software development teams.

“Going forward, tech companies that don’t adopt generative AI tools will have a significant productivity disadvantage,” Ziegler said. “Given how much faster this technology can help developers build, organizations that don’t adopt these tools or create their own will have a harder time in the marketplace.”

3 primary generative AI business models for organizations 📈

Enterprises all over the world are using generative AI tools to transform how work gets done. Three of the business models organizations use include:

  • Model as a Service (MaaS): Companies access generative AI models through the cloud and use them to create new content. OpenAI employs this model, which licenses its GPT-3 AI model, the platform behind ChatGPT. This option offers low-risk, low-cost access to generative AI, with limited upfront investment and high flexibility.
  • Built-in apps: Companies build new—or existing—apps on top of generative AI models to create new experiences. GitHub Copilot uses this model, which relies on Codex to analyze the context of the code to provide intelligent suggestions on how to complete it. This option offers high customization and specialized solutions with scalability.
  • Vertical integration: Vertical integration leverages existing systems to enhance the offerings. For instance, companies may use generative AI models to analyze large amounts of data and make predictions about prices or improve the accuracy of their services.

Duolingo, one of the largest language-learning apps in the world, is one company that recently adopted generative AI capabilities. They chose GitHub’s generative AI tool, GitHub Copilot, to help their developers write and ship code faster, while improving test coverage. Duolingo’s CTO Severin Hacker said GitHub Copilot delivered immediate benefits to the team, enabling them to code quickly and deliver their best work.

”[The tool] stops you from getting distracted when you’re doing deep work that requires a lot of your brain power,” Hacker noted. “You spend less time on routine work and more time on the hard stuff. With GitHub Copilot, our developers stay in the flow state and keep momentum instead of clawing through code libraries or documentation.”

After adopting GitHub Copilot and the GitHub platform, Duolingo saw a:

  • 25% increase in developer speed for those who are new to working with a specific repository
  • 10% increase in developer speed for those who are familiar with the respective codebase
  • 67% decrease in median code review turnaround time

“I don’t know of anything available today that’s remotely close to what we can get with GitHub Copilot,” Hacker said.

Looking forward

Generative AI is changing the world of software development. And it’s just getting started. The technology is quickly improving and more use cases are being identified across the software development lifecycle. With the announcement of GitHub Copilot X, our vision for the future of AI-powered software development, we’re committed to installing AI capabilities into every step of the developer workflow. There’s no better time to get started with generative AI at your company.

AWS Week in Review – AWS Notifications, Serverless event, and More – May 8, 2023

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-week-in-review-aws-notifications-serverless-event-and-more-may-8-2023/

At the end of this week, I’m flying to Seattle to take part in the AWS Serverless Innovation Day. Along with many customers and colleagues from AWS, we are going to be live on May 17 at a virtual free event. During the AWS Serverless Innovation Day we will share best practices related to building event-driven applications and using serverless functions and containers. Get a calendar reminder and check the full agenda at the event site.

Serverless innovation day

Last Week’s Launches
Here are some launches that got my attention during the previous week.

New Local Zones in Auckland – AWS Local Zones allow you to deliver applications that require single-digit millisecond latency or local data processing. Starting last week, AWS Local Zones is available in Auckland, New Zealand.

All AWS Local Zones

AWS Notifications Channy wrote an article explaining how you can view and configure notifications for your AWS account. In addition to the AWS Management Console notifications, the AWS Console Mobile Application now allows you to create and receive actionable push notifications when a resource requires your attention.

AWS SimSpace Weaver Last reInvent, we launched AWS SimSpace Weaver, a fully managed compute service that helps you deploy large spatial simulations in the cloud. Starting last week, AWS SimSpace Weaver allows you to save the state of the simulations at a specific point in time.

AWS Security Hub Added four new integration partners to help customers with their cloud security posture monitoring, and now it provides detailed tracking of finding changes with the finding history feature. This last feature provides an immutable trail of changes to get more visibility into the changes made to your findings.

AWS Compute Optimizer – AWS Compute Optimizer supports inferred workload type filtering on Amazon EC2 instance recommendations and automatically detects the applications that might run on your AWS resources. Now AWS Compute Optimizer supports filtering your rightsizing recommendation by tags and identifies and filters Microsoft SQL Server workloads as an inferred workload type.

AWS AppSyncNow AWS AppSync GraphQL APIs support Private API. With Private APIs, you can now create GraphQL APIs that can only be accessed from your Amazon Virtual Private Cloud (Amazon VPC).

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates and news that you may have missed:

  • Responsible AI in the Generative EraAmazon Science published a very interesting blog post this week about the special challenges raised by building a responsible generative AI and the different things builders of applications can do in order to solve these challenges.
  • Patterns for Building an API to Upload Files to Amazon S3 – Amazon S3 is one of the most used services by our customers, and applications often require a way for users to upload files. In this article, Thomas Moore shows different ways to do this in a secure way.
  • The Official AWS Podcast – Listen each week for updates on the latest AWS news and deep dives into exciting use cases. There are also official AWS podcasts in your local languages. Check out the ones in FrenchGermanItalian, and Spanish.
  • AWS Open-Source News and Updates – This is a newsletter curated by my colleague Ricardo to bring you the latest open-source projects, posts, events, and more.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

  • AWS Serverless Innovation DayJoin us on May 17 for a virtual and free event about AWS Serverless. We will have talks and fireside chats with customers related to AWS Lambda, Amazon ECS with Fargate, AWS Step Functions, and Amazon EventBridge.
  • AWS re:Inforce 2023You can now register for AWS re:Inforce, happening in Anaheim, California, on June 13–14.
  • AWS Global Summits – There are many summits going on right now around the world: Stockholm (May 11), Hong Kong (May 23), India (May 25), Amsterdam (June 1), London (June 7), Washington, DC (June 7–8), Toronto (June 14), Madrid (June 15), and Milano (June 22).
  • AWS Community Day – Join a community-led conference run by AWS user group leaders in your region: Warsaw (June 1), Chicago (June 15), Manila (June 29–30), and Munich (September 14).
  • AWS User Group Peru Conference – The local AWS User Group announced a one-day cloud event in Spanish and English in Lima on September 23. Seb, Jeff, and I will be attending the event from the AWS News blog team. Register today!

That’s all for this week. Check back next Monday for another Week in Review!

— Marcia

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Web Summit Rio 2023: Building an app in 18 minutes with GitHub Copilot X

Post Syndicated from Thomas Dohmke original https://github.blog/2023-05-05-web-summit-rio-2023-building-an-app-in-18-minutes-with-github-copilot-x/

Missed GitHub CEO Thomas Dohmke’s Web Summit Rio 2023 talk? Read his remarks and get caught up on everything GitHub Copilot X.

Hello, Rio! I’m Thomas, and I’m a developer. I’ve been looking forward to this for some time, about a year. When I first planned this talk, I wanted to talk about the future of artificial intelligence and applications. But then I thought, why not show you the full power live on stage. Build something with code.

I’ve been a developer my entire adult life. I’ve been coding since 1989. But as CEO, I haven’t been able to light up my contribution graph in a while. Like all of you, like every developer, my energy, my creativity every day is finite. And all the distractions of life—from the personal to the professional—from morning to night, gradually zaps my creative energy.

When I wake up, I’m at my most creative and sometimes I have a big, light bulb idea. But as soon as the coffee is poured, the distractions start. Slack messages and emails flood in. I have to review a document before a customer presentation. And then I hear my kids yell as their football flew over the fence. And, of course, they want me to go over to our neighbors’ and get it.

Then…I sit back down, I am in wall-to-wall meetings. And by the time the sun comes down, that idea I had in the morning—it’s gone! It’s midnight and I can’t even keep my eyes open. To be honest, in all this, I usually don’t even want to get started because the perception of having to do all the mundane tasks, all the boilerplate work that comes with coding, stops me in my tracks. And I know this is true for so many developers.

The point is: boilerplate sucks! All busywork in life, but especially repetitive work is a barrier between us and what we want to achieve. Every day, building endless boilerplate prevents developers from creating a new idea that will change the world. But with AI, these barriers are about to be shattered.

You’ve likely seen the memes about the 10x developer. It’s a common internet joke—people have claimed the 10x developer many times. With GitHub Copilot and now Copilot X, it’s time for us to redefine this! It isn’t that developers need to strive to be 10x, it’s that every developer deserves to be made 10 times more productive. With AI at every step, we will truly create without captivity. With AI at every step, we will realize the 10x developer.

Imagine: 10 days of work, done in one day. 10 hours of work, done in one hour. 10 minutes of work, done with a single prompt command. This will allow us to amplify our truest self-expression. It will help a new generation of developers learn and build as fast as their minds. And because of this, we’ll emerge into a new spring of digital creativity, where every light bulb idea we have when we open our eyes for the day can be fully realized no matter what life throws at us.

During my presentation, I built a snake game in 15 minutes.

This morning I woke up, and I wanted to build something. I wanted to build my own snake game, originally created back in 1976. It’s a classic. So, 15 minutes are left on my timer. And I’m going to build this snake game. Let’s bring up Copilot X and get going!

See how you can 10x your developer superpowers. Discover GitHub Copilot X.

AWS Week in Review: New Service for Generative AI and Amazon EC2 Trn1n, Inf2, and CodeWhisperer now GA – April 17, 2023

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/aws-week-in-review-new-service-for-generative-ai-and-amazon-ec2-trn1n-inf2-and-codewhisperer-now-ga-april-17-2023/

I could almost title this blog post the “AWS AI/ML Week in Review.” This past week, we announced several new innovations and tools for building with generative AI on AWS. Let’s dive right into it.

Last Week’s Launches
Here are some launches that got my attention during the previous week:

Announcing Amazon Bedrock and Amazon Titan models Amazon Bedrock is a new service to accelerate your development of generative AI applications using foundation models through an API without managing infrastructure. You can choose from a wide range of foundation models built by leading AI startups and Amazon. The new Amazon Titan foundation models are pre-trained on large datasets, making them powerful, general-purpose models. You can use them as-is or privately to customize them with your own data for a particular task without annotating large volumes of data. Amazon Bedrock is currently in limited preview. Sign up here to learn more.

Building with Generative AI on AWS

Amazon EC2 Trn1n and Inf2 instances are now generally availableTrn1n instances, powered by AWS Trainium accelerators, double the network bandwidth (compared to Trn1 instances) to 1,600 Gbps of Elastic Fabric Adapter (EFAv2). The increased bandwidth delivers even higher performance for training network-intensive generative AI models such as large language models (LLMs) and mixture of experts (MoE). Inf2 instances, powered by AWS Inferentia2 accelerators, deliver high performance at the lowest cost in Amazon EC2 for generative AI models, including LLMs and vision transformers. They are the first inference-optimized instances in Amazon EC2 to support scale-out distributed inference with ultra-high-speed connectivity between accelerators. Compared to Inf1 instances, Inf2 instances deliver up to 4x higher throughput and up to 10x lower latency. Check out my blog posts on Trn1 instances and Inf2 instances for more details.

Amazon CodeWhisperer, free for individual use, is now generally availableAmazon CodeWhisperer is an AI coding companion that generates real-time single-line or full-function code suggestions in your IDE to help you build applications faster. With GA, we introduce two tiers: CodeWhisperer Individual and CodeWhisperer Professional. CodeWhisperer Individual is free to use for generating code. You can sign up with an AWS Builder ID based on your email address. The Individual Tier provides code recommendations, reference tracking, and security scans. CodeWhisperer Professional—priced at $19 per user, per month—offers additional enterprise administration capabilities. Steve’s blog post has all the details.

Amazon GameLift adds support for Unreal Engine 5Amazon GameLift is a fully managed solution that allows you to manage and scale dedicated game servers for session-based multiplayer games. The latest version of the Amazon GameLift Server SDK 5.0 lets you integrate your Unreal 5-based game servers with the Amazon GameLift service. In addition, the latest Amazon GameLift Server SDK with Unreal 5 plugin is built to work with Amazon GameLift Anywhere so that you can test and iterate Unreal game builds faster and manage game sessions across any server hosting infrastructure. Check out the release notes to learn more.

Amazon Rekognition launches Face Liveness to deter fraud in facial verification – Face Liveness verifies that only real users, not bad actors using spoofs, can access your services. Amazon Rekognition Face Liveness analyzes a short selfie video to detect spoofs presented to the camera, such as printed photos, digital photos, digital videos, or 3D masks, as well as spoofs that bypass the camera, such as pre-recorded or deepfake videos. This AWS Machine Learning Blog post walks you through the details and shows how you can add Face Liveness to your web and mobile applications.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Here are some additional news items and blog posts that you may find interesting:

Updates to the AWS Well-Architected Framework – The most recent content updates and improvements focus on providing expanded guidance across the AWS service portfolio to help you make more informed decisions when developing implementation plans. Services that were added or expanded in coverage include AWS Elastic Disaster Recovery, AWS Trusted Advisor, AWS Resilience Hub, AWS Config, AWS Security Hub, Amazon GuardDuty, AWS Organizations, AWS Control Tower, AWS Compute Optimizer, AWS Budgets, Amazon CodeWhisperer, and Amazon CodeGuru. This AWS Architecture Blog post has all the details.

Amazon releases largest dataset for training “pick and place” robots – In an effort to improve the performance of robots that pick, sort, and pack products in warehouses, Amazon has publicly released the largest dataset of images captured in an industrial product-sorting setting. Where the largest previous dataset of industrial images featured on the order of 100 objects, the Amazon dataset, called ARMBench, features more than 190,000 objects. Check out this Amazon Science Blog post to learn more.

AWS open-source news and updates – My colleague Ricardo writes this weekly open-source newsletter in which he highlights new open-source projects, tools, and demos from the AWS Community. Read edition #153 here.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

Build On AWS - Generative AI#BuildOn Generative AI – Join our weekly live Build On Generative AI Twitch show. Every Monday morning, 9:00 US PT, my colleagues Emily and Darko take a look at aspects of generative AI. They host developers, scientists, startup founders, and AI leaders and discuss how to build generative AI applications on AWS.

In today’s episode, Emily walks us through the latest AWS generative AI announcements. You can watch the video here.

Dot Net Developer Day.NET Developer Day.NET Enterprise Developer Day EMEA 2023 (April 25) is a free, one-day virtual event providing enterprise developers with the most relevant information to swiftly and efficiently migrate and modernize their .NET applications and workloads on AWS.

AWS Developer Innovation DayAWS Developer Innovation DayAWS Developer Innovation Day (April 26) is a new, free, one-day virtual event designed to help developers and teams be productive and collaborate from discovery to delivery, to running software and building applications. Get a first look at exciting product updates, technical deep dives, and keynotes.

AWS Global Summits – Check your calendars and sign up for the AWS Summit close to where you live or work: Tokyo (April 20–21), Singapore (May 4), Stockholm (May 11), Hong Kong (May 23), Tel Aviv (May 31), Amsterdam (June 1), London (June 7), Washington, DC (June 7–8), Toronto (June 14), Madrid (June 15), and Milano (June 22).

You can browse all upcoming AWS-led in-person and virtual events and developer-focused events such as Community Days.

That’s all for this week. Check back next Monday for another Week in Review!

— Antje

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

How generative AI is changing the way developers work

Post Syndicated from Damian Brady original https://github.blog/2023-04-14-how-generative-ai-is-changing-the-way-developers-work/

During a time when computers were solely used for computation, the engineer, Douglas Engelbart, gave the “mother of all demos,” where he reframed the computer as a collaboration tool capable of solving humanity’s most complex problems. At the start of his demo, he asked audience members how much value they would derive from a computer that could instantly respond to their actions.

You can ask the same question of generative AI models. If you had a highly responsive generative AI coding tool to brainstorm new ideas, break big ideas into smaller tasks, and suggest new solutions to problems, how much more creative and productive could you be?

This isn’t a hypothetical question. AI-assisted engineering workflows are quickly emerging with new generative AI coding tools that offer code suggestions and entire functions in response to natural language prompts and existing code. These tools, and what they can help developers accomplish, are changing fast. That makes it important for every developer to understand what’s happening now—and the implications for how software is and will be built.

In this article, we’ll give a rundown of what generative AI in software development looks like today by exploring:

The unique value generative AI brings to the developer workflow

AI and automation have been a part of the developer workflow for some time now. From machine learning-powered security checks to CI/CD pipelines, developers already use a variety of automation and AI tools, like CodeQL on GitHub, for example.

While there’s overlap between all of these categories, here’s what makes generative AI distinct from automation and other AI coding tools:

Automation: 🛤
You know what needs to be done, and you know of a reliable way to get there every time.
Rules-based logic: 🔎
You know the end goal, but there’s more than one way to achieve it.
Machine learning: 🧠
You know the end goal, but the amount of ways to achieve it scales exponentially.
Generative AI: 🌐
You have big coding dreams, and want the freedom to bring them to life.
You want to make sure that any new code pushed to your repository follows formatting specifications before it’s merged to the main branch. Instead of manually validating the code, you use a CI/CD tool like GitHub Actions to trigger an automated workflow on the event of your choosing (like a commit or pull request). You know some patterns of SQL injections, but it’s time consuming to manually scan for them in your code. A tool like Code QL uses a system of rules to sort through your code and find those patterns, so you don’t have to do it by hand. You want to stay on top of security vulnerabilities, but the list of SQL injections continues to grow. A coding tool that uses a machine learning (ML) model, like Code QL, is trained to not only detect known injections, but also patterns similar to those injections in data it hasn’t seen before. This can help you increase recognition of confirmed vulnerabilities and predict new ones. Generative AI coding tools leverage ML to generate novel answers and predict coding sequences. A tool like GitHub Copilot can reduce the amount of times you switch out of your IDE to look up boilerplate code or help you brainstorm coding solutions. Shifting your role from rote writing to strategic decision making, generative AI can help you reflect on your code at a higher, more abstract level—so you can focus more on what you want to build and spend less time worrying about how.

How are generative AI coding tools designed and built?

Building a generative AI coding tool requires training AI models on large amounts of code across programming languages via deep learning. (Deep learning is a way to train computers to process data like we do—by recognizing patterns, making connections, and drawing inferences with limited guidance.)

To emulate the way humans learn patterns, these AI models use vast networks of nodes, which process and weigh input data, and are designed to function like neurons. Once trained on large amounts of data and able to produce useful code, they’re built into tools and applications. The models can then be plugged into coding editors and IDEs where they respond to natural language prompts or code to suggest new code, functions, and phrases.

Before we talk about how generative AI coding tools are made, let’s define what they are first. It starts with LLMs, or large language models, which are sets of algorithms trained on large amounts of code and human language. Like we mentioned above, they can predict coding sequences and generate novel content using existing code or natural language prompts.

Today’s state-of-the-art LLMs are transformers. That means they use something called an attention mechanism to make flexible connections between different tokens in a user’s input and the output that the model has already generated. This allows them to provide responses that are more contextually relevant than previous AI models because they’re good at connecting the dots and big-picture thinking.

Here’s an example of how a transformer works. Let’s say you encounter the word log in your code. The transformer node at that place would use the attention mechanism to contextually predict what kind of log would come next in the sequence.

Let’s say, in the example below, you input the statement from math import log. A generative AI model would then infer you mean a logarithmic function.

And if you add the prompt from logging import log, it would infer that you’re using a logging function.

Though sometimes a log is just a log.

LLMs can be built using frameworks besides transformers. But LLMs using frameworks, like a recurrent neural network or long short-term memory, struggle with processing long sentences and paragraphs. They also typically require training on labeled data (making training a labor-intensive process). This limits the complexity and relevance of their outputs, and the data they can learn from.

Transformer LLMs, on the other hand, can train themselves on unlabeled data. Once they’re given basic learning objectives, LLMs take a part of the new input data and use it to practice their learning goals. Once they’ve achieved these goals on that portion of the input, they apply what they’ve learned to understand the rest of the input. This self-supervised learning process is what allows transformer LLMs to analyze massive amounts of unlabeled data—and the larger the dataset an LLM is trained on, the more they scale by processing that data.

Why should developers care about transformers and LLMs?

LLMs like OpenAI’s GPT-3, GPT-4, and Codex models are trained on an enormous amount of natural language data and publicly available source code. This is part of the reason why tools like ChatGPT and GitHub Copilot, which are built on these models, can produce contextually accurate outputs.

Here’s how GitHub Copilot produces coding suggestions:

  • All of the code you’ve written so far, or the code that comes before the cursor in an IDE, is fed to a series of algorithms that decide what parts of the code will be processed by GitHub Copilot.
  • Since it’s powered by a transformer-based LLM, GitHub Copilot will apply the patterns it’s abstracted from training data and apply those patterns to your input code.
  • The result: contextually relevant, original coding suggestions. GitHub Copilot will even filter out known security vulnerabilities, vulnerable code patterns, and code that matches other projects.

Keep in mind: creating new content such as text, code, and images is at the heart of generative AI. LLMs are adept at abstracting patterns from their training data, applying those patterns to existing language, and then producing language or a line of code that follows those patterns. Given the sheer scale of LLMs, they might generate a language or code sequence that doesn’t even exist yet. Just as you would review a colleague’s code, you should assess and validate AI-generated code, too.

Why context matters for AI coding tools

Developing good prompt crafting techniques is important because input code passes through something called a context window, which is present in all transformer-based LLMs. The context window represents the capacity of data an LLM can process. Though it can’t process an infinite amount of data, it can grow larger. Right now, the Codex model has a context window that allows it to process a couple of hundred lines of code, which has already advanced and accelerated coding tasks like code completion and code change summarization.

Developers use details from pull requests, a folder in a project, open issues—and the list goes on—to contextualize their code. So, when it comes to a coding tool with a limited context window, the challenge is to figure out what data, in addition to code, will lead to the best suggestions.

The order of the data also impacts a model’s contextual understanding. Recently, GitHub made updates to its pair programmer so that it considers not only the code immediately before the cursor, but also some of the code after the cursor. The paradigm—which is called Fill-In-the-Middle (FIM)—leaves a gap in the middle of the code for GitHub Copilot to fill, providing the tool with more context about the developer’s intended code and how it should align with the rest of the program. This helps produce higher quality code suggestions without any added latency.

Visuals can also contextualize code. Multimodal LLMs (MMLLMs) scale transformer LLMs so they process images and videos, as well as text. OpenAI recently released its new GPT-4 model—and Microsoft revealed its own MMLLM called Kosmos-1. These models are designed to respond to natural language and images, like alternating text and images, image-caption pairs, and text data.

GitHub’s senior developer advocate Christina Warren shares the latest on GPT-4 and the creative potential it holds for developers:

Our R&D team at GitHub Next has been working to move AI past the editor with GitHub Copilot X. With this new vision for the future of AI-powered software development, we’re not only adopting OpenAI’s new GPT-4 model, but also introducing chat and voice, and bringing GitHub Copilot to pull requests, the command line, and docs. See how we’re investigating the future of AI-powered software development >

How developers are using generative AI coding tools

The field of generative AI is filled with experiments and explorations to uncover the technology’s full capabilities—and how they can enable effective developer workflows. Generative AI tools are already changing how developers write code and build software, from improving productivity to helping developers focus on bigger problems.

While generative AI applications in software development are still being actively defined, today, developers are using generative AI coding tools to:

  • Get a head start on complex code translation tasks. A study presented at the 2021 International Conference on Intelligent User Interfaces found that generative AI provided developers with a skeletal framework to translate legacy source code into Python. Even if the suggestions weren’t always correct, developers found it easier to assess and fix those mistakes than manually translate the source code from scratch. They also noted that this process of reviewing and correcting was similar to what they already do when working with code produced by their colleagues.

With GitHub Copilot Labs, developers can use the companion VS Code extension (that’s separate from but dependent on the GitHub Copilot extension) to translate code into different programming languages. Watch how GitHub Developer Advocate, Michelle Mannering, uses GitHub Copilot Labs to translate her Python code into Ruby in just a few steps.

Our own research supports these findings, too. As we mentioned earlier, we found that developers who used GitHub Copilot coded up to 55% faster than those who didn’t. But productivity gains went beyond speed with 74% of developers reporting that they felt less frustrated when coding and were able to focus on more satisfying work.

  • Tackle new problems and get creative. The PACMPL study also found that developers used GitHub Copilot to find creative solutions when they were unsure of how to move forward. These developers searched for next possible steps and relied on the generative AI coding tool to assist with unfamiliar syntax, look up the right API, or discover the correct algorithm.

I was one of the developers who wrote GitHub Copilot, but prior to that work, I had never written a single line of TypeScript. That wasn’t a problem because I used the first prototype of GitHub Copilot to learn the language and, eventually, help ship the world’s first at-scale generative AI coding tool.

– Albert Ziegler, Principal Machine Learning Engineer // GitHub
  • Find answers without leaving their IDEs. Some participants in the PACMPL study also treated GitHub Copilot’s multi-suggestion pane like StackOverflow. Since they were able to describe their goals in natural language, participants could directly prompt GitHub Copilot to generate ideas for implementing their goals, and press Ctrl/Cmd + Enter to see a list of 10 suggestions. Even though this kind of exploration didn’t lead to deep knowledge, it helped one developer to effectively use an unfamiliar API.

A 2023 study published by GitHub in the Association for Computing Machinery’s Queue magazine also found that generative AI coding tools save developers the effort of searching for answers online. This provides them with more straightful forward answers, reduces context switching, and conserves mental energy.

Part of GitHub’s new vision for the future of AI-powered software development is a ChatGPT-like experience directly in your editor. Watch how Martin Woodward, GitHub’s Vice President of Developer Relations, uses GitHub Copilot Chat to find and fix bugs in his code.

  • Build better test coverage. Some generative AI coding tools excel in pattern recognition and completion. Developers are using these tools to build unit and functional tests—and even security tests—via natural language prompts. Some tools also offer security vulnerability filtering, so a developer will be alerted if they unknowingly introduce a vulnerability in their code.

Want to see some examples in action? Check out how Rizel Scarlett, a developer advocate at GitHub, uses GitHub Copilot to develop tests for her codebase:

  • Discover tricks and solutions they didn’t know they needed. Scarlett also wrote about eight unexpected ways developers can use GitHub Copilot—from prompting it to create a dictionary of two-letter ISO country codes and their contributing country name, to helping developers exit Vim, an editor with a sometimes finicky closing process. Want to learn more? Check out the full guide >

The bottom line

Generative AI provides humans with a new mode of interaction—and it doesn’t just alleviate the tedious parts of software development. It also inspires developers to be more creative, feel empowered to tackle big problems, and model large, complex solutions in ways they couldn’t before. From increasing productivity and offering alternative solutions, to helping you build new skills—like learning a new language or framework, or even writing clear comments and documentation—there are so many reasons to be excited about the next wave of software development. This is only the beginning.

Additional resources

Generative AI-enabled compliance for software development

Post Syndicated from Mark Paulsen original https://github.blog/2023-04-11-generative-ai-enabled-compliance-for-software-development/

In our recent blog post announcing GitHub Copilot X, we mentioned that generative AI represents the future of software development. This amazing technology will enable developers to stay in the flow while helping enterprises meet their business goals.

But as we have also mentioned in our blog series on compliance, generative AI may soon act as an enabler for developer-focused compliance programs that will drive optimization and keep your development, compliance and audit teams productive and happy.

Today, we’ll explore the potential for generative AI to help enable teams to optimize and automate some of the foundational compliance components of separation of duties that many enterprises still often manage and review manually.

Generative AI has been dominating the news lately—but what exactly is it? Here’s what you need to know, and what it means for developers.

Separation of duties

The concept of “separation of duties,” long used in the accounting world as a check and balance approach, is also adopted in other scenarios, including technology architecture and workflows. While helpful to address compliance, it can lead to additional manual steps that can slow down delivery and innovation.

Fortunately, the PCI-DSS requirements guide provides a more DevOps, cloud native, and AI-enabled approach to separation of duties by focusing on functions and accounts, as opposed to people:

“The purpose of this requirement is to separate the development and test functions from the production functions. For example, a developer can use an administrator-level account with elevated privileges in the development environment and have a separate account with user-level access to the production environment.”

There are many parts of a software delivery workflow that need to have separation of duties in place—but one of the core components that is key for any compliance program is the code review. Having a separate set of objective eyes reviewing your code, whether it’s human or AI-powered, helps to ensure risks, tech debt, and security vulnerabilities are found and mitigated as early as possible.

Code reviews also help enable the concept of separation of duties since it prohibits a single person or a single function, account, or process from moving code to the next part of your delivery workflow. Additionally, code reviews help enable separation of duties for Infrastructure as Code (IaC) workflows, Policy-as-Code configurations, and even Kubernetes declarative deployments.

As we mentioned in our previous blog, GitHub makes code review easy, since pull requests are part of the existing workflow that millions of developers use daily. Having a foundational piece of compliance built-in to the platform that developers know and love keeps them in the flow, while keeping compliance and audit teams happy as well.

Generative AI and pull requests

Wouldn’t it be cool if one-day generative AI could be leveraged to enable more developer-friendly compliance programs which have traditionally been very labor and time intensive? Imagine if generative AI could help enable DevOps and cloud native approaches for separation of duties by automating tedious tasks and allowing humans to focus on key value-added tasks.

Bringing this back to compliance and separation of duties, wouldn’t it be great if a generative AI helper was available to provide an objective set of eyes on your pull requests? This is what the GitHub Next team has been working towards with GitHub Copilot for Pull Requests.

  • Suggestions for your pull request descriptions. AI-powered tags are embedded into a pull request description and automatically filled out by GitHub Copilot based on the code which the developers changed. Going one step further, the GitHub Next team is also looking at the creation of descriptive sentences and paragraphs as developers create pull requests.
  • Code reviews with AI. Taking pull requests and code reviews one step further, the GitHub Next team is looking at AI to help review the code and provide suggestions for changes. This will help enable human interactions and optimize existing processes. The AI would automate the creation of the descriptions, based on the code changes, as well as suggestions for improvements. The code reviewer will have everything they need to quickly review the change and decide to either move forward or send the change back.

When these capabilities are production ready, development teams and compliance programs will appreciate these features for a few reasons. First, the pull request and code review process would be driven by a conversation based on a neutral and independent description. Second, the description will be based on the actual code that was changed. Third, both development and compliance workflows will be optimized and allow humans to focus on value-added work.

While these capabilities are still a work in progress, there are features available now that may help enable compliance, audit, and security teams with GitHub Copilot for Business. The ability for developers to complete tasks faster and stay in the flow are truly amazing. But the ability for GitHub Copilot to provide AI-based security vulnerability filtering nowis a great place for compliance and audit teams within enterprises to get started on their journey to embracing generative AI into their day-to-day practices.

Next steps

Generative AI will enable developers and enterprises to achieve success by reducing manual tasks and enabling developers to focus their creativity on business value–all while staying in the flow.

I hope this blog will help drive positive discussions regarding this topic and has provided a forward looking view into what will be possible in the future. The future ability of generative AI to help enable teams by automating tedious tasks will help humans focus on more value-added work and could eventually be an important part of a robust and risk-based compliance posture.


Explore GitHub Copilot X >

What developers need to know about generative AI

Post Syndicated from Damian Brady original https://github.blog/2023-04-07-what-developers-need-to-know-about-generative-ai/

By now, you’ve heard of generative artificial intelligence (AI) tools like ChatGPT, DALL-E, and GitHub Copilot, among others. They’re gaining widespread interest thanks to the fact that they allow anyone to create content from email subject lines to code functions to artwork in a matter of moments.

This potential to revolutionize content creation across various industries makes it important to understand what generative AI is, how it’s being used, and who it’s being used by. In this article, we’ll explore what generative AI is, how it works, some real-world applications, and how it’s already changing the way people (and developers) work.

What is generative AI used for?

You may have heard the buzz around new generative AI tools like ChatGPT or the new Bing, but there’s a lot more to generative AI than any one single framework, project, or application.

Traditional AI systems are trained on large amounts of data to identify patterns, and they’re capable of performing specific tasks that can help people and organizations. But generative AI goes one step further by using complex systems and models to generate new, or novel, outputs in the form of an image, text, or audio based on natural language prompts.

Generative AI models and applications can, for example, be used for:

  • Text generation. Text generation, as a field, with AI tools has been in development since the 1970s—but more recently, AI researchers have been able to train generative adversarial networks (GANs) to produce text that models human-like speech. A prime example is OpenAI’s application ChatGPT, which has been trained on thousands of texts, books, articles, and code repositories, and can respond with full answers to natural language prompts and questions.
An example of text generation in ChatGPT
An example of text generation in ChatGPT
  • Image generation. Generative AI models can be used to create new images with natural language prompts, which is one of the most popular techniques with current tools and applications. The goal with text-to-image generation is to create an image that accurately represents the content of a given prompt. For example, when we give the text prompt, “impressionist style oil painting of a Shiba Inu dog giving a tarot card reading,” to the popular AI image generator DALL-E 2 we get something that looks like this (and yes, it’s a gem):
An AI-generated image from DALL-E 2 of a Shiba Inu dog giving a tarot card reading
An AI-generated image from DALL-E 2 of a Shiba Inu dog giving a tarot card reading

An example of a video created with a text prompt using diffusion models from [Imagen Video](https://imagen.research.google/).

  • Programming code generation. Rather than scouring the internet or developer community groups for help with code examples, generative AI models can be used to help generate new programming code with natural language prompts, complete partially written code with suggestions, or even translate code from one programming language to another. This is how, at a simple level, GitHub Copilot works: it uses OpenAI’sCodex model to offer code suggestions right from a developer’s editor. However, as you would with any software development tool, we encourage you to review generated code before merging into production.

  • Data generation. Creating new data—which is called synthetic data—and augmenting existing data sets is another common use case for generative AI. This involves generating new samples from an existing dataset to increase the dataset’s size and improve machine learning models trained on it, all while providing a layer of privacy since real user data is not being utilized to power models. Synthetic data generation provides a way to create useful, meaningful data for more than just ML training though—a number of self-driving car companies like Cruise and Waymo utilize AI-generated synthetic data for training perception systems to prepare vehicles for real-world situations while in operation.

  • Language translation. Natural-language understanding (NLU) models combined with generative AI have become increasingly popular to provide language translations on-the-fly. These types of tools help companies break language barriers and increase their scope of accessibility for customer bases by being able to provide things like support or documentation in their native language. Through complex, deep learning algorithms, generative AI is able to understand the context of a source text and linguistically construct those sentences in another language. This practice can also apply to coding languages, for example, translating a desired function from Python to Java.

The bottom line: Even though generative AI is a relatively new technology, it’s already being used in consumer and business applications. The use cases, as well as the quantity of applications created with it, will continue evolving to meet more distinct and specific needs.

How does generative AI work?

Generative AI models work by using neural networks to identify patterns from large sets of data, then generate new and original data or content.

But what are neural networks? In simple terms, they use interconnected nodes that are inspired by neurons in the human brain. These networks are the foundation of machine learning and deep learning models, which use a complex structure of algorithms to process large amounts of data such as text, code, or images. Training these neural networks involves adjusting the weights or parameters of the connections between neurons to minimize the difference between predicted and desired outputs, which allows the network to learn from mistakes and make more accurate predictions based on the data.

Algorithms are a key component of machine learning and generative AI models. But beyond helping machines learn from data, algorithms are also used to optimize accuracy of outputs and make decisions, or recommendations, based on input data.

While algorithms help automate these processes, building a generative AI model is incredibly complex due to the massive amounts of data and compute resources they require. People and organizations need large datasets to train these models, and generating high-quality data can be time-consuming and expensive.

To restate the obvious, these models are complicated. Need proof? Here are some common generative AI models and how they work:

  • Large language models (LLM): LLMs are a type of machine learning model that process and generate natural language text. One of the most significant advancements in the development of large language models has been the availability of vast amounts of text data, such as books, websites, and social media posts. This data can be used to train models that are capable of predicting and generating natural language responses in a variety of contexts. As a result, large language models have multiple practical applications, such as virtual assistants, chatbots, or text generators, like ChatGPT.

  • Generative adversarial networks (GAN): GANs are one of the most used models for generative AI, and they employ two different neural networks. GANs consist of two different types of neural networks: a generator and a discriminator. The generator network generates new data, such as images or audio, from a random noise signal while the discriminator is trained to distinguish between real data from the training set and the data produced by the generator.

During training, the generator tries to create data that can trick the discriminator network into thinking it’s real. This “adversarial” process will continue until the generator can produce data that is totally indistinguishable from real data in the training set. This process helps both networks improve at their respective tasks, which ultimately results in more realistic and higher-quality generated data.

A diagram illustrating how a generative adversarial network works. Image [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/deed.en) האדם-החושב on wikipedia
A diagram illustrating how a generative adversarial network works. Image [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/deed.en) האדם-החושב on wikipedia
  • Transformer-based models: A transformer-based model’s neural networks operate by learning context and meaning through tracking relationships of sequential data, which means these models are really good at natural language processing tasks like machine translation, language modeling, and answering questions. These models have been used in popular language models, such as GPT-4 (which stands for Generative Pre-trained Transformer 4), and have also been adapted for other such tasks that require the modeling of sequential data such as image recognition.
  • Variational autoencoder models (VAEs): These models are similar to GANs in that they work with two different neural networks: encoders and decoders. VAEs can take a large amount of data and compress it into a smaller representation, which can be used to create new data that is similar to the original data. VAEs are often used in image, video, and audio generation—and here’s a fun fact: you can train a VAE on datasets like CelebA, which contains over 200,000 images of celebrities, to create completely new portraits of people that don’t exist.
 The smile vector, a concept vector discovered by [Tom White](https://aiartists.org/tom-white) using VAEs trained on the CelebA dataset.
The smile vector, a concept vector discovered by Tom White using VAEs trained on the CelebA dataset.

The real-world applications of generative AI

The impact of generative AI is quickly becoming apparent—but it’s still in its early days. Despite this, we’re already seeing a proliferation of applications, products, and open source projects that are using generative AI models to achieve specific outcomes for people and organizations (and yes, developers, too).

Though generative AI is constantly evolving, it already has some solid real world applications. Here’s just a few of them:


New and seasoned developers alike can utilize generative AI to improve their coding processes. Generative AI coding tools can help automate some of the more repetitive tasks, like testing, as well as complete code or even generate brand new code. GitHub has its own AI-powered pair programmer, GitHub Copilot, which uses generative AI to provide developers with code suggestions. And GitHub also has announced GitHub Copilot X, which brings generative AI to more of the developer experience across the editor, pull requests, documentation, CLI, and more.


Generative AI has the potential to greatly impact and improve accessibility for folks with disabilities through a variety of modalities, such as speech-to-text transcription, text-to-speech audio generation, or assistive technologies. One of the most exciting facets of our GitHub Copilot tool is its voice-activated capabilities that allow developers with difficulties using a keyboard to code with their voice. By leveraging the power of generative AI, these types of tools are paving the way for a more inclusive and accessible future in technology.


Generative AI can take gaming to the next level (get it? 😉) by generating new characters, storylines, design components, and more. Case in point: The developer behind the game, This Girl Does Not Exist, has said that every component of the game—from the storyline to the art and even the music—was generated entirely by AI. This use of generative AI can enable gaming studios to create new and exciting content for their users, all without increasing the developer workload, which frees them up to work on other aspects of the game, such as story development.

Web design

Designers can utilize generative AI tools to automate the design process and save significant time and resources, which allows for a more streamlined and efficient workflow. Additionally, incorporating these tools into the development process can lead to the creation of highly customized designs and logos, enhancing the overall user experience and engagement with the website or application. Generative AI tools can also be used to do some of the more tedious work, such as creating design layouts that are optimized and adaptable across devices. For example, designers can use tools like designs.ai to quickly generate logos, banners, or mockups for their websites.

Microsoft and other industry players are increasingly utilizing generative AI models in search to create more personalized experiences. This includes query expansion, which generates relevant keywords to reduce the number of searches. So, rather than the search engine returning a list of links, generative AI can help these new and improved models return search results in the form of natural language responses. Bing now includes AI-powered features in partnership with OpenAI that provide answers to complex questions and allow users to ask follow-up questions in a chatbox for more refined responses.


Interest has emerged around the potential applications of generative AI in the healthcare field to improve disease detection and diagnosis, advance medical research, and accelerate progress in the pharmaceutical space. Potentially, generative AI could be used to analyze large amounts of data to simulate chemical structures and predict new compounds will be the most effective for new drug discoveries. NVIDIA Clara is one example of a generative AI model specifically designed for medical imaging and healthcare research. (Plus, Gartner suggests more than 30 percent of new pharmaceutical drugs and materials will be discovered via generative AI models by 2025.)

Fun fact: Did you know that ChatGPT recently passed the US Medical Licensing exam without any intervention from clinicians?

Marketing and advertising

In marketing, content is king—and generative AI is making it easier than ever to quickly create large amounts of it. A number of companies, agencies, and creators are already turning to generative AI tools to create images for social posts or write captions, product descriptions, blog posts, email subject lines, and more. Generative AI can also help companies personalize ad experiences by creating custom, engaging content for individuals at speed. Writers, marketers, and creators can leverage tools like Jasper to generate copy, Surfer SEO to optimize organic search, or albert.ai to personalize digital advertising content.

Art and design

As we’ve seen above, the power of AI can be harnessed to create some incredible portraits in a matter of moments (re: the future-telling Shiba 🐕). Artists and designers alike are using these AI tools as a source of inspiration. For example, architects can quickly create 3D models of objects or environments and artists can breathe new life into their portraits by using AI to apply different styles, like adding a Cubist style to their original image. Need proof? Designers are already starting to use AI image generators, such as Midjourney and Microsoft Designer, to create high-quality images by simply typing out Discord commands.


In a recent discussion about tech trends and how they’ll affect the finance sector, Michael Schrage, a research fellow at the MIT Sloan School Initiative on the Digital Economy, said, “I think, increasingly, we’re going to be seeing generative AI used for financial forecasts and scenario generation.” This is a likely path forward—generative AI can be used to analyze large amounts of data to detect fraud, manage risk, and inform decision making. And that has obvious applications in the financial services industry.


Manufacturers are starting to turn to generative AI solutions to help with product design, quality control, and predictive maintenance. Generative AI can be used to analyze historical data to improve machine failure predictions and help manufacturers with maintenance planning. According to research conducted by Capgemini, more than half of European manufacturers are implementing some AI solutions (although so far, these aren’t generative AI solutions). This is largely because the sheer amount of manufacturing data is easier for machines to analyze at speed than humans.

AI as a partner: Generative AI models and tools are narrow in focus, and work best at generating content, code, and images. In research at GitHub, we’ve found that GitHub Copilot helps developers code up to 55% faster, underscoring how generative AI models and tools can improve overall productivity and boost efficiency. Metrics like these show how generative AI tools are already changing how people and teams work—but they also underscore how these tools act as complement to human efforts.

Take this with you

Whether it’s creating visual assets for an ad campaign or augmenting medical images to help diagnose diseases, generative AI is helping us solve complex problems at speed. And the emergence of generative AI-based programming tools has revolutionized the way developers approach writing code.

We know that developers want to design and write software quickly, and tools like GitHub Copilot are enabling them to access large datasets to write more efficient code and boost productivity. In fact, 96% of developers surveyed reported spending less time on repetitive tasks using GitHub Copilot, which in turn allowed 74% of them to focus on more rewarding work.

While these models aren’t perfect yet, they’re getting better by the day—and that’s creating an exciting immediate future for developers and generative AI.