Tag Archives: Product

The journey of building a comprehensive attribution platform

Post Syndicated from Grab Tech original https://engineering.grab.com/attribution-platform

The Grab superapp offers a comprehensive array of services from ride-hailing and food delivery to financial services. This creates multifaceted user journeys, traversing homepages, product pages, checkouts, and interactions with diverse content, including advertisements and promo codes.

Background: Why ads and attribution matter in our superapp

Ads are crucial for Grab in driving user engagement and supporting our ecosystem by seamlessly connecting users with our services. In the ever-evolving world of advertising, the ability to gauge the impact of marketing investments takes on pivotal significance. Advertisers dedicate substantial resources to promote their businesses, necessitating a clear understanding of the return on AdSpend (ROAS) for each campaign. In this context, attribution plays a central role, serving as the guiding compass for advertisers and marketers, elucidating the effectiveness of touchpoints within campaigns.

For instance, a merchant-partner seeks to enhance its reach by advertising on the Grab food delivery homepage. With the assistance of our attribution system, the merchant-partner can now precisely gauge the impact of their homepage ads on Grab. This involves tracking user engagement and monitoring the resulting orders that stem from these interactions. This level of granularity not only highlights the value of attribution but also demonstrates its capability in providing detailed insights into the effectiveness of advertising campaigns and enabling merchant-partners to optimise their campaigns with more precision.

In this blog, we delve into the technical intricacies, software architecture, challenges, and solutions involved in crafting a state-of-the-art engineering solution for the attribution platform.

Genesis: Pre-project landscape

When our journey began in 2020, Grab’s marketing efforts had limited attribution capabilities and data analytics was predominantly reliant on ad hoc queries conducted by business and data analysts. Before the introduction of a standardised approach, we had to manage discrepant results and a time-consuming manual process of data preparation, cleansing, and storage across teams. When issues arose in the analytical pipeline, resolution efforts took relatively longer and were reoccurring. We needed a comprehensive engineering solution that would address the identified gaps, and significantly enhance metrics related to ROI, attribution accuracy, and data-handling efficiency.

Inception: The pure ads attribution engine (Kappa architecture)

We chose Kappa architecture due to its imperative role in achieving near real-time attribution, especially in support of our new pricing model, cost per order (CPO). With this solution, we aimed to drastically reduce data latency from 2-3 days to just a few minutes. Traditional ETL (Extract, Transform, and Load) based batch processing methods were evaluated but quickly found to be inadequate for our purposes, mainly due to their speed.

In the advertising industry, rapid decision-making is critical. Traditional batch processing solutions would introduce significant latency, hampering our ability to make real-time, data-driven decisions. With its architecture’s inherent capability for real-time stream processing, Kappa emerged as the logical choice. Additionally, Kappa offers the agility required to empower our ad-serving team for real-time decision support, and better ad ranking and selection, enabling dynamic and effective targeting decisions without delay.

The first step on this journey was to create a pure and near real-time stream processing Ads Attribution Engine. This engine was based on the Kappa architecture to provide advertisers with quick insights into their ROAS offering real-time attribution, enabling advertisers to optimise their campaigns efficiently.

High-level workflow of the Ads Attribution Engine

In this solution, we used the following tools in our tech stack:

  • Kafka for event streams
  • DDB for events storage
  • Amazon S3 as the data lake
  • An in-house stream processing framework similar to Keystone
  • Redis for caching events
  • ScyllaDB for storing ad metadata
  • Amazon relational database service (RDS) for analytics
Architecture of the near real-time stream processing Ads Attribution Engine

Evolution: Merging marketing levers – Ads and promos

We began to envision a world where we could merge various marketing levers into a unified Attribution Engine, starting with ads and promos. This evolved vision also aimed to prevent order double counting (when a user interacts with both ads and promos in the same checkout), which would provide a more holistic attribution solution.

With the unified Attribution Engine, we would also enable more sophisticated personalisation through machine learning models and drive higher conversions.

The unified Attribution Engine workflow, which included Promo touch points

The unified attribution engine used mostly the same tech stack, except for analytics where Druid was used instead of RDS.

Architecture of the unified Attribution Engine

Introspection: Identifying shortcomings and the path to improvement

While the unified attribution engine was a step in the right direction, it wasn’t without its challenges. There were challenges related to real-time data processing costs, scalability for longer attribution windows, latency and lag issues, out-of-order events leading to misattribution, and the complexity of implementing multi-touch attribution models. To truly empower advertisers and enhance the attribution process, we knew we needed to evolve further.

Rebirth: The birth of a full-fledged attribution platform (Lambda architecture)

This journey eventually led us to build a full-fledged attribution platform using Lambda architecture, which blended both batch and real-time stream processing methods. With this change, our platform could rapidly and accurately process data and attribute the impact of ads and promos on user behaviour.

Why Lambda architecture?

This choice was a strategic one – real-time processing is vital for tracking events as they occur, but it offers only a current snapshot of user behaviour. This means we would not be able to analyse historical data, which is a crucial aspect of accurate attribution and exploring multiple attribution models. Historical data allows us to identify trends, patterns, and correlations not evident in real-time data alone.

High level workflow for the full-fledged attribution platform with Lambda architecture

In this system’s tech stack, the key components are:

  • Coban, an in-house stream processing framework used for real-time data processing
  • Spark-based ETL jobs for batch processing
  • Amazon S3 as the data warehouse
  • An offline layer that is capable of providing historical context, handling large data volumes, performing complex analytics, and so on.

Key benefits of the offline layer

  • Provides historical context: The offline layer enriches the attribution process by providing a historical perspective on user interactions, essential for precise attribution analysis spanning extended time periods.
  • Handles enormous data volumes: This layer efficiently manages and processes extensive data generated by advertising campaigns, ensuring that attribution seamlessly accommodates large-scale data sets.
  • Performs complex analytics: Enables more intricate computations and data analysis than real-time processing alone, the offline layer is instrumental in fine-tuning attribution models and enhancing their accuracy.
  • Ensures reliability in the face of challenges: By providing fault tolerance and resilience against system failures, the offline layer ensures the continuous and dependable operation of the attribution system, even during unexpected events.
  • Optimises data storage and serving: Relying on Amazon S3, the storage layer for raw data optimises storage by building interactive reporting APIs.
Architecture of our comprehensive offline attribution platform

Challenges with Lambda and mitigation

Lambda architecture allows us to have the accuracy and robustness of batch processing along with real-time stream processing. However, we noticed some drawbacks that may lead to complexity due to maintaining both batch and stream processing:

  • Operating two parallel systems for batch and stream processing can lead to increased complexity in production environments.
  • Lambda architecture requires two sets of business logic – one for the batch layer and another for the stream layer.
  • Synchronisation across both layers can make system alterations more challenging.
  • This dual implementation could also allude to inconsistencies and introduce potential bugs into the system.

To mitigate these complications, we’re establishing an optimisation strategy for our current system. By distinctly separating the responsibilities of our real-time pipelines from those of our offline jobs, we intend to harness the full potential of each approach, while simultaneously curbing the added complexity.

Hence, redefining the way we utilise Lambda architecture, striking an efficient balance between real-time responsiveness and sturdy accuracy with the below proposal.

Vanguard: Enhancements in the future

In the coming months, we will be implementing the optimisation strategy and improving our attribution platform solution. This strategy can be broken down into the following sections.

Real-time pipeline handling time-sensitive data: Real-time pipelines can process and deliver time-sensitive metrics like CPO-related data in near real-time, allowing for budget capping and immediate adjustments to marketing spend. This can provide us with actionable insights that can help with areas like real-time bidding, real-time marketing, or dynamic pricing. By limiting the volume of data through the real-time path, we can ensure it’s more manageable and focused on immediate actionable data.

Batch jobs handling all other reporting data: Batch processing is best suited for computations that are not time-bound and where completeness is more important. By dedicating more time to the processing phase, batch processing can handle larger volumes and more complex computations, providing more comprehensive and accurate reporting.

This approach will simplify our Lambda architecture, as the batch and real-time pipelines will have clear separation of duties. It may also reduce the chance of discrepancies between the real-time and batch-processing datasets and lower the operational load of our real-time system.

Conclusion: A holistic attribution picture

Through our journey of building a comprehensive attribution platform, we can now deliver a holistic and dependable view of user behaviour and empower merchant-partners to use insights from advertisements and promotions. This journey has been a long one, but we were able to improve our attribution solution in several ways:

  • Attribution latency: Successfully reduced attribution latency from 2-3 days to just a few minutes, ensuring that advertisers can access real-time insights and feedback.
  • Data accuracy: Through improved data collection and processing, we achieved data discrepancies of less than 1%, enhancing the accuracy and reliability of attribution data.
  • Conversion rate: Advertisers witnessed a significant increase in conversion rates, a direct result of our real-time attribution capabilities.
  • Cost efficiency: Embracing the Lambda architecture led to a ~25% reduction in real-time data processing costs, allowing for more efficient campaign optimisations.
  • Operational resilience: Building an offline layer provided fault tolerance and resilience against system failures, ensuring that our attribution system continued to operate seamlessly, even during unexpected events.

Join us

Grab is the leading superapp platform in Southeast Asia, providing everyday services that matter to consumers. More than just a ride-hailing and food delivery app, Grab offers a wide range of on-demand services in the region, including mobility, food, package and grocery delivery services, mobile payments, and financial services across 428 cities in eight countries.

Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!

Managing dynamic marketplace content at scale: Grab’s approach to content moderation

Post Syndicated from Grab Tech original https://engineering.grab.com/dynamic-marketplace

In the fast-paced world of on-demand delivery, maintaining safe marketplaces is a complex undertaking. Grab, a leading superapp in Southeast Asia, operates GrabFood and GrabMart, two popular marketplaces that connect consumers with a wide range of food and daily necessities. With more than 100k listings for different items updated daily by our merchants across eight different countries, Grab is rising to the challenge of ensuring that its marketplaces remain compliant with its own policies, government regulations as well as platform policies.

This article provides an overview of how Grab employs a combination of automated and manual content moderation to manage its dynamic marketplace content efficiently, while also collaborating with Google to ensure marketplace safety. Stay tuned for future articles that will delve deeper into the technology and solutions used for content moderation.

Dynamic Marketplace Landscape

Marketplaces like GrabFood and GrabMart are at the forefront of connecting merchants and consumers. These marketplaces provide an avenue for merchants to showcase their offerings, enabling consumers to conveniently access a plethora of on-demand options. However, in an environment characterized by rapid changes as well as evolving regulatory frameworks, maintaining the integrity of these marketplaces becomes a formidable task.

Scale and Flexibility: A Dual Challenge

The cornerstone of Grab’s success lies in its ability to adapt to the unique regulations and requirements of each country it operates in. This necessitates a nuanced and multifaceted approach to content moderation. To achieve both scale and flexibility, Grab employs a proactive strategy that combines and leverages automated and manual moderation processes.

Automated Moderation

Automated moderation plays a pivotal role in efficiently managing the high volume of listings that undergo daily updates. Grab utilises advanced algorithms and machine learning technologies, built in-house, to scan listings everyday for potential violations of its own policies, government regulations and platform policies. This automation not only speeds up the process to put eligible listings on the Grab platform, but also ensures consistent adherence to predefined guidelines. However, automated moderation is not without its limitations, as contextual understanding and subjective judgment often require human intervention.

Manual Moderation

Recognising the nuanced nature of content moderation, Grab employs a team of human moderators who possess the cultural awareness and contextual understanding necessary to assess complex cases. These moderators review listings flagged by algorithms and machine learning technologies that require human judgment, ensuring that content aligns with Grab’s policies, local regulations as well as platform policies. Manual moderation adds a layer of human insight that automated systems may lack, contributing to a more accurate and contextually sensitive approach.

In its commitment to ensuring marketplace safety, Grab has also established a strong collaboration with Google. Grab works hand in hand with Google to collectively ensure adherence to Play Store policies and guidelines.

Grab

  • Programme Management: Poonam Gambhire, Shuyang Sun
  • Product: Chris Collard
  • Engineering: Shuya Ding, Kirubakaran Duraisamy, Xu Chen

Google

  • Play Policy: Siddhartha Paul Tiwari
  • Business Development: Mika Igarashi

Join us

Grab is the leading superapp platform in Southeast Asia, providing everyday services that matter to consumers. More than just a ride-hailing and food delivery app, Grab offers a wide range of on-demand services in the region, including mobility, food, package and grocery delivery services, mobile payments, and financial services across 428 cities in eight countries.

Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!

10 unexpected ways to use GitHub Copilot

Post Syndicated from Kedasha Kerr original https://github.blog/2024-01-22-10-unexpected-ways-to-use-github-copilot/

Writing code is more than just writing code. There’s commit messages to write, CLI commands to execute, and obscure syntax to try to remember. While you’ve probably used GitHub Copilot to support your coding, did you know it can also support your other workloads?

GitHub Copilot is widely known for its ability to help developers write code in their IDE. Today, I want to show you how the AI assistant’s abilities can extend beyond just code generation. In this post, we’ll explore 10 use cases where GitHub Copilot can help reduce friction during your developer workflow. This includes pull requests, working from the command line, debugging CI/CD workflows, and much more!

Let’s get into it.

1. Run terminal commands from GitHub Copilot Chat

If you ever forget how to run a particular command when you’re working in your VS Code, GitHub Copilot Chat is here to help! With the new @terminal agent in VS Code, you can ask GitHub Copilot how to run a particular command. Once it generates a response, you can then click the “Insert into Terminal” button to run the suggested command.

Let me show you what I mean:

The @terminal agent in VS Code also has context about the integrated shell terminal, so it can help you even further.

2. Write pull request summaries (Copilot Enterprise feature only)

We’ve all been there where we made a sizable pull request with tons of files and hundreds of changes. But, sometimes, it can be hard to remember every little detail that we’ve implemented or changed.

Yet it’s an important part of collaborating with other engineers/developers on my team. After all, if I don’t give them a summary of my proposed changes, I’m not giving them the full context they need to provide an effective review. Thankfully, GitHub Copilot is now integrated into pull requests! This means, with the assistance of AI, you can generate a detailed pull request summary of the changes you made in your files.

Let’s look at how you can generate these summaries:

Now, isn’t that grand! All you have to do is go in and edit what was generated and you have a great, detailed explanation of all the changes you’ve made—with links to changed files!

Note: You will need a Copilot Enterprise plan (which requires GitHub Enterprise Cloud) to use PR summaries. Learn more about this enterprise feature by reading our documentation.

3. Generate commit messages

I came across this one recently while making changes in VS Code. GitHub Copilot can help you generate commit messages right in your IDE. If you click on the source control button, you’ll notice a sparkle in the message input box.

Click on those sparkles and voilà, commit messages are generated on your behalf:

I thought this was a pretty nifty feature of GitHub Copilot in VS Code and Visual Studio.

4. Get help in the terminal with GitHub Copilot in the CLI

Another way to get help with terminal commands is to use GitHub Copilot in the CLI. This is an extension to GitHub CLI that helps you with general shell commands, Git commands, and gh cli commands.

GitHub Copilot in the CLI is a game-changer that is super useful for reminding you of commands, teaching you new commands or explaining random commands you come across online.

Learn how to get started with GitHub Copilot in the CLI by reading this post!

5. Talk to your repositories on GitHub.com (Copilot Enterprise feature only)

If you’ve ever gone to a new repository and have no idea what’s happening even though the README is there, you can now use GitHub Copilot Chat to explain the repository to you, right in GitHub.com. Just click on the Copilot icon in the top right corner of the repository and ask whatever you want to know about that repository.

On GitHub.com you can ask Copilot general software related questions, questions about the context of your project, questions about a specific file, or specified lines of code within a file.

Note: You will need a Copilot Enterprise plan (which requires GitHub Enterprise Cloud) to use GitHub Copilot Chat in repositories on GitHub.com. Learn more about this enterprise feature by reading our documentation.

6. Fix code inline

Did you know that in addition to asking for suggestions with comments, you can get help with your code inline? Just highlight the code you want to fix, right click, and select “Fix using Copilot.” Copilot will then provide you with a suggested fix for your code.

This is great to have for those small little fixes we sometimes need right in our current files.

7. Bulk close 1000+ GitHub Issues

My team and I had a use case where we needed to close over 1,600 invalid GitHub Issues submitted to one of our repositories. I created a custom GitHub Action that automatically closed all 1,600+ issues and implemented the solution with GitHub Copilot.

GitHub Copilot Chat helped me to create the GitHub Action, and also helped me implement the closeIssue() function very quickly by leveraging Octokit to grab all the issues that needed to be closed.

Example of a closeissues.js script generated by GitHub Copilot

You can read all about how I bulk closed 1000+ GitHub issues in this blog post, but just know that with GitHub Copilot Chat, we went from having 1,600+ open issues, to a measly 64 in a matter of minutes.

8. Generate documentation for your code

We all love documenting our code, but just in case some of us need a little help writing documentation, GitHub Copilot is here to help!

Regardless of your language, you can quickly generate documentation following language specific formats—Docstring for Python, JSDoc for Javascript or Javadoc for Java.

9. Get help with error messages in your terminal

Error messages can often be confusing. With GitHub Copilot in your IDE, you can now get help with error messages right in the terminal. Just highlight the error message, right click, and select “Explain with Copilot.” GitHub Copilot will then provide you with a description of the error and a suggested fix.

You can also bring error messages from your browser console into Copilot Chat so it can explain those messages to you as well with the /explain slash command.

10. Debug your GitHub Actions workflow

Whenever I have a speaking engagement, I like to create my slides using Slidev, an open source presentation slide builder for developers. I enjoy using it because I can create my slides in Markdown and still make them look splashy! Take a look at this one for example!

Anyway, there was a point in time where I had an issue with deploying my slides to GitHub Pages and I just couldn’t figure out what the issue was. So, of course, I turned to my trusty assistant—GitHub Copilot Chat that helped me debug my way through deploying my slides.

Conversation between GitHub Copilot Chat and developer to debug a GitHub Actions workflow

Read more about how I debugged my deployment workflow with GitHub Copilot Chat here.

GitHub Copilot goes beyond code completion

As you see above, GitHub Copilot extends far beyond your editor and code completion. It is truly evolving to be one of the best tools you can have in your developer toolkit. I’m still learning and discovering new ways to integrate GitHub Copilot into my daily workflow and I hope you give some of the above a chance!

Be sure to sign up for Github Copilot if you haven’t tried it out yet and stay up to date with all that’s happening by subscribing to our developer newsletter for more tips, technical guides, and best practices! You can also drop me a note on X if you have any questions, @itsthatladydev.

Until next time, happy coding!

The post 10 unexpected ways to use GitHub Copilot appeared first on The GitHub Blog.

An elegant platform

Post Syndicated from Grab Tech original https://engineering.grab.com/an-elegant-platform

Coban is Grab’s real-time data streaming platform team. As a platform team, we thrive on providing our internal users from all verticals with self-served data-streaming resources, such as Kafka topics, Flink and Change Data Capture (CDC) pipelines, various kinds of Kafka-Connect connectors, as well as Apache Zeppelin notebooks, so that they can effortlessly leverage real-time data to build intelligent applications and services.

In this article, we present our journey from pure Infrastructure-as-Code (IaC) towards a more sophisticated control plane that has revolutionised the way data streaming resources are self-served at Grab. This change also leads to improved scalability, stability, security, and user adoption of our data streaming platform.

Problem statement

In the early ages of public cloud, it was a common practice to create virtual resources by clicking through the web console of a cloud provider, which is sometimes referred to as ClickOps.

ClickOps has many downsides, such as:

  • Inability to review, track, and audit changes to the infrastructure.
  • Inability to massively scale the infrastructure operations.
  • Inconsistencies between environments, e.g. staging and production.
  • Inability to quickly recover from a disaster by re-creating the infrastructure at a different location.

That said, ClickOps has one tremendous advantage; it makes creating resources using a graphical User Interface (UI) fairly easy for anyone like Infrastructure Engineers, Software Engineers, Data Engineers etc. This also leads to a high iteration speed towards innovation in general.

IaC resolved many of the limitations of ClickOps, such as:

  • Changes are committed to a Version Control System (VCS) like Git: They can be reviewed by peers before being merged. The full history of all changes is available for investigating issues and for audit.
  • The infrastructure operations scale better: Code for similar pieces of infrastructure can be modularised. Changes can be rolled out automatically by Continuous Integration (CI) pipelines in the VCS system, when a change is merged to the main branch.
  • The same code can be used to deploy the staging and production environments consistently.
  • The infrastructure can be re-created anytime from its source code, in case of a disaster.

However, IaC unwittingly posed a new entry barrier too, requiring the learning of new tools like Ansible, Puppet, Chef, Terraform, etc.

Some organisations set up dedicated Site Reliability Engineer (SRE) teams to centrally manage, operate, and support those tools and the infrastructure as a whole, but that soon created the potential of new bottlenecks in the path to innovation.

On the other hand, others let engineering teams manage their own infrastructure, and Grab adopted that same approach. We use Terraform to manage infrastructure, and all teams are expected to have select engineers who have received Terraform training and have a clear understanding of it.

In this context, Coban’s platform initially started as a handful of Git repositories where users had to submit their Merge Requests (MR) of Terraform code to create their data streaming resources. Once reviewed by a Coban engineer, those Terraform changes would be applied by a CI pipeline running Atlantis.

While this was a meaningful first step towards self-service and platformisation of Coban’s offering within Grab, it had several significant downsides:

  • Stability: Due to the lack of control on the Terraform changes, the CI pipeline was prone to human errors and frequent failures. For example, users would initiate a new Terraform project by duplicating an existing one, but then would forget to change the location of the remote Terraform state, leading to the in-place replacement of an existing resource.
  • Scalability: The Coban team needed to review all MRs and provide ad hoc support whenever the pipeline failed.
  • Security: In the absence of Identity and Access Management (IAM), MRs could potentially contain changes pertaining to other teams’ resources, or even changes to Coban’s core infrastructure, with code review as the only guardrail.
  • Limited user growth: We could only acquire users who were well-versed in Terraform.

It soon became clear that we needed to build a layer of abstraction between our users and the Terraform code, to increase the level of control and lower the entry barrier to our platform, while still retaining all of the benefits of IaC under the hood.

Solution

We designed and built an in-house three-tier control plane made of:

  • Coban UI, a front-end web interface, providing our users with a seamless ClickOps experience.
  • Heimdall, the Go back-end of the web interface, transforming ClickOps into IaC.
  • Khone, the storage and provisioner layer, a Git repository storing Terraform code and metadata of all resources as well as the CI pipelines to plan and apply the changes.

In the next sections, we will deep dive in those three components.

Fig. 1 Simplified architecture of a request flowing from the user to the Coban infrastructure, via the three components of the control plane: the Coban UI, Heimdall, and Khone.

Although we designed the user journey to start from the Coban UI, our users can still opt to communicate with Heimdall and with Khone directly, e.g. for batch changes, or just because many engineers love Git and we want to encourage broad adoption. To make sure that data is eventually consistent across the three systems, we made Khone the only persistent storage layer. Heimdall regularly fetches data from Khone, caches it, and presents it to the Coban UI upon each query.

We also continued using Terraform for all resources, instead of mixing various declarative infrastructure approaches (e.g. Kubernetes Custom Resource Definition, Helm charts), for the sake of consistency of the logic in Khone’s CI pipelines.

Coban UI

The Coban UI is a React Single Page Application (React SPA) designed by our partner team Chroma, a dedicated team of front-end engineers who thrive on building legendary UIs and reusable components for platform teams at Grab.

It serves as a comprehensive self-service portal, enabling users to effortlessly create data streaming resources by filling out web forms with just a few clicks.

Fig. 2 Screen capture of a new Kafka topic creation in the Coban UI.

In addition to facilitating resource creation and configuration, the Coban UI is seamlessly integrated with multiple monitoring systems. This integration allows for real-time monitoring of critical metrics and health status for Coban infrastructure components, including Kafka clusters, Kafka topic bytes in/out rates, and more. Under the hood, all this information is exposed by Heimdall APIs.

Fig. 3 Screen capture of the metrics of a Kafka cluster in the Coban UI.

In terms of infrastructure, the Coban UI is hosted in AWS S3 website hosting. All dynamic content is generated by querying the APIs of the back-end: Heimdall.

Heimdall

Heimdall is the Go back-end of the Coban UI. It serves a collection of APIs for:

  • Managing the data streaming resources of the Coban platform with Create, Read, Update and Delete (CRUD) operations, treating the Coban UI as a first-class citizen.
  • Exposing the metadata of all Coban resources, so that they can be used by other platforms or searched in the Coban UI.

All operations are authenticated and authorised. Read more about Heimdall’s access control in Migrating from Role to Attribute-based Access Control.

In the next sections, we are going to dive deeper into these two features.

Managing the data streaming resources

First and foremost, Heimdall enables our users to self-manage their data streaming resources. It primarily relies on Khone as its storage and provisioner layer for actual resource management via Git CI pipelines. Therefore, we designed Heimdall’s resource management workflow to leverage the underlying Git flow.

Fig. 4 Diagram flow of a request in Heimdall.

Fig. 4 shows the diagram flow of a typical request in Heimdall to create, update, or delete a resource.

  1. An authenticated user initiates a request, either by navigating in the Coban UI or by calling the Heimdall API directly. At this stage, the request state is Initiated on Heimdall.
  2. Heimdall validates the request against multiple validation rules. For example, if an ongoing change request exists for the same resource, the request fails. If all tests succeed, the request state moves to Ongoing.
  3. Heimdall then creates an MR in Khone, which contains the Terraform files describing the desired state of the resource, as well as an in-house metadata file describing the key attributes of both resource and requester.
  4. After the MR has been created successfully, Heimdall notifies the requester via Slack and shares the MR URL.
  5. After that, Heimdall starts polling the status of the MR in a loop.
  6. For changes pertaining to production resources, an approver who is code owner in the repository of the resource has to approve the MR. Typically, the approver is an immediate teammate of the requester. Indeed, as a platform team, we empower our users to manage their own resources in a self-service fashion. Ultimately, the requester would merge the MR to trigger the CI pipeline applying the actual Terraform changes. Note that for staging resources, this entire step 6 is automatically performed by Heimdall.
  7. Depending on the MR status and the status of its CI pipeline in Khone, the final state of the request can be:
    • Failed if the CI pipeline has failed in Khone.
    • Completed if the CI pipeline has succeeded in Khone.
    • Cancelled if the MR was closed in Khone.

Heimdall exposes APIs to let users track the status of their requests. In the Coban UI, a page queries those APIs to elegantly display the requests.

Fig. 5 Screen capture of the Coban UI showing all requests.

Exposing the metadata

Apart from managing the data streaming resources, Heimdall also centralises and exposes the metadata pertaining to those resources so other Grab systems can fetch and use it. They can make various queries, for example, listing the producers and consumers of a given Kafka topic, or determining if a database (DB) is the data source for any CDC pipeline.

To make this happen, Heimdall not only retains the metadata of all of the resources that it creates, but also regularly ingests additional information from a variety of upstream systems and platforms, to enrich and make this metadata comprehensive.

Fig. 6 Diagram showing some of Heimdall’s upstreams (on the left) and downstreams (on the right) for metadata collection, enrichment, and serving. The arrows show the data flow. The network connection (client -> server) is actually the other way around.

On the left side of Fig. 6, we illustrate Heimdall’s ingestion mechanism with several examples (step 1):

  • The metadata of all Coban resources is ingested from Khone. This means the metadata of the resources that were created directly in Khone is also available in Heimdall.
  • The list of Kafka producers is retrieved from our monitoring platform, where most of them emit metrics.
  • The list of Kafka consumers is retrieved directly from the respective Kafka clusters, by listing the consumer groups and respective Client IDs of each partition.
  • The metadata of all DBs, that are used as a data source for CDC pipelines, is fetched from Grab’s internal DB management platform.
  • The Kafka stream schemas are retrieved from the Coban schema repository.
  • The Kafka stream configuration of each stream is retrieved from Grab Universal Configuration Management platform.

With all of this ingested data, Heimdall can provide comprehensive and accurate information about all data streaming resources to any other Grab platforms via a set of dedicated APIs.

The right side of Fig. 6 shows some examples (step 2) of Heimdall’s serving mechanism:

  • As a downstream of Heimdall, the Coban UI enables our direct users to conveniently browse their data streaming resources and access their attributes.
  • The entire resource inventory is ingested into the broader Grab inventory platform, based on backstage.io.
  • The Kafka streams are ingested into Grab’s internal data discovery platform, based on DataHub, where users can discover and trace the lineage of any piece of data.
  • The CDC connectors pertaining to DBs are ingested by Grab internal DB management platform, so that they are made visible in that platform when users are browsing their DBs.

Note that the downstream platforms that ingest data from Heimdall each expose a particular view of the Coban inventory that serves their purpose, but the Coban platform remains the only source of truth for any data streaming resource at Grab.

Lastly, Heimdall leverages an internal MySQL DB to support quick data query and exploration. The corresponding API is called by the Coban UI to let our users conveniently search globally among all resources’ attributes.

Fig. 7 Screen capture of the global search feature in the Coban UI.

Khone

Khone is the persistent storage layer of our platform, as well as the executor for actual resource creation, changes, and deletion. Under the hood, it is actually a GitLab repository of Terraform code in typical GitOps fashion, with CI pipelines to plan and apply the Terraform changes automatically. In addition, it also stores a metadata file for each resource.

Compared to letting the platform create the infrastructure directly and keep track of the desired state in its own way, relying on a standard IaC tool like Terraform for the actual changes to the infrastructure presents two major advantages:

  • The Terraform code can directly be used for disaster recovery. In case of a disaster, any entitled Cobaner with a local copy of the main branch of the Khone repository is able to recreate all our platform resources directly from their machine. There is no need to rebuild the entire platform’s control plane, thus reducing our Recovery Time Objective (RTO).
  • Minimal effort required to follow the API changes of our infrastructure ecosystem (AWS, Kubernetes, Kafka, etc.). When such a change happens, all we need to do is to update the corresponding Terraform provider.

If you’d like to read more about Khone, check out Securing GitOps pipelines. In this section, we will only focus on Khone’s features that are relevant from the platform perspective.

Lightweight Terraform

In Khone, each resource is stored as a Terraform definition. There are two major differences from a normal Terraform project:

  • No Terraform environment, such as the required Terraform providers and the location of the remote Terraform state file. They are automatically generated by the CI pipeline via a simple wrapper.
  • Only vetted Khone Terraform modules can be used. This is controlled and enforced by the CI pipeline via code inspection. There is one such Terraform module for each kind of supported resource of our platform (e.g. Kafka topic, Flink pipeline, Kafka Connect mirror source connector etc.). Furthermore, those in-house Terraform modules are designed to automatically derive their key variables (e.g. resource name, cluster name, environment) from the relative path of the parent Terraform project in the Khone repository.

Those characteristics are designed to limit the risk and blast radius of human errors. They also make sure that all resources created in Khone are supported by our platform, so that they can also be discovered and managed in Heimdall and the Coban UI. Lastly, by generating the Terraform environment on the fly, we can destroy resources simply by deleting the directory of the project in the code base – this would not be possible otherwise.

Resource metadata

All resource metadata is stored in a YAML file that is present in the Terraform directory of each resource in the Khone repository. This is mainly used for ownership and cost attribution.

With this metadata, we can:

  • Better communicate with our users whenever their resources are impacted by an incident or an upcoming maintenance operation.
  • Help teams understand the costs of their usage of our platform, a significant step towards cost efficiency.

There are two different ways resource metadata can be created:

  • Automatically through Heimdall: The YAML metadata file is automatically generated by Heimdall.
  • Through Khone by a human user: The user needs to prepare the YAML metadata file and include it in the MR. This file is then verified by the CI pipeline.

Outcome

The initial version of the three-tier Coban platform, as described in this article, was internally released in March 2022, supporting only Kafka topic management at the time. Since then, we have added support for Flink pipelines, four kinds of Kafka Connect connectors, CDC pipelines, and more recently, Apache Zeppelin notebooks. At the time of writing, the Coban platform manages about 5000 data streaming resources, all described as IaC under the hood.

Our platform also exposes enriched metadata that includes the full data lineage from Kafka producers to Kafka consumers, as well as ownership information, and cost attribution.

With that, our monthly active users have almost quadrupled, truly moving the needle towards democratising the usage of real-time data within all Grab verticals.

In spite of that user growth, the end-to-end workflow success rate for self-served resource creation, change or deletion, remained well above 90% in the first half of 2023, while the Heimdall API uptime was above 99.95%.

Challenges faced

A common challenge for platform teams resides in the misalignment between the Service Level Objective (SLO) of the platform, and the various environments (e.g. staging, production) of the managed resources and upstream/downstream systems and platforms.

Indeed, the platform aims to guarantee the same level of service, regardless of whether it is used to create resources in the staging or the production environment. From the platform team’s perspective, the platform as a whole is considered production-grade, as soon as it serves actual users.

A naive approach to address this challenge is to let the production version of the platform manage all resources regardless of their respective environments. However, doing so does not permit a hermetic segregation of the staging and production environments across the organisation, which is a good security practice, and often a requirement for compliance. For example, the production version of the platform would have to connect to upstream systems in the staging environment, e.g. staging Kafka clusters to collect their consumer groups, in the case of Heimdall. Conversely, the staging version of certain downstreams would have to connect to the production version of Heimdall, to fetch the metadata of relevant staging resources.

The alternative approach, generally adopted across Grab, is to instantiate all platforms in each environment (staging and production), while still considering both instances as production-grade and guaranteeing tight SLOs in both environments.

Fig. 8 Architecture of the Coban platform, broken down by environment.

In Fig. 8, both instances of Heimdall have equivalent SLOs. The caveat is that all upstream systems and platforms must also guarantee a strict SLO in both environments. This obviously comes with a cost, for example, tighter maintenance windows for the operations pertaining to the Kafka clusters in the staging environment.

A strong “platform” culture is required for platform teams to fully understand that their instance residing in the staging environment is not their own staging environment and should not be used for testing new features.

What’s next?

Currently, users creating, updating, or deleting production resources in the Coban UI (or directly by calling Heimdall API) receive the URL of the generated GitLab MR in a Slack message. From there, they must get the MR approved by a code owner, typically another team member, and finally merge the MR, for the requested change to be actually implemented by the CI pipeline.

Although this was a fairly easy way to implement a maker/checker process that was immediately compliant with our regulatory requirements for any changes in production, the user experience is not optimal. In the near future, we plan to bring the approval mechanism into Heimdall and the Coban UI, while still providing our more advanced users with the option to directly create, approve, and merge MRs in GitLab. In the longer run, we would also like to enhance the Coban UI with the output of the Khone CI jobs that include the Terraform plan and apply results.

There is another aspect of the platform that we want to improve. As Heimdall regularly polls the upstream platforms to collect their metadata, this introduces a latency between a change in one of those platforms and its reflection in the Coban platform, which can hinder the user experience. To refresh resource metadata in Heimdall in near real time, we plan to leverage an existing Grab-wide event stream, where most of the configuration and code changes at Grab are produced as events. Heimdall will soon be able to consume those events and update the metadata of the affected resources immediately, without waiting for the next periodic refresh.

Join us

Grab is the leading superapp platform in Southeast Asia, providing everyday services that matter to consumers. More than just a ride-hailing and food delivery app, Grab offers a wide range of on-demand services in the region, including mobility, food, package and grocery delivery services, mobile payments, and financial services across 428 cities in eight countries.

Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!

Road localisation in GrabMaps

Post Syndicated from Grab Tech original https://engineering.grab.com/road-localisation-grabmaps

Introduction

In 2022, Grab achieved self-sufficiency in its Geo services. As part of this transition, one crucial step was moving towards using an internally-developed map tailored specifically to the market in which Grab operates. Now that we have full control over the map layer, we can add more data to it or improve it according to the needs of the services running on top. One key aspect that this transition unlocked for us was the possibility of creating hyperlocal data at map level.

For instance, by determining the country to which a road belongs, we can now automatically infer the official language of that country and display the street name in that language. In another example, knowing the country for a specific road, we can automatically infer the driving side (left-handed or right-handed) leading to an improved navigation experience. Furthermore, this capability also enables us to efficiently handle various scenarios. For example, if we know that a road is part of a gated community, an area where our driver partners face restricted access, we can prevent the transit through that area.

These are just some examples of the possibilities from having full control over the map layer. By having an internal map, we can align our maps with specific markets and provide better experiences for our driver-partners and customers.

Background

For all these to be possible, we first needed to localise the roads inside the map. Our goal was to include hyperlocal data into the map, which refers to data that is specific to a certain area, such as a country, city, or even a smaller part of the city like a gated community. At the same time, we aimed to deliver our map with a high cadence, thus, we needed to find the right way to process this large amount of data while continuing to create maps in a cost-effective manner.

Solution

In the following sections of this article, we will use an extract from the Southeast Asia map to provide visual representations of the concepts discussed.

In Figure 1, Image 1 shows a visualisation of the road network, the roads belonging to this area. The coloured lines in Image 2 represent the borders identifying the countries in the same area. Overlapping the information from Image 1 and Image 2, we can extrapolate and say that the entire surface included in a certain border could have the same set of common properties as shown in Image 3. In Image 4, we then proceed with adding localised roads for each area.

Figure 1 – Map of Southeast Asia

For this to be possible, we have to find a way to localise each road and identify its associated country. Once this localisation process is complete, we can replicate all this information specific to a given border onto each individual road. This information includes details such as the country name, driving side, and official language. We can go even further and infer more information, and add hyperlocal data. For example, in Vietnam, we can automatically prevent motorcycle access on the motorways.

Assigning each road on the map to a specific area, such as a country, service area, or subdivision, presents a complex task. So, how can we efficiently accomplish this?

Implementation

The most straightforward approach would be to test the inclusion of each road into each area boundary, but that is easier said than done. With close to 30 million road segments in the Southeast Asia map and over 10 thousand areas, the computational cost of determining inclusion or intersection between a polyline and a polygon is expensive.

Our solution to this challenge involves replacing the expensive yet precise operation with a decent approximation. We introduce a proxy entity, the geohash, and we use it to approximate the areas and also to localise the roads.

We replace the geometrical inclusion with a series of simpler and less expensive operations. First, we conduct an inexpensive precomputation where we identify all the geohases that belong to a certain area or within a defined border. We then identify the geohashes to which the roads belong to. Finally, we use these precomputed values to assign roads to their respective areas. This process is also computationally inexpensive.

Given the large area we process, we leverage big data techniques to distribute the execution across multiple nodes and thus speed up the operation. We want to deliver the map daily and this is one of the many operations that are part of the map-making process.

What is a geohash?

To further understand our implementation we will first explain the geohash concept. A geohash is a unique identifier of a specific region on the Earth. The basic idea is that the Earth is divided into regions of user-defined size and each region is assigned a unique id, which is known as its geohash. For a given location on earth, the geohash algorithm converts its latitude and longitude into a string.

Geohashes uses a Base-32 alphabet encoding system comprising characters ranging from 0 to 9 and A to Z, excluding “A”, “I”, “L” and “O”. Imagine dividing the world into a grid with 32 cells. The first character in a geohash identifies the initial location of one of these 32 cells. Each of these cells are then further subdivided into 32 smaller cells.This subdivision process continues and refines to specific areas in the world. Adding characters to the geohash sub-divides a cell, effectively zooming in to a more detailed area.

The precision factor of the geohash determines the size of the cell. For instance, a precision factor of one creates a cell 5,000 km high and 5,000 km wide. A precision factor of six creates a cell 0.61km high and 1.22 km wide. Furthermore, a precision factor of nine creates a cell 4.77 m high and 4.77 m wide. It is important to note that cells are not always square and can have varying dimensions.

In Figure 2, we have exemplified a geohash 6 grid and its code is wsdt33.

Figure 2 – An example of geohash code wsdt33

Using less expensive operations

Calculating the inclusion of the roads inside a certain border is an expensive operation. However, quantifying the exact expense is challenging as it depends on several factors. One factor is the complexity of the border. Borders are usually irregular and very detailed, as they need to correctly reflect the actual border. The complexity of the road geometry is another factor that plays an important role as roads are not always straight lines.

Figure 3 – Roads to localise

Since this operation is expensive both in terms of cloud cost and time to run, we need to identify a cheaper and faster way that would yield similar results. Knowing that the complexity of the border lines is the cause of the problem, we tried using a different alternative, a rectangle. Calculating the inclusion of a polyline inside a rectangle is a cheaper operation.

Figure 4 – Roads inside a rectangle

So we transformed this large, one step operation, where we test each road segment for inclusion in a border, into a series of smaller operations where we perform the following steps:

  1. Identify all the geohashes that are part of a certain area or belong to a certain border. In this process we include additional areas to make sure that we cover the entire surface inside the border.
  2. For each road segment, we identify the list of geohashes that it belongs to. A road, depending on its length or depending on its shape, might belong to multiple geohashes.

In Figure 5, we identify that the road belongs to two geohashes and that the two geohashes are part of the border we use.

Figure 5 – Geohashes as proxy

Now, all we need to do is join the two data sets together. This kind of operation is a great candidate for a big data approach, as it allows us to run it in parallel and speed up the processing time.

Precision tradeoff

We mentioned earlier that, for the sake of argument, we replace precision with a decent approximation. Let’s now delve into the real tradeoff by adopting this approach.

The first thing that stands out with this approach is that we traded precision for cost. We are able to reduce the cost as this approach uses less hardware resources and computation time. However, this reduction in precision suffers, particularly for roads located near the borders as they might be wrongly classified.

Going back to the initial example, let’s take the case of the external road, on the left side of the area. As you can see in Figure 6, it is clear that the road does not belong to our border. But when we apply the geohash approach it gets included into the middle geohash.

Figure 6 – Wrong road localisation

Given that just a small part of the geohash falls inside the border, the entire geohash will be classified as belonging to that area, and, as a consequence, the road that belongs to that geohash will be wrongly localised and we’ll end up adding the wrong localisation information to that road. This is clearly a consequence of the precision tradeoff. So, how can we solve this?

Geohash precision

One option is to increase the geohash precision. By using smaller and smaller geohashes, we can better reflect the actual area. As we go deeper and we further split the geohash, we can accurately follow the border. However, a high geohash precision also equates to a computationally intensive operation bringing us back to our initial situation. Therefore, it is crucial to find the right balance between the geohash size and the complexity of operations.

Figure 7 – Geohash precision

Geohash coverage percentage

To find a balance between precision and data loss, we looked into calculating the geohash coverage percentage. For example, in Figure 8, the blue geohash is entirely within the border. Here we can say that it has a 100% geohash coverage.

Figure 8 – Geohash inside the border

However, take for example the geohash in Figure 9. It touches the border and has only around 80% of its surface inside the area. Given that most of its surface is within the border, we still can say that it belongs to the area.

Figure 9 – Geohash partially inside the border

Let’s look at another example. In Figure 10, only a small part of the geohash is within the border. We can say that the geohash coverage percentage here is around 5%. For these cases, it becomes difficult for us to determine whether the geohash does belong to the area. What would be a good tradeoff in this case?

Figure 10 – Geohash barely inside the border

Border shape

To go one step further, we can consider a mixed solution, where we use the border shape but only for the geohashes touching the border. This would still be an intensive computational operation but the number of roads located in these geohashes will be much smaller, so it is still a gain.

For the geohashes with full coverage inside the area, we’ll use the geohash for the localisation, the simpler operation. For the geohashes that are near the border, we’ll use a different approach. To increase the precision around the borders, we can cut the geohash following the border’s shape. Instead of having a rectangle, we’ll use a more complex shape which is still simpler than the initial border shape.

Figure 11 – Geohash following a border’s shape

Result

We began with a simple approach and we enhanced it to improve precision. This also increased the complexity of the operation. We then asked, what are the actual gains? Was it worthwhile to go through all this process? In this section, we put this to the test.

We first created a benchmark by taking a small sample of the data and ran the localisation process on a laptop. The sample comprised approximately 2% of the borders and 0.0014% of the roads. We ran the localisation process using two approaches.

  • With the first approach, we calculated the intersection between all the roads and borders. The entire operation took around 38 minutes.
  • For the second approach, we optimised the operation using geohashes. In this approach, the runtime was only 78 seconds (1.3 minutes).

However, it is important to note that this is not an apples-to-apples comparison. The operation that we measured was the localisation of the roads but we did not include the border filling operation where we fill the borders with geohashes. This is because this operation does not need to be run every time. It can be run once and reused multiple times.

Though not often required, it is still crucial to understand and consider the operation of precomputing areas and filling borders with geohashes. The precomputation process depends on several factors:

  • Number and shape of the borders – The more borders and the more complex the borders are, the longer the operation will take.
  • Geohash precision – How accurate do we need our localisation to be? The more accurate it needs to be, the longer it will take.
  • Hardware availability

Going back to our hypothesis, although this precomputation might be expensive, it is rarely run as the borders don’t change often and can be triggered only when needed. However, regular computation, where we find the area to which each road belongs to, is often run as the roads change constantly. In our system, we run this localisation for each map processing.

We can also further optimise this process by applying the opposite approach. Geohashes that have full coverage inside a border can be merged together into larger geohashes thus simplifying the computation inside the border. In the end, we can have a solution that is fully optimised for our needs with the best cost-to-performance ratio.

Figure 12 – Optimised geohashes

Conclusion

Although geohashes seem to be the right solution for this kind of problem, we also need to monitor their content. One consideration is the road density inside a geohash. For example, a geohash inside a city centre usually has a lot of roads while one in the countryside may have much less. We need to consider this aspect to have a balanced computation operation and take full advantage of the big data approach. In our case, we achieve this balance by considering the number of road kilometres within a geohash.

Figure 13 – Unbalanced data

Additionally, the resources that we choose also matter. To optimise time and cost, we need to find the right balance between the running time and resource cost. As shown in Figure 14, based on a sample data we ran, sometimes, we get the best result when using smaller machines.

Figure 14 – Cost vs runtime

The achievements and insights showcased in this article are indebted to the contributions made by Mihai Chintoanu. His expertise and collaborative efforts have profoundly enriched the content and findings presented herein.

Join us

Grab is the leading superapp platform in Southeast Asia, providing everyday services that matter to consumers. More than just a ride-hailing and food delivery app, Grab offers a wide range of on-demand services in the region, including mobility, food, package and grocery delivery services, mobile payments, and financial services across 428 cities in eight countries.

Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!

The architecture of today’s LLM applications

Post Syndicated from Nicole Choi original https://github.blog/2023-10-30-the-architecture-of-todays-llm-applications/


We want to empower you to experiment with LLM models, build your own applications, and discover untapped problem spaces. That’s why we sat down with GitHub’s Alireza Goudarzi, a senior machine learning researcher, and Albert Ziegler, a principal machine learning engineer, to discuss the emerging architecture of today’s LLMs.

In this post, we’ll cover five major steps to building your own LLM app, the emerging architecture of today’s LLM apps, and problem areas that you can start exploring today.

Five steps to building an LLM app

Building software with LLMs, or any machine learning (ML) model, is fundamentally different from building software without them. For one, rather than compiling source code into binary to run a series of commands, developers need to navigate datasets, embeddings, and parameter weights to generate consistent and accurate outputs. After all, LLM outputs are probabilistic and don’t produce the same predictable outcomes.

Diagram that lists the five steps to building a large language model application. Data source for diagram is detailed here: https://github.blog/?p=74969&preview=true#five-steps-to-building-an-llm-app
Click on diagram to enlarge and save.

Let’s break down, at a high level, the steps to build an LLM app today. 👇

1. Focus on a single problem, first. The key? Find a problem that’s the right size: one that’s focused enough so you can quickly iterate and make progress, but also big enough so that the right solution will wow users.

For instance, rather than trying to address all developer problems with AI, the GitHub Copilot team initially focused on one part of the software development lifecycle: coding functions in the IDE.

2. Choose the right LLM. You’re saving costs by building an LLM app with a pre-trained model, but how do you pick the right one? Here are some factors to consider:

  • Licensing. If you hope to eventually sell your LLM app, you’ll need to use a model that has an API licensed for commercial use. To get you started on your search, here’s a community-sourced list of open LLMs that are licensed for commercial use.
  • Model size. The size of LLMs can range from 7 to 175 billion parameters—and some, like Ada, are even as small as 350 million parameters. Most LLMs (at the time of writing this post) range in size from 7-13 billion parameters.

Conventional wisdom tells us that if a model has more parameters (variables that can be adjusted to improve a model’s output), the better the model is at learning new information and providing predictions. However, the improved performance of smaller models is challenging that belief. Smaller models are also usually faster and cheaper, so improvements to the quality of their predictions make them a viable contender compared to big-name models that might be out of scope for many apps.

  • Model performance. Before you customize your LLM using techniques like fine-tuning and in-context learning (which we’ll cover below), evaluate how well and fast—and how consistently—the model generates your desired output. To measure model performance, you can use offline evaluations.

3. Customize the LLM. When you train an LLM, you’re building the scaffolding and neural networks to enable deep learning. When you customize a pre-trained LLM, you’re adapting the LLM to specific tasks, such as generating text around a specific topic or in a particular style. The section below will focus on techniques for the latter. To customize a pre-trained LLM to your specific needs, you can try in-context learning, reinforcement learning from human feedback (RLHF), or fine-tuning.

  • In-context learning, sometimes referred to as prompt engineering by end users, is when you provide the model with specific instructions or examples at the time of inference—or the time you’re querying the model—and asking it to infer what you need and generate a contextually relevant output.

In-context learning can be done in a variety of ways, like providing examples, rephrasing your queries, and adding a sentence that states your goal at a high-level.

  • RLHF comprises a reward model for the pre-trained LLM. The reward model is trained to predict if a user will accept or reject the output from the pre-trained LLM. The learnings from the reward model are passed to the pre-trained LLM, which will adjust its outputs based on user acceptance rate.

The benefit to RLHF is that it doesn’t require supervised learning and, consequently, expands the criteria for what’s an acceptable output. With enough human feedback, the LLM can learn that if there’s an 80% probability that a user will accept an output, then it’s fine to generate. Want to try it out? Check out these resources, including codebases, for RLHF.

  • Fine-tuning is when the model’s generated output is evaluated against an intended or known output. For example, you know that the sentiment behind a statement like this is negative: “The soup is too salty.” To evaluate the LLM, you’d feed this sentence to the model and query it to label the sentiment as positive or negative. If the model labels it as positive, then you’d adjust the model’s parameters and try prompting it again to see if it can classify the sentiment as negative.

Fine-tuning can result in a highly customized LLM that excels at a specific task, but it uses supervised learning, which requires time-intensive labeling. In other words, each input sample requires an output that’s labeled with exactly the correct answer. That way, the actual output can be measured against the labeled one and adjustments can be made to the model’s parameters. The advantage of RLHF, as mentioned above, is that you don’t need an exact label.

4. Set up the app’s architecture. The different components you’ll need to set up your LLM app can be roughly grouped into three categories:

  • User input which requires a UI, an LLM, and an app hosting platform.
  • Input enrichment and prompt construction tools. This includes your data source, embedding model, a vector database, prompt construction and optimization tools, and a data filter.
  • Efficient and responsible AI tooling, which includes an LLM cache, LLM content classifier or filter, and a telemetry service to evaluate the output of your LLM app.

5. Conduct online evaluations of your app. These evaluations are considered “online” because they assess the LLM’s performance during user interaction. For example, online evaluations for GitHub Copilot are measured through acceptance rate (how often a developer accepts a completion shown to them), as well as the retention rate (how often and to what extent a developer edits an accepted completion).


The emerging architecture of LLM apps

Let’s get started on architecture. We’re going to revisit our friend Dave, whose Wi-Fi went out on the day of his World Cup watch party. Fortunately, Dave was able to get his Wi-Fi running in time for the game, thanks to an LLM-powered assistant.

We’ll use this example and the diagram above to walk through a user flow with an LLM app, and break down the kinds of tools you’d need to build it. 👇

Flow chart that reads from right to left, showing components of a large language model application and how they all work together. Data source for diagram is detailed here: https://github.blog/?p=74969&preview=true#the-emerging-architecture-of-llm-apps
Click diagram to enlarge and save.

User input tools

When Dave’s Wi-Fi crashes, he calls his internet service provider (ISP) and is directed to an LLM-powered assistant. The assistant asks Dave to explain his emergency, and Dave responds, “My TV was connected to my Wi-Fi, but I bumped the counter, and the Wi-Fi box fell off! Now, we can’t watch the game.”

In order for Dave to interact with the LLM, we need four tools:

  • LLM API and host: Is the LLM app running on a local machine or in the cloud? In an ISP’s case, it’s probably hosted in the cloud to handle the volume of calls like Dave’s. Vercel and early projects like jina-ai/rungpt aim to provide a cloud-native solution to deploy and scale LLM apps.

But if you want to build an LLM app to tinker, hosting the model on your machine might be more cost effective so that you’re not paying to spin up your cloud environment every time you want to experiment. You can find conversations on GitHub Discussions about hardware requirements for models like LLaMA‚ two of which can be found here and here.

  • The UI: Dave’s keypad is essentially the UI, but in order for Dave to use his keypad to switch from the menu of options to the emergency line, the UI needs to include a router tool.
  • Speech-to-text translation tool: Dave’s verbal query then needs to be fed through a speech-to-text translation tool that works in the background.

Input enrichment and prompt construction tools

Let’s go back to Dave. The LLM can analyze the sequence of words in Dave’s transcript, classify it as an IT complaint, and provide a contextually relevant response. (The LLM’s able to do this because it’s been trained on the internet’s entire corpus, which includes IT support documentation.)

Input enrichment tools aim to contextualize and package the user’s query in a way that will generate the most useful response from the LLM.

  • A vector database is where you can store embeddings, or index high-dimensional vectors. It also increases the probability that the LLM’s response is helpful by providing additional information to further contextualize your user’s query.

Let’s say the LLM assistant has access to the company’s complaints search engine, and those complaints and solutions are stored as embeddings in a vector database. Now, the LLM assistant uses information not only from the internet’s IT support documentation, but also from documentation specific to customer problems with the ISP.

  • But in order to retrieve information from the vector database that’s relevant to a user’s query, we need an embedding model to translate the query into an embedding. Because the embeddings in the vector database, as well as Dave’s query, are translated into high-dimensional vectors, the vectors will capture both the semantics and intention of the natural language, not just its syntax.

Here’s a list of open source text embedding models. OpenAI and Hugging Face also provide embedding models.

Dave’s contextualized query would then read like this:

// pay attention to the the following relevant information.
to the colors and blinking pattern.

// pay attention to the following relevant information.

// The following is an IT complaint from, Dave Anderson, IT support expert.
Answers to Dave's questions should serve as an example of the excellent support
provided by the ISP to its customers.

*Dave: Oh it's awful! This is the big game day. My TV was connected to my
Wi-Fi, but I bumped the counter and the Wi-Fi box fell off and broke! Now we
can't watch the game.

Not only do these series of prompts contextualize Dave’s issue as an IT complaint, they also pull in context from the company’s complaints search engine. That context includes common internet connectivity issues and solutions.

MongoDB released a public preview of Vector Atlas Search, which indexes high-dimensional vectors within MongoDB. Qdrant, Pinecone, and Milvus also provide free or open source vector databases.

  • A data filter will ensure that the LLM isn’t processing unauthorized data, like personal identifiable information. Preliminary projects like amoffat/HeimdaLLM are working to ensure LLMs access only authorized data.
  • A prompt optimization tool will then help to package the end user’s query with all this context. In other words, the tool will help to prioritize which context embeddings are most relevant, and in which order those embeddings should be organized in order for the LLM to produce the most contextually relevant response. This step is what ML researchers call prompt engineering, where a series of algorithms create a prompt. (A note that this is different from the prompt engineering that end users do, which is also known as in-context learning).

Prompt optimization tools like langchain-ai/langchain help you to compile prompts for your end users. Otherwise, you’ll need to DIY a series of algorithms that retrieve embeddings from the vector database, grab snippets of the relevant context, and order them. If you go this latter route, you could use GitHub Copilot Chat or ChatGPT to assist you.

Learn how the GitHub Copilot team uses the Jaccard similarity to decide which pieces of context are most relevant to a user’s query >

Efficient and responsible AI tooling

To ensure that Dave doesn’t become even more frustrated by waiting for the LLM assistant to generate a response, the LLM can quickly retrieve an output from a cache. And in the case that Dave does have an outburst, we can use a content classifier to make sure the LLM app doesn’t respond in kind. The telemetry service will also evaluate Dave’s interaction with the UI so that you, the developer, can improve the user experience based on Dave’s behavior.

  • An LLM cache stores outputs. This means instead of generating new responses to the same query (because Dave isn’t the first person whose internet has gone down), the LLM can retrieve outputs from the cache that have been used for similar queries. Caching outputs can reduce latency, computational costs, and variability in suggestions.

You can experiment with a tool like zilliztech/GPTcache to cache your app’s responses.

  • A content classifier or filter can prevent your automated assistant from responding with harmful or offensive suggestions (in the case that your end users take their frustration out on your LLM app).

Tools like derwiki/llm-prompt-injection-filtering and laiyer-ai/llm-guard are in their early stages but working toward preventing this problem.

  • A telemetry service will allow you to evaluate how well your app is working with actual users. A service that responsibly and transparently monitors user activity (like how often they accept or change a suggestion) can share useful data to help improve your app and make it more useful.

OpenTelemetry, for example, is an open source framework that gives developers a standardized way to collect, process, and export telemetry data across development, testing, staging, and production environments.

Learn how GitHub uses OpenTelemetry to measure Git performance >

Woohoo! 🥳 Your LLM assistant has effectively answered Dave’s many queries. His router is up and working, and he’s ready for his World Cup watch party. Mission accomplished!


Real-world impact of LLMs

Looking for inspiration or a problem space to start exploring? Here’s a list of ongoing projects where LLM apps and models are making real-world impact.

  • NASA and IBM recently open sourced the largest geospatial AI model to increase access to NASA earth science data. The hope is to accelerate discovery and understanding of climate effects.
  • Read how the Johns Hopkins Applied Physics Laboratory is designing a conversational AI agent that provides, in plain English, medical guidance to untrained soldiers in the field based on established care procedures.
  • Companies like Duolingo and Mercado Libre are using GitHub Copilot to help more people learn another language (for free) and democratize ecommerce in Latin America, respectively.

Further reading

The post The architecture of today’s LLM applications appeared first on The GitHub Blog.

Demystifying LLMs: How they can do things they weren’t trained to do

Post Syndicated from Jeimy Ruiz original https://github.blog/2023-10-27-demystifying-llms-how-they-can-do-things-they-werent-trained-to-do/

Large language models (LLMs) are revolutionizing the way we interact with software by combining deep learning techniques with powerful computational resources.

While this technology is exciting, many are also concerned about how LLMs can generate false, outdated, or problematic information, and how they sometimes even hallucinate (generating information that doesn’t exist) so convincingly. Thankfully, we can immediately put one rumor to rest. According to Alireza Goudarzi, senior researcher of machine learning (ML) for GitHub Copilot: “LLMs are not trained to reason. They’re not trying to understand science, literature, code, or anything else. They’re simply trained to predict the next token in the text.”

Let’s dive into how LLMs come to do the unexpected, and why. This blog post will provide comprehensive insights into LLMs, including their training methods and ethical considerations. Our goal is to help you gain a better understanding of LLM capabilities and how they’ve learned to master language, seemingly, without reasoning.

What are large language models?

LLMs are AI systems that are trained on massive amounts of text data, allowing them to generate human-like responses and understand natural language in a way that traditional ML models can’t.

“These models use advanced techniques from the field of deep learning, which involves training deep neural networks with many layers to learn complex patterns and relationships,” explains John Berryman, a senior researcher of ML on the GitHub Copilot team.

What sets LLMs apart is their proficiency at generalizing and understanding context. They’re not limited to pre-defined rules or patterns, but instead learn from large amounts of data to develop their own understanding of language. This allows them to generate coherent and contextually appropriate responses to a wide range of prompts and queries.

And while LLMs can be incredibly powerful and flexible tools because of this, the ML methods used to train them, and the quality—or limitations—of their training data, can also lead to occasional lapses in generating accurate, useful, and trustworthy information.

Deep learning

The advent of modern ML practices, such as deep learning, has been a game-changer when it comes to unlocking the potential of LLMs. Unlike the earliest language models that relied on predefined rules and patterns, deep learning allows these models to create natural language outputs in a more human-like way.

“The entire discipline of deep learning and neural networks—which underlies all of this—is ‘how simple can we make the rule and get as close to the behavior of a human brain as possible?’” says Goudarzi.

By using neural networks with many layers, deep learning enables LLMs to analyze and learn complex patterns and relationships in language data. This means that these models can generate coherent and contextually appropriate responses, even in the face of complex sentence structures, idiomatic expressions, and subtle nuances in language.

While the initial pre-training equips LLMs with a broad language understanding, fine-tuning is where they become versatile and adaptable. “When developers want these models to perform specific tasks, they provide task descriptions and examples (few-shot learning) or task descriptions alone (zero-shot learning). The model then fine-tunes its pre-trained weights based on this information,” says Goudarzi. This process helps it adapt to the specific task while retaining the knowledge it gained from its extensive pre-training.

But even with deep learning’s multiple layers and attention mechanisms enabling LLMs to generate human-like text, it can also lead to overgeneralization, where the model produces responses that may not be contextually accurate or up to date.

Why LLMs aren’t always right

There are several factors that shed light on why tools built on LLMs may be inaccurate at times, even while sounding quite convincing.

Limited knowledge and outdated information

LLMs often lack an understanding of the external world or real-time context. They rely solely on the text they’ve been trained on, and they don’t possess an inherent awareness of the world’s current state. “Typically this whole training process takes a long time, and it’s not uncommon for the training data to be two years out of date for any given LLM,” says Albert Ziegler, principal researcher and member of the GitHub Next research and development team.

This limitation means they may generate inaccurate information based on outdated assumptions, since they can’t verify facts or events in real-time. If there have been developments or changes in a particular field or topic after they have been trained, LLMs may not be aware of them and may provide outdated information. This is why it’s still important to fact check any responses you receive from an LLM, regardless of how fact-based it may seem.

Lack of context

One of the primary reasons LLMs sometimes provide incorrect information is the lack of context. These models rely heavily on the information given in the input text, and if the input is ambiguous or lacks detail, the model may make assumptions that can lead to inaccurate responses.

Training data biases and limitations

LLMs are exposed to massive unlabelled data sets of text during pre-training that are diverse and representative of the language the model should understand. Common sources of data include books, articles, websites—even social media posts!

Because of this, they may inadvertently produce responses that reflect these biases or incorrect information present in their training data. This is especially concerning when it comes to sensitive or controversial topics.

“Their biases tend to be worse. And that holds true for machine learning in general, not just for LLMs. What machine learning does is identify patterns, and things like stereotypes can turn into extremely convenient shorthands. They might be patterns that really exist, or in the case of LLMs, patterns that are based on human prejudices that are talked about or implicitly used,” says Ziegler.

If a model is trained on a dataset that contains biased or discriminatory language, it may generate responses that are also biased or discriminatory. This can have real-world implications, such as reinforcing harmful stereotypes or discriminatory practices.

Overconfidence

LLMs don’t have the ability to assess the correctness of the information they generate. Given their deep learning, they often provide responses with a high degree of confidence, prioritizing generating text that appears sensible and flows smoothly—even when the information is incorrect!

Hallucinations

LLMs can sometimes “hallucinate” information due to the way they generate text (via patterns and associations). Sometimes, when they’re faced with incomplete or ambiguous queries, they try to complete them by drawing on these patterns, sometimes generating information that isn’t accurate or factual. Ultimately, hallucinations are not supported by evidence or real-world data.

For example, imagine that you ask ChatGPT about a historical issue in the 20th century. Instead, it describes a meeting between two famous historical figures who never actually met!

In the context of GitHub Copilot, Ziegler explains that “the typical hallucinations we encounter are when GitHub Copilot starts talking about code that’s not even there. Our mitigation is to make it give enough context to every piece of code it talks about that we can check and verify that it actually exists.”

But the GitHub Copilot team is already thinking about how to use hallucinations to their advantage in a “top-down” approach to coding. Imagine that you’re tackling a backlog issue, and you’re looking for GitHub Copilot to give you suggestions. As Johan Rosenkilde, principal researcher for GitHub Next, explains, “ideally, you’d want it to come up with a sub-division of your complex problem delegated to nicely delineated helper functions, and come up with good names for those helpers. And after suggesting code that calls the (still non-existent) helpers, you’d want it to suggest the implementation of them too!”

This approach to hallucination would be like getting the blueprint and the building blocks to solve your coding challenges.

Ethical use and responsible advocacy of LLMs

It’s important to be aware of the ethical considerations that come along with using LLMs. That being said, while LLMs have the potential to generate false information, they’re not intentionally fabricating or deceiving. Instead, these arise from the model’s attempts to generate coherent and contextually relevant text based on the patterns and information it has learned from its training data.

The GitHub Copilot team has developed a few tools to help detect harmful content. Goudarzi says “First, we have a duplicate detection filter, which helps us detect matches between generated code and all open source code that we have access to, filtering such suggestions out. Another tool we use is called Responsible AI (RAI), and it’s a classifier that can filter out abusive words. Finally, we also separately filter out known unsafe patterns.”

Understanding the deep learning processes behind LLMs can help users grasp their limitations—as well as their positive impact. To navigate these effectively, it’s crucial to verify information from reliable sources, provide clear and specific input, and exercise critical thinking when interpreting LLM-generated responses.

As Berryman reminds us, “the engines themselves are amoral. Users can do whatever they want with them and that can run the gamut of moral to immoral, for sure. But by being conscious of these issues and actively working towards ethical practices, we can ensure that LLMs are used in a responsible and beneficial manner.”

Developers, researchers, and scientists continuously work to improve the accuracy and reliability of these models, making them increasingly valuable tools for the future. All of us can advocate for the responsible and ethical use of LLMs. That includes promoting transparency and accountability in the development and deployment of these models, as well as taking steps to mitigate biases and stereotypes in our own corners of the internet.

The post Demystifying LLMs: How they can do things they weren’t trained to do appeared first on The GitHub Blog.

Building hyperlocal GrabMaps

Post Syndicated from Grab Tech original https://engineering.grab.com/building-hyperlocal-grabmaps

Introduction

Southeast Asia (SEA) is a dynamic market, very different from other parts of the world. When travelling on the road, you may experience fast-changing road restrictions, new roads appearing overnight, and high traffic congestion. To address these challenges, GrabMaps has adapted to the SEA market by leveraging big data solutions. One of the solutions is the integration of hyperlocal data in GrabMaps.

Hyperlocal information is oriented around very small geographical communities and obtained from the local knowledge that our map team gathers. The map team is spread across SEA, enabling us to define clear specifications (e.g. legal speed limits), and validate that our solutions are viable.

Figure 1 – Map showing detections from images and probe data, and hyperlocal data.

Hyperlocal inputs make our mapping data even more robust, adding to the details collected from our image and probe detection pipelines. Figure 1 shows how data from our detection pipeline is overlaid with hyperlocal data, and then mapped across the SEA region. If you are curious and would like to check out the data yourself, you can download it here.

Processing hyperlocal data

Now let’s go through the process of detecting hyperlocal data.

Download data

GrabMaps is based on OpenStreetMap (OSM). The first step in the process is to download the .pbf file for Asia from geofabrick.de. This .pbf file contains all the data that is available on OSM, such as details of places, trees, and roads. Take for example a park, the .pbf file would contain data on the park name, wheelchair accessibility, and many more.

For this article, we will focus on hyperlocal data related to the road network. For each road, you can obtain data such as the type of road (residential or motorway), direction of traffic (one-way or more), and road name.

Convert data

To take advantage of big data computing, the next step in the process is to convert the .pbf file into Parquet format using a Parquetizer. This will convert the binary data in the .pbf file into a table format. Each road in SEA is now displayed as a row in a table as shown in Figure 2.

Figure 2 – Road data in Parquet format.

Identify hyperlocal data

After the data is prepared, GrabMaps then identifies and inputs all of our hyperlocal data, and delivers a consolidated view to our downstream services. Our hyperlocal data is obtained from various sources, either by looking at geometry, or other attributes in OSM such as the direction of travel and speed limit. We also apply customised rules defined by our local map team, all in a fully automated manner. This enhances the map together with data obtained from our rides and deliveries GPS pings and from KartaView, Grab’s product for imagery collection.

Figure 3 – Architecture diagram showing how hyperlocal data is integrated into GrabMaps.

Benefit of our hyperlocal GrabMaps

GrabNav, a turn-by-turn navigation tool available on the Grab driver app, is one of our products that benefits from having hyperlocal data. Here are some hyperlocal data that are made available through our approach:

  • Localisation of roads: The country, state/county, or city the road is in
  • Language spoken, driving side, and speed limit
  • Region-specific default speed regulations
  • Consistent name usage using language inference
  • Complex attributes like intersection links

To further explain the benefits of this hyperlocal feature, we will use intersection links as an example. In the next section, we will explain how intersection links data is used and how it impacts our driver-partners and passengers.

An intersection link is when two or more roads meet. Figure 4 and 5 illustrates what an intersection link looks like in a GrabMaps mock and in OSM.

Figure 4 – Mock of an intersection link.
Figure 5 – Intersection link illustration from a real road network in OSM.

To locate intersection links in a road network, there are computations involved. We would first combine big data processing (which we do using Spark) with graphs. We use geohash as the unit of processing, and for each geohash, a bi-directional graph is created.

From such resulting graphs, we can determine intersection links if:

  • Road segments are parallel
  • The roads have the same name
  • The roads are one way roads
  • Angles and the shape of the road are in the intervals or requirements we seek

Each intersection link we identify is tagged in the map as intersection_links. Our downstream service teams can then identify them by searching for the tag.

Impact

The impact we create with our intersection link can be explained through the following example.

Figure 6 – Longer route, without GrabMaps intersection link feature. The arrow indicates where the route should have suggested a U-turn.
Figure 7 – Shorter route using GrabMaps by taking a closer link between two main roads.

Figure 6 and Figure 7 show two different routes for the same origin and destination. However, you can see that Figure 7 has a shorter route and this is made available by taking an intersection link early on in the route. The highlighted road segment in Figure 7 is an intersection link, tagged by the process we described earlier. The route is now much shorter making GrabNav more efficient in its route suggestion.

There are numerous factors that can impact a driver-partner’s trip, and intersection links are just one example. There are many more features that GrabMaps offers across Grab’s services that allow us to “outserve” our partners.

Conclusion

GrabMaps and GrabNav deliver enriched experiences to our driver-partners. By integrating certain hyperlocal data features, we are also able to provide more accurate pricing for both our driver-partners and passengers. In our mission towards sustainable growth, this is an area that we will keep on improving by leveraging scalable tech solutions.

Join us

Grab is the leading superapp platform in Southeast Asia, providing everyday services that matter to consumers. More than just a ride-hailing and food delivery app, Grab offers a wide range of on-demand services in the region, including mobility, food, package and grocery delivery services, mobile payments, and financial services across 428 cities in eight countries.

Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!

10 things you didn’t know you could do with GitHub Projects

Post Syndicated from Kedasha Kerr original https://github.blog/2023-08-28-10-things-you-didnt-know-you-could-do-with-github-projects/

GitHub Projects has been adopted by program managers, OSS maintainers, enterprises, and individual developers alike for its user-friendly design and efficiency. We all know that managing issues and pull requests in our repositories can be challenging.

To help you optimize your usage of GitHub Projects to plan and track your work from start to finish, I’ll be sharing 10 things you can do with GitHub Projects to make it easier to keep track of your issues and pull requests.

1. Manage your projects with the CLI

If you prefer to work from your terminals, we’ve made it more convenient for you to manage and automate your project workflows with the GitHub CLI project command. This essentially allows you to work more collaboratively with your team to keep your projects updated with your existing toolkit.

For example, if I wanted to add a draft issue to my project “Learning Ruby,” I would do this by first ensuring that I have the CLI installed and I’m authenticated with the project scope. Once authenticated, I need to find the number of the project I want to manage with the CLI. You can find the project number by looking at the project URL. For example, https://github.com/orgs/That-Lady-Dev/projects/4 the project number here is “4.” Now that we have the project number, we can use it to add a draft issue to the project! The command will look like this:

gh project item-create 4 --owner That-Lady-Dev --title "Test Adding Draft" --body "I added this draft issue with GitHub CLI"

When we run this, a new draft issue is added to the project:

updating the project from the terminal; seeing the new item added to the board live

You can do a lot more with the GitHub CLI and GitHub projects. Check out our documentation to see all the possibilities of interacting with your projects from the terminal.

2. Export your projects to TSV

If you ever need your project data, you can export your project view to a file, which can then be imported into Figjam, Google Sheets, Excel, or any other platform that supports TSV files.

Go to any view of your project and click the arrow next to the view name, then select Export view data. This will give you a TSV file that you can use.

export project view data as a TSV file

Though TSV offers much better formatting than a CSV file, you can ask GitHub Copilot Chat how to convert a TSV file to a CSV file, copy the code, run it, and get your new CSV document, if CSV is your jam.

GitHub Copilot Chat converts TSV to CSV with Python code

Here’s a quick gist of how I converted a TSV to a CSV with GitHub Copilot Chat!

3. Create reusable project templates

If you often find yourself recreating projects with similar content and structure, you can set a project as a template so you and others can use it as a base when creating new projects.

To set your project as a template, navigate to the project “Settings” page, and under the “Templates” section toggle on Make template.

 toggle templates on from the setting page showing a green button in the UI

This will turn the project into a template that can be used with the green Use this template button at the top of your project, or when creating a new project. Building a library of templates that can be reused across your organization can help you and your teams share best practices and inspiration when getting started with a project!

4. Make a copy of a project

In addition to making your project a template that can be reused, you can also make a one-time copy of an existing project that will contain the fields, views, any configured workflows, insights, and draft items from the original project!

To copy a project, navigate to the project you want to copy, click the three dots to open the menu, and select Make a copy. This will open up a dialog where you can set the Owner, name the project, and click whether you want draft issues copied over or not. Once that’s all set, your new project is ready to be used!

making a copy of the project and updating data

You can also do this with the CLI. The command will look like this:

gh project copy 1 --source-owner That-Lady-Dev --target-owner Demos-and-Donuts --title "copied project"

5. Automate your project with workflows

If you want an issue to be automatically added to a project or if you want to set the status of an issue to “completed” when it is closed, you can do this automatically with built-in project workflows!

Go to the menu and click “Workflows.” This will show you a list of default workflows you can enable on your projects. To automatically add an issue to your project from a repository, you can enable the “Auto-add to project” workflow. To automatically set the status of a closed issue to “complete,” you can enable the “item closed” workflow.

turning on built-in project workflows from the settings page

Explore more built-in workflows by reading our documentation where you can also learn how to automate your projects with GitHub Actions.

6. Add colors and description to custom fields

Custom fields help you organize and categorize items in your projects, with flexible field types including text, number, date, single select, and iteration. If you want to add a splash of color to your project or more details about a specific field, you can add colors and descriptions to your single select fields!

To add a color and a description to a new single select field, navigate to the project settings, and add a new field. From there, you can add options to the field where you can select colors and add a description so everyone on your team knows what those options in the field mean and how they can be used.

updating project settings with new fields and descriptions from the settings page

You can also update field descriptions and colors directly from the project view by selecting Edit details from the group or column menus.

updating colors and description fields from the main project view

7. Add Issues from any organization

If you’re an open source maintainer, or a developer with multiple clients, you may be working across multiple organizations at a time. This means you have multiple issues to keep track of and need a way to combine these issues in one cohesive manner.

This is where GitHub Projects come in! You can collate issues from any organization onto a single project.

For example, I’m a part of the That-Lady-Dev and the Demos-and-Donuts organizations. I have the issues I want to track on my project board from That-Lady-Dev, but I also want to add the issues I have from the other organization to the same board. I can do this in one of two ways—I can either copy the issue link from the Demos-and-Donuts organization and paste it into the project, or I can search for the Demos-and-Donuts organization and repository from the project using # and select the issues I want to add.

This is a lot to take in—take a look at the gif below.

pasting an issue url from another org onto the project and searching for an issue from another org to add to the project

You can also add an issue or pull request to a project with the CLI. The command will look like this:

gh project item-add 4 --owner That-Lady-Dev --url https://github.com/Demos-and-Donuts/video-to-gif-converter/issues/1

8. Edit multiple items at once

Rather than spending time manually updating individual items, you can edit multiple items in one go with our bulk editing feature on GitHub Projects.

Let’s say you wanted to assign multiple issues to yourself. On the table layout, assign one issue and with the cell highlighted, and copy the contents of the cell. Select all the remaining items you want to be assigned and paste the copied contents. You just assigned yourself to multiple issues at once, and this can be undone at the click of a button or using keyboard commands as well.

This is demonstrated in the gif below.

bulk editing fields by assigning LadyKerr to thirteen field at the same time

You can also drag and drop multiple items on a project board to different columns.

dragging and dropping four board items to another column at the same time

9. Reorder fields

With a growing list of fields in your project, you’ll want to make sure your fields are organized and you see the most important ones up top. To change the order in how they appear on the side panel and on the issues page, you can rearrange the order of the fields from the project settings by dragging and dropping them in the “Custom fields” list.

putting status field at the top on settings page and showing on the project view that it is now the first field on the issue

10. See what you want to see with slice by

If you find yourself with multiple views and filters to see how items are spread among various teams, labels, or assignees, you can configure a slice field to break down and quickly toggle through your items. You can choose a Slice by field that will pull the field values into a panel on the left of your view, and clicking each value will adjust the items in the project view on the right. See the gif below for how this works.

slicing the project by content type, labels and assignees to demonstrate slice by feature

Try out slicing by different fields to unlock a new way to organize your items!

Bonus tip: Deep linking

Let’s say you want to send a specific issue from your project to a teammate. You can use the Copy link to project button to send them a direct link to that particular issue in the project without having them sift through to find the issue you mentioned. See what I mean in this gif.

using the copy project link to deep link items

Wrap-up

And there you have it—10 things you didn’t know you could do with GitHub Projects. The team is continuing to work on more amazing features to make tracking your issues with pull requests as seamless and painless as possible. GitHub Projects is a powerful, flexible, and efficient way to keep track of your items while staying on top of your work.

Do let me know if you have any questions about GitHub Projects; I’m happy to jump in and assist.

The post 10 things you didn’t know you could do with GitHub Projects appeared first on The GitHub Blog.

Introducing code referencing for GitHub Copilot

Post Syndicated from Ryan J. Salva original https://github.blog/2023-08-03-introducing-code-referencing-for-github-copilot/

Make more informed decisions about the code you use. In the rare case where a GitHub Copilot suggestion matches public code, this update will show a list of repositories where that code appears and their licenses. Sign up for the private beta today.

Over the course of the last year, GitHub Copilot, the world’s first at-scale AI pair programmer trained on billions of lines of public code, has attracted more than 1 million developers and helped over 27,000 organizations build faster and more productively. During that time, many developers told us they want to see when GitHub Copilot’s suggestions match public code.

Today, we’re announcing a private beta of GitHub Copilot with code referencing that includes an updated filter which detects and shows context of code suggestions matching public code on GitHub. When the filter is enabled, GitHub Copilot checks code suggestions with surrounding code of about 150 characters and compares it against an index of all the public code on GitHub.com. Matches—along with information about every repository in which they appear—are displayed right in the editor. Developers can now choose whether to block suggestions containing matching code, or allow those suggestions with information about matches.

Why? Some want to learn from others’ work, others may want to take a dependency rather than introduce new app logic, and still others want to give or receive credit for similar work. Whatever the reason, it’s nice to know when similar code is out there.

Let’s see how it works.

How GitHub Copilot code referencing works

With billions of files to index and a latency budget of only 10-20ms, it’s a miracle of engineering that this is even possible. Still, if there’s a match, a notification appears in the editor showing: (1) the matching code, (2) the repositories where that code appears, and (3) the license governing each repository.

Why code referencing matters

In our journey to create a code referencing tool, we discovered a few interesting things:

First, our previous research suggests that matches occur in less than one percent of GitHub Copilot suggestions. But that one percent isn’t evenly distributed across all use cases. In the context of an existing application with surrounding code, we almost never see a match. But in an empty or nearly empty file, we see matches far more often.

Suggestions are heavily biased toward the prompt so GitHub Copilot can provide suggestions tailor-made for your current task. That means, in an existing app with lots of context, you’ll get a suggestion customized for your code. But in an empty, or nearly empty file, there’s little to no context. So, you’re more likely to get a suggestion that matches public code.

We’ve also found that when suggestions match public code, those matches frequently appear in dozens, if not hundreds of repositories. In some ways, this isn’t surprising because the models that power GitHub Copilot are akin to giant probability machines. A code fragment that appears in many repositories is more likely to be a “pattern” detected by the model—similar to the patterns we see elsewhere in public code.

For example, research on Java projects finds that up to 11% of repositories may contain code that resembles solutions posted to Stack Overflow, and the vast majority of those snippets appear without attribution. Another study on Python found that many matches are too generic to trace to an original usage. A smaller-scale study found that Stack Overflow answers contain code from Android applications.

Finally, the repositories with matching code are often governed by multiple, sometimes conflicting licenses, which makes attributing a match to its source more challenging. By consulting a list of references, developers can now:

  • Determine whether, what, and who to attribute rather than having matches simply blocked from the outset.
  • Learn from other developers by studying their approaches to similar problems.
  • Take a dependency on an open source library to avoid new business logic.
  • Evaluate the context of code before accepting a matching suggestion.
  • Discover new projects and contribute upstream.

Building a better developer experience and community

This is just the beginning. As GitHub continues to innovate, we will always strive to help developers stay in the flow, build creatively, and maintain a transparent connection to the community. We’re excited for you to try GitHub Copilot with code referencing.

Happy coding.

A developer’s guide to prompt engineering and LLMs

Post Syndicated from Albert Ziegler original https://github.blog/2023-07-17-prompt-engineering-guide-generative-ai-llms/


In a blog post authored back in 2011, Marc Andreessen warned that, “Software is eating the world.” Over a decade later, we are witnessing the emergence of a new type of technology that’s consuming the world with even greater voracity: generative artificial intelligence (AI). This innovative AI includes a unique class of large language models (LLM), derived from a decade of groundbreaking research, that are capable of out-performing humans at certain tasks. And you don’t have to have a PhD in machine learning to build with LLMs—developers are already building software with LLMs with basic HTTP requests and natural language prompts.

In this article, we’ll tell the story of GitHub’s work with LLMs to help other developers learn how to best make use of this technology. This post consists of two main sections: the first will describe at a high level how LLMs function and how to build LLM-based applications. The second will dig into an important example of an LLM-based application: GitHub Copilot code completions.

Others have done an impressive job of cataloging our work from the outside. Now, we’re excited to share some of the thought processes that have led to the ongoing success of GitHub Copilot.

Let’s jump in.

Everything you need to know about prompt engineering in 1600 tokens or less

You know when you’re tapping out a text message on your phone, and in the middle of the screen just above the keypad, there’s a button you can click to accept a suggested next word? That’s pretty much what an LLM is doing—but at scale.

A GIF show autocomplete functionalities in iOS.
An example of iMessage’s text prediction feature.

Instead of text on your phone, an LLM works to predict the next best group of letters, which are called “tokens.” And in the same way that you can keep tapping that middle button to complete your text message, the LLM completes a document by predicting the next word. It will continue to do that over and over, and it will only stop once it has reached a maximum threshold of tokens or once it has encountered a special token that signals “Stop! This is the end of the document.”

There’s an important difference, though. The language model in your phone is pretty simple—it’s basically saying, “Based only upon the last two words entered, what is the most likely next word?” In contrast, an LLM produces an output that’s more akin to being “based upon the full content of every document ever known to exist in the public domain, what is the most likely next token in your document?” By training such a large, well-architected model on an enormous dataset, an LLM can almost appear to have common sense such as understanding that a glass ball sitting on a table might roll off and shatter.

A screenshot of ChatGPT answering a question about the danger of setting a round glass ball on a small table.
Example of an LLM’s awareness or “common sense” due to its training.

But be warned: LLMs will also sometimes confidently produce information that isn’t real or true, which are typically called “hallucinations” or “fabulations.” LLMs can also appear to learn how to do things they weren’t initially trained to do. Historically, natural language models have been created for one-off tasks, like classifying the sentiment of a tweet, extracting the business entities from an email, or identifying similar documents, but now you can ask AI tools like ChatGPT to perform a task that it was never trained to do.

A screenshot of ChatGPT answering a prompt to create a chicken-based limerick.
John conversing with ChatGPT about serious things.

Building applications using LLMs

A document completion engine is a far cry from the amazing proliferation of LLM applications that are springing up every day, running the gamut from conversational search, writing assistants, automated IT support, and code completion tools, like GitHub Copilot. But how is it possible that all of these tools can come from what is effectively a document completion tool? The secret is any application that uses an LLM is actually mapping between two domains: the user domain and the document domain.

A graphic showing how LLMs work and the processes behind them to determine context before giving an answer.
Diagram of the user flow when communicating with an LLM, in this case, Dave’s user flow.

On the left is the user. His name is Dave, and he has a problem. It’s the day of his big World Cup watch party, and the Wi-Fi is out. If they don’t get it fixed soon, he’ll be the butt of his friends’ jokes for years. Dave calls his internet provider and gets an automated assistant. Ugh! But imagine that we are implementing the automated assistant as an LLM application. Can we help him?

The key here is to figure out how to convert from user domain into document domain. For one thing, we will need to transcribe the user’s speech into text. As soon as the automated support agent says “Please state the nature of your cable-related emergency,” Dave blurts out:

Oh it’s awful! It’s the World Cup finals. My TV was connected to my Wi-Fi, but I bumped the counter and the Wi-Fi box fell off and broke! Now, we can’t watch the game.

At this point, we have text, but it’s not of much use. Maybe you would imagine that this was part of a story and continue it, “I guess, I’ll call up my brother and see if we can watch the game with him.” An LLM with no context will similarly create the continuation of Dave’s story. So, let’s give the LLM some context and establish what type of document this is:

### ISP IT Support Transcript:

The following is a recorded conversation between an ISP customer, Dave Anderson, and Julia Jones, IT support expert. This transcript serves as an example of the excellent support provided by Comcrash to its customers.

*Dave: Oh it's awful! This is the big game day. My TV was connected to my Wi-Fi, but I bumped the counter and the Wi-Fi box fell off and broke! Now we can't watch the game.
*Julia:

Now, if you found this pseudo document on the ground, how would you complete it? Based on the extra context, you would see that Julia is an IT support expert, and apparently a really good one. You would expect the next words to be sage advice to help Dave with his problem. It doesn’t matter that Julia doesn’t exist, and this wasn’t a recorded conversation—what matters is that these extra words offer more context for what a completion might look like. An LLM does the same exact thing. After reading this partial document, it will do its best to complete Julia’s dialogue in a helpful manner.

But there’s more we can do to make the best document for the LLM. The LLM doesn’t know a whole lot about cable TV troubleshooting. (Well, it has read every manual and IT document ever published online, but stay with me here). Let’s assume that its knowledge is lacking in this particular domain. One thing we can do is search for extra content that might help Dave and place it into the document. Let’s assume that we have a complaints search engine that allows us to find documentation that has been helpful in similar situations in the past. Now, all we have to do is weave this information into our pseudo document in a natural place.

Continuing from above:

*Julia:(rifles around in her briefcase and pulls out the perfect documentation for Dave's request)
Common internet connectivity problems ...
<...here we insert 1 page of text that comes from search results against our customer support history database...>
(After reading the document, Julia makes the following recommendation)
*Julia:

Now, given this full body of text, the LLM is conditioned to make use of the implanted documentation, and in the context of “a helpful IT expert,” the model will generate a response. This reply takes into account the documentation as well as Dave’s specific request.

The last step is to move from the document domain into the user’s problem domain. For this example, that means just converting text to voice. And since this is effectively a chat application, we would go back and forth several times between the user and the document domain, making the transcript longer each time.

This, at the core of the example, is prompt engineering. In the example, we crafted a prompt with enough context for the AI to produce the best possible output, which in this case was providing Dave with helpful information to get his Wi-Fi up and running again. In the next section, we’ll take a look at how we at GitHub have refined our prompt engineering techniques for GitHub Copilot.

The art and science of prompt engineering

Converting between the user domain and document domain is the realm of prompt engineering—and since we’ve been working on GitHub Copilot for over two years, we’ve started to identify some patterns in the process.

These patterns have helped us formalize a pipeline, and we think it is an applicable template to help others better approach prompt engineering for their own applications. Now, we’ll demonstrate how this pipeline works by examining it in the context of GitHub Copilot, our AI pair programmer.

The prompt engineering pipeline for GitHub Copilot

From the very beginning, GitHub Copilot’s LLMs have been built on AI models from OpenAI that have continued to get better and better. But what hasn’t changed is the answer to the central question of prompt engineering: what kind of document is the model trying to complete?

The OpenAI models we use have been trained to complete code files on GitHub. Ignoring some filtering and stratification steps that don’t really change the prompt engineering game, this distribution is pretty much that of individual file contents according to the most recent commit to main at data collection time.

The document completion problem the LLM solves is about code, and GitHub Copilot’s task is all about completing code. But the two are very different.

Here are some examples:

  • Most files committed to main are finished. For one, they usually compile. Most of the time the user is typing, the code does not compile because of incompletions that will be fixed before a commit is pushed.
  • The user might even write their code in hierarchical order, method signatures first, then bodies rather than line by line or in a mixed style.
  • Writing code means jumping around. In particular, people’s edits often require them to jump up in the document and make a change there, for example, adding a parameter to a function. Strictly speaking, if Codex suggests using a function that has not been imported yet, no matter how much sense it might make, that’s a mistake. But as a GitHub Copilot suggestion, it would be useful.

The issue is that merely predicting the most likely continuation based on the text in front of the cursor to make a GitHub Copilot suggestion would be a wasted opportunity. That’s because it ignores an incredible wealth of context. We can use that context to guide the suggestion, like metadata, the code below the cursor, the content of imports, the rest of the repository, or issues, and create a strong prompt for the AI assistant.

Software development is a deeply interconnected, multimodal challenge, and the more of that complexity we can tame and present to the model, the better your completions are going to be.

Step 1: Gathering context

GitHub Copilot lives in the context of an IDE such as Visual Studio Code (VS Code), and it can use whatever it can get the IDE to tell it—only if the IDE is quick about it though. In an interactive environment like GitHub Copilot, every millisecond matters. GitHub Copilot promises to take care of the common coding tasks, and if it wants to do that, it needs to display its solution to the developer before they have started to write more code in their IDE. Our rough heuristics say that for every additional 10 milliseconds we take to come up with a suggestion, the chance it’ll arrive in time decreases by one percent.

So, what can we say quickly? Well, here’s an example. Consider this suggestion to a simple piece of Python:

A developer prompting GitHub Copilot to write a simple function in Python to compute Fibonacci numbers.

Wrong! Turns out the user actually wanted to write Ruby, like this:

A developer using GitHub Copilot to write a simple function to compute Fibonacci numbers in Ruby.

The two languages have similar enough syntax so that only a couple of lines can be ambiguous, especially when it’s toward the beginning of the file where much of what we encounter are boilerplate comments. But modern IDEs such as VS Code typically know what language the user is writing in. That makes language mix ups especially annoying to the user because they break the implicit expectation that “the computer should know” (after all, most IDEs highlight language syntax).

So, let’s put the language metadata into our pile of context we might want to include. In fact, let’s add the whole filename too. If it’s available, it usually implies the language through its extension, and additionally sets the tone for what to expect in that file—small, easy pieces of information that won’t turn the tide but are helpful to include.

On the other end of the spectrum, there’s the rest of the repository. Say you’ve got a file that defines an abstract class DataReader. And you have another that defines a subclass CsvReader. And you’re now writing a new file defining another subclass SqlReader. Chances are that to write the new file, you’ll want to check out both existing files as well because they communicate useful background into what you need to implement and how to do it. Typically, developers keep such files open in different tabs and switch to remind themselves of definitions, examples, similar patterns, or tests.

If the content of those two files is useful to you, chances are it would be useful to the AI as well. So, let’s add it as context! After all, the IDE knows what other files from the repository are open as tabs in the same window. The repository might have hundreds or even thousands of files, but only some will be open, and that is a strong hint that they might be useful to what they’re doing right now. Of course, “some” can mean a lot of things, so we don’t consider any more than the 20 most recent tabs.

Step 2: Snippeting

Irrelevant information in an LLM’s context decreases its accuracy. Additionally, source code tends to be long, so even a single file is not guaranteed to fit completely into an LLM’s context window (a problem that occurs roughly a fifth of the time). So, unless the user is very frugal about their tab usage, we simply cannot include all the tabs.

It’s important to be selective about what code to include from other files, so we cut files into (hopefully) natural, overlapping snippets that are no longer than 60 lines. Of course, we don’t want to actually include all overlapping snippets—that’s why we score them and take only the best. In this case, the “score” is meant to reflect relevance. To determine a snippet’s score, we use the Jaccard similarity, a stat that can be used to gauge the similarity or diversity of sample sets. (It’s also super fast to compute, which is great for reducing latency.)

Step 3: Dressing them up

Now we have some context we’d like to pass on to the model. But how? Codex and other models don’t offer an API where you can add other files, or where you can specify the document’s language and filename for that matter. They complete one single document. As mentioned above, you’ll need to inject your context into that document in a natural way.

The path and name might be easiest. Many files start with a preamble that gives some metadata, like author, project name, or filename. So, we’ll pretend this is happening here as well, and add a line at the very top that reads something like # filepath: foo/bar.py or // filepath: foo.bar.js, depending on comment syntax in the file’s language.

Sometimes the path isn’t known, like with new files that haven’t yet been saved. Even then, we could try to at least specify the language, provided the IDE is aware of it. For many languages, we have the opportunity to include shebang lines like #!/usr/bin/python or #!/usr/bin/node. That’s a neat trick that works pretty well at warding against mistaken language identity. But it’s also a bit dangerous since files with shebang lines are a biased subpopulation of all code. So, let’s do it for short files where the danger of mistaken language identity is high, and avoid it for larger or named files.

If comments work as a delivery system for tiny nuggets of information, like path or language, we can also make them work as delivery systems for the chunky deep dives that are 60 lines of related code.

Comments are versatile, and commented-out code exists all over GitHub. Let’s look at some of the most common examples:

  • Old code that doesn’t apply anymore
  • Deleted features
  • Earlier versions of current code
  • Example code specifically left there for documentation purposes
  • Code lifted from other parts of the codebase

Let’s take our inspiration from the last group of examples. Familiarity with groups (1) – (3) makes things a bit easier on the model, but our snippets aim to emulate groups (4) and (5):

# compare this snippet from utils/concatenate.py:

# def crazy_concat(a, b):

# return str(a) + str(b)[::-1]

Note that including the file name and path of the snippet source can be useful. And combined with the current file’s path, this might guide completions referencing imports.

Step 4: Prioritization

So far, we have grabbed many pieces of context from many sources: the text directly above the cursor, text below the cursor, text in other files, and metadata like language and file path.

In the vast majority of cases (around 95%), we have to make the tough choice of what we can or cannot include.

We make that choice by thinking of the items we might include as “wishes.” Each time we uncover a piece of context, like a commented out snippet from an open tab, we make a wish. Wishes come with some priority attached, for example, the shebang lines have rather low priorities. Snippets with a low similarity score are barely higher. In contrast, the lines directly above the cursor have maximum priority. Wishes also come with a desired position in the document. The shebang line needs to be the very first item, while the text directly above the cursor comes last—it should directly precede the LLM’s completion.

The fastest way of selecting which wishes to fill and which ones to discard is by sorting that wishlist by priority. Then, we can keep deleting the lowest priority wishes until what remains fits in the context window. We then sort again by the intended order in the document and paste everything together.

Step 5: The AI does its thing

Now that we’ve assembled an informative prompt, it’s time for the AI to come up with a useful completion. We have always faced a very delicate tradeoff here—GitHub Copilot needs to use a highly capable model because quality makes all the difference between a useful suggestion and a distraction. But at the same time, it needs to be a model capable of speed, because latency makes all the difference between a useful suggestion and not being able to provide a suggestion at all.

So, which AI should we choose to “do its thing” on the completion task: the fastest or the most accurate one? It’s hard to know in advance, so OpenAI developed a fleet of models in collaboration with GitHub. We put two different models in front of developers but found that people got the most mileage (in terms of accepted and retained completions) out of the much faster model. Since then, further optimizations have increased model speed significantly, so that the current version of GitHub Copilot is backed by an even more capable model.

Step 6: Now, over to you!

The generative AI produces a string, and if it’s not stopped, it keeps on producing and will keep going until it predicts the end of the file. That would waste time and compute resources, so you need to set up “stop” criteria.

The most common stop criterion is actually looking for the first line break. In many situations, it seems likely that a software developer wants the current line to be finished, but not more. But some of the most magical contributions by GitHub Copilot are when it suggests multiple lines of code all at once.

Multi-line completions feel natural when they’re about a single semantic unit, such as the body of a function, an if-branch, or a class. GitHub Copilot looks for cases where such a block is being started, either because the developer has just written the start, such as the header, if guard, or class declaration, or is currently writing the start. If the block body appears to be empty, it will attempt to make a suggestion for it, and only stop when the block appears to be done.

This is the point when the suggestion gets surfaced to the coder. And the rest, as they say, is ~~history~~ 10x development.

If you’re interested in learning more about prompt engineering in general and how you can refine your own techniques, check out our guide on getting started with GitHub Copilot.

Crafting a better, faster code view

Post Syndicated from Joshua Brown original https://github.blog/2023-06-21-crafting-a-better-faster-code-view/

Reading code is not as simple as reading the text of a file end-to-end. It is a non-linear, sometimes chaotic process of jumping between files to follow a trail, building a mental picture of how code relates to its surrounding context. GitHub’s mission is to be the home for all developers, and reading code is one of the core experiences we offer. Every day, millions of users use GitHub to view and interact with code. So, about a year ago we set out to create a new code view that supports the entire code reading experience with features like a file tree, symbol navigation, code search integration, sticky lines, and code section folding. The new code view is powerful, intelligent, and interactive, but it is not an attempt to turn the repository browsing experience into an IDE.

While building the new code view, our team had a few guiding principles on which we refused to compromise:

  • It must add these powerful new features to transform how users read code on GitHub.
  • It must be intuitive and easy to use for all of GitHub’s millions of users.
  • It must be fast.

Initial efforts

The first step was to build out the features we wanted in a natural, straightforward way, taking the advice that “premature optimization is the root of all evil.”1 After all, if simple code satisfactorily solves our problems, then we should stop there. We knew we wanted to build a highly interactive and stateful code viewing experience, so we decided to use React to enable us to iterate more quickly on the user interface. Our initial implementation for the code blob was dead-simple: our syntax highlighting service converted the raw file contents to a list of HTML strings corresponding to the lines of the file, and each of these lines was added to the document.

There was one key problem: our performance scaled badly with the number of lines in the file. In particular, our LCP and TTI times measurably increased at around 500 lines, and this increase became noticeable at around 2,000 lines. Around those same thresholds, interactions like highlighting a line or collapsing a code section became similarly sluggish. We take these performance metrics seriously for a number of reasons. Most importantly, they are user-centric—that is, they are meant to measure aspects of the quality of a user’s experience on the page. On top of that, they are also part of how search engines like Google determine where to rank pages in their search results; fast pages get shown first, and the code view is one of the many ways GitHub’s users can show their work to the world.

As we dug in, we discovered that there were a few things at play:

  • When there are many DOM nodes on the page, style calculations and paints take longer.
  • When there are many DOM nodes on the page, DOM queries take longer, and the results can have a significant memory footprint.
  • When there are many React nodes on the page, renders and DOM reconciliation both take longer.

It’s worth noting that none of these are problems with React specifically; any page with a very large DOM would experience the first two problems, and any solution where a large DOM is created and managed by JavaScript would experience the third.

We mitigated these problems considerably by ensuring that we were not running these expensive operations more than necessary. Typical React optimization techniques like memoization and debouncing user input, as well as some less common solutions like pulling in an observer pattern went a long way toward ensuring that React state updates, and therefore DOM updates, only occurred as needed.

Mitigating the problem, however, is not solving the problem. Even with all of these optimizations in place, the initial render of the page remained a fundamentally expensive operation for large files. In the repository that builds GitHub.com, for example, we have a CODEOWNERS file that is about 18,000 lines long, and pushes the 2MB size limit for displaying files in the UI. With no optimizations besides the ones described above, React’s first pass at building the DOM for this page takes nearly 27 seconds.2 Considering more than half of users will abandon a page if it loads for more than three seconds, there was obviously lots of work left to do.

A promising but incomplete solution

Enter virtualization. Virtualization is a performance optimization technique that examines the scroll state of the page to determine what content to include in the DOM. For example, if we are viewing a 10,000 line file but only about 75 lines fit on the screen at a time, we can save lots of time by only rendering the lines that fit in the viewport. As the user scrolls, we add any lines that need to appear, and remove any lines that can disappear, as illustrated by this demo.3

This satisfies the most basic requirements of the page with flying colors. It loads on average more quickly than the existing experience, and the experience of scrolling through the file is nearly indistinguishable from the non-virtualized case. Remember that 27 second initial render? Virtualizing the file content gets that time down to under a second, and that number does not increase substantially even if we artificially remove our file size limit and pull in hundreds of megabytes of text.

Unfortunately, virtualization is not a cure-all. While our initial implementation added features to the page at the expense of performance, naïvely virtualizing the code lines delivers a fast experience at the expense of vital functionality. The biggest problem was that without the entire text of the file on the page at once, the browser’s built-in find-in-file only surfaced results that are visible in the viewport. Breaking users’ ability to find text on the page breaks our hard requirement that the page remain intuitive and easy to use. Before we could ship any of this to real users, we had to ensure that this use case would be covered.

The immediate solution was to implement our own version of find-in-file by implementing a custom handler for the Ctrl+F shortcut (⌘+F on Mac). We added a new piece of UI in the sidebar to show results as part of our integration with symbol navigation and code search.

Screenshot of the "find" sidebar, showing a search bar with the term "isUn" and a list offive lines of code from the current file that contain that string, the second of which is highlighted as selected.

There is precedent for overriding this native browser feature to allow users to find text in virtualized code lines. Monaco, the text editor behind VS Code, does exactly this to solve the same problem, as do many other online code editors, including Repl.it and CodePen. Some other editors like the official Ruby playground ignore the problem altogether and accept that Ctrl+F will be partially broken within their virtualized editor.

At the time, we felt confident leaning on this precedent. These examples are applications that run in a browser window, and as users, we expect applications to implement their own controls. Writing our own way to find text on the page was a step toward making GitHub’s Code View less of a web page and more of a web application.

When we released the new code view experience as a private beta at GitHub Universe, we received clear feedback that our users think of GitHub as a page, not as an app. We tried to rework the experience to be as similar as possible to the native implementation, both in terms of user experience and performance. But ultimately, there are plenty of good reasons not to override this kind of native browser behavior.

  • Users of assistive technologies often use Ctrl+F to locate elements on a page, so restricting the scope to the contents of the file broke these workflows.
  • Users rely heavily on specific muscle memory for common actions, and we followed a deep rabbit hole to get the custom control to support all of the shortcuts used by various browsers.
  • Finally, the native browser implementation is simply faster.

Despite plenty of precedent for an overridden find experience, this user feedback drove us to dig deeper into how we could lean on the browser for something it already does well.

Virtualization has an important role to play in our final product, but it is only one piece of the puzzle.

How the pieces fit together

Our complete solution for the code view features two pieces:

  1. A textarea that contains the entire text of the raw file. The contents are accessible, keyboard-navigable, copyable, and findable, yet invisible.
  2. A virtualized, syntax-highlighted overlay. The contents are visible, yet hidden from both mouse events and the browser’s find.

Together, these pieces deliver a code view that supports the complete code reading experience with many new features. Despite the added complexity, this new experience is faster to render than the static HTML page that has displayed code on GitHub for more than a decade.

A textarea and a read-only cursor

The first half of this solution came to us from an unexpected angle.

Beyond adding functionality to the code view, we wanted to improve the code reading experience for users of assistive technologies like screen readers. The previous code view was minimally accessible; a code document was displayed as a table, which created a very surprising experience for screen reader users. A code document is not a table, but likewise it is not a paragraph of text. To support a familiar interface for interacting with the code on the page, we added an invisible textarea underneath the virtualized, syntax-highlighted code lines so that users can move through the code with the keyboard in a familiar way. And for the browser, rendering a textarea is much simpler than using JavaScript to insert syntax-highlighted HTML. Browsers can render megabytes of text in a textarea with ease.

Since this textarea contains the entire text of the raw file, it is not just an accessibility feature, but an opportunity to remove our custom implementation of Ctrl+F in favor of native browser implementations.

Hiding text from Ctrl+F

With the addition of the textarea, we now have two copies of every line that is visible in the viewport: one in the textarea, and another in the virtualized, syntax-highlighted overlay. In this state, searching for text yields duplicated results, which is more confusing than a slow or unfamiliar experience.

The question, then, is how to expose only one copy of the text to the browser’s native Ctrl+F. That brings us to the next key part of our solution: how we hid the syntax-highlighted overlay from find.

For a code snippet like this line of Python:

print("Hello!")

the old code view created a bit of HTML that looks like this:

<span class="pl-en">print</span>(<span class="pl-s">"Hello!"</span>)

But the text nodes containing print, (,"Hello!", and ) are all findable. It took two iterations to arrive at a format that looks identical but is consistently hidden fromCtrl+F on all major browsers. And as it turns out, this is not a question that is very easy to research!

The first approach we tried relied on the fact that :before pseudoelements are not part of the DOM, and therefore do not appear in find results. With a bit of a change to our HTML format that moves all text into a data- attribute, we can use CSS to inject the code text into the page without any findable text nodes.

HTML

<span class="pl-en" data-code-text="print"></span>
<span data-code-text="("></span>
<span class="pl-s" data-code-text=""Hello!""></span>
<span data-code-text=")"></span>

CSS

[data-code-text]:before {
   content: attr(data-code-text);
}

But that’s not the end of the story, because the major browsers do not agree on whether text in :before pseudoelements should be findable; Firefox in particular has a powerful Ctrl+F implementation that is not fooled by our first trick.

Our second attempt relied on a fact on which all browsers seem to agree: that text in adjacent pseudoelements is not treated as a contiguous block of text.4 So, even though Firefox would find print in the first example, it would not find print(. The solution, then, is to break up the text character-by-character:

<span class="pl-en">
   <span data-code-text="p"></span>
   <span data-code-text="r"></span>
   <span data-code-text="i"></span>
   <span data-code-text="n"></span>
   <span data-code-text="t"></span>
</span>
<span data-code-text="("></span>
<span class="pl-s">
   <span data-code-text="""></span>
   <span data-code-text="H"></span>
   <span data-code-text="e"></span>
   <span data-code-text="l"></span>
   <span data-code-text="l"></span>
   <span data-code-text="o"></span>
   <span data-code-text="!"></span>
   <span data-code-text="""></span>
</span>
<span data-code-text=")"></span>

At first glance, this might seem to complicate the DOM so much that it might outweigh the performance gains for which we worked so hard. But since these lines are virtualized, we create this overlay for at most a few hundred lines at a time.

Syntax highlighting in a compact format

The path we took to build a faster code view with more features was, like the path one might follow when reading code in a new repository, highly non-linear. Performance optimizations led us to fix behaviors which were not quite right, and those behavior fixes led us to need further performance optimizations. Knowing how we wanted the HTML for the syntax-highlighted overlay to look, we had a few options for how to make it happen. After a number of experiments, we completed our puzzle with a performance optimization that ended this cycle without causing any behavior changes.

Our syntax-highlighting service previously gave us a list of HTML strings, one for each line of code:

[
   "<span class=\"pl-en\">print</span>(<span class=\"pl-s\">"Hello!"</span>)"
]

In order to display code in a different format, we introduced a new format that simply gives the locations and css classes of highlighted segments:

[
   [
       {"start": 0, "end": 5, "cssClass": "pl-en"},
       {"start": 6, "end": 14, "cssClass": "pl-s"}
   ]
]

From here, we can easily generate whatever HTML we want. And that brings us to our final optimization:

within our syntax-highlighted overlay, we save React the trouble of managing the code lines by generating the HTML strings ourselves. This can deliver a surprisingly large performance boost in certain cases, like scrolling all the way through the 18,000-line CODEOWNERS file mentioned earlier. With React managing the entire DOM, we hit the “end” key to move all the way to the end of the file, and it takes the browser 870 milliseconds to finish handling the “keyup” event, followed by 3,700 milliseconds of JavaScript blocking the main thread. When we generate the code lines as HTML strings, handling the “keyup” event takes only 80 milliseconds, followed by about 700 milliseconds of blocking JavaScript.5

In summary

GitHub’s mission is to be the home for all developers. Developers spend a substantial amount of their time reading code, and reading code is hard! We spent the past year building a new code view that supports the entire code reading experience because we are passionate about bringing great tools to the developers of the world to make their lives a bit easier.

After a lot of difficult work, we have created a code view that introduces tons of new features for understanding code in context, and those features can be used by anyone. And we did it all while also making the page faster!

We’re proud of what we built, and we would love for everyone to try it out and send us feedback!

Notes


  1. This quote, popularized by and often attributed to Donald Knuth, was first said by Sir Tony Hoare, the developer of the quicksort algorithm. 
  2. All performance metrics generated for this post use a development build of React in order to better compare apples to apples. 
  3. Check out the source code for this virtualization demo here! 
  4. The fact that browsers do not treat adjacent :before elements as part of the same block of text also introduces another complication: it resets the tab stop location for each node, which means that tabs are not rendered with the correct width! We need the syntax-highlighted overlay to align exactly with the text content underneath because any discrepancy creates a highly confusing user experience. Luckily, since the overlay is neither findable nor copyable, we can modify it however we like. The tab width problem is solved neatly by converting tabs to the appropriate number of spaces in the overlay. 
  5. Although code on GitHub is often nested deeply, the syntax information for a line of code can still be described linearly much of the time—we have a keyword followed by some plain text and then a string literal, etc. But sometimes it is not so simple—we might have a Markdown document with a code section. That code section might be an HTML document with a script tag. That script tag might contain JavaScript. That JavaScript might contain doc comments on a function. Those doc comments might contain @param tags which are rendered as keywords. We can handle this kind of arbitrarily nested syntax tree with a recursive React component. But that means the shape of our tree of React nodes, and therefore the amount of time it takes to perform DOM reconciliation, is determined by the code our users have chosen to write. On top of that, React adds DOM nodes one-at-a-time, and our overlay uses one DOM node per character of code. These are the main reasons that sidestepping React for this part of the page gives us such a dramatic performance boost. 

How to use GitHub Copilot: Prompts, tips, and use cases

Post Syndicated from Rizel Scarlett original https://github.blog/2023-06-20-how-to-write-better-prompts-for-github-copilot/

Leia este artigo em português

As ferramentas de programação de IA generativa estão transformando a maneira como as pessoas desenvolvedoras abordam as tarefas diárias de programação. Desde a documentação de nossas bases de código até a geração de testes de unidade, essas ferramentas estão ajudando a acelerar nossos fluxos de trabalho. No entanto, assim como acontece com qualquer tecnologia emergente, sempre há uma curva de aprendizado. Como resultado, as pessoas desenvolvedoras — tanto iniciantes quanto experientes — às vezes se sentem frustradas quando os assistentes de programação baseados em IA não geram o resultado desejado. (Familiar com isso?)

Por exemplo, ao pedir ao GitHub Copilot para desenhar uma casquinha de sorvete 🍦usando p5.js, uma biblioteca JavaScript para código criativo, continuamos recebendo sugestões irrelevantes ou, às vezes, nenhuma sugestão. Mas quando aprendemos mais sobre a maneira como o GitHub Copilot processa as informações, percebemos que precisávamos ajustar a maneira como nos comunicamos com elas.

Aqui está um exemplo do GitHub Copilot gerando uma solução irrelevante:

When we wrote this prompt to GitHub Copilot,

Quando ajustamos nosso prompt, conseguimos gerar resultados mais precisos:

When we wrote this prompt to GitHub Copilot,

Somos desenvolvedoras e entusiastas de IA. Eu, Rizel, usei o GitHub Copilot para criar uma extensão de navegador; jogo de pedra, papel e tesoura; e para enviar um Tweet. E eu, Michele, abri uma empresa de AI em 2021. Somos ambas Developer Advocates no GitHub e adoramos compartilhar nossas principais dicas para trabalhar com o GitHub Copilot.

Neste guia do GitHub Copilot, abordaremos:

  • O que exatamente é um prompt e o que é engenharia de prompt também (dica: depende se você está falando com uma pessoa desenvolvedora ou pesquisadora de machine learning)
  • Três práticas recomendadas e três dicas adicionais para criação imediata com o GitHub Copilot
  • Um exemplo em que você pode tentar solicitar ao GitHub Copilot para ajudá-lo a criar uma extensão de navegador

Progresso antes de perfeição

Mesmo com nossa experiência no uso de IA, reconhecemos que todes estão em uma fase de tentativa e erro com a tecnologia de IA generativa. Também conhecemos o desafio de fornecer dicas generalizadas de criação de prompts porque os modelos variam, assim como os problemas individuais nos quais as pessoas desenvolvedoras estão trabalhando. Este não é um guia definitivo. Em vez disso, estamos compartilhando o que aprendemos sobre criação de prompts para acelerar o aprendizado coletivo durante esta nova era de desenvolvimento de software.

O que é um prompt e o que é engenharia de prompt?

Depende de com quem você fala.

No contexto das ferramentas de programação de IA generativa, um prompt pode significar coisas diferentes, dependendo se você está perguntando a pessoas pesquisadoras de Machine Learning (ML) que estão construindo e ajustando essas ferramentas ou pessoas desenvolvedoras que as estão usando em seus IDEs.

Para este guia, definiremos os termos do ponto de vista de uma pessoa desenvolvedora que está usando uma ferramenta de programação AI generativa no IDE. Mas, para dar a você uma visão completa, também adicionamos as definições do pesquisador de ML abaixo em nosso gráfico.

Prompts Engenharia de Prompt Contexto
Pessoa Desenvolvedora Blocos de código, linhas individuais de código ou comentários em linguagem natural que uma pessoa desenvolvedora escreve para gerar uma sugestão específica do GitHub Copilot. Fornecer instruções ou comentários no IDE/strong> para gerar sugestões de código específicas. DDetalhes que são fornecidos por uma pessoa desenvolvedora para especificar a saída desejada de uma ferramenta de programação AI generativa.
Pessoa Pesquisadora de ML Compilação de código de IDE
e contexto
relevante (comentários IDE, código em arquivos abertos, etc.) que são continuamente gerados por algoritmos e enviados para o modelo de uma ferramenta de programação AI generativa
Criação de algoritmos que irão gerar prompts (compilações de código IDE e contexto) para um grande modelo de linguagem Detalhes (como dados de seus arquivos abertos e código que você escreveu antes e depois do cursor) que os algoritmos enviam para um modelo de linguagem grande (LLM) como informações adicionais sobre o código

3 melhores práticas para construção de prompt com GitHub Copilot

1. Defina o cenário com um objetivo de alto nível 🖼

Isso é mais útil se você tiver um arquivo em branco ou uma base de código vazia. Em outras palavras, se o GitHub Copilot não tiver contexto do que você deseja criar ou realizar, definir o cenário para a programação em par AI pode ser realmente útil. Isso ajuda a preparar o GitHub Copilot com uma descrição geral do que você deseja gerar – antes de entrar nos detalhes.

Ao solicitar o GitHub Copilot, pense no processo como uma conversa com alguém: como devo detalhar o problema para que possamos resolvê-lo juntes? Como eu abordaria a programação em par com essa pessoa?

Por exemplo, ao construir um editor de markdown em Next.jst, poderíamos escrever um comentário como este:


/*
Crie um editor de markdown básico em Next.jcom as seguintes habilidades:
- Use react hooks
- Crie um estado para markdown com texto default "digite markdown aqui"
- Uma área de texto onde pessoas usuárias podem escrever markdown a
- Mostre uma demostração ao vivo do markdown enquando a pessoas digitaS
- Suporte para sintaxe básica de markdown como cabeçalhos, negrito, itálico
- Use React markdown npm package 
- O texto markdown e resultado em HTML devem ser salvos no estado do componente e atualizado em tempo real 
*/

Isso solicitará que o GitHub Copilot gere o código a seguir e produza um editor de markdown muito simples, sem estilo, mas funcional, em menos de 30 segundos. Podemos usar o tempo restante para estilizar o componente:

We used this prompt to build a markdown editor in Next.jst using GitHub Copilot:
- Use react hooks
- Create state for markdown with default text

Observação: esse nível de detalhe ajuda a criar um resultado mais desejado, mas os resultados ainda podem ser não determinísticos. Por exemplo, no comentário, solicitamos ao GitHub Copilot que criasse um texto padrão que diz “digite markdown aqui”, mas, em vez disso, gerou “visualização de markdown” como as palavras padrão.

2. Faça sua pergunta simples e específica. Procure receber uma saída curta do GitHub Copilot.🗨

Depois de comunicar seu objetivo principal ao Copilot, articule a lógica e as etapas que ele precisa seguir para atingir esse objetivo. O GitHub Copilot entende melhor seu objetivo quando você detalha as coisas. (Imagine que você está escrevendo uma receita. Você dividiria o processo de cozimento em etapas discretas – não escreveria um parágrafo descrevendo o prato que deseja fazer.)

Deixe o GitHub Copilot gerar o código após cada etapa, em vez de pedir que ele gere vários códigos de uma só vez.

Aqui está um exemplo de nós fornecendo ao GitHub Copilot instruções passo a passo para reverter uma função:

We prompted GitHub Copilot to reverse a sentence by writing six prompts one at a time. This allowed GitHub Copilot to generate a suggestion for one prompt before moving onto the text. It also gave us the chance to tweak the suggested code before moving onto the next step. The six prompts we used were: First, let's make the first letter of the sentence lower case if it's not an 'I.' Next, let's split the sentence into an array of words. Then, let's take out the punctuation marks from the sentence. Now, let's remove the punctuation marks from the sentence. Let's reverse the sentence and join it back together. Finally, let's make the first letter of the sentence capital and add the punctuation marks.

3. De alguns exemplos para o GitHub Copilot. ✍

Aprender com exemplos não é útil apenas para humanos, mas também para seu programador Copilot. Por exemplo, queríamos extrair os nomes do array de dados abaixo e armazená-los em um novo array:


const data = [
  [
    { name: 'John', age: 25 },
    { name: 'Jane', age: 30 }
  ],
  [
    { name: 'Bob', age: 40 }
  ]
];

Quando você não mostra um exemplo para o GitHub Copilot …


// Map por um array de arrays de objetos para transformar dados
const data = [
  [
    { name: 'John', age: 25 },
    { name: 'Jane', age: 30 }
  ],
  [
    { name: 'Bob', age: 40 }
  ]
];

Isso gerou um uso incorreto do map:


const mappedData = data.map(x => [x.name](http://x.name/));

console.log(mappedData);

// Results: [undefined, undefined]

Em contraste, quando mostramos um exemplo …


// Map por um array de arrays de objetos
// Exemplo: Extraia nomes  array data
// Resultado desejado: ['John', 'Jane', 'Bob']
const data = [
  [{ name: 'John', age: 25 }, { name: 'Jane', age: 30 }],
  [{ name: 'Bob', age: 40 }]
];

Recebemos o resultado desejado.


const mappedData = data.flatMap(sublist => sublist.map(person => person.name));

console.log(mappedData);
// Results: ['John', 'Jane', 'Bob']

Leia mais sobre abordagens comuns para treinamento de IA, como aprendizado de disparo zero, disparo único e poucos disparos.

Três dicas adicionais para criação imediata com o GitHub Copilot

Aqui estão três dicas adicionais para ajudar a orientar sua conversa com o GitHub Copilot.

1. Experimente com seus prompts.

Assim como a conversa é mais uma arte do que uma ciência, o mesmo acontece com a criação imediata. Portanto, se você não receber o que deseja na primeira tentativa, reformule seu prompt seguindo as práticas recomendadas acima.

Por exemplo, o prompt abaixo é vago. Ele não fornece nenhum contexto ou limites para o GitHub Copilot gerar sugestões relevantes.


# Escreva algum código para grades.py

Repetimos o prompt para sermos mais específicos, mas ainda não obtivemos o resultado exato que procurávamos. Este é um bom lembrete de que adicionar especificidade ao seu prompt é mais difícil do que parece. É difícil saber, desde o início, quais detalhes você deve incluir sobre seu objetivo para gerar as sugestões mais úteis do GitHub Copilot. É por isso que encorajamos a experimentação.
A versão do prompt abaixo é mais específica que a anterior, mas não define claramente os requisitos de entrada e saída.


# Implemente uma função em grades.py para calcular a nota média

Experimentamos o prompt mais uma vez definindo limites e delineando o que queríamos que a função fizesse. Também reformulamos o comentário para que a função fosse mais clara (dando ao GitHub Copilot uma intenção clara de verificação).

Desta vez, obtivemos os resultados que procurávamos.


# Implemente a função calculate_average_grade em grades.py que recebe uma lista de notas como entrada e retorna a nota média como um número de ponto flutuante

2. Mantenha algumas abas abertas.

Não temos um número exato de abas que você deve manter abertas para ajudar o GitHub Copilot a contextualizar seu código, mas, com base em nossa experiência, descobrimos que uma ou duas são úteis.

O GitHub Copilot usa uma técnica chamada de abas vizinhas que permite que a ferramenta programadora em par AI contextualize seu código processando todos os arquivos abertos em seu IDE em vez de apenas um único arquivo em que você está trabalhando. No entanto, não é garantido que o GitHub Copilot considere todos os arquivos abertos como contexto necessário para o seu código.

3. Use boas práticas de programação.

Isso inclui fornecer nomes e funções de variáveis ​​descritivas e seguir estilos e padrões de codificação consistentes. Descobrimos que trabalhar com o GitHub Copilot nos encoraja a seguir boas práticas de programação que aprendemos ao longo de nossas carreiras.

Por exemplo, aqui usamos um nome de função descritivo e seguimos os padrões da base de código para alavancar casos de cobra.


def authenticate_user(username, password):

Como resultado, GitHub Copilot gera uma sugestão de código relevante:


def authenticate_user(username, password):
    # Code for authenticating the user
    if is_valid_user(username, password):
        generate_session_token(username)
        return True
    else:
        return False

Compare isso com o exemplo abaixo, onde introduzimos um estilo de programação inconsistente e nomeamos mal nossa função.


def rndpwd(l):

Em vez de sugerir código, o GitHub Copilot gerou um comentário que dizia: “O código vai aqui”.


def rndpwd(l):
    # Code goes here

Fique esperto

Os LLMs por trás das ferramentas de programação de IA generativas são projetados para encontrar e extrapolar padrões de seus dados de treinamento, aplicar esses padrões à linguagem existente e, em seguida, produzir código que siga esses padrões. Dada a escala desses modelos, eles podem gerar uma sequência de código que ainda nem existe. Assim como você revisaria o código de um colega, você deve sempre avaliar, analisar e validar o código gerado por IA.

Um exemplo de prática 👩🏽‍💻

Tente solicitar ao GitHub Copilot para criar uma extensão de navegador.

Para começar, você precisará ter o GitHub Copilot instalado e aberto em seu IDE. Também temos acesso a uma prévia do bate-papo do GitHub Copilot, que é o que usamos quando tiver dúvidas sobre o nosso código. Se você não tem bate-papo no GitHub Copilot, inscreva-se na lista de espera. Até então, você pode emparelhar o GitHub Copilot com o ChatGPT.

Guias de criação de IA mais generativos (em inglês)

<0l>

  • Um guia para iniciantes sobre engenharia de prompt com o GitHub Copilot
  • Engenharia de alerta para IA
  • Como o GitHub Copilot está melhorando a compreensão do seu código

  • Lee este articulo en español

    Las herramientas de codificación con IA generativa están transformando la forma en que los desarrolladores abordan las tareas de codificación diarias. Desde documentar nuestras bases de código hasta generar pruebas unitarias, estas herramientas están ayudando a acelerar nuestros flujos de trabajo. Sin embargo, como con cualquier tecnología emergente, siempre hay una curva de aprendizaje. Como resultado, los desarrolladores, tanto principiantes como experimentados, a veces se sienten frustrados cuando los asistentes de codificación impulsados por IA no generan el resultado que quieren. (¿Te suena familiar?)

    Por ejemplo, al pedirle a GitHub Copilot que dibuje un cono de helado 🍦 usando p5.js, una biblioteca de JavaScript para codificación creativa, seguimos recibiendo sugerencias irrelevantes, o a veces ninguna sugerencia en absoluto. Pero cuando aprendimos más sobre la forma en que GitHub Copilot procesa la información, nos dimos cuenta de que teníamos que ajustar la forma en que nos comunicábamos.

    Aquí hay un ejemplo de GitHub Copilot generando una solución irrelevante:

    When we wrote this prompt to GitHub Copilot,

    Cuando ajustamos nuestra instrucción, pudimos generar resultados más precisos:

    When we wrote this prompt to GitHub Copilot,

    Somos tanto desarrolladoras como entusiastas de la IA. Yo, Rizel, he utilizado GitHub Copilot para construir una extensión de navegador; un juego de piedra, papel o tijera; y para enviar un tweet. Y yo, Michelle, lancé una compañía de IA en 2016. Ambas somos DevRel en GitHub y nos encanta compartir nuestros mejores consejos para trabajar con GitHub Copilot.

    En esta guía para GitHub Copilot, cubriremos:

    • Qué es exactamente un “prompt” y qué es la ingeniería de prompts (pista: depende de si estás hablando con un desarrollador o con un investigador de aprendizaje automático)
    • Tres mejores prácticas y tres consejos adicionales para la creación de prompts con GitHub Copilot
    • Un ejemplo donde puedes probar a GitHub Copilot para que te ayude en la construcción de una extensión de navegador

    Progreso sobre perfección

    Incluso con nuestra experiencia usando IA, reconocemos que todos están en una fase de prueba y error con la tecnología de IA generativa. También conocemos el desafío de proporcionar consejos generales de elaboración de prompts porque los modelos varían, al igual que los problemas individuales en los que los desarrolladores están trabajando. Esta no es una guía definitiva. En su lugar, estamos compartiendo lo que hemos aprendido sobre la elaboración de prompts para acelerar el aprendizaje colectivo durante esta nueva era del desarrollo de software.

    ¿Qué es un “prompt” y qué es la ingeniería de prompt?

    Depende de con quién hables.

    En el contexto de las herramientas de codificación de IA generativa, un prompt puede significar diferentes cosas, dependiendo de si está preguntando a los investigadores de aprendizaje automático (ML) que están construyendo y ajustando estas herramientas, o a los desarrolladores que las están usando en sus IDE.

    Para esta guía, definiremos los términos desde el punto de vista de un desarrollador que utiliza una herramienta de codificación de IA generativa en el IDE. Pero para brindarle una imagen completa, también agregamos las definiciones de investigador de ML a continuación.

    Prompts Ingenieria de Prompt Contexto
    Desarrollador Bloques de código, líneas individuales de código, o comentarios en lenguaje natural que un desarrollador escribe para generar una sugerencia específica de GitHub Copilot. Proporcionar instrucciones o comentarios en el IDE para generar sugerencias de código específicas Detalles que proporciona un desarrollador para especificar la prompt deseada de una herramienta de codificación generativa de IA
    Investigador de ML Compilación de código de IDE y contexto relevante (comentarios de IDE, código en archivos abiertos, etc.) que se genera continuamente por algoritmos y se envíaal modelo de una herramienta de codificación generativa de IA Creación de algoritmos que generarán prompts (compilaciones de código de IDE y contexto) para un modelo de lenguaje de gran tamaño Detalles (como datos de tus archivos abiertos y código que has escrito antes y después del curso) que los algoritmos envían a un modelo de lenguaje de gran tamaño (LLM) como información adicional sobre el código

    3 mejores prácticas para la elaboración de prompts con GitHub Copilot

    1. Establecer el escenario con un objetivo de alto nivel. 🖼

    Esto es más útil si tienes un archivo en blanco o una base de código vacía. En otras palabras, si GitHub Copilot no tiene ningún contexto de lo que quieres construir o lograr, establecer el escenario para el programador par de IA puede ser realmente útil. Ayuda a preparar a GitHub Copilot con una descripción general de lo que quieres que genere, antes de que te sumerjas en los detalles.

    Al hacer prompts, GitHub Copilot, piensa en el proceso como si estuvieras teniendo una conversación con alguien: ¿Cómo debería desglosar el problema para que podamos resolverlo juntos? ¿Cómo abordaría la programación en pareja con esta persona?

    Por ejemplo, al construir un editor de markdown en Next.js, podríamos escribir un comentario como este:

    
    //
    Crea un editor de markdown básico en Next.js con las siguientes características:
    - Utiliza hooks de React
    - Crea un estado para markdown con texto predeterminado "escribe markdown aquí"
    - Un área de texto donde los usuarios pueden escribir markdown
    - Muestra una vista previa en vivo del texto de markdown mientras escribo
    - Soporte para la sintaxis básica de markdown como encabezados, negrita, cursiva
    - Utiliza el paquete npm de React markdown
    - El texto de markdown y el HTML resultante deben guardarse en el estado del componente y actualizarse en tiempo real 
    */
    

    Esto hará que GitHub Copilot genere el siguiente código y produzca un muy editor de rebajas simple, sin estilo pero funcional en menos de 30 segundos. Podemos usar el tiempo restante para diseñar el componente:

    We used this prompt to build a markdown editor in Next.jst using GitHub Copilot:
- Use react hooks
- Create state for markdown with default text

    Nota: Este nivel de detalle te ayuda a crear una prompt más deseada, pero los resultados aún pueden ser no deterministas. Por ejemplo, en el comentario, solicitamos a GitHub Copilot que cree un texto predeterminado que diga “escribe markdown aquí”, pero en cambio generó “vista previa de markdown” como las palabras predeterminadas.

    2. Haz tu solicitud simple y específica. Apunta a recibir una prompt corta de GitHub Copilot. 🗨

    Una vez que comunicas tu objetivo principal al programador par AI, articula la lógica y los pasos que debe seguir para alcanzar ese objetivo. GitHub Copilot comprende mejor tu objetivo cuando desglosas las cosas. (Imagina que estás escribiendo una receta. Desglosarías el proceso de cocción en pasos discretos, no escribirías un párrafo describiendo el plato que quieres hacer.)
    Deja que GitHub Copilot genere el código después de cada paso, en lugar de pedirle que genere un montón de código de una sola vez.
    Aquí tienes un ejemplo de cómo proporcionamos a GitHub Copilot instrucciones paso a paso para invertir una función:

    We prompted GitHub Copilot to reverse a sentence by writing six prompts one at a time. This allowed GitHub Copilot to generate a suggestion for one prompt before moving onto the text. It also gave us the chance to tweak the suggested code before moving onto the next step. The six prompts we used were: First, let's make the first letter of the sentence lower case if it's not an 'I.' Next, let's split the sentence into an array of words. Then, let's take out the punctuation marks from the sentence. Now, let's remove the punctuation marks from the sentence. Let's reverse the sentence and join it back together. Finally, let's make the first letter of the sentence capital and add the punctuation marks.

    3. Proporciona a GitHub Copilot uno o dos ejemplos. ✍

    Aprender de ejemplos no solo es útil para los humanos, sino también para tu programador par AI. Por ejemplo, queríamos extraer los nombres del siguiente arreglo de datos y almacenarlos en un nuevo arreglo:

    
    const data = [
      [
        { name: 'John', age: 25 },
        { name: 'Jane', age: 30 }
      ],
      [
        { name: 'Bob', age: 40 }
      ]
    ];
    

    Cuando no le mostramos un ejemplo a GitHub Copilot …

    
    // Mapee a través de una matriz de matrices de objetos para transformar datos
    const data = [
      [
        { name: 'John', age: 25 },
        { name: 'Jane', age: 30 }
      ],
      [
        { name: 'Bob', age: 40 }
      ]
    ];
    

    Generó un uso incorrecto del mapa:

    
    const mappedData = data.map(x => [x.name](http://x.name/));
    
    console.log(mappedData);
    
    // Results: [undefined, undefined]
    

    Por el contrario, cuando proporcionamos un ejemplo…

    
    // Recorrer un array de arrays de objetos
    // Ejemplo: Extraer los nombres del array de datos
    // Resultado deseado: ['John', 'Jane', 'Bob']
    const data = [
      [{ name: 'John', age: 25 }, { name: 'Jane', age: 30 }],
      [{ name: 'Bob', age: 40 }]
    ];
    

    Recibimos nuestro resultado deseado.

    
    const mappedData = data.flatMap(sublist => sublist.map(person => person.name));
    
    console.log(mappedData);
    // Results: ['John', 'Jane', 'Bob'] 
    

    Lee más acerca de los enfoques comunes para el entrenamiento de IA, como el aprendizaje de zero-shot, one-shot, and few-shot learning.

    Tres consejos adicionales para la elaboración de prompts con GitHub Copilot

    Aquí tienes tres consejos adicionales para ayudarte a guiar tu conversación con GitHub Copilot.

    1. Experimenta con tus prompts.

    Al igual que la conversación es más un arte que una ciencia, también lo es la elaboración de prompts. Así que, si no recibes lo que quieres en el primer intento, reformula tu prompts siguiendo las mejores prácticas mencionadas anteriormente.

    Por ejemplo, la prompts de abajo es vaga. No proporciona ningún contexto ni límites para que GitHub Copilot genere sugerencias relevantes.

    
    # Escribe algo de código para grades.py
    

    Iteramos en el prompt para ser más específicos, pero aún no obtuvimos el resultado exacto que estábamos buscando. Este es un buen recordatorio de que añadir especificidad a tu prompt es más difícil de lo que parece. Es difícil saber, desde el principio, qué detalles debes incluir sobre tu objetivo para generar las sugerencias más útiles de GitHub Copilot. Por eso animamos a la experimentación.

    La versión del promopt de abajo es más específica que la de arriba, pero no define claramente los requisitos de entrada y salida.

    
    # Implementar una función en grades.py para calcular la nota media
    

    Experimentamos una vez más con el promopt estableciendo límites y delineando lo que queríamos que hiciera la función. También reformulamos el comentario para que la función fuera más clara (dándole a GitHub Copilot una intención clara contra la que verificar).

    Esta vez, obtuvimos los resultados que estábamos buscando.

    
    # Implementa la función calculate_average_grade en grades.py que toma una lista de calificaciones como entrada y devuelve la calificación media como un número flotante.
    

    2. Mantén un par de pestañas relevantes abiertas.

    No tenemos un número exacto de pestañas que debas mantener abiertas para ayudar a GitHub Copilot a contextualizar tu código, pero en nuestra experiencia, hemos encontrado que una o dos es útil.

    GitHub Copilot utiliza una técnica llamada pestañas vecinas que permite al programador de pares de IA contextualizar su código procesando todos los archivos abiertos en su IDE en lugar de solo el archivo en el que está trabajando. Sin embargo, no se garantiza que GItHub Copilot considere todos los archivos abiertos como contexto necesario para su código.

    3. Utilice buenas prácticas de codificación.

    Eso incluye proporcionar nombres y funciones de variables descriptivas, y seguir estilos y patrones de codificación consistentes. Hemos descubierto que trabajar con GitHub Copilot nos anima a seguir las buenas prácticas de codificación que hemos aprendido a lo largo de nuestras carreras.

    Por ejemplo, aquí usamos un nombre de función descriptiva y seguimos los patrones de la base de código para aprovechar el caso de la serpiente.

    
    def authenticate_user(nombre de usuario, contraseña):
    

    Como resultado, GitHub Copilot generó una sugerencia de código relevante:

    
    def authenticate_user(nombre de usuario, contraseña):
        # Código para autenticar al usuario
        Si is_valid_user(nombre de usuario, contraseña):
            generate_session_token(nombre de usuario)
            return True
        más:
            return Falso
    

    Compare esto con el siguiente ejemplo, donde introdujimos un estilo de codificación inconsistente y mal nombramos nuestra función.

    
    def rndpwd(l):
    

    En lugar de sugerir código, GitHub Copilot generó un comentario que decía: “El código va aquí”.

    
    def rndpwd(l):
        # El código va aquí
    

    Mantente inteligente

    Los LLM detrás de las herramientas generativas de codificación de IA están diseñados para encontrar y extrapolar patrones de sus datos de entrenamiento, aplicar esos patrones al lenguaje existente y luego producir código que siga esos patrones. Dada la gran escala de estos modelos, podrían generar una secuencia de código que ni siquiera existe todavía. Al igual que revisaría el código de un colega, siempre debe evaluar, analizar y validar el código generado por IA.

    Un ejemplo 👩🏽 💻 de práctica

    Intenta pedirle a GitHub Copilot que cree una extensión del navegador.

    Para comenzar, deberás tener GitHub Copilot instalado y abierto en tu IDE. También tenemos acceso a una vista previa temprana del chat de GitHub Copilot, que es lo que hemos estado usando cuando tenemos preguntas sobre nuestro código. Si no tienes el chat de GitHub Copilot, regístrate en la lista de espera. Hasta entonces, puede emparejar GitHub Copilot con ChatGPT.

    Guías de elaboración de avisos de IA más generativas

  • Una guía para principiantes sobre ingeniería rápida con GitHub Copilot
  • Ingeniería rápida para IA
  • Cómo GitHub Copilot está mejorando en la comprensión de tu código


  • Generative AI coding tools are transforming the way developers approach daily coding tasks. From documenting our codebases to generating unit tests, these tools are helping to accelerate our workflows. However, just like with any emerging tech, there’s always a learning curve. As a result, developers—beginners and experienced alike— sometimes feel frustrated when AI-powered coding assistants don’t generate the output they want. (Feel familiar?)

    For example, when asking GitHub Copilot to draw an ice cream cone 🍦using p5.js, a JavaScript library for creative coding, we kept receiving irrelevant suggestions—or sometimes no suggestions at all. But when we learned more about the way that GitHub Copilot processes information, we realized that we had to adjust the way we communicated with it.

    Here’s an example of GitHub Copilot generating an irrelevant solution:

    When we wrote this prompt to GitHub Copilot,

    When we adjusted our prompt, we were able to generate more accurate results:

    When we wrote this prompt to GitHub Copilot,

    We’re both developers and AI enthusiasts ourselves. I, Rizel, have used GitHub Copilot to build a browser extension, rock, paper, scissors game, and to send a Tweet. And I, Michelle, launched an AI company in 2016. We’re both developer advocates at GitHub and love to share our top tips for working with GitHub Copilot.

    In this guide for GitHub Copilot, we’ll cover:

    Progress over perfection

    Even with our experience using AI, we recognize that everyone is in a trial and error phase with generative AI technology. We also know the challenge of providing generalized prompt-crafting tips because models vary, as do the individual problems that developers are working on. This isn’t an end-all, be-all guide. Instead, we’re sharing what we’ve learned about prompt crafting to accelerate collective learning during this new age of software development.

    What’s a prompt and what is prompt engineering?

    It depends on who you talk to.

    In the context of generative AI coding tools, a prompt can mean different things, depending on whether you’re asking machine learning (ML) researchers who are building and fine-tuning these tools, or developers who are using them in their IDEs.

    For this guide, we’ll define the terms from the point of view of a developer who’s using a generative AI coding tool in the IDE. But to give you the full picture, we also added the ML researcher definitions below in our chart.

    Prompts Prompt engineering Context
    Developer Code blocks, individual lines of code, or natural language comments a developer writes to generate a specific suggestion from GitHub Copilot Providing instructions or comments in the IDE to generate specific coding suggestions Details that are provided by a developer to specify the desired output from a generative AI coding tool
    ML researcher Compilation of IDE code and relevant context (IDE comments, code in open files, etc.) that is continuously generated by algorithms and sent to the model of a generative AI coding tool Creating algorithms that will generate prompts (compilations of IDE code and context) for a large language model Details (like data from your open files and code you’ve written before and after the cursor) that algorithms send to a large language model (LLM) as additional information about the code

    3 best practices for prompt crafting with GitHub Copilot

    1. Set the stage with a high-level goal. 🖼

    This is most helpful if you have a blank file or empty codebase. In other words, if GitHub Copilot has zero context of what you want to build or accomplish, setting the stage for the AI pair programmer can be really useful. It helps to prime GitHub Copilot with a big picture description of what you want it to generate—before you jump in with the details.

    When prompting GitHub Copilot, think of the process as having a conversation with someone: How should I break down the problem so we can solve it together? How would I approach pair programming with this person?

    For example, when building a markdown editor in Next.jst, we could write a comment like this

    /*
    Create a basic markdown editor in Next.js with the following features:
    - Use react hooks
    - Create state for markdown with default text "type markdown here"
    - A text area where users can write markdown 
    - Show a live preview of the markdown text as I type
    - Support for basic markdown syntax like headers, bold, italics 
    - Use React markdown npm package 
    - The markdown text and resulting HTML should be saved in the component's state and updated in real time 
    */
    

    This will prompt GitHub Copilot to generate the following code and produce a very simple, unstyled but functional markdown editor in less than 30 seconds. We can use the remaining time to style the component:

    We used this prompt to build a markdown editor in Next.jst using GitHub Copilot:
- Use react hooks
- Create state for markdown with default text

    Note: this level of detail helps you to create a more desired output, but the results may still be non-deterministic. For example, in the comment, we prompted GitHub Copilot to create default text that says “type markdown here” but instead it generated “markdown preview” as the default words.

    2. Make your ask simple and specific. Aim to receive a short output from GitHub Copilot. 🗨

    Once you communicate your main goal to the AI pair programmer, articulate the logic and steps it needs to follow for achieving that goal. GitHub Copilot better understands your goal when you break things down. (Imagine you’re writing a recipe. You’d break the cooking process down into discrete steps–not write a paragraph describing the dish you want to make.)

    Let GitHub Copilot generate the code after each step, rather than asking it to generate a bunch of code all at once.

    Here’s an example of us providing GitHub Copilot with step-by-step instructions for reversing a function:

    We prompted GitHub Copilot to reverse a sentence by writing six prompts one at a time. This allowed GitHub Copilot to generate a suggestion for one prompt before moving onto the text. It also gave us the chance to tweak the suggested code before moving onto the next step. The six prompts we used were: First, let's make the first letter of the sentence lower case if it's not an 'I.' Next, let's split the sentence into an array of words. Then, let's take out the punctuation marks from the sentence. Now, let's remove the punctuation marks from the sentence. Let's reverse the sentence and join it back together. Finally, let's make the first letter of the sentence capital and add the punctuation marks.

    3. Give GitHub Copilot an example or two. ✍

    Learning from examples is not only useful for humans, but also for your AI pair programmer. For instance, we wanted to extract the names from the array of data below and store it in a new array:

    const data = [
      [
        { name: 'John', age: 25 },
        { name: 'Jane', age: 30 }
      ],
      [
        { name: 'Bob', age: 40 }
      ]
    ];
    

    When we didn’t show GitHub Copilot an example …

    // Map through an array of arrays of objects to transform data
    const data = [
      [
        { name: 'John', age: 25 },
        { name: 'Jane', age: 30 }
      ],
      [
        { name: 'Bob', age: 40 }
      ]
    ];
    
    const mappedData = data.map(x => [x.name](http://x.name/));
    
    console.log(mappedData);
    
    // Results: [undefined, undefined]
    

    It generated an incorrect usage of map:

    const mappedData = data.map(x => [x.name](http://x.name/));
    
    console.log(mappedData);
    
    // Results: [undefined, undefined]
    

    By contrast, when we did provide an example …

    // Map through an array of arrays of objects
    // Example: Extract names from the data array
    // Desired outcome: ['John', 'Jane', 'Bob']
    const data = [
      [{ name: 'John', age: 25 }, { name: 'Jane', age: 30 }],
      [{ name: 'Bob', age: 40 }]
    ];
    
    
    const mappedData = data.flatMap(sublist => sublist.map(person => person.name));
    
    console.log(mappedData);
    

    We received our desired outcome.

    const mappedData = data.flatMap(sublist => sublist.map(person => person.name));
    
    console.log(mappedData);
    // Results: ['John', 'Jane', 'Bob']
    

    Read more about common approaches to AI training, such as zero-shot, one-shot, and few-shot learning.

    Three additional tips for prompt crafting with GitHub Copilot

    Here are three additional tips to help guide your conversation with GitHub Copilot.

    1. Experiment with your prompts.

    Just how conversation is more of an art than a science, so is prompt crafting. So, if you don’t receive what you want on the first try, recraft your prompt by following the best practices above.

    For example, the prompt below is vague. It doesn’t provide any context or boundaries for GitHub Copilot to generate relevant suggestions.

    # Write some code for grades.py  
    

    We iterated on the prompt to be more specific, but we still didn’t get the exact result we were looking for. This is a good reminder that adding specificity to your prompt is harder than it sounds. It’s difficult to know, from the start, which details you should include about your goal to generate the most useful suggestions from GitHub Copilot. That’s why we encourage experimentation.

    The version of the prompt below is more specific than the one above, but it doesn’t clearly define the input and output requirements.

    # Implement a function in grades.py to calculate the average grade
    

    We experimented with the prompt once more by setting boundaries and outlining what we wanted the function to do. We also rephrased the comment so the function was more clear (giving GitHub Copilot a clear intention to verify against).

    This time, we got the results we were looking for.

    # Implement the function calculate_average_grade in grades.py that takes a list of grades as input and returns the average grade as a floating-point number
    

    2. Keep a couple of relevant tabs open.

    We don’t have an exact number of tabs that you should keep open to help GitHub Copilot contextualize your code, but from our experience, we’ve found that one or two is helpful.

    GitHub Copilot uses a technique called neighboring tabs that allows the AI pair programmer to contextualize your code by processing all of the files open in your IDE instead of just the single file you’re working on. However, it’s not guaranteed that GItHub Copilot will deem all open files as necessary context for your code.

    3. Use good coding practices.

    That includes providing descriptive variable names and functions, and following consistent coding styles and patterns. We’ve found that working with GitHub Copilot encourages us to follow good coding practices we’ve learned throughout our careers.

    For example, here we used a descriptive function name and followed the codebase’s patterns of leveraging snake case.

    def authenticate_user(username, password):
    

    As a result, GitHub Copilot generated a relevant code suggestion:

    def authenticate_user(username, password):
        # Code for authenticating the user
        if is_valid_user(username, password):
            generate_session_token(username)
            return True
        else:
            return False
    

    Compare this to the example below, where we introduced an inconsistent coding style and poorly named our function.

    def rndpwd(l):
    

    Instead of suggesting code, GitHub Copilot generated a comment that said, “Code goes here.”

    def rndpwd(l):
        # Code goes here
    

    Stay smart

    The LLMs behind generative AI coding tools are designed to find and extrapolate patterns from their training data, apply those patterns to existing language, and then produce code that follows those patterns. Given the sheer scale of these models, they might generate a code sequence that doesn’t even exist yet. Just as you would review a colleague’s code, you should always assess, analyze, and validate AI-generated code.

    A practice example 👩🏽‍💻

    Try your hand at prompting GitHub Copilot to build a browser extension.

    To get started, you’ll need to have GitHub Copilot installed and open in your IDE. We also have access to an early preview of GitHub Copilot chat, which is what we’ve been using when we have questions about our code. If you don’t have GitHub Copilot chat, sign up for the waitlist. Until then you can pair GitHub Copilot with ChatGPT.

    More generative AI prompt crafting guides

    GitHub celebrates developers with disabilities on Global Accessibility Awareness Day

    Post Syndicated from Ed Summers original https://github.blog/2023-05-18-github-celebrates-developers-with-disabilities-on-global-accessibility-awareness-day/

    At GitHub, our favorite people are developers. We love to make them happy and productive, and today, on Global Accessibility Awareness Day, we want to celebrate their achievements by sharing some great stories about a few developers with disabilities alongside news of recent accessibility improvements at GitHub that help them do the best work of their lives.

    Amplifying the voices of disabled developers

    People with disabilities frequently encounter biases that prevent their full and equal participation in all areas of life, including education and employment. That’s why GitHub and The ReadME Project are thrilled to provide a platform for disabled developers to showcase their contributions and counteract bias.

    Paul Chiou, a developer who’s paralyzed from the neck down, is breaking new ground in the field of accessibility automation, while pursuing his Ph.D. Paul uses a computer with custom hardware and software he designed and built, and this lived experience gives him a unique insight into the needs of other people with disabilities. The barriers he encounters push him to innovate, both in his daily life and in his academic endeavors. Learn more about Paul and his creative solutions in this featured article and video profile.

    Becky Tyler found her way to coding via gaming, but she games completely with her eyes, just like everything else she does on a computer, from painting to livestreaming to writing code. Her desire to play Minecraft led her down the path of open source software and collaboration, and now she’s studying computer science at the University of Dundee. Learn more about Becky in this featured article and video profile.

    Dr. Annalu Waller leads the Augmentative and Alternative Communication Research Group at the University of Dundee. She’s also Becky’s professor. Becky calls her a “taskmaster,” but the profile of Annalu’s life shows how her lived experience informed her high expectations for her students—especially those with disabilities—and gave her a unique ability to absorb innovations and use them to benefit people with disabilities.

    Anton Mirhorodchenko has difficulty speaking and typing with his hands, and speaks English as a second language. Anton has explored ways to use ChatGPT and GitHub Copilot to not only help him communicate and express his ideas, but also develop software from initial architecture all the way to code creation. Through creative collaboration with his AI teammates, Anton has become a force to be reckoned with, and he recently shared his insights in this guide on how to harness the power of generative AI for software development.

    Removing barriers that block disabled developers

    Success requires skills. That’s why equal access to education is a fundamental human right. The GitHub Global Campus team agrees. They are working to systematically find and remove barriers that might block future developers with disabilities.

    npm is the default package manager for JavaScript and the largest software registry in the world. To empower every developer to contribute to and benefit from this amazing resource, the npm team recently completed an accessibility bug bash and removed hundreds of potential barriers. Way to go, npm team!

    The GitHub.com team has also been hard at work on accessibility and they recently shipped several improvements:

    Great accessibility starts with design, requiring an in-depth understanding of the needs of users with disabilities and their assistive technologies. The GitHub Design organization has been leaning into accessibility for years, and this blog post explores how it has built a culture of accessibility and shifted accessibility left in the GitHub development process.

    When I think about the future of technology, I think about GitHub Copilot—an AI pair programmer that boosts developers’ productivity and breaks down barriers to software development. The GitHub Copilot team recently shipped accessibility improvements for keyboard-only and screen reader users.

    GitHub Next, the team behind GitHub Copilot, also recently introduced GitHub Copilot Voice, an experiment currently in technical preview. GitHub Copilot Voice empowers developers to code completely hands-free using only their voice. That’s a huge win for developers who have difficulty typing with their hands. Sign up for the technical preview if you can benefit from this innovation.

    Giving back to our community

    As we work to empower all developers to build on GitHub, we regularly contribute back to the broader accessibility community that has been so generous to us. For example, all accessibility improvements in Primer are available for direct use by the community.

    Our accessibility team includes multiple Hubbers with disabilities—including myself. GitHub continually improves the accessibility and inclusivity of the processes we use to communicate and collaborate. One recent example is the process we use for retrospectives. At the end of our most recent retrospective, I observed that, as a person with blindness, it was the most accessible and inclusive retrospective I have ever attended. That observation prompted the team to share the process we use for inclusive retrospectives so other teams can benefit from our learnings.

    More broadly, Hubbers regularly give back to the causes we care about. During a recent social giving event, I invited Hubbers to support the Seeing Eye because that organization has made such a profound impact in my life as a person with blindness. Our goal was to raise $5,000 so we could name and support a Seeing Eye puppy that will eventually provide independence and self-confidence to a person with blindness. I was overwhelmed by the generosity of my coworkers when they donated more than $15,000! So, we now get to name three puppies and I’m delighted to introduce you to the first one. Meet Octo!

    A German Shepard named Octo sits in green grass wearing a green scarf that says “The Seeing Eye Puppy Raising Program.” She is sitting tall in a backyard with a black fence and a red shed behind her.
    Photo courtesy of The Seeing Eye

    Looking ahead

    GitHub CEO, Thomas Dohmke, frequently says, “GitHub thrives on developer happiness.” I would add that the GitHub accessibility program thrives on the happiness of developers with disabilities. Our success is measured by their contributions. Our job is to remove barriers from their path and celebrate their accomplishments. We’re delighted with our progress thus far, but we are just getting warmed up. Stay tuned for more great things to come! In the meantime, learn more about the GitHub accessibility program at accessibility.github.com.

    Inside GitHub: Working with the LLMs behind GitHub Copilot

    Post Syndicated from Sara Verdi original https://github.blog/2023-05-17-inside-github-working-with-the-llms-behind-github-copilot/

    The first time that engineers at GitHub worked with one of OpenAI’s large language models (LLM), they were equal parts excited and astonished. Alireza Goudarzi, a senior researcher of machine learning at GitHub recounts, “As a theoretical AI researcher, my job has been to take apart deep learning models to make sense of them and how they learn, but this was the first time that a model truly astonished me.” Though the emergent behavior of the model was somewhat surprising, it was obviously powerful. Powerful enough, in fact, to lead to the creation of GitHub Copilot.

    Due to the growing interest in LLMs and generative AI models, we decided to speak to the researchers and engineers at GitHub who helped build the early versions of GitHub Copilot and talk through what it was like to work with different LLMs from OpenAI, and how model improvements have helped evolve GitHub Copilot to where it is today—and beyond.

    A brief history of GitHub Copilot

    In June 2020, OpenAI released GPT-3, an LLM that sparked intrigue in developer communities and beyond. Over at GitHub, this got the wheels turning for a project our engineers had only talked about before: code generation.

    “Every six months or so, someone would ask in our meetings, ‘Should we think about general purpose code generation,’ but the answer was always ‘No, it’s too difficult, the current models just can’t do it,’” says Albert Ziegler, a principal machine learning engineer and member of the GitHub Next research and development team.

    But GPT-3 changed all that—suddenly the model was good enough to begin considering how a code generation tool might work.

    “OpenAI gave us the API to play around with,” Ziegler says. “We assessed it by giving it coding-like tasks and evaluated it in two different forms.”

    For the first form of evaluation, the GitHub Next team crowdsourced self-contained problems to help test the model. “The reason we don’t do this anymore is because the models just got too good,” Ziegler laughs.

    In the beginning, the model could solve about half of the problems it was posed with, but soon enough, it was solving upwards of 90 percent of the problems.

    This original testing method sparked the first ideas for how to harness the power of this model, and they began to conceptualize an AI-powered chatbot for developers to ask coding questions and receive immediate, runnable code snippets. “We built a prototype, but it turned out there was a better modality for this technology available,” Ziegler says. “We thought, ‘Let’s try to put this in the IDE.’”

    “The moment we did that and saw how well it worked, the whole static question-and-answer modality was forgotten,” he says. “This new approach was interactive and it was useful in almost every situation.”

    And with that, the development of GitHub Copilot began.

    Exploring model improvements

    To keep this project moving forward, GitHub returned to OpenAI to make sure that they could stay on track with the latest models. “The first model that OpenAI gave us was a Python-only model,” Ziegler remembers. “Next we were delivered a JavaScript model and a multilingual model, and it turned out that the Javascript model had particular problems that the multilingual model did not. It actually came as a surprise to us that the multilingual model could perform so well. But each time, the models were just getting better and better, which was really exciting for GitHub Copilot’s progress.”

    In 2021, OpenAI released the multilingual Codex model, which was built in partnership with GitHub. This model was an offshoot of GPT-3, so its original capability was generating natural language in response to text prompts. But what set the Codex model apart was that it was trained on billions of lines of public code—so that, in addition to natural language outputs, it also produced code suggestions.

    This model was open for use via an API that businesses could build on, and while this breakthrough was huge for GitHub Copilot, the team needed to work on internal model improvements to ensure that it was as accurate as possible for end users.

    As the GitHub Copilot product was prepared for launch as a technical preview, the team split off into further functional teams, and the Model Improvements team became responsible for monitoring and improving GitHub Copilot’s quality through communicating with the underlying LLM. This team also set out to work on improving completion for users. Completion refers to when users accept and keep GitHub Copilot suggestions in their code, and there are several different levers that the Model Improvements team works on to increase completion, including prompt crafting and fine tuning.

    An example of completion in action with GitHub Copilot
    An example of completion in action with GitHub Copilot.

    Prompt crafting

    When working with LLMs, you have to be very specific and intentional with your inputs to receive your desired output, and prompt crafting explores the art behind communicating these requests to get the optimal completion from the model.

    “In very simple terms, the LLM is, at its core, just a document completion model. For training it was given partial documents and it learned how to complete them one token at a time. Therefore, the art of prompt crafting is really all about creating a ‘pseudo-document’ that will lead the model to a completion that benefits the customer,” John Berryman, a senior researcher of machine learning on the Model Improvements team explains. Since LLMs are trained on partial document completion, then if the partial document is code, then this completion capability lends itself well to code completion, which is, in its base form, exactly what GitHub Copilot does.

    To better understand how the model could be applied to code completion, the team would provide the model with a file and evaluate the code completions it returned.

    “Sometimes the results are ok, sometimes they are quite good, and sometimes the results seem almost magical,” Berryman says. “The secret is that we don’t just have to provide the model with the original file that the GitHub Copilot user is currently editing; instead we look for additional pieces of context inside the IDE that can hint the model towards better completions.”

    He continues, “There have been several changes that helped get GitHub Copilot where it is today, but one of my favorite tricks was when we pulled similar texts in from the user’s neighboring editor tabs. That was a huge lift in our acceptance rate and characters retained.”

    Generative AI and LLMs are incredibly fascinating, but Berryman still seems to be most excited about the benefit that the users are seeing from the research and engineering efforts.

    “The idea here is to make sure that we make developers more productive, but the way we do that is where things start to get interesting: we can make the user more productive by incorporating the way they think about code into the algorithm itself,” Berryman says. “Where the developer might flip back and forth between tabs to reference code, we just can do that for them, and the completion is exactly what it would be if the user had taken all of the time to look that information up.”

    Fine-tuning

    Fine-tuning is a technique used in AI to adapt and improve a pre-trained model for a specific task or domain. The process involves taking a pre-trained model that has been trained on a large dataset and training it on a smaller, more specific dataset that is relevant to a particular use case. This enables the model to learn and adapt to the nuances of the new data, thus improving its performance on the specific task.

    These larger, more sophisticated LLMs can sometimes produce outputs that aren’t necessarily helpful because it’s hard to statistically define what constitutes a “good” response. It’s also incredibly difficult to train a model like Codex that contains upwards of 170 billion parameters.

    “Basically, we’re training the underlying Codex model on a user’s specific codebase to provide more focused, customized completions,” Goudarzi adds.

    “Our greatest challenge right now is to consider why the user rejects or accepts a suggestion,” Goudarzi adds. “We have to consider what context, or information, that we served to the model caused the model to output something that was either helpful or not helpful. There’s no way for us to really troubleshoot in the typical engineering way, but what we can do is figure out how to ask the right questions to get the output we desire.”

    Read more about how GitHub Copilot is getting better at understanding your code to provide a more customized coding experience here.

    GitHub Copilot—then and now

    As the models from OpenAI got stronger—and as we identified more areas to build on top of those LLMs in house—GitHub Copilot has improved and gained new capabilities with chat functionality, voice-assisted development, and more via GitHub Copilot X on the horizon.

    Johan Rosenkilde, a staff researcher on the GitHub Next team remembers, “When we received the latest model drops from OpenAI in the past, the improvements were good, but they couldn’t really be felt by the end user. When the third iteration of Codex dropped, you could feel it, especially when you were working with programming languages that are not one of the top five languages,” Rosenkilde says.

    He continues, “I happened to be working on a programming competition with some friends on the weekend that model version was released, and we were programming with F#. In the first 24 hours, we evidently had the old model for GitHub Copilot, but then BOOM! Magic happened,” he laughs. “There was an incredibly noticeable difference.”

    In the beginning, GitHub Copilot also had the tendency to suggest lines of code in a completely different programming language, which created a poor developer experience (for somewhat obvious reasons).

    “You could be working in a C# project, then all of the sudden at the top of a new file, it would suggest Python code,” Rosenkilde explains. So, the team added a headline to the prompt which listed the language you were working in. “Now this had no impact when you were deep down in the file because Copilot could understand which language you were in. But at the top of the file, there could be some ambiguity, and those early models just defaulted to the top popular languages.”

    About a month following that improvement, the team discovered that it was much more powerful to put the path of the file at the top of the document.

    A diagram of the file path improvement
    A diagram of the file path improvement.

    “The end of the file name would give away the language in most cases, and in fact the file name could provide crucial, additional information,” Rosenkilde says. “For example, the file might be named ‘connectiondatabase.py.’ Well that file is most likely about databases or connections, so you might want to import an SQL library, and that file was written in Python. So, that not only solved the language problem, but it also improved the quality and user experience by a surprising margin because GitHub Copilot could now suggest boilerplate code.”

    After a few more months of work, and several iterations, the team was able to create a component that lifted code from other files, which is a capability that had been talked about since the genesis of GitHub Copilot. Rosenkilde recalls, “this never really amounted to anything more than conversations or a draft pull request because it was so abstract. But then, Albert Ziegler built this component that looked at other files you have open in the IDE at that moment in time and scanned through those files for similar text to what’s in your current cursor. This was a huge boost in code acceptance because suddenly, GitHub Copilot knew about other files.”

    What’s next for GitHub Copilot

    After working with generative AI models and LLMs over the past three years, we’ve seen their transformative value up close. As the industry continues to find new uses for generative AI, we’re working to continue building new developer experiences. And in March 2023, GitHub announced the future of Copilot, GitHub Copilot X, our vision for an AI-powered developer experience. GitHub Copilot X aims to bring AI beyond the IDE to more components of the overall platform, such as docs and pull requests. LLMs are changing the ways that we interact with technology and how we work, and ideas like GitHub Copilot X are just an example of what these models, along with some dedicated training techniques, are capable of.

    How GitHub Copilot is getting better at understanding your code

    Post Syndicated from Johan Rosenkilde original https://github.blog/2023-05-17-how-github-copilot-is-getting-better-at-understanding-your-code/

    To make working with GitHub Copilot feel like a meeting of the minds between developers and the pair programmer, GitHub’s machine learning experts have been busy researching, developing, and testing new capabilities—and many are focused on improving the AI pair programmer’s contextual understanding. That’s because good communication is key to pair programming, and inferring context is critical to making good communication happen.

    To pull back the curtain, we asked GitHub’s researchers and engineers about the work they’re doing to help GitHub Copilot improve its contextual understanding. Here’s what we discovered.

    From OpenAI’s Codex model to GitHub Copilot

    When OpenAI released GPT-3 in June 2020, GitHub knew developers would benefit from a product that leveraged the model specifically for coding. So, we gave input to OpenAI as it built Codex, a descendant of GPT-3 and the LLM that would power GitHub Copilot. The pair programmer launched as a technical preview in June 2021 and became generally available in June 2022 as the world’s first at-scale generative AI coding tool.

    To ensure that the model has the best information to make the best predictions with speed, GitHub’s machine learning (ML) researchers have done a lot of work called prompt engineering (which we’ll explain in more detail below) so that the model provides contextually relevant responses with low latency.

    Though GitHub’s always experimenting with new models as they come out, Codex was the first really powerful generative AI model that was available, said David Slater, a ML engineer at GitHub. “The hands-on experience we gained from iterating on model and prompt improvements was invaluable.”

    All that experimentation resulted in a pair programmer that, ultimately, frees up a developer’s time to focus on more fulfilling work. The tool is often a huge help even for starting new projects or files from scratch because it scaffolds a starting point that developers can adapt and tweak as desired, said Alice Li, a ML researcher at GitHub.

    I still find myself impressed and even surprised by what GitHub Copilot can do, even after having worked on it for some time now.

    – Alice Li, ML researcher at GitHub

    Why context matters

    Developers use details from pull requests, a folder in a project, open issues, and more to contextualize their code. When it comes to a generative AI coding tool, we need to teach that tool what information to use to do the same.

    Transformer LLMs are good at connecting the dots and big-picture thinking. Generative AI coding tools are made possible by large language models (LLMs). These models are sets of algorithms trained on large amounts of code and human language. Today’s state-of-the-art LLMs are transformers, which makes them adept at making connections between text in a user’s input and the output that the model has already generated. This is why today’s generative AI tools are providing responses that are more contextually relevant than previous AI models.

    But they need to be told what information is relevant to your code. Right now, transformers that are fast enough to power GitHub Copilot can process about 6,000 characters at a time. While that’s been enough to advance and accelerate tasks like code completion and code change summarization, the limited amount of characters means that not all of a developer’s code can be used as context.

    So, our challenge is to figure out not only what data to feed the model, but also how to best order and enter it to get the best suggestions for the developer.

    Learn more about LLMs, generative AI coding tools, and how they’re changing the way developers work.

    How GitHub Copilot understands your code

    It all comes down to prompts, which are compilations of IDE code and relevant context that’s fed to the model. Prompts are generated by algorithms in the background, at any point in your coding. That’s why GitHub Copilot will generate coding suggestions whether you’re currently writing or just finished a comment, or in the middle of some gnarly code.

    • Here’s how a prompt is created: a series of algorithms first select relevant code snippets or comments from your current file and other sources (which we’ll dive into below). These snippets and comments are then prioritized, filtered, and assembled into the final prompt.

    GitHub Copilot’s contextual understanding has continuously matured over time. The first version was only able to consider the file you were working on in your IDE to be contextually relevant. But we knew context went beyond that. Now, just a year later, we’re experimenting with algorithms that will consider your entire codebase to generate customized suggestions.

    Let’s look at how we got here:

    • Prompt engineering is the delicate art of creating a prompt so that the model makes the most useful prediction for the user. The prompt tells LLMs, including GitHub Copilot, what data, and in what order, to process in order to contextualize your code. Most of this work takes place in what’s called a prompt library, which is where our in-house ML experts work with algorithms to extract and prioritize a variety of sources of information about the developer’s context, creating the prompt that’ll be processed by the GitHub Copilot model.

    • Neighboring tabs is what we call the technique that allows GitHub Copilot to process all of the files open in a developer’s IDE instead of just the single one the developer is working on. By opening all files relevant to their project, developers automatically invoke GitHub Copilot to comb through all of the data and find matching pieces of code between their open files and the code around their cursor—and add those matches to the prompt.

    When developing neighboring tabs, the GitHub Next team and in-house ML researchers did A/B tests to figure out the best parameters for identifying matches between code in your IDE and code in your open tabs. They found that setting a very low bar for when to include a match actually made for the best coding suggestions.

    By including every little bit of context, neighboring tabs helped to relatively increase user acceptance of GitHub Copilot’s suggestions by 5%**.

    Even if there was no perfect match—or even a very good one—picking the best match we found and including that as context for the model was better than including nothing at all.

    – Albert Ziegler, principal ML engineer at GitHub
    • The Fill-In-the-Middle (FIM) paradigm widened the context aperture even more. Prior to FIM, only the code before your cursor would be put into the prompt—ignoring the code after your cursor. (At GitHub, we refer to code before the cursor as the prefix and after the cursor as the suffix.) With FIM, we can tell the model which part of the prompt is the prefix, and which part is the suffix.

    Even if you’re creating something from scratch and have a skeleton of a file, we know that coding isn’t linear or sequential. So, while you bounce around your file, FIM helps GitHub Copilot offer better coding suggestions for the part in your file where your cursor is located, or the code that’s supposed to come between the prefix and suffix.

    Based on A/B testing, FIM gave a 10% relative boost in performance, meaning developers accepted 10% more of the completions that were shown to them. And thanks to optimal use of caching, neighboring tabs and FIM work in the background without any added latency.

    System diagram focused on model quality efforts. The diagram starts on the left with inputs from open tabs, data from editor, and vector database, which feed into a prompt library. (We are continuously working on improvements to provide better context from available sources in the prompt.) This then goes into the prompt, which is fed through a contextual filter model and a GPT model. (We are continuously working on new and improved model engines optimized for GitHub Copilot.) This model provides completions to fill in the middle of the prompt prefix and suffix. From the models, n completions are generated, and less than or equal to n completions are shown.

    Improving semantic understanding

    Today, we’re experimenting with vector databases that could create a customized coding experience for developers working in private repositories or with proprietary code. Generative AI coding tools use something called embeddings to retrieve information from a vector database.

    • What’s a vector database? It’s a database that indexes high-dimensional vectors.

    • What’s a high-dimensional vector? They’re mathematical representations of objects, and because these vectors can model objects in a number of dimensions, they can capture complexities of that object. When used properly to represent pieces of code, they may represent both the semantics and even intention of the code—not just the syntax.

    • What’s an embedding? In the context of coding and LLMs, an embedding is the representation of a piece of code as a high-dimensional vector. Because of the “knowledge” the LLM has of both programming and natural language, it’s able to capture both the syntax and semantics of the code in the vector.

    Here’s how they’d all work together:

    • Algorithms would create embeddings for all snippets in the repository (potentially billions of them), and keep them stored in the vector database.
    • Then, as you’re coding, algorithms would embed the snippets in your IDE.
    • Algorithms would then make approximate matches—also, in real time—between the embeddings that are created for your IDE snippets and the embeddings already stored in the vector database. The vector database is what allows algorithms to quickly search for approximate matches (not just exact ones) on the vectors it stores, even if it’s storing billions of embedded code snippets.

    Developers are familiar with retrieving data with hashcodes, which typically look for exact character by character matches, explained Alireza Goudarzi, senior ML researcher at GitHub. “But embeddings—because they arise from LLMs that were trained on a vast amount of data—develop a sense of semantic closeness between code snippets and natural language prompts.”

    Read the three sentences below and identify which two are the most semantically similar.

    • Sentence A: The king moved and captured the pawn.
    • Sentence B: The king was crowned in Westminster Abbey.
    • Sentence C: Both white rooks were still in the game.

    The answer is sentences A and C because both are about chess. While sentences A and B are syntactically, or structurally similar because both have a king as the subject, they’re semantically different because “king” is used in different contexts.

    Here’s how each of those statements could translate to Python. Note the syntactic similarity between snippets A and B despite their semantic difference, and the semantic similarity between snippets A and C despite their syntactic difference.

    Snippet A:

    if king.location() == pawn.location():
        board.captures_piece(king, pawn)
    

    Snippet B:

    if king.location() == "Westminster Abbey":
        king.crown()
    

    Snippet C:

    if len([ r for r in board.pieces("white") if r.type == "rook" ]) == 2:
        return True
    

    As mentioned above, we’re still experimenting with retrieval algorithms. We’re designing the feature with enterprise customers in mind, specifically those who are looking for a customized coding experience with private repositories and would explicitly opt in to use the feature.

    Take this with you

    Last year, we conducted quantitative research on GitHub Copilot and found that developers code up to 55% faster while using the pair programmer. This means developers feel more productive, complete repetitive tasks more quickly, and can focus more on satisfying work. But our work won’t stop there.

    The GitHub product and R&D teams, including GitHub Next, have been collaborating with Microsoft Azure AI-Platform to continue bringing improvements to GitHub Copilot’s contextual understanding. So much of the work that helps GitHub Copilot contextualize your code happens behind the scenes. While you write and edit your code, GitHub Copilot is responding to your writing and edits in real time by generating prompts–or, in other words, prioritizing and sending relevant information to the model based on your actions in your IDE—to keep giving you the best coding suggestions.


    Learn more

    How I used GitHub Copilot to build a browser extension

    Post Syndicated from Rizel Scarlett original https://github.blog/2023-05-12-how-i-used-github-copilot-to-build-a-browser-extension/

    For the first time ever, I built a browser extension and did it with the help of GitHub Copilot. Here’s how.

    I’ve built a rock, paper, scissors game with GitHub Copilot but never a browser extension. As a developer advocate at GitHub, I decided to put GitHub Copilot to the test, including its upcoming chat feature, and see if it could help me write an extension for Google Chrome to clear my cache.

    I’m going to be honest: it wasn’t as straightforward as I expected it to be. I had a lot of questions throughout the build and had to learn new information.

    But at the end of the day, I gained experience with learning an entirely new skill with a generative AI coding tool and pair programming with GitHub Copilot—and with other developers on Twitch 👥.

    I wanted to create steps that anyone—even those without developer experience—could easily replicate when building this extension, or any other extension. But I also wanted to share my new takeaways after a night of pair programming with GitHub Copilot and human developers.

    So, below you’ll find two sections:

    Let’s jump in.

    How to build a Chrome extension with GitHub Copilot

    To get started, you’ll need to have GitHub Copilot installed and open in your IDE. I also have access to an early preview of GitHub Copilot chat, which is what I used when I had a question. If you don’t have GitHub Copilot chat, sign up for the waitlist, and pair GitHub Copilot with ChatGPT for now.

    1. 🧑🏾‍💻 Using the chat window, I asked GitHub Copilot, “How do I create a Chrome extension? What should the file structure look like?”

    💻 GitHub Copilot gave me general steps for creating an extension—from designing the folder structure to running the project locally in Chrome.

    Screenshot of the char window where the user asked GitHub Copilot "How do I build a browser extension? What should the file structure look like?" GitHub Copilot provided some instructions in response."

    Then, it shared an example of a Chrome extension file structure.

    Copilot response showing an example structure for a simple Chrome extension.

    To save you some time, here’s a chart that briefly defines the purpose of these files:

    manifest.json 🧾 Metadata about your extension, like the name and version, and permissions. Manifest as a proper noun is the name of the Google Chrome API. The latest is V3.
    popup.js 🖼 When users click on your extension icon in their Chrome toolbar, a pop-up window will appear. This file is what determines the behavior of that pop-up and contains code for handling user interactions with the pop-up window.
    popup.html and style.css 🎨 These files make up the visual of your pop-up window. popup.html is the interface, including layout, structure, and content. style.css determines the way the HTML file should be displayed in the browser, including font, text color, background, etc.

    2. Create the manifest.json 🧾

    🧑🏾‍💻 Inside a folder in my IDE, I created a file called manifest.json. In manifest.json,

    I described my desired file:

    Manifest for Chrome extension that clears browser cache.
    manifest_version: 3
    Permissions for the extension are: storage, tabs, browsingData
    

    I pressed enter and invoked suggestions from GitHub Copilot by typing a curly brace.

    💻 Inside the curly brace, GitHub Copilot suggested the manifest. I deleted the lines describing my desired manifest.json, and the final file looked like this:

    {
       "name": "Clear Cache",
       "version": "1.0",
       "manifest_version": 3,
       "description": "Clears browser cache",
       "permissions": [
           "storage",
           "tabs",
           "browsingData"
       ],
       "action": {
           "default_popup": "popup.html"
       },
       "background": {
           "service_worker": "background.js"
       }
    }
    

    3. Create a service worker, which is a file called background.js 🔧

    This wasn’t a file that was recommended from my chat with GitHub Copilot. I learned that it was a necessary file from a developer who tuned into my livestream 👥. The background.js is what gives your extension the ability to run in the background, perform tasks, and respond to user events outside of the extension’s pop-up window (like network requests and data storage).

    🧑🏾‍💻 In my background.js file, I wrote a comment describing my desired service worker:

    Service Worker for Google Chrome Extension 
    Handles when extension is installed
    Handles when message is received
    

    Then, I wrote a comment for the first function, which prompted a suggestion from GitHub Copilot, and then wrote another comment to describe the second function.

    💻 The final file looked like this:

    /*
    Service Worker for Google Chrome Extension
    Handles when extension is installed
    Handles when message is received
    */
    
    // console.log when extension is installed
    chrome.runtime.onInstalled.addListener(function() {
       console.log("Extension installed");
    });
    
    // send response when message is received and console.log when message is received
    chrome.runtime.onMessage.addListener(function(request, sender, sendResponse) {
       console.log("Message received");
       sendResponse("Message received");
    });
    

    4. Create the popup.html file 🖼

    🧑🏾‍💻 In the popup.html file, I wrote a comment describing how I wanted my pop-up window to appear. This window is the one users will see when they click on the extension icon.

    <!--
       HTML for Chrome extension that clears browser cache.
       Connect to javascript file called popup.js and CSS file called style.css
       Will render the following buttons with id's:
       - "All History"
       - "Past Month"
       - "Past Week"
       - "Past Day"
       - "Past Hour"
       - "Past Minute"
    
    
       Will render an empty paragraph with id "lastCleared"
    -->
    

    💻 GitHub Copilot suggested the following:

    <!DOCTYPE html>
    <html>
       <head>
           <meta charset="utf-8">
           <title>Clear Cache</title>
           <link rel="stylesheet" href="style.css">
       </head>
       <body>
           <h1>Clear Cache</h1>
           <button id="allHistory">All History</button>
           <button id="pastMonth">Past Month</button>
           <button id="pastWeek">Past Week</button>
           <button id="pastDay">Past Day</button>
           <button id="pastHour">Past Hour</button>
           <button id="pastMinute">Past Minute</button>
           <p id="lastCleared"></p>
           <script src="popup.js"></script>
       </body>
    </html>
    

    5. Test the browser extension

    I decided to test the modifications before incorporating additional styles or interactivity. I prefer making incremental changes and conducting manual tests in between because it simplifies the debugging process.

    🧑🏾‍💻 Navigate to chrome://extensions/ in your Chrome browser. Turn on developer mode. Press “Load unpacked.” Upload the folder that holds your Chrome extension. Once loaded, you should be able to test your extension. Here’s what mine looked like:

    Screenshot of the Chrome extension offering options to "Clear Cache."

    6. Create a popup.js file to add interactivity

    🧑🏾‍💻 Inside this file, write a comment that will act as pseudocode for you but as a prompt for GitHub Copilot.

    Pseudocode isn’t code, but a method of conceptualizing and communicating ideas to plan and design programs before writing the actual code. Programmers can concentrate on algorithmic logic without worrying about syntactic details of a particular programming language—and communicate ideas to other developers regardless of coding experience.

    Here’s the pseudocode I used:

    /*
    This program is a Chrome Extension that clears browser cache.
    Handle on button click:
    - button with id "allHistory" that clears all cache history
    - button with id "pastMonth" that clears cache history from the past month
    - button with id "pastWeek" that clears cache history from the past week
    - button with id "pastDay" that clears cache history from the past day
    - button with id "pastHour" that clears cache history from the past hour
    - button with id "pastMinute" that clears cache history from the past minute
    
    
    Create function that
    - converts dates and times into human-readable format
    - adds "Successfully cleared cache" with date and time in a paragraph with id "lastCleared"
    */
    

    Then, write the series of comments below, but write them one at a time to allow GitHub Copilot to generate a code suggestion after each one. The final popup.js looked like this:

    /*
    This program is a Chrome Extension that clears browser cache.
    Handle on button click:
    - button with id "allHistory" that clears all cache history
    - button with id "pastMonth" that clears cache history from the past month
    - button with id "pastWeek" that clears cache history from the past week
    - button with id "pastDay" that clears cache history from the past day
    - button with id "pastHour" that clears cache history from the past hour
    - button with id "pastMinute" that clears cache history from the past minute
    
    Create function that
    - converts dates and times into human-readable format
    - adds "Successfully cleared cache" with date and time in a paragraph with id "lastCleared"
    */
    
    // convert date and time into human-readable format
    function convertDate(date) {
       let date = new Date(date);
       var options = { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' };
       return date.toLocaleDateString("en-US", options);
    }
    
    // add successfully cleared cache into paragraph with id "lastCleared"
    function addCleared() {
       var p = document.getElementById("lastCleared");
       let date = new Date();
       p.innerHTML = "Successfully cleared cache " + convertDate(date);
    }
    
    // clear all cache history
    document.getElementById("allHistory").addEventListener("click", function() {
       chrome.browsingData.removeCache({ "since": 0 }, function() {
           addCleared();
       });
    });
    
    // clear cache history from the past month
    document.getElementById("pastMonth").addEventListener("click", function() {
       let date = new Date();
       date.setMonth(date.getMonth() - 1);
       chrome.browsingData.removeCache({ "since": date.getTime() }, function() {
           addCleared();
       });
    });
    
    // clear cache history from the past week
    document.getElementById("pastWeek").addEventListener("click", function() {
       let date = new Date();
       date.setDate(date.getDate() - 7);
       chrome.browsingData.removeCache({ "since": date.getTime() }, function() {
           addCleared();
       });
    });
    
    // clear cache history from the past day
    document.getElementById("pastDay").addEventListener("click", function() {
       let date = new Date();
       date.setDate(date.getDate() - 1);
       chrome.browsingData.removeCache({ "since": date.getTime() }, function() {
           addCleared();
       });
    });
    
    // clear cache history from the past hour
    document.getElementById("pastHour").addEventListener("click", function() {
      let date = new Date();
       date.setHours(date.getHours() - 1);
       chrome.browsingData.removeCache({ "since": date.getTime() }, function() {
           addCleared();
       });
    });
    
    // clear cache history from the past minute
    document.getElementById("pastMinute").addEventListener("click", function() {
      let date = new Date();
       date.setMinutes(date.getMinutes() - 1);
       chrome.browsingData.removeCache({ "since": date.getTime() }, function() {
           addCleared();
       });
    });
    

    🧑🏾‍💻 GitHub Copilot actually generated the var keyword, which is outdated. So I changed that keyword to let.

    7. Create the last file in your folder: style.css

    🧑🏾‍💻 Write a comment that describes the style you want for your extension. Then, type “body” and continue tabbing until GitHub Copilot suggests all the styles.

    My final style.css looked like this:

    /* Style the Chrome extension's popup to be wider and taller
    Use accessible friendly colors and fonts
    Make h1 elements legible
    Highlight when buttons are hovered over
    Highlight when buttons are clicked
    Align buttons in a column and center them but space them out evenly
    Make paragraph bold and legible
    */
    
    body {
       background-color: #f1f1f1;
       font-family: Arial, Helvetica, sans-serif;
       font-size: 16px;
       color: #333;
       width: 400px;
       height: 400px;
    }
    
    h1 {
       font-size: 24px;
       color: #333;
       text-align: center;
    }
    
    button {
       background-color: #4CAF50;
       color: white;
       padding: 15px 32px;
       text-align: center;
       text-decoration: none;
       display: inline-block;
       font-size: 16px;
       margin: 4px 2px;
       cursor: pointer;
       border-radius: 8px;
    }
    
    button:hover {
       background-color: #45a049;
    }
    
    button:active {
       background-color: #3e8e41;
    }
    
    p {
       font-weight: bold;
       font-size: 18px;
       color: #333;
    }
    
    For detailed, step-by-step instructions, check out my Chrome extension with GitHub Copilot repo.

    Three important lessons about learning and pair programming in the age of AI

    1. Generative AI reduces the fear of making mistakes. It can be daunting to learn a new language or framework, or start a new project. The fear of not knowing where to start—or making a mistake that could take hours to debug—can be a significant barrier to getting started. I’ve been a developer for over three years, but streaming while coding makes me nervous. I sometimes focus more on people watching me code and forget the actual logic. When I conversed with GitHub Copilot, I gained reassurance that I was going in the right direction and that helped me to stay motivated and confident during the stream.

    2. Generative AI makes it easier to learn about new subjects, but it doesn’t replace the work of learning. GitHub Copilot didn’t magically write an entire Chrome extension for me. I had to experiment with different prompts, and ask questions to GitHub Copilot, ChatGPT, Google, and developers on my livestream. To put it in perspective, it took me about 1.5 hours to do steps 1 to 5 while streaming.

      But if I hadn’t used GitHub Copilot, I would’ve had to write all this code by scratch or look it up in piecemeal searches. With the AI-generated code suggestions, I was able to jump right into review and troubleshooting, so a lot of my time and energy was focused on understanding how the code worked. I still had to put in the effort to learn an entirely new skill, but I was analyzing and evaluating code more often than I was trying to learn and then remember it.

    3. Generative AI coding tools made it easier for me to collaborate with other developers. Developers who tuned into the livestream could understand my thought process because I had to tell GitHub Copilot what I wanted it to do. By clearly communicating my intentions with the AI pair programmer, I ended up communicating them more clearly with developers on my livestream, too. That made it easy for people tuning in to become my virtual pair programmers during my livestream.

    Overall, working with GitHub Copilot made my thought process and workflow more transparent. Like I said earlier, it was actually a developer on my livestream who recommended a service worker file after noticing that GitHub Copilot didn’t include it in its suggested file structure. Once I confirmed in a chat conversation with GitHub Copilot and a Google search that I needed a service worker, I used GitHub Copilot to help me write one.

    Take this with you

    GitHub Copilot made me more confident with learning something new and collaborating with other developers. As I said before, live coding can be nerve-wracking. (I even get nervous even when I’m just pair programming with a coworker!) But GitHub Copilot’s real-time code suggestions and corrections created a safety net, allowing me to code more confidently—and quickly— in front of a live audience. Also, because I had to clearly communicate my intentions with the AI pair programmer, I was also communicating clearly with the developers who tuned into my livestream. This made it easy to virtually collaborate with them.

    The real-time interaction with GitHub Copilot and the other developers helped with catching errors, learning coding suggestions, and reinforcing my own understanding and knowledge. The result was a high-quality codebase for a browser extension.

    This project is a testament to the collaborative power of human-AI interaction. The experience underscored how GitHub Copilot can be a powerful tool in boosting confidence, facilitating learning, and fostering collaboration among developers.


    More resources

    Web Summit Rio 2023: Building an app in 18 minutes with GitHub Copilot X

    Post Syndicated from Thomas Dohmke original https://github.blog/2023-05-05-web-summit-rio-2023-building-an-app-in-18-minutes-with-github-copilot-x/

    Missed GitHub CEO Thomas Dohmke’s Web Summit Rio 2023 talk? Read his remarks and get caught up on everything GitHub Copilot X.

    Hello, Rio! I’m Thomas, and I’m a developer. I’ve been looking forward to this for some time, about a year. When I first planned this talk, I wanted to talk about the future of artificial intelligence and applications. But then I thought, why not show you the full power live on stage. Build something with code.

    I’ve been a developer my entire adult life. I’ve been coding since 1989. But as CEO, I haven’t been able to light up my contribution graph in a while. Like all of you, like every developer, my energy, my creativity every day is finite. And all the distractions of life—from the personal to the professional—from morning to night, gradually zaps my creative energy.

    When I wake up, I’m at my most creative and sometimes I have a big, light bulb idea. But as soon as the coffee is poured, the distractions start. Slack messages and emails flood in. I have to review a document before a customer presentation. And then I hear my kids yell as their football flew over the fence. And, of course, they want me to go over to our neighbors’ and get it.

    Then…I sit back down, I am in wall-to-wall meetings. And by the time the sun comes down, that idea I had in the morning—it’s gone! It’s midnight and I can’t even keep my eyes open. To be honest, in all this, I usually don’t even want to get started because the perception of having to do all the mundane tasks, all the boilerplate work that comes with coding, stops me in my tracks. And I know this is true for so many developers.

    The point is: boilerplate sucks! All busywork in life, but especially repetitive work is a barrier between us and what we want to achieve. Every day, building endless boilerplate prevents developers from creating a new idea that will change the world. But with AI, these barriers are about to be shattered.

    You’ve likely seen the memes about the 10x developer. It’s a common internet joke—people have claimed the 10x developer many times. With GitHub Copilot and now Copilot X, it’s time for us to redefine this! It isn’t that developers need to strive to be 10x, it’s that every developer deserves to be made 10 times more productive. With AI at every step, we will truly create without captivity. With AI at every step, we will realize the 10x developer.

    Imagine: 10 days of work, done in one day. 10 hours of work, done in one hour. 10 minutes of work, done with a single prompt command. This will allow us to amplify our truest self-expression. It will help a new generation of developers learn and build as fast as their minds. And because of this, we’ll emerge into a new spring of digital creativity, where every light bulb idea we have when we open our eyes for the day can be fully realized no matter what life throws at us.

    During my presentation, I built a snake game in 15 minutes.

    This morning I woke up, and I wanted to build something. I wanted to build my own snake game, originally created back in 1976. It’s a classic. So, 15 minutes are left on my timer. And I’m going to build this snake game. Let’s bring up Copilot X and get going!

    See how you can 10x your developer superpowers. Discover GitHub Copilot X.

    How the Grafana Alerting team scales their issue management with GitHub Projects

    Post Syndicated from Mario Rodriguez original https://github.blog/2023-03-15-how-the-grafana-alerting-team-scales-their-issue-management-with-github-projects/

    At GitHub you’ve heard us talk about how we are using GitHub Projects and GitHub Actions to plan and track our work and now we’ve asked one of our customers, Grafana Labs, to share how their teams are approaching work in a new way. Whether they are managing open source requests, operational tasks, or escalations, the Grafana Labs Alerting team uses GitHub Projects to manage all these issues efficiently.

    Headshot photograph of Armand Grillet, a white male with short brownish hair and a short beard wearing round tortoise shell glasses. Let’s hear more from Armand Grillet, Senior Engineering Manager at Grafana Labs, including how his teams use tasklists to break work into manageable tasks, use a common set of labels to filter tasks, create multiple views on a single project to meet the needs of different teams and stakeholders, and use automation to enable engineers to stay focused on the code.


    Grafana is a leading open source platform for monitoring and observability, which is why the Grafana Labs GitHub organization is the center of our engineering efforts with nearly 1,000 repositories, including eight having more than 2,000 stars. In addition to this open source work, Grafana Labs engineers also work on the Grafana Cloud observability platform and its customers’ escalations.

    As the manager of the Grafana Alerting and service-level objectives (SLOs) backend teams at Grafana Labs it was essential to have one project board that benefited our multiple stakeholders: team members, other employees, as well as the open source community. Our GitHub Project has offered us the opportunity to do just that. You can even make a copy of our board and adapt it for your own project needs, using the ‘make a copy’ functionality.

    One view for each team, assembled around task lists

    Our four teams each have a view in the Grafana Alerting project: “Backend”,”Front-end”,”UX”,and “Docs”.

    Grafana Alerting has contributors working across four teams: backend, front-end, UX, and docs. Each of these contributor types has their own view in our project. The field options (thus the columns) are the same for each of these views:

    • Inbox—not reviewed yet
    • Waiting for input—open source issues that need more details
    • Backlog—reviewed, priority depending on the milestone set
    • In progress
    • In review
    • Done

    These columns (custom fields) are intentionally generic. They can work for all teams, no matter whether the issue has been written by someone in the community or by someone internally. We use project filters based on labels, such as “area/frontend” to allow issues to be automatically added to the correct views once they are added to our project.

    Big issues that we work on over a quarter use tasklists to breakdown the main pieces of work.
    Big issues that we work on over a quarter use tasklists to breakdown the main pieces of work.
    Note: Tasklists are currently in private beta; you can sign your organization up for access on the GitHub Feature Preview portal

    For bigger issues, we make use of the GitHub tasklist feature to break down the work into tasks. We use labels to filter the tasks to be included in the relevant team’s view. This creates views that provide two different, but useful, kinds of information. For example, for the docs team:

    • Smaller issues with only the label “type/docs” are items the docs team needs to work on.
    • Bigger issues containing task lists with the label “type/docs” along with labels for additional areas (for example, “area/frontend”) are issues with a dependency on the docs team, but the item is not owned by “docs.” For these bigger issues, if the status is “in progress” someone in the docs team should start checking how they can help.

    As a manager of the backend team and project lead for Grafana Alerting, this workflow gives me peace of mind. Even if our docs team members miss some meetings, the Docs view in our project is always accurate, because engineers maintain the status of issues.

    The four team views, combined with our “Epics” view (the first view in our board) that lists our big issues for an entire quarter, allow everyone to see our progress on Grafana Alerting. GitHub users who are part of the Grafana Labs organization can see all issues, whereas Github users within our community can only see issues in public repositories. As most of our issues live in the Grafana public repository this allows us to be transparent by default.

    Whilst we use a private repository for issues relating to our operational work, we use the same labels in all our alerting-related repositories so that we can use project filters easily. Having common labels in many repositories creates an incentive to have the same labels in other repositories, especially new ones, even if they do not relate to alerting. This growing commonality makes searching for issues across multiple repositories easier, which is particularly useful for our product managers.

    Custom fields to create tailored views

    A valuable feature of GitHub Projects is the ability to have different columns (custom fields) per view. This allows us to view not only smaller issues but also larger issues covering an entire quarter. Our team also handles engineering escalations that are worked at a faster pace compared to normal issues.

    As a team, our three custom fields are:

    • Status: the default field for all issues except escalations (as mentioned above).
    • Quarter: used for bigger issues that include tasklists.
    • Escalation: used to capture more granular status of each escalation.

    With these custom fields, we can have custom views, such as our epic view, which gives us a birds-eye view of our quarterly goals, and our escalations view, which lets us review the state of escalations that need engineering work.

    The escalations view uses the escalation status custom field with special values such as “Waiting for release.”

    Thanks to the adaptability of GitHub Projects, we finally have one ‘source-of-truth’ to reference normal issues, big issues, and escalations.

    Enhancing projects with GitHub Actions

    For escalations, which are urgent to solve, we also use GitHub Actions to notify us on Slack or use Grafana OnCall if this is a high priority escalation.

    Combining GitHub Projects and GitHub Actions offers endless possibilities. Actions like github/issue-labeler allow us to have repositories shared by different teams with issues labeled automatically depending on keywords used in an issue. These labels are then used by other actions to add the issue to the right view or to send notifications to external systems. Opening the project and seeing that new relevant issues have been added automatically, ready for triage, feels like magic.

    An issue labeled “alerting” and “prio/3” has been created by support in “grafana/support-escalations.” This adds the issue to the right project and notifies the current on-call on Slack via GitHub Actions.

    We often have escalations related to features requested by customers. Being able to link to an escalation in an open source issue allows engineers and product managers to prioritize together on one platform while keeping this information confidential from GitHub users outside our organization.

    These automations have been important in terms of engineers’ motivation and our productivity. For example, the Grafana Alerting team receives around five issues from support regarding escalations per week. During the last quarter of 2022, at any given time, the team had on average only four open escalations due to the automation motivating engineers to prioritize this work. This is a significant reduction from the average of 20 open escalations at any given time prior to our shift to GitHub Projects and GitHub Actions.

    Flow chart showing the origins and routes that an item can take on its way to ending up in a GitHub project.
    Combining GitHub Projects and GitHub Actions allows us to route feedback from various sources through the most relevant repos and then onto a single project.

    GitHub Projects at scale

    Since we started using GitHub Projects, our team has been growing. An important moment for me in our scaling was the creation of a new project for the team working on SLOs. Originally, this team had no escalations and a small number of engineers. We were able to easily make a copy of the ‘Alerting’ project, changing the filters to accommodate the new team’s repositories, too. The important question was whether the workflows we had for a big team would be too much for such a new team. The answer was yes, there were too many views, creating unnecessary overhead, but it was very easy to reduce the number of views to adapt to the new team’s needs. Using the existing project as a starting point for the SLO team enabled us to get a new project up and running fast.

    After a year of using GitHub Projects, we have seen its ability to handle our different types of issues and to adapt to the needs of our various teams at Grafana Labs. We find GitHub Projects to be a flexible planning and tracking solution that enables engineers to stay focussed on code and gives issues more visibility. GitHub Projects keeps the burden for contributors low whilst still providing the views managers need to understand and plan team efforts. GitHub Projects is the only tool the Grafana Alerting team needs to do project planning.

    How to automate your dev environment with dev containers and GitHub Codespaces

    Post Syndicated from Kedasha Kerr original https://github.blog/2023-03-06-how-to-automate-your-dev-environment-with-dev-containers-and-github-codespaces/

    When I started my first role as a software engineer, I remember taking about four days to set up my local development environment. I had so many issues with missing dependencies, incorrect versions, and failed installations. When I finally finished setting up all the tools and software I needed to be a productive member of the team, I cloned one of our repositories to my machine, set up my environment variables, ran npm run dev and received so many errors because I forgot to install the dependencies (and read the README) or switch to the right node version. Ugh! I can’t tell you how many times this happened to me in my first year!

    Back then, I wished I had a way that was streamlined—something I set up only once, that just worked every time I accessed the repository. Although I did learn how to automate my computer setup with a Brewfile, I wish I could just get to coding in a repository without thinking about configuration.

    Gif from the animated show Spongebob Squarepants of a character picking up a computer as if to toss it away, saying "I'll show you automated."
    source

    When I think about how we work on projects in a repository, I realize that many of the processes we need to get started on that project can be automated with the help of dev containers, in this case, by using a devcontainer.json file and Codespaces.

    Let’s take a look at how we can automate our dev environment by adding a dev container to this open source project—Tech is Hiring in GitHub Codespaces.

    For a TLDR of this post, GitHub Codespaces enables you to start coding faster when coupled with dev containers. See image below for a summary of how:

    Image of a table entitled "Start coding faster with Codespaces." The left column is labeled "Old Way" and the right column is labeled "New Way." The rest of this blog post will enumerate the items listed in the image.

    Now, let’s get some definitions out of the way.

    What is GitHub Codespaces?

    GitHub Codespaces is a development environment in the cloud. It is hosted by GitHub in an isolated environment (Docker container) that runs on a virtual machine. If you’re not familiar with virtual machines or Docker containers, take a look at these videos: what is a virtual machine? and what are Docker containers?.

    Currently, individual developers have 60 hours of free codespaces usage per month, so definitely take advantage of this awesomeness to build from anywhere.

    What are dev containers?

    Dev containers allow us to run our code in a preconfigured, isolated environment (container). It gives us the ability to predefine our dev environment in our repositories and use a consistent, reliable environment across the board without worrying about configuration files—since it’s all set up for us from the beginning with a devcontainer.json file.

    What is the devcontainer.json file?

    The devcontainer.json file is a document that defines how your project will build, display, and run. Simply put, it allows us to specify the behavior we want in our dev environment. For example, if we have a devcontainer.json file that includes installing the ESLint extension in VS Code, once we open up a workspace, ESLint will be automatically installed for us.

    Automating your workflow with dev containers and GitHub Codespaces

    To start using GitHub Codespaces, we don’t need to set up a devcontainer.json file. By default, GitHub Codespaces uses the universal dev container image, which caters to a vast array of languages and tools. This means, whenever you open up a new codespace without a devcontainer.json file your codespace will automagically load so you can code instantly. However, adding a devcontainer.json file gives us the ability to automate a lot of our dev environment workflows to our liking.

    Okay, okay, that was a lot of chatter—let’s now get into what you really came here for!

    Gif of the character Mary Poppins pinching shut the mouth of her talking umbrella handle and telling it, "The will be quite enough of that, thank you." She then opens the umbrella and floats away.
    source

    Using the open source project, Tech is Hiring, let’s walk through how we typically work with a repository using our local dev environment.

    What work typically looks like

    At first glance, we see that this project uses Nextjs, Tailwind CSS, Chakra UI, TypeScript, Storybook, Vite, Cypress, Axios, and Reactjs as some of its dependencies. We’d need to install all these dependencies to our local machine to get this project running.

    1. Let’s clone the repository, and cd into the project.

      Gif of the terminal output as the Tech is Hiring repository is cloned.

    2. Then, let’s install dependencies to get the project running locally.

      Gif showing terminal output as the necessary dependencies are installed.

    3. This project uses storybook, so let’s run both storybook and spin up the actual app locally.

      Gif of the user's desktop with a terminal app running commands.

    The process is not so bad, but it took a bit of time. We also need to check to make sure we’re using the correct node version, check if we need any environment variables, and if there are any runtime errors to resolve. Thankfully, I didn’t encounter any errors while working on this, but it still took a bit of time.

    Going faster with GitHub Codespaces and dev containers

    Let’s make the process better by adding a devcontainer.json file to this project and opening it in GitHub Codespaces to see what happens.

    We can either use the VS Code command palette to add a pre-existing dev container or we can write the configuration file ourselves (which we’ll do below).

    1. Let’s first create a .devcontainer folder in the root of the project and a devcontainer.json file in the new folder.

      Creating a a `.devcontainer` folder in the root of the project and a `devcontainer.json` file in the new folder.

    2. Now, let’s automate installing dependencies, starting the dev server, opening a preview of our app on localhost:3000, and installing vscode extensions. Once we get everything configured, your json file should look like this:

      {
        // image being used
         "image": "mcr.microsoft.com/devcontainers/universal:2",
        // set minimum cpu
         "hostRequirements": {
             "cpus": 4
         },
         // install dependencies and start app
         "updateContentCommand": "npm install",
         "postAttachCommand": "npm run dev",
         // open app.tsx once container is built
         "customizations": {
             "codespaces": {
                 "openFiles": [
                     "src/pages/_app.tsx"
                 ]
             },
             // install some vscode extensions
             "vscode": {
                 "extensions": [
                     "dbaeumer.vscode-eslint",
                     "github.vscode-pull-request-github",
                     "eamodio.gitlens",
                     "christian-kohler.npm-intellisense"
                 ]
             }
         },
         // connect to remote server
         "forwardPorts": [3000],
         // give port a label and open a preview of the app
         "portsAttributes": {
            "3000": {
               "label": "Application",
               "onAutoForward": "openPreview"
             }
           }
      }
      

      Sidenote: I’ve broken down the purpose of the properties in this file for you by adding comments before each. To learn more about each property, continue reading at container.dev. I also installed a few extensions that are not needed, but I wanted to show you that you could automate installing extensions, too!

    3. Let’s commit the file and merge it into the main branch, then open up the project in GitHub Codespaces.

      Committing a file, merging it to the main branch, and then opening GitHub Codespaces.

      When we open up the application in GitHub Codespaces, our dependencies will be installed, the server will start, and a preview will automatically open for us. If we needed environment variables to work on this repository, those would have already been configured for us as a repository secret on GitHub. We also didn’t need to install hefty node_modules to our machine.

    I call this a win!

    Comparing both ways

    There’s plenty more that we can do with dev containers and GitHub Codespaces to automate our dev environment. But let’s summarize what we just did in GitHub Codespaces and with the help of dev containers:

    • We clicked the GitHub Codespaces button on the GitHub repository.
    • Everything was setup/installed for us (thanks json file!).
    • We got to work and started coding.

    Now, isn’t that better?

    Wrapping up

    So, what’s the point of GitHub Codespaces and why should you care as a developer? Well, for one, you can automate most of the startup processes you need to access a repository. You can also do a lot more customizations to your dev environment with dev containers. Take a look at all the options you have—and watch out for my next blog post where I’ll go through the anatomy of a devcontainer.json file.

    Secondly, you can code from anywhere. I hate it when I’m not able to access one of my side projects on a different machine because that one machine is configured perfectly to suit the project. With GitHub Codespaces, you can start coding at the click of a button and from any machine that supports a modern browser.

    I encourage you to get started with GitHub Codespaces today and try adding a devcontainer.json file to one of your projects! I promise you won’t regret it.

    Until next time, happy coding!