Tag Archives: GitHub CLI

GitHub CLI project command is now generally available!

Post Syndicated from Ariel Deitcher original https://github.blog/2023-07-11-github-cli-project-command-is-now-generally-available/

Effective planning and tracking is essential for developer teams of all shapes and sizes. Last year, we announced the general availability of GitHub Projects, connecting your planning directly to the work your teams are doing in GitHub. Today, we’re making GitHub Projects faster and more powerful. The project command for the gh CLI is now generally available!

In this blog, we’ll take a look at how to get started with the new command, share some examples you can try on the command line and in GitHub Actions, and list the steps to upgrade from the archived gh-projects extension. Let’s take a look at how you can conveniently manage and collaborate on GitHub Projects from the command line.

The components of GitHub Projects

Let’s start by familiarizing ourselves with the key components of GitHub Projects. A project is made up of three components—the Project, Project field(s), and Project item(s).

A Project belongs to an owner (which can be either a user or an organization), and is identified by a project number. As an example, the GitHub public roadmap project is number 4247 in the github organization. We’ll use this project in some of our examples later on.

Project fields belong to a Project and have a type such as Status, Assignee, or Number, while field values are set on an item. See understanding fields for more details.

Project items are one of type draft issue, issue, or pull request. An item of type draft issue belongs to a single Project, while items of type issue and pull request can be added to multiple projects.

These three components make up the subcommands of gh project, for example:

  • Project subcommands include: create, copy, list, and view.
  • Project field subcommands include: field-create, field-list , and field-delete.
  • Project item subcommands include: item-add, item-edit, item-archive, and item-list.

For the full list of project commands, check out the manual.

Permissions check

In order to get started with the new command, you’ll need to ensure you have the right permissions. The project command requires the project auth scope, which isn’t part of the default scopes of the gh auth token.

In your terminal, you can check your current scopes with this command:

$ gh auth status
github.com
✓ Logged in to github.com as mntlty (keyring)
✓ Git operations for github.com configured to use https protocol.
✓ Token: gho_************************************
✓ Token scopes: gist, read:org, repo, workflow

If you don’t see project in the list of token scopes, you can add it by following the interactive prompts from this command:

$ gh auth refresh -s project

In GitHub Actions, you must choose one of the options from the documentation to make a token with the project scope available.

Running project commands

Now that you have the permissions you need, let’s look at some examples of running project commands using my user and the GitHub public roadmap project, which you can adapt to your team’s use cases.

List the projects owned by the current user (note that no --owner flag is set):

$ gh project list
NUMBER TITLE STATE ID
1 my first project open PVT_kwxxx
2 @mntlty's second project open PVT_kwxxx

Create a project owned by mntlty:

$ gh project create --owner mntlty --title 'my project'

View the GitHub public roadmap project:

$ gh project view --owner github 4247

Title

GitHub public roadmap

## Description

--

## Visibility

Public

## URL

<https://github.com/orgs/github/projects/4247>

## Item count

208

## Readme

--

## Field Name (Field Type)

Title (ProjectV2Field)

Assignees (ProjectV2Field)

Status (ProjectV2SingleSelectField)

Labels (ProjectV2Field)

Repository (ProjectV2Field)

Milestone (ProjectV2Field)

Linked pull requests (ProjectV2Field)

Reviewers (ProjectV2Field)

Tracks (ProjectV2Field)

Tracked by (ProjectV2Field)

List the items in the GitHub public roadmap project:

$ gh project item-list --owner github 4247

TYPE TITLE NUMBER REPOSITORY ID
Issue Kotlin security analysis support in CodeQL code scanning
(public beta) 207 github/roadmap
PVTI_lADNJr_NE13OAALQgw
Issue Swift security analysis support in CodeQL code scanning
(beta) 206 github/roadmap
PVTI_lADNJr_NE13OAALQhA
Issue Fine-grained PATs (v2 PATs) - [Public Beta]
184 github/roadmap PVTI_lADNJr_NE13OAALQmw

Copy the GitHub public roadmap project structure to a new project owned by mntlty:

$ gh project copy 4247 --source-owner github --target-owner mntlty --title 'my roadmap'

https://github.com/users/mntlty/projects/1

Note that if you are using a TTY and do not pass a --owner flag or the project number argument to a command which requires those values, an interactive prompt will be shown from which you can select those values.

JSON format

Now, let’s look at how to format the command output in JSON, which displays more information for use in scripting, automation, and piping into other commands. Every project subcommand supports outputting to JSON format by setting the --format=json flag:

$ gh project view --owner github 4247 --format=json
{"number":4247,"url":"<https://github.com/orgs/github/projects/4247","shortDescription":"", "public":true,"closed":false,"title":"GitHub> public roadmap","id":"PVT_kwDNJr_NE10","readme":"","items":{"totalCount":208},"fields":{"totalCount":10},"owner":{"type":"Organization","login":"github"}}%

Combining JSON formatted output with a tool such as jq enables you to unlock even more capabilities. For example, you can create a list of the URLs from all of the Issues on the GitHub public roadmap project that have status “Future”:

$ gh project item-list --owner github 4247 --format=json | jq '.items[] |
select(.status=="Future" and .content.type == "Issue") | .content.url'

"<https://github.com/github/roadmap/issues/188>"
"<https://github.com/github/roadmap/issues/187>"
"<https://github.com/github/roadmap/issues/166>"

GitHub Actions

You can also level up your team’s usage of GitHub Projects with project commands in your GitHub Actions workflows to enhance automation, generate on demand reports, and react to events such as when a project item is modified. For example, you can create a workflow which is triggered by a workflow_dispatch event and will close all projects that are owned by mntlty and which have no items:

on: 
  workflow_dispatch:

jobs:
  close_empty:
    runs-on: ubuntu-latest
    env:
      GH_TOKEN: ${{ secrets.PROJECT_TOKEN }}
    steps:
      - run: |
          gh project list --owner mntlty --format=json \
          | jq '.projects[] | select(.items.totalCount == 0) | .number' \
          | xargs -n1 gh project close --owner mntlty 

The latest version of gh is automatically available in the GitHub Actions environment. For more information on using GitHub Actions, see https://docs.github.com/en/actions.

Upgrading from the gh-projects extension

Now that the project command is officially part of the CLI, the gh-projects extension repository has been archived. If you’re currently using the extension, you don’t need to change anything. You can continue installing and using the gh-projects extension; however, it won’t receive any future enhancements. Fortunately, it’s very simple to make the transition from the gh-project extension to the project command:

  • Upgrade to the latest version of gh.
  • Replace flags for --user and --org with --owner in project commands. owner is the login of the project owner, which is either a user or an organization.
  • Replace gh projects with gh project.

To avoid confusion, I also recommend removing the extension by running the following command:

$ gh ext remove gh-projects

Thank you to the community, @mislav, @samcoe, and @vilmibm for providing invaluable feedback and support on gh-projects!

Get started with GitHub CLI project command today

If you’re interested in learning more or giving us feedback, check out these links:

Upgrade to the latest version of the gh CLI to level up your usage of GitHub Projects!

New GitHub CLI extension tools

Post Syndicated from Nate Smith original https://github.blog/2023-01-13-new-github-cli-extension-tools/

Since the GitHub CLI 2.0 release, developers and organizations have customized their CLI experience by developing and installing extensions. Since then, the CLI team has been busy shipping several new features to further enhance the experience for both extension consumers and authors. Additionally, we’ve shipped go-gh 1.0 release, a Go library giving extension authors access to the same code that powers the GitHub CLI itself. Finally, the CLI team released the gh/pre-extension-precompile action, which automates the compilation and release of Go, Rust, or C++ extensions.

This blog post provides a tour of what’s new, including an in-depth look at writing a CLI extension with Go.

Introducing extension discovery

In the 2.20.0 release of the GitHub CLI, we shipped two new commands, including gh extension browse and gh extension search, to make discovery of extensions easier (all extension commands are aliased under gh ext, so the rest of this post will use that shortened version).

gh ext browse

gh ext browse is a new kind of command for the GitHub CLI: a fully interactive Terminal User Interface (TUI). It allows users to explore published extensions interactively right in the terminal.

A screenshot of a terminal showing the user interface of "gh ext browse". A title and  search box are at the top. The left side of the screen contains a list of extensions and the right side a rendering of their readmes.

Once gh ext browse has launched and loads extension data, you can browse through all of the GitHub CLI extensions available for installation sorted by star count by pressing the up and down arrows (or k and j).

Pressing / focuses the filter box, allowing you to trim the list down to a search term.

A screenshot of "gh ext browse" running in a terminal. At the top, a filter input box is active and a search term has been typed in. The left column shows a filtered list of extensions that match the search term. The right column is rendering an extension's readme.

You can select any extension by highlighting it. The selected extension can be installed by pressing i or uninstalled by pressing r. Pressing w will open the currently highlighted extension’s repository page on GitHub in your web browser.

Our hope is that this is a more enjoyable and easy way to discover new extensions and we’d love to hear feedback on the approach we took with this command.

In tandem with gh ext browse we’ve shipped another new command intended for scripting and automation: gh ext search. This is a classic CLI command which, with no arguments, prints out the first 30 extensions available to install sorted by star count.

A screenshot of the "gh ext search" command after running it in a terminal. The output lists in columnar format a list of extension names and descriptions, as well as a leading checkmark for installed extensions. At the top is a summary of the results.

A green check mark on the left indicates that an extension is installed locally.

Any arguments provided narrow the search results:

A screenshot of the

Results can be further refined and processed with flags, like:

  • --limit, for fetching more results
  • --owner, for only returning extensions by a certain author
  • --sort, for example for sorting by updated
  • --license, to filter extensions by software license
  • --web, for opening search results in your web browser
  • --json, for returning results as JSON

This command is intended to be scripted and will produce composable output if piped. For example, you could install all of the extensions I have written with:

gh ext search --owner vilmibm | cut -f2 | while read -r extension; do gh ext install $extension; done

A screenshot of a terminal after running

For more information about gh ext search and example usage, see gh help ext search.

Writing an extension with go-gh

The CLI team wanted to accelerate extension development by putting some of the GitHub CLI’s own code into an external library called go-gh for use by extension authors. The GitHub CLI itself is powered by go-gh, ensuring that it is held to a high standard of quality. This library is written in Go just like the CLI itself.

To demonstrate how to make use of this library, I’m going to walk through building an extension from the ground up. I’ll be developing a command called askfor quickly searching the threads in GitHub Discussions. The end result of this exercise lives on GitHub if you want to see the full example.

Getting started

First, I’ll run gh ext create to get started. I’ll fill in the prompts to name my command “ask” and request scaffolding for a Go project.

An animated gif showing an interactive session with the

Before I edit anything, it would be nice to have this repository on GitHub. I’ll cd gh-ask and run gh repo create, selecting Push an existing local repository to GitHub, and follow the subsequent prompts. It’s okay to make this new repository private for now even if you intend to make it public later; private repositories can still be installed locally with gh ext install but will be unavailable to anyone without read access to that repository.

The initial code

Opening main.go in my editor, I’ll see the boilerplate that gh ext create made for us:

package main

import (
        "fmt"

        "github.com/cli/go-gh"
)

func main() {
        fmt.Println("hi world, this is the gh-ask extension!")
        client, err := gh.RESTClient(nil)
        if err != nil {
                fmt.Println(err)
                return
        }
        response := struct {Login string}{}
        err = client.Get("user", &response)
        if err != nil {
                fmt.Println(err)
                return
        }
        fmt.Printf("running as %s\n", response.Login)
}

go-gh has already been imported for us and there is an example of its RESTClient function being used.

Selecting a repository

The goal with this extension is to get a glimpse into threads in a GitHub repository’s discussion area that might be relevant to a particular question. It should work something like this:

$ gh ask actions
…a list of relevant threads from whatever repository you're in locally…

$ gh ask --repo cli/cli ansi
…a list of relevant threads from the cli/cli repository…

First, I’ll make sure that a repository can be selected. I’ll also remove stuff we don’t need right now from the initial boilerplate.

package main

import (
        "flag"
        "fmt"
        "os"

        "github.com/cli/go-gh"
        "github.com/cli/go-gh/pkg/repository"
)

func main() {
    if err := cli(); err != nil {
            fmt.Fprintf(os.Stderr, "gh-ask failed: %s\n", err.Error())
            os.Exit(1)
    }
}

func cli() {
        repoOverride := flag.String(
        "repo", "", "Specify a repository. If omitted, uses current repository")
        flag.Parse()

        var repo repository.Repository
        var err error

        if *repoOverride == "" {
                repo, err = gh.CurrentRepository()
        } else {
                repo, err = repository.Parse(*repoOverride)
        }
        if err != nil {
                return fmt.Errorf("could not determine what repo to use: %w", err.Error())

        }

        fmt.Printf(
       "Going to search discussions in %s/%s\n", repo.Owner(), repo.Name())

}

Running my code, I should see:

$ go run .
Going to search discussions in vilmibm/gh-ask

Adding our repository override flag:

$ go run . --repo cli/cli
Going to search discussions in cli/cli
Accepting an argument

Now that the extension can be told which repository to query I’ll next handle any arguments passed on the command line. These arguments will be our search term for the Discussions API. This new code replaces the fmt.Printf call.

    // fmt.Printf was here

        if len(flag.Args()) < 1 {
                return errors.New("search term required")
        }
        search := strings.Join(flag.Args(), " ")

        fmt.Printf(
                "Going to search discussions in '%s/%s' for '%s'\n",
            repo.Owner(), repo.Name(), search)

        }

With this change, the command will respect any arguments I pass.

$ go run . 
Please specify a search term
exit status 2

$ go run . cats
Going to search discussions in 'vilmibm/gh-ask' for 'cats'

$ go run . fluffy cats
Going to search discussions in 'vilmibm/gh-ask' for 'fluffy cats'
Talking to the API

With search term and target repository in hand, I can now ask the GitHub API for some results. I’ll be using the GraphQL API via go-gh’s GQLClient. For now, I’m just printing some basic output. What follows is the new code at the end of the cli function. I’ll delete the call to fmt.Printf that was here for now.

        // fmt.Printf call was here

client, err := gh.GQLClient(nil)
        if err != nil {
                return fmt.Errorf("could not create a graphql client: %w", err)
        }

        query := fmt.Sprintf(`{
                        repository(owner: "%s", name: "%s") {
                                hasDiscussionsEnabled
                                discussions(first: 100) {
                                        edges { node {
                                                title
                                    body
                                    url

    }}}}}`, repo.Owner(), repo.Name())

        type Discussion struct {
                Title string
                URL   string `json:"url"`
                Body  string
        }

        response := struct {
                Repository struct {
                        Discussions struct {
                                Edges []struct {
                                        Node Discussion
                                }
                        }
                        HasDiscussionsEnabled bool
                }
        }{}

        err = client.Do(query, nil, &response)
        if err != nil {
                return fmt.Errorf("failed to talk to the GitHub API: %w", err)
        }

        if !response.Repository.HasDiscussionsEnabled {
                return fmt.Errorf("%s/%s does not have discussions enabled.", repo.Owner(), repo.Name())
        }

        matches := []Discussion{}

        for _, edge := range response.Repository.Discussions.Edges {
                if strings.Contains(edge.Node.Body+edge.Node.Title, search) {
                        matches = append(matches, edge.Node)
                }
        }

        if len(matches) == 0 {
                fmt.Fprintln(os.Stderr, "No matching discussion threads found :(")
            return nil
        }

        for _, d := range matches {
                fmt.Printf("%s %s\n", d.Title, d.URL)
        }

When I run this, my output looks like:

$ go run . --repo cli/cli actions
gh pr create don't trigger `pullrequest:` actions https://github.com/cli/cli/discussions/6575
GitHub CLI 2.19.0 https://github.com/cli/cli/discussions/6561
What permissions are needed to use OOTB GITHUB_TOKEN with gh pr merge --squash --auto https://github.com/cli/cli/discussions/6379
gh actions feedback https://github.com/cli/cli/discussions/3422
Pushing changes to an inbound pull request https://github.com/cli/cli/discussions/5262
getting workflow id and artifact id to reuse in github actions https://github.com/cli/cli/discussions/5735

This is pretty cool! Matching discussions are printed and we can click their URLs. However, I’d prefer the output to be tabular so it’s a little easier to read.

Formatting output

To make this output easier for humans to read and machines to parse, I’d like to print the title of a discussion in one column and then the URL in another.

I’ve replaced that final for loop with some new code that makes use of go-gh’s term and tableprinter packages.

        if len(matches) == 0 {
                fmt.Println("No matching discussion threads found :(")
        }

    // old for loop was here

        isTerminal := term.IsTerminal(os.Stdout)
        tp := tableprinter.New(os.Stdout, isTerminal, 100)


if isTerminal {
                fmt.Printf(
                        "Searching discussions in '%s/%s' for '%s'\n",
                        repo.Owner(), repo.Name(), search)
        }

        fmt.Println()
        for _, d := range matches {
                tp.AddField(d.Title)
                tp.AddField(d.URL)
                tp.EndRow()
        }

        err = tp.Render()
        if err != nil {
                return fmt.Errorf("could not render data: %w", err)
        }

The call to term.IsTerminal(os.Stdout) will return true when a human is sitting at a terminal running this extension. If a user invokes our extension from a script or pipes its output to another program, term.IsTerminal(os.Stdout) will return false. This value then informs the table printer how it should format its output. If the output is a terminal, tableprinter will respect a display width, apply colors if desired, and otherwise assume that a human will be reading what it prints. If the output is not a terminal, values are printed raw and with all color stripped.

Running the extension gives me this result now:

A screenshot of running

Note how the discussion titles are truncated.

If I pipe this elsewhere, I can use a command like cut to see the discussion titles in full:

A screenshot of running "go run . --repo cli/cli actions | cut -f1" in a terminal. The result is a list of discussion thread titles.

Adding the tableprinter improved both human readability and scriptability of the extension.

Opening browsers

Sometimes, opening a browser can be helpful as not everything can be done in a terminal. go-gh has a function for this, which we’ll make use of in a new flag that mimics the “feeling lucky” button of a certain search engine. Specifying this flag means that we’ll open a browser with the first matching result to our search term.

I’ll add a new flag definition to the top of the main function:

func main() {
        lucky := flag.Bool("lucky", false, "Open the first matching result in a web browser")
    // rest of code below here

And, then add this before I set up the table printer:

        if len(matches) == 0 {
                fmt.Println("No matching discussion threads found :(")
        }

        if *lucky {
                b := browser.New("", os.Stdout, os.Stderr)
                b.Browse(matches[0].URL)
                return
        }

    // terminal and table printer code
JSON output

For extensions with more complex outputs, you could go even further in enabling scripting by exposing JSON output and supporting jq expressions. jq is a general purpose tool for interacting with JSON on the command line. go-gh has a library version of jq built directly in, allowing extension authors to offer their users the power of jq without them having to install it themselves.

I’m adding two new flags: --json and --jq. The first is a boolean and the second a string. They are now the first two lines in main:

func main() {
    jsonFlag := flag.Bool("json", false, "Output JSON")
    jqFlag := flag.String("jq", "", "Process JSON output with a jq expression")

After setting isTerminal, I’m adding this code block:

 
isTerminal := term.IsTerminal(os.Stdout)
        if *jsonFlag {
                output, err := json.Marshal(matches)
                if err != nil {
                        return fmt.Errorf("could not serialize JSON: %w", err)
                }

                if *jqFlag != "" {
                        return jq.Evaluate(bytes.NewBuffer(output), os.Stdout, *jqFlag)
                }

                return jsonpretty.Format(os.Stdout, bytes.NewBuffer(output), " ", isTerminal)
        }

Now, when I run my code with --json, I get nicely printed JSON output:

A screenshot of running

If I specify a jq expression I can process the data. For example, I can limit output to just titles like we did before with cut; this time, I’ll use the jq expression .[]|.Title instead.

A screenshot of running "go run . --repo cli/cli --json --jq '.[]|.Title' actions" in a terminal. The output is a list of discussion thread titles.

Keep going

I’ll stop here with gh ask but this isn’t all go-gh can do. To browse its other features, check out this list of packages in the reference documentation. You can see my full code on GitHub at vilmibm/gh-ask.

Releasing your extension with cli/gh-extension-precompile

Now that I have a feature-filled extension, I’d like to make sure it’s easy to create releases for it so others can install it. At this point it’s uninstallable since I have not precompiled any of the Go code.

Before I worry about making a release, I have to make sure that my extension repository has the gh-extension tag. I can add that by running gh repo edit --add-tag gh-extension. Without this topic added to the repository, it won’t show up in commands like gh ext browse or gh ext search.

Since I started this extension by running gh ext create, I already have a GitHub Actions workflow defined for releasing. All that’s left before others can use my extension is pushing a tag to trigger a release. The workflow file contains:

name: release
on:
  push:
    tags:
      - "v*"
permissions:
  contents: write

jobs:
  release:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: cli/gh-extension-precompile@v1

Before tagging a release, make sure:

  • The repository is public (assuming you want people to install this for themselves! You can keep it private and make releases just for yourself, if you prefer.).
  • You’ve pushed all the local work you want to see in the release.

To release:

A screenshot of running "git tag v1.0.0", "git push origin v1.0.0", and "gh run view" in a terminal. The output shows a tag being created, the tag being pushed, and then the successful result of the extension's release job having run.

Note that the workflow ran automatically. It looks for tags of the form vX.Y.Z and kicks off a build of your Go code. Once the release is done, I can check it to see that all my code compiled as expected:

A screenshot of running "gh release view" in a terminal for the extension created in the blog post. The output shows a list of assets. The list contains executable files for a variety of platforms and architectures.

Now, anyone can run gh ext install vilmibm/gh-ask and try out my extension! This is all the work of the gh-extension-precompile action. This action can be used to compile any language, but by default it only knows how to handle Go code.

By default, the action will compile executables for:

  • Linux (amd64, 386, arm, arm64)
  • Windows (amd64, 386, arm64)
  • MacOS (amd64, arm64)
  • FreeBSD (amd64, 386, arm64)
  • Android (amd64, arm64)

To build for a language other than Go, edit .github/workflows/release.yml to add a build_script_override configuration. For example, if my repository had a script at scripts/build.sh, my release.yml would look like:

name: release
on:
  push:
    tags:
      - "v*"

permissions:
  contents: write

jobs:
  release:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: cli/gh-extension-precompile@v1
        with:
          build_script_override: "script/build.sh"

The script specified as build_script_override must produce executables in a dist directory at the root of the extension repository with file names ending with: {os}-{arch}{ext}, where the extension is .exe on Windows and blank on other platforms. For example:

  • dist/gh-my-ext_v1.0.0_darwin-amd64
  • dist/gh-my-ext_v1.0.0_windows-386.exe

Executables in this directory will be uploaded as release assets on GitHub. For OS and architecture nomenclature, please refer to this list. We use this nomenclature when looking for executables from the GitHub CLI, so it needs to be respected even for non-Go extensions.

Future directions

The CLI team has some improvements on the horizon for the extensions system in the GitHub CLI. We’re planning a more accessible version of the extension browse command that renders a single column style interface suitable for screen readers. We intend to add support for nested extensions–in other words, an extension called as a subcommand of an existing gh command like gh pr my-extension–making third-party extensions fit more naturally into our command hierarchy. Finally, we’d like to improve the documentation and flexibility of the gh-extension-precompile action.

Are there features you’d like to see? We’d love to hear about it in a discussion or an issue in the cli/cli repository.

Wrap-up

It is our hope that the extensions system in the GitHub CLI inspire you to create features beyond our wildest imagination. Please go forth and make something you’re excited about, even if it’s just to make gh do fun things like run screensavers.

Contributing to open source at GitHub

Post Syndicated from Ariel Deitcher original https://github.blog/2022-09-06-contributing-to-open-source-at-github/

Ariel Deitcher (@mntlty) is a Senior Software Engineer at GitHub, working on Pull Requests and Merge Queue (beta). In this post, he shares the challenges he encountered finding his path to contributing to open source, what it was like contributing to open source at GitHub, and some of the lessons he learned.

Getting started with open source can be overwhelming

As a computer science graduate in 2011 and searching for my first tech job, I read that contributing to an open source project could help. It was a great way to build skills, make industry connections, and gain practical experience with a real-world problem. Perfect, I thought, I’ll just pick an open source project on this new website called GitHub, and, well, actually I wasn’t sure how to do that. Finding that “Goldilocks” project (where the size, language(s), domain, and community felt just right) was a lot harder than I thought, and I didn’t feel self-confident enough to make much progress. Overwhelmed, I decided the timing wasn’t right but resolved to try again someday.

It bugged me that the contribution graph on my GitHub profile remained stubbornly empty, as all the code I had committed lived in private repositories. That changed in 2016 when contributions to private repositories could be shown on my profile, but my contributions to open source had not. Between my family, work, life, and the explosive growth in projects to choose from, making that first contribution to open source felt more daunting than ever.

The opportunity to contribute to open source at GitHub

Fast forward to 2021. I read Working in Public: The Making and Maintenance of Open Source Software by Nadia Eghbal while interviewing at GitHub. I was especially captivated by the Stadium model of open source projects, where a small number of maintainers and occasional contributors are vastly outnumbered by a project’s users. This aligned with my mental model of open source projects, where a few performers on a digital stage would conjure feats of coding wizardry. I could only imagine how vulnerable working in public could be, and hoped it would feel less intimidating working at GitHub.

I joined the GitHub team building Merge Queue (beta), a feature which helps users coordinate their merges to a protected branch, ensures that changes are up to date, and that all required checks pass before automatically merging a pull request. Early on, I shared my long-held goal of contributing code to an open source project with my manager, and discussed the GitHub CLI, an open source tool written in Go which lets users interact with GitHub from the command line, as a possible candidate.

While building Merge Queue, our team carefully integrated it with GitHub’s many APIs and tools, checking each one for compatibility and correctness. Testing different scenarios of merging a pull request with the GitHub CLI, I saw that once a Merge Queue was required, running the CLI command gh pr merge would fail in most cases. The Merge Queue was correctly preventing direct merges to its protected branch, and so I began scoping out what changes the CLI might need to support Merge Queue.

As I didn’t have write access to the CLI repository, I forked it, started a new Codespace, and spent some time getting familiar with the CLI’s contributing guidelines and code. Wanting to minimize my changes, I targeted a few places in the merge command to modify. When I was ready, I pushed a commit to my fork and opened a pull request to share with the CLI maintainers. I expected that I would provide support but defer to them for the final implementation.

In reviewing my pull request with the CLI maintainers, it quickly became clear that my changes were hard to reason about. The merge command had accumulated sufficient technical debt that adding more complexity to it was risky. The team asked if I could refactor the merge command in an initial pull request and follow up with a subsequent pull request for the Merge Queue changes after the first was merged. What I had thought would be a rough guide of changes for the CLI maintainers was, in fact, the opportunity I had been looking for to contribute to open source at GitHub. I confirmed that my manager was onboard with this increased commitment, and was ready to get started.

Refactoring the merge command and adding Merge Queue support

I set out to refactor the merge command with a focus on simplicity, readability, and returning early over deeply nested conditionals. The existing test coverage gave me a confidence boost as I began stepping through the code, copying each section into a separate file for later reference, and wrote comments which I felt captured the intent of the removed section. I then grouped related Git and API operations, consolidated common code into appropriately named functions and variables, trimmed unreachable code paths, created a MergeContext struct to encapsulate state, and leaned into Go’s explicit error returns – all of which gave the code a more linear and consistent structure.

As an example, the mergeRun function, which is the heart of the merge command, went from over 220 lines to just 30:

func mergeRun(opts *MergeOptions) error {
    ctx, err := NewMergeContext(opts)
    if err != nil {
        return err
    }

    // no further action is possible when disabling auto merge
    if opts.AutoMergeDisable {
        return ctx.disableAutoMerge()
    }

    ctx.warnIfDiverged()

    if err := ctx.canMerge(); err != nil {
        return err
    }

    if err := ctx.merge(); err != nil {
        return err
    }

    if err := ctx.deleteLocalBranch(); err != nil {
        return err
    }

    if err := ctx.deleteRemoteBranch(); err != nil {
        return err
    }

    return nil
}

When I was finished, I opened a pull request from my fork to the CLI repository, and was blown away by how supportive the code review process was. After a few rounds of feedback, my code was merged and ready to ship in the next release. I was an open source contributor at GitHub!

Returning to my fork, my original Merge Queue changes were now completely out of date. In fact, much of the code I had on my branch no longer existed on the CLI’s trunk branch. Fortunately, I was now intimately familiar with the merge command and was able to make the Merge Queue changes and tests in a subsequent pull request quickly and with confidence.

Lessons learned

Looking back, I learned that searching for the right open source project on my own, trying to create time outside of work, and context switching from my existing projects were obstacles I could not overcome. Instead, the key for me was to find an open source project that was important to what I was already working on, and that I was accountable for. If this sounds familiar, consider asking your manager if you can devote some time to work on an issue in an open source project that you or your team rely on. It’s much easier to get started with an open source project you know and can align with work you’re already committed to. I recognize how fortunate I was to be in the right place at the right time, and with the right support from my manager, but it wasn’t easy.

Many people I know struggle with impostor syndrome, and working in public made me even more aware of mine. I am learning to accept that even though my commits aren’t perfect, and that I’m afraid of being judged for creating bugs like this regression, which will be discoverable forever, I should still contribute. Despite these challenges, I enjoy picking up new issues in the CLI labeled “help wanted (contributions welcome)” whenever I can, and hope you will too!

GitHub Availability Report: March 2022

Post Syndicated from Jakub Oleksy original https://github.blog/2022-04-06-github-availability-report-march-2022/

In March, we experienced a number of incidents that resulted in significant impact and degraded state of availability to some core GitHub services. This blog post includes a detailed follow-up on a series of incidents that occurred due to degraded database stability, and a distinct incident impacting the Actions service.

Database Stability

Last month, we experienced a number of recurring incidents that impacted the availability of our services. We want to acknowledge the impact this had on our customers, and take this opportunity during our monthly report to provide additional details as a result of further investigations and share what we have learned.

Background

The underlying theme of these issues was due to resource contention in our mysql1 cluster, which impacted the performance of a large number of our services and features during periods of peak load.

Each of these incidents resulted in a degraded state of availability for write operations on our primary services (including Git, issues, and pull requests). While some read operations were not impacted, any user who performed a write operation that involved our mysql1 cluster was affected, as the database could not handle the load.

After the other services recovered, GitHub Actions queues were saturated. We enabled the queues gradually to catch up in real time, and as a result our status page noted the multi-hour outages. When Actions are delayed, it can also impact CI completion and a host of other functions.

What we learned

These incidents were characterized by a burst in load during peak hours of GitHub traffic. During these bursts, our mysql1 cluster was not able to handle the load generated by traffic on the system and we were forced to fail-over and take other mitigations, as mentioned in the previous post.

Some of these incidents were related to our efforts to improve visibility on the database, but all of them were related to the low amount of headroom we had on our primary database and thus its susceptibility to a few poorly performing queries.

Optimizing for stability

Because of this, even after we mitigated the initial causes of downtime due to poor query performance, we were still running with low headroom and decided to take a proactive approach to managing load by intentionally slowing down services during peak hours. Furthermore, we took a calculated approach to increase capacity on the database by further optimizing queries.

Rather than risk another site outage, we established lower performance alerting thresholds on the database and proactively throttled webhooks and Actions services (the two largest drivers of automated load on the system) as we approached unsafe margins of error on March 14 14:43 UTC. We understood the potential impact to our customers, but decided it would be safer to proactively limit load on the system rather than risk another outage on multiple services.

In the meantime, we implemented a series of optimizations between March 14 and March 28 that drove queries per second on this database down by over 50% and reduced our transaction volume by 70% at peak load times. Through these performance optimizations, we became more confident in our headroom, but given ongoing investigations, we did not want to chance any unwarranted impacts.

Minimizing impact to our users

After the incidents mentioned above, we took steps to make sure we would be in a position, if necessary, to shut down any services driving high peak load. This meant taking maintenance windows for three services starting on March 24. We proactively paused migrations and team synchronization during peak load due to their potential impact.

We also took maintenance windows for GitHub Actions even though we did not actually throttle any actions and no customers were impacted during these windows. We did this in order to proactively notify customers of possible disruption. While it didn’t end up being the case, we knew we would need to throttle GitHub Actions if we saw any significant database degradation during these time windows. While this may have caused uncertainty for some customers, we wanted to prepare them for any potential impact.

Next steps

Immediate changes

In addition to the improvements mentioned above, we have significantly reduced our database performance alerting thresholds so that we are not “running hot” and will be well positioned to take action before customers are impacted.

We have also accelerated work that was already in progress to continue to shard this particular cluster and apply the learnings from this incident to other clusters that already exist outside of mysql1.

Additional technical and organizational initiatives

Due to the nature of this incident, we have also dedicated a team of engineers to study our internal processes and procedures, observability, and change release processes. While we’re still actively revisiting this incident, we feel confident we have mitigated the initial issues and we have the correct alerting and processes in place to ensure this problem is not likely to occur again.

We understand that the Actions service is critical to many of our customers. With new and ongoing investments across architecture and processes, we’ll continue to bring focus specifically to Actions reliability, including more graceful degradations when other GitHub services are experiencing issues, as well as faster recovery times.

March 29 10:26 UTC (lasting 57 minutes)

During an operation to move GitHub Actions and checks data to its own dedicated, sharded database cluster, a misconfiguration on the new database cluster caused the application to encounter errors. Once we reverted our changes, we were able to recover. This incident resulted in the failure or delay of some queued jobs for a period of time. Once mitigation was initiated, jobs that were queued during the incident were run successfully after the issue was resolved.

The Actions and checks data resides in a multi-tenant database cluster. As part of our efforts to improve reliability and scale, we have been working on functionally partitioning the Actions data to its own sharded database cluster. The switch over to the new cluster involves gradually switching over reads and then switching over writes. Immediately after switching the write traffic, we noticed Actions SLOs were breached and initiated a revert back to the old database. After we reverted back to the old database, we saw an immediate improvement in availability.

Upon further investigation, we discovered that update and delete queries were processed correctly on the new cluster, but insert queries were failing because of missing permissions on the new cluster. All changes processed on the new cluster were replicated back to the old cluster before the switch back, ensuring data integrity.

We have paused any attempts for migrations until we fully investigate and apply our learnings. Furthermore, due to the risk associated with these operations, we will no longer be attempting them during peak traffic hours, which occur between 12:00 and 21:00 UTC. From a technical perspective, we’re looking to scrutinize and improve our operational workflows for these database operations. Additionally, we are going to be performing an audit of our configurations and topology across our environment, to ensure we have properly covered them in our testing strategy. As part of these efforts, we uncovered a gap where we need to extend our pre-migration checklist with a step to verify permissions more thoroughly.

In summary

Every month we share an update on GitHub’s availability, including a description of any incidents that may have occurred and an update on how we are evolving our engineering systems and practices in response. Our hope is that by increasing our transparency and sharing what we’ve learned, everyone can gain from our experiences. At GitHub, we take the trust you place in us very seriously, and we hope this is a way for you to help hold us accountable for continuously improving our operational excellence, as well as our product functionality.

To learn more about our efforts to make GitHub more resilient every day, check out the GitHub engineering blog.