Tag Archives: апи

Тука е така или с АПИ на семинар

Post Syndicated from Боян Юруков original https://yurukov.net/blog/2021/api-seminar/

Бях решил да не се занимавам с тази случка, но заради това, което стана накрая реших, че си го заслужават.

Взех си няколко дни отпуска и в предпоследния ден в хотела ни нямаше почти хора. Тогава забелязах, че пристигат доста коли, сред които поне четири на Агенция Пътна инфраструктура както и няколко прескъпи джипа. Досетих се, че явно ще имат конференция или семинар или нещо подобно.

Така и се оказа. Напълниха конферентната стая с 80-тина души за 2-3 часа, след което се отправиха към басейна. В началото не бях сигурен кое ме раздразни – че трошаха яко държавна пара за 5-звезден хотел, че изведнъж всичко се напълни на 100% или че дори не си правиха труда да изглеждат, че са на официално събитие.

Бързо обаче размислих и се примирих – можеше да се напълни от други гости, а и все пак първи ден им е и може да имат друга програма за следващите. Дори фактът, че харчат толкова за пет звезден хотел в условията на сериозни скандали за разточителство на агенцията отметнах с това, че бледнее пред милиардите, които са раздали без обосновка, поръчка или дори строеж. А и все пак е форма на бонус към служители на държавна работа, която ако не се вкарат в някаква схема не може да се отрече, че е доста неблагодарна. Всеки офис и организация имат нужда от някаква форма на team-building, особено след последната година и въпреки всички резерви, които имах към тях. Отметнах всичко с ръка.

Гледах да не им обръщам внимание, но не можех да избягам от разговорите им. Явно не им пукаше много кой ги чува. Темите започнаха от злободневни – колко било скъпо в Гърция сега, колко скочили цените, един се жалваше колко малко от хората в агенцията са се ваксинирали, а никой не носел маска никъде в офисите или на конференцията сега.

После минаха на клюки кой какъв страничен бизнес имал, кой шеф успял да си доведе любовницата в хотела и кой не успял, защото я бил уредил в администрацията, а тука били само архитектура. После кой шеф каква кола под 6 цифрени суми не поглеждал и прочие. Общо взето се бяха разделили на три групи: едни дето се правеха на тежкари (предимно мъже, но и две жени), младите, които активно се наслаждаваха на почивката и „лелките и чичковци“, които също се радваха на басейна, но видимо бяха уморени от цялата драма.

Тежкарите изглеждаха доста притеснени през цялото време и все висяха на телефона. Единият много настоя точно до мен да крещи как са посмели да пуснат нещо електронно, като трябвало да си стои затворено в папка в бюрото му. Други двама постоянно дваха инструкции какво да се подписва от тяхно име „все едно са там“. Едната увещаваше, че трябвало да изкарат, че е била на комисия за одобрение на някакъв проект с още двама, които обръщаха коктейли на бара, а служителят в офиса трябвало с подписите на тримата да подпише протокола, защото „всичко сме се разбрали вече, така трябва да мине“. Имаше разговори и за очакваните бонуси и как нищо не можело да се спре, защото „то вече уредено и подписано всичко“.

Всички тези подхвърляния и спорадични изпускания показваха една добре смазана машина, която явно работи дори и от сауната и няма особени притеснения кой какво ще чуе. Общо взето смесица от обикновени хора, които просто искат да си свършат работата предвид ситуацията, в която са поставени от други хора, които раздават стотици милиони с един подпис и не им пука, защото няма кой да ги съди.

Гледах да страня от тях, защото все пак исках да си почина от тези глупости. Не смятах, че което и да е от подвърлянията или гръмогласните заповеди е толкова конкретно, за да си заслужава внимание, а и общата картинка е ясна на всички. Причината обаче да напиша всичко това беше черешката на тортата накрая.

Точно когато освобождавах стаята във фоайето бяха седнали няколко от АПИ и обсъждаха сметките. Напред-назад сновеше служителка на хотела притеснено. Темата беше как трябва да бъде изготвена фактурата и кое как да бъде вписано в касовата бележка. За малкото време, в което чаках собствената си сметка, на три пъти чух вариации на изказване, че хотелът иска да спази закона и трябва да фактурира всичко както е консумирано. От своя страна заобиколилите я държавни служители АПИ отчетливо заплашиха, че ще им изпратят данъчните на проверка и ще „има много неприятни ситуации“, ако не променят сметката.

Всички бяха с много ехидни усмивки, които не се сдържах да снимам. Имам право на това, защото макар някои да бяха по бански, всички бяха формално на работа и обсъждаха неща свързани с разход на бюджетни средства. Скривам лицата им, но с радост ще ги предоставя на отговорната институция, ако някой все има интерес. Скривам и служителката на хотела, защото искрено я съжалих на какъв натиск и заплахи е подложена от държавни служители. Надявам се да не се е поддала, но надали. Така работи АПИ и доста други институции.

ЯЯсно тук е, че е дори да се сменят шапките на тези организации, схемите, връзките и проблемите вътре остават. Разбиването им ще доведе неизменно до масово напускане на хора и криза в конкретната администрация. Т.е. ще стане по-зле преди да стане по-добре. Никой не изглеждаше притеснен от смяната на двама шефове за два месеца, но явно бяха притеснени какво следва.

И ние като данъкоплатци и гласоподаватели сме притеснени какво следва и ключов момент тук е не само каква култура на работа, говорене, поведение и нетърпимост ще бъде наложена на най-високо ниво в управлението, но и да има прокуратура и съд, които да преследват както такива дребни документални измами, така и милиардите подарени на фирми близки до ГЕРБ, ДПС и каквото друг открият. За първото – виждаме някакъв шанс. За прокуратурата ще трябва доста работа и борба.

The post Тука е така или с АПИ на семинар first appeared on Блогът на Юруков.

Announcing Rollbacks and API Access for Pages

Post Syndicated from David Song original https://blog.cloudflare.com/rollbacks-and-api-access-for-pages/

Announcing Rollbacks and API Access for Pages

Announcing Rollbacks and API Access for Pages

A couple of months ago, we announced the general availability of Cloudflare Pages: the easiest way to host and collaboratively develop websites on Cloudflare’s global network. It’s been amazing to see over 20,000 incredible sites built by users and hear your feedback. Since then, we’ve released user-requested features like URL redirects, web analytics, and Access integration.

We’ve been listening to your feedback and today we announce two new features: rollbacks and the Pages API. Deployment rollbacks allow you to host production-level code on Pages without needing to stress about broken builds resulting in website downtime. The API empowers you to create custom functionality and better integrate Pages with your development workflows. Now, it’s even easier to use Pages for production hosting.

Rollbacks

You can now rollback your production website to a previous working deployment with just a click of a button. This is especially useful when you want to quickly undo a new deployment for troubleshooting. Before, developers would have to push another deployment and then wait for the build to finish updating production. Now, you can restore a working version within a few moments by rolling back to a previous working build.

To rollback to a previous build, just click the “Rollback to this deployment” button on either the deployments list menu or on a specific deployment page.

Announcing Rollbacks and API Access for Pages

API Access

The Pages API exposes endpoints for you to easily create automations and to integrate Pages within your development workflow. Refer to the API documentation for a full breakdown of the object types and endpoints. To get started, navigate to the Cloudflare API Tokens page and copy your “Global API Key”. Now, you can authenticate and make requests to the API using your email and auth key in the request headers.

For example, here is an API request to get all projects on an account.

Request (example)

curl -X GET "https://api.cloudflare.com/client/v4/accounts/{account_id}/pages/projects" \
     -H "X-Auth-Email: {email}" \
     -H "X-Auth-Key: {auth_key}"

Response (example)

{
  "success": true,
  "errors": [],
  "messages": [],
  "result": {
    "name": "NextJS Blog",
    "id": "7b162ea7-7367-4d67-bcde-1160995d5",
    "created_on": "2017-01-01T00:00:00Z",
    "subdomain": "helloworld.pages.dev",
    "domains": [
      "customdomain.com",
      "customdomain.org"
    ],
    "source": {
      "type": "github",
      "config": {
        "owner": "cloudflare",
        "repo_name": "ninjakittens",
        "production_branch": "main",
        "pr_comments_enabled": true,
        "deployments_enabled": true
      }
    },
    "build_config": {
      "build_command": "npm run build",
      "destination_dir": "build",
      "root_dir": "/",
      "web_analytics_tag": "cee1c73f6e4743d0b5e6bb1a0bcaabcc",
      "web_analytics_token": "021e1057c18547eca7b79f2516f06o7x"
    },
    "deployment_configs": {
      "preview": {
        "env_vars": {
          "BUILD_VERSION": {
            "value": "3.3"
          }
        }
      },
      "production": {
        "env_vars": {
          "BUILD_VERSION": {
            "value": "3.3"
          }
        }
      }
    },
    "latest_deployment": {
      "id": "f64788e9-fccd-4d4a-a28a-cb84f88f6",
      "short_id": "f64788e9",
      "project_id": "7b162ea7-7367-4d67-bcde-1160995d5",
      "project_name": "ninjakittens",
      "environment": "preview",
      "url": "https://f64788e9.ninjakittens.pages.dev",
      "created_on": "2021-03-09T00:55:03.923456Z",
      "modified_on": "2021-03-09T00:58:59.045655",
      "aliases": [
        "https://branchname.projectname.pages.dev"
      ],
      "latest_stage": {
        "name": "deploy",
        "started_on": "2021-03-09T00:55:03.923456Z",
        "ended_on": "2021-03-09T00:58:59.045655",
        "status": "success"
      },
      "env_vars": {
        "BUILD_VERSION": {
          "value": "3.3"
        },
        "ENV": {
          "value": "STAGING"
        }
      },
      "deployment_trigger": {
        "type": "ad_hoc",
        "metadata": {
          "branch": "main",
          "commit_hash": "ad9ccd918a81025731e10e40267e11273a263421",
          "commit_message": "Update index.html"
        }
      },
      "stages": [
        {
          "name": "queued",
          "started_on": "2021-06-03T15:38:15.608194Z",
          "ended_on": "2021-06-03T15:39:03.134378Z",
          "status": "active"
        },
        {
          "name": "initialize",
          "started_on": null,
          "ended_on": null,
          "status": "idle"
        },
        {
          "name": "clone_repo",
          "started_on": null,
          "ended_on": null,
          "status": "idle"
        },
        {
          "name": "build",
          "started_on": null,
          "ended_on": null,
          "status": "idle"
        },
        {
          "name": "deploy",
          "started_on": null,
          "ended_on": null,
          "status": "idle"
        }
      ],
      "build_config": {
        "build_command": "npm run build",
        "destination_dir": "build",
        "root_dir": "/",
        "web_analytics_tag": "cee1c73f6e4743d0b5e6bb1a0bcaabcc",
        "web_analytics_token": "021e1057c18547eca7b79f2516f06o7x"
      },
      "source": {
        "type": "github",
        "config": {
          "owner": "cloudflare",
          "repo_name": "ninjakittens",
          "production_branch": "main",
          "pr_comments_enabled": true,
          "deployments_enabled": true
        }
      }
    },
    "canonical_deployment": {
      "id": "f64788e9-fccd-4d4a-a28a-cb84f88f6",
      "short_id": "f64788e9",
      "project_id": "7b162ea7-7367-4d67-bcde-1160995d5",
      "project_name": "ninjakittens",
      "environment": "preview",
      "url": "https://f64788e9.ninjakittens.pages.dev",
      "created_on": "2021-03-09T00:55:03.923456Z",
      "modified_on": "2021-03-09T00:58:59.045655",
      "aliases": [
        "https://branchname.projectname.pages.dev"
      ],
      "latest_stage": {
        "name": "deploy",
        "started_on": "2021-03-09T00:55:03.923456Z",
        "ended_on": "2021-03-09T00:58:59.045655",
        "status": "success"
      },
      "env_vars": {
        "BUILD_VERSION": {
          "value": "3.3"
        },
        "ENV": {
          "value": "STAGING"
        }
      },
      "deployment_trigger": {
        "type": "ad_hoc",
        "metadata": {
          "branch": "main",
          "commit_hash": "ad9ccd918a81025731e10e40267e11273a263421",
          "commit_message": "Update index.html"
        }
      },
      "stages": [
        {
          "name": "queued",
          "started_on": "2021-06-03T15:38:15.608194Z",
          "ended_on": "2021-06-03T15:39:03.134378Z",
          "status": "active"
        },
        {
          "name": "initialize",
          "started_on": null,
          "ended_on": null,
          "status": "idle"
        },
        {
          "name": "clone_repo",
          "started_on": null,
          "ended_on": null,
          "status": "idle"
        },
        {
          "name": "build",
          "started_on": null,
          "ended_on": null,
          "status": "idle"
        },
        {
          "name": "deploy",
          "started_on": null,
          "ended_on": null,
          "status": "idle"
        }
      ],
      "build_config": {
        "build_command": "npm run build",
        "destination_dir": "build",
        "root_dir": "/",
        "web_analytics_tag": "cee1c73f6e4743d0b5e6bb1a0bcaabcc",
        "web_analytics_token": "021e1057c18547eca7b79f2516f06o7x"
      },
      "source": {
        "type": "github",
        "config": {
          "owner": "cloudflare",
          "repo_name": "ninjakittens",
          "production_branch": "main",
          "pr_comments_enabled": true,
          "deployments_enabled": true
        }
      }
    }
  },
  "result_info": {
    "page": 1,
    "per_page": 100,
    "count": 1,
    "total_count": 1
  }
}

Here’s another quick example using the API to rollback to a previous deployment:

Request (example)

curl -X POST "https://api.cloudflare.com/client/v4/accounts/{account_id}/pages/projects/{project_name}/deployments/{deployment_id}/rollback" \
     -H "X-Auth-Email: {email}" \
     -H "X-Auth-Key: {auth_key"

Response (example)

{
  "success": true,
  "errors": [],
  "messages": [],
  "result": {
    "id": "f64788e9-fccd-4d4a-a28a-cb84f88f6",
    "short_id": "f64788e9",
    "project_id": "7b162ea7-7367-4d67-bcde-1160995d5",
    "project_name": "ninjakittens",
    "environment": "preview",
    "url": "https://f64788e9.ninjakittens.pages.dev",
    "created_on": "2021-03-09T00:55:03.923456Z",
    "modified_on": "2021-03-09T00:58:59.045655",
    "aliases": [
      "https://branchname.projectname.pages.dev"
    ],
    "latest_stage": {
      "name": "deploy",
      "started_on": "2021-03-09T00:55:03.923456Z",
      "ended_on": "2021-03-09T00:58:59.045655",
      "status": "success"
    },
    "env_vars": {
      "BUILD_VERSION": {
        "value": "3.3"
      },
      "ENV": {
        "value": "STAGING"
      }
    },
    "deployment_trigger": {
      "type": "ad_hoc",
      "metadata": {
        "branch": "main",
        "commit_hash": "ad9ccd918a81025731e10e40267e11273a263421",
        "commit_message": "Update index.html"
      }
    },
    "stages": [
      {
        "name": "queued",
        "started_on": "2021-06-03T15:38:15.608194Z",
        "ended_on": "2021-06-03T15:39:03.134378Z",
        "status": "active"
      },
      {
        "name": "initialize",
        "started_on": null,
        "ended_on": null,
        "status": "idle"
      },
      {
        "name": "clone_repo",
        "started_on": null,
        "ended_on": null,
        "status": "idle"
      },
      {
        "name": "build",
        "started_on": null,
        "ended_on": null,
        "status": "idle"
      },
      {
        "name": "deploy",
        "started_on": null,
        "ended_on": null,
        "status": "idle"
      }
    ],
    "build_config": {
      "build_command": "npm run build",
      "destination_dir": "build",
      "root_dir": "/",
      "web_analytics_tag": "cee1c73f6e4743d0b5e6bb1a0bcaabcc",
      "web_analytics_token": "021e1057c18547eca7b79f2516f06o7x"
    },
    "source": {
      "type": "github",
      "config": {
        "owner": "cloudflare",
        "repo_name": "ninjakittens",
        "production_branch": "main",
        "pr_comments_enabled": true,
        "deployments_enabled": true
      }
    }
  }
}

Try out an API request with one of your projects by replacing {account_id}, {deployment_id},{email}, and {auth_key}. You can find your account_id in the URL address bar by navigating to the Cloudflare Dashboard. (Ex: 41643ed677c7c7gba4x463c4zdb9563c).

Refer to the API documentation for a full breakdown of the object types and endpoints.

Using the Pages API on Workers

The Pages API is even more powerful and simple to use with workers.new. If you haven’t used Workers before, feel free to go through the getting started guide to learn more. However, you’ll need to have set up a Pages project to follow along. Next, you can copy and paste this template for the new worker. Then, customize the values such as {account_id}, {project_name}, {auth_key}, and {your_email}.

const endpoint = "https://api.cloudflare.com/client/v4/accounts/{account_id}/pages/projects/{project_name}/deployments";
const email = "{your_email}";
addEventListener("scheduled", (event) => {
  event.waitUntil(handleScheduled(event.scheduledTime));
});
async function handleScheduled(request) {
  const init = {
    method: "POST",
    headers: {
      "content-type": "application/json;charset=UTF-8",
      "X-Auth-Email": email,
      "X-Auth-Key": API_KEY,
      //We recommend you store API keys as secrets using the Workers dashboard or using Wrangler as documented here https://developers.cloudflare.com/workers/cli-wrangler/commands#secret
    },
  };
  const response = await fetch(endpoint, init);
  return new Response(200);
}

Announcing Rollbacks and API Access for Pages

To finish configuring the script, click the back arrow near the top left of the window and click on the settings tab. Then, set an environment variable “API_KEY” with the value of your Cloudflare Global key and click “Encrypt” and then “Save”.

The script just makes a POST request to the deployments’ endpoint to trigger a new build. Click “Quick edit” to go back to the code editor to finish testing the script. You can test out your configuration and make a request by clicking on the “Trigger scheduled event” button in the “Schedule” tab near the tabs saying “HTTP” and “Preview”. You should see a new queued build on your Project through the Pages dashboard. Now, you can click “Save and Deploy” to publish your work. Finally, back to the worker settings page by clicking the back arrow near the top left of the window.

All that’s left to do is set a cron trigger to periodically run this Worker on the “Triggers tab”. Click on “Add Cron Trigger”.

Announcing Rollbacks and API Access for Pages

Next, we can input “0 * * * *” to trigger the build every hour.

Announcing Rollbacks and API Access for Pages

Finally, click save and your automation using the Pages API will trigger a new build every hour.

2. Deleting old deployments after a week: Pages hosts and serves all project deployments on preview links. Suppose you want to keep your project relatively private and prevent access to old deployments. You can use the API to delete deployments after a month so that they are no longer public online. This is easy to do on Workers using Cron Triggers.

3. Sharing project information: Imagine you are working on a development team using Pages to build our websites. You probably want an easy way to share deployment preview links and build status without having to share Cloudflare accounts. Using the API, you can easily share project information, including deployment status and preview links, and serve this content as HTML from a Cloudflare Worker.

Find the code snippets for all three examples here.

Conclusion

We will continue making the API more powerful with features such as supporting prebuilt deployments in the future. We are excited to see what you build with the API and hope you enjoy using rollbacks. At Cloudflare, we are committed to building the best developer experience on Pages, and we always appreciate hearing your feedback. Come chat with us and share more feedback on the Workers Discord (We have a dedicated #pages-help channel!).

Building fine-grained authorization using Amazon Cognito, API Gateway, and IAM

Post Syndicated from Artem Lovan original https://aws.amazon.com/blogs/security/building-fine-grained-authorization-using-amazon-cognito-api-gateway-and-iam/

June 5, 2021: We’ve updated Figure 1: User request flow.


Authorizing functionality of an application based on group membership is a best practice. If you’re building APIs with Amazon API Gateway and you need fine-grained access control for your users, you can use Amazon Cognito. Amazon Cognito allows you to use groups to create a collection of users, which is often done to set the permissions for those users. In this post, I show you how to build fine-grained authorization to protect your APIs using Amazon Cognito, API Gateway, and AWS Identity and Access Management (IAM).

As a developer, you’re building a customer-facing application where your users are going to log into your web or mobile application, and as such you will be exposing your APIs through API Gateway with upstream services. The APIs could be deployed on Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), AWS Lambda, or Elastic Load Balancing where each of these options will forward the request to your Amazon Elastic Compute Cloud (Amazon EC2) instances. Additionally, you can use on-premises services that are connected to your Amazon Web Services (AWS) environment over an AWS VPN or AWS Direct Connect. It’s important to have fine-grained controls for each API endpoint and HTTP method. For instance, the user should be allowed to make a GET request to an endpoint, but should not be allowed to make a POST request to the same endpoint. As a best practice, you should assign users to groups and use group membership to allow or deny access to your API services.

Solution overview

In this blog post, you learn how to use an Amazon Cognito user pool as a user directory and let users authenticate and acquire the JSON Web Token (JWT) to pass to the API Gateway. The JWT is used to identify what group the user belongs to, as mapping a group to an IAM policy will display the access rights the group is granted.

Note: The solution works similarly if Amazon Cognito would be federating users with an external identity provider (IdP)—such as Ping, Active Directory, or Okta—instead of being an IdP itself. To learn more, see Adding User Pool Sign-in Through a Third Party. Additionally, if you want to use groups from an external IdP to grant access, Role-based access control using Amazon Cognito and an external identity provider outlines how to do so.

The following figure shows the basic architecture and information flow for user requests.

Figure 1: User request flow

Figure 1: User request flow

Let’s go through the request flow to understand what happens at each step, as shown in Figure 1:

  1. A user logs in and acquires an Amazon Cognito JWT ID token, access token, and refresh token. To learn more about each token, see using tokens with user pools.
  2. A RestAPI request is made and a bearer token—in this solution, an access token—is passed in the headers.
  3. API Gateway forwards the request to a Lambda authorizer—also known as a custom authorizer.
  4. The Lambda authorizer verifies the Amazon Cognito JWT using the Amazon Cognito public key. On initial Lambda invocation, the public key is downloaded from Amazon Cognito and cached. Subsequent invocations will use the public key from the cache.
  5. The Lambda authorizer looks up the Amazon Cognito group that the user belongs to in the JWT and does a lookup in Amazon DynamoDB to get the policy that’s mapped to the group.
  6. Lambda returns the policy and—optionally—context to API Gateway. The context is a map containing key-value pairs that you can pass to the upstream service. It can be additional information about the user, the service, or anything that provides additional information to the upstream service.
  7. The API Gateway policy engine evaluates the policy.

    Note: Lambda isn’t responsible for understanding and evaluating the policy. That responsibility falls on the native capabilities of API Gateway.

  8. The request is forwarded to the service.

Note: To further optimize Lambda authorizer, the authorization policy can be cached or disabled, depending on your needs. By enabling cache, you could improve the performance as the authorization policy will be returned from the cache whenever there is a cache key match. To learn more, see Configure a Lambda authorizer using the API Gateway console.

Let’s have a closer look at the following example policy that is stored as part of an item in DynamoDB.

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Sid":"PetStore-API",
         "Effect":"Allow",
         "Action":"execute-api:Invoke",
         "Resource":[
            "arn:aws:execute-api:*:*:*/*/*/petstore/v1/*",
            "arn:aws:execute-api:*:*:*/*/GET/petstore/v2/status"
         ],
         "Condition":{
            "IpAddress":{
               "aws:SourceIp":[
                  "192.0.2.0/24",
                  "198.51.100.0/24"
               ]
            }
         }
      }
   ]
}

Based on this example policy, the user is allowed to make calls to the petstore API. For version v1, the user can make requests to any verb and any path, which is expressed by an asterisk (*). For v2, the user is only allowed to make a GET request for path /status. To learn more about how the policies work, see Output from an Amazon API Gateway Lambda authorizer.

Getting started

For this solution, you need the following prerequisites:

  • The AWS Command Line Interface (CLI) installed and configured for use.
  • Python 3.6 or later, to package Python code for Lambda

    Note: We recommend that you use a virtual environment or virtualenvwrapper to isolate the solution from the rest of your Python environment.

  • An IAM role or user with enough permissions to create Amazon Cognito User Pool, IAM Role, Lambda, IAM Policy, API Gateway and DynamoDB table.
  • The GitHub repository for the solution. You can download it, or you can use the following Git command to download it from your terminal.

    Note: This sample code should be used to test out the solution and is not intended to be used in production account.

     $ git clone https://github.com/aws-samples/amazon-cognito-api-gateway.git
     $ cd amazon-cognito-api-gateway
    

    Use the following command to package the Python code for deployment to Lambda.

     $ bash ./helper.sh package-lambda-functions
     …
     Successfully completed packaging files.
    

To implement this reference architecture, you will be utilizing the following services:

Note: This solution was tested in the us-east-1, us-east-2, us-west-2, ap-southeast-1, and ap-southeast-2 Regions. Before selecting a Region, verify that the necessary services—Amazon Cognito, API Gateway, and Lambda—are available in those Regions.

Let’s review each service, and how those will be used, before creating the resources for this solution.

Amazon Cognito user pool

A user pool is a user directory in Amazon Cognito. With a user pool, your users can log in to your web or mobile app through Amazon Cognito. You use the Amazon Cognito user directory directly, as this sample solution creates an Amazon Cognito user. However, your users can also log in through social IdPs, OpenID Connect (OIDC), and SAML IdPs.

Lambda as backing API service

Initially, you create a Lambda function that serves your APIs. API Gateway forwards all requests to the Lambda function to serve up the requests.

An API Gateway instance and integration with Lambda

Next, you create an API Gateway instance and integrate it with the Lambda function you created. This API Gateway instance serves as an entry point for the upstream service. The following bash command below creates an Amazon Cognito user pool, a Lambda function, and an API Gateway instance. The command then configures proxy integration with Lambda and deploys an API Gateway stage.

Deploy the sample solution

From within the directory where you downloaded the sample code from GitHub, run the following command to generate a random Amazon Cognito user password and create the resources described in the previous section.

 $ bash ./helper.sh cf-create-stack-gen-password
 ...
 Successfully created CloudFormation stack.

When the command is complete, it returns a message confirming successful stack creation.

Validate Amazon Cognito user creation

To validate that an Amazon Cognito user has been created successfully, run the following command to open the Amazon Cognito UI in your browser and then log in with your credentials.

Note: When you run this command, it returns the user name and password that you should use to log in.

 $ bash ./helper.sh open-cognito-ui
  Opening Cognito UI. Please use following credentials to login:
  Username: cognitouser
  Password: xxxxxxxx

Alternatively, you can open the CloudFormation stack and get the Amazon Cognito hosted UI URL from the stack outputs. The URL is the value assigned to the CognitoHostedUiUrl variable.

Figure 2: CloudFormation Outputs - CognitoHostedUiUrl

Figure 2: CloudFormation Outputs – CognitoHostedUiUrl

Validate Amazon Cognito JWT upon login

Since we haven’t installed a web application that would respond to the redirect request, Amazon Cognito will redirect to localhost, which might look like an error. The key aspect is that after a successful log in, there is a URL similar to the following in the navigation bar of your browser:

http://localhost/#id_token=eyJraWQiOiJicVhMYWFlaTl4aUhzTnY3W...

Test the API configuration

Before you protect the API with Amazon Cognito so that only authorized users can access it, let’s verify that the configuration is correct and the API is served by API Gateway. The following command makes a curl request to API Gateway to retrieve data from the API service.

 $ bash ./helper.sh curl-api
{"pets":[{"id":1,"name":"Birds"},{"id":2,"name":"Cats"},{"id":3,"name":"Dogs"},{"id":4,"name":"Fish"}]}

The expected result is that the response will be a list of pets. In this case, the setup is correct: API Gateway is serving the API.

Protect the API

To protect your API, the following is required:

  1. DynamoDB to store the policy that will be evaluated by the API Gateway to make an authorization decision.
  2. A Lambda function to verify the user’s access token and look up the policy in DynamoDB.

Let’s review all the services before creating the resources.

Lambda authorizer

A Lambda authorizer is an API Gateway feature that uses a Lambda function to control access to an API. You use a Lambda authorizer to implement a custom authorization scheme that uses a bearer token authentication strategy. When a client makes a request to one of the API operations, the API Gateway calls the Lambda authorizer. The Lambda authorizer takes the identity of the caller as input and returns an IAM policy as the output. The output is the policy that is returned in DynamoDB and evaluated by the API Gateway. If there is no policy mapped to the caller identity, Lambda will generate a deny policy and request will be denied.

DynamoDB table

DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. This is ideal for this use case to ensure that the Lambda authorizer can quickly process the bearer token, look up the policy, and return it to API Gateway. To learn more, see Control access for invoking an API.

The final step is to create the DynamoDB table for the Lambda authorizer to look up the policy, which is mapped to an Amazon Cognito group.

Figure 3 illustrates an item in DynamoDB. Key attributes are:

  • Group, which is used to look up the policy.
  • Policy, which is returned to API Gateway to evaluate the policy.

 

Figure 3: DynamoDB item

Figure 3: DynamoDB item

Based on this policy, the user that is part of the Amazon Cognito group pet-veterinarian is allowed to make API requests to endpoints https://<domain>/<api-gateway-stage>/petstore/v1/* and https://<domain>/<api-gateway-stage>/petstore/v2/status for GET requests only.

Update and create resources

Run the following command to update existing resources and create a Lambda authorizer and DynamoDB table.

 $ bash ./helper.sh cf-update-stack
Successfully updated CloudFormation stack.

Test the custom authorizer setup

Begin your testing with the following request, which doesn’t include an access token.

$ bash ./helper.sh curl-api
{"message":"Unauthorized"}

The request is denied with the message Unauthorized. At this point, the Amazon API Gateway expects a header named Authorization (case sensitive) in the request. If there’s no authorization header, the request is denied before it reaches the lambda authorizer. This is a way to filter out requests that don’t include required information.

Use the following command for the next test. In this test, you pass the required header but the token is invalid because it wasn’t issued by Amazon Cognito but is a simple JWT-format token stored in ./helper.sh. To learn more about how to decode and validate a JWT, see decode and verify an Amazon Cognito JSON token.

$ bash ./helper.sh curl-api-invalid-token
{"Message":"User is not authorized to access this resource"}

This time the message is different. The Lambda authorizer received the request and identified the token as invalid and responded with the message User is not authorized to access this resource.

To make a successful request to the protected API, your code will need to perform the following steps:

  1. Use a user name and password to authenticate against your Amazon Cognito user pool.
  2. Acquire the tokens (id token, access token, and refresh token).
  3. Make an HTTPS (TLS) request to API Gateway and pass the access token in the headers.

Before the request is forwarded to the API service, API Gateway receives the request and passes it to the Lambda authorizer. The authorizer performs the following steps. If any of the steps fail, the request is denied.

  1. Retrieve the public keys from Amazon Cognito.
  2. Cache the public keys so the Lambda authorizer doesn’t have to make additional calls to Amazon Cognito as long as the Lambda execution environment isn’t shut down.
  3. Use public keys to verify the access token.
  4. Look up the policy in DynamoDB.
  5. Return the policy to API Gateway.

The access token has claims such as Amazon Cognito assigned groups, user name, token use, and others, as shown in the following example (some fields removed).

{
    "sub": "00000000-0000-0000-0000-0000000000000000",
    "cognito:groups": [
        "pet-veterinarian"
    ],
...
    "token_use": "access",
    "scope": "openid email",
    "username": "cognitouser"
}

Finally, let’s programmatically log in to Amazon Cognito UI, acquire a valid access token, and make a request to API Gateway. Run the following command to call the protected API.

$ bash ./helper.sh curl-protected-api
{"pets":[{"id":1,"name":"Birds"},{"id":2,"name":"Cats"},{"id":3,"name":"Dogs"},{"id":4,"name":"Fish"}]}

This time, you receive a response with data from the API service. Let’s examine the steps that the example code performed:

  1. Lambda authorizer validates the access token.
  2. Lambda authorizer looks up the policy in DynamoDB based on the group name that was retrieved from the access token.
  3. Lambda authorizer passes the IAM policy back to API Gateway.
  4. API Gateway evaluates the IAM policy and the final effect is an allow.
  5. API Gateway forwards the request to Lambda.
  6. Lambda returns the response.

Let’s continue to test our policy from Figure 3. In the policy document, arn:aws:execute-api:*:*:*/*/GET/petstore/v2/status is the only endpoint for version V2, which means requests to endpoint /GET/petstore/v2/pets should be denied. Run the following command to test this.

 $ bash ./helper.sh curl-protected-api-not-allowed-endpoint
{"Message":"User is not authorized to access this resource"}

Note: Now that you understand fine grained access control using Cognito user pool, API Gateway and lambda function, and you have finished testing it out, you can run the following command to clean up all the resources associated with this solution:

 $ bash ./helper.sh cf-delete-stack

Advanced IAM policies to further control your API

With IAM, you can create advanced policies to further refine access to your APIs. You can learn more about condition keys that can be used in API Gateway, their use in an IAM policy with conditions, and how policy evaluation logic determines whether to allow or deny a request.

Summary

In this post, you learned how IAM and Amazon Cognito can be used to provide fine-grained access control for your API behind API Gateway. You can use this approach to transparently apply fine-grained control to your API, without having to modify the code in your API, and create advanced policies by using IAM condition keys.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon Cognito forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Artem Lovan

Artem is a Senior Solutions Architect based in New York. He helps customers architect and optimize applications on AWS. He has been involved in IT at many levels, including infrastructure, networking, security, DevOps, and software development.

Summarize devices that are not reachable

Post Syndicated from Aigars Kadiķis original https://blog.zabbix.com/summarize-devices-that-are-not-reachable/13219/

In this lab, we will list all devices which are not reachable by a monitoring tool. This is good when we want to improve the overall monitoring experience and decrease the size queue (metrics which has not been arrived at the instance).

Tools required for the job: Access to a database server or a Windows computer with PowerShell

To summarize devices that are not reachable at the moment we can use a database query. Tested and works on 4.0, 5.0, on MySQL and PostgreSQL:

SELECT hosts.host,
       interface.ip,
       interface.dns,
       interface.useip,
       CASE interface.type
           WHEN 1 THEN 'ZBX'
           WHEN 2 THEN 'SNMP'
           WHEN 3 THEN 'IPMI'
           WHEN 4 THEN 'JMX'
       END AS "type",
       hosts.error
FROM hosts
JOIN interface ON interface.hostid=hosts.hostid
WHERE hosts.available=2
  AND interface.main=1
  AND hosts.status=0;

A very similar (but not exactly the same) outcome can be obtained via Windows PowerShell by contacting Zabbix API. Try this snippet:

$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-Type", "application/json")
$url = 'http://192.168.1.101/api_jsonrpc.php'
$user = 'api'
$password = 'zabbix'

# authorization
$key = Invoke-RestMethod $url -Method 'POST' -Headers $headers -Body "
{
    `"jsonrpc`": `"2.0`",
    `"method`": `"user.login`",
    `"params`": {
        `"user`": `"$user`",
        `"password`": `"$password`"
    },
    `"id`": 1
}
" | foreach { $_.result }
echo $key

# filter out unreachable Agent, SNMP, JMX, IPMI hosts
Invoke-RestMethod $url -Method 'POST' -Headers $headers -Body "
{
    `"jsonrpc`": `"2.0`",
    `"method`": `"host.get`",
    `"params`": {
        `"output`": [`"interfaces`",`"host`",`"proxy_hostid`",`"disable_until`",`"lastaccess`",`"errors_from`",`"error`"],
        `"selectInterfaces`": `"extend`",
        `"filter`": {`"available`": `"2`",`"status`":`"0`"}
    },
    `"auth`": `"$key`",
    `"id`": 1
}
" | foreach { $_.result }  | foreach { $_.interfaces } | Out-GridView

# log out
Invoke-RestMethod $url -Method 'POST' -Headers $headers -Body "
{
    `"jsonrpc`": `"2.0`",
    `"method`": `"user.logout`",
    `"params`": [],
    `"id`": 1,
    `"auth`": `"$key`"
}
"

Set a valid credential (URL, username, password) on the top of the code before executing it.

The benefit of PowerShell here is that we can use some on-the-fly filtering:

What is the exact meaning of the field ‘type’ we can understand by looking on the previous database query:

       CASE interface.type
           WHEN 1 THEN 'ZBX'
           WHEN 2 THEN 'SNMP'
           WHEN 3 THEN 'IPMI'
           WHEN 4 THEN 'JMX'
       END AS "type",

On Windows PowerShell, it is possible to download the unreachable hosts directly to CSV file. To do that, in the code above, we need to change:

Out-GridView

to

Export-Csv c:\temp\unavailable.hosts.csv

Alright, this was the knowledge bit today. Let’s keep Zabbixing!

A Byzantine failure in the real world

Post Syndicated from Tom Lianza original https://blog.cloudflare.com/a-byzantine-failure-in-the-real-world/

A Byzantine failure in the real world

An analysis of the Cloudflare API availability incident on 2020-11-02

When we review design documents at Cloudflare, we are always on the lookout for Single Points of Failure (SPOFs). Eliminating these is a necessary step in architecting a system you can be confident in. Ironically, when you’re designing a system with built-in redundancy, you spend most of your time thinking about how well it functions when that redundancy is lost.

On November 2, 2020, Cloudflare had an incident that impacted the availability of the API and dashboard for six hours and 33 minutes. During this incident, the success rate for queries to our API periodically dipped as low as 75%, and the dashboard experience was as much as 80 times slower than normal. While Cloudflare’s edge is massively distributed across the world (and kept working without a hitch), Cloudflare’s control plane (API & dashboard) is made up of a large number of microservices that are redundant across two regions. For most services, the databases backing those microservices are only writable in one region at a time.

Each of Cloudflare’s control plane data centers has multiple racks of servers. Each of those racks has two switches that operate as a pair—both are normally active, but either can handle the load if the other fails. Cloudflare survives rack-level failures by spreading the most critical services across racks. Every piece of hardware has two or more power supplies with different power feeds. Every server that stores critical data uses RAID 10 redundant disks or storage systems that replicate data across at least three machines in different racks, or both. Redundancy at each layer is something we review and require. So—how could things go wrong?

In this post we present a timeline of what happened, and how a difficult failure mode known as a Byzantine fault played a role in a cascading series of events.

2020-11-02 14:43 UTC: Partial Switch Failure

At 14:43, a network switch started misbehaving. Alerts began firing about the switch being unreachable to pings. The device was in a partially operating state: network control plane protocols such as LACP and BGP remained operational, while others, such as vPC, were not. The vPC link is used to synchronize ports across multiple switches, so that they appear as one large, aggregated switch to servers connected to them. At the same time, the data plane (or forwarding plane) was not processing and forwarding all the packets received from connected devices.

This failure scenario is completely invisible to the connected nodes, as each server only sees an issue for some of its traffic due to the load-balancing nature of LACP. Had the switch failed fully, all traffic would have failed over to the peer switch, as the connected links would’ve simply gone down, and the ports would’ve dropped out of the forwarding LACP bundles.

Six minutes later, the switch recovered without human intervention. But this odd failure mode led to further problems that lasted long after the switch had returned to normal operation.

2020-11-02 14:44 UTC: etcd Errors begin

The rack with the misbehaving switch included one server in our etcd cluster. We use etcd heavily in our core data centers whenever we need strongly consistent data storage that’s reliable across multiple nodes.

In the event that the cluster leader fails, etcd uses the RAFT protocol to maintain consistency and establish consensus to promote a new leader. In the RAFT protocol, cluster members are assumed to be either available or unavailable, and to provide accurate information or none at all. This works fine when a machine crashes, but is not always able to handle situations where different members of the cluster have conflicting information.

In this particular situation:

  • Network traffic between node 1 (in the affected rack) and node 3 (the leader) was being sent through the switch in the degraded state,
  • Network traffic between node 1 and node 2 were going through its working peer, and
  • Network traffic between node 2 and node 3 was unaffected.

This caused cluster members to have conflicting views of reality, known in distributed systems theory as a Byzantine fault. As a consequence of this conflicting information, node 1 repeatedly initiated leader elections, voting for itself, while node 2 repeatedly voted for node 3, which it could still connect to. This resulted in ties that did not promote a leader node 1 could reach. RAFT leader elections are disruptive, blocking all writes until they’re resolved, so this made the cluster read-only until the faulty switch recovered and node 1 could once again reach node 3.

A Byzantine failure in the real world

2020-11-02 14:45 UTC: Database system promotes a new primary database

Cloudflare’s control plane services use relational databases hosted across multiple clusters within a data center. Each cluster is configured for high availability. The cluster setup includes a primary database, a synchronous replica, and one or more asynchronous replicas. This setup allows redundancy within a data center. For cross-datacenter redundancy, a similar high availability secondary cluster is set up and replicated in a geographically dispersed data center for disaster recovery. The cluster management system leverages etcd for cluster member discovery and coordination.

When etcd became read-only, two clusters were unable to communicate that they had a healthy primary database. This triggered the automatic promotion of a synchronous database replica to become the new primary. This process happened automatically and without error or data loss.

There was a defect in our cluster management system that requires a rebuild of all database replicas when a new primary database is promoted. So, although the new primary database was available instantly, the replicas would take considerable time to become available, depending on the size of the database. For one of the clusters, service was restored quickly. Synchronous and asynchronous database replicas were rebuilt and started replicating successfully from primary, and the impact was minimal.

For the other cluster, however, performant operation of that database required a replica to be online. Because this database handles authentication for API calls and dashboard activities, it takes a lot of reads, and one replica was heavily utilized to spare the primary the load. When this failover happened and no replicas were available, the primary was overloaded, as it had to take all of the load. This is when the main impact started.

Reduce Load, Leverage Redundancy

At this point we saw that our primary authentication database was overwhelmed and began shedding load from it. We dialed back the rate at which we push SSL certificates to the edge, send emails, and other features, to give it space to handle the additional load. Unfortunately, because of its size, we knew it would take several hours for a replica to be fully rebuilt.

A silver lining here is that every database cluster in our primary data center also has online replicas in our secondary data center. Those replicas are not part of the local failover process, and were online and available throughout the incident. The process of steering read-queries to those replicas was not yet automated, so we manually diverted API traffic that could leverage those read replicas to the secondary data center. This substantially improved our API availability.

The Dashboard

The Cloudflare dashboard, like most web applications, has the notion of a user session. When user sessions are created (each time a user logs in) we perform some database operations and keep data in a Redis cluster for the duration of that user’s session. Unlike our API calls, our user sessions cannot currently be moved across the ocean without disruption. As we took actions to improve the availability of our API calls, we were unfortunately making the user experience on the dashboard worse.

This is an area of the system that is currently designed to be able to fail over across data centers in the event of a disaster, but has not yet been designed to work in both data centers at the same time. After a first period in which users on the dashboard became increasingly frustrated, we failed the authentication calls fully back to our primary data center, and kept working on our primary database to ensure we could provide the best service levels possible in that degraded state.

2020-11-02 21:20 UTC Database Replica Rebuilt

The instant the first database replica rebuilt, it put itself back into service, and performance resumed to normal levels. We re-ramped all of the services that had been turned down, so all asynchronous processing could catch up, and after a period of monitoring marked the end of the incident.

Redundant Points of Failure

The cascade of failures in this incident was interesting because each system, on its face, had redundancy. Moreover, no system fully failed—each entered a degraded state. That combination meant the chain of events that transpired was considerably harder to model and anticipate. It was frustrating yet reassuring that some of the possible failure modes were already being addressed.

A team was already working on fixing the limitation that requires a database replica rebuild upon promotion. Our user sessions system was inflexible in scenarios where we’d like to steer traffic around, and redesigning that was already in progress.

This incident also led us to revisit the configuration parameters we put in place for things that auto-remediate. In previous years, promoting a database replica to primary took far longer than we liked, so getting that process automated and able to trigger on a minute’s notice was a point of pride. At the same time, for at least one of our databases, the cure may be worse than the disease, and in fact we may not want to invoke the promotion process so quickly. Immediately after this incident we adjusted that configuration accordingly.

Byzantine Fault Tolerance (BFT) is a hot research topic. Solutions have been known since 1982, but have had to choose between a variety of engineering tradeoffs, including security, performance, and algorithmic simplicity. Most general-purpose cluster management systems choose to forgo BFT entirely and use protocols based on PAXOS, or simplifications of PAXOS such as RAFT, that perform better and are easier to understand than BFT consensus protocols. In many cases, a simple protocol that is known to be vulnerable to a rare failure mode is safer than a complex protocol that is difficult to implement correctly or debug.

The first uses of BFT consensus were in safety-critical systems such as aircraft and spacecraft controls. These systems typically have hard real time latency constraints that require tightly coupling consensus with application logic in ways that make these implementations unsuitable for general-purpose services like etcd. Contemporary research on BFT consensus is mostly focused on applications that cross trust boundaries, which need to protect against malicious cluster members as well as malfunctioning cluster members. These designs are more suitable for implementing general-purpose services such as etcd, and we look forward to collaborating with researchers and the open source community to make them suitable for production cluster management.

We are very sorry for the difficulty the outage caused, and are continuing to improve as our systems grow. We’ve since fixed the bug in our cluster management system, and are continuing to tune each of the systems involved in this incident to be more resilient to failures of their dependencies.  If you’re interested in helping solve these problems at scale, please visit cloudflare.com/careers.

Close problem automatically via Zabbix API

Post Syndicated from Aigars Kadiķis original https://blog.zabbix.com/close-problem-automatically-via-zabbix-api/12461/

Today we are talking about a use case when it’s impossible to find a proper way to write a recovery expression for the Zabbix trigger. In other words, we know how to identify problems. But there is no good way to detect when the problem is gone.

This mostly relates to a huge environment, for example:

  • Got one log file. There are hundreds of patterns inside. We respect all of them. We need them
  • SNMP trap item (snmptrap.fallback) with different patterns being written inside

In these situations, the trigger is most likely configured to “Event generation mode: Multiple.” This practically means: when a “problematic metric” hits the instance, it will open +1 additional problem.

Goal:
I just need to receive an email about the record, then close the event.

As a workaround (let’s call it a solution here), we can define an action which will:

  1. contact an API endpoint
  2. manually acknowledge the event and close it

The biggest reason why this functionality is possible is that: when an event hits the action, the operation actually knows the event ID of the problem. The macro {EVENT.ID} saves the day.

To solve the problem, we need to install API characteristics at the global level:

     {$Z_API_PHP}=http://127.0.0.1/api_jsonrpc.php
    {$Z_API_USER}=api
{$Z_API_PASSWORD}=zabbix

NOTE
‘http://127.0.0.1/api_jsonrpc.php’ means the frontend server runs on the same server as systemd:zabbix-server. If it is not the case, we need to plot a front-end address of Zabbix GUI + add ‘api_jsonrpc.php’.

We will have 2 actions. The first one will deliver a notification to email:

After 1 minute, a second action will close the event:

This is a full bash snippet we must put inside. No need to change anything. It works with copy and paste:

url={$Z_API_PHP}
    user={$Z_API_USER}
password={$Z_API_PASSWORD}

# authorization
auth=$(curl -sk -X POST -H "Content-Type: application/json" -d "
{
	\"jsonrpc\": \"2.0\",
	\"method\": \"user.login\",
	\"params\": {
		\"user\": \"$user\",
		\"password\": \"$password\"
	},
	\"id\": 1,
	\"auth\": null
}
" $url | \
grep -E -o "([0-9a-f]{32,32})")

# acknowledge and close event
curl -sk -X POST -H "Content-Type: application/json" -d "
{
	\"jsonrpc\": \"2.0\",
	\"method\": \"event.acknowledge\",
	\"params\": {
		\"eventids\": \"{EVENT.ID}\",
		\"action\": 1,
		\"message\": \"Problem resolved.\"
	},
	\"auth\": \"$auth\",
	\"id\": 1
}" $url

# close api key
curl -sk -X POST -H "Content-Type: application/json" -d "
{
    \"jsonrpc\": \"2.0\",
    \"method\": \"user.logout\",
    \"params\": [],
    \"id\": 1,
    \"auth\": \"$auth\"
}
" $url

Zabbix API scripting via curl and jq

Post Syndicated from Aigars Kadiķis original https://blog.zabbix.com/zabbix-api-scripting-via-curl-and-jq/12434/

In this lab we will use a bash environment and utilities ‘curl’ and ‘jq’ to perform Zabbix API calls, do some scripting.

‘curl’ is a tool to exchange JSON messages over HTTP/HTTPS.
‘jq’ utility helps to locate and extract specific elements in output.

To follow the lab we need to install ‘jq’:

# On CentOS7/RHEL7:
yum install epel-release && yum install jq

# On CentOS8/RHEL8:
dnf install jq

# On Ubuntu/Debian:
apt install jq

# On any 64-bit Linux platform:
curl -skL "https://github.com/stedolan/jq/releases/download/jq1.5/jq-linux64" -o /usr/bin/jq && chmod +x /usr/bin/jq

Obtaining an authorization token

In order to operate with API calls we need to:

  • Define an API endpoint. this is an URL, a PHP file which is designed to accept requests
  • Obtain an authorization token

If you tend to execute API calls from frontend server then most likelly.

url=http://127.0.0.1/api_jsonrpc.php
# or:
url=http://127.0.0.1/zabbix/api_jsonrpc.php

It’s required to set the URL variable to jump to the next step. Test if you have it configured:

echo $url

Any API call needs to be used via authorization token. To put one token in variable use the command:

auth=$(curl -s -X POST -H 'Content-Type: application/json-rpc' \
-d '
{"jsonrpc":"2.0","method":"user.login","params":
{"user":"api","password":"zabbix"},
"id":1,"auth":null}
' $url | \
jq -r .result
)

Note
Notice there is user ‘api’ with password ‘zabbix’. This is a dedicated user for API calls.

Check if you have a session key. It should be 32 character HEX string:

echo $auth

Though process

1) visit the documentation page and pick an API flavor for example alert.get:

{
"jsonrpc": "2.0",
"method": "alert.get",
"params": {
	"output": "extend",
	"actionids": "3"
},
"auth": "038e1d7b1735c6a5436ee9eae095879e",
"id": 1
}

2) Let’s use our favorite text editor and build in Find&Replace functionality to escape all double quotes:

{
\"jsonrpc\": \"2.0\",
\"method\": \"alert.get\",
\"params\": {
	\"output\": \"extend\",
	\"actionids\": \"3\"
},
\"auth\": \"038e1d7b1735c6a5436ee9eae095879e\",
\"id\": 1
}

NOTE
Don’t ever think to do this process manually by hand!

3) Replace session key 038e1d7b1735c6a5436ee9eae095879e with our variable $auth

{
\"jsonrpc\": \"2.0\",
\"method\": \"alert.get\",
\"params\": {
	\"output\": \"extend\",
	\"actionids\": \"3\"
},
\"auth\": \"$auth\",
\"id\": 1
}

4) Now let’s encapsulate the API command with curl:

curl -s -X POST \
-H 'Content-Type: application/json-rpc' \
-d " \

{
\"jsonrpc\": \"2.0\",
\"method\": \"alert.get\",
\"params\": {
	\"output\": \"extend\",
	\"actionids\": \"3\"
},
\"auth\": \"$auth\",
\"id\": 1
}

" $url

By executing the previous command, it should already print a JSON content in response.
To make the output more beautiful we can pipe it to jq .:

curl -s -X POST \
-H 'Content-Type: application/json-rpc' \
-d " \

{
\"jsonrpc\": \"2.0\",
\"method\": \"alert.get\",
\"params\": {
	\"output\": \"extend\",
	\"actionids\": \"3\"
},
\"auth\": \"$auth\",
\"id\": 1
}

" $url | jq .

Wrap everything together in one file

This is ready to use the snippet:

#!/bin/bash

# 1. set connection details
url=http://127.0.0.1/api_jsonrpc.php
user=api
password=zabbix

# 2. get authorization token
auth=$(curl -s -X POST \
-H 'Content-Type: application/json-rpc' \
-d " \
{
 \"jsonrpc\": \"2.0\",
 \"method\": \"user.login\",
 \"params\": {
  \"user\": \"$user\",
  \"password\": \"$password\"
 },
 \"id\": 1,
 \"auth\": null
}
" $url | \
jq -r '.result'
)

# 3. show triggers in problem state
curl -s -X POST \
-H 'Content-Type: application/json-rpc' \
-d " \
{
 \"jsonrpc\": \"2.0\",
    \"method\": \"trigger.get\",
    \"params\": {
        \"output\": \"extend\",
        \"selectHosts\": \"extend\",
        \"filter\": {
            \"value\": 1
        },
        \"sortfield\": \"priority\",
        \"sortorder\": \"DESC\"
    },
    \"auth\": \"$auth\",
    \"id\": 1
}
" $url | \
jq -r '.result'

# 4. logout user
curl -s -X POST \
-H 'Content-Type: application/json-rpc' \
-d " \
{
    \"jsonrpc\": \"2.0\",
    \"method\": \"user.logout\",
    \"params\": [],
    \"id\": 1,
    \"auth\": \"$auth\"
}
" $url

Conveniences

We can use https://jsonpathfinder.com/ to identify what should be the path to extract an element.

For example, to list all Zabbix proxies we will use and API call:

curl -s -X POST \
-H 'Content-Type: application/json-rpc' \
-d " \
{
    \"jsonrpc\": \"2.0\",
    \"method\": \"proxy.get\",
    \"params\": {
        \"output\": [\"host\"]
    },
    \"auth\": \"$auth\",
    \"id\": 1
} 
" $url

It may print content like:

{"jsonrpc":"2.0","result":[{"host":"broceni","proxyid":"10387"},{"host":"mysql8mon","proxyid":"12066"},{"host":"riga","proxyid":"12585"}],"id":1}

Inside JSONPathFinder by using a mouse click at the right panel, we can locate a sample element what we need to extract:

It suggests a path ‘x.result[1].host’. This means to extract all elements we can remove the number and use ‘.result[].host’ like this:

curl -s -X POST \
-H 'Content-Type: application/json-rpc' \
-d " \
{
    \"jsonrpc\": \"2.0\",
    \"method\": \"proxy.get\",
    \"params\": {
        \"output\": [\"host\"]
    },
    \"auth\": \"$auth\",
    \"id\": 1
} 
" $url | jq -r '.result[].host'

Now it prints only the proxy titles:

broceni
mysql8mon
riga

That is it for today. Bye.

Zabbix API calls through Postman

Post Syndicated from Aigars Kadiķis original https://blog.zabbix.com/zabbix-api-calls-through-postman/12198/

Zabbix API calls can be used through the graphical user interface (GUI), no need to jump to scripting. An application to perform API calls is called Postman.

Benefits:

  • Available on Windows, Linux, or MAC
  • Save/synchronize your collection with Google account
  • Can copy and paste examples from the official documentation page

Let’s go to basic steps on how to perform API calls:

1st step – Grab API method user.login and use a dedicated username and password to obtain and session token:

{
    "jsonrpc": "2.0",
    "method": "user.login",
    "params": {
        "user": "api",
        "password": "zabbix"
    },
    "id": 1
}

This is how it looks in Postman:

NOTE
We recommend using a dedicated user for API calls. For example, a user called “api”. Make sure the user type has been chosen as “Zabbix Super Admin” so through this user we can access any type of information.

2nd step – Use API method trigger.get to list all triggers in the problem state:

{
    "jsonrpc": "2.0",
    "method": "trigger.get",
    "params": {
        "output": [
            "triggerid",
            "description",
            "priority"
        ],
        "filter": {
            "value": 1
        },
        "sortfield": "priority",
        "sortorder": "DESC"
    },
    "auth": "<session key>",
    "id": 1
}

Replace “<session key>” inside the API snippet to make it work. Then click “Send” button. It will list all triggers in the problem state on the right side:

Postman conveniences – Environments

Environments are “a must” if you:

  • Have a separate test, development, and production Zabbix instance
  • Plan to migrate Zabbix to next version (4.0 to 5.0) so it’s better to test all API calls beforehand

On the top right corner, there is a button Manage Environments. Let’s click it.

Now Create an environment:

Each environment must consist of url and auth key:

Now we have one definition prod. Can close window with [X]:

In order to work with your new environment, select a newly created profile prod. It’s required to substitute Zabbix API endpoint with {{url}} and plot {{auth}} to serve as a dynamic authorization key:

NOTE
Every time we notice an API procedure does not work anymore, all we need to do is to enter Manage environments section and install a new session tokken..

Topic in video format:
https://youtu.be/B14tsDUasG8?t=2513

Automated Origin CA for Kubernetes

Post Syndicated from Terin Stock original https://blog.cloudflare.com/automated-origin-ca-for-kubernetes/

Automated Origin CA for Kubernetes

Automated Origin CA for Kubernetes

In 2016, we launched the Cloudflare Origin CA, a certificate authority optimized for making it easy to secure the connection between Cloudflare and an origin server. Running our own CA has allowed us to support fast issuance and renewal, simple and effective revocation, and wildcard certificates for our users.

Out of the box, managing TLS certificates and keys within Kubernetes can be challenging and error prone. The secret resources have to be constructed correctly, as components expect secrets with specific fields. Some forms of domain verification require manually rotating secrets to pass. Once you’re successful, don’t forget to renew before the certificate expires!

cert-manager is a project to fill this operational gap, providing Kubernetes resources that manage the lifecycle of a certificate. Today we’re releasing origin-ca-issuer, an extension to cert-manager integrating with Cloudflare Origin CA to easily create and renew certificates for your account’s domains.

Origin CA Integration

Creating an Issuer

After installing cert-manager and origin-ca-issuer, you can create an OriginIssuer resource. This resource creates a binding between cert-manager and the Cloudflare API for an account. Different issuers may be connected to different Cloudflare accounts in the same Kubernetes cluster.

apiVersion: cert-manager.k8s.cloudflare.com/v1
kind: OriginIssuer
metadata:
  name: prod-issuer
  namespace: default
spec:
  signatureType: OriginECC
  auth:
    serviceKeyRef:
      name: service-key
      key: key
      ```

This creates a new OriginIssuer named “prod-issuer” that issues certificates using ECDSA signatures, and the secret “service-key” in the same namespace is used to authenticate to the Cloudflare API.

Signing an Origin CA Certificate

After creating an OriginIssuer, we can now create a Certificate with cert-manager. This defines the domains, including wildcards, that the certificate should be issued for, how long the certificate should be valid, and when cert-manager should renew the certificate.

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: example-com
  namespace: default
spec:
  # The secret name where cert-manager
  # should store the signed certificate.
  secretName: example-com-tls
  dnsNames:
    - example.com
  # Duration of the certificate.
  duration: 168h
  # Renew a day before the certificate expiration.
  renewBefore: 24h
  # Reference the Origin CA Issuer you created above,
  # which must be in the same namespace.
  issuerRef:
    group: cert-manager.k8s.cloudflare.com
    kind: OriginIssuer
    name: prod-issuer

Once created, cert-manager begins managing the lifecycle of this certificate, including creating the key material, crafting a certificate signature request (CSR), and constructing a certificate request that will be processed by the origin-ca-issuer.

When signed by the Cloudflare API, the certificate will be made available, along with the private key, in the Kubernetes secret specified within the secretName field. You’ll be able to use this certificate on servers proxied behind Cloudflare.

Extra: Ingress Support

If you’re using an Ingress controller, you can use cert-manager’s Ingress support to automatically manage Certificate resources based on your Ingress resource.

apiVersion: networking/v1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/issuer: prod-issuer
    cert-manager.io/issuer-kind: OriginIssuer
    cert-manager.io/issuer-group: cert-manager.k8s.cloudflare.com
  name: example
  namespace: default
spec:
  rules:
    - host: example.com
      http:
        paths:
          - backend:
              serviceName: examplesvc
              servicePort: 80
            path: /
  tls:
    # specifying a host in the TLS section will tell cert-manager 
    # what DNS SANs should be on the created certificate.
    - hosts:
        - example.com
      # cert-manager will create this secret
      secretName: example-tls

Building an External cert-manager Issuer

An external cert-manager issuer is a specialized Kubernetes controller. There’s no direct communication between cert-manager and external issuers at all; this means that you can use any existing tools and best practices for developing controllers to develop an external issuer.

We’ve decided to use the excellent controller-runtime project to build origin-ca-issuer, running two reconciliation controllers.

Automated Origin CA for Kubernetes

OriginIssuer Controller

The OriginIssuer controller watches for creation and modification of OriginIssuer custom resources. The controllers create a Cloudflare API client using the details and credentials referenced. This client API instance will later be used to sign certificates through the API. The controller will periodically retry to create an API client; once it is successful, it updates the OriginIssuer’s status to be ready.

CertificateRequest Controller

The CertificateRequest controller watches for the creation and modification of cert-manager’s CertificateRequest resources. These resources are created automatically by cert-manager as needed during a certificate’s lifecycle.

The controller looks for Certificate Requests that reference a known OriginIssuer, this reference is copied by cert-manager from the origin Certificate resource, and ignores all resources that do not match. The controller then verifies the OriginIssuer is in the ready state, before transforming the certificate request into an API request using the previously created clients.

On a successful response, the signed certificate is added to the certificate request, and which cert-manager will use to create or update the secret resource. On an unsuccessful request, the controller will periodically retry.

Learn More

Up-to-date documentation and complete installation instructions can be found in our GitHub repository. Feedback and contributions are greatly appreciated. If you’re interested in Kubernetes at Cloudflare, including building controllers like these, we’re hiring.

Add Watermarks to your Cloudflare Stream Video Uploads

Post Syndicated from Rachel Chen original https://blog.cloudflare.com/add-watermarks-to-your-cloudflare-stream-video-uploads/

Add Watermarks to your Cloudflare Stream Video Uploads

Add Watermarks to your Cloudflare Stream Video Uploads

Since the launch of Cloudflare Stream, our customers have been asking for a programmatic way to add watermarks to their videos. We built the Watermarks API to support a wide range of use cases: from customers who simply want to tell Stream “can you put this watermark image to the top right of my video?” to customers with more detailed asks such as “can you put this watermark image in a way it doesn’t take up more than 10% of the original video and with 20% opacity?” All that and more is now available at no additional cost through the Watermarks API.

What is Cloudflare Stream?

Cloudflare Stream provides out-of-the-box video infrastructure so developers can bring their app ideas to market faster. While building a video streaming app, developers must ask themselves questions like

  • Where do we store the videos affordably?
  • How do we encode the videos to support users with varying Internet speeds?
  • How do we maintain our video pipeline in the long term?”

Cloudflare Stream is a single product that handles video encoding, storage, delivery and presentation (with the Stream Player.) Stream lets developers launch their ideas faster while having the confidence the video infrastructure will scale with their app’s growth.

How the Watermark API works

The Watermark API lets you add a watermark to a video at the time of uploading. It consists of two new features to the Stream API:

  • A new /stream/watermarks endpoint that lets you create watermark profiles and returns a uid, a unique identifier for each watermark profile
  • Support for a watermark object containing the uid of the watermark profile that can be passed at the time of upload

Step 1: Creating a Watermark Profile

A watermark profile describes the nature of the watermark, including the image to use as a watermark and properties such as its positioning, padding and scale.

Add Watermarks to your Cloudflare Stream Video Uploads

In this example, we are going to create a watermark profile that places the Cloudflare logo to the lower left of the video:

curl --request POST \
  --url https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/stream/watermarks \
  --header 'content-type: application/json' \
  --header 'x-auth-email: $CLOUDFLARE_EMAIL \
  --header 'x-auth-key: $CLOUDFLARE_KEY \
  --data '{
  "url": "https://storage.googleapis.com/zaid-test/Watermarks%20Demo/cf-icon.png",
  "name": "Cloudflare Icon",
  "opacity": 0.5,
  "padding": 0.05,
  "scale": 0.1,
  "position": "lowerLeft"
}'

The response contains information about the watermark profile, including a uid that we will use in the next step

{
  "result": {
    "uid": "a85d289c2e3f82701103620d16cd2408",
    "size": 9165,
    "height": 504,
    "width": 600,
    "created": "2020-09-03T20:43:56.337486Z",
    "downloadedFrom": "REDACTED_VIDEO_URL",
    "name": "Cloudflare Icon",
    "opacity": 0.5,
    "padding": 0.05,
    "scale": 0.1,
    "position": "lowerLeft"
  },
  "success": true,
  "errors": [],
  "messages": []
}

Step 2: Apply the Watermark

We’ve created the watermark and are ready to use it. Below is a screengrab from the Built For This commercial. It contains no watermark:

Add Watermarks to your Cloudflare Stream Video Uploads

We are going to upload the commercial and request Stream to add the logo from the previous step as a watermark:

curl --request POST \
  --url https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/stream/copy \
  --header 'content-type: application/json' \
  --header 'x-auth-email: $EMAIL \
  --header 'x-auth-key: $AUTH_KEY' \
  --data '{
  "url": "https://storage.googleapis.com/zaid-test/Watermarks%20Demo/The%20Internet%20was%20BuiltForThis.mp4",
  "watermark": {
    "uid": "a85d289c2e3f82701103620d16cd2408"
  }
}'

Step 3: Your video, now with a watermark!

You’re done! You can watch the video with a watermark:



What’s next

Read the detailed Watermark API docs covering different use cases.

In future iterations, we plan to add support for animated watermarks. Additionally, we want to add Watermark support to the Stream Dashboard so you have a UI to manage and add watermarks.

Introducing GitHub’s OpenAPI Description

Post Syndicated from Marc-Andre Giroux original https://github.blog/2020-07-27-introducing-githubs-openapi-description/

The GitHub REST API has been through three major revisions since it was first released, only a month after the site was launched. We often receive feedback that our REST API is an inspiration to many for design, and that it’s an industry reference for what an API should look like. Today, we’re excited to announce an improvement to how developers can interact with the API. GitHub has open sourced an OpenAPI description of the REST API.

OpenAPI

The OpenAPI specification is a programming language agnostic standard that lets providers describe the interface of their HTTP APIs. This allows both humans and machines to discover the capabilities of an API without needing to first read documentation or understand the implementation. OpenAPI is a widely adopted industry standard and GitHub is proud to be part of the community and help push the standard forward.

Try it Out

The GitHub OpenAPI description contains more than 600 operations exposed in our API. For visual exploration of the API, you can load the description as a Postman Collection. Programmatically, the description can be used to generate mock servers, test suites, and bindings for languages not supported by Octokit.

The description is provided under two formats. The bundled version is preferred for most use cases as it makes use of OpenAPI components for reuse and readability. For tooling that has poor support for inline references to components, we also provide a fully dereferenced version.

Active Development

The description is currently in beta. Describing a 12-year-old API is no easy task. We’ve built this description using a mix of existing JSON schemas, documented examples, contract testing, and love. We expect to make the description even more complete and accurate as we go forward and as OpenAPI becomes central to our developer experience — internally and externally.

Quarterly releases of the description are available for GitHub Enterprise Server and GitHub Private Instances, with versions like v2.21. More frequent updates to the description will be available for GitHub.com.

How Can I Contribute?

We’re always looking to make our OpenAPI description more complete and accurate as well as making it easier to consume. If you’d like to help contribute to the description, check out our contributing guide. If something is not working for you, please file an Issue on the repository.

Building a complete OpenAPI description for the GitHub API was no easy task and could not have been possible without a great team. Thanks to Gregor Martynus for his initial work on describing the API, the Docs Engineering team for their amazing work around OpenAPI and documentation, Will Roden for his help validating the description with octo-go, as well as the folks at Redoc.ly who helped along the way.

Learn more about our REST API OpenAPI Description

*  The OpenAPI Initiative logo is a trademark of The Linux Foundation

Backblaze B2 and the S3 Compatible API on Cloudflare

Post Syndicated from Tim Obezuk original https://blog.cloudflare.com/backblaze-b2-and-the-s3-compatible-api-on-cloudflare/

Backblaze B2 and the S3 Compatible API on Cloudflare

In May 2020, Backblaze, a founding Bandwidth Alliance partner announced S3 compatible APIs for their B2 Cloud Storage service. As a refresher, the Bandwidth Alliance is a group of forward-thinking cloud and networking companies that are committed to discounting or waiving data transfer fees for shared customers. Backblaze has been a proud partner since 2018. We are excited to see Backblaze introduce a new level of compatibility in their Cloud Storage service.

History of the S3 API

First let’s dive into the history of the S3 API and why it’s important for Cloudflare users.

Prior to 2006, before the mass migration to the Cloud, if you wanted to store content for your company you needed to build your own expensive and highly available storage platform that was large enough to store all your existing content with enough growth headroom for your business. AWS launched to help eliminate this model by renting their physical computing and storage infrastructure.

Amazon Simple Storage Service (S3) led the market by offering a scalable and resilient tool for storing unlimited amounts of data without building it yourself. It could be integrated into any application but there was one catch: you couldn’t use any existing standard such as WebDAV, FTP or SMB: your application needed to interface with Amazon’s bespoke S3 API.

Fast forward to 2020 and the storage provider landscape has become highly competitive with many providers capable of providing petabyte (and exabyte) scale content storage at extremely low cost-per-gigabyte. However, Amazon S3 has remained a dominant player despite heavy competition and not being the most cost-effective player.

The broad adoption of the S3 API by developers in their codebases and internal systems has transformed the S3 API into what WebDAV promised us to be: de facto standard HTTP File Storage API.

Engineering costs of changing storage providers

With many code bases and legacy applications being entrenched in the S3 API, the process to switch to a more cost-effective storage provider is not so easy. Companies need to consider the cost of engineer time programming a new storage API while also physically moving their data.

This engineering overhead has led many storage providers to natively support the S3 API, leveling the playing field and allowing companies to focus on picking the most cost-effective provider.

First-mile bandwidth costs and the Bandwidth Alliance

Cloudflare caches content in Points of Presence located in more than 200 cities around the world. This cached content is then handed to your Internet service provider (ISP) over low cost and often free Internet exchange connections in the same facility using mutual fibre optic cables. This cost saving is fairly well understood as the benefit of content delivery networks and has become highly commoditized.

What is less well understood is the first-mile cost of moving data from a storage provider to the content delivery network. Typically storage providers expect traffic to route via the Internet and will charge the consumer per-gigabyte of data transmitted. This is not the case for Cloudflare as we also share facilities and mutual fibre optic cables with many storage providers.

Backblaze B2 and the S3 Compatible API on Cloudflare

These shared interconnects created an opportunity to waive the cost of first-mile bandwidth between Cloudflare and many providers and is what prompted us to create the Bandwidth Alliance.

Media and entertainment companies serving user-generated content have a continuous supply of new content being moved over the first-mile from the storage provider to the content delivery network. The first-mile bandwidth cost adds up and using a Bandwidth Alliance partner such Backblaze can entirely eliminate it.

Using the S3 API in Cloudflare Workers

The Solutions Engineering team at Cloudflare is tasked with providing strategic technical guidance for our enterprise customers.

It’s not uncommon for developers to connect Cloudflare’s global network directly to their storage provider and directly serve content such as live and on-demand video without an intermediate web server.

For security purposes engineers typically use Cloudflare Workers to sign each uncached request using the S3 API. Cloudflare Workers allows anyone to deploy code to our global network of over 200+ Points of Presence in seconds and is built on top of Service Workers.

We’ve tested Backblaze B2’s S3 Compatible API in Cloudflare Workers using the same code tested for Amazon S3 buckets and it works perfectly by changing the target endpoint.

Creating a S3 Compatible Worker script

Here’s how it is done using Cloudflare Worker’s CLI tool Wrangler:

Generate a new project in Wrangler using a template intended for use with Amazon S3:

wrangler generate <projectname> https://github.com/obezuk/worker-signed-s3-template

This template uses aws4fetch. A fast, lightweight implementation of an S3 Compatible signing library that is commonly used in Service Worker environments like Cloudflare Workers.

The template creates an index.js file with a standard request signing implementation:

import { AwsClient } from 'aws4fetch'

const aws = new AwsClient({
    "accessKeyId": AWS_ACCESS_KEY_ID,
    "secretAccessKey": AWS_SECRET_ACCESS_KEY,
    "region": AWS_DEFAULT_REGION
});

addEventListener('fetch', function(event) {
    event.respondWith(handleRequest(event.request))
});

async function handleRequest(request) {
    var url = new URL(request.url);
    url.hostname = AWS_S3_BUCKET;
    var signedRequest = await aws.sign(url);
    return await fetch(signedRequest, { "cf": { "cacheEverything": true } });
}

Environment Variables

Modify your wrangler.toml file to use your Backblaze B2 API Key ID and Secret:

[env.dev]
vars = { AWS_ACCESS_KEY_ID = "<BACKBLAZE B2 keyId>", 
AWS_SECRET_ACCESS_KEY = "<BACKBLAZE B2 secret>", 
AWS_DEFAULT_REGION = "", 
AWS_S3_BUCKET = "<BACKBLAZE B2 bucketName>.<BACKBLAZE B2 S3 Endpoint>"}

AWS_S3_BUCKET environment variable will be the combination of your bucket name, period and S3 Endpoint. For a Backblaze B2 Bucket named example-bucket and S3 Endpoint s3.us-west-002.backblazeb2.com use example-bucket.s3.us-west-002.backblazeb2.com

AWS_DEFAULT_REGION environment variable is interpreted from your S3 Endpoint. I use us-west-002.

We recommend using Secret Environment variables to store your AWS_SECRET_ACCESS_KEY content when using this script in production.

Preview your Cloudflare Worker

Next run wrangler preview --env dev to enter a preview window of your Worker script. My bucket contained a static website containing adaptive streaming video content stored in a Backblaze B2 bucket.

Backblaze B2 and the S3 Compatible API on Cloudflare
Note: We permit caching of third party video content only for enterprise domains. Free/Pro/Biz users wanting to serve video content via Cloudflare may use Stream which delivers an end-to-end video delivery service.

Backblaze B2’s compatibility for the S3 API is an exciting update that has made their storage platform highly compatible with existing code bases and legacy systems. And, as a special offer to Cloudflare blog readers, Backblaze will pay the migration costs for transferring your data from S3 to Backblaze B2 (click here for more detail). With the cost of migration covered and compatibility for your existing workflows, it is now easier than ever to switch to a Bandwidth Alliance partner and save on first-mile costs. By doing so, you can slash your cloud bills, gain flexibility, and make no compromises to your performance.

To learn more, join us on May 14th for a webinar focused on getting you ultra fast worldwide content delivery.

Ready for changes with Hexagonal Architecture

Post Syndicated from Netflix Technology Blog original https://netflixtechblog.com/ready-for-changes-with-hexagonal-architecture-b315ec967749

by Damir Svrtan and Sergii Makagon

As the production of Netflix Originals grows each year, so does our need to build apps that enable efficiency throughout the entire creative process. Our wider Studio Engineering Organization has built more than 30 apps that help content progress from pitch (aka screenplay) to playback: ranging from script content acquisition, deal negotiations and vendor management to scheduling, streamlining production workflows, and so on.

Highly integrated from the start

About a year ago, our Studio Workflows team started working on a new app that crosses multiple domains of the business. We had an interesting challenge on our hands: we needed to build the core of our app from scratch, but we also needed data that existed in many different systems.

Some of the data points we needed, such as data about movies, production dates, employees, and shooting locations, were distributed across many services implementing various protocols: gRPC, JSON API, GraphQL and more. Existing data was crucial to the behavior and business logic of our application. We needed to be highly integrated from the start.

Swappable data sources

One of the early applications for bringing visibility into our productions was built as a monolith. The monolith allowed for rapid development and quick changes while the knowledge of the space was non-existent. At one point, more than 30 developers were working on it, and it had well over 300 database tables.

Over time applications evolved from broad service offerings towards being highly specialized. This resulted in a decision to decompose the monolith to specific services. This decision was not geared by performance issues — but with setting boundaries around all of these different domains and enabling dedicated teams to develop domain-specific services independently.

Large amounts of the data we needed for the new app were still provided by the monolith, but we knew that the monolith would be broken up at some point. We were not sure about the timing of the breakup, but we knew that it was inevitable, and we needed to be prepared.

Thus, we could leverage some of the data from the monolith at first as it was still the source of truth, but be prepared to swap those data sources to new microservices as soon as they came online.

Leveraging Hexagonal Architecture

We needed to support the ability to swap data sources without impacting business logic, so we knew we needed to keep them decoupled. We decided to build our app based on principles behind Hexagonal Architecture and Uncle Bob’s Clean Architecture.

The idea of Hexagonal Architecture is to put inputs and outputs at the edges of our design. Business logic should not depend on whether we expose a REST or a GraphQL API, and it should not depend on where we get data from — a database, a microservice API exposed via gRPC or REST, or just a simple CSV file.

The pattern allows us to isolate the core logic of our application from outside concerns. Having our core logic isolated means we can easily change data source details without a significant impact or major code rewrites to the codebase.

One of the main advantages we also saw in having an app with clear boundaries is our testing strategy — the majority of our tests can verify our business logic without relying on protocols that can easily change.

Defining the core concepts

Leveraged from the Hexagonal Architecture, the three main concepts that define our business logic are Entities, Repositories, and Interactors.

  • Entities are the domain objects (e.g., a Movie or a Shooting Location) — they have no knowledge of where they’re stored (unlike Active Record in Ruby on Rails or the Java Persistence API).
  • Repositories are the interfaces to getting entities as well as creating and changing them. They keep a list of methods that are used to communicate with data sources and return a single entity or a list of entities. (e.g. UserRepository)
  • Interactors are classes that orchestrate and perform domain actions — think of Service Objects or Use Case Objects. They implement complex business rules and validation logic specific to a domain action (e.g., onboarding a production)

With these three main types of objects, we are able to define business logic without any knowledge or care where the data is kept and how business logic is triggered. Outside of the business logic are the Data Sources and the Transport Layer:

  • Data Sources are adapters to different storage implementations.
    A data source might be an adapter to a SQL database (an Active Record class in Rails or JPA in Java), an elastic search adapter, REST API, or even an adapter to something simple such as a CSV file or a Hash. A data source implements methods defined on the repository and stores the implementation of fetching and pushing the data.
  • Transport Layer can trigger an interactor to perform business logic. We treat it as an input for our system. The most common transport layer for microservices is the HTTP API Layer and a set of controllers that handle requests. By having business logic extracted into interactors, we are not coupled to a particular transport layer or controller implementation. Interactors can be triggered not only by a controller, but also by an event, a cron job, or from the command line.
The dependency graph in Hexagonal Architecture goes inward.

With a traditional layered architecture, we would have all of our dependencies point in one direction, each layer above depending on the layer below. The transport layer would depend on the interactors, the interactors would depend on the persistence layer.

In Hexagonal Architecture all dependencies point inward — our core business logic does not know anything about the transport layer or the data sources. Still, the transport layer knows how to use interactors, and the data sources know how to conform to the repository interface.

With this, we are prepared for the inevitable changes to other Studio systems, and whenever that needs to happen, the task of swapping data sources is easy to accomplish.

Swapping data sources

The need to swap data sources came earlier than we expected — we suddenly hit a read constraint with the monolith and needed to switch a certain read for one entity to a newer microservice exposed over a GraphQL aggregation layer. Both the microservice and the monolith were kept in sync and had the same data, reading from one service or the other produced the same results.

We managed to transfer reads from a JSON API to a GraphQL data source within 2 hours.

The main reason we were able to pull it off so fast was due to the Hexagonal architecture. We didn’t let any persistence specifics leak into our business logic. We created a GraphQL data source that implemented the repository interface. A simple one-line change was all we needed to start reading from a different data source.

With a proper abstraction it was easy to change data sources

At that point, we knew that Hexagonal Architecture worked for us.

The great part about a one-line change is that it mitigates risks to the release. It is very easy to rollback in the case that a downstream microservice failed on initial deployment. This as well enables us to decouple deployment and activation, as we can decide which data source to use through configuration.

Hiding data source details

One of the great advantages of this architecture is that we are able to encapsulate data source implementation details. We ran into a case where we needed an API call that did not yet exist — a service had an API to fetch a single resource but did not have bulk fetch implemented. After talking with the team providing the API, we realized this endpoint would take some time to deliver. So we decided to move forward with another solution to solve the problem while this endpoint was being built.

We defined a repository method that would grab multiple resources given multiple record identifiers — and the initial implementation of that method on the data source sent multiple concurrent calls to the downstream service. We knew this was a temporary solution and that the second take at the data source implementation was to use the bulk API once implemented.

Our business logic doesn’t need to be aware of specific data source limitations.

A design like this enabled us to move forward with meeting the business needs without accruing much technical debt or the need to change any business logic afterward.

Testing strategy

When we started experimenting with Hexagonal Architecture, we knew we needed to come up with a testing strategy. We knew that a prerequisite to great development velocity was to have a test suite that is reliable and super fast. We didn’t think of it as a nice to have, but a must-have.

We decided to test our app at three different layers:

  • We test our interactors, where the core of our business logic lives but is independent of any type of persistence or transportation. We leverage dependency injection and mock any kind of repository interaction. This is where our business logic is tested in detail, and these are the tests we strive to have most of.
  • We test our data sources to determine if they integrate correctly with other services, whether they conform to the repository interface, and check how they behave upon errors. We try to minimize the amount of these tests.
  • We have integration specs that go through the whole stack, from our Transport / API layer, through the interactors, repositories, data sources, and hit downstream services. These specs test whether we “wired” everything correctly. If a data source is an external API, we hit that endpoint and record the responses (and store them in git), allowing our test suite to run fast on every subsequent invocation. We don’t do extensive test coverage on this layer — usually just one success scenario and one failure scenario per domain action.

We don’t test our repositories as they are simple interfaces that data sources implement, and we rarely test our entities as they are plain objects with attributes defined. We test entities if they have additional methods (without touching the persistence layer).

We have room for improvement, such as not pinging any of the services we rely on but relying 100% on contract testing. With a test suite written in the above manner, we manage to run around 3000 specs in 100 seconds on a single process.

It’s lovely to work with a test suite that can easily be run on any machine, and our development team can work on their daily features without disruption.

Delaying decisions

We are in a great position when it comes to swapping data sources to different microservices. One of the key benefits is that we can delay some of the decisions about whether and how we want to store data internal to our application. Based on the feature’s use case, we even have the flexibility to determine the type of data store — whether it be Relational or Documents.

Uncle Bob said it great:

The purpose of a good architecture is to delay decisions. Why? Because when we delay a decision, we have more information when it comes time to make it.

At the beginning of a project, we have the least amount of information about the system we are building. We should not lock ourselves into an architecture with uninformed decisions leading to a project paradox.

The decisions we made make sense for our needs now and have enabled us to move fast. The best part of Hexagonal Architecture is that it keeps our application flexible for future requirements to come.


Ready for changes with Hexagonal Architecture was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Introducing Secrets and Environment Variables to Cloudflare Workers

Post Syndicated from John Donmoyer original https://blog.cloudflare.com/workers-secrets-environment/

Introducing Secrets and Environment  Variables to Cloudflare Workers

Introducing Secrets and Environment  Variables to Cloudflare Workers

The Workers team here at Cloudflare has been hard at work shipping a bunch of new features in the last year and we’ve seen some amazing things built with the tools we’ve provided. However, as my uncle once said, with great serverless platform growth comes great responsibility.

One of the ways we can help is by ensuring that deploying and maintaining your Workers scripts is a low risk endeavor. Rotating a set of API keys shouldn’t require risking downtime through code edits and redeployments and in some cases it may not make sense for the developer writing the script to know the actual API key value at all. To help tackle this problem, we’re releasing Secrets and Environment Variables to the Wrangler CLI and Workers Dashboard.

Supporting secrets

As we started to design support for secrets in Workers we had a sense that this was already a big concern for a lot of our users but we wanted to learn about all of the use cases to ensure we were building the right thing. We headed to the community forums, twitter, and the inbox of Louis Grace, business development representative extraordinaire, for some anecdotes about Secrets usage. We also sent out a survey to our existing users to learn about use cases and pain points.

We learned that even though there was already a way to store secrets without exposing them via Workers KV, the solution was not very intuitive, nor did it meet all the needs of our users. Many users didn’t even know we had an interim solution in place. Recognizing that we were not the first platform to encounter this problem, we surveyed the existing landscape of Platform as a Service offerings to get a better sense for what our users would expect of us.

Deciding on a solution

One of the first things we found was that not all environment variables are created equal. While the simplest use case for having a defined environment variable may be storing a piece of text that can be updated no matter where it is referenced in a script, sometimes those variables may have higher stakes associated with them. If you’re storing an API key that controls access to an important system, you may not want to allow anyone with dashboard access to see it, maybe not even the developers themselves.

With this in mind, we had to ensure the feature covered two different use cases: one for storing variables in plain text where you could see the variable being referenced and make edits to it and another where the variable would be encrypted as soon as you save it, never to be seen again. This way, we were able to serve both needs of our users, side by side, without one compromising for the other.

Testing our prototypes

Once we had a fairly good idea of what we wanted to build, we built some prototypes and rough implementations in staging environments so we would be able to perform some usability testing. We wrangled up some developers and observed them as they performed a series of tasks where they were asked to add some secrets and plain-text environment variables, reference them in one of their Workers, and bind their Worker to a Worker KV namespace.

Along the way we also asked questions to understand the developer’s professional background, familiarity with the product, and the use cases they’ve had for using Workers in the past along with any paint points they experienced.

While we were testing the new dashboard interface we also began testing the usability of the Wrangler CLI. We had Wrangler users perform the same tasks as the Workers dashboard users to help us find out if users are expecting different things out of their command-line tooling.

Findings and fixes

Through our testing we were able to make a number of changes before the final release. Some of the smaller changes included things like adjusting the behavior of form fields to ensure users knew which variable would be associated with each value. We also made larger changes like electing to separate the KV namespace bindings from the other environment variables as a way to emphasize that KV namespace bindings are not the keys and values themselves but a reference to a namespace where those keys are stored.

Cina, one of our engineers, put together a proposal to align some of our terminology with the terms that our developers were naturally using to describe their workflow. In Wrangler users were accustomed to referencing their KV namespaces by adding a KV namespace binding so when they came to the Workers dashboard interface and saw a field called “KV Variables” they were often confused, thinking they were adding keys and values to the namespace itself instead of establishing a variable that could be used to reference the namespace. As a fix, we decided to call it a “KV namespace binding” throughout the experience.

Try it out

Environment variables are available now with the Wrangler CLI and in the Workers Dashboard so go ahead and give them a shot today!

Introducing Secrets and Environment  Variables to Cloudflare Workers
Adding a secret with Wrangler
Introducing Secrets and Environment  Variables to Cloudflare Workers
Managing environment variables and KV bindings in the Workers Dashboard

As we continue to build out the Workers platform we’d love to hear from you. Let us know if you’re interested in participating in user research or just have something to say as we’d love to hear from you.

How we used our new GraphQL Analytics API to build Firewall Analytics

Post Syndicated from Nick Downie original https://blog.cloudflare.com/how-we-used-our-new-graphql-api-to-build-firewall-analytics/

How we used our new GraphQL Analytics API to build Firewall Analytics

How we used our new GraphQL Analytics API to build Firewall Analytics

Firewall Analytics is the first product in the Cloudflare dashboard to utilize the new GraphQL Analytics API. All Cloudflare dashboard products are built using the same public APIs that we provide to our customers, allowing us to understand the challenges they face when interfacing with our APIs. This parity helps us build and shape our products, most recently the new GraphQL Analytics API that we’re thrilled to release today.

By defining the data we want, along with the response format, our GraphQL Analytics API has enabled us to prototype new functionality and iterate quickly from our beta user feedback. It is helping us deliver more insightful analytics tools within the Cloudflare dashboard to our customers.

Our user research and testing for Firewall Analytics surfaced common use cases in our customers’ workflow:

  • Identifying spikes in firewall activity over time
  • Understanding the common attributes of threats
  • Drilling down into granular details of an individual event to identify potential false positives

We can address all of these use cases using our new GraphQL Analytics API.

GraphQL Basics

Before we look into how to address each of these use cases, let’s take a look at the format of a GraphQL query and how our schema is structured.

A GraphQL query is comprised of a structured set of fields, for which the server provides corresponding values in its response. The schema defines which fields are available and their type. You can find more information about the GraphQL query syntax and format in the official GraphQL documentation.

To run some GraphQL queries, we recommend downloading a GraphQL client, such as GraphiQL, to explore our schema and run some queries. You can find documentation on getting started with this in our developer docs.

At the top level of the schema is the viewer field. This represents the top level node of the user running the query. Within this, we can query the zones field to find zones the current user has access to, providing a filter argument, with a zoneTag of the identifier of the zone we’d like narrow down to.

{
  viewer {
    zones(filter: { zoneTag: "YOUR_ZONE_ID" }) {
      # Here is where we'll query our firewall events
    }
  }
}

Now that we have a query that finds our zone, we can start querying the firewall events which have occurred in that zone, to help solve some of the use cases we’ve identified.

Visualising spikes in firewall activity

It’s important for customers to be able to visualise and understand anomalies and spikes in their firewall activity, as these could indicate an attack or be the result of a misconfiguration.

Plotting events in a timeseries chart, by their respective action, provides users with a visual overview of the trend of their firewall events.

Within the zones field in the query we’ve created earlier, we can query our firewall event aggregates using the firewallEventsAdaptiveGroups field, providing arguments to limit the count of groups, a filter for the date range we’re looking for (combined with any user-entered filters), and a list of fields to order by; in this case, just the datetimeHour field that we’re grouping by.

Within the zones field in the query we created earlier, we can further query our firewall event aggregates using the firewallEventsAdaptiveGroups field and providing arguments for:

  • A limit for the count of groups
  • A filter for the date range we’re looking for (combined with any user-entered filters)
  • A list of fields to orderBy (in this case, just the datetimeHour field that we’re grouping by).

By adding the dimensions field, we’re querying for groups of firewall events, aggregated by the fields nested within dimensions. In this case, our query includes the action and datetimeHour fields, meaning the response will be groups of firewall events which share the same action, and fall within the same hour. We also add a count field, to get a numeric count of how many events fall within each group.

query FirewallEventsByTime($zoneTag: string, $filter: FirewallEventsAdaptiveGroupsFilter_InputObject) {
  viewer {
    zones(filter: { zoneTag: $zoneTag }) {
      firewallEventsAdaptiveGroups(
        limit: 576
        filter: $filter
        orderBy: [datetimeHour_DESC]
      ) {
        count
        dimensions {
          action
          datetimeHour
        }
      }
    }
  }
}

Note – Each of our groups queries require a limit to be set. A firewall event can have one of 8 possible actions, and we are querying over a 72 hour period. At most, we’ll end up with 567 groups, so we can set that as the limit for our query.

This query would return a response in the following format:

{
  "viewer": {
    "zones": [
      {
        "firewallEventsAdaptiveGroups": [
          {
            "count": 5,
            "dimensions": {
              "action": "jschallenge",
              "datetimeHour": "2019-09-12T18:00:00Z"
            }
          }
          ...
        ]
      }
    ]
  }
}

We can then take these groups and plot each as a point on a time series chart. Mapping over the firewallEventsAdaptiveGroups array, we can use the group’s count property on the y-axis for our chart, then use the nested fields within the dimensions object, using action as unique series and the datetimeHour as the time stamp on the x-axis.

How we used our new GraphQL Analytics API to build Firewall Analytics

Top Ns

After identifying a spike in activity, our next step is to highlight events with commonality in their attributes. For example, if a certain IP address or individual user agent is causing many firewall events, this could be a sign of an individual attacker, or could be surfacing a false positive.

Similarly to before, we can query aggregate groups of firewall events using the firewallEventsAdaptiveGroups field. However, in this case, instead of supplying action and datetimeHour to the group’s dimensions, we can add individual fields that we want to find common groups of.

By ordering by descending count, we’ll retrieve groups with the highest commonality first, limiting to the top 5 of each. We can add a single field nested within dimensions to group by it. For example, adding clientIP will give five groups with the IP addresses causing the most events.

We can also add a firewallEventsAdaptiveGroups field with no nested dimensions. This will create a single group which allows us to find the total count of events matching our filter.

query FirewallEventsTopNs($zoneTag: string, $filter: FirewallEventsAdaptiveGroupsFilter_InputObject) {
  viewer {
    zones(filter: { zoneTag: $zoneTag }) {
      topIPs: firewallEventsAdaptiveGroups(
        limit: 5
        filter: $filter
        orderBy: [count_DESC]
      ) {
        count
        dimensions {
          clientIP
        }
      }
      topUserAgents: firewallEventsAdaptiveGroups(
        limit: 5
        filter: $filter
        orderBy: [count_DESC]
      ) {
        count
        dimensions {
          userAgent
        }
      }
      total: firewallEventsAdaptiveGroups(
        limit: 1
        filter: $filter
      ) {
        count
      }
    }
  }
}

Note – we can add the firewallEventsAdaptiveGroups field multiple times within a single query, each aliased differently. This allows us to fetch multiple different groupings by different fields, or with no groupings at all. In this case, getting a list of top IP addresses, top user agents, and the total events.

How we used our new GraphQL Analytics API to build Firewall Analytics

We can then reference each of these aliases in the UI, mapping over their respective groups to render each row with its count, and a bar which represents the proportion of total events, showing the proportion of all events each row equates to.

Are these firewall events false positives?

After users have identified spikes, anomalies and common attributes, we wanted to surface more information as to whether these have been caused by malicious traffic, or are false positives.

To do this, we wanted to provide additional context on the events themselves, rather than just counts. We can do this by querying the firewallEventsAdaptive field for these events.

Our GraphQL schema uses the same filter format for both the aggregate firewallEventsAdaptiveGroups field and the raw firewallEventsAdaptive field. This allows us to use the same filters to fetch the individual events which summate to the counts and aggregates in the visualisations above.

query FirewallEventsList($zoneTag: string, $filter: FirewallEventsAdaptiveFilter_InputObject) {
  viewer {
    zones(filter: { zoneTag: $zoneTag }) {
      firewallEventsAdaptive(
        filter: $filter
        limit: 10
        orderBy: [datetime_DESC]
      ) {
        action
        clientAsn
        clientCountryName
        clientIP
        clientRequestPath
        clientRequestQuery
        datetime
        rayName
        source
        userAgent
      }
    }
  }
}

How we used our new GraphQL Analytics API to build Firewall Analytics

Once we have our individual events, we can render all of the individual fields we’ve requested, providing users the additional context on event they need to determine whether this is a false positive or not.

That’s how we used our new GraphQL Analytics API to build Firewall Analytics, helping solve some of our customers most common security workflow use cases. We’re excited to see what you build with it, and the problems you can help tackle.

You can find out how to get started querying our GraphQL Analytics API using GraphiQL in our developer documentation, or learn more about writing GraphQL queries on the official GraphQL Foundation documentation.

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Post Syndicated from Filipp Nisenzoun original https://blog.cloudflare.com/introducing-the-graphql-analytics-api-exactly-the-data-you-need-all-in-one-place/

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Today we’re excited to announce a powerful and flexible new way to explore your Cloudflare metrics and logs, with an API conforming to the industry-standard GraphQL specification. With our new GraphQL Analytics API, all of your performance, security, and reliability data is available from one endpoint, and you can select exactly what you need, whether it’s one metric for one domain or multiple metrics aggregated for all of your domains. You can ask questions like “How many cached bytes have been returned for these three domains?” Or, “How many requests have all the domains under my account received?” Or even, “What effect did changing my firewall rule an hour ago have on the responses my users were seeing?”

The GraphQL standard also has strong community resources, from extensive documentation to front-end clients, making it easy to start creating simple queries and progress to building your own sophisticated analytics dashboards.

From many APIs…

Providing insights has always been a core part of Cloudflare’s offering. After all, by using Cloudflare, you’re relying on us for key parts of your infrastructure, and so we need to make sure you have the data to manage, monitor, and troubleshoot your website, app, or service. Over time, we developed a few key data APIs, including ones providing information regarding your domain’s traffic, DNS queries, and firewall events. This multi-API approach was acceptable while we had only a few products, but we started to run into some challenges as we added more products and analytics. We couldn’t expect users to adopt a new analytics API every time they started using a new product. In fact, some of the customers and partners that were relying on many of our products were already becoming confused by the various APIs.

Following the multi-API approach was also affecting how quickly we could develop new analytics within the Cloudflare dashboard, which is used by more people for data exploration than our APIs. Each time we built a new product, our product engineering teams had to implement a corresponding analytics API, which our user interface engineering team then had to learn to use. This process could take up to several months for each new set of analytics dashboards.

…to one

Our new GraphQL Analytics API solves these problems by providing access to all Cloudflare analytics. It offers a standard, flexible syntax for describing exactly the data you need and provides predictable, matching responses. This approach makes it an ideal tool for:

  1. Data exploration. You can think of it as a way to query your own virtual data warehouse, full of metrics and logs regarding the performance, security, and reliability of your Internet property.
  2. Building amazing dashboards, which allow for flexible filtering, sorting, and drilling down or rolling up. Creating these kinds of dashboards would normally require paying thousands of dollars for a specialized analytics tool. You get them as part of our product and can customize them for yourself using the API.

In a companion post that was also published today, my colleague Nick discusses using the GraphQL Analytics API to build dashboards. So, in this post, I’ll focus on examples of how you can use the API to explore your data. To make the queries, I’ll be using GraphiQL, a popular open-source querying tool that takes advantage of GraphQL’s capabilities.

Introspection: what data is available?

The first thing you may be wondering: if the GraphQL Analytics API offers access to so much data, how do I figure out what exactly is available, and how I can ask for it? GraphQL makes this easy by offering “introspection,” meaning you can query the API itself to see the available data sets, the fields and their types, and the operations you can perform. GraphiQL uses this functionality to provide a “Documentation Explorer,” query auto-completion, and syntax validation. For example, here is how I can see all the data sets available for a zone (domain):

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

If I’m writing a query, and I’m interested in data on firewall events, auto-complete will help me quickly find relevant data sets and fields:

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Querying: examples of questions you can ask

Let’s say you’ve made a major product announcement and expect a surge in requests to your blog, your application, and several other zones (domains) under your account. You can check if this surge materializes by asking for the requests aggregated under your account, in the 30 minutes after your announcement post, broken down by the minute:

{
 viewer { 
   accounts (filter: {accountTag: $accountTag}) {
     httpRequests1mGroups(limit: 30, filter: {datetime_geq: "2019-09-16T20:00:00Z", datetime_lt: "2019-09-16T20:30:00Z"}, orderBy: [datetimeMinute_ASC]) {
	  dimensions {
		datetimeMinute
	  }
	  sum {
		requests
	  }
	}
   }
 }
}

Here is the first part of the response, showing requests for your account, by the minute:

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Now, let’s say you want to compare the traffic coming to your blog versus your marketing site over the last hour. You can do this in one query, asking for the number of requests to each zone:

{
 viewer {
   zones(filter: {zoneTag_in: [$zoneTag1, $zoneTag2]}) {
     httpRequests1hGroups(limit: 2, filter: {datetime_geq: "2019-09-16T20:00:00Z",
datetime_lt: "2019-09-16T21:00:00Z"}) {
       sum {
         requests
       }
     }
   }
 }
}

Here is the response:

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

Finally, let’s say you’re seeing an increase in error responses. Could this be correlated to an attack? You can look at error codes and firewall events over the last 15 minutes, for example:

{
 viewer {
   zones(filter: {zoneTag: $zoneTag}) {
     httpRequests1mGroups (limit: 100,
filter: {datetime_geq: "2019-09-16T21:00:00Z",
datetime_lt: "2019-09-16T21:15:00Z"}) {
       sum {
         responseStatusMap {
           edgeResponseStatus
           requests
         }
       }
     }
    firewallEventsAdaptiveGroups (limit: 100,
filter: {datetime_geq: "2019-09-16T21:00:00Z",
datetime_lt: "2019-09-16T21:15:00Z"}) {
       dimensions {
         action
       }
       count
     }
    }
  }
}

Notice that, in this query, we’re looking at multiple datasets at once, using a common zone identifier to “join” them. Here are the results:

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

By examining both data sets in parallel, we can see a correlation: 31 requests were “dropped” or blocked by the Firewall, which is exactly the same as the number of “403” responses. So, the 403 responses were a result of Firewall actions.

Try it today

To learn more about the GraphQL Analytics API and start exploring your Cloudflare data, follow the “Getting started” guide in our developer documentation, which also has details regarding the current data sets and time periods available. We’ll be adding more data sets over time, so take advantage of the introspection feature to see the latest available.

Finally, to make way for the new API, the Zone Analytics API is now deprecated and will be sunset on May 31, 2020. The data that Zone Analytics provides is available from the GraphQL Analytics API. If you’re currently using the API directly, please follow our migration guide to change your API calls. If you get your analytics using the Cloudflare dashboard or our Datadog integration, you don’t need to take any action.

One more thing….

In the API examples above, if you find it helpful to get analytics aggregated for all the domains under your account, we have something else you may like: a brand new Analytics dashboard (in beta) that provides this same information. If your account has many zones, the dashboard is helpful for knowing summary information on metrics such as requests, bandwidth, cache rate, and error rate. Give it a try and let us know what you think using the feedback link above the new dashboard.

What’s new with Workers KV?

Post Syndicated from Steve Klabnik original https://blog.cloudflare.com/whats-new-with-workers-kv/

What’s new with Workers KV?

What’s new with Workers KV?

The Storage team here at Cloudflare shipped Workers KV, our global, low-latency, key-value store, earlier this year. As people have started using it, we’ve gotten some feature requests, and have shipped some new features in response! In this post, we’ll talk about some of these use cases and how these new features enable them.

New KV APIs

We’ve shipped some new APIs, both via api.cloudflare.com, as well as inside of a Worker. The first one provides the ability to upload and delete more than one key/value pair at once. Given that Workers KV is great for read-heavy, write-light workloads, a common pattern when getting started with KV is to write a bunch of data via the API, and then read that data from within a Worker. You can now do these bulk uploads without needing a separate API call for every key/value pair. This feature is available via api.cloudflare.com, but is not yet available from within a Worker.

For example, say we’re using KV to redirect legacy URLs to their new homes. We have a list of URLs to redirect, and where they should redirect to. We can turn this list into JSON that looks like this:

[
  {
    "key": "/old/post/1",
    "value": "/new-post-slug-1"
  },
  {
    "key": "/old/post/2",
    "value": "/new-post-slug-2"
  }
]

And then POST this JSON to the new bulk endpoint, /storage/kv/namespaces/:namespace_id/bulk. This will add both key/value pairs to our namespace.

Likewise, if we wanted to drop support for these redirects, we could issue a DELETE that has this body:

[
    "/old/post/1",
    "/old/post/2"
]

to /storage/kv/namespaces/:namespace_id/bulk, and we’d delete both key/value pairs in a single call to the API.

The bulk upload API has one more trick up its sleeve: not all data is a string. For example, you may have an image as a value, which is just a bag of bytes. if you need to write some binary data, you’ll have to base64 the value’s contents so that it’s valid JSON. You’ll also need to set one more key:

[
  {
    "key": "profile-picture",
    "value": "aGVsbG8gd29ybGQ=",
    "base64": true
  }
]

Workers KV will decode the value from base64, and then store the resulting bytes.

Beyond bulk upload and delete, we’ve also given you the ability to list all of the keys you’ve stored in any of your namespaces, from both the API and within a Worker. For example, if you wrote a blog powered by Workers + Workers KV, you might have each blog post stored as a key/value pair in a namespace called “contents”. Most blogs have some sort of “index” page that lists all of the posts that you can read. To create this page, we need to get a listing of all of the keys, since each key corresponds to a given post. We could do this from within a Worker by calling list() on our namespace binding:

const value = await contents.list()

But what we get back isn’t only a list of keys. The object looks like this:

{
  keys: [
    { name: "Title 1” },
    { name: "Title 2” }
  ],
  list_complete: false,
  cursor: "6Ck1la0VxJ0djhidm1MdX2FyD"
}

We’ll talk about this “cursor” stuff in a second, but if we wanted to get the list of titles, we’d have to iterate over the keys property, and pull out the names:

const keyNames = value.keys.map(e => e.name)

keyNames would be an array of strings:

[“Title 1”, “Title 2”, “Title 3”, “Title 4”, “Title 5”]

We could take keyNames and those titles to build our page.

So what’s up with the list_complete and cursor properties? Well, imagine that we’ve been a very prolific blogger, and we’ve now written thousands of posts. The list API is paginated, meaning that it will only return the first thousand keys. To see if there are more pages available, you can check the list_complete property. If it is false, you can use the cursor to fetch another page of results. The value of cursor is an opaque token that you pass to another call to list:

const value = await NAMESPACE.list()
const cursor = value.cursor
const next_value = await NAMESAPACE.list({"cursor": cursor})

This will give us another page of results, and we can repeat this process until list_complete is true.

Listing keys has one more trick up its sleeve: you can also return only keys that have a certain prefix. Imagine we want to have a list of posts, but only the posts that were made in October of 2019. While Workers KV is only a key/value store, we can use the prefix functionality to do interesting things by filtering the list. In our original implementation, we had stored the titles of keys only:

  • Title 1
  • Title 2

We could change this to include the date in YYYY-MM-DD format, with a colon separating the two:

  • 2019-09-01:Title 1
  • 2019-10-15:Title 2

We can now ask for a list of all posts made in 2019:

const value = await NAMESAPCE.list({"prefix": "2019"})

Or a list of all posts made in October of 2019:

const value = await NAMESAPCE.list({"prefix": "2019-10"})

These calls will only return keys with the given prefix, which in our case, corresponds to a date. This technique can let you group keys together in interesting ways. We’re looking forward to seeing what you all do with this new functionality!

Relaxing limits

For various reasons, there are a few hard limits with what you can do with Workers KV. We’ve decided to raise some of these limits, which expands what you can do.

The first is the limit of the number of namespaces any account could have. This was previously set at 20, but some of you have made a lot of namespaces! We’ve decided to relax this limit to 100 instead. This means you can create five times the number of namespaces you previously could.

Additionally, we had a two megabyte maximum size for values. We’ve increased the limit for values to ten megabytes. With the release of Workers Sites, folks are keeping things like images inside of Workers KV, and two megabytes felt a bit cramped. While Workers KV is not a great fit for truly large values, ten megabytes gives you the ability to store larger images easily. As an example, a 4k monitor has a native resolution of 4096 x 2160 pixels. If we had an image at this resolution as a lossless PNG, for example, it would be just over five megabytes in size.

KV browser

Finally, you may have noticed that there’s now a KV browser in the dashboard! Needing to type out a cURL command just to see what’s in your namespace was a real pain, and so we’ve given you the ability to check out the contents of your namespaces right on the web. When you look at a namespace, you’ll also see a table of keys and values:

What’s new with Workers KV?

The browser has grown with a bunch of useful features since it initially shipped. You can not only see your keys and values, but also add new ones:

What’s new with Workers KV?

edit existing ones:

What’s new with Workers KV?

…and even upload files!

What’s new with Workers KV?

You can also download them:

What’s new with Workers KV?

As we ship new features in Workers KV, we’ll be expanding the browser to include them too.

Wrangler integration

The Workers Developer Experience team has also been shipping some features related to Workers KV. Specifically, you can fully interact with your namespaces and the key/value pairs inside of them.

For example, my personal website is running on Workers Sites. I have a Wrangler project named “website” to manage it. If I wanted to add another namespace, I could do this:

$ wrangler kv:namespace create new_namespace
Creating namespace with title "website-new_namespace"
Success: WorkersKvNamespace {
    id: "<id>",
    title: "website-new_namespace",
}

Add the following to your wrangler.toml:

kv-namespaces = [
    { binding = "new_namespace", id = "<id>" }
]

I’ve redacted the namespace IDs here, but Wrangler let me know that the creation was successful, and provided me with the configuration I need to put in my wrangler.toml. Once I’ve done that, I can add new key/value pairs:

$ wrangler kv:key put "hello" "world" --binding new_namespace
Success

And read it back out again:

> wrangler kv:key get "hello" --binding new_namespace
world

If you’d like to learn more about the design of these features, “How we design features for Wrangler, the Cloudflare Workers CLI” discusses them in depth.

More to come

The Storage team is working hard at improving Workers KV, and we’ll keep shipping new stuff every so often. Our updates will be more regular in the future. If there’s something you’d particularly like to see, please reach out!

Inside the Web Browser’s Performance API

Post Syndicated from Young Park original https://blog.cloudflare.com/browser-performance-api/

Inside the Web Browser’s Performance API

Building a beautiful, feature-rich website is easier than ever before. Not long ago, you’d have to fire up a text editor and hand-craft a lot of HTML, CSS, and JavaScript. Today, you can use WYSIWYG tools and third-party libraries that make development much simpler. The flip side of this is that it can be hard to see everything that’s going into your website — and the performance can suffer.

The good news is that modern web browsers expose lots of performance data that can help you understand how your web page performs. With the launch of Browser Insights today, we can analyze the performance from the perspective of the web browser and what the end user actually experiences. In this post, we’ll dive into how we think about performance and utilize the timing APIs in the web browser.

How web browsers measure performance

In the old days, the only way for a developer to profile performance was to intercept requests and measure the time from the beginning of the page load until the end of the load event.

Today, we can use Web APIs that are supported by modern browsers. This is part of the web standard called the Performance API. The Performance API consists of a collection of individual APIs that include:

  • Navigation Timing (for timing information relates to the page and navigation)
  • Resource Timing (for timing data regarding the loading of an application’s resources)
  • Paint Timing (that provides timing information about paint operation during the page construction)

In this blog post, we will primarily focus on the Navigation Timing API.

Inside the Performance API

To see what’s collected with the Performance API, you can open the Developer Tools in Chrome browser and type ‘performance’ in the console tab (or type in performance.timing to get direct access to the PerformanceTiming attribute).

Try expanding the Performance object by clicking on the arrow before the label and again expand the ‘timing’. This is called PerformanceTiming, which includes all the timings that relate to the current page load as UNIX epoch timestamp (milliseconds). The timing attributes shown are not in the order of the load. So for better understanding, let’s look at the illustration provided by the W3C.

Inside the Web Browser’s Performance API
Image from https://www.w3.org/TR/navigation-timing/

As we can see from the diagram shown, each element (represented as a box above) in the order from left-to-right, represents the navigation flow of the page load. Each element has an attribute from the starting point to the end (and some have multiple attributes!) so that we can measure the elapsed time for each element. For example, to get the Request Time, you could type in the command like shown below in the console which appears to be 60 milliseconds.

Inside the Web Browser’s Performance API

How Cloudflare uses performance data

Once your website is proxied through Cloudflare and Browser Insights is enabled, we write and inject a JavaScript beacon into the web page. Our beacon collects metrics from the Performance API to send to our edge, where it can be used to understand where your website is slowing down or having any network problems. The reported data is shown in the Cloudflare dashboard on the Speed Page showing as averages of each timing metric:

Inside the Web Browser’s Performance API

The metrics we surface are:

  • DNS (domainLookupEnd – domainLookupStart): How long the DNS query takes. This could appear as zero if the connection is reused or the content was stored in the local cache (memory or disk).
  • TCP (connectEnd – connectStart): How long it takes to establish a TCP connection the server. If HTTPS, this process includes TLS negotiation time.
  • Request (responseStart – requestStart): The time elapsed between making an HTTP request and receiving the first byte of the response.
  • Response (responseEnd – responseStart): The time elapsed between the first byte and the last byte of the response received. You can think of this as a resource download time.
  • Processing (domComplete – domLoading): How long it took to render the page. If this number is big, you can optimize your document architecture, resource size, or configure settings under Speed page such as Auto Minify the source code. This document process can be drill down more with domInteractive, domContentLoadedEventStart, domContentLoadedEventEnd, and domComplete. We plan to provide more detailed analytics on this later on.
  • Load Event (loadEventEnd – loadEventStart): When the browser finishes loading its document and resources, it triggers a `load` event. This duration may be helpful to you if you have additional functions or any logic for the load event.
  • Total Time: Sum of each timing metrics shown on the graph.

If you are seeing any spikes or unusual form of a line in the stacked line chart, you could start investigating on each element to see what is causing the problem.

For more about how to use Browser Insights, see our announcement blog post.

What’s next

In this blog post, we’ve focused on the Navigation Timing API, because it’s at the heart of our first version of Browser Insights. In the near future, we plan to incorporate metrics from some of the other APIs. For example, we can break down some of the longer timings by looking at individual resource loads, and pointing out what’s taking longer. In addition to that, we plan to track JavaScript errors, provide a way to measure A/B performance, set up monitoring/alerting based on the metrics, and so on. So stay tuned!

How Castle is Building Codeless Customer Account Protection

Post Syndicated from Guest Author original https://blog.cloudflare.com/castle-building-codeless-customer-account-protection/

How Castle is Building Codeless Customer Account Protection

How Castle is Building Codeless Customer Account Protection

This is a guest post by Johanna Larsson, of Castle, who designed and built the Castle Cloudflare app and the supporting infrastructure.

Strong security should be easy.

Asking your consumers again and again to take responsibility for their security through robust passwords and other security measures doesn’t work. The responsibility of security needs to shift from end users to the companies who serve them.

Castle is leading the way for companies to better protect their online accounts with millions of consumers being protected every day. Uniquely, Castle extends threat prevention and protection for both pre and post login ensuring you can keep friction low but security high. With realtime responses and automated workflows for account recovery, overwhelmed security teams are given a hand. However, when you’re that busy, sometimes deploying new solutions takes more time than you have. Reducing time to deployment was a priority so Castle turned to Cloudflare Workers.

User security and friction

When security is no longer optional and threats are not black or white, security teams are left with trying to determine how to allow end-user access and transaction completions when there are hints of risk, or when not all of the information is available. Keeping friction low is important to customer experience. Castle helps organizations be more dynamic and proactive by making continuous security decisions based on realtime risk and trust.

Some of the challenges with traditional solutions is that they are often just focused on protecting the app or they are only focused on point of access, protecting against bot access for example. Tools specifically designed for securing user accounts however are fundamentally focused on protecting the accounts of the end-users, whether they are being targeting by human or bots. Being able to understand end-user behaviors and their devices both pre and post login is therefore critical in being able to truly protect each users. The key to protecting users is being able to decipher between normal and anomalous activity on an individual account and device basis. You also need a playbook to respond to anomalies and attacks with dedicated flows, that allows your end users to interact directly and provide feedback around security events.

By understanding the end user and their good behaviors, devices, and transactions, it is possible to automatically respond to account threats in real-time based on risk level and policy. This approach not only reduces end-user friction but enables security teams to feel more confident that they won’t ever be blocking a legitimate login or transaction.

Castle processes tens of millions of events every day through its APIs, including contextual information like headers, IP, and device types. The more information that can be associated with a request the better. This allows us to better recognize abnormalities and protect the end user. Collection of this information is done in two ways. One is done on the web application’s backend side through our SDKs and the other is done on the client side using our mobile SDK or browser script. Our experience shows that any integration of a security service based on user behavior and anomaly detection can involve many different parties across an organization, and it affects multiple layers of the tech stack. On top of the security related roles, it’s not unusual to also have to coordinate between backend, devops, and frontend teams. The information related to an end user session is often spread widely over a code base.

The cost of security

One of the biggest challenges in implementing a user-facing security and risk management solution is the variety of people and teams it needs attention from, each with competing priorities. Security teams are often understaffed and overwhelmed making it difficult to take on new projects. At the same time, it consumes time from product and engineering personnel on the application side, who are responsible for UX flows and performing continuous authentication post-login.

We’ve been experimenting with approaches where we can extract that complexity from your application code base, while also reducing the effort of integrating. At Castle, we believe that strong security should be easy.

How Castle is Building Codeless Customer Account Protection

With Cloudflare we found a service that enables us to create a more friendly, simple, and in the end, safe integration process by placing the security layer directly between the end user and your application. Security-related logic shouldn’t pollute your app, but should reside in a separate service, or shield, that covers your app. When the two environments are kept separate, this reduces the time and cost of implementing complex systems making integration and maintenance less stressful and much easier.

Our integration with Cloudflare aims to solve this implementation challenge, delivering end-to-end account protection for your users, both pre and post login, with the click of a button.

The codeless integration

In our quest for a purely codeless integration, key features are required. When every customer application is different, this means every integration is different. We want to solve this problem for you once and for all. To do this, we needed to move the security work away from the implementation details so that we could instead focus on describing the key interactions with the end user, like logins or bank transactions. We also wanted to empower key decision makers to recognize and handle crucial interactions in their systems. Creating a single solution that could be customized to fit each specific use case was a priority.

Building on top of Cloudflare’s platform, we made use of three unique and powerful products: Workers, Apps for Workers, and Workers KV.

Thanks to Workers we have full access to the interactions between the end user and your application. With their impressive performance, we can confidently run inline of website requests without creating noticeable latency. We will never slow down your site. And in order to achieve the flexibility required to match your specific use case, we created an internal configuration format that fully describes the interactions of devices and servers across HTTP, including web and mobile app traffic. It is in this Worker where we’ve implemented an advanced routing engine to match and collect information about requests and responses to events, directly from the edge. It also fully handles injecting the Castle browser script — one less thing to worry about.

All of this logic is kept separate from your application code, and through the Cloudflare App Store we are able to distribute this Worker, giving you control over when and where it is enabled, as well as what configurations are used. There’s no need to copy/paste code or manage your own Workers.

In order to achieve the required speed while running in distributed edge locations, we needed a high performing low latency datastore, and we found one in the Cloudflare Workers KV Store. Cloudflare Apps are not able to access the KV Store directly, but we’ve solved this by exposing it through a separate Worker that the Castle App connects to. Because traffic between Workers never leaves the Cloudflare network, this is both secure and fast enough to match your requirements. The KV Store allows us to maintain end user sessions across the world, and also gives us a place to store and update the configurations and sessions that drive the Castle App.

In combining these products we have a complete and codeless integration that is fully configurable and that won’t slow you down.

How does it work?

The data flow is straightforward. After installing the Castle App, Cloudflare will route your traffic through the Castle App, which uses the Castle Data Store and our API to intelligently protect your end users. The impact to traffic latency is minimal because most work is done in the background, not blocking the requests. Let’s dig deeper into each technical feature:

Script injection

One of the tools we use to verify user identity is a browser script: Castle.js. It is responsible for gathering device information and UI interaction behavior, and although it is not required for our service to function, it helps improve our verdicts. This means it’s important that it is properly added to every page in your web application. The Castle App, running between the end user and your application, is able to unobtrusively add the script to each page as it is served. In order for the script to also track page interactions it needs to be able to connect them to your users, which is done through a call to our script and also works out of the box with the Cloudflare interaction. This removes 100% of the integration work from your frontend teams.

Collect contextual information

The second half of the information that forms the basis of our security analysis is the information related to the request itself, such as IP and headers, as well as timestamps. Gathering this information may seem straightforward, but our experience shows some recurring problems in traditional integrations. IP-addresses are easily lost behind reverse proxies, as they need to be maintained as separate headers, like `X-Forwarded-For`, and the internal format of headers differs from platform to platform. Headers in general might get cut off based on whitelisting. The Castle App sees the original request as it comes in, with no outside influence or platform differences, enabling it to reliably create the context of the request. This saves your infrastructure and backend engineers from huge efforts debugging edge cases.

Advanced routing engine

Finally, in order to reliably recognize important events, like login attempts, we’ve built a fully configurable routing engine. This is fast enough to run inline of your web application, and supports near real-time configuration updates. It is powerful enough to translate requests to actual events in your system, like logins, purchases, profile updates or transactions. Using information from the request, it is then able to send this information to Castle, where you are able to analyze, verify and take action on suspicious activity. What’s even better, is that at any point in the future if you want to Castle protect a new critical user event – such as a withdrawal or transfer event – all it takes is adding a record to the configuration file. You never have to touch application code in order to expand your Castle integration across sensitive events.

We’ve put together an example TypeScript snippet that naively implements the flow and features we’ve discussed. The details are glossed over so that we can focus on the functionality.

addEventListener(event => event.respondWith(handleEvent(event)));

const respondWith = async (event: CloudflareEvent) => {
  // You configure the application with your Castle API key
  const { apiKey } = INSTALL_OPTIONS;
  const { request } = event;

  // Configuration is fetched from the KV Store
  const configuration = await getConfiguration(apiKey);

  // The session is also retrieved from the KV Store
  const session = await getUserSession(request);

  // Pass the request through and get the response
  let response = await fetch(request);

  // Using the configuration we can recognize events by running
  // the request+response and configuration through our matching engine
  const securityEvent = getMatchingEvent(request, response, configuration);

  if (securityEvent) {
    // With direct access to the raw request, we can confidently build the context
    // including a device ID generated by the browser script, IP, and headers
    const requestContext = getRequestContext(request);

    // Collecting the relevant information, the data is passed to the Castle API
    event.waitUntil(sendToCastle(securityEvent, session, requestContext));
  }

  // Because we have access to the response HTML page we can safely inject the browser
  // script. If the response is not an HTML page it is passed through untouched.
  response = injectScript(response, session);

  return response;
};

We hope we have inspired you and demonstrated how Workers can provide speed and flexibility when implementing end to end account protection for your end users with Castle. If you are curious about our service, learn more here.

Top Resources for API Architects and Developers

Post Syndicated from George Mao original https://aws.amazon.com/blogs/architecture/top-resources-for-api-architects-and-developers/

We hope you’ve enjoyed reading our series on API architecture and development. We wrote about best practices for REST APIs with Amazon API Gateway  and GraphQL APIs with AWS AppSync. This post will cover the top resources that all API developers should be aware of.

Tech Talks, Webinars, and Twitch Live Stream

The technical staff at AWS have produced a variety of digital media that cover new service launches, best practices, and customer questions. Be sure to review these videos for tips and tricks on building APIs:

  • Happy Little APIs: This is a multi part series produced by our awesome Developer Advocate, Eric Johnson. He leads a series of talks that demonstrate how to build a real world API.
  • API Gateway’s WebSocket webinar: API Gateway now supports real time APIs with Websockets. This webinar covers how to use this feature and why you should let API Gateway manage your realtime APIs.
  • Best practices for building enterprise grade APIs: API Gateway reduces the time it takes to build and deploy REST development but there are strategies that can make development, security, and management easier.
  • An Intro to AWS AppSync and GraphQL: AppSync helps you build sophisticated data applications with realtime and offline capabilities.

Gain Experience With Hands-On Workshops and Examples

One of the easiest ways to get started with Serverless REST API development is to use the Serverless Application Model (SAM). SAM lets you run APIs and Lambda functions locally on your machine for easy development and testing.

For example, you can configure API Gateway as an Event source for Lambda with just a few lines of code:

Type: Api
Properties:
Path: /photos
Method: post

There are many great examples on our GitHub page to help you get started with Authorization (IAMCognito), Request, Response,  various policies , and CORS configurations for API Gateway.

If you’re working with GraphQL, you should review the Amplify Framework. This is an official AWS project that helps you quickly build Web Applications with built in AuthN and backend APIs using REST or GraphQL. With just a few lines of code, you can have Amplify add all required configurations for your GraphQL API. You have two options to integrate your application with an AppSync API:

  1. Directly using the Amplify GraphQL Client
  2. Using the AWS AppSync SDK

An excellent walk through of the Amplify toolkit is available here, including an example showing how to create a single page web app using ReactJS powered by an AppSync GraphQL API.

Finally, if you are interested in a full hands on experience, take a look at:

  • The Amazon API Gateway WildRydes workshop. This workshop teaches you how to build a functional single page web app with a REST backend, powered by API Gateway.
  • The AWS AppSync GraphQL Photo Workshop. This workshop teaches you how to use Amplify to quickly build a Photo sharing web app, powered by AppSync.

Useful Documentation

The official AWS documentation is the source of truth for architects and developers. Get started with the API Gateway developer guide. API Gateway is currently has two APIs (V1 and V2) for managing the service. Here is where you can view the SDK and CLI reference.

Get started with the AppSync developer guide, and review the AppSync management API.

Summary

As an API architect, your job is not only to design and implement the best API for your use case, but your job is also to figure out which type of API is most cost effective for your product. For example, an application with high request volume (“chatty“) may benefit from a GraphQL implementation instead of REST.

API Gateway currently charges $3.50 / million requests and provides a free tier of 1 Million requests per month. There is tiered pricing that will reduce your costs as request volume rises. AppSync currently charges $4.00 / million for Query and Mutation requests.

While AppSync pricing per request is slightly higher, keep in mind that the nature of GraphQL APIs typically result in significantly fewer overall request numbers.

Finally, we encourage you to join us in the coming weeks — we will be starting a series of posts covering messaging best practices.

About the Author

George MaoGeorge Mao is a Specialist Solutions Architect at Amazon Web Services, focused on the Serverless platform. George is responsible for helping customers design and operate Serverless applications using services like Lambda, API Gateway, Cognito, and DynamoDB. He is a regular speaker at AWS Summits, re:Invent, and various tech events. George is a software engineer and enjoys contributing to open source projects, delivering technical presentations at technology events, and working with customers to design their applications in the Cloud. George holds a Bachelor of Computer Science and Masters of IT from Virginia Tech.