Tag Archives: well architected

Cyber hygiene and MAS Notice 655

Post Syndicated from Darran Boyd original https://aws.amazon.com/blogs/security/cyber-hygiene-and-mas-notice-655/

In this post, I will provide guidance and resources that will help you align to the expectations of the Monetary Authority of Singapore (MAS) Notice 655 – Notice on Cyber Hygiene.

The Monetary Authority of Singapore (MAS) issued Notice 655 – Notice on Cyber Hygiene on 6 Aug 2019. This notice is applicable to all banks in Singapore and takes effect from 6 Aug 2020. The notice sets out the requirements on cyber hygiene for banks across the following six categories: administrative accounts, security patches, security standards, network perimeter defense, malware protection, and multi-factor authentication.

Whilst Notice 655 is specific to all banks in Singapore, the AWS security guidance I provide in this post is based on consistent best practices. As always, it’s important to note that security and compliance is a shared responsibility between AWS and you as our customer. AWS is responsible for the security of the cloud, but you are responsible for your security in the cloud.

To aid in your alignment to Notice 655, AWS has developed a MAS Notice 655 – Cyber Hygiene – Workbook, which is available in AWS Artifact. The workbook covers each of the six categories of cyber hygiene in Notice 655 and maps to the following:

The downloadable workbook contains two embedded formats:

  • Microsoft Excel – coverage includes AWS responsibility control statements, and Well-Architected Framework best practices.
  • Dynamic HTML – same as Microsoft Excel, with the added feature that the Well-Architected Framework best practices are mapped to AWS Config managed rules and Amazon GuardDuty findings, where available or applicable.

Administrative accounts

“4.1. A relevant entity must ensure that every administrative account in respect of any operating system, database, application, security appliance or network device, is secured to prevent any unauthorised access to or use of such account.”

For administrative accounts, it is important to follow best practices for the privileged accounts, keeping in mind both human and programmatic access.

The most privileged user account in an AWS account is the root user. When you first create an AWS account (unless you create it with AWS Organizations), this is the initial user account created. The root user is associated with the provided email address and password used to create the account. The root user account has access to every resource in the account—including the ability to close it. To align to the principle of least privilege, the root user account should not be used for everyday tasks. Instead, AWS Identity and Access Management (IAM) roles should be created and scoped to particular roles and functions within your organization. Furthermore, AWS strongly recommends that you integrate with a centralized identity provider, or a directory service, to authenticate all users in a centralized place. This reduces the requirement for multiple credentials and reduces management complexity.

There are some additional key steps that you should do to further protect your root user account.

Ensure that you have a very long and complex password, and if necessary you should change the root user password to meet this recommendation.

  • Put the root user password in a secure location, and consider a physical or virtual password vault with strong multi-party access protocol.
  • Delete any access keys, and remove any programmatic access keys from the root user account.
  • Enable multi-factor authentication (MFA), and consider a hardware-based token that is stored in a physical vault or safe with a strong multi-party access protocol. Consider using a separate secure vault store for the password and the MFA token, with separate permissions for access.
  • A simple but hugely important step is to ensure your account information is correct, which includes the assigned root email address, so that AWS Support can contact you.

Do keep in mind that there are a few AWS tasks that require root user.

You should use IAM roles for programmatic or system-to-system access to AWS resources that are integrated with IAM. For example, you should use roles for applications that run on Amazon Elastic Compute Cloud (Amazon EC2) instances. Ensure the principle of least privilege is being applied for the IAM policies that are attached to the roles.

For cases where credentials that are not from AWS IAM, such as database credentials, need to be used by your application, you should not hard-code these credentials in the application source code, or stored in an un-encrypted state. It is recommended that you use a secrets management solution. For example, AWS Secrets Manager helps you protect the secrets needed to access your applications, services, and IT resources. Secrets Manager enables you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. Users and applications can retrieve secrets with a call to Secrets Manager APIs, which eliminates the need to hardcode sensitive information in plain text.

For more information, see 4.1 Administrative Accounts in the MAS Notice 655 workbook on AWS Artifact.

Security Patches

“4.2 (a) A relevant entity must ensure that security patches are applied to address vulnerabilities to every system, and apply such security patches within a timeframe that is commensurate with the risks posed by each vulnerability.”

Consider the various categories of security patches you need to manage, based on the AWS Shared Responsibility Model and the AWS services you are using.

Here are some common examples, but does not represent an exhaustive list:

Operating system

When using services from AWS where you have control over the operating system, it is your responsibility to perform patching on these services. For example, if you use Amazon EC2 with Linux, applying security patches for Linux is your responsibility. To help you with this, AWS publishes security patches for Amazon Linux at the Amazon Linux Security Center.

AWS Inspector allows you to run scheduled vulnerability assessments on your Amazon EC2 instances, and provides a report of findings against rules packages that include operating system configuration benchmarks for common vulnerabilities and exposures (CVEs) and Center for Internet Security (CIS) guidelines. To see if AWS Inspector is available in your AWS Region, see the list of supported Regions for Amazon Inspector.

For managing patching activity at scale, consider AWS Systems Manager Patch Manager to automate the process of patching managed instances with both security-related patches and other types of updates.

Container orchestration and containers

If you are running and managing your own container orchestration capability, it is your responsibility to perform patching for both the master and worker nodes. If you are using Amazon Elastic Kubernetes Service (Amazon EKS), then AWS manages the patching of the control plane, and publishes EKS-optimized Amazon Machine Images (AMIs) that include the necessary worker node binaries (Docker and Kubelet). This AMI is updated regularly and includes the most up to date version of these components. You can update your EKS managed nodes to the latest versions of the EKS-optimized AMIs with a single command in the EKS console, API, or CLI. If you are building your own custom AMIs to use for EKS worker nodes, AWS also publishes Packer scripts that document the AWS build steps, to allow you to identify the binaries included in each version of the AMI.

AWS Fargate provides the option of serverless compute for containers, so you can avoid the operational overhead of scaling, patching, securing, and managing servers.

For the container images, you do need to ensure these are scanned for vulnerabilities and patched. The Amazon Elastic Container Registry (Amazon ECR) service offers the ability to scan container images for common vulnerabilities and exposures (CVEs).

Database engines

If you are running and managing your own databases on top of an AWS service such as Amazon EC2, it is your responsibility to perform patching of the database engine.

If you are using Amazon Relational Database Service (Amazon RDS), then AWS will automatically perform the patching of the database engine. This is done within the configurable Amazon RDS maintenance window, and is your opportunity to control when DB instance modifications, database engine version upgrades, and software patching occurs.

In cases where you are using fully-managed AWS database services such as Amazon DynamoDB, AWS takes care of the underlying patching requirements.

Application

For application code and dependencies that you run on AWS services, you own and manage the patching. This applies to applications that your organization has built themselves, or applications from a third-party software vendor. You should make sure that you have a mechanism for ensuring that vulnerabilities in the application code you run are regularly identified & patched.

For more information, see 4.2 Security Patches in the MAS Notice 655 workbook on AWS Artifact.

Security Standards

“4.3. (b)… a relevant entity must ensure that every system conforms to the set of security standards.”

After you have defined your organizational security standards, the next step is to verify your conformance to these standards. In my consultation with customers, I advise that it is a best practice to enforce these security standards as early in the development lifecycle possible. For example, you may have a standard requiring that data of a specific data classification must be encrypted at rest with a AWS Key Management Service (AWS KMS) customer-managed customer master key (CMK). The way this is typically achieved is by defining your Infrastructure-as-Code (IaC), for example using AWS CloudFormation. As your projects move through the various stages of development in your pipeline, you can automatically and programmatically check your IaC templates against codified security standards that you have defined. AWS has a number of tools that assist you with defining you rules and evaluating your IaC templates.

In the case of AWS CloudFormation, you may want to consider the tools AWS CloudFormation Guard, cfn-lint or cfn_nag. Enforcing your security standards as early in the development lifecycle as possible has some key benefits. It instills a culture and practice of creating deployments that are aligned to your standards from the outset, and allows developers to move fast by using the tools and workflow that work best for their team, while providing feedback early enough so they have time to resolve any issues & meet security standards during the development process.

It’s important to complement this IaC pipeline approach with additional controls to ensure that security standards remain in place after it is deployed. You should make sure to look at both preventative controls and detective controls.

For preventative controls, the focus is on IAM permissions. You can use these fine-grained permissions to enforce at the level of the IAM principal (such as user or role) to control what actions can or cannot be taken on AWS resources. You can make use of AWS Organizations service control policies (SCPs) to enforce permission guardrails globally across the entire organization, across an organizational unit, or across individual AWS accounts. Some example SCPs that may align to your security standards include the following: Prevent any virtual private cloud (VPC) that doesn’t already have internet access from getting it, Prevent users from disabling Amazon GuardDuty or modifying its configuration. Additionally, you can use the SCPs described in the AWS Control Tower Guardrail Reference, which you can implement with or without using AWS Control Tower.

For detective controls, after your infrastructure is deployed, you should make use of the AWS Security Hub and/or AWS Config rules to help you meet your compliance needs. You should ensure that the findings from these services are integrated with your technology operations processes to take action, or you can use automated remediation.

For more information, see 4.3 Security Standards in the MAS Notice 655 workbook on AWS Artifact.

Network Perimeter Defense

“4.4. A relevant entity must implement controls at its network perimeter to restrict all unauthorised network traffic.”

Having a layered security strategy is a best practice, and this applies equally to your network. AWS provides a number of complimentary network configuration options that you can implement to add network protection to your resources. You should consider using all of the options I describe here for your AWS workload. You can implement multiple strategies together where possible, to provide network defense in depth.

For network layer protection, you can use security groups for your VPC. Security groups act as a virtual firewall for members of the group to control inbound and outbound traffic. For each security group, you add rules that control the inbound traffic to instances, and a separate set of rules that control the outbound traffic. You can attach Security groups to EC2 instances and other AWS services that use elastic network interfaces, including RDS instances, VPC endpoints, AWS Lambda functions, and Amazon SageMaker notebooks.

You can also use network access control lists (ACLs) as an optional layer of security for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets, and supports allow rules and deny rules. Network ACLs are a good option for controlling traffic at the subnet level.

For application-layer protection against common web exploits, you can use AWS WAF. You use AWS WAF on the Application Load Balancer that fronts your web servers or origin servers that are running on Amazon EC2, on Amazon API Gateway for your APIs, or you can use AWS WAF together with Amazon CloudFront. This allows you to strengthen security at the edge, filtering more of the unwanted traffic out before it reaches your critical content, data, code, and infrastructure.

For distributed denial of service (DDoS) protection, you can use AWS Shield, which is a managed service to provide protection against DDoS attacks for applications running on AWS. AWS Shield is available in two tiers: AWS Shield Standard and AWS Shield Advanced. All AWS customers benefit from the automatic protections of AWS Shield Standard, at no additional charge. AWS Shield Standard defends against most common, frequently occurring network and transport layer DDoS attacks that target your web site or applications. When you use AWS Shield Standard with Amazon CloudFront and Amazon Route 53, you receive comprehensive availability protection against all known infrastructure (Layer 3 and 4) attacks. AWS Shield Advanced provides advanced attack mitigation, visibility and attack notification, DDoS cost protection, and specialist support.

AWS Firewall Manager allows you to centrally configure and manage firewall rules across your accounts and applications in AWS Organizations. Also in AWS Firewall Manager, you can centrally manage and deploy security groups, AWS WAF rules, and AWS Shield Advanced protections.

There are many AWS Partner Network (APN) solutions that provide additional capabilities and protection that work alongside the AWS solutions discussed, in categories such as intrusion detection systems (IDS). For more information, find an APN Partner.

For more information, see 4.4 Network Perimeter Defence in the MAS Notice 655 workbook on AWS Artifact for additional information.

Malware protection

“4.5. A relevant entity must ensure that one or more malware protection measures are implemented on every system, to mitigate the risk of malware infection, where such malware protection measures are available and can be implemented.”

Malware protection requires a multi-faceted approach, including all of the following:

  • Training your employees in security awareness
  • Finding and patching vulnerabilities within your AWS workloads
  • Segmenting your networks
  • Limiting access to your critical systems and data
  • Having a comprehensive backup and restore strategy
  • Detection of malware
  • Creating incident response plans

In the previous sections of this post, I covered security patching, network segmentation, and limiting access. Now I’ll review the remaining elements.

Employee security awareness is crucial, because it is generally accepted that the primary vector by which malware is installed within your organization is by phishing (or spear phishing), where an employee is misled into installing malware, or opens an attachment that uses a vulnerability in software to install itself.

For backup and restore, a comprehensive and tested strategy is crucial, especially when the motivation of the malware is deletion, modification, or mass encryption (ransomware). You can review the AWS backup and restore solutions and leverage the various high-durability storage classes provided by Amazon Simple Storage Service (Amazon S3).

For malware protection, as with other security domains, it is important to have complementary detective controls along with the preventative ones, it is important to have systems for early detection of malware, or of the activity indicative of malware presence. Across your AWS account, when you understanding what typical activity looks like, that gives you a baseline of network and user activity that you can continuously monitor for anomalies.

Amazon GuardDuty is a threat detection service that continuously monitors and compares activity within your AWS environment for malicious or unauthorized behavior to help you protect your AWS accounts and workloads. Amazon GuardDuty uses machine learning, anomaly detection, and integrated threat intelligence to identify and prioritize potential threats.

Continuing on the topic of malware detection, you should consider other approaches as well, including endpoint detection and response (EDR) solutions. AWS has a number of partners that specialize in this space. For more information, find an APN Partner.

Finally, you should make sure that you have a security incident response plan to help you respond to an incident, communicate during an incident, and recover from it. AWS recommends that you create these response plans in the form of playbooks. A good way to create a playbook is to start off simple and iterate to improve your plan. Before you need to respond to an actual event, you should consider the tasks that you can do ahead of time to improve your recovery timeframes. Some of the issues to consider include pre-provisioning access to your responders, and pre-deploying the tools that the responders or forensic teams will need. Importantly, do not wait for an actual incident to test your response and recovery plans. You should run game days to practice, improve and iterate.

For more information, see 4.5 Malware protection in the MAS Notice 655 workbook on AWS Artifact.

Multi-factor authentication

“4.6. … a relevant entity must ensure that multi-factor authentication is implemented for the following:
(a)all administrative accounts in respect of any operating system, database, application, security appliance or network device that is a critical system; and
(b)all accounts on any system used by the relevant entity to access customer information through the internet.“

When using multi-factor authentication (MFA), it’s important to for you to think of the various layers that you need to implement.

For access to the AWS API, AWS Management Console, and AWS resources that use AWS Identity and Access Management (IAM), you can configure MFA with a number of different form factors, and apply it to users within your AWS accounts. As I mentioned in the Administrative accounts section, AWS recommends that you apply MFA to the root account user. Where possible, you should not use IAM users, but instead use identity federation with IAM roles. By using identity federation with IAM roles, you can apply and enforce MFA at the level of your identity provider, for example Active Directory Federation Services (AD FS) or AWS Single Sign-On. For highly privileged actions, you may want to configure MFA-protected API access to only allow the action if MFA authentication has been performed.

With regards to third-party applications, which includes software as a service (SaaS), you should consider integration with AWS services or third-party services to provide MFA protection. For example, AWS Single Sign-On (SSO) includes built-in integrations to many business applications, including Salesforce, Office 365, and others.

For your own in-house applications, you may want to consider solutions such as Amazon Cognito. Amazon Cognito goes beyond standard MFA (which use SMS or TOTP), and includes the option of adaptive authentication when using the advanced security features. With this feature enabled, when Amazon Cognito detects unusual sign-in activity, such as attempts from new locations and unknown devices, it can challenge the user with additional verification checks.

For more information, see 4.6 Multi-Factor authentication in the MAS Notice 655 workbook on AWS Artifact.

Conclusion

AWS products and services have security features designed to help you improve the security of your workloads, and meet your compliance requirements. Many AWS services are reviewed by independent third-party auditors, and these audit reports are available on AWS Artifact. You can use AWS services, tools, and guidance to address your side of the shared responsibility model to align with the requirements stated in Notice 655 – Notice on Cyber Hygiene.

Review the MAS Notice 655 – Cyber Hygiene – Workbook on AWS Artifact to understand both the AWS control environment (the AWS side of the shared responsibility model) and the guidance AWS provides to help you with your side of the shared responsibility model. You will find AWS guidance in the AWS Well-Architected Framework best practices, and where available or applicable to detective controls, in AWS Config rules and Amazon GuardDuty findings.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Boyd author photo

Darran Boyd

Darran is a Principal Security Solutions Architect at AWS, responsible for helping remove security blockers for our customers and accelerating their journey to the AWS Cloud. Darran’s focus and passion is to deliver strategic security initiatives that un-lock and enable our customers at scale across the financial services industry and beyond.

Building well-architected serverless applications: Controlling serverless API access – part 3

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-controlling-serverless-api-access-part-3/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the Introduction post for a table of contents and explanation of the example application.

Security question SEC1: How do you control access to your serverless API?

This post continues part 2 of this security question. Previously, I cover Amazon Cognito user and identity pools, JSON web tokens (JWT), API keys and usage plans.

Best practice: Scope access based on identity’s metadata

Authenticated users should be separated into logical groups, roles, or tiers. Separation can also be based on custom authentication token attributes included within Security Assertion Markup Language (SAML) or JSON Web Tokens (JWT). Consider using the user’s identity metadata to enable fine-grain control access to resources and actions.

Scoping access based on authentication metadata allows you to provide limited and fine-grained capabilities and access to consumers based on their roles and intent.

Review levels of access, identity metadata, and separate consumers into logical groups/tiers

With JWT or SAML, ensure you have the right level of information available within the token claims to help you develop authorization logic. Use custom private claims along with a unique namespace for non-public information. Private claims are to share custom information specifically with your application client. Unique namespaces are to avoid name collision for custom claims. For more information, see the AWS Partner Network blog post “Understanding JWT Public, Private and Reserved Claims”.

With Amazon Cognito, you can use custom attributes or the Pre Token Generation Lambda Trigger feature. This AWS Lambda trigger allows you to customize a JWT token claim before the token is generated.

To illustrate using Amazon Cognito groups, I use the example from this blog post. The example uses Amplify CLI to create a web application for managing group membership. API Gateway handles authentication using an Amazon Cognito user pool as part of an administrator API. Two Amazon Cognito user pool groups are created using amplify auth update, one for admin, and one for editors.

  1. I navigate to the deployed web application and create two users, an administrator called someadminuser and an editor user called awesomeeditor.
  2. Show Amazon Cognito user creation

    Show Amazon Cognito user creation

  3. I navigate to the Amazon Cognito user pool console, choose Users and groups under General settings, and can see that both users are created.
  4. View Amazon Cognito users created

    View Amazon Cognito users created

  5. I choose the Groups tab and see that there are two user pool groups set up as part of amplify auth update.
  6. I add the someadminuser to the admin group.
  7. View Amazon Cognito user added to group and IAM role

    View Amazon Cognito user added to group and IAM role

  8. There is an AWS Identity and Access Management (IAM) role associated with the administrator group. This IAM role has an associated identity policy that grants permission to access an S3 bucket for some future application functionality.
  9. {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Action": [
                    "s3:PutObject",
                    "s3:GetObject",
                    "s3:ListBucket",
                    "s3:DeleteObject"
                ],
                "Resource": [
                    "arn:aws:s3:::mystoragebucket194021-dev/*"
                ],
                "Effect": "Allow"
            }
        ]
    }
    
  10. I log on to the web application using both the someadminuser and awesomeeditor accounts and compare the two JWT accessToken Amazon Cognito has generated.

The someadminuser has a cognito:groups claim within the token showing membership of the user pool group admin.

View JWT with group membership

View JWT with group membership

This token with its group claim can be used in a number of ways to authorize access.

Within this example frontend application, the token is used against an API Gateway resource using an Amazon Cognito authorizer.

An Amazon Cognito authorizer is an alternative to using IAM or Lambda authorizers to control access to your API Gateway method. The client first signs in to the user pool, and receives a token. The client then calls the API method with the token which is typically in the request’s Authorization header. The API call only succeeds if a valid is supplied. Without the correct token, the client isn’t authorized to make the call.

In this example, the Amazon Cognito authorizer authorizes access at the API method. Next, the event payload passed to the Lambda function contains the token. The function reads the token information. If the group membership claim includes admin, it adds the awesomeeditor user to the Amazon Cognito user pool group editors.

  1. To see how this is configured, I navigate to the API Gateway console and select the AdminQueries API.
  2. I view the /{proxy+}/ANY resource.
  3. I see that the Integration Request is set to LAMBDA_PROXY. which calls the AdminQueries Lambda function.
  4. View API Gateway Lambda proxy path

    View API Gateway Lambda proxy path

  5. I view the Method Request.
  6. View API Gateway Method Request using Amazon Cognito authorization

    View API Gateway Method Request using Amazon Cognito authorization

  7. Authorization is set to an Amazon Cognito user pool authorizer with an OAuth scope of aws.cognito.signin.user.admin. This scope grants access to Amazon Cognito user pool API operations that require access tokens, such as AdminAddUserToGroup.
  8. I navigate to the Authorizers menu item, and can see the configured Amazon Cognito authorizer.
  9. In the Amazon Cognito user pool details, the Token Source is set to Authorization. This is the name of the header sent to the Amazon Cognito user pool for authorization.
  10. View Amazon Cognito authorizer settings

    View Amazon Cognito authorizer settings

  11. I navigate to the AWS Lambda console, select the AdminQueries function which amplify add auth added, and choose the Permissions tab. I select the Execution role and view its Permissions policies.
  12. I see that the function execution role allows write permission to the Amazon Cognito user pool resource. This allows the function to amend the user pool group membership.
  13. View Lambda execution role permissions including Amazon Cognito write

    View Lambda execution role permissions including Amazon Cognito write

  14. I navigate back to the AWS Lambda console, and view the configuration for the AdminQueries function. There is an environment variable set for GROUP=admin.
Lambda function environment variables

Lambda function environment variables

The Lambda function code checks if the authorizer.claims token includes the GROUP environment variable value of admin. If not, the function returns err.statusCode = 403 and an error message. Here is the relevant section of code within the function.

// Only perform tasks if the user is in a specific group
const allowedGroup = process.env.GROUP;
…..
  // Fail if group enforcement is being used
  if (req.apiGateway.event.requestContext.authorizer.claims['cognito:groups']) {
    const groups = req.apiGateway.event.requestContext.authorizer.claims['cognito:groups'].split(',');
    if (!(allowedGroup && groups.indexOf(allowedGroup) > -1)) {
      const err = new Error(`User does not have permissions to perform administrative tasks`);
      err.statusCode = 403;
      next(err);
    }
  } else {
    const err = new Error(`User does not have permissions to perform administrative tasks`);
    err.statusCode = 403;
    next(err);
  }

This example shows using a JWT to perform authorization within a Lambda function.

If the authorization is successful, the function continues and adds the awesomeeditor user to the editors group.

To show this flow in action:

  1. I log on to the web application using the awesomeeditor account, which is not a member of the admin group. I choose the Add to Group button.
  2. Sign in as editor

    Sign in as editor

  3. Using the browser developer tools I see that the API request has failed, returning the 403 error code from the Lambda function.
  4. Shows 403 access denied

    Shows 403 access denied

  5. I log on to the web application using the someadminuser account and choose the Add to Group button.
  6. Sign in as admin

    Sign in as admin

  7. Using the browser developer tools I see that the API request is now successful as the user is a member of the admin group.
  8. API successful call as admin

    API successful call as admin

  9. I navigate back to the Amazon Cognito user pool console, and view Users and groups. The awesomeeditor user is now a member of the editors group.
User now member of editors group

User now member of editors group

The Lambda function has added the awesomeeditor account to the editors group.

Implement authorization logic based on authentication metadata

Another way to separate users for authorization is using Amazon Cognito to define a resource server with custom scopes.

A resource server is a server for access-protected resources. It handles authenticated requests from an app that has an access token. This API can be hosted in Amazon API Gateway or outside of AWS. A scope is a level of access that an app can request to a resource. For example, if you have a resource server for airline flight details, it might define two scopes. One scope for all customers with read access to view the flight details, and one for airline employees with write access to add new flights. When the app makes an API call to request access and passes an access token, the token has one or more embedded scopes.

JWT with scope

JWT with scope

This allows you to provide different access levels to API resources for different application clients based on the custom scopes. It is another mechanism for separating users during authentication.

For authorizing based on token claims, use an API Gateway Lambda authorizer.

For more information, see “Using Amazon Cognito User Pool Scopes with Amazon API Gateway”.

With AWS AppSync, use GraphQL resolvers. AWS Amplify can also generate fine-grained authorization logic via GraphQL transformers (directives). You can annotate your GraphQL schema to a specific data type, data field, and specific GraphQL operation you want to allow access. These can include JWT groups or custom claims. For more information, see “GraphQL API Security with AWS AppSync and Amplify”, and the AWS AppSync documentation for Authorization Use Cases, and fine-grained access control.

Improvement plan summary:

  1. Review levels of access, identity metadata and separate consumers into logical groups/tiers.
  2. Implement authorization logic based on authentication metadata

Conclusion

Controlling serverless application API access using authentication and authorization mechanisms can help protect against unauthorized access and prevent unnecessary use of resources. In part 1, I cover the different mechanisms for authorization available for API Gateway and AWS AppSync. I explain the different approaches for public or private endpoints and show how to use IAM to control access to internal or private API consumers.

In part 2, I cover using Amplify CLI to add a GraphQL API with an Amazon Cognito user pool handling authentication. I explain how to view JSON Web Token (JWT) claims, and how to use Amazon Cognito identity pools to grant temporary access to AWS services. I also show how to use API keys and API Gateway usage plans for rate limiting and throttling requests.

In this post, I cover separating authenticated users into logical groups. I first show how to use Amazon Cognito user pool groups to separate users with an Amazon Cognito authorizer to control access to an API Gateway method. I also show how JWTs can be passed to a Lambda function to perform authorization within a function. I then explain how to also separate users using custom scopes by defining an Amazon Cognito resource server.

In an upcoming post, I will cover the second security question from the Well-Architected Serverless Lens about managing serverless security boundaries.

Building well-architected serverless applications: Controlling serverless API access – part 2

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-controlling-serverless-api-access-part-2/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the Introduction post for a table of contents and explanation of the example application.

Security question SEC1: How do you control access to your serverless API?

This post continues part 1 of this security question. Previously, I cover the different mechanisms for authentication and authorization available for Amazon API Gateway and AWS AppSync. I explain the different approaches for public or private endpoints and show how to use AWS Identity and Access Management (IAM) to control access to internal or private API consumers.

Required practice: Use appropriate endpoint type and mechanisms to secure access to your API

I continue to show how to implement security mechanisms appropriate for your API endpoint.

Using AWS Amplify CLI to add a GraphQL API

After adding authentication in part 1, I use the AWS Amplify CLI to add a GraphQL AWS AppSync API with the following command:

amplify add api

When prompted, I specify an Amazon Cognito user pool for authorization.

Amplify add Amazon Cognito user pool for authorization

Amplify add Amazon Cognito user pool for authorization

To deploy the AWS AppSync API configuration to the AWS Cloud, I enter:

amplify push

Once the deployment is complete, I view the GraphQL API from within the AWS AppSync console and navigate to Settings. I see the AWS AppSync API uses the authorization configuration added during the part 1 amplify add auth. This uses the Amazon Cognito user pool to store the user sign-up information.

View AWS AppSync authorization settings with Amazon Cognito

View AWS AppSync authorization settings with Amazon Cognito

For a more detailed walkthrough using Amplify CLI to add an AWS AppSync API for the serverless airline, see the build video.

Viewing JWT tokens

When I create a new account from the serverless airline web frontend, Amazon Cognito creates a user within the user pool. It handles the 3-stage sign-up process for new users. This includes account creation, confirmation, and user sign-in.

Serverless airline Amazon Cognito based sign-in process

Serverless airline Amazon Cognito based sign-in process

Once the account is created, I browse to the Amazon Cognito console and choose Manage User Pools. I navigate to Users and groups under General settings and view my user account.

View User Account

View User Account

When I sign in to the serverless airline web app, I authenticate with Amazon Cognito, and the client receives user pool tokens. The client then calls the AWS AppSync API, which authorizes access using the tokens, connects to data sources, and resolves the queries.

Amazon Cognito tokens used by AWS AppSync

Amazon Cognito tokens used by AWS AppSync

During the sign-in process, I can use the browser developer tools to view the three JWT tokens Amazon Cognito generates and returns to the client. These are the accesstoken, idToken, and refreshToken.

View tokens with browser developer tools

View tokens with browser developer tools

I copy the .idToken value and use the decoder at https://jwt.io/ to view the contents.

JSON web token decoded

JSON web token decoded

The decoded token contains claims about my identity. Claims are pieces of information asserted about my identity. In this example, these include my Amazon Cognito username, email address, and other sign-up fields specified in the user pool. The client can use this identity information inside the application.

The ID token expires one hour after I authenticate. The client uses the Amazon Cognito issued refreshToken to retrieve new ID and access tokens. By default, the refresh token expires after 30 days, but can be set to any value between 1 and 3650 days. When using the mobile SDKs for iOS and Android, retrieving new ID and access tokens is done automatically with a valid refresh token.

For more information, see “Using Tokens with User Pools”.

Accessing AWS services

An Amazon Cognito user pool is a managed user directory to provide access for a user to an application. Amazon Cognito has a feature called identity pools (federated identities), which allow you to create unique identities for your users. These can be from user pools, or other external identity providers.

These unique identities are used to get temporary AWS credentials to directly access other AWS services, or external services via API Gateway. The Amplify client libraries automatically expire, rotate, and refresh the temporary credentials.

Identity pools have identities that are either authenticated or unauthenticated. Unauthenticated identities typically belong to guest users. Authenticated identities belong to authenticated users who have received a token by a login provider, such as a user pool. The Amazon Cognito issued user pool tokens are exchanged for AWS access credentials from an identity pool.

JWT-tokens-from-Amazon-Cognito-user-pool-exchanged-for-AWS-credentials-from-Amazon-Cognito-identity-pool

JWT-tokens-from-Amazon-Cognito-user-pool-exchanged-for-AWS-credentials-from-Amazon-Cognito-identity-pool

API keys

For public content and unauthenticated access, both Amazon API Gateway and AWS AppSync provide API keys that can be used to track usage. API keys should not be used as a primary authorization method for production applications. Instead, use these for rate limiting and throttling. Unauthenticated APIs require stricter throttling than authenticated APIs.

API Gateway usage plans specify who can access API stages and methods, and also how much and how fast they can access them. API keys are then associated with the usage plans to identify API clients and meter access for each key. Throttling and quota limits are enforced on individual keys.

Throttling limits determine how many requests per second are allowed for a usage plan. This is useful to prevent a client from overwhelming a downstream resource. There are two API Gateway values to control this, the throttle rate and throttle burst, which use the token bucket algorithm. The algorithm is based on an analogy of filling and emptying a bucket of tokens representing the number of available requests that can be processed. The bucket in the algorithm has a fixed size based on the throttle burst and is filled at the token rate. Each API request removes a token from the bucket. The throttle rate then determines how many requests are allowed per second. The throttle burst determines how many concurrent requests are allowed and is shared across all APIs per Region in an account.

Token bucket algorithm

Token bucket algorithm

Quota limits allow you to set a maximum number of requests for an API key within a fixed time period. When billing for usage, this also allows you to enforce a limit when a client pays by monthly volume.

API keys are passed using the x-api-key header. API Gateway rejects requests without them.

For example, within the serverless airline, the loyalty service uses an AWS Lambda function to fetch loyalty points and next tier progress via an API Gateway REST API /loyalty/{customerId}/get resource.

I can use this API to simulate the effect of usage plans with API keys.

  1. I navigate to the airline-loyalty API /loyalty/{customerId}/get resource in API Gateway console.
  2. I change the API Key Required value to be true.
  3. Setting API Key Required on API Gateway method

    Setting API Key Required on API Gateway method

  4. I choose Deploy API from the Actions menu.
  5. I create a usage plan in the Usage Plans section of the API Gateway Console.
  6. I choose Create and enter a name for the usage plan.
  7. I select Enable throttling and set the rate to one request per second and the burst to two requests. These are artificially low numbers to simulate the effect.
  8. I select Enable quota and set the limit to 10 requests per day.
  9. Create API Gateway usage plan

    Create API Gateway usage plan

  10. I click Next.
  11. I associate an API Stage by choosing Add API Stage, and selecting the airline Loyalty API and Prod Stage.
  12. Associate usage plan to API Gateway stage

    Associate usage plan to API Gateway stage

  13. I click Next, and choose Create API Key and add to Usage Plan
  14. Create API key and add to usage plan.

    Create API key and add to usage plan.

  15. I name the API Key and ensure it is set to Auto Generate.
  16. Name API Key

  17. I choose Save then Done to associate the API key with the usage plan.
API key associated with usage plan

API key associated with usage plan

I test the API authentication, in addition to the throttles and limits using Postman.

I issue a GET request against the API Gateway URL using a customerId from the airline Airline-LoyaltyData Amazon DynamoDB table. I don’t specify any authorization or API key.

Postman unauthenticated GET request

Postman unauthenticated GET request

I receive a Missing Authentication Token reply, which I expect as the API uses IAM authentication and I haven’t authenticated.

I then configure authentication details within the Authorization tab, using an AWS Signature. I enter my AWS user account’s AccessKey and SecretKey, which has an associated IAM identity policy to access the API.

Postman authenticated GET request without access key

Postman authenticated GET request without access key

I receive a Forbidden reply. I have successfully authenticated, but the API Gateway method rejects the request as it requires an API key, which I have not provided.

I retrieve and copy my previously created API key from the API Gateway console API Keys section, and display it by choosing Show.

Retrieve API key.

Retrieve API key.

I then configure an x-api-key header in the Postman Headers section and paste the API key value.

Having authenticated and specifying the required API key, I receive a response from the API with the loyalty points value.

Postman successful authenticated GET request with access key

Postman successful authenticated GET request with access key

I then call the API with a number of quick successive requests.

When I exceed the throttle rate limit of one request per second, and the throttle burst limit of two requests, I receive:

{"message": "Too Many Requests"}

When I then exceed the quota of 10 requests per day, I receive:

{"message": "Limit Exceeded"}

I view the API key usage within the API Gateway console Usage Plan section.

I select the usage plan, choose the API Keys section, then choose Usage. I see how many requests I have made.

View API key usage

View API key usage

If necessary, I can also grant a temporary rate extension for this key.

For more information on using API Keys for unauthenticated access for AWS AppSync, see the documentation.

API Gateway also has support for AWS Web Application Firewall (AWS WAF) which helps protect web applications and APIs from attacks. It is another mechanism to apply rate-based rules to prevent public API consumers exceeding a configurable request threshold. AWS WAF rules are evaluated before other access control features, such as resource policies, IAM policies, Lambda authorizers, and Amazon Cognito authorizers. For more information, see “Using AWS WAF with Amazon API Gateway”.

AWS AppSync APIs have built-in DDoS protection to protect all GraphQL API endpoints from attacks.

Improvement plan summary:

  1. Determine your API consumer and choose an API endpoint type.
  2. Implement security mechanisms appropriate to your API endpoint

Conclusion

Controlling serverless application API access using authentication and authorization mechanisms can help protect against unauthorized access and prevent unnecessary use of resources.

In this post, I cover using Amplify CLI to add a GraphQL API with an Amazon Cognito user pool handling authentication. I explain how to view JSON Web Token (JWT) claims, and how to use identity pools to grant temporary access to AWS services. I also show how to use API keys and API Gateway usage plans for rate limiting and throttling requests.

This well-architected question will be continued where I look at segregating authenticated users into logical groups. I will first show how to use Amazon Cognito user pool groups to separate users with an Amazon Cognito authorizer to control access to an API Gateway method. I will also show how to pass JWTs to a Lambda function to perform authorization within a function. I will then explain how to also segregate users using custom scopes by defining an Amazon Cognito resource server.

Building well-architected serverless applications: Controlling serverless API access – part 1

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-controlling-serverless-api-access-part-1/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the Introduction post for a table of contents and explanation of the example application.

Security question SEC1: How do you control access to your serverless API?

Use authentication and authorization mechanisms to prevent unauthorized access, and enforce quota for public resources. By controlling access to your API, you can help protect against unauthorized access and prevent unnecessary use of resources.

AWS has a number of services to provide API endpoints including Amazon API Gateway and AWS AppSync.

Use Amazon API Gateway for RESTful and WebSocket APIs. Here is an example serverless web application architecture using API Gateway.

Example serverless application architecture using API Gateway

Example serverless application architecture using API Gateway

Use AWS AppSync for managed GraphQL APIs.

AWS AppSync overview diagram

AWS AppSync overview diagram

The serverless airline example in this series uses AWS AppSync to provide the frontend, user-facing public API. The application also uses API Gateway to provide backend, internal, private REST APIs for the loyalty and payment services.

Good practice: Use an authentication and an authorization mechanism

Authentication and authorization are mechanisms for controlling and managing access to a resource. In this well-architected question, that is a serverless API. Authentication is verifying who a client or user is. Authorization is deciding whether they have the permission to access a resource. By enforcing authorization, you can prevent unauthorized access to your workload from non-authenticated users.

Integrate with an identity provider that can validate your API consumer’s identity. An identity provider is a system that provides user authentication as a service. The identity provider may use the XML-based Security Assertion Markup Language (SAML), or JSON Web Tokens (JWT) for authentication. It may also federate with other identity management systems. JWT is an open standard that defines a way for securely transmitting information between parties as a JSON object. JWT uses frameworks such as OAuth 2.0 for authorization and OpenID Connect (OIDC), which builds on OAuth2, and adds authentication.

Only authorize access to consumers that have successfully authenticated. Use an identity provider rather than API keys as a primary authorization method. API keys are more suited to rate limiting and throttling.

Evaluate authorization mechanisms

Use AWS Identity and Access Management (IAM) for authorizing access to internal or private API consumers, or other AWS Managed Services like AWS Lambda.

For public, user facing web applications, API Gateway accepts JWT authorizers for authenticating consumers. You can use either Amazon Cognito or OpenID Connect (OIDC).

App client authenticates and gets tokens

App client authenticates and gets tokens

For custom authorization needs, you can use Lambda authorizers.

A Lambda authorizer (previously called a custom authorizer) is an AWS Lambda function which API Gateway calls for an authorization check when a client makes a request to an API method. This means you do not have to write custom authorization logic in a function behind an API. The Lambda authorizer function can validate a bearer token such as JWT, OAuth, or SAML, or request parameters and grant access. Lambda authorizers can be used when using an identity provider other than Amazon Cognito or AWS IAM, or when you require additional authorization customization.

Lambda authorizers

Lambda authorizers

For more information, see the AWS Hero blog post, “The Complete Guide to Custom Authorizers with AWS Lambda and API Gateway”.

The AWS documentation also has a useful section on “Understanding Lambda Authorizers Auth Workflow with Amazon API Gateway”.

Enforce authorization for non-public resources within your API

Within API Gateway, you can enable native authorization for users authenticated using Amazon Cognito or AWS IAM. For authorizing users authenticated by other identity providers, use Lambda authorizers.

For example, within the serverless airline, the loyalty service uses a Lambda function to fetch loyalty points and next tier progress. AWS AppSync acts as the client using an HTTP resolver, via an API Gateway REST API /loyalty/{customerId}/get resource, to invoke the function.

To ensure only AWS AppSync is authorized to invoke the API, IAM authorization is set within the API Gateway method request.

Viewing API Gateway IAM authorization

Viewing API Gateway IAM authorization

The serverless airline uses the AWS Serverless Application Model (AWS SAM) to deploy the backend infrastructure as code. This makes it easier to know which IAM role has access to the API. One of the benefits of using infrastructure as code is visibility into all deployed application resources, including IAM roles.

The loyalty service AWS SAM template contains the AppsyncLoyaltyRestApiIamRole.

AppsyncLoyaltyRestApiIamRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
  AppsyncLoyaltyRestApiIamRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: appsync.amazonaws.com
            Action: sts:AssumeRole
      Path: /
      Policies:
        - PolicyName: LoyaltyApiInvoke
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - execute-api:Invoke
                # arn:aws:execute-api:region:account-id:api-id/stage/METHOD_HTTP_VERB/Resource-path
                Resource: !Sub arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${LoyaltyApi}/*/*/*

The IAM role specifies that appsync.amazonaws.com can perform an execute-api:Invoke on the specific API Gateway resource arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${LoyaltyApi}/*/*/*

Within AWS AppSync, you can enable native authorization for users authenticating using Amazon Cognito or AWS IAM. You can also use any external identity provider compliant with OpenID Connect (OIDC).

Improvement plan summary:

  1. Evaluate authorization mechanisms.
  2. Enforce authorization for non-public resources within your API

Required practice: Use appropriate endpoint type and mechanisms to secure access to your API

APIs may have public or private endpoints. Consider public endpoints to serve consumers where they may not be part of your network perimeter. Consider private endpoints to serve consumers within your network perimeter where you may not want to expose the API publicly. Public and private endpoints may have different levels of security.

Determine your API consumer and choose an API endpoint type

For providing public content, use Amazon API Gateway or AWS AppSync public endpoints.

For providing content with restricted access, use Amazon API Gateway with authorization to specific resources, methods, and actions you want to restrict. For example, the serverless airline application uses AWS IAM to restrict access to the private loyalty API so only AWS AppSync can call it.

With AWS AppSync providing a GraphQL API, restrict access to specific data types, data fields, queries, mutations, or subscriptions.

You can create API Gateway private REST APIs that you can only access from your AWS Virtual Private Cloud(VPC) by using an interface VPC endpoint.

API Gateway private endpoints

API Gateway private endpoints

For more information, see “Choose an endpoint type to set up for an API Gateway API”.

Implement security mechanisms appropriate to your API endpoint

With Amazon API Gateway and AWS AppSync, for both public and private endpoints, there are a number of mechanisms for access control.

For providing content with restricted access, API Gateway REST APIs support native authorization using AWS IAM, Amazon Cognito user pools, and Lambda authorizers. Amazon Cognito user pools is a feature that provides a managed user directory for authentication. For more detailed information, see the AWS Hero blog post, “Picking the correct authorization mechanism in Amazon API Gateway“.

You can also use resource policies to restrict content to a specific VPC, VPC endpoint, a data center, or a specific AWS Account.

API Gateway resource policies are different from IAM identity policies. IAM identity policies are attached to IAM users, groups, or roles. These policies define what that identity can do on which resources. For example, in the serverless airline, the IAM role AppsyncLoyaltyRestApiIamRole specifies that appsync.amazonaws.com can perform an execute-api:Invoke on the specific API Gateway resource arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${LoyaltyApi}/*/*/*

Resource policies are attached to resources such as an Amazon S3 bucket, or an API Gateway resource or method. The policies define what identities can access the resource.

IAM access is determined by a combination of identity policies and resource policies.

For more information on the differences, see “Identity-Based Policies and Resource-Based Policies”. To see which services support resource-based policies, see “AWS Services That Work with IAM”.

API Gateway HTTP APIs support JWT authorizers as a part of OpenID Connect (OIDC) and OAuth 2.0 frameworks.

API Gateway WebSocket APIs support AWS IAM and Lambda authorizers.

With AWS AppSync public endpoints, you can enable authorization with the following:

  • AWS IAM
  • Amazon Cognito User pools for email and password functionality
  • Social providers (Facebook, Google+, and Login with Amazon)
  • Enterprise federation with SAML

Within the serverless airline, AWS Amplify Console hosts the public user facing site. Amplify Console provides a git-based workflow for building, deploying, and hosting serverless web applications. Amplify Console manages the hosting of the frontend assets for single page app (SPA) frameworks in addition to static websites, along with an optional serverless backend. Frontend assets are stored in S3 and the Amazon CloudFront global edge network distributes the web app globally.

The AWS Amplify CLI toolchain allows you to add backend resources using AWS CloudFormation.

Using Amplify CLI to add authentication

For the serverless airline, I use the Amplify CLI to add authentication using Amazon Cognito with the following command:

amplify add auth

When prompted, I specify the authentication parameters I require.

Amplify add auth

Amplify add auth

Amplify CLI creates a local CloudFormation template. Use the following command to deploy the updated authentication configuration to the cloud:

amplify push

Once the deployment is complete, I view the deployed authentication nested stack resources from within the CloudFormation Console. I see the Amazon Cognito user pool.

View Amplify authentication CloudFormation nested stack resources

View Amplify authentication CloudFormation nested stack resources

For a more detailed walkthrough using Amplify CLI to add authentication for the serverless airline, see the build video.

For more information on Amplify CLI and authentication, see “Authentication with Amplify”.

Conclusion

To help protect against unauthorized access and prevent unnecessary use of serverless API resources, control access using authentication and authorization mechanisms.

In this post, I cover the different mechanisms for authorization available for API Gateway and AWS AppSync. I explain the different approaches for public or private endpoints and show how to use IAM to control access to internal or private API consumers. I walk through how to use the Amplify CLI to create an Amazon Cognito user pool.

This well-architected question will be continued in a future post where I continue using the Amplify CLI to add a GraphQL API. I will explain how to view JSON Web Tokens (JWT) claims, and how to use Cognito identity pools to grant temporary access to AWS services. I will also show how to use API keys and API Gateway usage plans for rate limiting and throttling requests.

Building well-architected serverless applications: Approaching application lifecycle management – part 3

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-approaching-application-lifecycle-management-part-3/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the Introduction post for a table of contents and explanation of the example application.

Question OPS2: How do you approach application lifecycle management?

This post continues part 2 of this Operational Excellence question where I look at deploying to multiple stages using temporary environments, and rollout deployments. In part 1, I cover using infrastructure as code with version control to deploy applications in a repeatable manner.

Good practice: Use configuration management

Use environment variables and configuration management systems to make and track configuration changes. These systems reduce errors caused by manual processes, reduce the level of effort to deploy changes, and help isolate configuration from business logic.

Environment variables are suited for infrequently changing configuration options such as logging levels, and database connection strings. Configuration management systems are for dynamic configuration that might change frequently or contain sensitive data such as secrets.

Environment variables

The serverless airline example used in this series uses AWS Amplify Console environment variables to store application-wide settings.

For example, the Stripe payment keys for all branches, and names for individual branches, are visible within the Amplify Console in the Environment variables section.

AWS Amplify environment variables

AWS Amplify environment variables

AWS Lambda environment variables are set up as part of the function configuration stored using the AWS Serverless Application Model (AWS SAM).

For example, the airline booking ReserveBooking AWS SAM template sets global environment variables including the LOG_LEVEL with the following code.

Globals:
    Function:
        Environment:
            Variables:
                LOG_LEVEL: INFO

This is visible in the AWS Lambda console within the function configuration.

AWS Lambda environment variables in console

AWS Lambda environment variables in console

See the AWS Documentation for more information on using AWS Lambda environment variables and also how to store sensitive data. Amazon API Gateway can also pass stage-specific metadata to Lambda functions.

Dynamic configuration

Dynamic configuration is also stored in configuration management systems to specify external values and is unique to each environment. This configuration may include values such as an Amazon Simple Notification Service (Amazon SNS) topic, Lambda function name, or external API credentials. AWS System Manager Parameter Store, AWS Secrets Manager, and AWS AppConfig have native integrations with AWS CloudFormation to store dynamic configuration. For more information, see the examples for referencing dynamic configuration from within AWS CloudFormation.

For the serverless airline application, dynamic configuration is stored in AWS Systems Manager Parameter Store. During CloudFormation stack deployment, a number of parameters are stored in Systems Manager. For example, in the booking service AWS SAM template, the booking SNS topic ARN is stored.

BookingTopicParameter:
    Type: "AWS::SSM::Parameter"
    Properties:
        Name: !Sub /${Stage}/service/booking/messaging/bookingTopic
        Description: Booking SNS Topic ARN
        Type: String
        Value: !Ref BookingTopic

View the stored SNS topic value by navigating to the Parameter Store console, and search for BookingTopic.

Finding Systems Manager Parameter Store values

Finding Systems Manager Parameter Store values

Select the Parameter name and see the Amazon SNS ARN.

Viewing SNS topic value

Viewing SNS topic value

The loyalty service then references this value within another stack.

When the Amplify Console Makefile deploys the loyalty service, it retrieves this value for the booking service from Parameter Store, and references it as a parameter-override. The deployment is also parametrized with the $${AWS_BRANCH} environment variable if there are multiple environments within the same AWS account and Region.

sam deploy \
	--parameter-overrides \
	BookingSNSTopic=/$${AWS_BRANCH}/service/booking/messaging/bookingTopic

Environment variables and configuration management systems help with managing application configuration.

Improvement plan summary

  1. Use environment variables for configuration options that change infrequently such as logging levels, and database connection strings.
  2. Use a configuration management system for dynamic configuration that might change frequently or contain sensitive data such as secrets.

Best practice: Use CI/CD including automated testing across separate accounts

Continuous integration/delivery/deployment is one of the cornerstones of cloud application development and a vital part of a DevOps initiative.

Explanation of CI/CD stages

Explanation of CI/CD stages

Building CI/CD pipelines increases software delivery quality and feedback time for detecting and resolving errors. I cover how to deploy multiple stages in isolated environments and accounts, which helps with creating separate testing CI/CD pipelines in part 2. As the serverless airline example is using AWS Amplify Console, this comes with a built-in CI/CD pipeline.

Automate the build, deployment, testing, and rollback of the workload using KPI and operational alerts. This eases troubleshooting, enables faster remediation and feedback time, and enables automatic and manual rollback/roll-forward should an alert trigger.

I cover metrics, KPIs, and operational alerts in this series in the Application Health part 1, and part 2 posts. I cover rollout deployments with traffic shifting based on metrics in this question’s part 2.

CI/CD pipelines should include integration, and end-to-end tests. I cover local unit testing for Lambda and API Gateway in part 2.

Add an optional testing stage to Amplify Console to catch regressions before pushing code to production. Use the test step to run any test commands at build time using any testing framework of your choice. Amplify Console has deeper integration with the Cypress test suite that allows you to generate a UI report for your tests. Here is an example to set up end-to-end tests with Cypress.

Cypress testing example

Cypress testing example

There are a number of AWS and third-party solutions to host code and create CI/CD pipelines for serverless applications.

AWS Code Suite

AWS Code Suite

For more information on how to use the AWS Code* services together, see the detailed Quick Start deployment guide Serverless CI/CD for the Enterprise on AWS.

All these AWS services have a number of integrations with third-party products so you can integrate your serverless applications with your existing tools. For example, CodeBuild can build from GitHub and Atlassian Bitbucket repositories. CodeDeploy integrates with a number of developer tools and configuration management systems. CodePipeline has a number of pre-built integrations to use existing tools for your serverless applications. For more information specifically on using CircleCI for serverless applications, see Simplifying Serverless CI/CD with CircleCI and the AWS Serverless Application Model.

Improvement plan summary

  1. Use a continuous integration/continuous deployment (CI/CD) pipeline solution that deploys multiple stages in isolated environments/accounts.
  2. Automate testing including but not limited to unit, integration, and end-to-end tests.
  3. Favor rollout deployments over all-at-once deployments for more resilience, and gradually learn what metrics best determine your workload’s health to appropriately alert on.
  4. Use a deployment system that supports traffic shifting as part of your pipeline, and rollback/roll-forward traffic to previous versions if an alert is triggered.

Good practice: Review function runtime deprecation policy

Lambda functions created using AWS provided runtimes follow official long-term support deprecation policies. Third-party provided runtime deprecation policy may differ from official long-term support. Review your runtime deprecation policy and have a mechanism to report on runtimes that, if deprecated, may affect your workload to operate as intended.

Review the AWS Lambda runtime policy support page to understand the deprecation schedule for your runtime.

AWS Health provides ongoing visibility into the state of your AWS resources, services, and accounts. Use the AWS Personal Health Dashboard for a personalized view and automate custom notifications to communication channels other than your AWS Account email.

Use AWS Config to report on AWS Lambda function runtimes that might be near their deprecation. Run compliance and operational checks with AWS Config for Lambda functions.

If you are unable to migrate to newer runtimes within the deprecation schedule, use AWS Lambda custom runtimes as an interim solution.

Improvement plan summary

  1. Identify and report runtimes that might deprecate and their support policy.

Conclusion

Introducing application lifecycle management improves the development, deployment, and management of serverless applications. In part 1, I cover using infrastructure as code with version control to deploy applications in a repeatable manner. This reduces errors caused by manual processes and gives you more confidence your application works as expected. In part 2, I cover prototyping new features using temporary environments, and rollout deployments to gradually shift traffic to new application code.

In this post I cover configuration management, CI/CD for serverless applications, and managing function runtime deprecation.

In an upcoming post, I will cover the first Security question from the Well-Architected Serverless Lens – Controlling access to serverless APIs.

Building well-architected serverless applications: Approaching application lifecycle management – part 2

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-approaching-application-lifecycle-management-part-2/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the Introduction post for a table of contents and explanation of the example application.

Question OPS2: How do you approach application lifecycle management?

This post continues part 1 of this Operational Excellence question. Previously, I covered using infrastructure as code with version control to deploy applications in a repeatable manner.

Good practice: Prototype new features using temporary environments

Storing application configuration as infrastructure as code allows deployment of multiple, repeatable, isolated versions of an application.

Create multiple temporary environments for new features you may need to prototype, and tear them down as you complete them. Temporary environments enable fine grained feature isolation and higher fidelity development when interacting with managed services. This allows you to gain confidence your workload integrates and operates as intended.

These environments can also be in separate accounts which help isolate limits, access to data, and resiliency. For best practices on multi-account deployments, see the AWS Partner Network blog post: Best Practices Guide for Multi-Account AWS Deployments.

There are a number of ways to deploy separate environments for an application. To make the deployment simpler, it is good practice to separate dynamic configuration from your infrastructure logic.

For an application managed via the AWS Serverless Application Model (AWS SAM), use an AWS SAM CLI parameter to specify a new stack-name which deploys a new copy of the application as a separate stack.

For example, there is an existing AWS SAM application with a stack-name of app-test. To deploy a new copy, specify a new stack-name of app-newtest with the following command line:

sam deploy --stack-name app-newtest

This deploys a whole new copy of the application in the same account as a separate stack.

For the serverless airline example used in this series, deploy a whole new copy of the application following the deployment instructions, either into the same AWS account, or a completely different account. This is useful when each developer in a team has a sandbox environment. In this example, you only need to configure payment provider credentials as environment variables and seed the database with possible flights as these are currently manual post installation tasks.

However, maintaining an entirely separate codebase copy of an application becomes difficult to manage and reconcile over time.

As the airline application code is stored in a fork in a GitHub account, use git branches for separate environments. In typical development teams, developers may deploy a main branch to production, have a dev branch as staging, and create feature branches when working on new functionality. This allows safe prototyping in sandbox environments without affecting the main codebase, and use git as a mechanism to merge code and resolve conflicts. Changes are automatically pushed to production once they are merged into the main (or production) branch.

Git branching flow

Git branching flow

As the airline example is using AWS Amplify Console, there are a few different options to create a new environment linked to a feature branch.

You can create a whole new Amplify Console app deployment, either in a separate Region, or in a separate AWS account, which then connects to a feature branch by following the deployment instructions. Create a new branch called new-feature in GitHub and in the Amplify Console, select Connect App, and navigate to the repository and the new-feature branch. Configure the payment provider credentials as environment variables.

Deploy new application pointing to feature branch

You can also connect the existing Amplify Console deployment to a git branch, deploying the new-feature branch into the same AWS account and Region.

Amplify Environments

Amplify Environments

In the Amplify Console, navigate to the existing app, select Connect Branch, and choose the new-feature branch. Create a new Backend environment to deploy the full stack. If the feature branch is only frontend code changes, you can choose to use the same backend components.

Connect Amplify Console to feature branch

Connect Amplify Console to feature branch

Amplify Console then deploys a new stack in addition to the develop branch based on the code in the feature-branch.

New feature branch deploying within existing deployment.

New feature branch deploying within existing deployment.

You do not need to add the payment provider environment variables as these are stored per application, per Region, for all branches.

Amplify environment variables for All Branches.

Amplify environment variables for All Branches.

Using git and branching with Amplify Console, you have automatic deployments when any changes are pushed to the GitHub repository. If there are any issues with a particular deployment, you can revert the changes in git which will kick off a redeploy to a known good version. Once you are happy with the feature, you can merge the changes into the production branch which will again kick off another deployment.

As it is simple to set up multiple test environments, make sure to practice good application hygiene, as well as cost management, by identifying and deleting any temporary environments that are no longer required. It may be helpful to include the stack owner’s contact details via CloudFormation tags. Use Amazon CloudWatch scheduled tasks to notify and tag temporary environments for deletion, and provide a mechanism to delay its deletion if needed.

Prototyping locally

With AWS SAM or a third-party framework, you can run API Gateway, and invoke Lambda function code locally for faster development iteration. Local debugging and testing can help for quick confirmation that function code is working, and is also useful for some unit tests. Local testing cannot duplicate the full functionality of the cloud. It is suited to testing services with custom logic, such as Lambda, rather than trying to duplicate all cloud managed services such as Amazon SNS, or Amazon S3 locally. Don’t try to bring the cloud to the test, rather bring the testing to the cloud.

Here is an example of executing a function locally.

I use AWS SAM CLI to invoke the Airline-GetLoyalty Lambda function locally to test some functionality. AWS SAM CLI uses Docker to simulate the Lambda runtime. As the function only reads from DynamoDB, I use stubbed data, or can set up DynamoDB Local.

1. I pass a JSON event to the function to simulate the event from API Gateway, as well as passing in environment variables as JSON. Create sample events using sam local generate-event.

2. I run sam build GetFunc to build the function dependencies, in this case NodeJS.

$ sam build GetFunc
Building resource 'GetFunc'
Running NodejsNpmBuilder:NpmPack
Running NodejsNpmBuilder:CopyNpmrc
Running NodejsNpmBuilder:CopySource
Running NodejsNpmBuilder:NpmInstall
Running NodejsNpmBuilder:CleanUpNpmrc

Build Succeeded

3. I run sam local invoke passing in the event payload and environment variables. This spins up a Docker container, executes the function, and returns the result.

$ sam local invoke --event src/get/event.json --env-vars local-env-vars.json GetFunc
Invoking index.handler (nodejs10.x)

Fetching lambci/lambda:nodejs10.x Docker container image......
Mounting /home/ec2-user/environment/samples/aws-serverless-airline-booking/src/backend/loyalty/.aws-sam/build/GetFunc as /var/task:ro,delegated inside runtime container
START RequestId: 7be7e9a5-9f2f-1520-fbd1-a013485105d3 Version: $LATEST
END RequestId: 7be7e9a5-9f2f-1520-fbd1-a013485105d3
REPORT RequestId: 7be7e9a5-9f2f-1520-fbd1-a013485105d3 Init Duration: 249.89 ms Duration: 76.40 ms Billed Duration: 100 ms Memory Size: 512 MB Max Memory Used: 54 MB

{"statusCode": 200,"body": "{\"points\":0,\"level\":\"bronze\",\"remainingPoints\":50000}"}

For more information on using AWS SAM to run API Gateway, and invoke Lambda functions locally, see the AWS Documentation. For third-part framework solutions, see Invoking AWS Lambda functions locally with Serverless framework and Develop locally against cloud services with Stackery.

Improvement plan summary:

  1. Use a serverless framework to deploy temporary environments named after a feature.
  2. Implement a process to identify temporary environments that may not have been deleted over an extended period of time
  3. Prototype application code locally and test integrations directly with managed services

Good practice: Use a rollout deployment mechanism

Use a rollout deployment for production workloads as opposed to all-at-once mechanisms. Rollout deployments reduce the risk of a failed deployment by gradually deploying application changes to a limited set of customers. Use all-at-once deployments to deploy the entire application. This is best suited for non-production systems.

AWS Lambda versions and aliases

For production Lambda functions, it is best to deploy a new function version for every deployment. Versions can represent the stable version or reflect particular features. Create Lambda aliases which are pointers to particular function versions. Invoke Lambda functions using the aliases, with a specific alias for the stable production version. If an alias is not specified, the latest application code deployment is invoked which may not reflect a stable version or a desired feature. Use the new feature alias version for testing without affecting users of the stable production version.

AWS Lambda function versions and aliases

AWS Lambda function versions and aliases

See AWS Documentation to manage Lambda function versions and aliases using the AWS Management Console, or Lambda API.

Alias routing

Use Lambda alias’ routing configuration to introduce traffic shifting to send a small percentage of traffic to a second function alias or version for a rolling deployment. This is commonly called a canary release.

For example, configure Lambda alias named stable to point to function version 2. A new function version 3 is deployed with alias new-feature. Use the new-feature alias to test the new deployment without impacting production traffic to the stable version.

During production rollout, use alias routing. For example, 90% of invocations route to the stable version while 10% route to alias new-feature pointing to version 3. If the 10% is successful, deployment can continue until all traffic is migrated to version 3, and the stable alias is then pointed to version 3.

AWS Lambda alias routing

AWS Lambda alias routing

AWS SAM supports gradual Lambda deployments with a feature called Safe Lambda deployments using AWS CodeDeploy. This creates new versions of a Lambda function, and automatically creates aliases pointing to the new version. Customer traffic gradually shifts to the new version or rolls back automatically if any specified CloudWatch alarms trigger. AWS SAM supports canary, linear, and all-at-once deployments preference types.

Pre-traffic and post-traffic Lambda functions can also verify if the newly deployed code is working as expected.

In the airline example, create a safe deployment for the ReserveBooking Lambda function by adding the example AWS SAM template code specified in the instructions. This migrates 10 percent of traffic every 10 minutes with CloudWatch alarms to check for any function errors. You could also alarm on latency, or any custom metric.

During the Amplify Console build phase, the safe deployment is initiated. Navigate to the CodeDeploy console and see the deployment in progress.

AWS CodeDeploy deployment in progress

AWS CodeDeploy deployment in progress

Selecting the deployment, you can see the Traffic shifting progress and the Deployment details.

AWS CodeDeploy traffic shifting in progress.

AWS CodeDeploy traffic shifting in progress.

Within Deployment details, select the DeploymentGroup, and view the CloudWatch Alarms CodeDeploy is using to test the rollout.

Amazon CloudWatch Alarms AWS CodeDeploy is using to test the rollout

Amazon CloudWatch Alarms AWS CodeDeploy is using to test the rollout

Within Deployment details, select the Application, select the Revisions tab, and select the latest Revision location and view the CurrentVersion and TargetVersion for this deployment.

View deployment versions

View deployment versions

View Deployment status and see the traffic has now shifted to the new version. The Amplify Console build also continues.

Traffic shifting complete

Traffic shifting complete

View the Lambda function versions and aliases in the Lambda console, selecting Qualifiers.

Viewing Lambda function version and aliases

Viewing Lambda function version and aliases

Amazon API Gateway also supports canary release deployments at the API layer.

A rollout deployment provides traffic shifting, A/B testing, and the ability to roll back to any version at any point in time. AWS SAM makes it simple to add safe deployments to serverless applications.

Improvement plan summary

  1. For production systems, use a linear deployment strategy to gradually rollout changes to customers.
  2. For high volume production systems, use a canary deployment strategy when you want to limit changes to a fixed percentage of customers for an extended period of time.

Conclusion

Introducing application lifecycle management improves the development, deployment, and management of serverless applications. In this post I cover a number of methods to prototype new features using temporary environments. I show how to use rollout deployments to gradually shift traffic to new application code.

This well-architected question will continue in an upcoming post where I look at configuration management, CI/CD for serverless applications, and managing function runtime deprecation.

Building well-architected serverless applications: Approaching application lifecycle management – part 1

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-approaching-application-lifecycle-management-part-1/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the Introduction post for a table of contents and explanation of the example application.

Question OPS2: How do you approach application lifecycle management?

Adopt lifecycle management approaches that improve the flow of changes to production with higher fidelity, fast feedback on quality, and quick bug fixing. These practices help you rapidly identify, remediate, and limit changes that impact customer experience. By having an approach to application lifecycle management, you can reduce errors caused by manual process and increase the levels of control to gain confidence your workload operates as intended.

Required practice: Use infrastructure as code and stages isolated in separate environments

Infrastructure as code is a process of provisioning and managing cloud resources by storing application configuration in a template file. Using infrastructure as code helps to deploy applications in a repeatable manner, reducing errors caused by manual processes such as creating resources in the AWS Management Console.

Storing code in a version control system enables tracking and auditing of changes and releases over time. This is used to roll back changes safely to a known working state if there is an issue with an application deployment.

Infrastructure as code

For AWS Cloud development the built-in choice for infrastructure as code is AWS CloudFormation. The template file, written in JSON or YAML, contains a description of the resources an application needs. CloudFormation automates the deployment and ongoing updates of the resources by creating CloudFormation stacks.

CloudFormation code example creating infrastructure

CloudFormation code example creating infrastructure

There are a number of higher-level tools and frameworks that abstract and then generate CloudFormation. A serverless specific framework helps model the infrastructure necessary for serverless workloads, providing either declarative or imperative mechanisms to define event sources for functions. It wires permissions between resources automatically, adds resource configuration, code packaging, and any infrastructure necessary for a serverless application to run.

The AWS Serverless Application Model (AWS SAM) is an AWS open-source framework optimized for serverless applications. The AWS Cloud Development Kit allows you to provision cloud resources using familiar programming languages such as TypeScript, JavaScript, Python, Java, and C#/.Net. There are also third-party solutions for creating serverless cloud resources such as the Serverless Framework.

The AWS Amplify Console provides a git-based workflow for building, deploying, and hosting serverless applications including both the frontend and backend. The AWS Amplify CLI toolchain enables you to add backend resources using CloudFormation.

For a large number of resources, consider breaking common functionality such as monitoring, alarms, or dashboards into separate infrastructure as code templates. With CloudFormation, use nested stacks to help deploy them as part of your serverless application stack. When using AWS SAM, import these nested stacks as nested applications from the AWS Serverless Application Repository.

AWS CloudFormation nested stacks

AWS CloudFormation nested stacks

Here is an example AWS SAM template using nested stacks. There are two AWS::Serverless::Application nested resources, api.template.yaml and database.template.yaml. For more information on nested stacks, see the AWS Partner Network blog post: CloudFormation Nested Stacks Primer.

Version control

The serverless airline example application used in this series uses Amplify Console to provide part of the backend resources, including authentication using Amazon Cognito, and a GraphQL API using AWS AppSync.

The airline application code is stored in GitHub as a version control system. Fork, or copy, the application to your GitHub account. Configure Amplify Console to connect to the GitHub fork.

When pushing code changes to a fork, Amplify Console automatically deploys these backend resources along with the rest of the application. It hosts the application at the Production branch URL, and you can also configure a custom domain name if needed.

AWS Amplify Console App details

AWS Amplify Console App details

The Amplify Console configuration to create the API and Authentication backend resources is found in the backend-config.json file. The resources are provisioned during the Amplify Console build phase.

To view the deployed resources, within the Amplify Console, navigate to the awsserverlessairline application. Select Backend environments and then select an environment, in this example sampledev.

Select the API and Authentication tabs to view the created backend resources.

AWS Amplify Console deployed backend resources

AWS Amplify Console deployed backend resources

Using multiple tools

Applications can use multiple tools and frameworks even within a single project to manage the infrastructure as code. Within the airline application, AWS SAM is also used to provision the rest of the serverless infrastructure using nested stacks. During the Amplify Console build process, the Makefile contains the AWS SAM build instructions for each application service.

For example, the AWS SAM build instructions to deploy the booking service are as follows:

deploy.booking: ##=> Deploy booking service using SAM
	$(info [*] Packaging and deploying Booking service...)
	cd src/backend/booking && \
		sam build && \
		sam package \
			--s3-bucket $${DEPLOYMENT_BUCKET_NAME} \
			--output-template-file packaged.yaml && \
		sam deploy \
			--template-file packaged.yaml \
			--stack-name $${STACK_NAME}-booking-$${AWS_BRANCH} \
			--capabilities CAPABILITY_IAM \
			--parameter-overrides \
	BookingTable=/$${AWS_BRANCH}/service/amplify/storage/table/booking \
	FlightTable=/$${AWS_BRANCH}/service/amplify/storage/table/flight \
	CollectPaymentFunction=/$${AWS_BRANCH}/service/payment/function/collect \
	RefundPaymentFunction=/$${AWS_BRANCH}/service/payment/function/refund \
	AppsyncApiId=/$${AWS_BRANCH}/service/amplify/api/id \
	Stage=$${AWS_BRANCH}

Each service has its own AWS SAM template.yml file. The files contain the resources for each of the booking, catalog, log-processing, loyalty, and payment services. This means that the services can be managed independently within the application as separate stacks. In larger applications, these services may be managed by separate teams, or be in separate repositories, environments or AWS accounts. It may make sense to split out some common functionality such as alarms, or dashboards into separate infrastructure as code templates.

AWS SAM can also use IAM roles to assume temporary credentials and deploy a serverless application to separate AWS accounts.

For more information on managing serverless code, see Best practices for organizing larger serverless applications.

View the deployed resources in the AWS CloudFormation Console. Select Stacks from the left-side navigation bar, and select the View nested toggle.

Viewing CloudFormation nested stacks

Viewing CloudFormation nested stacks

The serverless airline application is a more complex example application comprising multiple services composed of multiple CloudFormation stacks. Some stacks are managed via Amplify Console and others via AWS SAM. Using infrastructure as code is not only for large and complex applications. As a best practice, we suggest using SAM or another framework for even simple, small serverless applications with a single stack. For a getting started tutorial, see the example Deploying a Hello World Application.

Improvement plan summary:

  1. Use a serverless framework to help you execute functions locally, build and package application code. Separate packaging from deployment, deploy to isolated stages in separate environments, and support secrets via configuration management systems.
  2. For a large number of resources, consider breaking common functionalities such as alarms into separate infrastructure as code templates.

Conclusion

Introducing application lifecycle management improves the development, deployment, and management of serverless applications. In this post I cover using infrastructure as code with version control to deploy applications in a repeatable manner. This reduces errors caused by manual processes and gives you more confidence your application works as expected.

This well-architected question will continue in an upcoming post where I look further at deploying to multiple stages using temporary environments, and rollout deployments.

Building well-architected serverless applications: Understanding application health – part 2

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-understanding-application-health-part-2/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the Introduction post for a table of contents and explaining the example application.

Question OPS1: How do you evaluate your serverless application’s health?

This post continues part 1 of this Operational Excellence question. Previously, I covered using Amazon CloudWatch out-of-the-box standard metrics and alerts, and structured and centralized logging with the Embedded Metrics Format.

Best practice: Use application, business, and operations metrics

Identifying key performance indicators (KPIs), including business, customer, and operations outcomes, in additional to application metrics, helps to show a higher-level view of how the application and business is performing.

Business KPIs measure your application performance against business goals. For example, if fewer flight reservations are flowing through the system, the business would like to know.

Customer experience KPIs highlight the overall effectiveness of how customers use the application over time. Examples are perceived latency, time to find a booking, make a payment, etc.

Operational metrics help to see how operationally stable the application is over time. Examples are continuous integration/delivery/deployment feedback time, mean-time-between-failure/recovery, number of on-call pages and time to resolution, etc.

Custom Metrics

Embedded Metrics Format can also emit custom metrics to help understand your workload health’s impact on business.

The airline booking service uses AWS Step Functions, AWS Lambda, Amazon SNS, and Amazon DynamoDB.

In the confirm booking module function handler, I add a new namespace and dimension to associate this set of logs with this application and service.

metrics.set_namespace("ServerlessAirlineEMF")
metrics.put_dimensions({"service":"confirm_booking"})

Within the try statement within the try/except block, I emit a metric for a successful booking:

metrics.put_metric("BookingSuccessful", 1, "Count")

And within the except statement within the try/except block, I emit a metric for a failed booking:

metrics.put_metric("BookingFailed", 1, "Count")

Once I make a booking, within the CloudWatch console, I navigate to Logs | Log groups and select the Airline-ConfirmBooking-develop group. I select a log stream and find the custom metric as part of the structured log entry.

structured-log-entry

Custom metric structured log entry

I can also create a custom metric graph. Within the CloudWatch console, I navigate to Metrics. I see the ServerlessAirlineEMF Custom Namespace is available.

custom-namespace

Custom metric namespace

I select the Namespace and the available metric.

available-namespace

Available metric

I select a Line graph, add a name, and see the successful booking now plotted on the graph.

custom-metric-plotted

Plotted CloudWatch metric

I can also visualize and analyze this metric using CloudWatch Insights.

Once a booking is made, within the CloudWatch console, I navigate to Logs | Insights. I select the /aws/lambda/Airline-ConfirmBooking-develop log group. I choose Run query which shows a list of discovered fields on the right of the console.

I can search for discovered booking related fields.

cloudwatch-insights-discovered-fieldsI then enter the following in the query pane to search the logs and plot the sum of BookingReference and choose Run Query:

fields @timestamp, @message
| stats sum(@BookingReference)

cloudwatch-insights-displayed-bookingreference

CloudWatch Insights query

There are a number of other component metrics that are important to measure. Track interactions between upstream and downstream components such as message queue length, integration latency, and throttling.

Improvement plan summary:

  1. Identify user journeys and metrics from each customer transaction.
  2. Create custom metrics asynchronously instead of synchronously for improved performance, cost, and reliability outcomes.
  3. Emit business metrics from within your workload to measure application performance against business goals.
  4. Create and analyze component metrics to measure interactions with upstream and downstream components.
  5. Create and analyze operational metrics to assess the health of your continuous delivery pipeline and operational processes.

Good practice: Use distributed tracing and code is instrumented with additional context

Logging provides information on the individual point in time events the application generates. Tracing provides a wider continuous view of an application. Tracing helps to follow a user journey or transaction through the application.

AWS X-Ray is an example of a distributed tracing solution, there are a number of third-party options as well. X-Ray collects data about the requests that your application serves and also builds a service graph, which shows all the components of an application. This provides visibility to understand how upstream/downstream services may affect workload health.

For the most comprehensive view, enable X-Ray across as many services as possible and include X-Ray tracing instrumentation in code. This is the list of AWS Services integrated with X-Ray.

X-Ray receives data from services as segments. Segments for a common request are then grouped into traces. Segments can be further broken down into more granular subsegments. Custom data key-value pairs are added to segments and subsegments with annotations and metadata. Traces can also be grouped which helps with filter expressions.

AWS Lambda instruments incoming requests for all supported languages. Lambda application code can be further instrumented to emit information about its status, correlation identifiers, and business outcomes to determine transaction flows across your workload.

X-Ray tracing for a Lambda function is enabled in the Lambda Console. Select a Lambda function. I select the Airline-ReserveBooking-develop function. In the Configuration pane, I select to enable X-Ray Active tracing.

x-ray-enable

X-Ray tracing enabled

X-Ray can also be enabled via CloudFormation with the following code:

TracingConfig:
  Mode: Active

Lambda IAM permissions to write to X-Ray are added automatically when active tracing is enabled via the console. When using CloudFormation, Allow the Actions xray:PutTraceSegments and xray:PutTelemetryRecords

It is important to understand what invocations X-Ray does trace. X-Ray applies a sampling algorithm. If an upstream service, such as API Gateway with X-Ray tracing enabled, has already sampled a request, the Lambda function request is also sampled. Without an upstream request, X-Ray traces data for the first Lambda invocation each second, and then 5% of additional invocations.

For the airline application, X-Ray tracing is initiated within the shared library with the code:

from aws_xray_sdk.core import models, patch_all, xray_recorder

Segments, subsegments, annotations, and metadata are added to functions with the following example code:

segment = xray_recorder.begin_segment('segment_name')
# Start a subsegment
subsegment = xray_recorder.begin_subsegment('subsegment_name')
# Add metadata and annotations
segment.put_metadata('key', dict, 'namespace')
subsegment.put_annotation('key', 'value')
# Close the subsegment and segment
xray_recorder.end_subsegment()
xray_recorder.end_segment()

For example, within the collect payment module, an annotation is added for a successful payment with:

tracer.put_annotation("PaymentStatus", "SUCCESS")

CloudWatch ServiceLens

Once a booking is made and payment is successful, the tracing is available in the X-Ray console.

I explore how Amazon CloudWatch ServiceLens connects metrics, logs and the X-Ray tracing. Within the CloudWatch console, I navigate to ServiceLens | Service Map.

I can visualize all application resources and dependencies where X-Ray is enabled. I can trace performance or availability issues. If there was an issue connecting to SNS for example, this would be shown.

I select the Airline-CollectPayment-develop node and can view the out-of-the-box standard Lambda metrics.

I can select View Logs to jump to the CloudWatch Logs Insights console.

cloudwatch-insights-service-map-view

CloudWatch Insights Service map

I select View dashboard to see the function metrics, node map, and function details.

cloudwatch-insights-service-map-dashboard

CloudWatch Insights Service Map dashboard

I select View traces and can filter by the custom metric PaymentStatus. I select SUCCESS and chose Add to filter. I then select a trace.

cloudwatch-insights-service-map-select-trace

CloudWatch Insights Filtered traces

I see the full trace details, which show the full application transaction of a payment collection.

cloudwatch-insights-service-map-view-trace

Segments timeline

Selecting the Lambda handler subsegment – ## lambda_handler, I can view the trace Annotations and Metadata, which include the business transaction details such as Customer and PaymentStatus.

cloudwatch-insights-service-map-view-trace-annotations

X-Ray annotations

Trace groups are another feature of X-Ray and ServiceLens. Trace groups use filter expressions such as Annotation.PaymentStatus = "FAILED" which are used to view traces that match the particular group. Service graphs can also be viewed, and CloudWatch alarms created based on the group.

CloudWatch ServiceLens provides powerful capabilities to understand application performance bottlenecks and issues, helping determine how users are impacted.

Improvement plan summary:

  1. Identify common business context and system data that are commonly present across multiple transactions.
  2. Instrument SDKs and requests to upstream/downstream services to understand the flow of a transaction across system components.

Recent announcements

There have been a number of recent announcements for X-Ray and CloudWatch to improve how to evaluate serverless application health.

Conclusion

Evaluating application health helps you identify which services should be optimized to improve your customer’s experience. In part 1, I cover out-of-the-box standard metrics and alerts, as well as structured and centralized logging. In this post, I explore custom metrics and distributed tracing and show how to use ServiceLens to view logs, metrics, and traces together.

In an upcoming post, I will cover the next Operational Excellence question from the Well-Architected Serverless Lens – Approaching application lifecycle management.

Building well-architected serverless applications: Understanding application health – part 1

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-understanding-application-health-part-1/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the Introduction post for a table of contents and explaining the example application.

Question OPS1: How do you evaluate your serverless application’s health?

Evaluating your metrics, distributed tracing, and logging gives you insight into business and operational events, and helps you understand which services should be optimized to improve your customer’s experience. By understanding the health of your Serverless Application, you will know whether it is functioning as expected, and can proactively react to any signals that indicate it is becoming unhealthy.

Required practice: Understand, analyze, and alert on metrics provided out of the box

It is important to understand metrics for every AWS service used in your application so you can decide how to measure its behavior. AWS services provide a number of out-of-the-box standard metrics to help monitor the operational health of your application.

As these metrics are generated automatically, it is a simple way to start monitoring your application and can also be augmented with custom metrics.

The first stage is to identify which services the application uses. The airline booking component uses AWS Step Functions, AWS Lambda, Amazon SNS, and Amazon DynamoDB.

When I make a booking, as shown in the Introduction post, AWS services emit metrics to Amazon CloudWatch. These are processed asynchronously without impacting the application’s performance.

There are two default CloudWatch dashboards to visualize key metrics quickly: per service and cross service.

Per service

To view the per service metrics dashboard, I open the CloudWatch console.

per-service-metrics-dashboardI select a service where Overview is shown, such as Lambda. Now I can view the metrics for all Lambda functions in the account.

per-service-metrics-lambdaCross service

To see an overview of key metrics across all AWS services, open the CloudWatch console and choose View cross service dashboard.

cross-service-metrics-dashboardI see a list of all services with one or two key metrics displayed. This provides a good overview of all services your application uses.

Alerting

The next stage is to identify the key metrics for comparison and set up alerts for under- and over-performing services. Here are some recommended metrics to alarm on for a number of AWS services.

Alerts can be configured manually or via infrastructure as code tools such as the AWS Serverless Application Model, AWS CloudFormation, or third-party tools.

To configure a manual alert for Lambda function errors using CloudWatch Alarms:

  1. I open the CloudWatch console and select Alarms and select Create Alarm.
  2. I choose Select Metric and from AWS Namespaces, select Lambda, Across All Functions and select Errors and select Select metric.add-metric-to-alert
  3. I change the Statistic to Sum and the Period to 1 minute.metric-values
  4. Under Conditions, I select a Static threshold Greater than 1 and select Next.

Alarms can also be created using anomaly detection rather than static values if there is a discernible pattern or trend. Anomaly detection looks at past metric data and uses machine learning to create a model of expected values. Alerts can then be configured if they fall outside this band of “normal” values. I use a Static threshold for this alarm.

  1. For the notification, I set the trigger to alarm to an existing SNS topic with my email address, then choose Next.metric-notification
  2. I enter a descriptive alarm name such as serverlessairline-lambda-prod-errors > 1, select Next, and choose Create alarm.

I have now manually set up an alarm.

Use CloudWatch composite alarms to combine multiple alarms to reduce noise and focus on critical issues. For example, a single alarm could trigger if there are both Lambda function errors as well as high Lambda concurrent executions.

It is simpler and more scalable to include alerting within infrastructure as code. Here is an example of alerting programmatically using CloudFormation.

I view the out of the box standard metrics and in this example, manually create an alarm for Lambda function errors.

Improvement plan summary:

  1. Understand what metrics and dimensions each managed service used provides.
  2. Configure alerts on relevant metrics for when services are unhealthy.

Good practice: Use structured and centralized logging

Central logging provides a single place to search and analyze logs. Structured logging means selecting a consistent log format and content structure to simplify querying across multiple components.

To identify a business transaction across components, such as a particular flight booking, log operational information from upstream and downstream services. Add information such as customer_id along with business outcomes such as order=accepted or order=confirmed. Make sure you are not logging any sensitive or personal identifying data in any logs.

Use JSON as your logging output format. Log multiple fields in a single object or dictionary rather than many one line messages for simpler searching.

Here is an example of a structured logging format.

The airline booking component, which is written in Python, currently uses a shared library with a separate log processing stack.

Embedded Metrics Format is a simpler mechanism to replace the shared library and use structured logging. CloudWatch Embedded Metrics adds environmental metadata such as Lambda Function version and also automatically extracts custom metrics so you can visualize and alarm on them. There are open-source client libraries available for Node.js and Python.

I then add embedded metrics to the individual confirm booking module with the following steps:

  1. I install the aws-embedded-metrics library using the instructions.
  2. In the function init code, I import the module and create a metric_scope with the following code

from aws_embedded_metrics import metric_scope
@metric_scope

  1. In the function handler, I log the generated bookingReference with the following code.

metrics.set_property("BookingReference", ret["bookingReference"])

In this example I also log the entire incoming event details.

metrics.set_property("event", event)

It is best practice to only log what is required to avoid unnecessary costs. Ensure the event does not have any sensitive or personal identifying data which is available to anyone who has access to the logs.

To avoid the duplicate logging in this example airline application which adds cost, I remove the existing shared library logger.*() lines.

When I make a booking, the CloudWatch log message is in structured JSON format. It contains the properties I set event, BookingReference, as well as function metadata.

I can then search for all log activity related to a specific booking across multiple functions with booking_id. I can track customer activity across multiple bookings using customer_id.

Logging is often created as a shared library resource which all functions reference. Another option is using Lambda Layers, which lets functions import additional code such as external libraries. Multiple functions can share this code.

Improvement plan summary:

  1. Log request identifiers from downstream services, component name, component runtime information, unique correlation identifiers, and information that helps identify a business transaction.
  2. Use JSON as the logging output. Prefer logging entire objects/dictionaries rather than many one line messages. Mask or remove sensitive data when logging.
  3. Minimize logging debugging information to a minimum as they can incur both costs and increase noise to signal ratio

Conclusion

Evaluating serverless application health helps understand which services should be optimized to improve your customer’s experience. I cover out of the box metrics and alerts, as well as structured and centralized logging.

This well-architected question will be continued in an upcoming post where I look at custom metrics and distributed tracing.

Building well-architected serverless applications: Introduction

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-introduction/

Customers building production applications in the cloud are looking to build and operate their applications following best practices. Best practices are useful throughout the software development lifecycle and help to answer the question “am I well-architected?”

Serverless technologies provide a solid foundation for building well-architected applications with a goal to reduce and minimize the impact of issues that can happen. What does it mean to be “well architected”?

In 2015, AWS released the AWS Well-Architected Framework.  It’s a structured way to compare applications against AWS architectural best practices with advice on how to improve. This formalized framework publicly documents the approach our solutions architects use when conducting architectural reviews. The Well-Architected Framework can be used by customers and partners to evaluate applications. It’s currently based on five pillars:

  • Operational Excellence
  • Security
  • Reliability
  • Performance Efficiency
  • Cost Optimization

In 2017, AWS extended the framework’s general advice to be more application domain specific with the concept of a “lens”. A lens adds additional questions for specific technology areas, which focus on what is different from the generic advice. Today, there are three technology domain lenses:

In 2018, AWS added the AWS Well-Architected Tool within the AWS Management Console. The tool allows you to define a Workload and answer a number of questions based on each of the five pillars. Each question has context to explain what the question means and why it is important, and then provides a number of best practices.

well-architected-tool-questionThe tool is used to assess risks, and find opportunities for improvement for workloads. It can also provide a broader view across multiple applications for a central team. Each question has a number of best practices to follow, categorized as high or medium risk. This can help you decide where to focus.

Serverless Lens

In February, we added functionality to apply lenses to the Well-Architected Tool. The first one is the Serverless Lens. This includes nine questions, each with additional best practice recommendations to follow.

serverless-lens-in-consoleApplying best practices to production applications may need some more time and effort. The Well-Architected Tool and Serverless Lens are available to help and are not intended to only be a one-time check.

We suggest at a minimum reading the whitepapers and being familiar with the questions before the design phase. It’s a good idea to check throughout the application lifecycle; halfway (or sooner), close to launch, and post-launch for each iteration. Although it is never too late to add best practices to your applications, starting early helps reduce the remediation process as well as making it part of the full application lifecycle journey.

The following posts are a multi-part series addressing each of the questions within the Serverless Lens of the Well-Architected Tool.

Series Episodes: Building well-architected serverless applications

  1. Introduction
  2. Operational Excellence: Understanding serverless application health
    1. Out of the box metrics and alerts; structured and centralized logging
    2. Custom metrics and distributed tracing (upcoming)
  3. Operational Excellence – Understanding serverless application health (upcoming)
  4. Operational Excellence – Approaching application lifecycle management (upcoming)
  5. Security – Controlling access to serverless APIs (upcoming)
  6. Security – Managing serverless security boundaries (upcoming)
  7. Security – Implementing application security (upcoming)
  8. Reliability – Regulating inbound request rates (upcoming)
  9. Reliability – Building resiliency into serverless applications (upcoming)
  10. Performance Efficiency – Optimizing your Serverless Application Performance (upcoming)
  11. Cost Optimization – Optimizing your Serverless Application Costs (upcoming)

Example application

For this series, I am looking at an example application, AWS Serverless Airline Booking. This is a complete web application for a fictional airline booking service built using serverless technologies that provide Flight Search, Payment, Booking, and Loyalty Point features.

I show how to use the Well-Architected Tool with the Serverless Lens to help apply serverless best practices to the application.

airline-mobile-exampleThis web application is the theme of Season 2 of Build on Serverless, which is a video series on Twitch.tv, and presented at AWS re:Invent 2019.

You can view this application directly on GitHub or deploy into an AWS account using the Getting Started instructions. This is not required to follow this Building Well-Architected Serverless Applications series but allows you to explore the code.

The application architecture consists of a frontend web application, which interacts with a number of backend microservices to provide the airline booking functionality.

airline-architectureThere are four backend services that make up the application:

ServiceDescription
CatalogProvides flight search
BookingProvides new and list bookings. Business workflow implemented in Python.
PaymentProvides payment authorization, collection, and refund workflows using Stripe. Business workflow implemented in Python
LoyaltyProvides loyalty points for customers including tiers. Implemented in TypeScript.

Once the application is deployed, I can create a booking by searching for flights.

Once I select an available flight, I enter payment details using a test Stripe credit card.

The application authorizes the payment, confirms, and stores the booking and then updates the loyalty points.

Next steps

Building serverless applications using best practices can give you more confidence in the architecture and operations of your workloads.

Using the Well-Architected Tool and Serverless Lens can give you more visibility into your applications. They can help pinpoint and rank areas to improve. Well-Architected works best when integrated into your application lifecycle processes.

Continue learning in the next post, which dives into the first Well-Architected Serverless Lens question: Operational Excellence – Understanding Serverless Application Health – part1.

Five Talent Collaborates with Customers Using the AWS Well-Architected Tool

Post Syndicated from Scott Sprinkel original https://aws.amazon.com/blogs/architecture/five-talent-collaborates-with-customers-using-the-aws-well-architected-tool/

Since its launch at re:Invent 2018, the AWS Well-Architected Tool (AWS W-A Tool) has provided a consistent process for documenting and measuring architecture workloads using the best practices from the AWS Well-Architected Framework. However, sharing workload reports for collaborative work experience was time consuming.

Well-Architected Tool

The new workload sharing feature solves these issues by offering a simple way to share workloads with other AWS accounts and AWS Identity and Account Management (IAM) users. Companies can leverage workload sharing to securely and efficiently collaborate and provide feedback about architecture implementation and design without sharing confidential account details through emails and PDFs. Multiple people across multiple organizations can now review a workload simultaneously and provide feedback and input.

Five Talent, a partner on the AWS Partner Network (APN) uses the workload sharing feature for a more collaborative Well-Architected review experience. With workload sharing, the company provides its clients with improved efficiency, transparency, and security.

“The new sharing feature has increased the efficiency across client and partner teams, which decreases the average time to remediate the high risks. By sharing the reviews in the AWS Console, we can protect sensitive customer data while staying informed in real time.” – Ryan Comingdeer, Chief Talent Officer, Five Talent.

Previously, Five Talent defined workloads in its AWS account and asked its clients to submit their workload information in a custom-built webform. Five Talent then generated and sent PDF reports via email. This was problematic for several reasons: Five Talent and its clients couldn’t control who had access to the report PDFs, it had no way to expedite high risk issue (HRI) remediation, and its recommendations could easily get lost in email correspondence. The workload sharing feature solves these problems and builds customer confidence through the ability for multiple people to work on reviews collaboratively.

Five Talent added extra security by customizing its workload sharing access controls based on its customer needs. The company is using the notes sections in the workload for secure and accurate communication. It shares links to documentation that can help clients take initiative and remediate HRIs — enabling quicker remediation and more transparent review cycles. Five Talent also highlights milestones in the AWS W-A Tool, enabling customers to prioritize HRIs without sorting through lengthy PDFs and email threads, which ultimately expedites the review and revision process.

The workload sharing feature has helped Five Talent drive down HRIs without requiring direct access to the AWS account where the workload is defined. This transparency and ability to work simultaneously helps keep all teams accountable while reinforcing the principles of the Well-Architected Framework.

Sign in to the Console and check out the new Shares tab in the AWS Well-Architected Tool, or visit the workload shares documentation to learn more.

AWS Security Profiles: Greg McConnel, Senior Manager, Security Specialists Team

Post Syndicated from Becca Crockett original https://aws.amazon.com/blogs/security/aws-security-profiles-greg-mcconnel-senior-manager-security-specialists-team/

Greg McConnel, Senior Manager
In the weeks leading up to re:Invent 2019, we’ll share conversations we’ve had with people at AWS who will be presenting at the event so you can learn more about them and some of the interesting work that they’re doing.


How long have you been at AWS, and what do you do in your current role?

I’ve been with AWS nearly seven years. I was previously a Solutions Architect on the Security Specialists Team for the Americas, and I recently took on an interim management position for the team. Once we find a permanent team lead, I’ll become the team’s West Coast manager.

How do you explain your job?

The Security Specialist team is a customer-facing role where we get to talk to customers about the security, identity, and compliance of AWS. We help enable customers to securely and compliantly adopt the AWS Cloud. Some of what we do is one-to-one with customers, but we also engage in a lot of one-to-many activities. My team talks to customers about their security concerns, how best to secure their workloads on AWS, and how to use our services alongside partner solutions to improve their security posture. A big part of my role is building content and delivering that to customers. This includes things like white papers, blog posts, hands-on workshops, Well-Architected reviews, and presentations.

What are you currently working on that you’re most excited about?

In the Security Specialist team, the one-to-one engagements we have with customers are primarily short-term and centered around particular questions or topics. Although this can benefit customers a great deal, it’s a reactive model. In order to move to a more proactive stance, we plan to start working with individual customers on a longer-term basis to help them solve particular security challenges or opportunities. This will remove any barriers to working directly with a Security Specialist and allow us to dive deep into the security concerns of our customers.

What’s the most challenging part of your job?

Keeping up with the pace of innovation. Just about everyone who works with AWS experiences this, whether you’re a customer or working in the AWS field. AWS is launching so many amazing things all the time it can be challenging to keep up. The specialist SA role is attractive because the area we cover is somewhat limited. It’s a carved-out space within the larger, continuously expanding body of AWS knowledge. So it’s a little easier, and it allows me to dive deeper into particular security topics. But even within this carved out area of security, it takes some dedication to stay current and informed for my customers. I find things like Jeff Barr’s AWS News Blog and the AWS Security Blog really valuable. Also, I’m a big fan of hands on learning, so just playing around with new services, especially by following along with examples in new blog posts, can be an interesting way to keep up.

What’s your favorite part of your job?

Creating content, particularly workshops. For me, hands-on learning is not only an effective way to learn, it’s simply more fun. The process of creating workshops with a team can be rewarding and educational. It’s a creative, interactive process. Getting to see others try out your workshop and provide feedback is so satisfying, especially when you can tell that people now understand new concepts because of the work you did. Watching other people deliver a workshop that you built is also rewarding.

In your opinion, what’s the biggest challenge facing cloud security right now?

AWS allows you to move very fast. One of the primary advantages of cloud computing is the speed and agility it provides. When you’re moving fast though, security can be overlooked a bit. It’s so easy to start building on AWS that customers might not invest the necessary cycles on security, or they might think that doing so will slow them down. The reality, though, is that AWS provides the tools to allow you to stay agile while maintaining—and in many cases improving—your security. Automation and continuous monitoring are the keys here. AWS makes it possible to automate many basic security tasks, like patching. With the right tooling, you also gain the visibility needed to accurately identify and monitor critical assets and data. We provide great solutions for logging and monitoring that are highly integrated with our other services. So it’s quite possible now to move fast and stay secure, and it’s important that you do both.

What security best practices do you recommend to customers?

There are certain services we recommend that customer look into enabling because they’re an easy win when it comes to security. These aren’t requirements because there are many great partner solutions that customers can also use. But these are solutions that I’d encourage customers to at least consider:

  • Turn on Amazon GuardDuty, which is a threat detection service that continuously monitors for malicious and unauthorized activity. The benefits it provides versus the effort involved to enable it makes it a pretty easy choice. You literally click a button to turn on GuardDuty, and the service starts analyzing tens of billions of events across a number of AWS data sources.
  • Work with your account team to get a Well-Architected review of your key workloads, especially with a focus on the Security pillar. You can even run well-architected reviews yourself with the AWS Well-Architected Tool. A well-architected review can help you build secure, high-performing, resilient, and efficient infrastructure for your application.
  • Use IAM access advisor to help ensure that all the credentials in your AWS accounts are not overly broad by examining your service last accessed information.
  • Use AWS Organizations for centralized administration of your credentials management and governance and move to full-feature mode to take advantage of everything the service offers.
  • Practice the principle of least privilege.

What does cloud security mean to you, personally?

This is the first time in a while that I’ve worked for a company where I actually like the product. I’m constantly surprised by the new services and features we release and what our customers are building on AWS. My background is in security, especially identity, so that’s the area I am most passionate about within AWS. I want people to use AWS because they can build amazing things at a pace that was just not possible before the cloud—but I want people to do all of this securely. Being able to help customers use our services and do so securely keeps me motivated.

You have a passion for building great workshops. What are you and Jesse Fuchs doing to help make workshops at re:Invent better?

Jesse and I have been working on a central site for AWS security workshops. We post a curated list of workshops, and everything on the site has gone through an internal bar raiser review. This is a formal process for reviewing the workshops and other content to make sure they meet certain standards, that they’re up to date, and that they follow a certain format so customers will find them familiar no matter which workshop they go through.

We’re now implementing a version of this bar raiser review for every workshop at re:Invent this year. In the past, there were people who helped review workshops, but there was no formal, independent bar raising process. Now, every re:Invent workshop will go through it. Eventually, we want to extend this review at other AWS events.

Over time, you’ll see a lot more workshops added to the site. All of these workshops can either be delivered by AWS to the customer, or by customers to their own internal teams. It’s all Open Source too. Finally, we’ll keep the workshops up to date as new features are added.

The workshop you and Jesse Fuchs will host at re:Invent promises that attendees will build an end-to-end functional app with a secure identity provider. How can you build such an app in such a relatively short time?

It know it sounds like a tall order, but the workshop is designed to be completed in the allotted time. And since the content will be up on GitHub, you can also work on it afterwards on your own. It’s a level 400 workshop, so we’re expecting people to come into the session with a certain level of familiarity with some of these services. The session is two and half hours, and a big part of this workshop is the one-on-one interaction with the facilitators. So if attendees feel like this is a lot to take in, they should keep in mind that they’ll get one-on-one interaction with experts in these fields. Our facilitators will sit down with attendees and address their questions as they go through the hands-on work. A lot of the learning actually comes out of that facilitation. The session starts with an initial presentation and detailed instructions for doing the workshop. Then the majority of the time will be spent hands-on with that added one-on-one interaction. I don’t think we stress enough the value that customers get from the facilitation that occurs in workshops and how much learning occurs during that process.

What else should we know about your workshop?

One of the things I concentrate on is identity, and that’s not just Identity and Access Management. It’s identity across the board. If you saw Quint Van Deman’s session last year at re:Invent, he used an analogy about identity being a cake with three layers. There’s the platform layer on the bottom, which includes access to the console. There’s the infrastructure layer, which is identity for services like EC2 and RDS, for example. And then there’s the application identity layer on the top. That’s sometimes the layer that customers are most interested in because it’s related to the applications they’re building. Our workshop is primarily about that top layer. It’s one of the more difficult topics to cover, but again, an interesting one for customers.

What is your advice to first-time attendees at re:Invent?

Take advantage of your time at re:Invent and plan your week because there’s really no fluff. It’s not about marketing, it’s about learning. re:Invent is an educational event first and foremost. Also, take advantage of meeting people because it’s also a great networking event. Having conversations with people from companies that are doing similar things to what you’re doing on AWS can be very enlightening.

You like obscure restaurants and even more obscure movies. Can you share some favorites?

From a movie standpoint, I like Thrillers and Sci-Fi—especially unusual Sci-Fi movies that are somewhat out of the mainstream. I do like blockbusters, but I definitely like to find great independent movies. I recently saw a movie called Coherence that was really interesting. It’s about what would happen if a group of friends having a dinner party could move between parallel or alternate universes. A few others I’d recommend to people who share my taste for the obscure, and that are maybe even a little though-provoking, are Perfect Sense and The Quiet Earth.

From a restaurant standpoint, I am always looking for new Asian food restaurants, especially Vietnamese and Thai food. There are a lot of great ones hidden here in Los Angeles.
 
Want more AWS Security news? Follow us on Twitter.

The AWS Security team is hiring! Want to find out more? Check out our career page.

Greg McConnel

In his nearly 7 years at AWS, Greg has been in a number of different roles, which gives him a unique viewpoint on this company. He started out as a TAM in Enterprise Support, moved to a Gaming Specialist SA role, and then found a fitting home in AWS Security as a Security Specialist SA focusing on identity. Greg will continue to contribute to the team in his new role as a manager. Greg currently lives in LA where he loves to kayak, go on road trips, and in general, enjoy the sunny climate.

Ten Things Serverless Architects Should Know

Post Syndicated from Justin Pirtle original https://aws.amazon.com/blogs/architecture/ten-things-serverless-architects-should-know/

Building on the first three parts of the AWS Lambda scaling and best practices series where you learned how to design serverless apps for massive scale, AWS Lambda’s different invocation models, and best practices for developing with AWS Lambda, we now invite you to take your serverless knowledge to the next level by reviewing the following 10 topics to deepen your serverless skills.

1: API and Microservices Design

With the move to microservices-based architectures, decomposing monothlic applications and de-coupling dependencies is more important than ever. Learn more about how to design and deploy your microservices with Amazon API Gateway:

Get hands-on experience building out a serverless API with API Gateway, AWS Lambda, and Amazon DynamoDB powering a serverless web application by completing the self-paced Wild Rydes web application workshop.

Figure 1: WildRydes serverless web application workshop

2: Event-driven Architectures and Asynchronous Messaging Patterns

When building event-driven architectures, whether you’re looking for simple queueing and message buffering or a more intricate event-based choreography pattern, it’s valuable to learn about the mechanisms to enable asynchronous messaging and integration. These are enabled primarily through the use of queues or streams as a message buffer and topics for pub/sub messaging. Understand when to use each and the unique advantages and features of all three:

Gets hands-on experience building a real-time data processing application using Amazon Kinesis Data Streams and AWS Lambda by completing the self-paced Wild Rydes data processing workshop.

3: Workflow Orchestration in a Distributed, Microservices Environment

In distributed microservices architectures, you must design coordinated transactions in different ways than traditional database-based ACID transactions, which are typically implemented using a monolithic relational database. Instead, you must implement coordinated sequenced invocations across services along with rollback and retry mechanisms. For workloads where there a significant orchestration logic is required and you want to use more of an orchestrator pattern than the event choreography pattern mentioned above, AWS Step Functions enables the building complex workflows and distributed transactions through integration with a variety of AWS services, including AWS Lambda. Learn about the options you have to build your business workflows and keep orchestration logic out of your AWS Lambda code:

Get hands-on experience building an image processing workflow using computer vision AI services with AWS Rekognition and AWS Step Functions to orchestrate all logic and steps with the self-paced Serverless image processing workflow workshop.

Figure 2: Several AWS Lambda functions managed by an AWS Step Functions state machine

4: Lambda Computing Environment and Programming Model

Though AWS Lambda is a service that is quick to get started, there is value in learning more about the AWS Lambda computing environment and how to take advantage of deeper performance and cost optimization strategies with the AWS Lambda runtime. Take your understanding and skills of AWS Lambda to the next level:

5: Serverless Deployment Automation and CI/CD Patterns

When dealing with a large number of microservices or smaller components—such as AWS Lambda functions all working together as part of a broader application—it’s critical to integrate automation and code management into your application early on to efficiently create, deploy, and version your serverless architectures. AWS offers several first-party deployment tools and frameworks for Serverless architectures, including the AWS Serverless Application Model (SAM), the AWS Cloud Development Kit (CDK), AWS Amplify, and AWS Chalice. Additionally, there are several third party deployment tools and frameworks available, such as the Serverless Framework, Claudia.js, Sparta, or Zappa. You can also build your own custom-built homegrown framework. The important thing is to ensure your automation strategy works for your use case and team, and supports your planned data source integrations and development workflow. Learn more about the available options:

Learn how to build a full CI/CD pipeline and other DevOps deployment automation with the following workshops:

6: Serverless Identity Management, Authentication, and Authorization

Modern application developers need to plan for and integrate identity management into their applications while implementing robust authentication and authorization functionality. With Amazon Cognito, you can deploy serverless identity management and secure sign-up and sign-in directly into your applications. Beyond authentication, Amazon API Gateway also allows developers to granularly manage authorization logic at the gateway layer and authorize requests directly, without exposing their using several types of native authorization.

Learn more about the options and benefits of each:

Get hands-on experience working with Amazon Cognito, AWS Identity and Access Management (IAM), and Amazon API Gateway with the Serverless Identity Management, Authentication, and Authorization Workshop.

Figure 3: Serverless Identity Management, Authentication, and Authorization Workshop

7: End-to-End Security Techniques

Beyond identity and authentication/authorization, there are many other areas to secure in a serverless application. These include:

  • Input and request validation
  • Dependency and vulnerability management
  • Secure secrets storage and retrieval
  • IAM execution roles and invocation policies
  • Data encryption at-rest/in-transit
  • Metering and throttling access
  • Regulatory compliance concerns

Thankfully, there are AWS offerings and integrations for each of these areas. Learn more about the options and benefits of each:

Get hands-on experience adding end-to-end security with the techniques mentioned above into a serverless application with the Serverless Security Workshop.

8: Application Observability with Comprehensive Logging, Metrics, and Tracing

Before taking your application to production, it’s critical that you ensure your application is fully observable, both at a microservice or component level, as well as overall through comprehensive logging, metrics at various granularity, and tracing to understand distributed system performance and end user experiences end-to-end. With many different components making up modern architectures, having centralized visibility into all of your key logs, metrics, and end-to-end traces will make it much easier to monitor and understand your end users’ experiences. Learn more about the options for observability of your AWS serverless application:

9. Ensuring Your Application is Well-Architected

Adding onto the considerations mentioned above, we suggest architecting your applications more holistically to the AWS Well-Architected framework. This framework includes the five key pillars: security, reliability, performance efficiency, cost optimization, and operational excellence. Additionally, there is a serverless-specific lens to the Well-Architected framework, which more specifically looks at key serverless scenarios/use cases such as RESTful microservices, Alexa skills, mobile backends, stream processing, and web applications, and how they can implement best practices to be Well-Architected. More information:

10. Continuing your Learning as Serverless Computing Continues to Evolve

As we’ve discussed, there are many opportunities to dive deeper into serverless architectures in a variety of areas. Though the resources shared above should be helpful in familiarizing yourself with key concepts and techniques, there’s nothing better than continued learning from others over time as new advancements come out and patterns evolve.

Finally, we encourage you to check back often as we’ll be continuing further blog post series on serverless architectures, with the next series focusing on API design patterns and best practices.

About the author

Justin PritleJustin Pirtle is a specialist Solutions Architect at Amazon Web Services, focused on the Serverless platform. He’s responsible for helping customers design, deploy, and scale serverless applications using services such as AWS Lambda, Amazon API Gateway, Amazon Cognito, and Amazon DynamoDB. He is a regular speaker at AWS conferences, including re:Invent, as well as other AWS events. Justin holds a bachelor’s degree in Management Information Systems from the University of Texas at Austin and a master’s degree in Software Engineering from Seattle University.

Singapore financial services: new resources for customer side of the shared responsibility model

Post Syndicated from Darran Boyd original https://aws.amazon.com/blogs/security/singapore-financial-services-new-resources-for-customer-side-of-shared-responsibility-model/

Based on customer feedback, we’ve updated our AWS User Guide to Financial Services Regulations and Guidelines in Singapore whitepaper, as well as our AWS Monetary Authority of Singapore Technology Risk Management Guidelines (MAS TRM Guidelines) Workbook, which is available for download via AWS Artifact. Both resources now include considerations and best practices for the customer portion of the AWS Shared Responsibility Model.

The whitepaper provides considerations for financial institutions as they assess their responsibilities when using AWS services with regard to the MAS Outsourcing Guidelines, MAS TRM Guidelines, and Association of Banks in Singapore (ABS) Cloud Computing Implementation Guide.

The MAS TRM Workbook provides best practices for the customer portion of the AWS Shared Responsibility Model—that is, guidance on how you can manage security in the AWS Cloud. The guidance and best practices are sourced from the AWS Well-Architected Framework.

The Well-Architected Framework helps you understand the pros and cons of decisions you make while building systems on AWS. By using the Framework, you will learn architectural best practices for designing and operating reliable, secure, efficient, and cost-effective systems in the cloud. It provides a way for you to consistently measure your architectures against best practices and identify areas for improvement. The process for reviewing an architecture is a constructive conversation about architectural decisions, and is not an audit mechanism. We believe that having well-architected systems greatly increases the likelihood of business success. For more information, see the AWS Well-Architected homepage.

The compliance controls provided by the workbook also continue to address the AWS side of the Shared Responsibility Model (security of the AWS Cloud).

View the updated whitepaper here, or download the updated AWS MAS TRM Guidelines Workbook via AWS Artifact.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Boyd author photo

Darran Boyd

Darran is a Principal Security Solutions Architect at AWS, responsible for helping remove security blockers for our customers and accelerating their journey to the AWS Cloud. Darran’s focus and passion is to deliver strategic security initiatives that un-lock and enable our customers at scale across the financial services industry and beyond… Cx0 to <code>

Well-Architected: “To thrive, to evolve, to delight”

Post Syndicated from Philip Fitzsimons original https://aws.amazon.com/blogs/architecture/well-architected-to-thrive-to-evolve-to-delight/

Today we published a subtle but significant update to the AWS Well-Architected Framework. We looked to see where AWS Solutions Architects and customers diverged in how they judged something to be Well-Architected, and we restructured our questions and answers to help close that gap.

There are many definitions of IT architecture, but, at its core, a good architecture is like a well-designed building: It creates a space that we love to be in. In technology, good architecture creates a space that allows our code to thrive, to evolve, to delight. Bad architecture inhibits the ability of our code to meet our expectations, and it exposes us to risk, wasted effort, extra costs, and bad outcomes. How do you know if your architectures are good?

In 2012, we created the AWS Well-Architected Framework as a way to answer that question. AWS Solutions Architects review thousands of workloads every year using the framework. We learn from these reviews: new ideas, bad ideas, and constant innovation. We curate those learnings, from our customers and our teams, to create a framework that remains current but also tested and pragmatic.

We use Kaizen 改善to help us continually improve the framework. We bring data—your anecdotes and challenges—and use it to understand where we can improve the framework. We experiment, ask five whys, draw pictures of fish, and whiteboard, whiteboard, whiteboard. We draft guidelines for writing best practices, we shape our best practices to fit, we iterate. We have it reviewed by our principal community, taken for a spin by AWS Solutions Architects, and then we publish an update to the framework. We joyfully spin round a Deming cycle.

With this update every question has been refined, and some have been split into two to ensure they focus on a single topic. The framework has also been updated to reflect new services and features, and is available as a whitepaper PDF and free as Kindle books. We believe that if you follow the well-architected way, your architectures will create a space where your code and functionality will delight your customers. You can find free training and all of the whitepapers on the AWS Well-Architected homepage.

Philip Fitzsimons is the leader of the AWS Well-Architected Team

Serverless Architectures with AWS Lambda: Overview and Best Practices

Post Syndicated from Andrew Baird original https://aws.amazon.com/blogs/architecture/serverless-architectures-with-aws-lambda-overview-and-best-practices/

For some organizations, the idea of “going serverless” can be daunting. But with an understanding of best practices – and the right tools — many serverless applications can be fully functional with only a few lines of code and little else.

Examples of fully-serverless-application use cases include:

  • Web or mobile backends – Create fully-serverless, mobile applications or websites by creating user-facing content in a native mobile application or static web content in an S3 bucket. Then have your front-end content integrate with Amazon API Gateway as a backend service API. Lambda functions will then execute the business logic you’ve written for each of the API Gateway methods in your backend API.
  • Chatbots and virtual assistants – Build new serverless ways to interact with your customers, like customer support assistants and bots ready to engage customers on your company-run social media pages. The Amazon Alexa Skills Kit (ASK) and Amazon Lex have the ability to apply natural-language understanding to user-voice and freeform-text input so that a Lambda function you write can intelligently respond and engage with them.
  • Internet of Things (IoT) backends – AWS IoT has direct-integration for device messages to be routed to and processed by Lambda functions. That means you can implement serverless backends for highly secure, scalable IoT applications for uses like connected consumer appliances and intelligent manufacturing facilities.

Using AWS Lambda as the logic layer of a serverless application can enable faster development speed and greater experimentation – and innovation — than in a traditional, server-based environment.

We recently published the “Serverless Architectures with AWS Lambda: Overview and Best Practices” whitepaper to provide the guidance and best practices you need to write better Lambda functions and build better serverless architectures.

Once you’ve finished reading the whitepaper, below are a couple additional resources I recommend as your next step:

  1. If you would like to better understand some of the architecture pattern possibilities for serverless applications: Thirty Serverless Architectures in 30 Minutes (re:Invent 2017 video)
  2. If you’re ready to get hands-on and build a sample serverless application: AWS Serverless Workshops (GitHub Repository)
  3. If you’ve already built a serverless application and you’d like to ensure your application has been Well Architected: The Serverless Application Lens: AWS Well Architected Framework (Whitepaper)

About the Author

 

Andrew Baird is a Sr. Solutions Architect for AWS. Prior to becoming a Solutions Architect, Andrew was a developer, including time as an SDE with Amazon.com. He has worked on large-scale distributed systems, public-facing APIs, and operations automation.

On Architecture and the State of the Art

Post Syndicated from Philip Fitzsimons original https://aws.amazon.com/blogs/architecture/on-architecture-and-the-state-of-the-art/

On the AWS Solutions Architecture team we know we’re following in the footsteps of other technical experts who pulled together the best practices of their eras. Around 22 BC the Roman Architect Vitruvius Pollio wrote On architecture (published as The Ten Books on Architecture), which became a seminal work on architectural theory. Vitruvius captured the best practices of his contemporaries and those who went before him (especially the Greek architects).

Closer to our time, in 1910, another technical expert, Henry Harrison Suplee, wrote Gas Turbine: progress in the design and construction of turbines operated by gases of combustion, from which we believe the phrase “state of the art” originates:

It has therefore been thought desirable to gather under one cover the most important papers which have appeared upon the subject of the gas turbine in England, France, Germany, and Switzerland, together with some account of the work in America, and to add to this such information upon actual experimental machines as can be secured.

In the present state of the art this is all that can be done, but it is believed that this will aid materially in the conduct of subsequent work, and place in the hands of the gas-power engineer a collection of material not generally accessible or available in convenient form.  

Source

Both authors wrote books that captured the current knowledge on design principles and best practices (in architecture and engineering) to improve awareness and adoption. Like these authors, we at AWS believe that capturing and sharing best practices leads to better outcomes. This follows a pattern we established internally, in our Principal Engineering community. In 2012 we started an initiative called “Well-Architected” to help share the best practices for architecting in the cloud with our customers.

Every year AWS Solution Architects dedicate hundreds of thousands of hours to helping customers build architectures that are cloud native. Through customer feedback, and real world experience we see what strategies, patterns, and approaches work for you.

“After our well-architected review and subsequent migration to the cloud, we saw the tremendous cost-savings potential of Amazon Web Services. By using the industry-standard service, we can invest the majority of our time and energy into enhancing our solutions. Thanks to (consulting partner) 1Strategy’s deep, technical AWS expertise and flexibility during our migration, we were able to leverage the strengths of AWS quickly.”

Paul Cooley, Chief Technology Officer for Imprev

This year we have again refreshed the AWS Well-Architected Framework, with a particular focus on Operational Excellence. Last year we announced the addition of Operational Excellence as a new pillar to AWS Well-Architected Framework. Having carried out thousands of reviews since then, we reexamined the pillar and are pleased to announce some significant changes. First, the pillar dives more deeply into people and process because this is an area where we see the most opportunities for teams to improve. Second, we’ve pivoted heavily to focusing on whether your team and your workload are ready for runtime operations. Key to this is ensuring that in the early phases of design that you think about how your architecture will be operated. Reflecting on this we realized that Operational Excellence should be the first pillar to support the “Architect for run-time operations” approach.

We’ve also added detail on how Amazon approaches technology architecture, covering topics such as our Principal Engineering Community and two-way doors and mechanisms. We refreshed the other pillars to reflect the evolution of AWS, and the best practices we are seeing in the field. We have also added detail on the review process, in the surprisingly named “The Review Process” section.

As part of refreshing the pillars are have also released a new Operational Excellence Pillar whitepaper, and have updated the whitepapers for all of the other pillars of the Framework. For example we have significantly updated the Reliability Pillar whitepaper to provide guidance on application design for high availability. New sections cover techniques for high availability including and beyond infrastructure implications, and considerations across the application lifecycle. This updated whitepaper also provides examples that show how to achieve availability goals in single and multi-region architectures.

You can find free training and all of the ”state of the art” whitepapers on the AWS Well-Architected homepage:

Philip Fitzsimons, Leader, AWS Well-Architected Team