Tag Archives: Advanced (300)

How to migrate symmetric exportable keys from AWS CloudHSM Classic to AWS CloudHSM

Post Syndicated from Mohamed AboElKheir original https://aws.amazon.com/blogs/security/migrate-symmetric-exportable-keys-aws-cloudhsm-classic-aws-cloudhsm/

In August 2017, we announced the “new” AWS CloudHSM service, which had a lot of improvements over AWS CloudHSM Classic (for clarity in this post I will refer to the services as New CloudHSM and CloudHSM Classic). These advantages in security, scalability, usability, and economy, included FIPS 140-2 Level 3 certification, fully managed high availability and backup, a management console, and lower costs.

Now, we turn another page. The Luna 5 HSMs used for CloudHSM Classic are reaching end of life, and the CloudHSM Classic service is being subsequently decommissioned, so CloudHSM Classic users must migrate cryptographic key material to the New CloudHSM.

In this post, I’ll show you how to use the RSA OAEP (optimal asymmetric encryption padding) wrapping mechanism, which was introduced in the CloudHSM client version 2.0.0, to move key material from CloudHSM Classic to New CloudHSM without exposing the plain text of the key material outside the HSM boundaries. You’ll use an RSA public key to wrap the key material (export it in encrypted form) on CloudHSM Classic, then use the corresponding RSA Private Key to unwrap it on New CloudHSM.

NOTE: This solution only works for symmetric exportable keys. Asymmetric keys on CloudHSM Classic can’t be exported. To replace non-exportable and asymmetric keys, you must generate new keys on New CloudHSM, then use the old keys to decrypt and the new keys to re-encrypt your data.

Solution overview

My solution shows you how to use the CKDemo utility on CloudHSM Classic, and key_mgmt_util on New CloudHSM, to: generate an RSA wrapping key pair; use it to wrap keys on CloudHSM Classic; and then unwrap the keys on New CloudHSM. These are all done via the RSA OAEP mechanism.
The following diagram provides a summary of the steps involved in the solution:

Figure 1: Solution overview

Figure 1: Solution overview

  1. Generate the RSA wrapping key pair on New CloudHSM.
  2. Export the RSA Public Key to the New CloudHSM client instance.
  3. Move the RSA public key to the CloudHSM Classic client instance.
  4. Import the RSA public key to CloudHSM Classic.
  5. Wrap the key using the imported RSA public key.
  6. Move the wrapped key to the New CloudHSM client instance.
  7. Unwrap the key on New CloudHSM with the RSA Private Key.

NOTE: You can perform the same procedure using supported libraries, such as JCE (Java Cryptography Extension) and PKCS#11. For example, you can use the wrap_with_imported_rsa_key sample to import an RSA public key into CloudHSM Classic, use that key to wrap your CloudHSM Classic keys, and then use the rsa_wrapping sample (specifically the rsa_oaep_unwrap_key function) to unwrap the keys into New CloudHSM using the RSA OAEP mechanism.

Prerequisites

  1. An active New CloudHSM cluster with at least one active hardware security module (HSM). Follow the Getting Started Guide to create and initialize a New CloudHSM cluster.
  2. An Amazon Elastic Compute Cloud (Amazon EC2) instance with the New CloudHSM client installed and configured to connect to the New CloudHSM cluster. You can refer to the Getting Started Guide to configure and connect the client instance.
  3. New CloudHSM CU (crypto user) credentials.
  4. An EC2 instance with the CloudHSM Classic client installed and configured to connect to the CloudHSM Classic partition or the high-availability (HA) partition group that contains the keys you want to migrate. You can refer to this guide to install and configure a CloudHSM Classic Client.
  5. The Password of the CloudHSM Classic partition or HA partition group that contains the keys you want to migrate.
  6. The handle of the symmetric key on CloudHSM Classic you want to migrate.

Step 1: Generate the RSA wrapping key pair on CloudHSM

1.1. On the New CloudHSM client instance, run the key_mgmt_util command line tool, and log in as the CU, as described in Getting Started with key_mgmt_util.


Command:  loginHSM -u CU -s <CU user> -p <CU password>
    
	Cfm3LoginHSM returned: 0x00 : HSM Return: SUCCESS

	Cluster Error Status
	Node id 0 and err state 0x00000000 : HSM Return: SUCCESS

1.2. Run the following
genRSAKeyPair command to generate an RSA key pair with the label
classic_wrap. Take note of the private and public key handles, as they’ll be used in the coming steps.


Command:  genRSAKeyPair -m 2048 -e 65537 -l classic_wrap

	Cfm3GenerateKeyPair returned: 0x00 : HSM Return: SUCCESS

	Cfm3GenerateKeyPair:    public key handle: 407    private key handle: 408

	Cluster Error Status
	Node id 0 and err state 0x00000000 : HSM Return: SUCCESS

Step 2: Export the RSA public key to the New CloudHSM client instance

2.1. Run the following exportPubKey command to export the RSA public key to the New CloudHSM client instance using the public key handle you received in step 1.2 (407, in my example). This will export the public key to a file named wrapping_public.pem.


Command:  exportPubKey -k <public key handle> -out wrapping_public.pem

PEM formatted public key is written to wrapping_public.pem

	Cfm3ExportPubKey returned: 0x00 : HSM Return: SUCCESS

Step 3: Move the RSA public key to the CloudHSM Classic client instance

Move the RSA Public Key to the CloudHSM Classic client instance using scp (or any other tool you prefer).

Step 4: Import the RSA public key to CloudHSM Classic

4.1. On the CloudHSM Classic instance, use the cmu command as shown below to import the RSA public key with the label classic_wrap. You’ll need the partition or HA partition group password for this command, plus the slot number of the partition or HA partition group (you can get the slot number of your partition or HA partition group using the vtl listSlots command).


# cmu import -inputFile=wrapping_public.pem -label classic_wrap
Select token
 [1] Token Label: partition1
 [2] Token Label: partition2
 [3] Token Label: partition3
 Enter choice: <slot number>
Please enter password for token in slot 1 : <password>

4.2. Run the below command to get the handle (highlighted below) of the imported key.


# cmu list -label classic_wrap
Select token
 [1] Token Label: partition1
 [2] Token Label: partition2
 [3] Token Label: partition3
 Enter choice: <slot number>
Please enter password for token in slot 1 : <password>
handle=149	label=classic_wrap

4.3. Run the CKDemo utility.


# ckdemo

4.4. Open a session to the partition or HA partition group slot.


Enter your choice : 1

Slots available:
	slot#1 - LunaNet Slot
	slot#2 - LunaNet Slot
	...
Select a slot: <slot number>

SO[0], normal user[1], or audit user[2]? 1

Status: Doing great, no errors (CKR_OK)

4.5. Log in using the partition or HA partition group pin.


Enter your choice : 3
Security Officer[0]
Crypto-Officer  [1]
Crypto-User     [2]:
Audit-User      [3]: 1
Enter PIN          : <password>

Status: Doing great, no errors (CKR_OK)

4.6. Change the CKA_WRAP attribute of the imported RSA public key to be able to use it for wrapping using the imported public key handle you received in step 4.2 above (149, in my example).


Enter your choice : 25

Which object do you want to modify (-1 to list available objects) : <imported public key handle>

Edit template for set attribute operation.

(1) Add Attribute   (2) Remove Attribute   (0) Accept Template :1

 0 - CKA_CLASS                  1 - CKA_TOKEN
 2 - CKA_PRIVATE                3 - CKA_LABEL
 4 - CKA_APPLICATION            5 - CKA_VALUE
 6 - CKA_XXX                    7 - CKA_CERTIFICATE_TYPE
 8 - CKA_ISSUER                 9 - CKA_SERIAL_NUMBER
10 - CKA_KEY_TYPE              11 - CKA_SUBJECT
12 - CKA_ID                    13 - CKA_SENSITIVE
14 - CKA_ENCRYPT               15 - CKA_DECRYPT
16 - CKA_WRAP                  17 - CKA_UNWRAP
18 - CKA_SIGN                  19 - CKA_SIGN_RECOVER
20 - CKA_VERIFY                21 - CKA_VERIFY_RECOVER
22 - CKA_DERIVE                23 - CKA_START_DATE
24 - CKA_END_DATE              25 - CKA_MODULUS
26 - CKA_MODULUS_BITS          27 - CKA_PUBLIC_EXPONENT
28 - CKA_PRIVATE_EXPONENT      29 - CKA_PRIME_1
30 - CKA_PRIME_2               31 - CKA_EXPONENT_1
32 - CKA_EXPONENT_2            33 - CKA_COEFFICIENT
34 - CKA_PRIME                 35 - CKA_SUBPRIME
36 - CKA_BASE                  37 - CKA_VALUE_BITS
38 - CKA_VALUE_LEN             39 - CKA_LOCAL
40 - CKA_MODIFIABLE            41 - CKA_ECDSA_PARAMS
42 - CKA_EC_POINT              43 - CKA_EXTRACTABLE
44 - CKA_ALWAYS_SENSITIVE      45 - CKA_NEVER_EXTRACTABLE
46 - CKA_CCM_PRIVATE           47 - CKA_FINGERPRINT_SHA1
48 - CKA_OUID                  49 - CKA_X9_31_GENERATED
50 - CKA_PRIME_BITS            51 - CKA_SUBPRIME_BITS
52 - CKA_USAGE_COUNT           53 - CKA_USAGE_LIMIT
54 - CKA_EKM_UID               55 - CKA_GENERIC_1
56 - CKA_GENERIC_2             57 - CKA_GENERIC_3
58 - CKA_FINGERPRINT_SHA256
Select which one: 16
Enter boolean value: 1

CKA_WRAP=01

(1) Add Attribute   (2) Remove Attribute   (0) Accept Template :0

Status: Doing great, no errors (CKR_OK)

Step 5: Wrap the key using the imported RSA public key

5.1. Check whether the symmetric key you want to migrate is exportable. This can be done by following the below command using the handle of the key you want to migrate, and confirming the value of the CKA_EXTRACTABLE attribute (highlighted below) is equal to 1. Otherwise, the key can’t be exported.


Enter your choice : 27

Enter handle of object to display (-1 to list available objects): <handle of the key to be migrated>
Object handle=120
CKA_CLASS=00000004
CKA_TOKEN=01
CKA_PRIVATE=01
CKA_LABEL=Generated AES Key
CKA_KEY_TYPE=0000001f
CKA_ID=
CKA_SENSITIVE=01
CKA_ENCRYPT=01
CKA_DECRYPT=01
CKA_WRAP=01
CKA_UNWRAP=01
CKA_SIGN=01
CKA_VERIFY=01
CKA_DERIVE=01
CKA_START_DATE=
CKA_END_DATE=
CKA_VALUE_LEN=00000020
CKA_LOCAL=01
CKA_MODIFIABLE=01
CKA_EXTRACTABLE=01
CKA_ALWAYS_SENSITIVE=01
CKA_NEVER_EXTRACTABLE=00
CKA_CCM_PRIVATE=00
CKA_FINGERPRINT_SHA1=f8babf341748ba5810be21acc95c6d4d9fac75aa
CKA_OUID=29010002f90900005e850700
CKA_EKM_UID=
CKA_GENERIC_1=
CKA_GENERIC_2=
CKA_GENERIC_3=
CKA_FINGERPRINT_SHA256=7a8efcbff27703e281617be3c3d484dc58df6a78f6b144207c1a54ad32a98c00

Status: Doing great, no errors (CKR_OK)

5.2. Wrap the key using the imported RSA public key. This will create a file called wrapped.key that contains the wrapped key. Make sure to use handles of public key handle you received in step 4.2 above (149, in my example), and the handle of the key you want to migrate.


Enter your choice : 60
[1]DES-ECB        [2]DES-CBC        [3]DES3-ECB       [4]DES3-CBC
                                    [7]CAST3-ECB      [8]CAST3-CBC
[9]RSA            [10]TRANSLA       [11]DES3-CBC-PAD  [12]DES3-CBC-PAD-IPSEC
[13]SEED-ECB      [14]SEED-CBC      [15]SEED-CBC-PAD  [16]DES-CBC-PAD
[17]CAST3-CBC-PAD [18]CAST5-CBC-PAD [19]AES-ECB       [20]AES-CBC
[21]AES-CBC-PAD   [22]AES-CBC-PAD-IPSEC [23]ARIA-ECB  [24]ARIA-CBC
[25]ARIA-CBC-PAD
[26]RSA_OAEP    [27]SET_OAEP
Select mechanism for wrapping: 26

Enter filename of OAEP Source Data [0 for none]: 0

Enter handle of wrapping key (-1 to list available objects) : <imported public key handle>

Enter handle of key to wrap (-1 to list available objects) : <handle of the key to be migrated>
Wrapped key was saved in file wrapped.key

Status: Doing great, no errors (CKR_OK)

Step 6: Move the wrapped key to the New CloudHSM client instance

Move the wrapped key to the New CloudHSM client instance using scp (or any other tool you prefer).

Step 7: Unwrap the key on New CloudHSM with the RSA Private Key

7.1. On the New CloudHSM client instance, run the key_mgmt_util and login as the CU.


Command:  loginHSM -s <CU user> -p <CU password>

	Cfm3LoginHSM returned: 0x00 : HSM Return: SUCCESS

	Cluster Error Status
	Node id 0 and err state 0x00000000 : HSM Return: SUCCESS

7.2. Run the following unWrapKey command to unwrap the key using the RSA private key handle you received in step 1.2 (408, in my example). The output of the command should show the handle of the wrapped key (highlighted below).


Command:  unWrapKey -f wrapped.key -w <private key handle> -m 8 -noheader -l unwrapped_aes -kc 4 -kt 31

	Cfm3CreateUnwrapTemplate2 returned: 0x00 : HSM Return: SUCCESS

	Cfm2UnWrapWithTemplate3 returned: 0x00 : HSM Return: SUCCESS

	Key Unwrapped.  Key Handle: 410

	Cluster Error Status
	Node id 0 and err state 0x00000000 : HSM Return: SUCCESS

Conclusion

Using RSA OAEP for key migration ensures that your key material doesn’t leave the HSM boundary in plain text, as it’s encrypted using an RSA public key before being exported from CloudHSM Classic, and it can only be decrypted by New CloudHSM through the RSA private key that is generated and kept on New CloudHSM.

My post provides an example of how to use the ckdemo and key_mgmt_util utilities for the migration, but the same procedure can also be performed using the supported software libraries, such as the Java JCE library or the PKCS#11 library,a to migrate larger volumes of keys in an automated manner.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author photo

Mohamed AboElKheir

Mohamed AboElKheir is an Application Security Engineer who worksa with different team to ensure AWS services, applications, and websites are designed and implemented to the highest security standards. He is a subject matter expert for CloudHSM and is always enthusiastic about assisting CloudHSM customers with advanced issues and use cases. Mohamed is passionate about InfoSec, specifically cryptography, penetration testing (he’s OSCP certified), application security, and cloud security (he’s AWS Security Specialty certified).

Visualizing Sensor Data in Amazon QuickSight

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/visualizing-sensor-data-in-amazon-quicksight/

This post is courtesy of Moheeb Zara, Developer Advocate, AWS Serverless

The Internet of Things (IoT) is a term used wherever physical devices are networked in some meaningful connected way. Often, this takes the form of sensor data collection and analysis. As the number of devices and size of data scales, it can become costly and difficult to keep up with demand.

Using AWS Serverless Application Model (AWS SAM), you can reduce the cost and time to market of an IoT solution. This guide demonstrates how to collect and visualize data from a low-cost, Wi-Fi connected IoT device using a variety of AWS services. Much of this can be accomplished within the AWS Free Usage Tier, which is necessary for the following instructions.

Services used

The following services are used in this example:

What’s covered in this post?

This post covers:

  • Connecting an Arduino MKR 1010 Wi-Fi device to AWS IoT Core.
  • Forwarding messages from an AWS IoT Core topic stream to a Lambda function.
  • Using a Kinesis Data Firehose delivery stream to store data in S3.
  • Analyzing and visualizing data stored in S3 using Amazon QuickSight.

Connect the device to AWS IoT Core using MQTT

The Arduino MKR 1010 is a low-cost, Wi-Fi enabled, IoT device, shown in the following image.

An Arduino MKR 1010 Wi-Fi microcontroller

Its analog and digital input and output pins can be used to read sensors or to write to actuators. Arduino provides a detailed guide on how to securely connect this device to AWS IoT Core. The following steps build upon it to push arbitrary sensor data to a topic stream and ultimately visualize that data using Amazon QuickSight.

  1. Start by following this comprehensive guide to using an Arduino MKR 1010 with AWS IoT Core. Upon completion, your device is connected to AWS IoT Core using MQTT (Message Queuing Telemetry Transport), a protocol for publishing and subscribing to messages using topics.
  2. In the Arduino IDE, choose File, Sketch, Include Library, and Manage Libraries.
  3. In the window that opens, search for ArduinoJson and select the library by Benoit Blanchon. Choose install.

4. Add #include <ArduinoJson.h> to the top of your sketch from the Arduino guide.

5. Modify the publishMessage() function with this code. It publishes a JSON message with two keys: time (ms) and the current value read from the first analog pin.

void publishMessage() {  
  Serial.println("Publishing message");

  // send message, the Print interface can be used to set the message contents
  mqttClient.beginMessage("arduino/outgoing");
  
  // create json message to send
  StaticJsonDocument<200> doc;
  doc["time"] = millis();
  doc["sensor_a0"] = analogRead(0);
  serializeJson(doc, mqttClient); // print to client
  
  mqttClient.endMessage();
}

6. Save and upload the sketch to your board.

Create a Kinesis Firehose delivery stream

Amazon Kinesis Data Firehose is a service that reliably loads streaming data into data stores, data lakes, and analytics tools. Amazon QuickSight requires a data store to create visualizations of the sensor data. This simple Kinesis Data Firehose delivery stream continuously uploads data to an S3 storage bucket. The next sections cover how to add records to this stream using a Lambda function.

  1. In the Kinesis Data Firehose console, create a new delivery stream, called SensorDataStream.
  2. Leave the default source as a Direct PUT or other sources and choose Next.
  3. On the next screen, leave all the default values and choose Next.
  4. Select Amazon S3 as the destination and create a new bucket with a unique name. This is where records are continuously uploaded so that they can be used by Amazon QuickSight.
  5. On the next screen, choose Create New IAM Role, Allow. This gives the Firehose delivery stream permission to upload to S3.
  6. Review and then choose Create Delivery Stream.

It can take some time to fully create the stream. In the meantime, continue on to the next section.

Invoking Lambda using AWS IoT Core rules

Using AWS IoT Core rules, you can forward messages from devices to a Lambda function, which can perform actions such as uploading to an Amazon DynamoDB table or an S3 bucket, or running data against various Amazon Machine Learning services. In this case, the function transforms and adds a message to the Kinesis Data Firehose delivery stream, which then adds that data to S3.

AWS IoT Core rules use the MQTT topic stream to trigger interactions with other AWS services. An AWS IoT Core rule is created by using an SQL statement, a topic filter, and a rule action. The Arduino example publishes messages every five seconds on the topic arduino/outgoing. The following instructions show how to consume those messages with a Lambda function.

Create a Lambda function

Before creating an AWS IoT Core rule, you need a Lambda function to consume forwarded messages.

  1. In the AWS Lambda console, choose Create function.
  2. Name the function ArduinoConsumeMessage.
  3. For Runtime, choose Author From Scratch, Node.js10.x. For Execution role, choose Create a new role with basic Lambda permissions. Choose Create.
  4. On the Execution role card, choose View the ArduinoConsumeMessage-role-xxxx on the IAM console.
  5. Choose Attach Policies. Then, search for and select AmazonKinesisFirehoseFullAccess.
  6. Choose Attach Policy. This applies the necessary permissions to add records to the Firehose delivery stream.
  7. In the Lambda console, in the Designer card, select the function name.
  8. Paste the following in the code editor, replacing SensorDataStream with the name of your own Firehose delivery stream. Choose Save.
const AWS = require('aws-sdk')

const firehose = new AWS.Firehose()
const StreamName = "SensorDataStream"

exports.handler = async (event) => {
    
    console.log('Received IoT event:', JSON.stringify(event, null, 2))
    
    let payload = {
        time: new Date(event.time),
        sensor_value: event.sensor_a0
    }
    
    let params = {
            DeliveryStreamName: StreamName,
            Record: { 
                Data: JSON.stringify(payload)
            }
        }
        
    return await firehose.putRecord(params).promise()

}

Create an AWS IoT Core rule

To create an AWS IoT Core rule, follow these steps.

  1. In the AWS IoT console, choose Act.
  2. Choose Create.
  3. For Rule query statement, copy and paste SELECT * FROM 'arduino/outgoing’. This subscribes to the outgoing message topic used in the Arduino example.
  4. Choose Add action, Send a message to a Lambda function, Configure action.
  5. Select the function created in the last set of instructions.
  6. Choose Create rule.

At this stage, any message published to the arduino/outgoing topic forwards to the ArduinoConsumeMessage Lambda function, which transforms and puts the payload on the Kinesis Data Firehose stream and also logs the message to Amazon CloudWatch. If you’ve connected an Arduino device to AWS IoT Core, it publishes to that topic every five seconds.

The following steps show how to test functionality using the AWS IoT console.

  1. In the AWS IoT console, choose Test.
  2. For Publish, enter the topic arduino/outgoing .
  3. Enter the following test payload:
    {
      “time”: 1567023375013,  
      “sensor_a0”: 456
    }
  4. Choose Publish to topic.
  5. Navigate back to your Lambda function.
  6. Choose Monitoring, View logs in CloudWatch.
  7. Select a log item to view the message contents, as shown in the following screenshot.

Visualizing data with Amazon QuickSight

To visualize data with Amazon QuickSight, follow these steps.

  1. In the Amazon QuickSight console, sign up.
  2. Choose Manage Data, New Data Set. Select S3 as the data source.
  3. A manifest file is necessary for Amazon QuickSight to be able to fetch data from your S3 bucket. Copy the following into a file named manifest.json. Replace YOUR-BUCKET-NAME with the name of the bucket created for the Firehose delivery stream.
    {
       "fileLocations":[
          {
             "URIPrefixes":[
                "s3://YOUR-BUCKET-NAME/"
             ]
          }
       ],
       "globalUploadSettings":{
          "format":"JSON"
       }
    }
  4. Upload the manifest.json file.
  5. Choose Connect, then Visualize. You may have to give Amazon QuickSight explicit permissions to your S3 bucket.
  6. Finally, design the Amazon QuickSight visualizations in the drag and drop editor. Drag the two available fields into the center card to generate a Sum of Sensor_value by Time visual.

Conclusion

This post demonstrated visualizing data from a securely connected remote IoT device. This was achieved by connecting an Arduino to AWS IoT Core using MQTT, forwarding messages from the topic stream to Lambda using IoT Core rules, putting records on an Amazon Kinesis Data Firehose delivery stream, and using Amazon QuickSight to visualize the data stored within an S3 bucket.

With these building blocks, it is possible to implement highly scalable and customizable IoT data collection, analysis, and visualization. With the use of other AWS services, you can build a full end-to-end platform for an IoT product that can reliably handle volume. To further explore how hardware and AWS Serverless can work together, visit the Amazon Web Services page on Hackster.

One to Many: Evolving VPC Design

Post Syndicated from Androski Spicer original https://aws.amazon.com/blogs/architecture/one-to-many-evolving-vpc-design/

Since its inception, the Amazon Virtual Private Cloud (VPC) has acted as the embodiment of security and privacy for customers who are looking to run their applications in a controlled, private, secure, and isolated environment.

This logically isolated space has evolved, and in its evolution has increased the avenues that customers can take to create and manage multi-tenant environments with multiple integration points for access to resources on-premises.

This blog is a two-part series that begins with a look at the Amazon VPC as a single unit of networking in the AWS Cloud but eventually takes you to a world in which simplified architectures for establishing a global network of VPCs are possible.

From One VPC: Single Unit of Networking

To be successful with the AWS Virtual Private Cloud you first have to define success for today and what success might look like as your organization’s adoption of the AWS cloud increases and matures. In essence, your VPCs should be designed to satisfy the needs of your applications today and must be scalable to accommodate future needs.

Classless Inter-Domain Routing (CIDR) notations are used to denote the size of your VPC. AWS allows you specify a CIDR block between /16 and /28. The largest, /16, provides you with 65,536 IP addresses and the smallest possible allowed CIDR block, /28, provides you with 16 IP addresses. Note, the first four IP addresses and the last IP address in each subnet CIDR block are not available for you to use, and cannot be assigned to an instance.

AWS VPC supports both IPv4 and IPv6. It is required that you specify an IPv4 CIDR range when creating a VPC. Specifying an IPv6 range is optional.

Customers can specify ANY IPv4 address space for their VPC. This includes but is not limited to RFC 1918 addresses.

After creating your VPC, you divide it into subnets. In an AWS VPC, subnets are not isolation boundaries around your application. Rather, they are containers for routing policies.

Isolation is achieved by attaching an AWS Security Group (SG) to the EC2 instances that host your application. SGs are stateful firewalls, meaning that connections are tracked to ensure return traffic is allowed. They control inbound and outbound access to the elastic network interfaces that are attached to an EC2 instance. These should be tightly configured, only allowing access as needed.

It is our best practice that subnets should be created in categories. There two main categories; public subnets and private subnets. At minimum they should be designed as outlined in the below diagrams for IPv4 and IPv6 subnet design.

Recommended IPv4 subnet design pattern

Recommended IPv6 subnet design pattern

Subnet types are denoted by the ability and inability for applications and users on the internet to directly initiate access to infrastructure within a subnet.

Public Subnets

Public subnets are attached to a route table that has a default route to the Internet via an Internet gateway.

Resources in a public subnet can have a public IP or Elastic IP (EIP) that has a NAT to the Elastic Network Interface (ENI) of the virtual machines or containers that hosts your application(s). This is a one-to-one NAT that is performed by the Internet gateway.

Illustration of public subnet access path to the Internet through the Internet Gateway (IGW)

Private Subnets

A private subnet contains infrastructure that isn’t directly accessible from the Internet. Unlike the public subnet, this infrastructure only has private IPs.

Infrastructure in a private subnet gain access to resources or users on the Internet through a NAT infrastructure of sorts.

AWS natively provides NAT capability through the use of the NAT Gateway service. Customers can also create NAT instances that they manage or leverage third-party NAT appliances from the AWS Marketplace.

In most scenarios, it is recommended to use the AWS NAT Gateway as it is highly available (in a single Availability Zone) and is provided as a managed service by AWS. It supports 5 Gbps of bandwidth per NAT gateway and automatically scales up to 45 Gbps.

An AWS NAT gateway’s high availability is confined to a single Availability Zone. For high availability across AZs, it is recommended to have a minimum of two NAT gateways (in different AZs). This allows you to switch to an available NAT gateway in the event that one should become unavailable.

This approach allows you to zone your Internet traffic, reducing cross Availability Zone connections to the Internet. More details on NAT gateway are available here.

Illustration of an environment with a single NAT Gateway (NAT-GW)

Illustration of high availability with a multiple NAT Gateways (NAT-GW) attached to their own route table

Illustration of the failure of one NAT Gateway and the fail over to an available NAT Gateway by the manual changing of the default route next hop in private subnet A route table

AWS allocated IPv6 addresses are Global Unicast Addresses by default. That said, you can privatize these subnets by using an Egress-Only Internet Gateway (E-IGW), instead of a regular Internet gateway. E-IGWs are purposely built to prevents users and applications on the Internet from initiating access to infrastructure in your IPv6 subnet(s).

Illustration of internet access for hybrid IPv6 subnets through an Egress-Only Internet Gateway (E-IGW)

Applications hosted on instances living within a private subnet can have different access needs. Some require access to the Internet while others require access to databases, applications, and users that are on-premises. For this type of access, AWS provides two avenues: the Virtual Gateway and the Transit Gateway. The Virtual Gateway can only support a single VPC at a time, while the Transit Gateway is built to simplify the interconnectivity of tens to hundreds of VPCs and then aggregating their connectivity to resources on-premises. Given that we are looking at the VPC as a single unit of networking, all diagrams below contain illustrations of the Virtual Gateway which acts a WAN concentrator for your VPC.

Illustration of private subnets connecting to data center via a Virtual Gateway (VGW)

 

Illustration of private subnets connecting to Data Center via a VGW

 

Illustration of private subnets connecting to Data Center using AWS Direct Connect as primary and IPsec as backup

The above diagram illustrates a WAN connection between a VGW attached to a VPC and a customer’s data center.

AWS provides two options for establishing a private connectivity between your VPC and on-premises network: AWS Direct Connect and AWS Site-to-Site VPN.

AWS Site-to-Site VPN configuration leverages IPSec with each connection providing two redundant IPSec tunnels. AWS support both static routing and dynamic routing (through the use of BGP).

BGP is recommended, as it allows dynamic route advertisement, high availability through failure detection, and fail over between tunnels in addition to decreased management complexity.

VPC Endpoints: Gateway & Interface Endpoints

Applications running inside your subnet(s) may need to connect to AWS public services (like Amazon S3, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon API Gateway, etc.) or applications in another VPC that lives in another account. For example, you may have a database in another account that you would like to expose applications that lives in a completely different account and subnet.

For these scenarios you have the option to leverage an Amazon VPC Endpoint.

There are two types of VPC Endpoints: Gateway Endpoints and Interface Endpoints.

Gateway Endpoints only support Amazon S3 and Amazon DynamoDB. Upon creation, a gateway is added to your specified route table(s) and acts as the destination for all requests to the service it is created for.

Interface Endpoints differ significantly and can only be created for services that are powered by AWS PrivateLink.

Upon creation, AWS creates an interface endpoint consisting of one or more Elastic Network Interfaces (ENIs). Each AZ can support one interface endpoint ENI. This acts as a point of entry for all traffic destined to a specific PrivateLink service.

When an interface endpoint is created, associated DNS entries are created that point to the endpoint and each ENI that the endpoint contains. To access the PrivateLink service you must send your request to one of these hostnames.

As illustrated below, ensure the Private DNS feature is enabled for AWS public and Marketplace services:

Since interface endpoints leverage ENIs, customers can use cloud techniques they are already familiar with. The interface endpoint can be configured with a restrictive security group. These endpoints can also be easily accessed from both inside and outside the VPC. Access from outside a VPC can be accomplished through Direct Connect and VPN.

Illustration of a solution that leverages an interface and gateway endpoint

Customers can also create AWS Endpoint services for their applications or services running on-premises. This allows access to these services via an interface endpoint which can be extended to other VPCs (even if the VPCs themselves do not have Direct Connect configured).

VPC Sharing

At re:Invent 2018, AWS launched the feature VPC sharing, which helps customers control VPC sprawl by decoupling the boundary of an AWS account from the underlying VPC network that supports its infrastructure.

VPC sharing uses Amazon Resource Access Manager (RAM) to share subnets across accounts within the same AWS organization.

VPC sharing is defined as:

VPC sharing allows customers to centralize the management of network, its IP space and the access paths to resources external to the VPC. This method of centralization and reuse (of VPC components such as NAT Gateway and Direct Connect connections) results in a reduction of cost to manage and maintain this environment.

Great, but there are times when a customer needs to build networks with multiple VPCs in and across AWS regions. How should this be done and what are the best practices?

This will be answered in part two of this blog.

 

 

Architecting multiple microservices behind a single domain with Amazon API Gateway

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/architecting-multiple-microservices-behind-a-single-domain-with-amazon-api-gateway/

This post is courtesy of Roberto Iturralde, Solutions Architect.

Today’s modern architectures are increasingly microservices-based, with separate engineering teams working independently on services with their own feature requirements and deployment pipelines. The benefits of this approach include increased agility and release velocity.

Microservice architectures also come with some challenges, particularly when they make up parts of a public service or API. These include enforcing engineering and security standards and collating application logs and metrics for a cross-service operational view.

It’s also important to have the microservices feel like a cohesive product to external customers, for authentication and metering in particular:

  • The engineering teams want autonomy.
  • The security team wants a cross-service view and to make it easy for the teams to adhere to the organization’s guidelines.
  • Customers want to feel like they’re using a unified product.

The AWS toolbox

AWS offers many services that you can weave together to meet these needs.

Amazon API Gateway is a fully managed service for deploying and managing a unified front door to your applications. It has features for routing your domain’s traffic to different backing microservices, enforcing consistent authentication and authorization with fine-grained permissions across them, and implementing consistent API throttling and usage metering. The microservice that backs a given API can live in another AWS account. You don’t have to expose it to the internet.

Amazon Cognito is a user management service with rich support for authentication and authorization of users. You can manage those users within Amazon Cognito or from other federated IdPs. Amazon Cognito can vend JSON Web Tokens and integrates natively with API Gateway to support OAuth scopes for fine-grained API access.

Amazon CloudWatch is a monitoring and management service that collects and visualizes data across AWS services. CloudWatch dashboards are customizable home pages that can contain graphs showing metrics and alarms. You can customize these to represent a specific microservice, a collection of microservices that comprise a product, or any other meaningful view with fine-grained access control to the dashboard.

AWS X-Ray is an analysis and debugging tool designed for distributed applications. It has tools to help gain insight into the performance of your microservices, and the APIs that front them, to measure and debug any potential customer impact.

AWS Service Catalog allows the central management and self-service creation of AWS resources that meet your organization’s guidelines and best practices. You can require separate permissions for managing catalog entries from deploying catalog entries, allowing a central team to define and publish templates for resources across the company.

Architectural options

There are many options for how you can combine these AWS services to meet your requirements. Your decisions may also depend on your expertise with AWS. The following features are common to all the designs below:

  • Amazon Route 53 has registered custom domains and hosts their DNS. You could also use an external registrar and DNS service.
  • AWS Certificate Manager (ACM) manages Transport Layer Security (TLS) certificates for the custom domains that route traffic to API Gateway APIs in a given account.
  • Amazon Cognito manages the users who access the APIs in API Gateway.
  • Service Catalog holds catalog products for API Gateway APIs that adhere to the organizational guidelines and best practices, such as security configuration and default API throttling. Microservice teams have permission to create an API pointed to their service and configure specific parameters, with approvals required for production environments. For more information, see Standardizing infrastructure delivery in distributed environments using AWS Service Catalog.

The following shows common design patterns and their high-level benefits and challenges.

Single AWS account

Microservices, their fronting API Gateway APIs, and supporting services are in the same AWS account. This account also includes core AWS services such as the following:

  • Route 53 for domain name registration and DNS
  • ACM for managing server certificates for your domain
  • Amazon Cognito for user management
  • Service Catalog for the catalog of best-practice product templates to use across the organization

Single AWS account example

Use this approach if you do not yet have a multi-account strategy or if you use AWS native tools for observability. With a single AWS account, the microservices can share the same networking topology, and so more easily communicate with each other when needed. With all the API Gateway APIs in the same AWS account, you can configure API throttling, metering, authentication, and authorization features for a unified experience for customers. You can also route traffic to a given API using subdomains or base path mapping in API Gateway.

A single AWS account can manage TLS certificates for AWS domains in one place. This feature is available to all API Gateway APIs. Having the microservices and their API Gateway APIs in the same AWS account gives more complete X-Ray service maps, given that X-Ray currently can’t analyze traces across AWS accounts. Similarly, you have a complete view of the metrics all AWS services publish to CloudWatch. This feature allows you to create CloudWatch dashboards that span the API Gateway APIs and their backing microservices.

There is an increased blast radius with this architecture, because the microservices share the same account. The microservices can impact each other through shared AWS service limits or mistakes by team members on other microservice teams. Most AWS services support tagging for cost allocation and granular access control, but there are some features of AWS services that do not. Because of this, it’s more difficult to separate the costs of each microservice completely.

Separate AWS accounts

When using separate AWS accounts, each API Gateway API lives in the same AWS account as its backing microservice. Separate AWS accounts hold the Service Catalog portfolio, domain registration (using Route 53), and aggregated logs from the microservices. The organization account, security account, and other core accounts are discussed further in the AWS Landing Zone Solution.

Separate AWS accounts

Use this architecture if you have a mature multi-account strategy and existing tooling for cross-account observability. In this approach, an AWS account encapsulates a microservice completely, for cost isolation and reduced blast radius. With the API Gateway API in the same account as the backing microservice, you have a complete view of the microservice in CloudWatch and X-Ray.

You can only meter API usage by microservice because API Gateway usage plans can’t track activity across accounts. Implement a process to ensure each customer’s API Gateway API key is the same across accounts for a smooth customer experience.

API Gateway base path mappings are local to an AWS account, so you must use subdomains to separate the microservices that comprise a product under a single domain. However, you can have a complete view of each microservice in the CloudWatch dashboards and X-Ray console for its AWS account. This creates a view across microservices that requires aggregation in a central AWS account or external tool.

Central API account

Using a central API account is similar to the separate account architecture, except the API Gateway APIs are in a central account.

Central API account

This architecture is the best approach for most users. It offers a balance of the benefits of microservice separation with the unification of particular services for a better end-user experience. Each microservice has an AWS account, which isolates it from the other services and reduces the risk of AWS service limit contention or accidents due to sharing the account with other engineering teams.

Because each microservice lives in a separate account, that account’s bill captures all the costs for that microservice. You can track the API costs, which are in the shared API account, using tags on API Gateway resources.

While the microservices are isolated in separate AWS accounts, the API Gateway throttling, metering, authentication, and authorization features are centralized for a consistent experience for customers. You can use subdomains or API Gateway base path mappings to route traffic to different API Gateway APIs. Also, the TLS certificates for your domains are centrally managed and available to all API Gateway APIs.

You can now split CloudWatch metrics, X-Ray traces, and application logs across accounts for a given microservice and its fronting API Gateway API. Unify these in a central AWS account or a third-party tool.

Conclusion

The breadth of the AWS Cloud presents many architectural options to customers. When designing your systems, it’s essential to understand the benefits and challenges of design decisions before implementing a solution.

This post walked you through three common architectural patterns for allowing independent microservice teams to operate behind a unified domain presented to your customers. The best approach for your organization depends on your priorities, experience, and familiarity with AWS.

How to use AWS Secrets Manager to securely store and rotate SSH key pairs

Post Syndicated from Maitreya Ranganath original https://aws.amazon.com/blogs/security/how-to-use-aws-secrets-manager-securely-store-rotate-ssh-key-pairs/

AWS Secrets Manager provides full lifecycle management for secrets within your environment. In this post, Maitreya and I will show you how to use Secrets Manager to store, deliver, and rotate SSH keypairs used for communication within compute clusters. Rotation of these keypairs is a security best practice, and sometimes a regulatory requirement. Traditionally, these keypairs have been associated with a number of tough challenges. For example, synchronizing key rotation across all compute nodes, enable detailed logging and auditing, and manage access to users in order to modify secrets.

However, rotating the keypair on all compute clusters’ nodes must be done in a tightly coordinated fashion, and failures generally result in availability risks. Moreover, the keypairs themselves are highly sensitive security credentials which must be carefully controlled with fine-grain access controls, detailed monitoring, and audit logging. These are precisely the types of tough challenges that AWS Secrets Manger solves for you.

In this post, we’ll show you how to secure, rotate, and use SSH keypairs for inter-cluster communication. You’ll use an AWS CloudFormation template to launch a cluster and configure Secrets Manager. Then we’ll show you how to use Secrets Manager to deliver the keypair to the cluster and use it for management operations, such as securely copying a file between nodes. Finally, we’ll use Secrets Manager to seamlessly rotate the keypair used by the cluster without any changes or outages. In this post, we’ve highlighted compute clusters, but you can use Secrets Manager to apply this solution directly to any SSH based use-case.

Solution overview

The following architecture diagram presents an overview of the solution:
 

Figure 1: Solution architecture

Figure 1: Solution architecture

The sample architecture created by CloudFormation includes one master node, three worker nodes, AWS Secret Manager—which utilizes a rotation AWS Lambda function—and AWS Systems Manager. Setting up the cluster is out of scope for this post; in our walkthrough, we’ll focus on the keypair rotation architecture.

Secrets Manager uses staging labels to identify different versions of a secret during rotation. A staging label is a text string. For example, by default, AWSCURRENT is attached to the current version of the secret, while AWSPENDING will be attached to new versions of the secret before they have been verified and deployed to corresponding resources.

As shown in the diagram:

  1. A secret is created in AWS Secrets Manager. The secret holds the SSH keypair that the master node will use to connect to the other nodes in the cluster. Upon keypair rotation, Secrets Manager will invoke a Lambda function (labeled 1.a in the diagram). The Lambda function will perform four steps:
    • 1.b: createSecret – create a new SSH keypair and store the private key as a new version of the secret.
    • 1.c: setSecret – label the newly created secret version with the label AWSPENDING and copy the public key to the worker nodes with AWS Systems Manager Run Command.

    The Lambda function will also perform two steps not shown in the diagram:

    • testSecret – verify that the new SSH keypair has been successfully deployed by invoking a test SSH connection.
    • finishSecret – set the staging label AWSCURRENT to the new secret version and remove the old keys from the worker nodes. This will also set the staging label AWSPREVIOUS to the old secret, allowing your administrator to have the ‘last known password’ if something goes wrong.

    An overview of the rotation Lambda function is available in the AWS Secrets Manager user guide. You have full control over the rotation function so that you can customize it to your needs. Note that no key is installed on the master node. Instead, the function will retrieve the private key from Secrets Manager only when it needs to securely communicate with the worker nodes. That private key is not saved on the master node’s filesystem but rather in volatile memory (per best practice, the private key variable is overwritten after successful authentication and deleted before the script exits); details about keeping secret data in volatile memory will follow later in this post.

  2. When the master node needs to communicate with any worker node, it will use an AWS SDK (Python Boto3) to read the SSH private key from Secrets Manager (2.a) and use the private key to establish an SSH tunnel with the worker nodes (2.b). The master node is authorized to read the private key from Secrets Manager because an AWS Identity and Access Management (IAM) role with a policy that allows it to access the secret is attached to the master node. The corresponding public key was deployed to each of the worker nodes during the rotation process in step one above.
  3. The secrets in Secrets Managers are encrypted with AWS Key Management System (KMS), and every version of the secret is encrypted with a unique data encryption key. The SSH key pair in the cluster will periodically rotate based on a configurable rotation interval, which you’ll configure from the Secrets Manager console later in this post. Each rotation repeats the process described in steps 1-2, resulting in a new version of the secret. Each new version will be encrypted using a new KMS data key, which provides an extra layer of security.
  4. The AWS Systems Manager Run Command will use the Amazon Elastic Compute Cloud (EC2) tag RotateSSHKeys with a value of True to identify the cluster’s worker node instances. Note that if you rely on tags as a security control, you must have clear governance and control over which users are able to change the tags and tag values on your EC2 instances.

Solution cost

Today, this solution will cost $0.48 an hour for the four T2.micro EC2 instances that comprise the sample cluster. Secrets manager has a 30-day trial period, after which one secret will cost $0.40 per month and $0.05 per 10,000 API calls. There is no additional charge for AWS Systems Manager.

Deploying the sample solution

In this section, you’ll deploy a test stack that demonstrates the entire solution. After deployment, you’ll log in to the master node and securely copy a file to one of the worker nodes. Finally, you’ll use Secrets Manager to rotate and deploy a new SSH keypair. The CloudFormation templates and secret rotation code are available in the AWS GitHub repository.

Set up the sample deployment by selecting the AWS CloudFormation Launch Stack button bellow; by default, the stack will be deployed in the us-east-1 (N. Virginia) Region.
 
Select this image to open a link that starts building the CloudFormation stack

The template creates an Amazon Amazon Virtual Private Cloud (VPC), private and public subnets, EC2 instances (master node and mock cluster), and the IAM role and policies used for the EC2 instances.

  1. Select your EC2 SSH key pair and input your IP range as stack parameters. In the YourIPRange field, enter the CIDR of your machine or network only, as this ensures only hosts from your network can access the master server. You may leave all other parameters as default. This CloudFormation template launches four t2.micro instances in a new VPC. One instance will be tagged as MasterServer and the rest will be tagged WorkerServer1-3.

    Note: The SSH keypair referenced here will be used to connect from your local computer to the master node. It is distinct from the SSH keypair used by the master node to connect to the worker nodes.

     

    Figure 2: Enter the CIDR of your machine or network

    Figure 2: Enter the CIDR of your machine or network

    Important: For simplicity, the master node you’ll create in this walkthrough will be in a public subnet, making it accessible from the CIDR you provided in Step 2. However, this is not the most secure approach possible. Follow the guidance in the Amazon EC2 VPC documentation to securely configure your cluster in a private subnet following the “defense in depth” principal.

  2. Monitor the status of the stack. When the status is CREATE_COMPLETE, the deployment is ready. Select the Outputs tab to find information about the newly created resources, and write down the master node’s public DNS and a worker node IP address. You’ll need both later in this post.
  3. Select the Launch Stack button to launch the AWS CloudFormation template that will deploy the Lambda function used by Secrets Manager, Accept the default values for the parameters. This template is designed for reusability; it can be applied to any SSH rotation use-case.
     
    Select this image to open a link that starts building the CloudFormation stack

Next, create and configure a new secret from the Secrets Manager console to store the cluster communication SSH keypair.

Configuring a secret in AWS Secrets Manager

The CloudFormation template did not deploy a secret, so follow these steps to create a secret from the console and rotation function configuration. To create a new secret:

  1. Open the AWS Secrets Manager console and select Store New Secret.
  2. Select Other type of secrets, then select the Plaintext tab.
  3. As shown in Figure 3, enter {} to create an empty JSON value with no properties. This value will be initially populated with a keypair by the rotation Lambda function.
     
    Figure 3: Create an empty JSON value with no properties

    Figure 3: Create an empty JSON value with no properties

  4. Keep the default encryption key and select Next. We’re keeping the default encryption key for the sake of simplicity in this example, but security best practices suggest using a Customer Master Key (CMK) that you’ve created.
  5. In Step 2: Name and description, name the secret /dev/ssh. The path of a secret can be used in the secret’s IAM policy to restrict users and roles to a secret or hierarchy of secrets. For example, the IAM policy could include /dev/* or /prod/* to control access to secrets in development or production, respectively.
  6. Add a description, then select Next.
     
    Figure 4: Add a description

    Figure 4: Add a description

  7. In Step 3: Configure rotation, choose Enable automatic rotation and enable a rotation interval of your choice, which you can configure using the rotation interval dropdown list.
  8. Select the Choose an AWS Lambda function drop-down and choose RotateSSH. This is the Lambda function that was deployed by the CloudFormation template.
  9. Select Next, then review your configuration and select Store. When the new secrets configuration is stored, the rotation Lambda function is immediately invoked, populating the value of the secret.
     
    Figure 5: Configure the rotation

    Figure 5: Configure the rotation

Testing the sample solution

With the secret configuration completed and the instances up and running, you’re now going to securely copy a file from the master node to one of the worker nodes, using the SSH key stored in Secrets Manager to test the solution.

  1. Log in to the master node via SSH, using the EC2 key that you specified in the CloudFormation template.
  2. Once connected, securely copy a file from the master node to the worker node using SCP (secure copy protocol) by entering the command below. Replace <private-ip-of-worker> with the worker node IP you copied down in step 3:
    
                python copy_file.py ec2-user <private-ip-of-worker>
            

Figure 6 shows ssh login to master node, and the copy_file.py command to worker node.
 

Figure 6: The <span style="font-family: courier">ssh</span> login to master node, and the <span style="font-family: courier">copy_file.py</span> command

Figure 6: The ssh login to master node, and the copy_file.py command

During execution, the python script will use the Secrets Manager get_secret_value API to retrieve the secret, which includes the private key. It will then use this key to establish a secure SSH connection with the worker nodes, without saving the private key on any master node storage.

You can review the copy_file.py on the master node or on GitHub. In the get_private_key() function, you can read the secret value, which includes the private key:


    get_secret_value_response = client.get_secret_value(
    SecretId=secret_name)           

In the copy_file function, create a secured SSH tunnel to copy a file using the private key from memory, using Paramiko, a Python implementation of SSHv2.


    private_key_str = io.StringIO()
    # Write private key to a memory file
    private_key_str.write(private_key)
    
    # Create key object
    key = paramiko.RSAKey.from_private_key(private_key_str)
    
    # Open a channel and authenticate 
    trans = paramiko.Transport(ip, 22) 
    trans.start_client()
    trans.auth_publickey(user, key)
    del key        

To demonstrate the rotation of the SSH keypair, you’ll now manually invoke the rotation function:

  1. Return to the Secrets Manager console, select your /dev/ssh secret, and choose Retrieve Secret Value to see the key pair.
  2. Select Rotate secret immediately. In the pop-up window, confirm your choice by selecting Rotate.
     
    Figure 7: Set the "Secret value" and "Rotation configuration"

    Figure 7: Set the “Secret value” and “Rotation configuration”

  3. Choose Rotate again to complete the rotation.
     
    Figure 8: Select "Rotate"

    Figure 8: Select “Rotate”

  4. Select the Close button to refresh the view, and then choose Retrieve Secret Value again.
  5. Once the rotation has completed, you can inspect the new keypair via the Secrets Manager console. Go back to the terminal and run the same python script to copy a file using SCP. Replace <private-ip-of-worker> with your own worker node ID:
    
                    python copy_file.py ec2-user <private-ip-of-worker>
            

The file has now been transferred successfully using a new key pair, with no updates required.

Auditing and monitoring

You can monitor and audit all APIs used to create and rotate your keys in Secrets Manager via AWS CloudTrail. To view CloudTrail events, follow these steps:

  1. Open the CloudTrail console and select Event history.
  2. From the Filter dropdown field, select Event source, enter secret in the filter field, then select secretsmanager.amazonaws.com from the dropdown menu.
  3. From here, you can review Secrets Manager’s events, such as GetSecretValue, PutSecretValue, UpdateSecretVersionStage (which modifies the staging labels attached to a version of a secret), and RotationSucceeded, in the CloudTrail event history. These event logs help to audit secrets configuration, rotation, and access.
     
    Figure 9: The "Event history" window

    Figure 9: The “Event history” window

Additionally, Secrets Manager can work with CloudWatch Events to trigger alerts when administrator-specified operations occur in an organization (for example, to notify you of a secret deletion attempt).

Cleaning up the CloudFormation Stack

To delete the entire CloudFormation stack:

  1. Select the stack named RotateSSH from the CloudFormation console.
  2. Select Actions, and then Delete Stack. This will delete all AWS resources created by the stack.
  3. Repeat the steps above to delete the stack named MasterWorkers.
  4. From the AWS Secrets Manager console, delete the secret /dev/ssh. Read more about Deleting and Restoring a Secret in the AWS Secrets Manager User Guide.

Conclusion

In this post, we demonstrate how you can use AWS Secrets Manager to store, rotate, and deliver SSH keypairs in order to secure communication within a compute cluster. Keys are securely encrypted and stored in AWS Secret Manager, which will also rotate the keys and install public keys on all nodes for you. By using this method, you won’t have to manually deploy SSH Keys on the various EC2 instances or manually rotate them. APIs associated with secrets management and rotation are logged in CloudTrail for auditing and monitoring. This key rotation solution is serverless. It does not require any servers to maintain and can scale rapidly.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the AWS Secrets Manager forum.

Want more AWS Security news? Follow us on Twitter.

Author

Assaf Namer

Assaf is a Senior Solutions Architect. He likes coding, hackathons, and enjoys helping customers building reliable and secure cloud solutions. Outside of work, Assaf enjoys spinning and tennis.

Author

Maitreya Ranganath

Maitreya is a Solutions Architect with the Enterprise team. He has a focus on Security and Compliance and enjoys helping customers architect secure, scalable, and cost-effective solutions on AWS.

Building a Serverless FHIR Interface on AWS

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/building-a-serverless-fhir-interface-on-aws/

This post is courtesy of Mithun Mallick, Senior Solutions Architect (Messaging), and Navneet Srivastava, Senior Solutions Architect.

Technology is revolutionizing the healthcare industry but it can be a challenge for healthcare providers to take full advantage because of software systems that don’t easily communicate with each other. A single patient visit involves multiple systems such as practice management, electronic health records, and billing. When these systems can’t operate together, it’s harder to leverage them to improve patient care.

To help make it easier to exchange data between these systems, Health Level Seven International (HL7) developed the Fast Healthcare Interoperability Resources (FHIR), an interoperability standard for the electronic exchange of healthcare information. In this post, I will show you the AWS services you use to build a serverless FHIR interface on the cloud.

In FHIR, resources are your basic building blocks. A resource is an exchangeable piece of content that has a common way to define and represent it, a set of common metadata, and a human readable part. Each resource type has the same set of operations, called interactions, that you use to manage the resources in a granular fashion. For more information, see the FHIR overview.

FHIR Serverless Architecture

My FHIR architecture features a server with its own data repository and a simple consumer application that displays Patient and Observation data. To make it easier to build, my server only supports the JSON content type over HTTPs, and it only supports the Bundle, Patient, and Observation FHIR resource types. In a production environment, your server should support all resource types.

For this architecture, the server supports the following interactions:

  • Posting bundles as collections of Patients and Observations
  • Searching Patients and Observations
  • Updating and reading Patients
  • Creating a CapabilityStatement

You can expand this architecture to support all FHIR resource types, interactions, and data formats.

The following diagram shows how the described services work together to create a serverless FHIR messaging interface.

 

Services work together to create a serverless FHIR messaging interface.

 

Amazon API Gateway

In Amazon API Gateway, you create the REST API that acts as a “front door” for the consumer application to access the data and business logic of this architecture. I used API Gateway to host the API endpoints. I created the resource definitions and API methods in the API Gateway.

For this architecture, the FHIR resources map to the resource definitions in API Gateway. The Bundle FHIR resource type maps to the Bundle API Gateway resource. The observation FHIR resource type maps to the observation API Gateway resource. And, the Patient FHIR resource type maps to the Patient API Gateway resource.

To keep the API definitions simple, I used the ANY method. The ANY method handles the various URL mappings in the AWS Lambda code, and uses Lambda proxy integration to send requests to the Lambda function.

You can use the ANY method to handle HTTP methods, such as:

  • POST to represent the interaction to create a Patient resource type
  • GET to read a Patient instance based on a patient ID, or to search based on predefined parameters

We chose Amazon DynamoDB because it provides the input data representation and query patterns necessary for a FHIR data repository. For this architecture, each resource type is stored in its own Amazon DynamoDB table. Metadata for resources stored in the repository is also stored in its own table.

We set up global secondary indexes on the patient and observations tables in order to perform searches and retrieve observations for a patient. In this architecture, the patient id is stored as a patient reference id in the observation table. The patientRefid-index allows you to retrieve observations based on the patient id without performing a full scan of the table.

We chose Amazon S3 to store archived FHIR messages because of its low cost and high durability.

Processing FHIR Messages

Each Amazon API Gateway request in this architecture is backed by an AWS Lambda function containing the Jersey RESTful web services framework, the AWS serverless Java container framework, and the HAPI FHIR library.

The AWS serverless Java framework provides a base implementation for the handleRequest method in LambdaHandler class. It uses the serverless Java container initialized in the global scope to proxy requests to our jersey application.

The handler method calls a proxy class and passes the stream classes along with the context.

This source code from the LambdaHandler class shows the handleRequest method:

// Main entry point of the Lambda function, uses the serverless-java-container initialized in the global scope
// to proxy requests to our jersey application
public void handleRequest(InputStream inputStream, OutputStream outputStream, Context context) 
    throws IOException {
    	
        handler.proxyStream(inputStream, outputStream, context);

        // just in case it wasn't closed by the map	per
        outputStream.close();
}

The resource implementations classes are in the com.amazonaws.lab.resources package. This package defines the URL mappings necessary for routing the REST API calls.

The following method from the PatientResource class implements the GET patient interaction based on a patient id. The annotations describe the HTTP method called, as well as the path that is used to make the call. This method is invoked when a request is sent with the URL pattern: Patient/{id}. It retrieves the Patient resource type based on the id sent as part of the URL.

	@GET
	@Path("/{id}")
public Response gETPatientid(@Context SecurityContext securityContext,
			@ApiParam(value = "", required = true) @PathParam("id") String id, @HeaderParam("Accept") String accepted) {
…
}

Deploying the FHIR Interface

To deploy the resources for this architecture, we used an AWS Serverless Application Model (SAM) template. During deployment, SAM templates are expanded and transformed into AWS CloudFormation syntax. The template launches and configures all the services that make up the architecture.

Building the Consumer Application

For out architecture, we wrote a simple Node.JS client application that calls the APIs on FHIR server to get a list of patients and related observations. You can build more advanced applications for this architecture. For example, you could build a patient-focused application that displays vitals and immunization charts. Or, you could build a backend/mid-tier application that consumes a large number of messages and transforms them for downstream analytics.

This is the code we used to get the token from Amazon Cognito:

token = authcognito.token();

//Setting url to call FHIR server

     var options = {
       url: "https://<FHIR SERVER>",
       host: "FHIR SERVER",
       path: "Prod/Patient",
       method: "GET",
       headers: {
         "Content-Type": "application/json",
         "Authorization": token
         }
       }

This is the code we used to call the FHIR server:

request(options, function(err, response, body) {
     if (err) {
       console.log("In error  ");
       console.log(err);

}
else {
     let patientlist = JSON.parse(body);

     console.log(patientlist);
     res.json(patientlist["entry"]);
}
});
 

We used AWS CloudTrail and AWS X-Ray for logging and debugging.

The screenshots below display the results:

Conclusion

In this post, we demonstrated how to build a serverless FHIR architecture. We used Amazon API Gateway and AWS Lambda to ingest and process FHIR resources, and Amazon DynamoDB and Amazon S3 to provide a repository for the resources. Amazon Cognito provides secure access to the API Gateway. We also showed you how to build a simple consumer application that displays patient and observation data. You can modify this architecture for your individual use case.

About the authors

Mithun MallickMithun is a Sr. Solutions Architect and is responsible for helping customers in the HCLS industry build secure, scalable and cost-effective solutions on AWS. Mithun helps develop and implement strategic plan to engage customers and partners in the industry and works with the community of technically focused HCLS specialists within AWS. He has hands on experience on messaging standards like X12, HL7 and FHIR. Mithun has a M.B.A from CSU (Ft. Collins, CO) and a bachelors in Computer Engineering. He holds several associate and professional certifications for architecting on AWS.

 

 

Navneet SrivastavaNavneet, a Sr. Solutions Architect, is responsible for helping provider organizations and healthcare companies to deploy electronic medical records, devices, and AI/ML-based applications while educating customers about how to build secure, scalable, and cost-effective AWS solutions. He develops strategic plans to engage customers and partners, and works with a community of technically focused HCLS specialists within AWS. He is skilled AI, ML, Big Data, and healthcare related technologies. Navneet has a M.B.A from NYIT and a bachelors in software Engineering and holds several associate and professional certifications for architecting on AWS.

Running AWS Infrastructure On Premises with AWS Outposts

Post Syndicated from Matt Garman original https://aws.amazon.com/blogs/compute/running-aws-infrastructure-on-premises-with-aws-outposts/

We announced AWS Outposts at re:Invent last December and since then have seen immense customer interest. Customers have been asking for an AWS option on-premises to run applications with low latency and local data-processing requirements. AWS Outposts is a new service slated to launch in late 2019, that brings the same infrastructure, APIs, and tools that customers use in AWS to virtually any customer on-premises facility. It is a fully managed service; the physical infrastructure is delivered and installed by AWS, operated and monitored by AWS, and automatically updated and patched as part of being connected to an AWS Region.

You can use AWS Outposts to launch a range of Amazon EC2 instances (C5, M5, R5, I3en and G4, both with and without local storage options) and Amazon EBS volumes locally. In addition to EC2 and EBS, you can also run a wide range of AWS services locally on Outposts. At general availability, AWS services supported locally on Outposts will include Amazon ECS and Amazon EKS clusters for container-based applications, Amazon EMR clusters for data analytics, and Amazon RDS instances for relational database services. Additional services such as Amazon SageMaker and Amazon MSK are coming soon after launch.

An Outpost works as an extension of the AWS Region into your own data center; services running on the Outpost can seamlessly work with any AWS service or resource running in the cloud. For example, you can use private connectivity to your Amazon S3 buckets or Amazon DynamoDB tables in the public region. Amazon tools will work with Outposts as well. API calls will be logged via CloudTrail automatically and existing CloudFormation templates will work as well. When AWS launches new innovations, they will work with Outposts so customers can always take advantage of the latest technologies.

We are speaking with customers across verticals including manufacturing, healthcare, financial services, media and entertainment, and telecom who are interested in Outposts for their connected environments. One of the most common scenarios is applications that need single-digit millisecond latency to end-users or onsite equipment. Customers may need to run compute-intensive workloads on their manufacturing factory floors with precision and quality. Others have graphics-intensive applications such as image analysis that need low-latency access to end-users or storage-intensive workloads that collect and process hundreds of TBs of data a day. Customers want to integrate their cloud deployments with their on-premises environments and use AWS services for a consistent hybrid experience.

Some customers are also interested in leveraging AWS services in disconnected environments like cruise ships or remote mining locations. For these types of environments, AWS offers Snowball Edge, which is optimized to operate in environments with limited to no connectivity. Compared to Snowball Edge, Outposts are designed to be run exclusively in connected, on-premises environments.

I want to share an example of how an early user of Outposts is using it to control and operate industrial equipment at hundreds of sites around the world. They already run centralized decision-making applications in AWS to identify what work to execute at which site. Predictable low-latency access to local compute resources is essential for their on-premises control systems to manage materials with smoothness and speed. For instance, control systems need to process video streams to sense the product on the conveyor belt, and execute a robotic movement to direct the product to the right location. Their sites also run video monitoring applications where the captured data can exceed available bandwidth so they want to conduct video encoding on-premises.

We worked with our early customer to deploy an Outpost rack at one of their sites. After connecting their Outpost to the nearest local AWS Region, the customer has complete control over their virtual network, including selection of an IP address range, creation of subnets, and configuration of route tables and network gateways, just like with their Amazon VPC today. They can seamlessly extend their regional VPC to the Outpost by creating a subnet and associating it with the Outpost the same way they associate subnets with an Availability Zone today.

To launch instances on their Outposts, they use the exact same API call as they do in the public region, but target their Outpost subnet so the instances launch on the Outpost in their facility. These instances will run in their existing VPC and even be able to communicate with instances running in the public region via private IP addresses. As the infrastructure is the same hardware as what we use in our public AWS Region, the applications running on these instances will perform the same as they do in the public region. To get low-latency access to local compute and storage already running in their facility, the customer can create a local gateway in their VPC that makes it simple for the Outpost to route traffic directly to their local datacenter networks.

Using Outposts, the customer plans to standardize tooling across on-premises and the cloud, and automate deployments and configurations across hundreds of sites by using the same APIs, the same IAM permissions, the same EC2 AMIs, the same CloudFormation templates, and the same deployment pipelines everywhere. As Outposts are updated and patched as part of AWS regional operations, they no longer need to upgrade and patch their on-premises infrastructure or take downtime for maintenance.

We are excited about the enthusiasm we’re seeing from customers, and are eager to help them focus on innovating for their end-users without worrying about procuring, deploying, and operating the infrastructure for their applications. With AWS Outposts, we are bringing the same AWS infrastructure, APIs, services, and tools to help customers build applications that can run as reliably and securely at any of their sites as in the cloud. We recently released a new video that shares more details about the Outposts rack. If you haven’t already seen it, go check it out! As always, we look forward to your feedback.

To learn more about AWS Outposts, visit the product detail page.

Additional Resources:

What is an AWS Outposts Rack?

AWS Nitro System

Creating static custom domain endpoints with Amazon MQ to simplify broker modification and scaling

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/creating-static-custom-domain-endpoints-with-amazon-mq/

This post is courtesy of Wallace Printz, Senior Solutions Architect, AWS, and Christian Mueller, Senior Solutions Architect, AWS.

Many cloud-native application architectures take advantage of the point-to-point and publish-subscribe (“pub-sub”) model of message-based communication between application components. This architecture is generally more resilient to failure because of the loose coupling and because message processing failures can be retried. It’s also more efficient because individual application components can independently scale up or down to maintain message-processing SLAs, compared to monolithic application architectures. Synchronous (REST-based) systems are tightly coupled. A problem in a synchronous downstream dependency has an immediate impact on the upstream callers.

Retries from upstream callers can all too easily fan out and amplify problems. Amazon SQS and Amazon SNS are fully managed message queuing services, but are not necessarily the right tool for the job in some cases. For applications requiring messaging protocols including JMS, NMS, AMQP, STOMP, MQTT, and WebSocket, Amazon provides Amazon MQ. Amazon MQ is a managed message broker service for Apache ActiveMQ that makes it easy to set up and operate message brokers in the cloud.

Amazon MQ provides two managed broker deployment connection options: public brokers and private brokers. Public brokers receive internet-accessible IP addresses, while private brokers receive only private IP addresses from the corresponding CIDR range in their VPC subnet.

In some cases, for security purposes, you may prefer to place brokers in a private subnet. You can also allow access to the brokers through a persistent public endpoint, such as a subdomain of their corporate domain like mq.example.com.

In this post, we explain how to provision private Amazon MQ brokers behind a secure public load balancer endpoint using an example subdomain.

Architecture overview

There are several reasons one might want to deploy this architecture beyond the security aspects.

First, human-readable URLs are easier for people to parse when reviewing operations and troubleshooting, such as deploying updates to mq-dev.example.com before mq-prod.example.com.

Second, maintaining static URLs for your brokers helps reduce the necessity of modifying client code when performing maintenance on the brokers.

Third, this pattern allows you to vertically scale your brokers without changing the client code or even notifying the clients that changes have been made.

Finally, the same architecture described here works for a network of brokers configuration as well, whereby you could horizontally scale your brokers without impacting the client code.

Prerequisites

This blog post assumes some familiarity with AWS networking fundamentals, such as VPCs, subnets, load balancers, and Amazon Route 53.

When you are finished, the architecture should be set up as shown in the following diagram. For ease of visualization, we demonstrate with a pair of brokers using the active-standby option.

Solution Overview

Amazon MQ solution overview

The client to broker traffic flow is as follows.

  • First, the client service tries to connect with a failover URL to the domain endpoint setup in Route 53. If a client loses the connection, using the failover URL allows the client to automatically try to reconnect to the broker.
  • The client looks up the domain name from Route 53, and Route 53 returns the IP address of the Network Load Balancer.
  • The client creates a secure socket layer (SSL) connection to the Network Load Balancer with an SSL certificate provided from AWS Certificate Manager (ACM). The Network Load Balancer selects from the healthy brokers in its target group and creates a separate SSL connection between the Network Load Balancer and the broker. This provides secure, end-to-end SSL encrypted messaging between client and brokers.

In this diagram, the healthy broker connection is shown in the solid line. The standby broker, which does not reply to connection requests and is therefore marked as unhealthy in the target group, is shown in the dashed line.

Solution walkthrough

To build this architecture, build the network segmentation first, then the Amazon MQ brokers, and finally the network routing.

Setup

First, you need the following resources:

  • A VPC
  • One private subnet per Availability Zone
  • One public subnet for your bastion host (if desired)

This demonstration VPC uses the 20.0.0.0/16 CIDR range.

Additionally, you must create a custom security group for your brokers. Set up this security group to allow traffic from your Network Load Balancer and, if using a network of brokers, among the brokers as well.

This VPC is not being used for any other workloads. This demonstration allows all incoming traffic originating within the VPC, including the Network Load Balancer, through to the brokers on the following ports:

  • OpenWire communication port of 61617
  • Apache ActiveMQ console port of 8162

If you are using a different protocol, adjust the port numbers accordingly.

Create an amazon mq security group

Building the Amazon MQ brokers

Now that you have the network segmentation set up, build the Amazon MQ brokers. As mentioned previously, this demonstration uses the active-standby pair of private brokers option.

Configure the broker settings by selecting a broker name, instance type, ActiveMQ console user, and password first.

Configure Amazon MQ broker settings

In the Additional Settings area, place the brokers in your previously selected VPC and the associated private subnets.

Configure Amazon MQ additional settings

Finally, select the existing Security Group previously discussed, and make sure that the Public Accessibility option is set to No.

Set Amazon MQ security group settings

That’s it for the brokers. When it is done provisioning, the Amazon MQ dashboard should look like the one shown in the following screenshot. Note the IP addresses of the brokers and the ActiveMQ web console URLs for later.

Amazon MQ dashboard

Configuring a Load Balancer Target Group

The next step in the build process is to configure the load balancer’s target group. This demonstration uses the private IP addresses of the brokers as targets for the Network Load Balancer.

Create and name a target group, select the IP option under Target type, and make sure to select TLS under Protocol and 61617 under Port, as well as the VPC in which your brokers reside. It is important to configure the health check settings so traffic is only routed to active brokers by selecting the TCP protocol and overriding the health check port to 8162, the Apache ActiveMQ console port.

Do not use the OpenWire port as the target group health check port. Because the Network Load Balancer may not be able to recognize the host as healthy on that port, it is better to use the ActiveMQ web console port.

Next, add the brokers’ IP addresses as targets. You can find the broker IP addresses in the Amazon MQ console page after they complete provisioning. Make sure to add both the active and the standby broker to the target group so that when reboots occur, the Network Load Balancer routes traffic to whichever broker is active.

You may be pursuing a more dynamic environment for scaling brokers up and down to handle the demands of a variable message load. In that case, as you scale to add more brokers, make sure that you also add them to the target group.

AWS Lambda would be a great way to programmatically handle adding or removing the broker’s IP addresses to this target group automatically.

Creating a Network Load Balancer

Next, create a Network Load Balancer. This demo uses an internet-facing load balancer with TLS listeners on port 61617, and routes traffic to brokers’ VPC and private subnets.

Configure a network load balancer

Clients must securely connect to the Network Load Balancer, so this demo uses an ACM certificate for the subdomain registered in Route 53, such as mq.example.com. For simplicity, ACM certificate provisioning is not shown. For more information, see Request a Public Certificate.

Make sure that the ACM certificate is provisioned in the same Region as your Network Load Balancer, or the certificate is not displayed in the selection menu.

Next, select the target group that you just created, and select TLS for the connection between the Network Load Balancer and the brokers. Similarly, select the health checks on TCP port 8162.

If all went well, you see the list of brokers’ IP addresses listed as targets. From here, review your settings and confirm you’d like to deploy the Network Load Balancer.

Configuring Route 53

The last step in this build is to configure Route 53 to serve traffic at the subdomain of your choice to your Network Load Balancer.

Go to your Route 53 Hosted Zone, and create a new subdomain record set, such as mq.example.com, that matches the ACM certificate that you previously created. In the Type field, select A – IPv4 address, then select Yes for Alias. This allows you to select the Network Load Balancer as the alias target. Select the Network Load Balancer that you just created from the Alias Target menu and save the record set.

Testing broker connectivity

And that’s it!

There’s an important advantage to this architecture. When you create Amazon MQ active-standby brokers, the Amazon MQ service provides two endpoints. Only one broker host is active at a time, and when configuration changes or other reboot events occur, the standby broker becomes active and the active broker goes to standby. The typical connection string when there is an option to connect to multiple brokers is something similar to the following string

"failover:(ssl://b-ce452fbe-2581-4003-8ce2-4185b1377b43-1.mq.us-west-2.amazonaws.com:61617,ssl://b-ce452fbe-2581-4003-8ce2-4185b1377b43-2.mq.us-west-2.amazonaws.com:61617)"

In this architecture, you use only a single connection URL, but you still want to use the failover protocol to force re-connection if the connection is dropped for any reason.

For ease of use, this solution relies on the Amazon MQ workshop client application code from re:Invent 2018. To test this solution setting the connection URL to the following:

"failover:(ssl://mq.example.com:61617

Run the producer and consumer clients in separate terminal windows.

The messages are sent and received successfully across the internet, while the brokers are hidden behind the Network Load Balancer.

Logging into the broker’s ActiveMQ console

But what if we want to log in to the broker’s ActiveMQ web console?

There are three options. Due to the security group rules, only traffic originating from inside the VPC is allowed to the brokers.

  • Use a VPN connection from the corporate network to the VPC. Most customers likely use this option, but for rapid testing, there is a simple and cost-effective method.
  • Connect to the brokers’ web console through a Route 53 subdomain, which requires creating a separate port 8162 Listener on the existing Network Load Balancer and creating a separate TLS target group on port 8162 for the brokers.
  • Use a bastion host to proxy traffic to the web console.

To use a bastion host, create a small Linux EC2 instance in your public subnet, and make sure that:

  • The EC2 instance has a public IP address.
  • You have access to the SSH key pair.
  • It is placed in a security group that allows SSH port 22 traffic from your location.

For simplicity, this step is not shown, but this demonstration uses a t3.micro Amazon Linux 2 host with all default options as the bastion.

Creating a forwarding tunnel

Next, create a forwarding tunnel through an SSH connection to the bastion host. Below is an example command in the terminal window. This keeps a persistent SSH connection forwarding port 8162 through the bastion host at the public IP address 54.244.188.53.

For example, the command could be:

ssh -D 8162 -C -q -N -I <my-key-pair-name>.pem [email protected]<ec2-ip-address>

You can also configure a browser to tunnel traffic through your proxy.

We have chosen to demonstrate in Firefox. Configure the network settings to use a manual proxy on localhost on the Apache ActiveMQ console port of 8162.  This can be done by opening the Firefox Connection Settings.  In the Configure Proxy Access to the Internet section, select Manual proxy configuration, then set the SOCKS Host to localhost and Port to 8162, leaving other fields empty.

Finally, use the Apache ActiveMQ console URL provided in the Amazon MQ web console details page to connect to the broker through the proxy.

ActiveMQ screenshot

Conclusion

Congratulations! You’ve successfully built a highly available Amazon MQ broker pair in a private subnet. You’ve layered your security defense by putting the brokers behind a highly scalable Network Load Balancer, and you’ve configured routing from a single custom subdomain URL to multiple brokers with health check built in.

To learn more about Amazon MQ and scalable broker communication patterns, we highly recommend the following resources:

Keep on building!

Top Resources for API Architects and Developers

Post Syndicated from George Mao original https://aws.amazon.com/blogs/architecture/top-resources-for-api-architects-and-developers/

We hope you’ve enjoyed reading our series on API architecture and development. We wrote about best practices for REST APIs with Amazon API Gateway  and GraphQL APIs with AWS AppSync. This post will cover the top resources that all API developers should be aware of.

Tech Talks, Webinars, and Twitch Live Stream

The technical staff at AWS have produced a variety of digital media that cover new service launches, best practices, and customer questions. Be sure to review these videos for tips and tricks on building APIs:

  • Happy Little APIs: This is a multi part series produced by our awesome Developer Advocate, Eric Johnson. He leads a series of talks that demonstrate how to build a real world API.
  • API Gateway’s WebSocket webinar: API Gateway now supports real time APIs with Websockets. This webinar covers how to use this feature and why you should let API Gateway manage your realtime APIs.
  • Best practices for building enterprise grade APIs: API Gateway reduces the time it takes to build and deploy REST development but there are strategies that can make development, security, and management easier.
  • An Intro to AWS AppSync and GraphQL: AppSync helps you build sophisticated data applications with realtime and offline capabilities.

Gain Experience With Hands-On Workshops and Examples

One of the easiest ways to get started with Serverless REST API development is to use the Serverless Application Model (SAM). SAM lets you run APIs and Lambda functions locally on your machine for easy development and testing.

For example, you can configure API Gateway as an Event source for Lambda with just a few lines of code:

Type: Api
Properties:
Path: /photos
Method: post

There are many great examples on our GitHub page to help you get started with Authorization (IAMCognito), Request, Response,  various policies , and CORS configurations for API Gateway.

If you’re working with GraphQL, you should review the Amplify Framework. This is an official AWS project that helps you quickly build Web Applications with built in AuthN and backend APIs using REST or GraphQL. With just a few lines of code, you can have Amplify add all required configurations for your GraphQL API. You have two options to integrate your application with an AppSync API:

  1. Directly using the Amplify GraphQL Client
  2. Using the AWS AppSync SDK

An excellent walk through of the Amplify toolkit is available here, including an example showing how to create a single page web app using ReactJS powered by an AppSync GraphQL API.

Finally, if you are interested in a full hands on experience, take a look at:

  • The Amazon API Gateway WildRydes workshop. This workshop teaches you how to build a functional single page web app with a REST backend, powered by API Gateway.
  • The AWS AppSync GraphQL Photo Workshop. This workshop teaches you how to use Amplify to quickly build a Photo sharing web app, powered by AppSync.

Useful Documentation

The official AWS documentation is the source of truth for architects and developers. Get started with the API Gateway developer guide. API Gateway is currently has two APIs (V1 and V2) for managing the service. Here is where you can view the SDK and CLI reference.

Get started with the AppSync developer guide, and review the AppSync management API.

Summary

As an API architect, your job is not only to design and implement the best API for your use case, but your job is also to figure out which type of API is most cost effective for your product. For example, an application with high request volume (“chatty“) may benefit from a GraphQL implementation instead of REST.

API Gateway currently charges $3.50 / million requests and provides a free tier of 1 Million requests per month. There is tiered pricing that will reduce your costs as request volume rises. AppSync currently charges $4.00 / million for Query and Mutation requests.

While AppSync pricing per request is slightly higher, keep in mind that the nature of GraphQL APIs typically result in significantly fewer overall request numbers.

Finally, we encourage you to join us in the coming weeks — we will be starting a series of posts covering messaging best practices.

About the Author

George MaoGeorge Mao is a Specialist Solutions Architect at Amazon Web Services, focused on the Serverless platform. George is responsible for helping customers design and operate Serverless applications using services like Lambda, API Gateway, Cognito, and DynamoDB. He is a regular speaker at AWS Summits, re:Invent, and various tech events. George is a software engineer and enjoys contributing to open source projects, delivering technical presentations at technology events, and working with customers to design their applications in the Cloud. George holds a Bachelor of Computer Science and Masters of IT from Virginia Tech.

Building and testing polyglot applications using AWS CodeBuild

Post Syndicated from Prakash Palanisamy original https://aws.amazon.com/blogs/devops/building-and-testing-polyglot-applications-using-aws-codebuild/

Prakash Palanisamy, Solutions Architect

Microservices are becoming the new normal, and it’s natural to use multiple different programming languages for different microservices in the same application. This blog post explains how easy it is to build polyglot applications, test them, and package them for deployment using a single AWS CodeBuild project.

CodeBuild adds support for Polyglot builds using runtime-versions. With CodeBuild, it is possible to specify multiple runtimes in the buildspec file as part of the install phase. Runtimes can be composed of different major versions of the same programming language, or different programming languages altogether. For a complete list of supported runtime versions and how they must be specified in the buildspec file, see the build specification reference for CodeBuild.

This post provides an example of a microservices application comprised of three different microservices written using different programming languages. The complete code is available in the GitHub repository aws-codebuild-polyglot-application.

  • A microservice providing a greeting message (microservice-greeting) is written in Python.
  • A microservice obtaining users’ details (microservice-name) from an Amazon DynamoDB table is written using Node.js.
  • Another microservice making API calls to the above-mentioned name and greeting services, and provide a personalized greeting (microservice-webapp), is written in Java.

All of these microservices are deployed as serverless functions in AWS Lambda and front-ended by an API interface using Amazon API Gateway. To enable automated packaging and deployment, use AWS Serverless Application Model (AWS SAM) and the AWS SAM CLI. The complete application is deployed locally using DynamoDB Local and the sam local command. Automated UI testing uses the built-in headless browsers in the standard CodeBuild containers. To run DynamoDB Local and SAM Local docker runtime is being used hence the privileged mode should be enabled in the CodeBuild project.

The example buildspec file contains the following code:

version: 0.2

env:
  variables:
    ARTIFACT_BUCKET: "polyglot-codebuild-artifact-eu-west-1"

phases:
  install:
    runtime-versions:	
      nodejs: 10
      java: corretto8
      python: 3.7
      docker: 18
    commands:
      - pip3 install --upgrade aws-sam-cli selenium pylint
  pre_build:
    commands:
      - python --version
      - pylint $CODEBUILD_SRC_DIR/microservices-greeting/greeting_handler.py
      - own_ip=`hostname -i`
      - sed -ie "s/CODEBUILD_IP/$own_ip/g" $CODEBUILD_SRC_DIR/env-webapp.json
  build:
    commands:
      - node --version
      - cd $CODEBUILD_SRC_DIR/microservices-name
      - npm install
      - npm run build
      - java -version
      - cd $CODEBUILD_SRC_DIR/microservices-webapp
      - mvn clean package -Plambda
      - cd $CODEBUILD_SRC_DIR
      - |
        sam package --template-file greeting-sam.yaml \
        --output-template-file greeting-sam-packaged.yaml \
        --s3-bucket $ARTIFACT_BUCKET
      - |
        sam package --template-file name-sam.yaml \
        --output-template-file name-sam-packaged.yaml \
        --s3-bucket $ARTIFACT_BUCKET
      - |
        sam package --template-file webapp-sam.yaml \
        --output-template-file webapp-sam-packaged.yaml \
        --s3-bucket $ARTIFACT_BUCKET
  post_build:
    commands:
      - cd $CODEBUILD_SRC_DIR
      - echo "Starting local DynamoDB instance"
      - docker network create codebuild-local
      - |
        docker run -d -p 8000:8000 --network codebuild-local \
        --name dynamodb amazon/dynamodb-local
      - |
        aws dynamodb create-table --endpoint-url http://127.0.0.1:8000 \
        --table-name NamesTable \
        --attribute-definitions AttributeName=Id,AttributeType=S \
        --key-schema AttributeName=Id,KeyType=HASH \
        --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5
      - |
        aws dynamodb --endpoint-url http://127.0.0.1:8000 batch-write-item \
        --cli-input-json file://ddb_names.json
      - echo "Starting APIs locally..."
      - |
        nohup sam local start-api --template greeting-sam.yaml \
        --port 3000 --host 0.0.0.0 &> greeting.log &
      - |
        nohup sam local start-api --template name-sam.yaml \
        --port 3001 --host 0.0.0.0 --env-vars env.json \
        --docker-network codebuild-local &> name.log &
      - |
        nohup sam local start-api --template webapp-sam.yaml \
        --port 3002 --host 0.0.0.0 \
        --env-vars env-webapp.json &> webapp.log &
      - cd $CODEBUILD_SRC_DIR/website
      - nohup python3 -m http.server 8080 &>webserver.log &
      - sleep 20
      - echo "Starting headless UI testing..."
      - python $CODEBUILD_SRC_DIR/tests/testsuite.py
artifacts:
  files:
    - greeting-sam-packaged.yaml
    - name-sam-packaged.yaml
    - webapp-sam-packaged.yaml

In the env section, an environment variable named ARTIFACT_BUCKET is uploaded and initialized. It contains the name of the S3 bucket in which the sam package command generated the AWS SAM template. This variable can be overridden either in the build project or while running the build. For more information about the order of precedence, see Run a Build (Console).

In the install phase, there are two sequences: runtime-versions and commands. As part of the runtime versions, four runtimes are specified: Three programming languages’ runtime (Node.js, Java, and Python), which are required to build the microservices, and Docker runtime to deploy the application locally using the sam local command. In the commands sequence, the required packages are installed and updated.

In the pre_build phase, use pylint to lint the Python-based microservice. Get the IP address of the CodeBuild container, to be used later while running the application locally.

In the build phase, use the command sequence to build individual microservices based on their programming language. Use the sam package command to create the Lambda deployment package and generate an updated AWS SAM template.

In the post_build phase, start a DynamoDB Local container as a daemon, create a table in that database, and insert the user details in to that table. After that, start a local instance of greeting, name, and web app microservices using the sam local command. The repository also contains a simple static website that can make API calls to these microservices and provide a user-friendly response. This website is hosted locally using Python’s built-in HTTP server. After the application starts, execute the automated UI testing by running the testsuite.py script. This test script uses Selenium WebDriver. It runs UI tests on headless Firefox and Chrome browsers to validate whether the local deployment of the application works as expected.

If all the phases complete successfully, the updated AWS SAM template files are stored as artifacts based on the specification in artifacts section.

The repository also contains an AWS CloudFormation template that creates a pipeline on AWS CodePipeline with AWS CodeCommit as sources. It builds the application using the previously mentioned buildspec in CodeBuild, and deploys the application using AWS CloudFormation.

Conclusion

In this post, you learned how to use the runtime-versions capability in CodeBuild to build a polyglot application. You also learned about automated UI testing using built-in headless browsers in CodeBuild. Using polygot build capabilities of CodeBuild you shall efficiently handle building applications developed on multiple programming languages using a single build project, instead of creating and maintaining multiple build projects. Similarly, since all the dependent components are built as part of the same project, it also improves the ease of testing including the integration between components.

How to add DNS filtering to your NAT instance with Squid

Post Syndicated from Nicolas Malaval original https://aws.amazon.com/blogs/security/how-to-add-dns-filtering-to-your-nat-instance-with-squid/

Note from September 4, 2019: We’ve updated this blog post, initially published on January 26, 2016. Major changes include: support of Amazon Linux 2, no longer having to compile Squid 3.5, and a high availability version of the solution across two availability zones.

Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources on a virtual private network that you’ve defined. On an Amazon VPC, many people use network address translation (NAT) instances and NAT gateways to enable instances in a private subnet to initiate outbound traffic to the Internet, while preventing the instances from receiving inbound traffic initiated by someone on the Internet.

For security and compliance purposes, you might have to filter the requests initiated by these instances (also known as “egress filtering”). Using iptables rules, you could restrict outbound traffic with your NAT instance based on a predefined destination port or IP address. However, you might need to enforce more complex security policies, such as allowing requests to AWS endpoints only, or blocking fraudulent websites, which you can’t easily achieve by using iptables rules.

In this post, I discuss and give an example of how to use Squid, a leading open-source proxy, to implement a “transparent proxy” that can restrict both HTTP and HTTPS outbound traffic to a given set of Internet domains, while being fully transparent for instances in the private subnet.

The solution architecture

In this section, I present the architecture of the high availability NAT solution and explain how to configure Squid to filter traffic transparently. Later in this post, I’ll provide instructions about how to implement and test the solution.

The following diagram illustrates how the components in this process interact with each other. Squid Instance 1 intercepts HTTP/S requests sent by instances in Private Subnet 1, including the Testing Instance. Squid Instance 1 then initiates a connection with the destination host on behalf of the Testing Instance, which goes through the Internet gateway. This solution spans two Availability Zones, with Squid Instance 2 intercepting requests sent from the other Availability Zone. Note that you may adapt the solution to span additional Availability Zones.
 

Figure 1: The solution spans two Availability Zones

Figure 1: The solution spans two Availability Zones

Intercepting and filtering traffic

In each availability zone, the route table associated to the private subnet sends the outbound traffic to the Squid instance (see Route Tables for a NAT Device). Squid intercepts the requested domain, then applies the following filtering policy:

  • For HTTP requests, Squid retrieves the host header field included in all HTTP/1.1 request messages. This specifies the Internet host being requested.
  • For HTTPS requests, the HTTP traffic is encapsulated in a TLS connection between the instance in the private subnet and the remote host. Squid cannot retrieve the host header field because the header is encrypted. A feature called SslBump would allow Squid to decrypt the traffic, but this would not be transparent for the client because the certificate would be considered invalid in most cases. The feature I use instead, called SslPeekAndSplice, retrieves the Server Name Indication (SNI) from the TLS initiation. The SNI contains the requested Internet host. As a result, Squid can make filtering decisions without decrypting the HTTPS traffic.

Note 1: Some older client-side software stacks do not support SNI. Here are the minimum versions of some important stacks and programming languages that support SNI: Python 2.7.9 and 3.2, Java 7 JSSE, wget 1.14, OpenSSL 0.9.8j, cURL 7.18.1

Note 2: TLS 1.3 introduced an optional extension that allows the client to encrypt the SNI, which may prevent Squid from intercepting the requested domain.

The SslPeekAndSplice feature was introduced in Squid 3.5 and is implemented in the same Squid module as SslBump. To enable this module, Squid requires that you provide a certificate, though it will not be used to decode HTTPS traffic. The solution creates a certificate using OpenSSL.


mkdir /etc/squid/ssl
cd /etc/squid/ssl
openssl genrsa -out squid.key 4096
openssl req -new -key squid.key -out squid.csr -subj "/C=XX/ST=XX/L=squid/O=squid/CN=squid"
openssl x509 -req -days 3650 -in squid.csr -signkey squid.key -out squid.crt
cat squid.key squid.crt >> squid.pem        

The following code shows the Squid configuration file. For HTTPS traffic, note the ssl_bump directives instructing Squid to “peek” (retrieve the SNI) and then “splice” (become a TCP tunnel without decoding) or “terminate” the connection depending on the requested host.


visible_hostname squid
cache deny all

# Log format and rotation
logformat squid %ts.%03tu %6tr %>a %Ss/%03>Hs %<st %rm %ru %ssl::>sni %Sh/%<a %mt
logfile_rotate 10
debug_options rotate=10

# Handling HTTP requests
http_port 3128
http_port 3129 intercept
acl allowed_http_sites dstdomain "/etc/squid/whitelist.txt"
http_access allow allowed_http_sites

# Handling HTTPS requests
https_port 3130 cert=/etc/squid/ssl/squid.pem ssl-bump intercept
acl SSL_port port 443
http_access allow SSL_port
acl allowed_https_sites ssl::server_name "/etc/squid/whitelist.txt"
acl step1 at_step SslBump1
acl step2 at_step SslBump2
acl step3 at_step SslBump3
ssl_bump peek step1 all
ssl_bump peek step2 allowed_https_sites
ssl_bump splice step3 allowed_https_sites
ssl_bump terminate step2 all
http_access deny all       

The text file located at /etc/squid/whitelist.txt contains the list of whitelisted domains, with one domain per line. In this blog post, I’ll show you how to configure Squid to allow requests to *.amazonaws.com, which corresponds to AWS endpoints. Note that you can restrict access to a specific set of AWS services that you’ve defined (see Regions and Endpoints for a detailed list of endpoints), or you can set your own list of domains.

Note: An alternate approach is to use VPC endpoints to privately connect your VPC to supported AWS services without requiring access over the Internet (see VPC Endpoints). Some supported AWS services allow you to create a policy that controls the use of the endpoint to access AWS resources (see VPC Endpoint Policies, and VPC Endpoints for a list of supported services).

You may have noticed that Squid listens on port 3129 for HTTP traffic and 3130 for HTTPS. Because Squid cannot directly listen to 80 and 443, you have to redirect the incoming requests from instances in the private subnets to the Squid ports using iptables. You do not have to enable IP forwarding or add any FORWARD rule, as you would do with a standard NAT instance.


sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 3129
sudo iptables -t nat -A PREROUTING -p tcp --dport 443 -j REDIRECT --to-port 3130       

The solution stores the files squid.conf and whitelist.txt in an Amazon Simple Storage Service (S3) bucket and runs the following script every minute on the Squid instances to download and update the Squid configuration from S3. This makes it easy to maintain the Squid configuration from a central location. Note that it first validates the files with squid -k parse and then reload the configuration with squid -k reconfigure if no error was found.


        cp /etc/squid/* /etc/squid/old/
        aws s3 sync s3://<s3-bucket> /etc/squid
        squid -k parse && squid -k reconfigure || (cp /etc/squid/old/* /etc/squid/; exit 1)     

The solution then uses the CloudWatch Agent on the Squid instances to collect and store Squid logs in Amazon CloudWatch Logs. The log group /filtering-nat-instance/cache.log contains the error and debug messages that Squid generates and /filtering-nat-instance/access.log contains the access logs.

An access log record is a space-delimited string that has the following format:

<time> <response_time> <client_ip> <status_code> <size> <method> <url> <sni> <remote_host> <mime>

The following table describes the fields of an access log record.

FieldDescription
timeRequest time in seconds since epoch
response_timeResponse time in milliseconds
client_ipClient source IP address
status_codeSquid request status and HTTP response code sent to the client. For example, a HTTP request to an unallowed domain logs TCP_DENIED/403, and a HTTPS request to a whitelisted domain logs TCP_TUNNEL/200
sizeTotal size of the response sent to client
methodRequest method like GET or POST.
urlRequest URL received from the client. Logged for HTTP requests only
sniDomain name intercepted in the SNI. Logged for HTTPS requests only
remote_hostSquid hierarchy status and remote host IP address
mimeMIME content type. Logged for HTTP requests only

The following are some examples of access log records:


1563718817.184 14 10.0.0.28 TCP_DENIED/403 3822 GET http://example.com/ - HIER_NONE/- text/html
1563718821.573 7 10.0.0.28 TAG_NONE/200 0 CONNECT 172.217.7.227:443 example.com HIER_NONE/- -
1563718872.923 32 10.0.0.28 TCP_TUNNEL/200 22927 CONNECT 52.216.187.19:443 calculator.s3.amazonaws.com ORIGINAL_DST/52.216.187.19 –   

Designing a high availability solution

The Squid instances introduce a single point of failure for the private subnets. If a Squid instance fails, the instances in its associated private subnet cannot send outbound traffic anymore. The following diagram illustrates the architecture that I propose to address this situation within an Availability Zone.
 

Figure 2: The architecture to address if a Squid instance fails within an Availability Zone

Figure 2: The architecture to address if a Squid instance fails within an Availability Zone

Each Squid instance is launched in an Amazon EC2 Auto Scaling group that has a minimum size and a maximum size of one instance. A shell script is run at startup to configure the instances. That includes installing and configuring Squid (see Running Commands on Your Linux Instance at Launch).

The solution uses the CloudWatch Agent and its procstat plugin to collect the CPU usage of the Squid process every 10 seconds. For each Squid instance, the solution creates a CloudWatch alarm that watches this custom metric and goes to an ALARM state when a data point is missing. This can happen, for example, when Squid crashes or the Squid instance fails. Note that for my use case, I consider watching the Squid process a sufficient approach to determining the health status of a Squid instance, although it cannot detect eventual cases of the Squid process being alive but unable to forward traffic. As a workaround, you can use an end-to-end monitoring approach, like using witness instances in the private subnets to send test requests at regular intervals and collect the custom metric.

When an alarm goes to ALARM state, CloudWatch sends a notification to an Amazon Simple Notification Service (SNS) topic which then triggers an AWS Lambda function. The Lambda function marks the Squid instance as unhealthy in its Auto Scaling group, retrieves the list of healthy Squid instances based on the state of other CloudWatch alarms, and updates the route tables that currently route traffic to the unhealthy Squid instance to instead route traffic to the first available healthy Squid instance. While the Auto Scaling group automatically replaces the unhealthy Squid instance, private instances can send outbound traffic through the Squid instance in the other Availability Zone.

When the CloudWatch agent starts collecting the custom metric again on the replacement Squid instance, the alarm reverts to OK state. Similarly, CloudWatch sends a notification to the SNS topic, which then triggers the Lambda function. The Lambda function completes the lifecycle action (see Amazon EC2 Auto Scaling Lifecycle Hooks) to indicate that the replacement instance is ready to serve traffic, and updates the route table associated to the private subnet in the same availability zone to route traffic to the replacement instance.

Implementing and testing the solution

Now that you understand the architecture behind this solution, you can follow the instructions in this section to implement and test the solution in your AWS account.

Implementing the solution

First, you’ll use AWS CloudFormation to provision the required resources. Select the Launch Stack button below to open the CloudFormation console and create a stack from the template. Then, follow the on-screen instructions.

Select this image to open a link that starts building the CloudFormation stack

CloudFormation will create the following resources:

  • An Amazon Virtual Private Cloud (Amazon VPC) with an internet gateway attached.
  • Two public subnets and two private subnets on the Amazon VPC.
  • Three route tables. The first route table is associated to the public subnets to make them publicly accessible. The other two route tables are associated to the private subnets.
  • An S3 bucket to store the Squid configuration files, and two Lambda-based custom resources to add the files squid.conf and whitelist.txt to this bucket.
  • An IAM role to grant the Squid instances permissions to read from the S3 bucket and use the CloudWatch agent.
  • A security group to allow HTTP and HTTPS traffic from instances in the private subnets.
  • A launch configuration to specify the template of Squid instances. That includes commands to run at startup for automating the initial configuration.
  • Two Auto Scaling groups that use this launch configuration to launch the Squid instances.
  • A Lambda function to redirect the outbound traffic and recover a Squid instance when it fails.
  • Two CloudWatch alarms to watch the custom metric sent by Squid instances and trigger the Lambda function when the health status of Squid instances changes.
  • An EC2 instance in the first private subnet to test the solution, and an IAM role to grant this instance permissions to use the SSM agent. Session Manager, which I introduce in the next paragraph, uses this SSM agent (see Working with SSM Agent)

Testing the solution

After the stack creation has completed (it can take up to 10 minutes), connect onto the Testing Instance using Session Manager, a capability of AWS Systems Manager that lets you manage instances through an interactive shell without the need to open an SSH port:

  1. Open the AWS Systems Manager console.
  2. In the navigation pane, choose Session Manager.
  3. Choose Start Session.
  4. For Target instances, choose the option button to the left of Testing Instance.
  5. Choose Start Session.

Note: Session Manager makes calls to several AWS endpoints (see Working with SSM Agent). If you prefer to restrict access to a defined set of AWS services, make sure to whitelist the associated domains.

After the connection is made, you can test the solution with the following commands. Only the last three requests should return a valid response, because Squid allows traffic to *.amazonaws.com only.


curl http://www.amazon.com
curl https://www.amazon.com
curl http://calculator.s3.amazonaws.com/index.html
curl https://calculator.s3.amazonaws.com/index.html
aws ec2 describe-regions --region us-east-1         

To find the requests you just made in the access logs, here’s how to browse the Squid logs in Amazon CloudWatch Logs:

  1. Open the Amazon CloudWatch console.
  2. In the navigation pane, choose Logs.
  3. For Log Groups, choose the log group /filtering-nat-instance/access.log.
  4. Choose Search Log Group to view and search log records.

To test how the solution behaves when a Squid instance fails, you can terminate one of the Squid instances manually in the Amazon EC2 console. Then, watch the CloudWatch alarm change its state in the Amazon CloudWatch console, or watch the solution change the default route of the impacted route table in the Amazon VPC console.

You can now delete the CloudFormation stack to clean up the resources that were just created.

Discussion: Transparent or forward proxy?

The solution that I describe in this blog is fully transparent for instances in the private subnets, which means that instances don’t need to be aware of the proxy and can make requests as if they were behind a standard NAT instance. An alternate solution is to deploy a forward proxy in your Amazon VPC and configure instances in private subnets to use it (see the blog post How to set up an outbound VPC proxy with domain whitelisting and content filtering for an example). In this section, I discuss some of the differences between the two solutions.

Supportability

A major drawback with forward proxies is that the proxy must be explicitly configured on every instance within the private subnets. For example, you can configure the HTTP_PROXY and HTTPS_PROXY environment variables on Linux instances, but some applications or services, like yum, require their own proxy configuration, or don’t support proxy usage. Note also that some AWS services and features, like Amazon EMR or Amazon SageMaker notebook instances, don’t support using a forward proxy at the time of this post. However, with TLS 1.3, a forward proxy is the only option to restrict outbound traffic if the SNI is encrypted.

Scalability

Deploying a forward proxy on AWS usually consists of a load balancer distributing traffic to a set of proxy instances launched in an Auto Scaling group. Proxy instances can be launched or terminated dynamically depending on the demand (also known as “horizontal scaling”). With forward proxies, each route table can route traffic to a single instance at a time, and changing the type of the instance is the only way to increase or decrease the capacity (also known as “vertical scaling”).

The solution I present in this post does not dynamically adapt the instance type of the Squid instances based on the demand. However, you might consider a mechanism in which the traffic from a private subnet is temporarily redirected through another Availability Zone while the Squid instance is being relaunched by Auto Scaling with a smaller or larger instance type.

Mutualization

Deploying a centralized proxy solution and using it across multiple VPCs is a way of reducing cost and operational complexity.

With a forward proxy, instances in private subnets send IP packets to the proxy load balancer. Therefore, sharing a forward proxy across multiple VPCs only requires connectivity between the “instance VPCs” and a proxy VPC that has VPC Peering or equivalent capabilities.

With a transparent proxy, instances in private subnets sends IP packets to the remote host. VPC Peering does not support transitive routing (see Unsupported VPC Peering Configurations) and cannot be used to share a transparent proxy across multiple VPCs. However, you can now use an AWS Transit Gateway that acts as a network transit hub to share a transparent proxy across multiple VPCs. I give an example in the next section.

Sharing the solution across multiple VPCs using AWS Transit Gateway

In this section, I give an example of how to share a transparent proxy across multiple VPCs using AWS Transit Gateway. The architecture is illustrated in the following diagram. For the sake of simplicity, the diagram does not include Availability Zones.
 

Figure 3: The architecture for a transparent proxy across multiple VPCs using AWS Transit Gateway

Figure 3: The architecture for a transparent proxy across multiple VPCs using AWS Transit Gateway

Here’s how instances in the private subnet of “VPC App” can make requests via the shared transparent proxy in “VPC Shared:”

  1. When instances in VPC App make HTTP/S requests, the network packets they send have the public IP address of the remote host as the destination address. These packets are forwarded to the transit gateway, based on the route table associated to the private subnet.
  2. The transit gateway receives the packets and forwards them to VPC Shared, based on the default route of the transit gateway route table.
  3. Note that the transit gateway attachment resides in the transit gateway subnet. When the packets arrive in VPC Shared, they are forwarded to the Squid instance because the next destination has been determined based on the route table associated to the transit gateway subnet.
  4. The Squid instance makes requests on behalf of the source instance (“Instances” in the schema). Then, it sends the response to the source instance. The packets that it emits have the IP address of the source instance as the destination address and are forwarded to the transit gateway according to the route table associated to the public subnet.
  5. The transit gateway receives and forwards the response packets to VPC App.
  6. Finally, the response reaches the source instance.

In a high availability deployment, you could have one transit gateway subnet per Availability Zone that sends traffic to the Squid instance that resides in the same Availability Zone, or to the Squid instance in another Availability Zone if the instance in the same Availability Zone fails.

You could also use AWS Transit Gateway to implement a transparent proxy solution that scales horizontally. This allows you to add or remove proxy instances based on the demand, instead of changing the instance type. With this approach, you must deploy a fleet of proxy instances – launched by an Auto Scaling group, for example – and mount a VPN connection between each instance and the transit gateway. The proxy instances need to support ECMP (“Equal Cost Multipath routing”; see Transit Gateways) to equally spread the outbound traffic between instances. I don’t describe this alternative architecture further in this blog post.

Conclusion

In this post, I’ve shown how you can use Squid to implement a high availability solution that filters outgoing traffic to the Internet and helps meet your security and compliance needs, while being fully transparent for the back-end instances in your VPC. I’ve also discussed the key differences between transparent proxies and forward proxies. Finally, I gave an example of how to share a transparent proxy solution across multiple VPCs using AWS Transit Gateway.

If you have any questions or suggestions, please leave a comment below or on the Amazon VPC forum.

If you have feedback about this blog post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Nicolas Malaval

Nicolas is a Solution Architect for Amazon Web Services. He lives in Paris and helps our healthcare customers in France adopt cloud technology and innovate with AWS. Before that, he spent three years as a Consultant for AWS Professional Services, working with enterprise customers.

Things to Consider When You Build a GraphQL API with AWS AppSync

Post Syndicated from Steve Johnson original https://aws.amazon.com/blogs/architecture/things-to-consider-when-you-build-a-graphql-api-with-aws-appsync/

Co-authored by George Mao

When building a serverless API layer in AWS (one that provides a custom grammar for your serverless resources), your choices include Amazon API Gateway (REST) and AWS AppSync (GraphQL). We’ve discussed the differences between REST and GraphQL in our first post of this series and explored REST APIs in our second post. This post will dive deeper into GraphQL implementation with AppSync.

Note that the GraphQL specification is focused on grammar and expected behavior, and is light on implementation details. Therefore, each GraphQL implementation approaches these details in a different way. While this blog post speaks to architectural principles, it will also discuss specific features AppSync for practical advice.

Schema vs. Resolver Complexity

All GraphQL APIs are defined by their schema. The schema contains operation definitions (Queries, Mutations, and Subscriptions), as well as data definitions (Data and Input Types). Since GraphQL tools provide introspection, your Schema also becomes your API documentation. So, as your Schema grows, it’s important that it’s consistent and adheres to best practices (such as the use of Input Types for mutations).

Clients will love using your API if they can do what they want with as little work as possible. A good rule of thumb is to put any necessary complexity in the resolver rather than in the client code. For example, if you know client applications will need “Book” information that includes the cover art and current sales ranking – all from different data sources – you can build a single data type that combines them:

GraphQL API -1

In this case, the complexity associated with assembling that data should be handled in the resolver, rather than forcing the client code to make multiple calls and manipulate the returned data.

Mapping data storage to schema gets more complex when you are accessing legacy data services: internal APIs, external REST services, relational database SQL, and services with custom protocols or libraries. The other end of the spectrum is new development, where some architects argue they can map their entire schema to a single source. In practice, your mapping should consider the following:

  • Should slower data sources have a caching layer?
  • Do your most frequently used operations have low latency?
  • How many layers (services) does a request touch?
  • Can high latency requests be modeled asynchronously?

With AppSync, you have the option to use Pipeline Resolvers, which execute reusable functions within a resolver context. Each function in the pipeline can call one of the native resolver types.

Security

Public APIs (ones with an external endpoint) provide access to secured resources. The first line of defense is authorization – restricting who can call an operation in the GraphQL Schema. AppSync has four methods to authenticate clients:

  • API keys: Since the API key does not reference an identity and is easily compromised, we recommend it be used for development only.
  • IAMs: These are standard AWS credentials that are often used for server-side processes. A common example is assigning an execution role to a AWS Lambda function that makes calls to AppSync.
  • OIDC Tokens: These are time-limited and suitable for external clients.
  • Cognito User Pool Tokens: These provide the advantages of OIDC tokens, and also allow you to use the @auth transform for authorization.

Authorization in GraphQL is handled in the resolver logic, which allows field-level access to your data, depending on any criteria you can express in the resolver. This allows flexibility, but also introduces complexity in the form of code.

AppSync Resolvers use a Request/Response template pattern similar to API Gateway. Like API Gateway, the template language is Apache Velocity. In an AppSync resolver, you have the ability to examine the incoming authentication information (such as the IAM username) in the context variable. You can then compare that username against an owner field in the data being retrieved.

AppSync provides declarative security using the @auth and @model transforms. Transforms are annotations you add to your schema that are interpreted by the Amplify Toolchain. Using the @auth transform, you can apply different authentication types to different operations. AppSync will automatically generate resolver logic for DynamoDB, based on the data types in your schema. You can also define field-level permissions based on identity. Get detailed information.

Performance

To get a quantitative view of your API’s performance, you should enable field-level logging on your AppSync API. Doing so will automatically emit information into CloudWatch Logs. Then you can analyze AppSync performance with CloudWatch Logs Insights to identify performance bottlenecks and the root cause of operational problems, such as:

  • Resolvers with the maximum latency
  • The most (or least) frequently invoked resolvers
  • Resolvers with the most errors

Remember, choice of resolver type has an impact on performance. When accessing your data sources, you should prefer a native resolver type such as Amazon DynamoDB or Amazon ElasticSearch using VTL templates. The HTTP resolver type is very efficient, but latency depends on the performance of the downstream service. Lambda resolvers provide flexibility, but have the performance characteristics of your application code.

AppSync also has another resolver type, the Local Resolver, which doesn’t interact with a data source. Instead, this resolver invokes a mutation operation that will result in a subscription message being sent. This is useful in use cases where AppSync is used as a message bus, or in cases where the data has been modified by an external source, and notifications must be sent without modifying the data a second time.

GraphQL API -2

GraphQL Subscriptions

One of the reasons customers choose GraphQL is the power of Subscriptions. These are notifications that are sent immediately to clients when data has changed by a mutation. AppSync subscriptions are implemented using Websockets, and are directly tied to a mutation in the schema. The AppSync SDKs and AWS Amplify Library allow clients to subscribe to these real-time notifications.

AppSync Subscriptions have many uses outside standard API CRUD operations. They can be used for inter-client communication, such as a mobile or web chat application. Subscription notifications can also be used to provide asynchronous responses to long-running requests. The initial request returns quickly, while the full result can be sent via subscription when it’s complete (local resolvers are useful for this pattern).

Subscriptions will only be received if the client is currently running, connected to the server, and is subscribed to the target mutation. If you have a mobile client, you may want to augment these notifications with mobile or web push notifications.

Summary

The choice of using a GraphQL API brings many advantages, especially for client developers. While basic considerations of Security and Performance are important (as they are in any solution), GraphQL APIs require some thought and planning around Schema and Resolver organization, and managing their complexity.

About the author

Steve Johnson

Steve Johnson is a Specialist Solutions Architect at Amazon Web Services, focused on Mobile Applications. Steve helps customers design Mobile and GraphQL applications using AWS Amplify, Amazon AppSync, Amazon Cognito, and the AWS Serverless Suite of products. He is a speaker at AWS Summits and various tech events. Steve is a software and systems engineer and enjoys tinkering with all things mechanical and cloud related. He holds a Bachelor of Mechanical Engineering and Masters in Software Systems Engineering. Steve lives and works near us-east-1, because the latency is good.

 

Improve Productivity and Reduce Overhead Expenses with Red Hat OpenShift Dedicated on AWS

Post Syndicated from Ryan Niksch original https://aws.amazon.com/blogs/architecture/improve-productivity-and-reduce-overhead-expenses-with-red-hat-openshift-dedicated-on-aws/

Red Hat OpenShift on AWS helps you develop, deploy, and manage container-based applications across on-premises and cloud environments. A recent case study from Cathay Pacific Airways proved that the use of the Red Hat OpenShift application platform can significantly improve developer productivity and reduce operational overhead by automating infrastructure, application deployment, and scaling. In this post, I explore how the architectural implementation and customization options of Red Hat OpenShift dedicated on AWS can cater to a variety of customer needs.

Red Hat OpenShift is a turn key solution providing  a container runtime, Kubernetes orchestration, container image repositories, pipeline, build process, monitoring, logging, role-based access control, granular policy-based control, and abstractions to simplify functions. Deploying a single turnkey solution, instead of building and integrating a collection of independent solutions or services, allows you to invest more time and effort in building meaningful applications for your business.

In the past, Red Hat OpenShift deployed on Amazon EC2 using an automated provisioning process with an open source solution, like the Red Hat OpenShift on AWS Quick Start. The Red Hat OpenShift Quick Start is an infrastructure as code solution which accelerates customer provisioning of Red Hat OpenShift on AWS. The OpenShift Quick Start adheres to the reference architecture to deploy Red Hat OpenShift on AWS in a resilient, scalable, well-architected manner. This reference architecture sees the control plane as a collection of load balanced master nodes for traffic routing, session state, scheduling, and monitoring. It also contains the application nodes where the customer’s containerized workloads run. This solution allowed customers to get up and running within three hours; however, it did not reduce management overhead because customers were required to monitor and maintain the infrastructure of the Red Hat OpenShift cluster.

Red Hat and AWS listened to customer feedback and created Red Hat OpenShift dedicated, a fully managed OpenShift implementation running exclusively on AWS. This implementation monitors the layers and functions, scales the layers to cater to consumption needs, and addresses operational concerns.

Customers now have access to a platform that helps manage control planes for business-critical solutions, like their developer and operational platforms.

Red Hat OpenShift Dedicated Infrastructure on AWS

You can purchase Red Hat OpenShift dedicated through the Red Hat account team. Red Hat OpenShift dedicated comes in two varieties: the Standard edition and the Cloud Choice edition (bring your own cloud).

redhar-openshit options

Figure 1: Red Hat OpenShift architecture illustrating master and infrastructure nodes spread over three Availability Zones and placed behind elastic load balancers.

Red Hat OpenShift dedicated adheres to the reference architecture defined by AWS and Red Hat. Master and infrastructure layers are spread across three AWS availability zones providing resilience within the OpenShift solution, as well as the underlying infrastructure.

Red Hat OpenShift Dedicated Standard Edition

In the Red Hat OpenShift dedicated standard edition, Red Hat deploys the OpenShift cluster into an AWS account owned and managed by Red Hat. Red Hat provides an aggregated bill for the OpenShift subscription fees, management fees, and AWS billing. This edition is ideal for customers who want everything to be managed for them. The Red Hat site reliability engineering  team (SRE) will monitor and manage healing, scaling, and patching of the cluster.

Red Hat OpenShift Cloud Choice Edition

The cloud choice edition allows customers to create their own AWS account, and then have the Red Hat OpenShift dedicated infrastructure provisioned into their existing account. The Red Hat SRE team provisions the Red Hat OpenShift cluster into the customer owned AWS account and manages the solution via IAM roles.

Figure 2: Red Hat OpenShift Cloud Choice IAM role separation

Red Hat provides billing for the Red Hat OpenShift Cloud Choice subscription and management fees, and AWS provides billing for the AWS resources. Keeping the Red Hat OpenShift infrastructure within your AWS account allows better cost controls.

Red Hat OpenShift Cloud Choice provides visibility into the resources running in your account; which is desirable if you have regulatory and auditing concerns. You can inspect, monitor, and audit resources within the AWS account — taking advantage of the rich AWS service set (AWS CloudTrail, AWS config, AWS CloudWatch, and AWS cost explorer).

You can also take advantage of cost management solutions like AWS organizations and consolidated billing. Customers with multiple business units using AWS can combine the usage across their accounts to share the volume pricing discounts resulting in cost savings for projects, departments, and companies.

Red Hat OpenShift Cloud Choice dedicated cannot be deployed into an account currently hosting other applications and resources. In order to maintain separation of control with the managed service, Red Hat OpenShift Cloud Choice dedicated requires an AWS account dedicated to the managed Red Hat OpenShift solution.

You can take advantage of cost reductions of up to 70% using Reserved Instances, which match the pervasive running instances. This is ideal for the master and infrastructure nodes of the Red Hat OpenShift solutions running in your account. The reference architecture for Red Hat OpenShift on AWS recommends spanning  nodes over three availability zones, which translates to three master instances. The master and infrastructure nodes scale differently; so, there will be three additional instances for the infrastructure nodes. Purchasing reserved instances to offset the costs of the master nodes and the infrastructure nodes can free up funds for your next project.

Interactions

DevOps teams using either edition of Red Hat OpenShift dedicated have a rich console experience providing control over networking between application workloads, storage, and monitoring. Granular drill down consoles enable operations teams to focus on what is most critical to their organization.

Each interface is controlled through granular role-based access control. Teams have visibility of high-level cluster overviews where they are able to see visualizations of the overall health of the cluster; and they have access to more granular overviews of views of hosts, nodes, and containers. Application owners, key stake holders, and operations teams have access to a customizable dashboard displaying the running state. Teams can drill down to the underlying nodes, and further into the PODs and containers, should they wish to explore the status or overall health of the containerized micro services. The cluster-wide event stream provides the same drill down experience to logging events.

The drill down console menu options are illustrated in the screenshots below:

In summary, the partnership of Red Hat and AWS created a fully managed solution which directly answers customer feedback requests for a fully managed application platform running on the availability, scalability, and cost benefits of AWS. The solution allows visibility and control whenever and wherever you need it.

About the author

Ryan Niksch

Ryan Niksch is a Partner Solutions Architect focusing on application platforms, hybrid application solutions, and modernization. Ryan has worn many hats in his life and has a passion for tinkering and a desire to leave everything he touches a little better than when he found it.

Things to Consider When You Build REST APIs with Amazon API Gateway

Post Syndicated from George Mao original https://aws.amazon.com/blogs/architecture/things-to-consider-when-you-build-rest-apis-with-amazon-api-gateway/

A few weeks ago, we kicked off this series with a discussion on REST vs GraphQL APIs. This post will dive deeper into the things an API architect or developer should consider when building REST APIs with Amazon API Gateway.

Request Rate (a.k.a. “TPS”)

Request rate is the first thing you should consider when designing REST APIs. By default, API Gateway allows for up to 10,000 requests per second. You should use the built in Amazon CloudWatch metrics to review how your API is being used. The Count metric in particular can help you review the total number of API requests in a given period.

It’s important to understand the actual request rate that your architecture is capable of supporting. For example, consider this architecture:

REST API 1

This API accepts GET requests to retrieve a user’s cart by using a Lambda function to perform SQL queries against a relational database managed in RDS.  If you receive a large burst of traffic, both API Gateway and Lambda will scale in response to the traffic. However, relational databases typically have limited memory/cpu capacity and will quickly exhaust the total number of connections.

As an API architect, you should design your APIs to protect your down stream applications.  You can start by defining API Keys and requiring your clients to deliver a key with incoming requests. This lets you track each application or client who is consuming your API.  This also lets you create Usage Plans and throttle your clients according to the plan you define.  For example, you if you know your architecture is capable of of sustaining 200 requests per second, you should define a Usage plan that sets a rate of 200 RPS and optionally configure a quota to allow a certain number of requests by day, week, or month.

Additionally, API Gateway lets you define throttling settings for the whole stage or per method. If you know that a GET operation is less resource intensive than a POST operation you can override the stage settings and set different throttling settings for each resource.

Integrations and Design patterns

The example above describes a synchronous, tightly coupled architecture where the request must wait for a response from the backend integration (RDS in this case). This results in system scaling characteristics that are the lowest common denominator of all components. Instead, you should look for opportunities to design an asynchronous, loosely coupled architecture. A decoupled architecture separates the data ingestion from the data processing and allows you to scale each system separately. Consider this new architecture:

REST API 2

This architecture enables ingestion of orders directly into a highly scalable and durable data store such as Amazon Simple Queue Service (SQS).  Your backend can process these orders at any speed that is suitable for your business requirements and system ability.  Most importantly,  the health of the backend processing system does not impact your ability to continue accepting orders.

Security

Security with API Gateway falls into three major buckets, and I’ll outline them below. Remember, you should enable all three options to combine multiple layers of security.

Option 1 (Application Firewall)

You can enable AWS Web Application Firewall (WAF) for your entire API. WAF will inspect all incoming requests and block requests that fail your inspection rules. For example, WAF can inspect requests for SQL Injection, Cross Site Scripting, or whitelisted IP addresses.

Option 2 (Resource Policy)

You can apply a Resource Policy that protects your entire API. This is an IAM policy that is applied to your API and you can use this to white/black list client IP ranges or allow AWS accounts and AWS principals to access your API.

Option 3 (AuthZ)

  1. IAM:This AuthZ option requires clients to sign requests with the AWS v4 signing process. The associated IAM role or user must have permissions to perform the execute-api:Invoke action against the API.
  2. Cognito: This AuthZ option requires clients to login into Cognito and then pass the returned ID or Access JWT token in the Authentication header.
  3. Lambda Auth: This AuthZ option is the most flexible and lets you execute a Lambda function to perform any custom auth strategy needed. A common use case for this is OpenID Connect.

A Couple of Tips

Tip #1: Use Stage variables to avoid hard coding your backend Lambda and HTTP integrations. For example, you probably have multiple stages such as “QA” and “PROD” or “V1” and “V2.” You can define the same variable in each stage and specify different values. For example, you might an API that executes a Lambda function. In each stage, define the same variable called functionArn. You can reference this variable as your Lambda ARN during your integration configuration using this notation: ${stageVariables.functionArn}. API Gateway will inject the corresponding value for the stage dynamically at runtime, allowing you to execute different Lambda functions by stage.

Tip #2: Use Path and Query variables to inject dynamic values into your HTTP integrations. For example, your cart API may define a userId Path variable that is used to lookup a user’s cart: /cart/profile/{userId}. You can inject this variable directly into your backend HTTP integration URL settings like this: http://myapi.someds.com/cart/profile/{userId}

Summary

This post covered strategies you should use to ensure your REST API architectures are scalable and easy to maintain.  I hope you’ve enjoyed this post and our next post will cover GraphQL API architectures with AWS AppSync.

About the Author

George MaoGeorge Mao is a Specialist Solutions Architect at Amazon Web Services, focused on the Serverless platform. George is responsible for helping customers design and operate Serverless applications using services like Lambda, API Gateway, Cognito, and DynamoDB. He is a regular speaker at AWS Summits, re:Invent, and various tech events. George is a software engineer and enjoys contributing to open source projects, delivering technical presentations at technology events, and working with customers to design their applications in the Cloud. George holds a Bachelor of Computer Science and Masters of IT from Virginia Tech.

How to Architect APIs for Scale and Security

Post Syndicated from George Mao original https://aws.amazon.com/blogs/architecture/how-to-architect-apis-for-scale-and-security/

We hope you’ve enjoyed reading our posts on best practices for your serverless applications. This series of posts will focus on best practices and concepts you should be familiar with when you architect APIs for your applications. We’ll kick this first post off with a comparison between REST and GraphQL API architectures.

Introduction

Developers have been creating RESTful APIs for a long time, typically using HTTP methods, such as GET, POST, DELETE to perform operations against the API. Amazon API Gateway is designed to make it easy for developers to create APIs at any scale without managing any servers. API Gateway will handle all of the heavy lifting needed including traffic management, security, monitoring, and version/environment management.

GraphQL APIs are relatively new, with a primary design goal of allowing clients to define the structure of the data that they require. AWS AppSync allows you to create flexible APIs that access and combine multiple data sources.

REST APIs

Architecting a REST API is structured around creating combinations of resources and methods.  Resources are paths  that are present in the request URL and methods are HTTP actions that you take against the resource. For example, you may define a resource called “cart”: http://myapi.somecompany.com/cart. The cart resource can respond to HTTP POSTs for adding items to a shopping cart or HTTP GETs for retrieving the items in your cart. With API Gateway, you would implement the API like this:

Behind the scenes, you can integrate with nearly any backend to provide the compute logic, data persistence, or business work flows.  For example, you can configure an AWS Lambda function to perform the addition of an item to a shopping cart (HTTP POST).  You can also use API Gateway to directly interact with AWS services like Amazon DynamoDB.  An example is using API Gateway to retrieve items in a cart from DynamoDB (HTTP GET).

RESTful APIs tend to use Path and Query parameters to inject dynamic values into APIs. For example, if you want to retreive a specific cart with an id of abcd123, you could design the API to accept a query or path parameter that specifies the cartID:

/cart?cartId=abcd123 or /cart/abcd123

Finally, when you need to add functionality to your API, the typical approach would be to add additional resources.  For example, to add a checkout function, you could add a resource called /cart/checkout.

GraphQL APIs

Architecting GraphQL APIs is not structured around resources and HTTP verbs, instead you define your data types and configure where the operations will retrieve data through a resolver. An operation is either a query or a mutation. Queries simply retrieve data while mutations are used when you want to modify data. If we use the same example from above, you could define a cart data type as follows:

type Cart {

  cartId: ID!

  customerId: String

  name: String

  email: String

  items: [String]

}

Next, you configure the fields in the Cart to map to specific data sources. AppSync is then responsible for executing resolvers to obtain appropriate information. Your client will send a HTTP POST to the AppSync endpoint with the exact shape of the data they require. AppSync is responsible for executing all configured resolvers to obtain the requested data and return a single response to the client.

Rest API

With GraphQL, the client can change their query to specify the exact data that is needed. The above example shows two queries that ask for different sets of information. The first getCart query asks for all of the static customer (customerId, name, email) and a list of items in the cart. The second query just asks for the customer’s static information. Based on the incoming query, AppSync will execute the correct resolver(s) to obtain the data. The client submits the payload via a HTTP POST to the same endpoint in both cases. The payload of the POST body is the only thing that changes.

As we saw above, a REST based implementation would require the API to define multiple HTTP resources and methods or path/query parameters to accomplish this.

AppSync also provides other powerful features that are not possible with REST APIs such as real-time data synchronization and multiple methods of authentication at the field and operation level.

Summary

As you can see, these are two different approaches to architecting your API. In our next few posts, we’ll cover specific features and architecture details you should be aware of when choosing between API Gateway (REST) and AppSync (GraphQL) APIs. In the meantime, you can read more about working with API Gateway and Appsync.

About the Author

George MaoGeorge Mao is a Specialist Solutions Architect at Amazon Web Services, focused on the Serverless platform. George is responsible for helping customers design and operate Serverless applications using services like Lambda, API Gateway, Cognito, and DynamoDB. He is a regular speaker at AWS Summits, re:Invent, and various tech events. George is a software engineer and enjoys contributing to open source projects, delivering technical presentations at technology events, and working with customers to design their applications in the Cloud. George holds a Bachelor of Computer Science and Masters of IT from Virginia Tech.

Simple Two-way Messaging using the Amazon SQS Temporary Queue Client

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/simple-two-way-messaging-using-the-amazon-sqs-temporary-queue-client/

This post is contributed by Robin Salkeld, Sr. Software Development Engineer

Amazon SQS is a fully managed message queuing service that makes it easy to decouple and scale microservices, distributed systems, and serverless applications. Asynchronous workflows have always been the primary use case for SQS. Using queues ensures one component can keep running smoothly without losing data when another component is unavailable or slow.

We were surprised, then, to discover that many customers use SQS in synchronous workflows. For example, many applications use queues to communicate between frontends and backends when processing a login request from a user.

Why would anyone use SQS for this? The service stores messages for up to 14 days with high durability, but messages in a synchronous workflow often must be processed within a few minutes, or even seconds. Why not just set up an HTTPS endpoint?

The more we talked to customers, the more we understood. Here’s what we learned:

  • Creating a queue is often easier and faster than creating an HTTPS endpoint and the infrastructure necessary to ensure the endpoint’s scalability.
  • Queues are safe by default because they are locked down to the AWS account that created them. In addition, any DDoS attempt on your service is absorbed by SQS instead of loading down your own servers.
  • There is generally no need to create firewall rules for the communication between microservices if they use queues.
  • Although SQS provides durable storage (which isn’t necessary for short-lived messages), it is still a cost-effective solution for this use case. This is especially true when you consider all the messaging broker maintenance that is done for you.

However, setting up efficient two-way communication through one-way queues requires some non-trivial client-side code. In our previous two-part post series on implementing enterprise integration patterns with AWS messaging services, Point-to-point channels and Publish-subscribe channels, we discussed the Request-Response Messaging Pattern. In this pattern, each requester creates a temporary destination to receive each response message.

The simplest approach is to create a new queue for each response, but this is like building a road just so a single car can drive on it before tearing it down. Technically, this can work (and SQS can create and delete queues quickly), but we can definitely make it faster and cheaper.

To better support short-lived, lightweight messaging destinations, we are pleased to present the Amazon SQS Temporary Queue Client. This client makes it easy to create and delete many temporary messaging destinations without inflating your AWS bill.

Virtual queues

The key concept behind the client is the virtual queue. Virtual queues let you multiplex many low-traffic queues onto a single SQS queue. Creating a virtual queue only instantiates a local buffer to hold messages for consumers as they arrive; there is no API call to SQS and no costs associated with creating a virtual queue.

The Temporary Queue Client includes the AmazonSQSVirtualQueuesClient class for creating and managing virtual queues. This class implements the AmazonSQS interface and adds support for attributes related to virtual queues. You can create a virtual queue using this client by calling the CreateQueue API action and including the HostQueueURL queue attribute. This attribute specifies the existing SQS queue on which to host the virtual queue. The queue URL for a virtual queue is in the form <host queue URL>#<virtual queue name>. For example:

https://sqs.us-east-2.amazonaws.com/123456789012/MyQueue#MyVirtualQueueName

When you call the SendMessage or SendMessageBatch API actions on AmazonSQSVirtualQueuesClient with a virtual queue URL, the client first extracts the virtual queue name. It then attaches this name as an additional message attribute to each message, and sends the messages to the host queue. When you call the ReceiveMessage API action on a virtual queue, the calling thread waits for messages to appear in the in-memory buffer for the virtual queue. Meanwhile, a background thread polls the host queue and dispatches messages to these buffers according to the additional message attribute.

This mechanism is similar to how the AmazonSQSBufferedAsyncClient prefetches messages, and the benefits are similar. A single call to SQS can provide messages for up to 10 virtual queues, reducing the API calls that you pay for by up to a factor of ten. Deleting a virtual queue simply removes the client-side resources used to implement them, again without making API calls to SQS.

The diagram below illustrates the end-to-end process for sending messages through virtual queues:

Sending messages through virtual queues

Virtual queues are similar to virtual machines. Just as a virtual machine provides the same experience as a physical machine, a virtual queue divides the resources of a single SQS queue into smaller logical queues. This is ideal for temporary queues, since they frequently only receive a handful of messages in their lifetime. Virtual queues are currently implemented entirely within the Temporary Queue Client, but additional support and optimizations might be added to SQS itself in the future.

In most cases, you don’t have to manage virtual queues yourself. The library also includes the AmazonSQSTemporaryQueuesClient class. This class automatically creates virtual queues when the CreateQueue API action is called and creates host queues on demand for all queues with the same queue attributes. To optimize existing application code that creates and deletes queues, you can use this class as a drop-in replacement implementation of the AmazonSQS interface.

The client also includes the AmazonSQSRequester and AmazonSQSResponder interfaces, which enable two-way communication through SQS queues. The following is an example of an RPC implementation for a web application’s login process.

/**
 * This class handles a user's login request on the client side.
 */
public class LoginClient {

    // The SQS queue to send the requests to.
    private final String requestQueueUrl;

    // The AmazonSQSRequester creates a temporary queue for each response.
    private final AmazonSQSRequester sqsRequester = AmazonSQSRequesterClientBuilder.defaultClient();

    private final LoginClient(String requestQueueUrl) {
        this.requestQueueUrl = requestQueueUrl;
    }

    /**
     * Send a login request to the server.
     */
    public String login(String body) throws TimeoutException {
        SendMessageRequest request = new SendMessageRequest()
                .withMessageBody(body)
                .withQueueUrl(requestQueueUrl);

        // This:
        //  - creates a temporary queue
        //  - attaches its URL as an attribute on the message
        //  - sends the message
        //  - receives the response from the temporary queue
        //  - deletes the temporary queue
        //  - returns the response
        //
        // If something goes wrong and the server's response never shows up, this method throws a
        // TimeoutException.
        Message response = sqsRequester.sendMessageAndGetResponse(request, 20, TimeUnit.SECONDS);
        
        return response.getBody();
    }
}

/**
 * This class processes users' login requests on the server side.
 */
public class LoginServer {

    // The SQS queue to poll for login requests.
    // Assume that on construction a thread is created to poll this queue and call
    // handleLoginRequest() below for each message.
    private final String requestQueueUrl;

    // The AmazonSQSResponder sends responses to the correct response destination.
    private final AmazonSQSResponder sqsResponder = AmazonSQSResponderClientBuilder.defaultClient();

    private final AmazonSQS(String requestQueueUrl) {
        this.requestQueueUrl = requestQueueUrl;
    }

    /**
     * Handle a login request sent from the client above.
     */
    public void handleLoginRequest(Message message) {
        // Assume doLogin does the actual work, and returns a serialized result
        String response = doLogin(message.getBody());

        // This extracts the URL of the temporary queue from the message attribute and sends the
        // response to that queue.
        sqsResponder.sendResponseMessage(MessageContent.fromMessage(message), new MessageContent(response));  
    }
}

Cleaning up

As with any other AWS SDK client, your code should call the shutdown() method when the temporary queue client is no longer needed. The AmazonSQSRequester interface also provides a shutdown() method, which shuts down its internal temporary queue client. This ensures that the in-memory resources needed for any virtual queues are reclaimed, and that the host queue that the client automatically created is also deleted automatically.

However, in the world of distributed systems things are a little more complex. Processes can run out of memory and crash, and hosts can reboot suddenly and unexpectedly. There are even cases where you don’t have the opportunity to run custom code on shutdown.

The Temporary Queue Client client addresses this issue as well. For each host queue with recent API calls, the client periodically uses the TagQueue API action to attach a fresh tag value that indicates the queue is still being used. The tagging process serves as a heartbeat to keep the queue alive. According to a configurable time period (by default, 5 minutes), a background thread uses the ListQueues API action to obtain the URLs of all queues with the configured prefix. Then, it deletes each queue that has not been tagged recently. The mechanism is similar to how the Amazon DynamoDB Lock Client expires stale lock leases.

If you use the AmazonSQSTemporaryQueuesClient directly, you can customize how long queues must be idle before they is deleted by configuring the IdleQueueRetentionPeriodSeconds queue attribute. The client supports setting this attribute on both host queues and virtual queues. For virtual queues, setting this attribute ensures that the in-memory resources do not become a memory leak in the presence of application bugs.

Any API call to a queue marks it as non-idle, including ReceiveMessage calls that don’t return any messages. The only reason to increase the retention period attribute is to give the client more time when it can’t send heartbeats—for example, due to garbage collection pauses or networking issues.

But what if you want to use this client in a fleet of a thousand EC2 instances? Won’t every client spend a lot of time checking every queue for idleness? Doesn’t that imply duplicate work that increases as the fleet is scaled up?

We thought of this too. The Temporary Queue Client creates a shared queue for all clients using the same queue prefix, and uses this queue as a work queue for the distributed task. Instead of every client calling the ListQueues API action every five minutes, a new seed message (which triggers the sweeping process) is sent to this queue every five minutes.

When one of the clients receives this message, it calls the ListQueues API action and sends each queue URL in the result as another kind of message to the same shared work queue. The work of actually checking each queue for idleness is distributed roughly evenly to the active clients, ensuring scalability. There is even a mechanism that works around the fact that the ListQueues API action currently only returns no more than 1,000 queue URLs at time.

Summary

We are excited about how the new Amazon SQS Temporary Queue Client makes more messaging patterns easier and cheaper for you. Download the code from GitHub, have a look at Temporary Queues in the Amazon SQS Developer Guide, try out the client, and let us know what you think!

Best Practices for Developing on AWS Lambda

Post Syndicated from George Mao original https://aws.amazon.com/blogs/architecture/best-practices-for-developing-on-aws-lambda/

In our previous post we discussed the various ways you can invoke AWS Lambda functions. In this post, we’ll provide some tips and best practices you can use when building your AWS Lambda functions.

One of the benefits of using Lambda, is that you don’t have to worry about server and infrastructure management. This means AWS will handle the heavy lifting needed to execute your Lambda functions. You should take advantage of this architecture with the tips below.

Tip #1: When to VPC-Enable a Lambda Function

Lambda functions always operate from an AWS-owned VPC. By default, your function has full ability to make network requests to any public internet address — this includes access to any of the public AWS APIs. For example, your function can interact with AWS DynamoDB APIs to PutItem or Query for records. You should only enable your functions for VPC access when you need to interact with a private resource located in a private subnet. An RDS instance is a good example.

RDS instance: When to VPC enable a Lambda function

Once your function is VPC-enabled, all network traffic from your function is subject to the routing rules of your VPC/Subnet. If your function needs to interact with a public resource, you will need a route through a NAT gateway in a public subnet.

Tip #2: Deploy Common Code to a Lambda Layer (i.e. the AWS SDK)

If you intend to reuse code in more than one function, consider creating a Layer and deploying it there. A great candidate would be a logging package that your team is required to standardize on. Another great example is the AWS SDK. AWS will include the AWS SDK for NodeJS and Python functions (and update the SDK periodically). However, you should bundle your own SDK and pin your functions to a version of the SDK you have tested.

Tip #3: Watch Your Package Size and Dependencies

Lambda functions require you to package all needed dependencies (or attach a Layer) — the bigger your deployment package, the slower your function will cold-start. Remove all unnecessary items, such as documentation and unused libraries. If you are using Java functions with the AWS SDK, only bundle the module(s) that you actually need to use — not the entire SDK.

Good:

<dependency>
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>dynamodb</artifactId>
    <version>2.6.0</version>
</dependency>

Bad:

<!-- https://mvnrepository.com/artifact/software.amazon.awssdk/aws-sdk-java -->
<dependency>
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>aws-sdk-java</artifactId>
    <version>2.6.0</version>
</dependency>

Tip #4: Monitor Your Concurrency (and Set Alarms)

Our first post in this series talked about how concurrency can effect your down stream systems. Since Lambda functions can scale extremely quickly, this means you should have controls in place to notify you when you have a spike in concurrency. A good idea is to deploy a CloudWatch Alarm that notifies your team when function metrics such as ConcurrentExecutions or Invocations exceeds your threshold. You should create an AWS Budget so you can monitor costs on a daily basis. Here is a great example of how to set up automated cost controls.

Tip #5: Over-Provision Memory (in some use cases) but Not Function Timeout

Lambda allocates compute power in proportion to the memory you allocate to your function. This means you can over provision memory to run your functions faster and potentially reduce your costs. You should benchmark your use case to determine where the breakeven point is for running faster and using more memory vs running slower and using less memory.

However, we recommend you do not over provision your function time out settings. Always understand your code performance and set a function time out accordingly. Overprovisioning function timeout often results in Lambda functions running longer than expected and unexpected costs.

About the Author

George MaoGeorge Mao is a Specialist Solutions Architect at Amazon Web Services, focused on the Serverless platform. George is responsible for helping customers design and operate Serverless applications using services like Lambda, API Gateway, Cognito, and DynamoDB. He is a regular speaker at AWS Summits, re:Invent, and various tech events. George is a software engineer and enjoys contributing to open source projects, delivering technical presentations at technology events, and working with customers to design their applications in the Cloud. George holds a Bachelor of Computer Science and Masters of IT from Virginia Tech.

How to migrate a digital signing workload to AWS CloudHSM

Post Syndicated from Tracy Pierce original https://aws.amazon.com/blogs/security/how-to-migrate-a-digital-signing-workload-to-aws-cloudhsm/

Is your on-premises Hardware Security Module (HSM) at end-of-life? Does continued maintenance of your on-premises hardware take a lot of time and cost a lot of money? Do you want or need all of your workloads to be performed on AWS? By migrating these workloads to AWS CloudHSM, you receive automated backups, low cost HSMs, managed maintenance, automatic recovery in event of a hardware failure, integrated fault tolerance, and high-availability. One such workload you might consider migrating is secret key material used for digital signing operations.

Enterprise certificate authority (CA) or public key infrastructure (PKI) applications use the private portion of an asymmetric key pair generated and stored in a hardware security module (HSM) to perform signing operations. Examples of such operations include the creation of digital certificates for web-servers or IoT devices, file signatures, or when negotiating a TLS session. Migrating this type of workload to AWS may save you time and money. If your HSM is at end of life and you need an alternative, you can migrate the digital signing workload to AWS CloudHSM in just a few steps.

This post will focus on a workload that allows you to create and use a digital certificate to digitally sign an arbitrary file. I’ll show you how to create a new asymmetric key pair and generate the corresponding certificate signing request (CSR) on AWS CloudHSM. This CSR, once signed by the appropriate issuing CA, allows your new key pair and the associated certificate to be trusted in the same way as the key pairs in your original HSM. You could then move traffic related to signing operations or issuing certificates to your AWS CloudHSM cluster.

Background

Before I walk you through the steps of migrating a certificate signing workload into CloudHSM, I’ll provide a little background information so you’ll know how CloudHSM, PKI, and CAs work together. Every certificate is associated with a key pair made up of a private (secret) key and a public key. The private key associated with a certificate needs to be kept confidential, so it typically resides on a hardware security module (HSM). The public portion of the key pair is not confidential, is included in the certificate, and can be shared with anyone who wants to verify a digital signature made with the corresponding private key. In a PKI, a CA is the trusted entity that issues digital certificates on behalf of end-entities. At the top of the trust hierarchy is a root CA, which is implicitly trusted when it is established because it acts as the root of trust for intermediate CAs and end-entity certificates that may be issued underneath it. Intermediate CAs are trusted because their certificates are signed by the root CA. Intermediate CAs in turn sign end-entity certificates, which are used to authenticate identities of various actors across the data transfer process. A common use case for end-entity certificates is for web servers so that connecting clients can verify the server’s identity. Generally, end-entity certificates are valid for 1-3 years, intermediate CA certificates are valid for 5-10 years, and root CAs are valid for 30 years or more.

Beyond solving for the non-repudiation of objects signed by end-entity certificates to ensure the owner of the private key performed the signing operation, there is still the problem of trusting that the owner of the private key is the identity they claim to be. When evaluating trust in this way, there are generally two options; relying on public CAs or private CAs. Public CAs widely distribute the public keys of their root certificates into popular client trust stores (for example, browsers and operating systems). This allows users to verify that the identity of the end-entity has been attested to by a publicly trusted CA. This helps when the signer and the verifier of the digital asset don’t know each other and haven’t shared cryptographic material with each other in advance to perform future validations. Private CAs are those for which there are no widely distributed copies of their associated public keys. The verifier has to retrieve the public key from the private CA and has to explicitly trust the cert without any third-party attestation of the signer’s identity. This is appropriate for cases when signers and verifiers are in the same company or know each other. Examples of when to use a private CA are securing virtual private networks, data or file replication between internal servers, remote backups, file-sharing, email, or other personal accounts.

Regardless of the certificate trust model you need, AWS CloudHSM can be used to create the initial key pair and CSR for both public and private CA requests. Note that AWS offers some alternatives for certificate management that may simplify your workloads without having to use AWS CloudHSM directly. AWS Certificate Manager (ACM) automatically creates key pairs and issues public or private certificates to identify resources within your organization. For use cases that need capabilities not yet supported by ACM, or in unusual situations in which a single-tenant HSM under your control is required for compliance reasons, you can use AWS CloudHSM directly for key generation and signing operations.

Organizations currently using an on-premises HSM for the creation of asymmetric keys used in digital certificates often use a vendor-proprietary mechanism to replicate key material across multiple HSMs for resiliency. However, this method prevents the key material from ever being transferred to an HSM offered by a different vendor. Consider it “vendor lock-in’ by design. So, the private key corresponding to the certificates you use for signing and authentication are locked inside that HSM. But if they are locked inside, how do you move to AWS CloudHSM? The answer is that you don’t have to rely on these inaccessible keys: you can create a new key pair and use it within AWS CloudHSM to begin issuing end-entity certificates.

Solution overview

I will go over creating a new private key in AWS CloudHSM using the Windows client and using Microsoft certreq to generate a corresponding CSR. You provide this CSR to your private or public CA to receive a signed certificate in return. This certificate and its public key then needs to be propagated to wherever your signatures are verified. At the end of this post, I will show you how to verify your digital signatures using Microsoft SignTool. SignTool is provided by Microsoft to allow Windows users to digitally sign files, verify file signatures, and file timestamps.
 

Figure 1: Procedural diagram

Figure 1: Procedural diagram

As shown in the diagram above, the steps followed in this post are:

  1. Create a new RSA private key using KSP/CNG through the AWS CloudHSM Windows client.
  2. Using Microsoft certreq, create your CSR.
  3. Provide the CSR to your CA for signing.
  4. Use Microsoft SignTool to sign files in your environment.

Note: You may have to register this new certificate with any partners that do not automatically verify the entire certificate chain. This could be 3rd party applications, vendors, or outside entities that utilize your certificates to determine trust.

Prerequisites

In this walkthrough, I assume that you already have an AWS CloudHSM cluster set up and initialized with at least one HSM device, and an Amazon Elastic Compute Cloud (EC2) Windows-based instance with the AWS CloudHSM client, PowerShell, and Windows SDK with Microsoft SignTool installed. You must have a crypto user (CU) on the HSM to perform the steps in this post.

Deploying the solution

Step 1: Create a new private key using KSP/CNG using the AWS CloudHSM Windows client

On your Windows server where the AWS CloudHSM Windows client is installed, use a text editor to create a certificate request file named IISCertRequest.inf. For the purpose of this post, I have filled out an example file below.


[Version]
Signature = "$Windows NT$"
[NewRequest]
Subject = "CN=example.com,C=US,ST=Washington,L=Seattle,O=ExampleOrg,OU=WebServer"
HashAlgorithm = SHA256
KeyAlgorithm = RSA
KeyLength = 2048
ProviderName = "Cavium Key Storage Provider"
KeyUsage = "CERT_DIGITAL_SIGNATURE_KEY_USAGE"
MachineKeySet = True    

Step 2: Using Microsoft certreq, create your CSR

On the same server, open PowerShell and, at the PowerShell prompt, create a CSR from the IISCertRequest.inf file by using the Windows certreq command. Here’s an example of the command. Remember to change out the text in red italics with your own file name.


PS C:\>certreq -new <IISCertRequest.inf IISCertRequest.csr> 
	SDK Version: 2.03
CertReq: Request Created

If successful, you’ll see the “Request Created” message above, as well as the new file <IISCertRequest.csr> on your server. This certificate will be provided to your choice of public CA for certificate issuance. This will need to be completed manually via your public CAs suggested method of certificate request.

Step 3: Provide the CSR to your CA for signing

The CA that had been signing your existing end-entity certificates with keys generated by your original HSM is the one you use to sign the new certificates with keys generated by AWS CloudHSM, as well. There are many CAs to choose from, such as Digicert, Trustwave, GoDaddy, and so on. You will want to follow their steps for submitting your CSR to receive your certificate in return.

Step 4: Use Microsoft SignTool to sign files in your environment

When you receive your signed certificate back from your chosen CA, save a copy locally on your Windows server. Then, move the certificate file to the Personal Certificate Store in Windows so it can be used by other applications, such as Microsoft SignTool. Here’s an example of the command. Be sure to replace the value in <red italics> with your actual certificate name.
PS C:\certreq -accept <signedCertificate.cer>

Now, the certificate is ready for use, and I’ll show you how to use it to sign a file. First, you have to get the thumbprint of your certificate. To do this, open PowerShell as an Administrator (right-click the app and choose Run as Administrator). Type this command:
PS C:\>Get-ChildItem -path cert:\LocalMachine\My

If successful, you should see an output similar to this. Copy the thumbprint that is returned. You’ll need it when you perform the actual signing operation on a file.


Thumbprint				                Subject
---------------						-----------
49DF7HDJT84723FDKCURLSXYRF9830568CXHSUB2		CN=WINDOWS-CA
VJFU57E6DI9DKMCHAKLDFJA8E73739Q04730QU7A		CN=www.example.com, OU=Certif….

To open the SignTool application, navigate to the app’s directory within PowerShell. By default, this is typically:
C:\Program Files (x86)\Windows Kits\<SDK Version> \bin\<version number> \<CPU architecture>

For example, if you had downloaded the Microsoft Windows SDK 10 version, the application would be stored in:

C:\Program Files (x86)\Windows Kits\10\bin\10.0.17763.0\x64

When you’ve located the directory, sign your file by running the command below. Remember to replace the values in <red italics> with your own values. The test.exe file in this example can be any valid executable file in your directory.
PS C:\>.\signtool.exe sign /v /fd sha256 /sha1 <thumbprint> /sm /as C:\Users\Administrator\Desktop\<test.exe>

You should see a message like this:


Done Adding Additional Store
Successfully signed C:\User\Administrator\Desktop\<test.exe>

Number of files successfully Signed: 1
Number of warnings: 0
Number of errors: 0

One last optional item you can do is verify the signature on the file using the command below. Again, replace your values for those in red italics.
PS C:\>.\signtool.exe verify /v /pa C:\Users\Administrators\Desktop\<test.exe>

You’ve now successfully migrated your file signing workload to AWS CloudHSM. If your signing certificate was not issued by a publicly trusted CA but instead by a private CA, make sure to deploy a copy of the root CA certificate and any intermediate certs from the private CA on any systems you want to verify the integrity of your signed file.

Conclusion

In this post, I walked you through creating a new RSA asymmetric key pair to create a CSR. After supplying the CSR to your chosen CA and receiving a signing certificate in return, I then showed you a how to use Microsoft SignTool with AWS CloudHSM to sign files in your environment. You can now use AWS CloudHSM to sign code, documents, or other certificates in the same method of your original HSMs.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the AWS CloudHSM forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Tracy Pierce

Tracy Pierce is a Senior Consultant, Security Specialty, for Remote Consulting Services. She enjoys the peculiar culture of Amazon and uses that to ensure every day is exciting for her fellow engineers and customers alike. Customer Obsession is her highest priority and she shows this by improving processes, documentation, and building tutorials. She has her AS in Computer Security & Forensics from SCTD, SSCP certification, AWS Developer Associate certification, and AWS Security Specialist certification. Outside of work, she enjoys time with friends, her Great Dane, and three cats. She keeps work interesting by drawing cartoon characters on the walls at request.

Understanding the Different Ways to Invoke Lambda Functions

Post Syndicated from George Mao original https://aws.amazon.com/blogs/architecture/understanding-the-different-ways-to-invoke-lambda-functions/

In our first post, we talked about general design patterns to enable massive scale with serverless applications. In this post, we’ll review the different ways you can invoke Lambda functions and what you should be aware of with each invocation model.

Synchronous Invokes

Synchronous invocations are the most straight forward way to invoke your Lambda functions. In this model, your functions execute immediately when you perform the Lambda Invoke API call. This can be accomplished through a variety of options, including using the CLI or any of the supported SDKs.

Here is an example of a synchronous invoke using the CLI:

aws lambda invoke —function-name MyLambdaFunction —invocation-type RequestResponse —payload  “[JSON string here]”

The Invocation-type flag specifies a value of “RequestResponse”. This instructs AWS to execute your Lambda function and wait for the function to complete. When you perform a synchronous invoke, you are responsible for checking the response and determining if there was an error and if you should retry the invoke.

Many AWS services can emit events that trigger Lambda functions. Here is a list of services that invoke Lambda functions synchronously:

Asynchronous Invokes

Here is an example of an asynchronous invoke using the CLI:

aws lambda invoke —function-name MyLambdaFunction —invocation-type Event —payload  “[JSON string here]”

Notice, the Invocation-type flag specifies “Event.” If your function returns an error, AWS will automatically retry the invoke twice, for a total of three invocations.

Here is a list of services that invoke Lambda functions asynchronously:

Asynchronous invokes place your invoke request in Lambda service queue and we process the requests as they arrive. You should use AWS X-Ray to review how long your request spent in the service queue by checking the “dwell time” segment.

Poll based Invokes

This invocation model is designed to allow you to integrate with AWS Stream and Queue based services with no code or server management. Lambda will poll the following services on your behalf, retrieve records, and invoke your functions. The following are supported services:

AWS will manage the poller on your behalf and perform Synchronous invokes of your function with this type of integration. The retry behavior for this model is based on data expiration in the data source. For example, Kinesis Data streams store records for 24 hours by default (up to 168 hours). The specific details of each integration are linked above.

Conclusion

In our next post, we’ll provide some tips and best practices for developing Lambda functions. Happy coding!

 

About the Author

George MaoGeorge Mao is a Specialist Solutions Architect at Amazon Web Services, focused on the Serverless platform. George is responsible for helping customers design and operate Serverless applications using services like Lambda, API Gateway, Cognito, and DynamoDB. He is a regular speaker at AWS Summits, re:Invent, and various tech events. George is a software engineer and enjoys contributing to open source projects, delivering technical presentations at technology events, and working with customers to design their applications in the Cloud. George holds a Bachelor of Computer Science and Masters of IT from Virginia Tech.

Configuring user creation workflows with AWS Step Functions and AWS Managed Microsoft AD logs

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/configuring-user-creation-workflows-with-aws-step-functions-and-aws-managed-microsoft-ad-logs/

This post is contributed by Taka Matsumoto, Cloud Support Engineer

AWS Directory Service lets you run Microsoft Active Directory as a managed service. Directory Service for Microsoft Active Directory, also referred to as AWS Managed Microsoft AD, is powered by Microsoft Windows Server 2012 R2. It manages users and makes it easy to integrate with compatible AWS services and other applications. Using the log forwarding feature, you can stay aware of all security events in Amazon CloudWatch Logs. This helps monitor events like the addition of a new user.

When new users are created in your AWS Managed Microsoft AD, you might go through the initial setup workflow manually. However, AWS Step Functions can coordinate new user creation activities into serverless workflows that automate the process. With Step Functions, AWS Lambda can be also used to run code for the automation workflows without provisioning or managing servers.

In this post, I show how to create and trigger a new user creation workflow in Step Functions. This workflow creates a WorkSpace in Amazon WorkSpaces and a user in Amazon Connect using AWS Managed Microsoft AD, Step Functions, Lambda, and Amazon CloudWatch Logs.

Overview

The following diagram shows the solution graphically.

Configuring user creation workflows with AWS Step Functions and AWS Managed Microsoft AD logs

Walkthrough

Using the following procedures, create an automated user creation workflow with AWS Managed Microsoft AD. The solution requires the creation of new resources in CloudWatch, Lambda, and Step Functions, and a new user in Amazon WorkSpaces and Amazon Connect. Here’s the list of steps:

  1. Enable log forwarding.
  2. Create the Lambda functions.
  3. Set up log streaming.
  4. Create a state machine in Step Functions.
  5. Test the solution.

Requirements

To follow along, you need the following resources:

  • AWS Managed Microsoft AD
    • Must be registered with Amazon WorkSpaces
    • Must be registered with Amazon Connect

In this example, you use an Amazon Connect instance with SAML 2.0-based authentication as identity management. For more information, see Configure SAML for Identity Management in Amazon Connect.

Enable log forwarding

Enable log forwarding for your AWS Managed Microsoft AD.  Use /aws/directoryservice/<directory id> for the CloudWatch log group name. You will use this log group name when creating a Log Streaming in Step 3.

Create Lambda functions

Create two Lambda functions. The first starts a Step Functions execution with CloudWatch Logs. The second performs a user registration process with Amazon WorkSpaces and Amazon Connect within a Step Functions execution.

Create the first function with the following settings:

  • Name: DS-Log-Stream-Function
  • Runtime: Python 3.7
  • Memory: 128 MB
  • Timeout: 3 seconds
  • Environment variables:
    • Key: stateMachineArn
    • Value: arn:aws:states:<Region>:<AccountId>:stateMachine:NewUserWorkFlow
  • IAM role with the following permissions:
    • AWSLambdaBasicExecutionRole
    • The following permissions policy
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "states:StartExecution",
            "Resource": "*"
        }
    ]
}
import base64
import boto3
import gzip
import json
import re
import os
def lambda_handler(event, context):
    logEvents = DecodeCWPayload(event)
    print('Event payload:', logEvents)
    returnResultDict = []
    
    # Because there can be more than one message pushed in a single payload, use a for loop to start a workflow for every user
    for logevent in logEvents:
        logMessage = logevent['message']
        upnMessage =  re.search("(<Data Name='UserPrincipalName'>)(.*?)(<\/Data>)",logMessage)
        if upnMessage != None:
            upn = upnMessage.group(2).lower()
            userNameAndDomain = upn.split('@')
            userName = userNameAndDomain[0].lower()
            userNameAndDomain = upn.split('@')
            domainName = userNameAndDomain[1].lower()
            sfnInputDict = {'Username': userName, 'UPN': upn, 'DomainName': domainName}
            sfnResponse = StartSFNExecution(json.dumps(sfnInputDict))
            print('Username:',upn)
            print('Execution ARN:', sfnResponse['executionArn'])
            print('Execution start time:', sfnResponse['startDate'])
            returnResultDict.append({'Username': upn, 'ExectionArn': sfnResponse['executionArn'], 'Time': str(sfnResponse['startDate'])})

    returnObject = {'Result':returnResultDict}
    return {
        'statusCode': 200,
        'body': json.dumps(returnObject)
    }

# Helper function decode the payload
def DecodeCWPayload(payload):
    # CloudWatch Log Stream event 
    cloudWatchLog = payload['awslogs']['data']
    # Base 64 decode the log 
    base64DecodedValue = base64.b64decode(cloudWatchLog)
    # Uncompress the gzipped decoded value
    gunzipValue = gzip.decompress(base64DecodedValue)
    dictPayload = json.loads(gunzipValue)
    decodedLogEvents = dictPayload['logEvents']
    return decodedLogEvents

# Step Functions state machine execution function
def StartSFNExecution(sfnInput):
    sfnClient = boto3.client('stepfunctions')
    try:
        response = sfnClient.start_execution(
            stateMachineArn=os.environ['stateMachineArn'],
            input=sfnInput
        )
        return response
    except Exception as e:
        return e

For the other function used to perform a user creation task, use the following settings:

  • Name: SFN-New-User-Flow
  • Runtime: Python 3.7
  • Memory: 128 MB
  • Timeout: 3 seconds
  • Environment variables:
    • Key: nameDelimiter
    • Value: . [period]

This delimiter is used to split the username into a first name and last name, as Amazon Connect instances with SAML-based authentication require both a first name and last name for users. For more information, see CreateUser API and UserIdentity Info.

  • Key: bundleId
  • Value: <WorkSpaces bundle ID>

Run the following AWS CLI command to return Amazon-owned WorkSpaces bundles. Use one of the bundle IDs for the key-value pair.

aws workspaces describe-workspace-bundles –owner AMAZON

  • Key: directoryId
  • Value: <WorkSpaces directory ID>

Run the following AWS CLI command to return Amazon WorkSpaces directories. Use your directory ID for the key-value pair.

aws workspaces describe-workspace-directories

  • Key: instanceId
  • Value: <Amazon Connect instance ID>

Find the Amazon Connect instance ID the Amazon Connect instance ID.

  • Key: routingProfile
  • Value: <Amazon Connect routing profile>

Run the following AWS CLI command to list routing profiles with their IDs. For this walkthrough, use the ID for the basic routing profile.

aws connect list-routing-profiles –instance-id <instance id>

  • Key: securityProfile
  • Value: <Amazon Connect security profile>

Run the following AWS CLI command to list security profiles with their IDs. For this walkthrough, use the ID for an agent security profile.

aws connect list-security-profiles –instance-id  <instance id>

  • IAM role permissions:
    • AWSLambdaBasicExecutionRole

The following permissions policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "connect:CreateUser",
                "workspaces:CreateWorkspaces"
            ],
            "Resource": "*"
        }
    ]
}
import json
import os
import boto3

def lambda_handler(event, context):
    userName = event['input']['User']
    nameDelimiter = os.environ['nameDelimiter']
    if nameDelimiter in userName:
        firstName = userName.split(nameDelimiter)[0]
        lastName = userName.split(nameDelimiter)[1]
    else:
        firstName = userName
        lastName = userName
    domainName = event['input']['Domain']
    upn = event['input']['UPN']
    serviceName = event['input']['Service']
    if serviceName == 'WorkSpaces':
        # Setting WorkSpaces variables
        workspacesDirectoryId = os.environ['directoryId']
        workspacesUsername = upn
        workspacesBundleId = os.environ['bundleId']
        createNewWorkSpace = create_new_workspace(
            directoryId=workspacesDirectoryId,
            username=workspacesUsername,
            bundleId=workspacesBundleId
        )
        return createNewWorkSpace
    elif serviceName == 'Connect':
        createConnectUser = create_connect_user(
            connectUsername=upn,
            connectFirstName=firstName,
            connectLastName=lastName,
            securityProfile=os.environ['securityProfile'], 
            routingProfile=os.environ['routingProfile'], 
            instanceId=os.environ['instanceId']
        )
        return createConnectUser
    else:
        print(serviceName, 'is not recognized...')
        print('Available service names are WorkSpaces and Connect')
        unknownServiceException = {
            'statusCode': 500,
            'body': json.dumps(f'Service name, {serviceName}, is not recognized')}
        raise Exception(unknownServiceException)

class FailedWorkSpaceCreationException(Exception):
    pass

class WorkSpaceResourceExists(Exception):
    pass

def create_new_workspace(directoryId, username, bundleId):
    workspacesClient = boto3.client('workspaces')
    response = workspacesClient.create_workspaces(
        Workspaces=[{
                'DirectoryId': directoryId,
                'UserName': username,
                'BundleId': bundleId,
                'WorkspaceProperties': {
                    'RunningMode': 'AUTO_STOP',
                    'RunningModeAutoStopTimeoutInMinutes': 60,
                    'RootVolumeSizeGib': 80,
                    'UserVolumeSizeGib': 100,
                    'ComputeTypeName': 'VALUE'
                    }}]
                    )
    print('create_workspaces response:',response)
    for pendingRequest in response['PendingRequests']:
        if pendingRequest['UserName'] == username:
            workspacesResultObject = {'UserName':username, 'ServiceName':'WorkSpaces', 'Status': 'Success'}
            return {
                'statusCode': 200,
                'body': json.dumps(workspacesResultObject)
                }
    for failedRequest in response['FailedRequests']:
        if failedRequest['WorkspaceRequest']['UserName'] == username:
            errorCode = failedRequest['ErrorCode']
            errorMessage = failedRequest['ErrorMessage']
            errorResponse = {'Error Code:', errorCode, 'Error Message:', errorMessage}
            if errorCode == "ResourceExists.WorkSpace": 
                raise WorkSpaceResourceExists(str(errorResponse))
            else:
                raise FailedWorkSpaceCreationException(str(errorResponse))
                
def create_connect_user(connectUsername, connectFirstName,connectLastName,securityProfile,routingProfile,instanceId):
    connectClient = boto3.client('connect')
    response = connectClient.create_user(
                    Username=connectUsername,
                    IdentityInfo={
                        'FirstName': connectFirstName,
                        'LastName': connectLastName
                        },
                    PhoneConfig={
                        'PhoneType': 'SOFT_PHONE',
                        'AutoAccept': False,
                        },
                    SecurityProfileIds=[
                        securityProfile,
                        ],
                    RoutingProfileId=routingProfile,
                    InstanceId = instanceId
                    )
    connectSuccessResultObject = {'UserName':connectUsername,'ServiceName':'Connect','FirstName': connectFirstName, 'LastName': connectLastName,'Status': 'Success'}
    return {
        'statusCode': 200,
        'body': json.dumps(connectSuccessResultObject)
        }

Set up log streaming

Create a new CloudWatch Logs subscription filter that sends log data to the Lambda function DS-Log-Stream-Function created in Step 2.

  1. In the CloudWatch console, choose Logs, Log Groups, and select the log group, /aws/directoryservice/<directory id>, for the directory set up in Step 1.
  2. Choose Actions, Stream to AWS Lambda.
  3. Choose Destination, and select the Lambda function DS-Log-Stream-Function.
  4. For Log format, choose Other as the log format and enter “<EventID>4720</EventID>” (include the double quotes).
  5. Choose Start streaming.

If there is an existing subscription filter for the log group, run the following AWS CLI command to create a subscription filter for the Lambda function, DS-Log-Stream-Function.

aws logs put-subscription-filter \

--log-group-name /aws/directoryservice/<directoryid> \

--filter-name NewUser \

--filter-pattern "<EventID>4720</EventID>" \

--destination-arn arn:aws:lambda:<Region>:<ACCOUNT_NUMBER>:function:DS-Log-Stream-Function

For more information, see Using CloudWatch Logs Subscription Filters.

Create a state machine in Step Functions

The next step is to create a state machine in Step Functions. This state machine runs the Lambda function, SFN-New-User-Flow, to create a user in Amazon WorkSpaces and Amazon Connect.

Define the state machine, using the following settings:

  • Name: NewUserWorkFlow
  • State machine definition: Copy the following state machine definition:
{
    "Comment": "An example state machine for a new user creation workflow",
    "StartAt": "Parallel",
    "States": {
        "Parallel": {
            "Type": "Parallel",
            "End": true,
            "Branches": [
                {
                    "StartAt": "CreateWorkSpace",
                    "States": {
                        "CreateWorkSpace": {
                            "Type": "Task",
                            "Parameters": {
                                "input": {
                                    "User.$": "$.Username",
                                    "UPN.$": "$.UPN",
                                    "Domain.$": "$.DomainName",
                                    "Service": "WorkSpaces"
                                }
                            },
                            "Resource": "arn:aws:lambda:{region}:{account id}:function:SFN-New-User-Flow",
                            "Retry": [
                                {
                                    "ErrorEquals": [
                                        "WorkSpaceResourceExists"
                                    ],
                                    "IntervalSeconds": 1,
                                    "MaxAttempts": 0,
                                    "BackoffRate": 1
                                },
                                {
                                    "ErrorEquals": [
                                        "States.ALL"
                                    ],
                                    "IntervalSeconds": 10,
                                    "MaxAttempts": 2,
                                    "BackoffRate": 2
                                }
                            ],
                            "Catch": [
                                {
                                    "ErrorEquals": [
                                        "WorkSpaceResourceExists"
                                    ],
                                    "ResultPath": "$.workspacesResult",
                                    "Next": "WorkSpacesPassState"
                                },
                                {
                                    "ErrorEquals": [
                                        "States.ALL"
                                    ],
                                    "ResultPath": "$.workspacesResult",
                                    "Next": "WorkSpacesPassState"
                                }
                            ],
                            "End": true
                        },
                        "WorkSpacesPassState": {
                            "Type": "Pass",
                            "Parameters": {
                                "Result.$": "$.workspacesResult"
                            },
                            "End": true
                        }
                    }
                },
                {
                    "StartAt": "CreateConnectUser",
                    "States": {
                        "CreateConnectUser": {
                            "Type": "Task",
                            "Parameters": {
                                "input": {
                                    "User.$": "$.Username",
                                    "UPN.$": "$.UPN",
                                    "Domain.$": "$.DomainName",
                                    "Service": "Connect"
                                }
                            },
                            "Resource": "arn:aws:lambda:{region}:{account id}:function:SFN-New-User-Flow",
                            "Retry": [
                                {
                                    "ErrorEquals": [
                                        "DuplicateResourceException"
                                    ],
                                    "IntervalSeconds": 1,
                                    "MaxAttempts": 0,
                                    "BackoffRate": 1
                                },
                                {
                                    "ErrorEquals": [
                                        "States.ALL"
                                    ],
                                    "IntervalSeconds": 10,
                                    "MaxAttempts": 2,
                                    "BackoffRate": 2
                                }
                            ],
                            "Catch": [
                                {
                                    "ErrorEquals": [
                                        "DuplicateResourceException"
                                    ],
                                    "ResultPath": "$.connectResult",
                                    "Next": "ConnectPassState"
                                },
                                {
                                    "ErrorEquals": [
                                        "States.ALL"
                                    ],
                                    "ResultPath": "$.connectResult",
                                    "Next": "ConnectPassState"
                                }
                            ],
                            "End": true,
                            "ResultPath": "$.connectResult"
                        },
                        "ConnectPassState": {
                            "Type": "Pass",
                            "Parameters": {
                                "Result.$": "$.connectResult"
                            },
                            "End": true
                        }
                    }
                }
            ]
        }
    }
}

After entering the name and state machine definition, choose Next.

Configure the settings by choosing Create an IAM role for me. This creates an IAM role for the state machine to run the Lambda function SFN-New-User-Flow.

Here’s the list of states in the NewUserWorkFlow state machine definition:

  • Start—When the state machine starts, it creates a parallel state to start both the CreateWorkSpace and CreateConnectUser states.
  • CreateWorkSpace—This task state runs the SFN-New-User-Flow Lambda function to create a new WorkSpace for the user. If this is successful, it goes to the End state.
  • WorkSpacesPassState—This pass state returns the result from the CreateWorkSpace state.
  • CreateConnectUse — This task state runs the SFN-New-User-Flow Lambda function to create a user in Amazon Connect. If this is successful, it goes to the End state.
  • ConnectPassState—This pass state returns the result from the CreateWorkSpace state.
  • End

The following diagram shows how these states relate to each other.

Step Functions State Machine

Test the solution

It’s time to test the solution. Create a user in AWS Managed Microsoft AD. The new user should have the following attributes:

This starts a new state machine execution in Step Functions. Here’s the flow:

  1. When there is a user creation event (Event ID: 4720) in the AWS Managed Microsoft AD security log, CloudWatch invokes the Lambda function, DS-Log-Stream-Function, to start a new state machine execution in Step Functions.
  2. To create a new WorkSpace and create a user in the Amazon Connect instance, the state machine execution runs tasks to invoke the other Lambda function, SFN-New-User-Flow.

Conclusion

This solution automates the initial user registration workflow. Step Functions provides the flexibility to customize the workflow to meet your needs. This walkthrough included Amazon WorkSpaces and Amazon Connect; both services are used to register the new user. For organizations that create a number of new users on a regular basis, this new user automation workflow can save time when configuring resources for a new user.

The event source of the automation workflow can be any event that triggers the new user workflow, so the event source isn’t limited to CloudWatch Logs. Also, the integrated service used for new user registration can be any AWS service that offers API and works with AWS Managed Microsoft AD. Other programmatically accessible services within or outside AWS can also fill that role.

In this post, I showed you how serverless workflows can streamline and coordinate user creation activities. Step Functions provides this functionality, with the help of Lambda, Amazon WorkSpaces, AWS Managed Microsoft AD, and Amazon Connect. Together, these services offer increased power and functionality when managing users, monitoring security, and integrating with compatible AWS services.