Tag Archives: least privilege

Combine Transactional and Analytical Data Using Amazon Aurora and Amazon Redshift

Post Syndicated from Re Alvarez-Parmar original https://aws.amazon.com/blogs/big-data/combine-transactional-and-analytical-data-using-amazon-aurora-and-amazon-redshift/

A few months ago, we published a blog post about capturing data changes in an Amazon Aurora database and sending it to Amazon Athena and Amazon QuickSight for fast analysis and visualization. In this post, I want to demonstrate how easy it can be to take the data in Aurora and combine it with data in Amazon Redshift using Amazon Redshift Spectrum.

With Amazon Redshift, you can build petabyte-scale data warehouses that unify data from a variety of internal and external sources. Because Amazon Redshift is optimized for complex queries (often involving multiple joins) across large tables, it can handle large volumes of retail, inventory, and financial data without breaking a sweat.

In this post, we describe how to combine data in Aurora in Amazon Redshift. Here’s an overview of the solution:

  • Use AWS Lambda functions with Amazon Aurora to capture data changes in a table.
  • Save data in an Amazon S3
  • Query data using Amazon Redshift Spectrum.

We use the following services:

Serverless architecture for capturing and analyzing Aurora data changes

Consider a scenario in which an e-commerce web application uses Amazon Aurora for a transactional database layer. The company has a sales table that captures every single sale, along with a few corresponding data items. This information is stored as immutable data in a table. Business users want to monitor the sales data and then analyze and visualize it.

In this example, you take the changes in data in an Aurora database table and save it in Amazon S3. After the data is captured in Amazon S3, you combine it with data in your existing Amazon Redshift cluster for analysis.

By the end of this post, you will understand how to capture data events in an Aurora table and push them out to other AWS services using AWS Lambda.

The following diagram shows the flow of data as it occurs in this tutorial:

The starting point in this architecture is a database insert operation in Amazon Aurora. When the insert statement is executed, a custom trigger calls a Lambda function and forwards the inserted data. Lambda writes the data that it received from Amazon Aurora to a Kinesis data delivery stream. Kinesis Data Firehose writes the data to an Amazon S3 bucket. Once the data is in an Amazon S3 bucket, it is queried in place using Amazon Redshift Spectrum.

Creating an Aurora database

First, create a database by following these steps in the Amazon RDS console:

  1. Sign in to the AWS Management Console, and open the Amazon RDS console.
  2. Choose Launch a DB instance, and choose Next.
  3. For Engine, choose Amazon Aurora.
  4. Choose a DB instance class. This example uses a small, since this is not a production database.
  5. In Multi-AZ deployment, choose No.
  6. Configure DB instance identifier, Master username, and Master password.
  7. Launch the DB instance.

After you create the database, use MySQL Workbench to connect to the database using the CNAME from the console. For information about connecting to an Aurora database, see Connecting to an Amazon Aurora DB Cluster.

The following screenshot shows the MySQL Workbench configuration:

Next, create a table in the database by running the following SQL statement:

Create Table
CREATE TABLE Sales (
InvoiceID int NOT NULL AUTO_INCREMENT,
ItemID int NOT NULL,
Category varchar(255),
Price double(10,2), 
Quantity int not NULL,
OrderDate timestamp,
DestinationState varchar(2),
ShippingType varchar(255),
Referral varchar(255),
PRIMARY KEY (InvoiceID)
)

You can now populate the table with some sample data. To generate sample data in your table, copy and run the following script. Ensure that the highlighted (bold) variables are replaced with appropriate values.

#!/usr/bin/python
import MySQLdb
import random
import datetime

db = MySQLdb.connect(host="AURORA_CNAME",
                     user="DBUSER",
                     passwd="DBPASSWORD",
                     db="DB")

states = ("AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN",
"IA","KS","KY","LA","ME","MD","MA","MI","MN","MS","MO","MT","NE","NV","NH","NJ",
"NM","NY","NC","ND","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VT","VA",
"WA","WV","WI","WY")

shipping_types = ("Free", "3-Day", "2-Day")

product_categories = ("Garden", "Kitchen", "Office", "Household")
referrals = ("Other", "Friend/Colleague", "Repeat Customer", "Online Ad")

for i in range(0,10):
    item_id = random.randint(1,100)
    state = states[random.randint(0,len(states)-1)]
    shipping_type = shipping_types[random.randint(0,len(shipping_types)-1)]
    product_category = product_categories[random.randint(0,len(product_categories)-1)]
    quantity = random.randint(1,4)
    referral = referrals[random.randint(0,len(referrals)-1)]
    price = random.randint(1,100)
    order_date = datetime.date(2016,random.randint(1,12),random.randint(1,30)).isoformat()

    data_order = (item_id, product_category, price, quantity, order_date, state,
    shipping_type, referral)

    add_order = ("INSERT INTO Sales "
                   "(ItemID, Category, Price, Quantity, OrderDate, DestinationState, \
                   ShippingType, Referral) "
                   "VALUES (%s, %s, %s, %s, %s, %s, %s, %s)")

    cursor = db.cursor()
    cursor.execute(add_order, data_order)

    db.commit()

cursor.close()
db.close() 

The following screenshot shows how the table appears with the sample data:

Sending data from Amazon Aurora to Amazon S3

There are two methods available to send data from Amazon Aurora to Amazon S3:

  • Using a Lambda function
  • Using SELECT INTO OUTFILE S3

To demonstrate the ease of setting up integration between multiple AWS services, we use a Lambda function to send data to Amazon S3 using Amazon Kinesis Data Firehose.

Alternatively, you can use a SELECT INTO OUTFILE S3 statement to query data from an Amazon Aurora DB cluster and save it directly in text files that are stored in an Amazon S3 bucket. However, with this method, there is a delay between the time that the database transaction occurs and the time that the data is exported to Amazon S3 because the default file size threshold is 6 GB.

Creating a Kinesis data delivery stream

The next step is to create a Kinesis data delivery stream, since it’s a dependency of the Lambda function.

To create a delivery stream:

  1. Open the Kinesis Data Firehose console
  2. Choose Create delivery stream.
  3. For Delivery stream name, type AuroraChangesToS3.
  4. For Source, choose Direct PUT.
  5. For Record transformation, choose Disabled.
  6. For Destination, choose Amazon S3.
  7. In the S3 bucket drop-down list, choose an existing bucket, or create a new one.
  8. Enter a prefix if needed, and choose Next.
  9. For Data compression, choose GZIP.
  10. In IAM role, choose either an existing role that has access to write to Amazon S3, or choose to generate one automatically. Choose Next.
  11. Review all the details on the screen, and choose Create delivery stream when you’re finished.

 

Creating a Lambda function

Now you can create a Lambda function that is called every time there is a change that needs to be tracked in the database table. This Lambda function passes the data to the Kinesis data delivery stream that you created earlier.

To create the Lambda function:

  1. Open the AWS Lambda console.
  2. Ensure that you are in the AWS Region where your Amazon Aurora database is located.
  3. If you have no Lambda functions yet, choose Get started now. Otherwise, choose Create function.
  4. Choose Author from scratch.
  5. Give your function a name and select Python 3.6 for Runtime
  6. Choose and existing or create a new Role, the role would need to have access to call firehose:PutRecord
  7. Choose Next on the trigger selection screen.
  8. Paste the following code in the code window. Change the stream_name variable to the Kinesis data delivery stream that you created in the previous step.
  9. Choose File -> Save in the code editor and then choose Save.
import boto3
import json

firehose = boto3.client('firehose')
stream_name = ‘AuroraChangesToS3’


def Kinesis_publish_message(event, context):
    
    firehose_data = (("%s,%s,%s,%s,%s,%s,%s,%s\n") %(event['ItemID'], 
    event['Category'], event['Price'], event['Quantity'],
    event['OrderDate'], event['DestinationState'], event['ShippingType'], 
    event['Referral']))
    
    firehose_data = {'Data': str(firehose_data)}
    print(firehose_data)
    
    firehose.put_record(DeliveryStreamName=stream_name,
    Record=firehose_data)

Note the Amazon Resource Name (ARN) of this Lambda function.

Giving Aurora permissions to invoke a Lambda function

To give Amazon Aurora permissions to invoke a Lambda function, you must attach an IAM role with appropriate permissions to the cluster. For more information, see Invoking a Lambda Function from an Amazon Aurora DB Cluster.

Once you are finished, the Amazon Aurora database has access to invoke a Lambda function.

Creating a stored procedure and a trigger in Amazon Aurora

Now, go back to MySQL Workbench, and run the following command to create a new stored procedure. When this stored procedure is called, it invokes the Lambda function you created. Change the ARN in the following code to your Lambda function’s ARN.

DROP PROCEDURE IF EXISTS CDC_TO_FIREHOSE;
DELIMITER ;;
CREATE PROCEDURE CDC_TO_FIREHOSE (IN ItemID VARCHAR(255), 
									IN Category varchar(255), 
									IN Price double(10,2),
                                    IN Quantity int(11),
                                    IN OrderDate timestamp,
                                    IN DestinationState varchar(2),
                                    IN ShippingType varchar(255),
                                    IN Referral  varchar(255)) LANGUAGE SQL 
BEGIN
  CALL mysql.lambda_async('arn:aws:lambda:us-east-1:XXXXXXXXXXXXX:function:CDCFromAuroraToKinesis', 
     CONCAT('{ "ItemID" : "', ItemID, 
            '", "Category" : "', Category,
            '", "Price" : "', Price,
            '", "Quantity" : "', Quantity, 
            '", "OrderDate" : "', OrderDate, 
            '", "DestinationState" : "', DestinationState, 
            '", "ShippingType" : "', ShippingType, 
            '", "Referral" : "', Referral, '"}')
     );
END
;;
DELIMITER ;

Create a trigger TR_Sales_CDC on the Sales table. When a new record is inserted, this trigger calls the CDC_TO_FIREHOSE stored procedure.

DROP TRIGGER IF EXISTS TR_Sales_CDC;
 
DELIMITER ;;
CREATE TRIGGER TR_Sales_CDC
  AFTER INSERT ON Sales
  FOR EACH ROW
BEGIN
  SELECT  NEW.ItemID , NEW.Category, New.Price, New.Quantity, New.OrderDate
  , New.DestinationState, New.ShippingType, New.Referral
  INTO @ItemID , @Category, @Price, @Quantity, @OrderDate
  , @DestinationState, @ShippingType, @Referral;
  CALL  CDC_TO_FIREHOSE(@ItemID , @Category, @Price, @Quantity, @OrderDate
  , @DestinationState, @ShippingType, @Referral);
END
;;
DELIMITER ;

If a new row is inserted in the Sales table, the Lambda function that is mentioned in the stored procedure is invoked.

Verify that data is being sent from the Lambda function to Kinesis Data Firehose to Amazon S3 successfully. You might have to insert a few records, depending on the size of your data, before new records appear in Amazon S3. This is due to Kinesis Data Firehose buffering. To learn more about Kinesis Data Firehose buffering, see the “Amazon S3” section in Amazon Kinesis Data Firehose Data Delivery.

Every time a new record is inserted in the sales table, a stored procedure is called, and it updates data in Amazon S3.

Querying data in Amazon Redshift

In this section, you use the data you produced from Amazon Aurora and consume it as-is in Amazon Redshift. In order to allow you to process your data as-is, where it is, while taking advantage of the power and flexibility of Amazon Redshift, you use Amazon Redshift Spectrum. You can use Redshift Spectrum to run complex queries on data stored in Amazon S3, with no need for loading or other data prep.

Just create a data source and issue your queries to your Amazon Redshift cluster as usual. Behind the scenes, Redshift Spectrum scales to thousands of instances on a per-query basis, ensuring that you get fast, consistent performance even as your dataset grows to beyond an exabyte! Being able to query data that is stored in Amazon S3 means that you can scale your compute and your storage independently. You have the full power of the Amazon Redshift query model and all the reporting and business intelligence tools at your disposal. Your queries can reference any combination of data stored in Amazon Redshift tables and in Amazon S3.

Redshift Spectrum supports open, common data types, including CSV/TSV, Apache Parquet, SequenceFile, and RCFile. Files can be compressed using gzip or Snappy, with other data types and compression methods in the works.

First, create an Amazon Redshift cluster. Follow the steps in Launch a Sample Amazon Redshift Cluster.

Next, create an IAM role that has access to Amazon S3 and Athena. By default, Amazon Redshift Spectrum uses the Amazon Athena data catalog. Your cluster needs authorization to access your external data catalog in AWS Glue or Athena and your data files in Amazon S3.

In the demo setup, I attached AmazonS3FullAccess and AmazonAthenaFullAccess. In a production environment, the IAM roles should follow the standard security of granting least privilege. For more information, see IAM Policies for Amazon Redshift Spectrum.

Attach the newly created role to the Amazon Redshift cluster. For more information, see Associate the IAM Role with Your Cluster.

Next, connect to the Amazon Redshift cluster, and create an external schema and database:

create external schema if not exists spectrum_schema
from data catalog 
database 'spectrum_db' 
region 'us-east-1'
IAM_ROLE 'arn:aws:iam::XXXXXXXXXXXX:role/RedshiftSpectrumRole'
create external database if not exists;

Don’t forget to replace the IAM role in the statement.

Then create an external table within the database:

 CREATE EXTERNAL TABLE IF NOT EXISTS spectrum_schema.ecommerce_sales(
  ItemID int,
  Category varchar,
  Price DOUBLE PRECISION,
  Quantity int,
  OrderDate TIMESTAMP,
  DestinationState varchar,
  ShippingType varchar,
  Referral varchar)
ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION 's3://{BUCKET_NAME}/CDC/'

Query the table, and it should contain data. This is a fact table.

select top 10 * from spectrum_schema.ecommerce_sales

 

Next, create a dimension table. For this example, we create a date/time dimension table. Create the table:

CREATE TABLE date_dimension (
  d_datekey           integer       not null sortkey,
  d_dayofmonth        integer       not null,
  d_monthnum          integer       not null,
  d_dayofweek                varchar(10)   not null,
  d_prettydate        date       not null,
  d_quarter           integer       not null,
  d_half              integer       not null,
  d_year              integer       not null,
  d_season            varchar(10)   not null,
  d_fiscalyear        integer       not null)
diststyle all;

Populate the table with data:

copy date_dimension from 's3://reparmar-lab/2016dates' 
iam_role 'arn:aws:iam::XXXXXXXXXXXX:role/redshiftspectrum'
DELIMITER ','
dateformat 'auto';

The date dimension table should look like the following:

Querying data in local and external tables using Amazon Redshift

Now that you have the fact and dimension table populated with data, you can combine the two and run analysis. For example, if you want to query the total sales amount by weekday, you can run the following:

select sum(quantity*price) as total_sales, date_dimension.d_season
from spectrum_schema.ecommerce_sales 
join date_dimension on spectrum_schema.ecommerce_sales.orderdate = date_dimension.d_prettydate 
group by date_dimension.d_season

You get the following results:

Similarly, you can replace d_season with d_dayofweek to get sales figures by weekday:

With Amazon Redshift Spectrum, you pay only for the queries you run against the data that you actually scan. We encourage you to use file partitioning, columnar data formats, and data compression to significantly minimize the amount of data scanned in Amazon S3. This is important for data warehousing because it dramatically improves query performance and reduces cost.

Partitioning your data in Amazon S3 by date, time, or any other custom keys enables Amazon Redshift Spectrum to dynamically prune nonrelevant partitions to minimize the amount of data processed. If you store data in a columnar format, such as Parquet, Amazon Redshift Spectrum scans only the columns needed by your query, rather than processing entire rows. Similarly, if you compress your data using one of the supported compression algorithms in Amazon Redshift Spectrum, less data is scanned.

Analyzing and visualizing Amazon Redshift data in Amazon QuickSight

Modify the Amazon Redshift security group to allow an Amazon QuickSight connection. For more information, see Authorizing Connections from Amazon QuickSight to Amazon Redshift Clusters.

After modifying the Amazon Redshift security group, go to Amazon QuickSight. Create a new analysis, and choose Amazon Redshift as the data source.

Enter the database connection details, validate the connection, and create the data source.

Choose the schema to be analyzed. In this case, choose spectrum_schema, and then choose the ecommerce_sales table.

Next, we add a custom field for Total Sales = Price*Quantity. In the drop-down list for the ecommerce_sales table, choose Edit analysis data sets.

On the next screen, choose Edit.

In the data prep screen, choose New Field. Add a new calculated field Total Sales $, which is the product of the Price*Quantity fields. Then choose Create. Save and visualize it.

Next, to visualize total sales figures by month, create a graph with Total Sales on the x-axis and Order Data formatted as month on the y-axis.

After you’ve finished, you can use Amazon QuickSight to add different columns from your Amazon Redshift tables and perform different types of visualizations. You can build operational dashboards that continuously monitor your transactional and analytical data. You can publish these dashboards and share them with others.

Final notes

Amazon QuickSight can also read data in Amazon S3 directly. However, with the method demonstrated in this post, you have the option to manipulate, filter, and combine data from multiple sources or Amazon Redshift tables before visualizing it in Amazon QuickSight.

In this example, we dealt with data being inserted, but triggers can be activated in response to an INSERT, UPDATE, or DELETE trigger.

Keep the following in mind:

  • Be careful when invoking a Lambda function from triggers on tables that experience high write traffic. This would result in a large number of calls to your Lambda function. Although calls to the lambda_async procedure are asynchronous, triggers are synchronous.
  • A statement that results in a large number of trigger activations does not wait for the call to the AWS Lambda function to complete. But it does wait for the triggers to complete before returning control to the client.
  • Similarly, you must account for Amazon Kinesis Data Firehose limits. By default, Kinesis Data Firehose is limited to a maximum of 5,000 records/second. For more information, see Monitoring Amazon Kinesis Data Firehose.

In certain cases, it may be optimal to use AWS Database Migration Service (AWS DMS) to capture data changes in Aurora and use Amazon S3 as a target. For example, AWS DMS might be a good option if you don’t need to transform data from Amazon Aurora. The method used in this post gives you the flexibility to transform data from Aurora using Lambda before sending it to Amazon S3. Additionally, the architecture has the benefits of being serverless, whereas AWS DMS requires an Amazon EC2 instance for replication.

For design considerations while using Redshift Spectrum, see Using Amazon Redshift Spectrum to Query External Data.

If you have questions or suggestions, please comment below.


Additional Reading

If you found this post useful, be sure to check out Capturing Data Changes in Amazon Aurora Using AWS Lambda and 10 Best Practices for Amazon Redshift Spectrum


About the Authors

Re Alvarez-Parmar is a solutions architect for Amazon Web Services. He helps enterprises achieve success through technical guidance and thought leadership. In his spare time, he enjoys spending time with his two kids and exploring outdoors.

 

 

 

Use the New Visual Editor to Create and Modify Your AWS IAM Policies

Post Syndicated from Joy Chatterjee original https://aws.amazon.com/blogs/security/use-the-new-visual-editor-to-create-and-modify-your-aws-iam-policies/

Today, AWS Identity and Access Management (IAM) made it easier for you to create and modify your IAM policies by using a point-and-click visual editor in the IAM console. The new visual editor guides you through granting permissions for IAM policies without requiring you to write policies in JSON (although you can still author and edit policies in JSON, if you prefer). This update to the IAM console makes it easier to grant least privilege for the AWS service actions you select by listing all the supported resource types and request conditions you can specify. Policy summaries identify unrecognized services and actions and permissions errors when you import existing policies, and now you can use the visual editor to correct them. In this blog post, I give a brief overview of policy concepts and show you how to create a new policy by using the visual editor.

IAM policy concepts

You use IAM policies to define permissions for your IAM entities (groups, users, and roles). Policies are composed of one or more statements that include the following elements:

  • Effect: Determines if a policy statement allows or explicitly denies access.
  • Action: Defines AWS service actions in a policy (these typically map to individual AWS APIs.)
  • Resource: Defines the AWS resources to which actions can apply. The defined resources must be supported by the actions defined in the Action element for permissions to be granted.
  • Condition: Defines when a permission is allowed or denied. The conditions defined in a policy must be supported by the actions defined in the Action element for the permission to be granted.

To grant permissions, you attach policies to groups, users, or roles. Now that I have reviewed the elements of a policy, I will demonstrate how to create an IAM policy with the visual editor.

How to create an IAM policy with the visual editor

Let’s say my human resources (HR) recruiter, Casey, needs to review files located in an Amazon S3 bucket for all the product manager (PM) candidates our HR team has interviewed in 2017. To grant this access, I will create and attach a policy to Casey that grants list and limited read access to all folders that begin with PM_Candidate in the pmrecruiting2017 S3 bucket. To create this new policy, I navigate to the Policies page in the IAM console and choose Create policy. Note that I could also use the visual editor to modify existing policies by choosing Import existing policy; however, for Casey, I will create a new policy.

Image of the "Create policy" button

On the Visual editor tab, I see a section that includes Service, Actions, Resources, and Request Conditions.

Image of the "Visual editor" tab

Select a service

To grant S3 permissions, I choose Select a service, type S3 in the search box, and choose S3 from the list.

Image of choosing "S3"

Select actions

After selecting S3, I can define actions for Casey by using one of four options:

  1. Filter actions in the service by using the search box.
  2. Type actions by choosing Add action next to Manual actions. For example, I can type List* to grant all S3 actions that begin with List*.
  3. Choose access levels from List, Read, Write, Permissions management, and Tagging.
  4. Select individual actions by expanding each access level.

In the following screenshot, I choose options 3 and 4, and choose List and s3:GetObject from the Read access level.

Screenshot of options in the "Select actions" section

We introduced access levels when we launched policy summaries earlier in 2017. Access levels give you a way to categorize actions and help you understand the permissions in a policy. The following table gives you a quick overview of access levels.

Access level Description Example actions
List Actions that allow you to see a list of resources s3:ListBucket, s3:ListAllMyBuckets
Read Actions that allow you to read the content in resources s3:GetObject, s3:GetBucketTagging
Write Actions that allow you to create, delete, or modify resources s3:PutObject, s3:DeleteBucket
Permissions management Actions that allow you to grant or modify permissions to resources s3:PutBucketPolicy
Tagging Actions that allow you to create, delete, or modify tags
Note: Some services support authorization based on tags.
s3:PutBucketTagging, s3:DeleteObjectVersionTagging

Note: By default, all actions you choose will be allowed. To deny actions, choose Switch to deny permissions in the upper right corner of the Actions section.

As shown in the preceding screenshot, if I choose the question mark icon next to GetObject, I can see the description and supported resources and conditions for this action, which can help me scope permissions.

Screenshot of GetObject

The visual editor makes it easy to decide which actions I should select by providing in an integrated documentation panel the action description, supported resources or conditions, and any required actions for every AWS service action. Some AWS service actions have required actions, which are other AWS service actions that need to be granted in a policy for an action to run. For example, the AWS Directory Service action, ds:CreateDirectory, requires seven Amazon EC2 actions to be able to create a Directory Service directory.

Choose resources

In the Resources section, I can choose the resources on which actions can be taken. I choose Resources and see two ways that I can define or select resources:

  1. Define specific resources
  2. Select all resources

Specific is the default option, and only the applicable resources are presented based on the service and actions I chose previously. Because I want to grant Casey access to some objects in a specific bucket, I choose Specific and choose Add ARN under bucket.

Screenshot of Resources section

In the pop-up, I type the bucket name, pmrecruiting2017, and choose Add to specify the S3 bucket resource.

Screenshot of specifying the S3 bucket resource

To specify the objects, I choose Add ARN under object and grant Casey access to all objects starting with PM_Candidate in the pmrecruiting2017 bucket. The visual editor helps you build your Amazon Resource Name (ARN) and validates that it is structured correctly. For AWS services that are AWS Region specific, the visual editor prompts for AWS Region and account number.

The visual editor displays all applicable resources in the Resources section based on the actions I choose. For Casey, I defined an S3 bucket and object in the Resources section. In this example, when the visual editor creates the policy, it creates three statements. The first statement includes all actions that require a wildcard (*) for the Resource element because this action does not support resource-level permissions. The second statement includes all S3 actions that support an S3 bucket. The third statement includes all actions that support an S3 object resource. The visual editor generates policy syntax for you based on supported permissions in AWS services.

Specify request conditions

For additional security, I specify a condition to restrict access to the S3 bucket from inside our internal network. To do this, I choose Specify request conditions in the Request Conditions section, and choose the Source IP check box. A condition is composed of a condition key, an operator, and a value. I choose aws:SourceIp for my Key so that I can control from where the S3 files can be accessed. By default, IpAddress is the Operator, and I set the Value to my internal network.

Screenshot of "Request conditions" section

To add other conditions, choose Add condition and choose Save changes after choosing the key, operator, and value.

After specifying my request condition, I am now able to review all the elements of these S3 permissions.

Screenshot of S3 permissions

Next, I can choose to grant permissions for another service by choosing Add new permissions (bottom left of preceding screenshot), or I can review and create this new policy. Because I have granted all the permissions Casey needs, I choose Review policy. I type a name and a description, and I review the policy summary before choosing Create policy. 

Now that I have created the policy, I attach it to Casey by choosing the Attached entities tab of the policy I just created. I choose Attach and choose Casey. I then choose Attach policy. Casey should now be able to access the interview files she needs to review.

Summary

The visual editor makes it easier to create and modify your IAM policies by guiding you through each element of the policy. The visual editor helps you define resources and request conditions so that you can grant least privilege and generate policies. To start using the visual editor, sign in to the IAM console, navigate to the Policies page, and choose Create policy.

If you have comments about this post, submit them in the “Comments” section below. If you have questions about or suggestions for this solution, start a new thread on the IAM forum.

– Joy

Cross-Account Integration with Amazon SNS

Post Syndicated from Christie Gifrin original https://aws.amazon.com/blogs/compute/cross-account-integration-with-amazon-sns/

Contributed by Zak Islam, Senior Manager, Software Development, AWS Messaging

 

Amazon Simple Notification Service (Amazon SNS) is a fully managed AWS service that makes it easy to decouple your application components and fan-out messages. SNS provides topics (similar to topics in message brokers such as RabbitMQ or ActiveMQ) that you can use to create 1:1, 1:N, or N:N producer/consumer design patterns. For more information about how to send messages from SNS to Amazon SQS, AWS Lambda, or HTTP(S) endpoints in the same account, see Sending Amazon SNS Messages to Amazon SQS Queues.

SNS can be used to send messages within a single account or to resources in different accounts to create administrative isolation. This enables administrators to grant only the minimum level of permissions required to process a workload (for example, limiting the scope of your application account to only send messages and to deny deletes). This approach is commonly known as the “principle of least privilege.” If you are interested, read more about AWS’s multi-account security strategy.

This is great from a security perspective, but why would you want to share messages between accounts? It may sound scary, but it’s a common practice to isolate application components (such as producer and consumer) to operate using different AWS accounts to lock down privileges in case credentials are exposed. In this post, I go slightly deeper and explore how to set up your SNS topic so that it can route messages to SQS queues that are owned by a separate AWS account.

Potential use cases

First, look at a common order processing design pattern:

This is a simple architecture. A web server submits an order directly to an SNS topic, which then fans out messages to two SQS queues. One SQS queue is used to track all incoming orders for audits (such as anti-entropy, comparing the data of all replicas and updating each replica to the newest version). The other is used to pass the request to the order processing systems.

Imagine now that a few years have passed, and your downstream processes no longer scale, so you are kicking around the idea of a re-architecture project. To thoroughly test your system, you need a way to replay your production messages in your development system. Sure, you can build a system to replicate and replay orders from your production environment in your development environment. Wouldn’t it be easier to subscribe your development queues to the production SNS topic so you can test your new system in real time? That’s exactly what you can do here.

Here’s another use case. As your business grows, you recognize the need for more metrics from your order processing pipeline. The analytics team at your company has built a metrics aggregation service and ingests data via a central SQS queue. Their architecture is as follows:

Again, it’s a fairly simple architecture. All data is ingested via SQS queues (master_ingest_queue, in this case). You subscribe the master_ingest_queue, running under the analytics team’s AWS account, to the topic that is in the order management team’s account.

Making it work

Now that you’ve seen a few scenarios, let’s dig into the details. There are a couple of ways to link an SQS queue to an SNS topic (subscribe a queue to a topic):

  1. The queue owner can create a subscription to the topic.
  2. The topic owner can subscribe a queue in another account to the topic.

Queue owner subscription

What happens when the queue owner subscribes to a topic? In this case, assume that the topic owner has given permission to the subscriber’s account to call the Subscribe API action using the topic ARN (Amazon Resource Name). For the examples below, also assume the following:

  •  Topic_Owner is the identifier for the account that owns the topic MainTopic
  • Queue_Owner is the identifier for the account that owns the queue subscribed to the main topic

To enable the subscriber to subscribe to a topic, the topic owner must add the sns:Subscribe and topic ARN to the topic policy via the AWS Management Console, as follows:

{
  "Version":"2012-10-17",
  "Id":"MyTopicSubscribePolicy",
  "Statement":[{
      "Sid":"Allow-other-account-to-subscribe-to-topic",
      "Effect":"Allow",
      "Principal":{
        "AWS":"Topic_Owner"
      },
      "Action":"sns:Subscribe",
      "Resource":"arn:aws:sns:us-east-1:Queue_Owner:MainTopic"
    }
  ]
}

After this has been set up, the subscriber (using account Queue_Owner) can call Subscribe to link the queue to the topic. After the queue has been successfully subscribed, SNS starts to publish notifications. In this case, neither the topic owner nor the subscriber have had to process any kind of confirmation message.

Topic owner subscription

The second way to subscribe an SQS queue to an SNS topic is to have the Topic_Owner account initiate the subscription for the queue from account Queue_Owner. In this case, SNS first sends a confirmation message to the queue. To confirm the subscription, a user who can read messages from the queue must visit the URL specified in the SubscribeURL value in the message. Until the subscription is confirmed, no notifications published to the topic are sent to the queue. To confirm a subscription, you can use the SQS console or the ReceiveMessage API action.

What’s next?

In this post, I covered a few simple use cases but the principles can be extended to complex systems as well. As you architect new systems and refactor existing ones, think about where you can leverage queues (SQS) and topics (SNS) to build a loosely coupled system that can be quickly and easily extended to meet your business need.

For step by step instructions, see Sending Amazon SNS messages to an Amazon SQS queue in a different account. You can also visit the following resources to get started working with message queues and topics:

Secure API Access with Amazon Cognito Federated Identities, Amazon Cognito User Pools, and Amazon API Gateway

Post Syndicated from Ed Lima original https://aws.amazon.com/blogs/compute/secure-api-access-with-amazon-cognito-federated-identities-amazon-cognito-user-pools-and-amazon-api-gateway/

Ed Lima, Solutions Architect

 

Our identities are what define us as human beings. Philosophical discussions aside, it also applies to our day-to-day lives. For instance, I need my work badge to get access to my office building or my passport to travel overseas. My identity in this case is attached to my work badge or passport. As part of the system that checks my access, these documents or objects help define whether I have access to get into the office building or travel internationally.

This exact same concept can also be applied to cloud applications and APIs. To provide secure access to your application users, you define who can access the application resources and what kind of access can be granted. Access is based on identity controls that can confirm authentication (AuthN) and authorization (AuthZ), which are different concepts. According to Wikipedia:

 

The process of authorization is distinct from that of authentication. Whereas authentication is the process of verifying that “you are who you say you are,” authorization is the process of verifying that “you are permitted to do what you are trying to do.” This does not mean authorization presupposes authentication; an anonymous agent could be authorized to a limited action set.

Amazon Cognito allows building, securing, and scaling a solution to handle user management and authentication, and to sync across platforms and devices. In this post, I discuss the different ways that you can use Amazon Cognito to authenticate API calls to Amazon API Gateway and secure access to your own API resources.

 

Amazon Cognito Concepts

 

It’s important to understand that Amazon Cognito provides three different services:

Today, I discuss the use of the first two. One service doesn’t need the other to work; however, they can be configured to work together.
 

Amazon Cognito Federated Identities

 
To use Amazon Cognito Federated Identities in your application, create an identity pool. An identity pool is a store of user data specific to your account. It can be configured to require an identity provider (IdP) for user authentication, after you enter details such as app IDs or keys related to that specific provider.

After the user is validated, the provider sends an identity token to Amazon Cognito Federated Identities. In turn, Amazon Cognito Federated Identities contacts the AWS Security Token Service (AWS STS) to retrieve temporary AWS credentials based on a configured, authenticated IAM role linked to the identity pool. The role has appropriate IAM policies attached to it and uses these policies to provide access to other AWS services.

Amazon Cognito Federated Identities currently supports the IdPs listed in the following graphic.

 



Continue reading Secure API Access with Amazon Cognito Federated Identities, Amazon Cognito User Pools, and Amazon API Gateway

Introducing an Easier Way to Delegate Permissions to AWS Services: Service-Linked Roles

Post Syndicated from Abhishek Pandey original https://aws.amazon.com/blogs/security/introducing-an-easier-way-to-delegate-permissions-to-aws-services-service-linked-roles/

Some AWS services create and manage AWS resources on your behalf. To do this, these services require you to delegate permissions to them by using AWS Identity and Access Management (IAM) roles. Today, AWS IAM introduces service-linked roles, which give you an easier and more secure way to delegate permissions to AWS services. To start, you can use service-linked roles with Amazon Lex, a service that enables you to build conversational interfaces in any application by using voice and text. Over time, more AWS services will use service-linked roles as a way for you to delegate permissions to them to create and manage AWS resources on your behalf. In this blog post, I walk through the details of service-linked roles and show how to use them.

Creation and management of service-linked roles

Each service-linked role links to an AWS service, which is called the linked service. Service-linked roles provide a secure way to delegate permissions to AWS services because only the linked service can assume a service-linked role. Additionally, AWS automatically defines and sets the permissions of service-linked roles, depending on the actions that the linked service performs on your behalf. This makes it easier for you to manage the permissions you delegate to AWS services. AWS allows only those changes to service-linked roles that do not remove the permissions required by the linked service to manage your resources, preventing you from making any changes that would leave your AWS resources in an inconsistent state. Service-linked roles also help you meet your monitoring and auditing requirements because all actions performed on your behalf by an AWS service using a service-linked role appear in your AWS CloudTrail logs.

When you work with an AWS service that uses service-linked roles, the service automatically creates a service-linked role for you. After that, whenever the service must act on your behalf to manage your resources, it assumes the service-linked role. You can view the details of the service-linked roles in your account by using the IAM console, IAM APIs, or the AWS CLI.

Service-linked roles follow a specific naming convention that includes a mandatory prefix that is defined by AWS and an optional suffix defined by you. The examples in the following table show how the role names of service-linked roles may appear.

Service-linked role name Prefix Optional suffix
AWSServiceRoleForLexBots AWSServiceRoleForLexBots Not set
AWSServiceRoleForElasticBeanstalk_myRole AWSServiceRoleForElasticBeanstalk myRole

If you are the administrator of your account and you do not want to grant permissions to other users to create roles or delegate permissions to AWS services, you can create service-linked roles for users in your account by using the IAM console, IAM APIs, or the AWS CLI. For more information about how to create service-linked roles through IAM, see the IAM documentation about creating a role to delegate permissions to an AWS service.

To create a service-linked role or to enable an AWS service to create one on your behalf, you must have permission for the iam:CreateServiceLinkedRole action. The following IAM policy grants the permission to create service-linked roles for Amazon Lex.

{ 
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowCreationOfServiceLinkedRoleForLex",
            "Effect": "Allow",
            "Action": ["iam:CreateServiceLinkedRole"],
            "Resource": ["arn:aws:iam::*:role/aws-service-role/lex.amazonaws.com/AWSServiceRoleForLex*"],
	    "Condition": {
	         "StringLike":{
			"iam:AWSServiceName": "lex.amazonaws.com"
		  }
 	    }
        }
    ]
}

The preceding policy allows the iam:CreateServiceLinkedRole action when the linked service is Amazon Lex, and the name of the service-linked role starts with AWSServiceRoleForLex. For more information, see Working with Policies.

If you no longer wish to use a specific AWS service, you can revoke permissions for that service by deleting the service-linked role. You can do this from the linked service, and the service might require you to delete the resources that depend on the service-linked role. This helps ensure that you do not inadvertently delete a role that is required for your AWS resources to function properly. To learn more about how to delete a service-linked role, see the linked service’s documentation.

Permissions of service-linked roles

Just like existing IAM roles, the permissions of service-linked roles come from two policies: a permission policy and a trust policy. The permission policy determines what the role can and cannot do, and the trust policy defines who can assume the role. AWS automatically sets the permission and trust policies of service-linked roles.

For the permission policy, service-linked roles use an AWS managed policy. This means that when a service adds a new feature, AWS automatically updates the managed policy to enable the new functionality without requiring you to change the policy. In most cases, you do not have to update the permission policy of a service-linked role. However, some services may require you to add specific permissions to the role such as access to a specific Amazon S3 bucket. To learn more about how to add permissions if a service requires specific permissions, see the linked service’s documentation.

Only the linked AWS service can assume a service-linked role, which is why you cannot modify the trust policy of a service-linked role. You can allow your users to create service-linked roles for AWS services while not permitting them to escalate their own privileges. For example, imagine Alice is a developer on your team and she wants to delegate permissions to Amazon Lex. When Alice creates a service-linked role for Amazon Lex, AWS automatically attaches the permission and trust policies. The permission policy includes only the permissions that Amazon Lex needs to manage your resources (this best practice is known as least privilege), and the trust policy defines Amazon Lex as the trusted entity. As a result, Alice is able to create a service-linked role to delegate permissions to Amazon Lex. However, she is unable to edit the trust policy to include additional trusted entities. This prevents her from granting unapproved access to other users or escalating her own privileges, while still having the necessary permissions to create service-linked roles for Amazon Lex.

How to create a service-linked role by using the IAM console

The following steps lead you through creating a service-linked role by using the IAM console. However, before you create a service-linked role, make sure you have the right permissions to do so.

To create a service-linked role by using the IAM console:

  1. Navigate to the IAM console and choose Roles in the navigation pane.
    Screenshot of the IAM console
  2. Choose Create new role.
    Screenshot showing the "Create new role" button
  3. On the Select role type page, in the AWS service-linked role section, choose the AWS service for which you want to create the role. For this example, I choose Amazon Lex – Bots.
    Screenshot of choosing "Amazon Lex - Bots"
  4. Notice that the role name prefix is automatically populated. Type the role name suffix for the service-linked role. Some AWS services, such as Amazon Lex, do not support custom suffixes, in which case you should leave the role name suffix box blank.
    Screenshot of the role name suffix box
  5. Include a description of the new role. Notice that IAM automatically suggests a description for this role, which you can edit. For this example, I keep the suggested description.
    Screenshot of the "Role description" box
  6. Choose Create role. After the role is created, you can view it in the IAM console. Service-linked roles are marked with a cube-shaped icon in the console to help you distinguish these roles from other roles in your account.

Conclusion

With service-linked roles, delegating permissions to AWS services is easier because when you work with an AWS service that uses these roles, the service creates the role for you. You do not have to create IAM policies for delegating permissions to AWS services. Any changes to these roles that might interfere with your AWS resources do not go through. Delegation of permissions is secure because only the linked service is able to use these roles.

To see which AWS services support service-linked roles, see AWS services that work with IAM. If you have any comments about service-linked roles, submit a comment in the “Comments” section below. If you have questions about working with service-linked roles, please start a new thread on the IAM forum.

– Abhishek

Amazon Elasticsearch Service support for Elasticsearch 5.1

Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/amazon-elasticsearch-service-support-for-es-5-1/

The Amazon Elasticsearch Service is a fully managed service that provides easier deployment, operation, and scale for the Elasticsearch open-source search and analytics engine. We are excited to announce that Amazon Elasticsearch Service now supports Elasticsearch 5.1 and Kibana 5.1.

Elasticsearch 5 comes with a ton of new features and enhancements that customers can now take advantage of in Amazon Elasticsearch service. Elements of the Elasticsearch 5 release are as follow:

  • Indexing performance: Improved Indexing throughput with updates to lock implementation & async translog fsyncing
  • Ingestion Pipelines: Incoming data can be sent to a pipeline that applies a series of ingestion processors, allowing transformation to the exact data you want to have in your search index. There are twenty processors included, from simple appending to complex regex applications
  • Painless scripting: Amazon Elasticsearch Service supports Painless, a new secure and performant scripting language for Elasticsearch 5. You can use scripting to change the precedence of search results, delete index fields by query, modify search results to return specific fields, and more.
  • New data structures: Lucene 6 data structures, new data types; half_float, text, keyword, and more complete support for dots-in-fieldnames
  • Search and Aggregations: Refactored search API, BM25 relevance calculations, Instant Aggregations, improvements to histogram aggregations & terms aggregations, and rewritten percolator & completion suggester
  • User experience: Strict settings and body & query string parameter validation, index management improvement, default deprecation logging, new shard allocation API, and new indices efficiency pattern for rollover & shrink APIs
  • Java REST client: simple HTTP/REST Java client that works with Java 7 and handles retry on node failure, as well as, round-robin, sniffing, and logging of requests
  • Other improvements: Lazy unicast hosts DNS lookup, automatic parallel tasking of reindex, update-by-query, delete-by-query, and search cancellation by task management API

The compelling new enhancements of Elasticsearch 5 are meant to make the service faster and easier to use while providing better security. Amazon Elasticsearch Service is a managed service designed to aid customers in building, developing and deploying solutions with Elasticsearch by providing the following capabilities:

  • Multiple configurations of instance types
  • Amazon EBS volumes for data storage
  • Cluster stability improvement with dedicated master nodes
  • Zone awareness – Cluster node allocation across two Availability Zones in the region
  • Access Control & Security with AWS Identity and Access Management (IAM)
  • Various geographical locations/regions for resources
  • Amazon Elasticsearch domain snapshots for replication, backup and restore
  • Integration with Amazon CloudWatch for monitoring Amazon Elasticsearch domain metrics
  • Integration with AWS CloudTrail for configuration auditing
  • Integration with other AWS Services like Kinesis Firehouse and DynamoDB for loading of real-time streaming data into Amazon Elasticsearch Service

Amazon Elasticsearch Service allows dynamic changes with zero downtime. You can add instances, remove instances, change instance sizes, change storage configuration, and make other changes dynamically.

The best way to highlight some of the aforementioned capabilities is with an example.

During a presentation at the IT/Dev conference, I demonstrated how to build a serverless employee onboarding system using Express.js, AWS Lambda, Amazon DynamoDB, and Amazon S3. In the demo, the information collected was personnel data stored in DynamoDB about an employee going through a fictional onboarding process. Imagine if the collected employee data could be searched, queried, and analyzed as needed by the company’s HR department. We can easily augment the onboarding system to add these capabilities by enabling the employee table to use DynamoDB Streams to trigger Lambda and store the desired employee attributes in Amazon Elasticsearch Service.

The result is the following solution architecture:

We will focus solely on how to dynamically store and index employee data to Amazon Elasticseach Service each time an employee record is entered and subsequently stored in the database.
To add this enhancement to the existing aforementioned onboarding solution, we will implement the solution as noted by the detailed cloud architecture diagram below:

Let’s look at how to implement the employee load process to the Amazon Elasticsearch Service, which is the first process flow shown in the diagram above.

Amazon Elasticsearch Service: Domain Creation

Let’s now visit the AWS Console to check out Amazon Elasticsearch Service with Elasticsearch 5 in action. As you probably guessed, from the AWS Console home, we select Elasticsearch Service under the Analytics group.

The first step in creating an Elasticsearch solution is to create a domain.  You will notice that now when creating an Amazon Elasticsearch Service domain, you now have the option to choose the Elasticsearch 5.1 version.  Since we are discussing the launch of the support of Elasticsearch 5, we will, of course, choose the 5.1 Elasticsearch engine version when creating our domain in the Amazon Elasticsearch Service.


After clicking Next, we will now setup our Elasticsearch domain by configuring our instance and storage settings. The instance type and the number of instances for your cluster should be determined based upon your application’s availability, network volume, and data needs. A recommended best practice is to choose two or more instances in order to avoid possible data inconsistencies or split brain failure conditions with Elasticsearch. Therefore, I will choose two instances/data nodes for my cluster and set up EBS as my storage device.

To understand how many instances you will need for your specific application, please review the blog post, Get Started with Amazon Elasticsearch Service: How Many Data Instances Do I Need, on the AWS Database blog.

All that is left for me is to set up the access policy and deploy the service. Once I create my service, the domain will be initialized and deployed.

Now that I have my Elasticsearch service running, I now need a mechanism to populate it with data. I will implement a dynamic data load process of the employee data to Amazon Elasticsearch Service using DynamoDB Streams.

Amazon DynamoDB: Table and Streams

Before I head to the DynamoDB console, I will quickly cover the basics.

Amazon DynamoDB is a scalable, distributed NoSQL database service. DynamoDB Streams provide an ordered, time-based sequence of every CRUD operation to the items in a DynamoDB table. Each stream record has information about the primary attribute modification for an individual item in the table. Streams execute asynchronously and can write stream records in practically real time. Additionally, a stream can be enabled when a table is created or can be enabled and modified on an existing table. You can learn more about DynamoDB Streams in the DynamoDB developer guide.

Now we will head to the DynamoDB console and view the OnboardingEmployeeData table.

This table has a primary partition key, UserID, that is a string data type and a primary sort key, Username, which is also of a string data type. We will use the UserID as the document ID in Elasticsearch. You will also notice that on this table, streams are enabled and the stream view type is New image. A stream that is set to a New image view type will have stream records that display the entire item record after it has been updated. You also have the option to have the stream present records that provide data items before modification, provide only the items’ key attributes, or provide old and new item information.  If you opt to use the AWS CLI to create your DynamoDB table, the key information to capture is the Latest Stream ARN shown underneath the Stream Details section. A DynamoDB stream has a unique ARN identifier that is outside of the ARN of the DynamoDB table. The stream ARN will be needed to create the IAM policy for access permissions between the stream and the Lambda function.

IAM Policy

The first thing that is essential for any service implementation is getting the correct permissions in place. Therefore, I will first go to the IAM console to create a role and a policy for my Lambda function that will provide permissions for DynamoDB and Elasticsearch.

First, I will create a policy based upon an existing managed policy for Lambda execution with DynamoDB Streams.

This will take us to the Review Policy screen, which will have the selected managed policy details. I’ll name this policy, Onboarding-LambdaDynamoDB-toElasticsearch, and then customize the policy for my solution. The first thing you should notice is that the current policy allows access to all streams, however, the best practice would be to have this policy only access the specific DynamoDB Stream by adding the Latest Stream ARN. Hence, I will alter the policy and add the ARN for the DynamoDB table, OnboardingEmployeeData, and validate the policy. The altered policy is as shown below.

The only thing left is to add the Amazon Elasticsearch Service permissions in the policy. The core policy for Amazon Elasticsearch Service access permissions is as shown below:

 

I will use this policy and add the specific Elasticsearch domain ARN as the Resource for the policy. This ensures that I have a policy that enforces the Least Privilege security best practice for policies. With the Amazon Elasticsearch Service domain added as shown, I can validate and save the policy.

The best way to create a custom policy is to use the IAM Policy Simulator or view the examples of the AWS service permissions from the service documentation. You can also find some examples of policies for a subset of AWS Services here. Remember you should only add the ES permissions that are needed using the Least Privilege security best practice, the policy shown above is used only as an example.

We will create the role for our Lambda function to use to grant access and attach the aforementioned policy to the role.

AWS Lambda: DynamoDB triggered Lambda function

AWS Lambda is the core of Amazon Web Services serverless computing offering. With Lambda, you can write and run code using supported languages for almost any type of application or backend service. Lambda will trigger your code in response to events from AWS services or from HTTP requests. Lambda will dynamically scale based upon workload and you only pay for your code execution.

We will have DynamoDB streams trigger a Lambda function that will create an index and send data to Elasticsearch. Another option for this is to use the Logstash plugin for DynamoDB. However, since several of the Logstash processors are now included in Elasticsearch 5.1 core and with the improved performance optimizations, I will opt to use Lambda to process my DynamoDB stream and load data to Amazon Elasticsearch Service.
Now let us head over to the AWS Lambda console and create the lambda function for loading employee data to Amazon Elasticsearch Service.

Once in the console, I will create a new Lambda function by selecting the Blank Function blueprint that will take me to the Configure Trigger page. Once on the trigger page, I will select DynamoDB as the AWS service which will trigger Lambda, and I provide the following trigger related options:

  • Table: OnboardingEmployeeData
  • Batch size: 100 (default)
  • Starting position: Trim Horizon

I hit Next button, and I am on the Configure Function screen. The name of my function will be ESEmployeeLoad and I will write this function in Node.4.3.

The Lambda function code is as follows:

var AWS = require('aws-sdk');
var path = require('path');

//Object for all the ElasticSearch Domain Info
var esDomain = {
    region: process.env.RegionForES,
    endpoint: process.env.EndpointForES,
    index: process.env.IndexForES,
    doctype: 'onboardingrecords'
};
//AWS Endpoint from created ES Domain Endpoint
var endpoint = new AWS.Endpoint(esDomain.endpoint);
//The AWS credentials are picked up from the environment.
var creds = new AWS.EnvironmentCredentials('AWS');

console.log('Loading function');
exports.handler = (event, context, callback) => {
    //console.log('Received event:', JSON.stringify(event, null, 2));
    console.log(JSON.stringify(esDomain));
    
    event.Records.forEach((record) => {
        console.log(record.eventID);
        console.log(record.eventName);
        console.log('DynamoDB Record: %j', record.dynamodb);
       
        var dbRecord = JSON.stringify(record.dynamodb);
        postToES(dbRecord, context, callback);
    });
};

function postToES(doc, context, lambdaCallback) {
    var req = new AWS.HttpRequest(endpoint);

    req.method = 'POST';
    req.path = path.join('/', esDomain.index, esDomain.doctype);
    req.region = esDomain.region;
    req.headers['presigned-expires'] = false;
    req.headers['Host'] = endpoint.host;
    req.body = doc;

    var signer = new AWS.Signers.V4(req , 'es');  // es: service code
    signer.addAuthorization(creds, new Date());

    var send = new AWS.NodeHttpClient();
    send.handleRequest(req, null, function(httpResp) {
        var respBody = '';
        httpResp.on('data', function (chunk) {
            respBody += chunk;
        });
        httpResp.on('end', function (chunk) {
            console.log('Response: ' + respBody);
            lambdaCallback(null,'Lambda added document ' + doc);
        });
    }, function(err) {
        console.log('Error: ' + err);
        lambdaCallback('Lambda failed with error ' + err);
    });
}

The Lambda function Environment variables are:

I will select an Existing role option and choose the ESOnboardingSystem IAM role I created earlier.

Upon completing my IAM role permissions for the Lambda function, I can review the Lambda function details and complete the creation of ESEmployeeLoad function.

I have completed the process of building my Lambda function to talk to Elasticsearch, and now I test my function my simulating data changes to my database.

Now my function, ESEmployeeLoad, will execute upon changes to the data in my database from my onboarding system. Additionally, I can review the processing of the Lambda function to Elasticsearch by reviewing the CloudWatch logs.

Now I can alter my Lambda function to take advantage of the new features or go directly to Elasticsearch and utilize the new Ingest Mode. An example of this would be to implement a pipeline for my Employee record documents.

I can replicate this function for handling the badge updates to the employee record, and/or leverage other preprocessors against the employee data. For instance, if I wanted to do a search of data based upon a data parameter in the Elasticsearch document, I could use the Search API and get records from the dataset.

The possibilities are endless, and you can get as creative as your data needs dictate while maintaining great performance.

Amazon Elasticsearch Service: Kibana 5.1

All Amazon Elasticsearch Service domains using Elasticsearch 5.1 are bundled with Kibana 5.1, the latest version of the open-source visualization tool.

The companion visualization and analytics platform, Kibana, has also been enhanced in the Kibana 5.1 release. Kibana is used to view, search or and interact with Elasticsearch data with a myriad of different charts, tables, and maps.  In addition, Kibana performs advanced data analysis of large volumes of the data. Key enhancements of the Kibana release are as follows:

  • Visualization tool new design: Updated color scheme and maximization of screen real-estate
  • Timelion: visualization tool with a time-based query DSL
  • Console: formerly known as Sense is now part of the core, using the same configuration for free-form requests to Elasticsearch
  • Scripted field language: ability use new Painless scripting language in the Elasticsearch cluster
  • Tag Cloud Visualization: 5.1 adds a word base graphical view of data sized by importance
  • More Charts: return of previously removed charts and addition of advanced view for X-Pack
  • Profiler UI:1 provides an enhancement to profile API with tree view
  • Rendering performance improvement: Discover performance fixes, decrease of CPU load

Summary

As you can see this release is expansive with many enhancements to assist customers in building Elasticsearch solutions. Amazon Elasticsearch Service now supports 15 new Elasticsearch APIs and 6 new plugins. Amazon Elasticsearch Service supports the following operations for Elasticsearch 5.1:

You can read more about the supported operations for Elasticsearch in the Amazon Elasticsearch Developer Guide, and you can get started by visiting the Amazon Elasticsearch Service website and/or sign into the AWS Management Console.

Tara

 

AWS Organizations – Policy-Based Management for Multiple AWS Accounts

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-organizations-policy-based-management-for-multiple-aws-accounts/

Over the years I have found that many of our customers are managing multiple AWS accounts. This situation can arise for several reasons. Sometimes they adopt AWS incrementally and organically, with individual teams and divisions making the move to cloud computing on a decentralized basis. Other companies grow through mergers and acquisitions and take on responsibility for existing accounts. Still others routinely create multiple accounts in order to meet strict guidelines for compliance or to create a very strong isolation barrier between applications, sometimes going so far as to use distinct accounts for development, testing, and production.

As these accounts proliferate, our customers find that they would like to manage them in a scalable fashion. Instead of dealing with a multitude of per-team, per-division, or per-application accounts, they have asked for a way to define access control policies that can be easily applied to all, some, or individual accounts. In many cases, these customers are also interested in additional billing and cost management, and would like to be able to control how AWS pricing benefits such as volume discounts and Reserved Instances are applied to their accounts.

AWS Organizations Emerges from Preview
To support this increasingly important use case, we are moving AWS Organizations from Preview to General Availability today. You can use Organizations to centrally manage multiple AWS accounts, with the ability to create a hierarchy of Organizational Units (OUs), assign each account to an OU, define policies, and then apply them to the entire hierarchy, to select OUs, or to specific accounts. You can invite existing AWS accounts to join your organization and you can also create new accounts. All of these functions are available from the AWS Management Console, the AWS Command Line Interface (CLI), and through the AWS Organizations API.

Here are some important terms and concepts that will help you to understand Organizations (this assumes that you are the all-powerful, overall administrator of your organization’s AWS accounts, and that you are responsible for the Master account):

An Organization is a consolidated set of AWS accounts that you manage. Newly-created Organizations offer the ability to implement sophisticated, account-level controls such as Service Control Policies. This allows Organization administrators to manage lists of allowed and blocked AWS API functions and resources that place guard rails on individual accounts. For example, you could give your advanced R&D team access to a wide range of AWS services, and then be a bit more cautious with your mainstream development and test accounts. Or, on the production side, you could allow access only to AWS services that are eligible for HIPAA compliance.

Some of our existing customers use a feature of AWS called Consolidated Billing. This allows them to select a Payer Account which rolls up account activity from multiple AWS Accounts into a single invoice and provides a centralized way of tracking costs. With this launch, current Consolidated Billing customers now have an Organization that provides all the capabilities of Consolidated Billing, but by default does not have the new features (like Service Control Policies) we’re making available today. These customers can easily enable the full features of AWS Organizations. This is accomplished by first enabling the use of all AWS Organization features from the Organization’s master account and then having each member account authorize this change to the Organization. Finally, we will continue to support creating new Organizations that support only the Consolidated Billing capabilities. Customers that wish to only use the centralized billing features can continue to do so, without allowing the master account administrators to enforce the advanced policy controls on member accounts in the Organization.

An AWS account is a container for AWS resources.

The Master account is the management hub for the Organization and is also the payer account for all of the AWS accounts in the Organization. The Master account can invite existing accounts to join the Organization, and can also create new accounts.

Member accounts are the non-Master accounts in the Organization.

An Organizational Unit (OU) is a container for a set of AWS accounts. OUs can be arranged into a hierarchy that can be up to five levels deep. The top of the hierarchy of OUs is also known as the Administrative Root.

A Service Control Policy (SCP) is a set of controls that the Organization’s Master account can apply to the Organization, selected OUs, or to selected accounts. When applied to an OU, the SCP applies to the OU and to any other OUs beneath it in the hierarchy. The SCP or SCPs in effect for a member account specify the permissions that are granted to the root user for the account. Within the account, IAM users and roles can be used as usual. However, regardless of how permissive the user or the role might be, the effective set of permissions will never extend beyond what is defined in the SCP. You can use this to exercise fine-grained control over access to AWS services and API functions at the account level.

An Invitation is used to ask an AWS account to join an Organization. It must be accepted within 15 days, and can be extended via email address or account ID. Up to 20 Invitations can be outstanding at any given time. The invitation-based model allows you to start from a Master account and then bring existing accounts into the fold. When an Invitation is accepted, the account joins the Organization and all applicable policies become effective. Once the account has joined the Organization, you can move it to the proper OU.

AWS Organizations is appropriate when you want to create strong isolation boundaries between the AWS accounts that you manage. However, keep in mind that AWS resources (EC2 instances, S3 buckets, and so forth) exist within a particular AWS account and cannot be moved from one account to another. You do have access to many different cross-account AWS features including VPC peering, AMI sharing, EBS snapshot sharing, RDS snapshot sharing, cross-account email sending, delegated access via IAM roles, cross-account S3 bucket permissions, and cross-acount access in the AWS Management Console.

Like consolidated billing, AWS Organizations also provides several benefits when it comes to the use of EC2 and RDS Reserved Instances. For billing purposes, all of the accounts in the Organization are treated as if they are one account and can receive the hourly cost benefit of an RI purchased by any other account in the same Organization (in order for this benefit to be applied as expected, the Availability Zone and other attributes of the RI must match the attributes of the EC2 or RDS instance).

Creating an Organization
Let’s create an Organization from the Console, create some Organizational Units, and then create some accounts. I start by clicking on Create organization:

Then I choose ENABLE ALL FEATURES and click on Create organization:

My Organization is ready in seconds:

I can create a new account by clicking on Add account, and then selecting Create account:

Then I supply the details (the IAM role is created in the new account and grants enough permissions for the account to be customized after creation):

Here’s what the console looks like after I have created Dev, Test, and Prod accounts:

At this point all of the accounts are at the top of the hierarchy:

In order to add some structure, I click on Organize accounts, select Create organizational unit (OU), and enter a name:

I do the same for a second OU:

Then I select the Prod account, click on Move accounts, and choose the Operations OU:

Next, I move the Dev and Test accounts into the Development OU:

At this point I have four accounts (my original one plus the three that I just created) and two OUs. The next step is to create one or more Service Control Policies by clicking on Policies and selecting Create policy. I can use the Policy Generator or I can copy an existing SCP and then customize it. I’ll use the Policy Generator. I give my policy a name and make it an Allow policy:

Then I use the Policy Generator to construct a policy that allows full access to EC2 and S3, and the ability to run (invoke) Lambda functions:

Remember, that this policy defines the full set of allowable actions within the account. In order to allow IAM users within the account to be able to use these actions, I would still need to create suitable IAM policies and attach them to the users (all within the member account). I click on Create policy and my policy is ready:

Then I create a second policy for development and testing. This one also allows access to AWS CodeCommit, AWS CodeBuild, AWS CodeDeploy, and AWS CodePipeline:

Let’s recap. I have created my accounts and placed them into OUs. I have created a policy for the OUs. Now I need to enable the use of policies, and attach the policy to the OUs. To enable the use of policies, I click on Organize accounts and select Home (this is not the same as the root because Organizations was designed to support multiple, independent hierarchies), and then click on the checkbox in the Root OU. Then I look to the right, expand the Details section, and click on Enable:

Ok, now I can put all of the pieces together! I click on the Root OU to descend in to the hierarchy, and then click on the checkbox in the Operations OU. Then I expand the Control Policies on the right and click on Attach policy:

Then I locate the OperationsPolicy and click on Attach:

Finally, I remove the FullAWSAccess policy:

I can also attach the DevTestPolicy to the Development OU.

All of the operations that I described above could have been initiated from the AWS Command Line Interface (CLI) or by making calls to functions such as CreateOrganization, CreateAccount, CreateOrganizationalUnit, MoveAccount, CreatePolicy, AttachPolicy, and InviteAccountToOrganization. To see the CLI in action, read Announcing AWS Organizations: Centrally Manage Multiple AWS Accounts.

Best Practices for Use of AWS Organizations
Before I wrap up, I would like to share some best practices for the use of AWS Organizations:

Master Account – We recommend that you keep the Master Account free of any operational AWS resources (with one exception). In addition to making it easier for you to make high-quality control decision, this practice will make it easier for you to understand the charges on your AWS bill.

CloudTrail – Use AWS CloudTrail (this is the exception) in the Master Account to centrally track all AWS usage in the Member accounts.

Least Privilege – When setting up policies for your OUs, assign as few privileges as possible.

Organizational Units – Assign policies to OUs rather than to accounts. This will allow you to maintain a better mapping between your organizational structure and the level of AWS access needed.

Testing – Test new and modified policies on a single account before scaling up.

Automation – Use the APIs and a AWS CloudFormation template to ensure that every newly created account is configured to your liking. The template can create IAM users, roles, and policies. It can also set up logging, create and configure VPCs, and so forth.

Learning More
Here are some resources that will help you to get started with AWS Organizations:

Things to Know
AWS Organizations is available today in all AWS regions except China (Beijing) and AWS GovCloud (US) and is available to you at no charge (to be a bit more precise, the service endpoint is located in US East (Northern Virginia) and the SCPs apply across all relevant regions). All of the accounts must be from the same seller; you cannot mix AWS and AISPL (the local legal Indian entity that acts as a reseller for AWS services accounts in India) in the same Organization.

We have big plans for Organizations, and are currently thinking about adding support for multiple payers, control over allocation of Reserved Instance discounts, multiple hierarchies, and other control policies. As always, your feedback and suggestions are welcome.

Jeff;

SAML Identity Federation: Follow-Up Questions, Materials, Guides, and Templates from an AWS re:Invent 2016 Workshop (SEC306)

Post Syndicated from Quint Van Deman original https://aws.amazon.com/blogs/security/saml-identity-federation-follow-up-questions-materials-guides-and-templates-from-an-aws-reinvent-2016-workshop-sec306/

As part of the re:Source Mini Con for Security Services at AWS re:Invent 2016, we conducted a workshop focused on Security Assertion Markup Language (SAML) identity federation: Choose Your Own SAML Adventure: A Self-Directed Journey to AWS Identity Federation Mastery. As part of this workshop, attendees were able to submit their own federation-focused questions to a panel of AWS experts. In this post, I share the questions and answers from that workshop because this information can benefit any AWS customer interested in identity federation.

I have also made available the full set of workshop materials, lab guides, and AWS CloudFormation templates. I encourage you to use these materials to enrich your exploration of SAML for use with AWS.

Q: SAML assertions are limited to 50,000 characters. We often hit this limit by being in too many groups. What can AWS do to resolve this size-limit problem?

A: Because the SAML assertion is ultimately part of an API call, an upper bound must be in place for the assertion size.

On the AWS side, your AWS solution architect can log a feature request on your behalf to increase the maximum size of the assertion in a future release. The AWS service teams use these feature requests, in conjunction with other avenues of customer feedback, to plan and prioritize the features they deliver. To facilitate this process you need two things: the proposed higher value to which you’d like to see the maximum size raised, and a short written description that would help us understand what this increased limit would enable you to do.

On the AWS customer’s side, we often find that these cases are most relevant to centralized cloud teams that have broad, persistent access across many roles and accounts. This access is often necessary to support troubleshooting or simply as part of an individual’s job function. However, in many cases, exchanging persistent access for just-in-time access enables the same level of access but with better levels of visibility, reduced blast radius, and better adherence to the principle of least privilege. For example, you might implement a fast, efficient, and monitored workflow that allows you to provision a user into the necessary backend directory group for a short duration when needed in lieu of that user maintaining all of those group memberships on a persistent basis. This approach could effectively resolve the limit issue you are facing.

Q: Can we use OpenID Connect (OIDC) for federated authentication and authorization into the AWS Management Console? If so, does it have a similar size limit?

A: Currently, AWS support for OIDC is oriented around providing access to AWS resources from mobile or web applications, not access to the AWS Management Console. This is possible to do, but it requires the construction of a custom identity broker. In this solution, this broker would consume the OIDC identity, use its own logic to authenticate and authorize the user (thus being subject only to any size limits you enforce on the OIDC side), and use the sts:GetFederatedToken call to vend the user an AWS Security Token Service (STS) token for either AWS Management Console use or API/CLI use. During this sts:GetFederatedToken call, you attach a scoping policy with a limit of 2,048 characters. See Creating a URL that Enables Federated Users to Access the AWS Management Console (Custom Federation Broker) for additional details about custom identity brokers.

Q: We want to eliminate permanent AWS Identity and Access Management (IAM) access keys, but we cannot do so because of third-party tools. We are contemplating using HashiCorp Vault to vend permanent keys. Vault lets us tie keys to LDAP identities. Have you seen this work elsewhere? Do you think it will work for us?

A: For third-party tools that can run within Amazon EC2, you should use EC2 instance profiles to eliminate long-term credentials and their associated management (distribution, rotation, etc.). For third-party tools that cannot run within EC2, most customers opt to leverage their existing secrets-management tools and processes for the long-lived keys. These tools are often enhanced to make use of AWS APIs such as iam:GetCredentialReport (rotation information) and iam:{Create,Update,Delete}AccessKey (rotation operations). HashiCorp Vault is a popular tool with an available AWS Quick Start Reference Deployment, but any secrets management platform that is able to efficiently fingerprint the authorized resources and is extensible to work with the previously mentioned APIs will fill this need for you nicely.

Q: Currently, we use an Object Graph Navigation Library (OGNL) script in our identity provider (IdP) to build role Amazon Resource Names (ARNs) for the role attribute in the SAML assertion. The script consumes a list of distribution list display names from our identity management platform of which the user is a member. There is a 60-character limit on display names, which leaves no room for IAM pathing (which has a 512-character limit). We are contemplating a change. The proposed solution would make AWS API calls to get role ARNs from the AWS APIs. Have you seen this before? Do you think it will work? Does the AWS SAML integration support full-length role ARNs that would include up to 512 characters for the IAM path?

A: In most cases, we recommend that you use regular expression-based transformations within your IdP to translate a list of group names to a list of role ARNs for inclusion in the SAML assertion. Without pathing, you need to know the 12-digit AWS account number and the role name in order to be able to do so, which is accomplished using our recommended group-naming convention (AWS-Acct#-RoleName). With pathing (because “/” is not a valid character for group names within most directories), you need an additional source from which to pull this third data element. This could be as simple as an extra dimension in the group name (such as AWS-Acct#-Path-RoleName); however, that would multiply the number of groups required to support the solution. Instead, you would most likely derive the path element from a user attribute, dynamic group information, or even an external information store. It should work as long as you can reliably determine all three data elements for the user.

We do not recommend drawing the path information from AWS APIs, because the logic within the IdP is authoritative for the user’s authorization. In other words, the IdP should know for which full path role ARNs the user is authorized without asking AWS. You might consider using the AWS APIs to validate that the role actually exists, but that should really be an edge case. This is because you should integrate any automation that builds and provisions the roles with the frontend authorization layer. This way, there would never be a case in which the IdP authorizes a user for a role that doesn’t exist. AWS SAML integration supports full-length role ARNs.

Q: Why do AWS STS tokens contain a session token? This makes them incompatible with third-party tools that only support permanent keys. Is there a way to get rid of the session token to make temporary keys that contain only the access_key and the secret_key components?

A: The session token contains information that AWS uses to confirm that the AWS STS token is valid. There is no way to create a token that does not contain this third component. Instead, the preferred AWS mechanisms for distributing AWS credentials to third-party tools are EC2 instance profiles (in EC2), IAM cross-account trust (SaaS), or IAM access keys with secrets management (on-premises). Using IAM access keys, you can rotate the access keys as often as the third-party tool and your secrets management platform allow.

Q: How can we use SAML to authenticate and authorize code instead of having humans do the work? One proposed solution is to use our identity management (IDM) platform to generate X.509 certificates for identities, and then present these certificates to our IdP in order to get valid SAML assertions. This could then be included in an sts:AssumeRoleWithSAML call. Have you seen this working before? Do you think it will work for us?

A: Yes, when you receive a SAML assertion from your IdP by using your desired credential form (user name/password, X.509, etc.), you can use the sts:AssumeRoleWithSAML call to retrieve an AWS STS token. See How to Implement a General Solution for Federated API/CLI Access Using SAML 2.0 for a reference implementation.

Q: As a follow-up to the previous question, how can we get code using multi-factor authentication (MFA)? There is a gauth project that uses NodeJS to generate virtual MFAs. Code could theoretically get MFA codes from the NodeJS gauth server.

A: The answer depends on your choice of IdP and MFA provider. Generically speaking, you need to authenticate (either web-based or code-based) to the IdP using all of your desired factors before the SAML assertion is generated. This assertion can then include details of the authentication mechanism used as an additional attribute in role-assumption conditions within the trust policy in AWS. This lab guide from the workshop provides further details and a how-to guide for MFA-for-SAML.

This blog post clarifies some re:Invent 2016 attendees’ questions about SAML-based federation with AWS. For more information presented in the workshop, see the full set of workshop materials, lab guides, and CloudFormation templates. If you have follow-up questions, start a new thread in the IAM forum.

– Quint