Tag Archives: Conditions

How to use customer managed policies in AWS IAM Identity Center for advanced use cases

Post Syndicated from Ron Cully original https://aws.amazon.com/blogs/security/how-to-use-customer-managed-policies-in-aws-single-sign-on-for-advanced-use-cases/

Are you looking for a simpler way to manage permissions across all your AWS accounts? Perhaps you federate your identity provider (IdP) to each account and divide permissions and authorization between cloud and identity teams, but want a simpler administrative model. Maybe you use AWS IAM Identity Center (successor to AWS Single Sign-On) but are running out of room in your permission set policies; or need a way to keep the role models you have while tailoring the policies in each account to reference their specific resources. Or perhaps you are considering IAM Identity Center as an alternative to per-account federation, but need a way to reuse the customer managed policies that you have already created. Great news! Now you can use customer managed policies (CMPs) and permissions boundaries (PBs) to help with these more advanced situations.

In this blog post, we explain how you can use CMPS and PBs with IAM Identity Center to address these considerations. We describe how IAM Identity Center works, how these types of policies work with IAM Identity Center, and how to best use CMPs and PBs with IAM Identity Center. We also show you how to configure and use CMPs in your IAM Identity Center deployment.

IAM Identity Center background

With IAM Identity Center, you can centrally manage access to multiple AWS accounts and business applications, while providing your workplace users a single sign-on experience with your choice of identity system. Rather than manage identity in each account individually, IAM Identity Center provides one place to connect an existing IdP, Microsoft Active Directory Domain Services (AD DS), or workforce users that you create directly in AWS. Because IAM Identity Center integrates with AWS Organizations, it also provides a central place to define your roles, assign them to your users and groups, and give your users a portal where they can access their assigned accounts.

With AWS Identity Center, you manage access to accounts by creating and assigning permission sets. These are AWS Identity and Access Management (IAM) role templates that define (among other things) which policies to include in a role. If you’re just getting started, you can attach AWS managed policies to the permission set. These policies, created by AWS service teams, enable you to get started without having to learn how to author IAM policies in JSON.

For more advanced cases, where you are unable to express policies sufficiently using inline policies, you can create a custom policy in the permission set. When you assign a permission set to users or groups in a specified account, IAM Identity Center creates a role from the template and then controls single sign-on access to the role. During role creation, IAM Identity Center attaches any specified AWS managed policies, and adds any custom policy to the role as an inline policy. These custom policies must be within the 10,240 character IAM quota of inline policies.

IAM provides two other types of custom policies that increase flexibility when managing access in AWS accounts. Customer managed policies (CMPs) are standalone policies that you create and can attach to roles in your AWS accounts to grant or deny access to AWS resources. Permissions boundaries (PBs) provide an advanced feature that specifies the maximum permissions that a role can have. For both CMPs and PBs, you create the custom policy in your account and then attach it to roles. IAM Identity Center now supports attaching both of these to permission sets so you can handle cases where AWS Managed Policies and inline policies may not be enough.

How CMPs and PBs work with IAM Identity Center

Although you can create IAM users to manage access to AWS accounts and resources, AWS recommends that you use roles instead of IAM users for this purpose. Roles act as an identity (sometimes called an IAM principal), and you assign permissions (identity-based policies) to the role. If you use the AWS Management Console or the AWS Command Line Interface to assume a role, you get the permissions of the role that you assumed. With its simpler way to maintain your users and groups in one AWS location and its ability to centrally manage and assign roles, AWS recommends that you use IAM Identity Center to manage access to your AWS accounts.

With this new IAM Identity Center release, you have the option to specify the names of CMPs and one PB in your permission set (role definition). Doing so modifies how IAM Identity Center provisions roles into accounts. When you assign a user or group to a permission set, IAM Identity Center checks the target account to verify that all specified CMPs and the PB are present. If they are all present, IAM Identity Center creates the role in the account and attaches the specified policies. If any of the specified CMPs or the PB are missing, IAM Identity Center fails the role creation.

This all sounds simple enough, but there are important implications to consider.

If you modify the permission set, IAM Identity Center updates the corresponding roles in all accounts to which you assigned the permission set. What is different when using CMPs and PBs is that IAM Identity Center is uninvolved in the creation or maintenance of the CMPs or PBs. It’s your responsibility to make sure that the CMPs and PBs are created and managed in all of the accounts to which you assign permission sets that use the CMPs and PBs. This means that you must be careful in how you name, create, and maintain these policies in your accounts, to avoid unintended consequences. For example, if you do not apply changes to CMPs consistently across all your accounts, the behavior of an IAM Identity Center created role will vary between accounts.

What CMPs do for you

By using CMPs with permission sets, you gain four main benefits:

  1. If you federate to your accounts directly and have CMPs already, you can reuse your CMPs with permission sets in IAM Identity Center. We describe exceptions later in this post.
  2. If you are running out of space in your permission set inline policies, you can add permission sets to increase the aggregate size of your policies.
  3. Policies often need to refer to account-specific resources by Amazon Resource Name (ARN). Designing an inline policy that does this correctly across all your accounts can be challenging and, in some cases, may not be possible. By specifying a CMP in a permission set, you can tailor the CMPs in each of your accounts to reference the resources of the account. When IAM Identity Center creates the role and attaches the CMPs of the account, the policies used by the IAM Identity Center–generated role are now specific to the account. We highlight this example later in this post.
  4. You get the benefit of a central location to define your roles, which gives you visibility of all the policies that are in use across the accounts where you assigned permission sets. This enables you to have a list of CMP and PB names that you should monitor for change across your accounts. This helps you ensure that you are maintaining your policies correctly.

Considerations and best practices

Start simple, avoid complex – If you’re just starting out, try using AWS managed policies first. With managed policies, you don’t need to know JSON policy to get started. If you need more advanced policies, start by creating identity-based inline custom policies in the permission set. These policies are provisioned as inline policies, and they will be identical in all your accounts. If you need larger policies or more advanced capabilities, use CMPs as your next option. In most cases, you can accomplish what you need with inline and customer managed policies. When you can’t achieve your objective using CMPs, use PBs. For information about intended use cases for PBs, see the blog post When and where to use IAM permissions boundaries.

Permissions boundaries don’t constrain IAM Identity Center admins who create permission sets – IAM Identity Center administrators (your staff) that you authorize to create permission sets can create inline policies and attach CMPs and PBs to permission sets, without restrictions. Permissions boundary policies set the maximum permissions of a role and the maximum permissions that the role can grant within an account through IAM only. For example, PBs can set the maximum permissions of a role that uses IAM to create other roles for use by code or services. However, a PB doesn’t set maximum permissions of the IAM Identity Center permission set creator. What does that mean? Suppose you created an IAM Identity Center Admin permission set that has a PB attached, and you assigned it to John Doe. John Doe can then sign in to IAM Identity Center and modify permission sets with any policy, regardless of what you put in the PB. The PB doesn’t restrict the policies that John Doe can put into a permission set.

In short, use PBs only for roles that need to create IAM roles for use by code or services. Don’t use PBs for permission sets that authorize IAM Identity Center admins who create permission sets.

Create and use a policy naming plan – IAM Identity Center doesn’t consider the content of a named policy that you attach to a permission set. If you assign a permission set in multiple accounts, make sure that all referenced policies have the same intent. Failure to do this will result in unexpected and inconsistent role behavior between different accounts. Imagine a CMP named “S3” that grants S3 read access in account A, and another CMP named “S3” that grants S3 administrative permissions over all S3 buckets in account B. A permission set that attaches the S3 policy and is assigned in accounts A and B will be confusing at best, because the level access is quite different in each of the accounts. It’s better to have more specific names, such as “S3Reader” and “S3Admin,” for your policies and ensure they are identical except for the account-specific resource ARNs.

Use automation to provision policies in accounts – Using tools such as AWS CloudFormation stacksets, or other infrastructure-as-code tools, can help ensure that naming and policies are consistent across your accounts. It also helps reduce the potential for administrators to modify policies in undesirable ways.

Policies must match the capabilities of IAM Identity Center – Although IAM Identity Center supports most IAM semantics, there are exceptions:

  1. If you use an identity provider as your identity source, IAM Identity Center passes only PrincipalTag attributes that come through SAML assertions to IAM. IAM Identity Center doesn’t process or forward other SAML assertions to IAM. If you have CMPs or PBs that rely on other information from SAML assertions, they won’t work. For example, IAM Identity Center doesn’t provide multi-factor authentication (MFA) context keys or SourceIdentity.
  2. Resource policies that reference role names or tags as part of trust policies don’t work with IAM Identity Center. You can use resource policies that use attribute-based access control (ABAC). IAM Identity Center role names are not static, and you can’t tag the roles that IAM Identity Center creates from its permission sets.

How to use CMPs with permission sets

Now that you understand permission sets and how they work with CMPs and PBs, let’s take a look at how you can configure a permission set to use CMPs.

In this example, we show you how to use one or more permission sets that attach a CMP that enables Amazon CloudWatch operations to the log group of specified accounts. Specifically, the AllowCloudWatch_permission set attaches a CMP named AllowCloudWatchForOperations. When we assign the permission set in two separate accounts, the assigned users can perform CloudWatch operations against the log groups of the assigned account only. Because the CloudWatch operations policies are in CMPs rather than inline policies, the log groups can be account specific, and you can reuse the CMPs in other permission sets if you want to have CloudWatch operations available through multiple permission sets.

Note: For this blog post, we demonstrate using CMPs by utilizing the IAM Management Console to create policies and assignments. We recommend that after you learn how to do this, you create your policies through automation for production environments. For example, use AWS CloudFormation. The intent of this example is to demonstrate how you can have a policy in two separate accounts that refer to different resources; something that is harder to accomplish using inline policies. The use case itself is not that advanced, but the use of CMPs to have different resources referenced in each account is a more advanced idea. We kept this simple to make it easier to focus on the feature than the use case.


In this example, we assume that you know how to use the AWS Management Console, create accounts, navigate between accounts, and create customer managed policies. You also need administrative privileges to enable IAM Identity Center and to create policies in your accounts.

Before you begin, enable IAM Identity Center in your AWS Organizations management account in an AWS Region of your choice. You need to create at least two accounts within your AWS Organization. In this example, the account names are member-account and member-account-1. After you set up the accounts, you can optionally configure IAM Identity Center for administration in a delegated member account.

Configure an IAM Identity Center permission set to use a CMP

Follow these four procedures to use a CMP with a permission set:

  1. Create CMPs with consistent names in your target accounts
  2. Create a permission set that references the CMP that you created
  3. Assign groups or users to the permission set in accounts where you created CMPs
  4. Test your assignments

Step 1: Create CMPs with consistent names in your target accounts

In this step, you create a customer managed policy named AllowCloudWatchForOperations in two member accounts. The policy allows your cloud operations users to access a predefined CloudWatch log group in the account.

To create CMPs in your target accounts

  1. Sign into AWS.

    Note: You can sign in to IAM Identity Center if you have existing permission sets that enable you to create policies in member accounts. Alternatively, you can sign in using IAM federation or as an IAM user that has access to roles that enable you to navigate to other accounts where you can create policies. Your sign-in should also give you access to a role that can administer IAM Identity Center permission sets.

  2. Navigate to an AWS Organizations member account.

    Note: If you signed in through IAM Identity Center, use the user portal page to navigate to the account and role. If you signed in by using IAM federation or as an IAM user, choose your sign-in name that is displayed in the upper right corner of the AWS Management Console and then choose switch role, as shown in Figure 1.

    Figure 1: Switch role for IAM user or IAM federation

    Figure 1: Switch role for IAM user or IAM federation

  3. Open the IAM console.
  4. In the navigation pane, choose Policies.
  5. In the upper right of the page, choose Create policy.
  6. On the Create Policy page, choose the JSON tab.
  7. Paste the following policy into the JSON text box. Replace <account-id> with the ID of the account in which the policy is created.

    Tip: To copy your account number, choose your sign-in name that is displayed in the upper right corner of the AWS Management Console, and then choose the copy icon next to the account ID, as shown in Figure 2.

    Figure 2: Copy account number

    Figure 2: Copy account number

        "Version": "2012-10-17",
        "Statement": [
                "Action": [
                "Effect": "Allow",
                "Resource": "arn:aws:logs:us-east-1:<account-id>:log-group:OperationsLogGroup:*"
                "Action": [
                "Effect": "Allow",
                "Resource": "arn:aws:logs:us-east-1:<account-id>:log-group::log-stream:*"

  8. Choose Next:Tags, and then choose Next:Review.
  9. On the Create Policy/Review Policy page, in the Name field, enter AllowCloudWatchForOperations. This is the name that you will use when you attach the CMP to the permission set in the next procedure (Step 2).
  10. Repeat steps 1 through 7 in at least one other member account. Be sure to replace the <account-id> element in the policy with the account ID of each account where you create the policy. The only difference between the policies in each account is the <account-id> in the policy.

Step 2: Create a permission set that references the CMP that you created

At this point, you have at least two member accounts containing the same policy with the same policy name. However, the ResourceARN in each policy refers to log groups that belong to the respective accounts. In this step, you create a permission set and attach the policy to the permission set. Importantly, you attach only the name of the policy to the permission set. The actual attachment of the policy to the role that IAM Identity Center creates, happens when you assign the permission set to a user or group in Step 3.

To create a permission set that references the CMP

  1. Sign in to the Organizations management account or the IAM Identity Center delegated administration account.
  2. Open the IAM Identity Center console.
  3. In the navigation pane, choose Permission Sets.
  4. On the Select Permission set type screen, select Custom permission Set and choose Next.
    Figure 3: Select custom permission set

    Figure 3: Select custom permission set

  5. On the Specify policies and permissions boundary page, expand the Customer managed policies option, and choose Attach policies.
    Figure 4: Specify policies and permissions boundary

    Figure 4: Specify policies and permissions boundary

  6. For Policy names, enter the name of the policy. This name must match the name of the policy that you created in Step 1. In our example, the name is AllowCloudWatchForOperations. Choose Next.
  7. On the Permission set details page, enter a name for your permission set. In this example, use AllowCloudWatch_PermissionSet. You can alspecify additional details for your permission sets, such as session duration and relay state (these are a link to a specific AWS Management Console page of your choice).
    Figure 5: Permission set details

    Figure 5: Permission set details

  8. Choose Next, and then choose Create.

Step 3: Assign groups or users to the permission set in accounts where you created your CMPs

In the preceding steps, you created a customer managed policy in two or more member accounts, and a permission set with the customer managed policy attached. In this step, you assign users to the permission set in your accounts.

To assign groups or users to the permission set

  1. Sign in to the Organizations management account or the IAM Identity Center delegated administration account.
  2. Open the IAM Identity Center console.
  3. In the navigation pane, choose AWS accounts.
    Figure 6: AWS account

    Figure 6: AWS account

  4. For testing purposes, in the AWS Organization section, select all the accounts where you created the customer managed policy. This means that any users or groups that you assign during the process will have access to the AllowCloudWatch_PermissionSet role in each account. Then, on the top right, choose Assign users or groups.
  5. Choose the Users or Groups tab and then select the users or groups that you want to assign to the permission set. You can select multiple users and multiple groups in this step. For this example, we recommend that you select a single user for which you have credentials, so that you can sign in as that user to test the setup later. After selecting the users or groups that you want to assign, choose Next.
    Figure 7: Assign users and groups to AWS accounts

    Figure 7: Assign users and groups to AWS accounts

  6. Select the permission set that you created in Step 2 and choose Next.
  7. Review the users and groups that you are assigning and choose Submit.
  8. You will see a message that IAM Identity Center is configuring the accounts. In this step, IAM Identity Center creates roles in each of the accounts that you selected. It does this for each account, so it looks in the account for the CMP that you specified in the permission set. If the name of the CMP that you specified in the permission set matches the name that you provided when creating the CMP, IAM Identity Center creates a role from the permission set. If the names don’t match or if the CMP isn’t present in the account to which you assigned the permission set, you see an error message associated with that account. After successful submission, you will see the following message: We reprovisioned your AWS accounts successfully and applied the updated permission set to the accounts.

Step 4: Test your assignments

Congratulations! You have successfully created CMPs in multiple AWS accounts, created a permission set and attached the CMPs by name, and assigned the permission set to users and groups in the accounts. Now it’s time to test the results.

To test your assignments

  1. Go to the IAM Identity Center console.
  2. Navigate to the Settings page.
  3. Copy the user portal URL, and then paste the user portal URL into your browser.
  4. At the sign-in prompt, sign in as one of the users that you assigned to the permission set.
  5. The IAM Identity Center user portal shows the accounts and roles that you can access. In the example shown in Figure 8, the user has access to the AllowCloudWatch_PermissionSet created in two accounts.
    Figure 8: User portal

    Figure 8: User portal

    If you choose AllowCloudWatch_PermissionSet in the member-account, you will have access to the CloudWatch log group in the member-account account. If you choose the role in member-account-1, you will have access to CloudWatch Log group in member-account-1.

  6. Test the access by choosing Management Console for the AllowCloudWatch_PermissionSet in the member-account.
  7. Open the CloudWatch console.
  8. In the navigation pane, choose Log groups. You should be able to access log groups, as shown in Figure 9.
    Figure 9: CloudWatch log groups

    Figure 9: CloudWatch log groups

  9. Open the IAM console. You shouldn’t have permissions to see the details on this console, as shown in figure 10. This is because AllowCloudWatch_PermissionSet only provided CloudWatch log access.
    Figure 10: Blocked access to the IAM console

    Figure 10: Blocked access to the IAM console

  10. Return to the IAM Identity Center user portal.
  11. Repeat steps 4 through 8 using member-account-1.

Answers to key questions

What happens if I delete a CMP or PB that is attached to a role that IAM Identity Center created?
IAM prevents you from deleting policies that are attached to IAM roles.

How can I delete a CMP or PB that is attached to a role that IAM Identity Center created?
Remove the CMP or PB reference from all your permission sets. Then re-provision the roles in your accounts. This detaches the CMP or PB from IAM Identity Center–created roles. If the policies are unused by other IAM roles in your account or by IAM users, you can delete the policy.

What happens if I modify a CMP or PB that is attached to an IAM Identity Center provisioned role?
The IAM Identity Center role picks up the policy change the next time that someone assumes the role.


In this post, you learned how IAM Identity Center works with customer managed policies and permissions boundaries that you create in your AWS accounts. You learned different ways that this capability can help you, and some of the key considerations and best practices to succeed in your deployments. That includes the principle of starting simple and avoiding unnecessarily complex configurations. Remember these four principles:

  1. In most cases, you can accomplish everything you need by starting with custom (inline) policies.
  2. Use customer managed policies for more advanced cases.
  3. Use permissions boundary policies only when necessary.
  4. Use CloudFormation to manage your customer managed policies and permissions boundaries rather than having administrators deploy them manually in accounts.

To learn more about this capability, see the IAM Identity Center User Guide. If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS IAM re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Ron Cully

Ron s a Principal Product Manager at AWS where he leads feature and roadmap planning for workforce identity products at AWS. Ron has over 20 years of industry experience in product and program management of networking and directory related products. He is passionate about delivering secure, reliable solutions that help make it easier for customers to migrate directory aware applications and workloads to the cloud.

Nitin Kulkarni

Nitin Kulkarni

Nitin is a Solutions Architect on the AWS Identity Solutions team. He helps customers build secure and scalable solutions on the AWS platform. He also enjoys hiking, baseball and linguistics.

Build an end-to-end attribute-based access control strategy with AWS SSO and Okta

Post Syndicated from Louay Shaat original https://aws.amazon.com/blogs/security/build-an-end-to-end-attribute-based-access-control-strategy-with-aws-sso-and-okta/

This blog post discusses the benefits of using an attribute-based access control (ABAC) strategy and also describes how to use ABAC with AWS Single Sign-On (AWS SSO) when you’re using Okta as an identity provider (IdP).

Over the past two years, Amazon Web Services (AWS) has invested heavily in making ABAC available across the majority of our services. With ABAC, you can simplify your access control strategy by granting access to groups of resources, which are specified by tags, instead of managing long lists of individual resources. Each tag is a label that consists of a user-defined key and value, and you can use these to assign metadata to your AWS resources. Tags can help you manage, identify, organize, search for, and filter resources. You can create tags to categorize resources by purpose, owner, environment, or other criteria. To learn more about tags and AWS best practices for tagging, see Tagging AWS resources.

The ability to include tags in sessions—combined with the ability to tag AWS Identity and Access Management (IAM) users and roles—means that you can now incorporate user attributes from your identity provider as part of your tagging and authorization strategy. Additionally, user attributes help organizations to make permissions more intuitive, because the attributes are easier to relate to teams and functions. A tag that represents a team or a job function is easier to audit and understand.

For more information on ABAC in AWS, see our ABAC documentation.

Why use ABAC?

ABAC is a strategy that that can help organizations to innovate faster. Implementing a purely role-based access control (RBAC) strategy requires identity and security teams to define a large number of RBAC policies, which can lead to complexity and time delays. With ABAC, you can make use of attributes to build more dynamic policies that provide access based on matching the attribute conditions. AWS supports both RBAC and ABAC as co-existing strategies, so you can use ABAC alongside your existing RBAC strategy.

A good example that uses ABAC is the scenario where you have two teams that require similar access to their secrets in AWS Secrets Manager. By using ABAC, you can build a single role or policy with a condition based on the Department attribute from your IdP. When the user is authenticated, you can pass the Department attribute value and use a condition to provide access to resources that have the identical tag, as shown in the following code snippet. In this post, I show how to use ABAC for this example scenario.

"Condition": {
                "StringEquals": {
                    "secretsmanager:ResourceTag/Department": "${aws:PrincipalTag/Department}"

ABAC provides organizations with a more dynamic way of working with permissions. There are four main benefits for organizations that use ABAC:

  • Scale your permissions as you innovate: As developers create new project resources, administrators can require specific attributes to be applied when resources are created. This can include applying tags with attributes that give developers immediate access to the new resources they create, without requiring an update to their own permissions.
  • Help your teams to change and grow quickly: Because permissions are based on user attributes from a corporate identity source such as an IdP, changing user attributes in the IdP that you use for access control in AWS automatically updates your permissions in AWS.
  • Create fewer AWS SSO permission sets and IAM roles: With ABAC, multiple users who are using the same AWS SSO permission set and IAM role can still get unique permissions, because permissions are now based on user attributes. Administrators can author IAM policies that grant users access only to AWS resources that have matching attributes. This helps to reduce the number of IAM roles you need to create for various use cases in a single AWS account.
  • Efficiently audit who performed an action: By using attributes that are logged in AWS CloudTrail next to every action that is performed in AWS by using an IAM role, you can make it easier for security administrators to determine the identity that takes actions in a role session.


In this section, I describe some higher-level prerequisites for using ABAC effectively. ABAC in AWS relies on the use of tags for access-control decisions, so it’s important to have in place a tagging strategy for your resources. To help you develop an effective strategy, see the AWS Tagging Strategies whitepaper.

Organizations that implement ABAC can enhance the use of tags across their resources for the purpose of identity access. Making sure that tagging is enforced and secure is essential to an enterprise-wide strategy. For more information about enforcing a tagging policy, see the blog post Enforce Centralized Tag Compliance Using AWS Service Catalog, DynamoDB, Lambda, and CloudWatch Events.

You can use the service AWS Resource Groups to identify untagged resources and to find resources to tag. You can also use Resource Groups to remediate untagged resources.

Use AWS SSO with Okta as an IdP

AWS SSO gives you an efficient way to centrally manage access to multiple AWS accounts and business applications, and to provide users with single sign-on access to all their assigned accounts and applications from one place. With AWS SSO, you can manage access and user permissions to all of your accounts in AWS Organizations centrally. AWS SSO configures and maintains all the necessary permissions for your accounts automatically, without requiring any additional setup in the individual accounts.

AWS SSO supports access control attributes from any IdP. This blog post focuses on how you can use ABAC attributes with AWS SSO when you’re using Okta as an external IdP.

Use other single sign-on services with ABAC

This post describes how to turn on ABAC in AWS SSO. To turn on ABAC with other federation services, see these links:

Implement the solution

Follow these steps to set up Okta as an IdP in AWS SSO and turn on ABAC.

To set up Okta and turn on ABAC

  1. Set up Okta as an IdP for AWS SSO. To do so, follow the instructions in the blog post Single Sign-On Between Okta Universal Directory and AWS. For more information on the supported actions in AWS SSO with Okta, see our documentation.
  2. Enable attributes for access control (in other words, turn on ABAC) in AWS SSO by using these steps:
    1. In the AWS Management Console, navigate to AWS SSO in the AWS Region you selected for your implementation.
    2. On the Dashboard tab, select Choose your identity source.
    3. Next to Attributes for access control, choose Enable.

      Figure 1: Turn on ABAC in AWS SSO

      Figure 1: Turn on ABAC in AWS SSO

    You should see the message “Attributes for access control has been successfully enabled.”

  3. Enable updates for user attributes in Okta provisioning. Now that you’ve turned on ABAC in AWS SSO, you need to verify that automatic provisioning for Okta has attribute updates enabled.Log in to Okta as an administrator and locate the application you created for AWS SSO. Navigate to the Provisioning tab, choose Edit, and verify that Update User Attributes is enabled.

    Figure 2: Enable automatic provisioning for ABAC updates

    Figure 2: Enable automatic provisioning for ABAC updates

  4. Configure user attributes in Okta for use in AWS SSO by following these steps:
    1. From the same application that you created earlier, navigate to the Sign On tab.
    2. Choose Edit, and then expand the Attributes (optional) section.
    3. In the Attribute Statements (optional) section, for each attribute that you will use for access control in AWS SSO, do the following:
      1. For Name, enter https://aws.amazon.com/SAML/Attributes/AccessControl:<AttributeName>. Replace <AttributeName> with the name of the attribute you’re expecting in AWS SSO, for example https://aws.amazon.com/SAML/Attributes/AccessControl:Department.
      2. For Name Format, choose URI reference.
      3. For Value, enter user.<AttributeName>. Replace <AttributeName> with the Okta default user profile variable name, for example user.department. To view the Okta default user profile, see these instructions.


    Figure 3: Configure two attributes for users in Okta

    Figure 3: Configure two attributes for users in Okta

    In the example shown here, I added two attributes, Department and Division. The result should be similar to the configuration shown in Figure 3.

  5. Add attributes to your users by using these steps:
    1. In your Okta portal, log in as administrator. Navigate to Directory, and then choose People.
    2. Locate a user, navigate to the Profile tab, and then choose Edit.
    3. Add values to the attributes you selected.
    Figure 4: Addition of user attributes in Okta

    Figure 4: Addition of user attributes in Okta

  6. Confirm that attributes are mapped. Because you’ve enabled automatic provisioning updates from Okta, you should be able to see the attributes for your user immediately in AWS SSO. To confirm this:
    1. In the console, navigate to AWS SSO in the Region you selected for your implementation.
    2. On the Users tab, select a user that has attributes from Okta, and select the user. You should be able to see the attributes that you mapped from Okta.
    Figure 5: User attributes in Okta

    Figure 5: User attributes in Okta

Now that you have ABAC attributes for your users in AWS SSO, you can now create permission sets based on those attributes.

Note: Step 4 ensures that users will not be successfully authenticated unless the attributes configured are present. If you don’t want this enforcement, do not perform step 4.

Build an ABAC permission set in AWS SSO

For demonstration purposes, I’ll show how you can build a permission set that is based on ABAC attributes for AWS Secrets Manager. The permission set will match resource tags to user tags, in order to control which resources can be managed by Secrets Manager administrators. You can apply this single permission set to multiple teams.

To build the ABAC permission set

  1. In the console, navigate to AWS SSO, and choose AWS Accounts.
  2. Choose the Permission sets tab.
  3. Choose Create permission set, and then choose Create a custom permission set.
  4. Fill in the fields as follows.
    1. For Name, enter a name for your permission set that will be visible to your users, for example, SecretsManager-Profile.
    2. For Description, enter ABAC SecretsManager Profile.
    3. Select the appropriate session duration.
    4. For Relay State, for my example I will enter the URL for Secrets Manager: https://console.aws.amazon.com/secretsmanager/home. This will give a better user experience when the user signs in to AWS SSO, with an automatic redirect to the Secrets Manager console.
    5. For the field What policies do you want to include in your permission set?, choose Create a custom permissions policy.
    6. Under Create a custom permissions policy, paste the following policy.
          "Version": "2012-10-17",
          "Statement": [
                  "Sid": "SecretsManagerABAC",
                  "Effect": "Allow",
                  "Action": [
                  "Resource": "*",
                  "Condition": {
                      "StringEquals": {
                          "secretsmanager:ResourceTag/Department": "${aws:PrincipalTag/Department}"
                  "Sid": "NeededPermissions",
                  "Effect": "Allow",
                  "Action": [
                  "Resource": "*"

    This policy grants users the ability to create and list secrets that belong to their department. The policy is configured to allow Secrets Manager users to manage only the resources that belong to their department. You can modify this policy to perform matching on more attributes, in order to have more granular permissions.

    Note: The RDS permissions in the policy enable users to select an RDS instance for the secret and the Lambda Permissions are to enable custom key rotation.

    If you look closely at the condition

    “secretsmanager:ResourceTag/Department”: “${aws:PrincipalTag/Department}”

    …the condition states that the user can only access Secrets Manager resources that have a Department tag, where the value of that tag is identical to the value of the Department tag from the user.

  5. Choose Next: Tags.
  6. Tag your permission set. For my example, I’ll add Key: Service and Value: SecretsManager.
  7. Choose Next: Review and create.
  8. Assign the permission set to a user or group and to the appropriate accounts that you have in AWS Organizations.

Test an ABAC permission set

Now you can test the ABAC permission set that you just created for Secrets Manager.

To test the ABAC permission set

  1. In the AWS SSO console, on the Dashboard page, navigate to the User Portal URL.
  2. Sign in as a user who has the attributes that you configured earlier in AWS SSO. You will assume the permission set that you just created.
  3. Choose Management console. This will take you to the console that you specified in the Relay State setting for the permission set, which in my example is the Secrets Manager console.

    Figure 6: AWS SSO ABAC profile access

    Figure 6: AWS SSO ABAC profile access

  4. Try to create a secret with no tags:
    1. Choose Store a new secret.
    2. Choose Other type of secrets.
    3. You can add any values you like for the other options, and then choose Next.
    4. Give your secret a name, but don’t add any tags. Choose Next.
    5. On the Configure automatic rotation page, choose Next, and then choose Store.

    You should receive an error stating that the user failed to create the secret, because the user is not authorized to perform the secretsmanager:CreateSecret action.

    Figure 7: Failure to create a secret (no attributes)

    Figure 7: Failure to create a secret (no attributes)

  5. Choose Previous twice, and then add the appropriate tag. For my example, I’ll add a tag with the key Department and the value Serverless.

    Figure 8: Adding tags for a secret

    Figure 8: Adding tags for a secret

  6. Choose Next twice, and then choose Store. You should see a message that your secret creation was successful.

    Figure 9: Successful secret creation

    Figure 9: Successful secret creation

Now administrators who assume this permission set can view, create, and manage only the secrets that belong to their team or department, based on the tags that you defined. You can reuse this permission set across a large number of teams, which can reduce the number of permission sets you need to create and manage.


In this post, I’ve talked about the benefits organizations can gain from embracing an ABAC strategy, and walked through how to turn on ABAC attributes in Okta and AWS SSO. I’ve also shown how you can create ABAC-driven permission sets to simplify your permission set management. For more information on AWS services that support ABAC—in other words, authorization based on tags—see our updated AWS services that work with IAM page.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Single Sign-On forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Louay Shaat

Louay is a Security Solutions Architect with AWS. He spends his days working with customers, from startups to the largest of enterprises helping them build cool new capabilities and accelerating their cloud journey. He has a strong focus on security and automation helping customers improve their security, risk, and compliance in the cloud.

How to delegate management of identity in AWS Single Sign-On

Post Syndicated from Louay Shaat original https://aws.amazon.com/blogs/security/how-to-delegate-management-of-identity-in-aws-single-sign-on/

In this blog post, I show how you can use AWS Single Sign-On (AWS SSO) to delegate administration of user identities. Delegation is the process of providing your teams permissions to manage accounts and identities associated with their teams. You can achieve this by using the existing integration that AWS SSO has with AWS Organizations, and by using tags and conditions in AWS Identity and Access Management (IAM).

AWS SSO makes it easy to centrally manage access to multiple Amazon Web Services (AWS) accounts and business applications, and to provide users with single sign-on access to all their assigned accounts and applications from one place.

AWS SSO uses permission sets—a collection of administrator-defined policies—to determine a user’s effective permissions to access a given AWS account. Permission sets can contain either AWS managed policies or custom policies that are stored in AWS SSO. Policies are documents that act as containers for one or more permission statements. These statements represent individual access controls (allow or deny) for various tasks, which determine what tasks users can or cannot perform within the AWS account. Permission sets are provisioned as IAM roles in your organizational accounts, and are managed centrally using AWS SSO.

AWS SSO is tightly integrated with AWS Organizations, and runs in your AWS Organizations management account. This integration enables AWS SSO to retrieve and manage permission sets across your AWS Organizations configuration.

As you continue to build more of your workloads on AWS, managing access to AWS accounts and services becomes more time consuming for team members that manage identities. With a centralized identity approach that uses AWS SSO, there’s an increased need to delegate control of permission sets and accounts to domain and application owners. Although this is a valid use case, access to the management account in Organizations should be tightly guarded as a security best practice. As an administrator in the management account of an organization, you can control how teams and users access your AWS accounts and applications.

This post shows how you can build comprehensive delegation models in AWS SSO to securely and effectively delegate control of identities to various teams.

Solution overview

Suppose you’ve implemented AWS SSO in Organizations to manage identity across your entire AWS environment. Your organization is growing and the number of accounts and teams that need access to your AWS environment is also growing. You have a small Identity team that is constantly adding, updating, or deleting users or groups and permission sets to enable your teams to gain access to their required services and accounts.

Note: You can learn how to enable AWS SSO from the Introducing AWS Single Sign-On blog post.

As the number of teams grows, you want to start using a delegation model to enable account and application owners to manage access to their resources, in order to reduce the heavy lifting that is done by teams that manage identities.

Figure 1 shows a simple organizational structure that your organization implemented.

Figure 1: AWS SSO with AWS Organizations

Figure 1: AWS SSO with AWS Organizations

In this scenario, you’ve already built a collection of organizational-approved permission sets that are used across your organization. You have a tagging strategy for permission sets, and you’ve implemented two tags across all your permission sets:

  • Environment: The values for this tag are Production or Development. You only apply Production permission sets to Production accounts.
  • OU: This tag identifies the organizational unit (OU) that the permission set belongs to.

A value of All can be assigned to either tag to identify organization-wide use of the permission set.

You identified three models of delegation that you want to enable based on the setup just described, and your Identity team has identified three use cases that they want to implement:

  • A simple delegation model for a team to manage all permission sets for a set of accounts.
  • A delegation model for support teams to apply read-only permission sets to all accounts.
  • A delegation model based on AWS Organizations, where a team can manage only the permission sets intended for a specific OU.

The AWS SSO delegation model enables three key conditions for restricting user access:

  • Permission sets.
  • Accounts
  • Tags that use the condition aws:ResourceTag, to ensure that tags are present on your permission sets as part of your delegation model.

In the rest of this blog post, I show you how AWS SSO administrators can use these conditions to implement the use cases highlighted here to build a delegation model.

See Delegating permission set administration and Actions, resources, and condition keys for AWS SSO for more information.

Important: The use cases that follow are examples that can be adopted by your organization. The permission sets in these use cases show only what is needed to delegate the components discussed. You need to add additional policies to give users and groups access to AWS SSO.

Some examples:

Identify your permission set and AWS SSO instance IDs

You can use either the AWS Command Line Interface (AWS CLI) v2 or the AWS Management Console to identify your permission set and AWS SSO instance IDs.

Use the AWS CLI

To use the AWS CLI to identify the Amazon resource names (ARNs) of the AWS SSO instance and permission set, make sure you have AWS CLI v2 installed.

To list the AWS SSO instance ID ARN

Run the following command:

aws sso-admin list-instances

To list the permission set ARN

Run the following command:

aws sso-admin list-permission-sets --instance-arn <instance arn from above>

Use the console

You can also use the console to identify your permission sets and AWS SSO instance IDs.

To list the AWS SSO Instance ID ARN

  1. Navigate to the AWS SSO in your Region. Choose the Dashboard and then choose Choose your identity source.
  2. Copy the AWS SSO ARN ID.
Figure 2: AWS SSO ID ARN

Figure 2: AWS SSO ID ARN

To list the permission set ARN

  1. Navigate to the AWS SSO Service in your Region. Choose AWS Accounts and then Permission Sets.
  2. Select the permission set you want to use.
  3. Copy the ARN of the permission set.
Figure 3: Permission set ARN

Figure 3: Permission set ARN

Use case 1: Accounts-based delegation model

In this use case, you create a single policy to allow administrators to assign any permission set to a specific set of accounts.

First, you need to create a custom permission set to use with the following example policy.

The example policy is as follows.

            "Sid": "DelegatedAdminsAccounts",
            "Effect": "Allow",
            "Action": [
            "Resource": [

This policy specifies that delegated admins are allowed to provision any permission set to the three accounts listed in the policy.

Note: To apply this permission set to your environment, replace the account numbers following Resource with your account numbers.

Use case 2: Permission-based delegation model

In this use case, you create a single policy to allow administrators to assign a specific permission set to any account. The policy is as follows.

                    "Sid": "DelegatedPermissionsAdmin",
                    "Effect": "Allow",
                    "Action": [
                    "Resource": [



This policy specifies that delegated admins are allowed to provision only the specific permission set listed in the policy to any account.


Use case 3: OU-based delegation model

In this use case, the Identity team wants to delegate the management of the Development permission sets (identified by the tag key Environment) to the Test OU (identified by the tag key OU). You use the Environment and OU tags on permission sets to restrict access to only the permission sets that contain both tags.

To build this permission set for delegation, you need to create two policies in the same permission set:

  • A policy that filters the permission sets based on both tags—Environment and OU.
  • A policy that filters the accounts belonging to the Development OU.

The policies are as follows.

                    "Sid": "DelegatedOUAdmin",
                    "Effect": "Allow",
                    "Action": [
                    "Resource": "arn:aws:sso:::permissionSet/*/*",
                    "Condition": {
                        "StringEquals": {
                            "aws:ResourceTag/Environment": "Development",
                            "aws:ResourceTag/OU": "Test"
            "Sid": "Instance",
            "Effect": "Allow",
            "Action": [
            "Resource": [


In the delegated policy, the user or group is only allowed to provision permission sets that have both tags, OU and Environment, set to “Development” and only to accounts in the Development OU.

Note: In the example above arn:aws:sso:::instance/ssoins-11112222233333 is the ARN for the AWS SSO Instance ID. To get your AWS SSO Instance ID, refer to Identify your permission set and AWS SSO Instance IDs.

Create a delegated admin profile in AWS SSO

Now that you know what’s required to delegate permissions, you can create a delegated profile and deploy that to your users and groups.

To create a delegated AWS SSO profile

  1. In the AWS SSO console, sign in to your management account and browse to the Region where AWS SSO is provisioned.
  2. Navigate to AWS Accounts and choose Permission sets, and then choose Create permission set.
    Figure 4: AWS SSO permission sets menu

    Figure 4: AWS SSO permission sets menu

  3. Choose Create a custom permission set.
    Figure 5: Create a new permission set

    Figure 5: Create a new permission set

  4. Give a name to your permission set based on your naming standards and select a session duration from your organizational policies.
  5. For Relay state, enter the following URL:

    where <region> is the AWS Region in which you deployed AWS SSO.

    The relay state will automatically redirect the user to the Accounts section in the AWS SSO console, for simplicity.

    Figure 6: Custom permission set

    Figure 6: Custom permission set

  6. Choose Create new permission set. Here is where you can decide the level of delegation required for your application or domain administrators.
    Figure 7: Assign users

    Figure 7: Assign users

    See some of the examples in the earlier sections of this post for the permission set.

  7. If you’re using AWS SSO with AWS Directory Service for Microsoft Active Directory, you’ll need to provide access to AWS Directory Service in order for your administrator to assign permission sets to users and groups.

    To provide this access, navigate to the AWS Accounts screen in the AWS SSO console, and select your management account. Assign the required users or groups, and select the permission set that you created earlier. Then choose Finish.

  8. To test this delegation, sign in to AWS SSO. You’ll see the newly created permission set.
    Figure 8: AWS SSO sign-on page

    Figure 8: AWS SSO sign-on page

  9. Next to developer-delegated-admin, choose Management console. This should automatically redirect you to AWS SSO in the AWS Accounts submenu.

If you try to provision access by assigning or creating new permission sets to accounts or permission sets you are not explicitly allowed, according to the policies you specified earlier, you will receive the following error.

Figure 9: Error based on lack of permissions

Figure 9: Error based on lack of permissions

Otherwise, the provisioning will be successful.


You’ve seen that by using conditions and tags on permission sets, application and account owners can use delegation models to manage the deployment of permission sets across the accounts they manage, providing them and their teams with secure access to AWS accounts and services.

Additionally, because AWS SSO supports attribute-based access control (ABAC), you can create a more dynamic delegation model based on attributes from your identity provider, to match the tags on the permission set.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Single Sign-On forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Louay Shaat

Louay is a Security Solutions Architect with AWS. He spends his days working with customers, from startups to the largest of enterprises, helping them build cool new capabilities and accelerating their cloud journey. He has a strong focus on security and automation to help customers improve their security, risk, and compliance in the cloud.

Friday Squid Blogging: Do Cephalopods Contain Alien DNA?

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/06/friday_squid_bl_627.html

Maybe not DNA, but biological somethings.

Cause of Cambrian explosion — Terrestrial or Cosmic?“:

Abstract: We review the salient evidence consistent with or predicted by the Hoyle-Wickramasinghe (H-W) thesis of Cometary (Cosmic) Biology. Much of this physical and biological evidence is multifactorial. One particular focus are the recent studies which date the emergence of the complex retroviruses of vertebrate lines at or just before the Cambrian Explosion of ~500 Ma. Such viruses are known to be plausibly associated with major evolutionary genomic processes. We believe this coincidence is not fortuitous but is consistent with a key prediction of H-W theory whereby major extinction-diversification evolutionary boundaries coincide with virus-bearing cometary-bolide bombardment events. A second focus is the remarkable evolution of intelligent complexity (Cephalopods) culminating in the emergence of the Octopus. A third focus concerns the micro-organism fossil evidence contained within meteorites as well as the detection in the upper atmosphere of apparent incoming life-bearing particles from space. In our view the totality of the multifactorial data and critical analyses assembled by Fred Hoyle, Chandra Wickramasinghe and their many colleagues since the 1960s leads to a very plausible conclusion — life may have been seeded here on Earth by life-bearing comets as soon as conditions on Earth allowed it to flourish (about or just before 4.1 Billion years ago); and living organisms such as space-resistant and space-hardy bacteria, viruses, more complex eukaryotic cells, fertilised ova and seeds have been continuously delivered ever since to Earth so being one important driver of further terrestrial evolution which has resulted in considerable genetic diversity and which has led to the emergence of mankind.

Two commentaries.

This is almost certainly not true.

As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.

Read my blog posting guidelines here.

Protecting coral reefs with Nemo-Pi, the underwater monitor

Post Syndicated from Janina Ander original https://www.raspberrypi.org/blog/coral-reefs-nemo-pi/

The German charity Save Nemo works to protect coral reefs, and they are developing Nemo-Pi, an underwater “weather station” that monitors ocean conditions. Right now, you can vote for Save Nemo in the Google.org Impact Challenge.

Nemo-Pi — Save Nemo

Save Nemo

The organisation says there are two major threats to coral reefs: divers, and climate change. To make diving saver for reefs, Save Nemo installs buoy anchor points where diving tour boats can anchor without damaging corals in the process.

reef damaged by anchor
boat anchored at buoy

In addition, they provide dos and don’ts for how to behave on a reef dive.

The Nemo-Pi

To monitor the effects of climate change, and to help divers decide whether conditions are right at a reef while they’re still on shore, Save Nemo is also in the process of perfecting Nemo-Pi.

Nemo-Pi schematic — Nemo-Pi — Save Nemo

This Raspberry Pi-powered device is made up of a buoy, a solar panel, a GPS device, a Pi, and an array of sensors. Nemo-Pi measures water conditions such as current, visibility, temperature, carbon dioxide and nitrogen oxide concentrations, and pH. It also uploads its readings live to a public webserver.

Inside the Nemo-Pi device — Save Nemo
Inside the Nemo-Pi device — Save Nemo
Inside the Nemo-Pi device — Save Nemo

The Save Nemo team is currently doing long-term tests of Nemo-Pi off the coast of Thailand and Indonesia. They are also working on improving the device’s power consumption and durability, and testing prototypes with the Raspberry Pi Zero W.

web dashboard — Nemo-Pi — Save Nemo

The web dashboard showing live Nemo-Pi data

Long-term goals

Save Nemo aims to install a network of Nemo-Pis at shallow reefs (up to 60 metres deep) in South East Asia. Then diving tour companies can check the live data online and decide day-to-day whether tours are feasible. This will lower the impact of humans on reefs and help the local flora and fauna survive.

Coral reefs with fishes

A healthy coral reef

Nemo-Pi data may also be useful for groups lobbying for reef conservation, and for scientists and activists who want to shine a spotlight on the awful effects of climate change on sea life, such as coral bleaching caused by rising water temperatures.

Bleached coral

A bleached coral reef

Vote now for Save Nemo

If you want to help Save Nemo in their mission today, vote for them to win the Google.org Impact Challenge:

  1. Head to the voting web page
  2. Click “Abstimmen” in the footer of the page to vote
  3. Click “JA” in the footer to confirm

Voting is open until 6 June. You can also follow Save Nemo on Facebook or Twitter. We think this organisation is doing valuable work, and that their projects could be expanded to reefs across the globe. It’s fantastic to see the Raspberry Pi being used to help protect ocean life.

The post Protecting coral reefs with Nemo-Pi, the underwater monitor appeared first on Raspberry Pi.

The devil wears Pravda

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/05/the-devil-wears-pravda.html

Classic Bond villain, Elon Musk, has a new plan to create a website dedicated to measuring the credibility and adherence to “core truth” of journalists. He is, without any sense of irony, going to call this “Pravda”. This is not simply wrong but evil.

Musk has a point. Journalists do suck, and many suck consistently. I see this in my own industry, cybersecurity, and I frequently criticize them for their suckage.

But what he’s doing here is not correcting them when they make mistakes (or what Musk sees as mistakes), but questioning their legitimacy. This legitimacy isn’t measured by whether they follow established journalism ethics, but whether their “core truths” agree with Musk’s “core truths”.

An example of the problem is how the press fixates on Tesla car crashes due to its “autopilot” feature. Pretty much every autopilot crash makes national headlines, while the press ignores the other 40,000 car crashes that happen in the United States each year. Musk spies on Tesla drivers (hello, classic Bond villain everyone) so he can see the dip in autopilot usage every time such a news story breaks. He’s got good reason to be concerned about this.

He argues that autopilot is safer than humans driving, and he’s got the statistics and government studies to back this up. Therefore, the press’s fixation on Tesla crashes is illegitimate “fake news”, titillating the audience with distorted truth.

But here’s the thing: that’s still only Musk’s version of the truth. Yes, on a mile-per-mile basis, autopilot is safer, but there’s nuance here. Autopilot is used primarily on freeways, which already have a low mile-per-mile accident rate. People choose autopilot only when conditions are incredibly safe and drivers are unlikely to have an accident anyway. Musk is therefore being intentionally deceptive comparing apples to oranges. Autopilot may still be safer, it’s just that the numbers Musk uses don’t demonstrate this.

And then there is the truth calling it “autopilot” to begin with, because it isn’t. The public is overrating the capabilities of the feature. It’s little different than “lane keeping” and “adaptive cruise control” you can now find in other cars. In many ways, the technology is behind — my Tesla doesn’t beep at me when a pedestrian walks behind my car while backing up, but virtually every new car on the market does.

Yes, the press unduly covers Tesla autopilot crashes, but Musk has only himself to blame by unduly exaggerating his car’s capabilities by calling it “autopilot”.

What’s “core truth” is thus rather difficult to obtain. What the press satisfies itself with instead is smaller truths, what they can document. The facts are in such cases that the accident happened, and they try to get Tesla or Musk to comment on it.

What you can criticize a journalist for is therefore not “core truth” but whether they did journalism correctly. When such stories criticize “autopilot”, but don’t do their diligence in getting Tesla’s side of the story, then that’s a violation of journalistic practice. When I criticize journalists for their poor handling of stories in my industry, I try to focus on which journalistic principles they get wrong. For example, the NYTimes reporters do a lot of stories quoting anonymous government sources in clear violation of journalistic principles.

If “credibility” is the concern, then it’s the classic Bond villain here that’s the problem: Musk himself. His track record on business statements is abysmal. For example, when he announced the Model 3 he claimed production targets that every Wall Street analyst claimed were absurd. He didn’t make those targets, he didn’t come close. Model 3 production is still lagging behind Musk’s twice adjusted targets.


So who has a credibility gap here, the press, or Musk himself?

Not only is Musk’s credibility problem ironic, so is the name he chose, “Pravada”, the Russian word for truth that was the name of the Soviet Union Communist Party’s official newspaper. This is so absurd this has to be a joke, yet Musk claims to be serious about all this.

Yes, the press has a lot of problems, and if Musk were some journalism professor concerned about journalists meeting the objective standards of their industry (e.g. abusing anonymous sources), then this would be a fine thing. But it’s not. It’s Musk who is upset the press’s version of “core truth” does not agree with his version — a version that he’s proven time and time again differs from “real truth”.

Just in case Musk is serious, I’ve already registered “www.antipravda.com” to start measuring the credibility of statements by billionaire playboy CEOs. Let’s see who blinks first.

I stole the title, with permission, from this tweet:

Analyze Apache Parquet optimized data using Amazon Kinesis Data Firehose, Amazon Athena, and Amazon Redshift

Post Syndicated from Roy Hasson original https://aws.amazon.com/blogs/big-data/analyzing-apache-parquet-optimized-data-using-amazon-kinesis-data-firehose-amazon-athena-and-amazon-redshift/

Amazon Kinesis Data Firehose is the easiest way to capture and stream data into a data lake built on Amazon S3. This data can be anything—from AWS service logs like AWS CloudTrail log files, Amazon VPC Flow Logs, Application Load Balancer logs, and others. It can also be IoT events, game events, and much more. To efficiently query this data, a time-consuming ETL (extract, transform, and load) process is required to massage and convert the data to an optimal file format, which increases the time to insight. This situation is less than ideal, especially for real-time data that loses its value over time.

To solve this common challenge, Kinesis Data Firehose can now save data to Amazon S3 in Apache Parquet or Apache ORC format. These are optimized columnar formats that are highly recommended for best performance and cost-savings when querying data in S3. This feature directly benefits you if you use Amazon Athena, Amazon Redshift, AWS Glue, Amazon EMR, or any other big data tools that are available from the AWS Partner Network and through the open-source community.

Amazon Connect is a simple-to-use, cloud-based contact center service that makes it easy for any business to provide a great customer experience at a lower cost than common alternatives. Its open platform design enables easy integration with other systems. One of those systems is Amazon Kinesis—in particular, Kinesis Data Streams and Kinesis Data Firehose.

What’s really exciting is that you can now save events from Amazon Connect to S3 in Apache Parquet format. You can then perform analytics using Amazon Athena and Amazon Redshift Spectrum in real time, taking advantage of this key performance and cost optimization. Of course, Amazon Connect is only one example. This new capability opens the door for a great deal of opportunity, especially as organizations continue to build their data lakes.

Amazon Connect includes an array of analytics views in the Administrator dashboard. But you might want to run other types of analysis. In this post, I describe how to set up a data stream from Amazon Connect through Kinesis Data Streams and Kinesis Data Firehose and out to S3, and then perform analytics using Athena and Amazon Redshift Spectrum. I focus primarily on the Kinesis Data Firehose support for Parquet and its integration with the AWS Glue Data Catalog, Amazon Athena, and Amazon Redshift.

Solution overview

Here is how the solution is laid out:



The following sections walk you through each of these steps to set up the pipeline.

1. Define the schema

When Kinesis Data Firehose processes incoming events and converts the data to Parquet, it needs to know which schema to apply. The reason is that many times, incoming events contain all or some of the expected fields based on which values the producers are advertising. A typical process is to normalize the schema during a batch ETL job so that you end up with a consistent schema that can easily be understood and queried. Doing this introduces latency due to the nature of the batch process. To overcome this issue, Kinesis Data Firehose requires the schema to be defined in advance.

To see the available columns and structures, see Amazon Connect Agent Event Streams. For the purpose of simplicity, I opted to make all the columns of type String rather than create the nested structures. But you can definitely do that if you want.

The simplest way to define the schema is to create a table in the Amazon Athena console. Open the Athena console, and paste the following create table statement, substituting your own S3 bucket and prefix for where your event data will be stored. A Data Catalog database is a logical container that holds the different tables that you can create. The default database name shown here should already exist. If it doesn’t, you can create it or use another database that you’ve already created.

CREATE EXTERNAL TABLE default.kfhconnectblog (
  awsaccountid string,
  agentarn string,
  currentagentsnapshot string,
  eventid string,
  eventtimestamp string,
  eventtype string,
  instancearn string,
  previousagentsnapshot string,
  version string
STORED AS parquet
LOCATION 's3://your_bucket/kfhconnectblog/'
TBLPROPERTIES ("parquet.compression"="SNAPPY")

That’s all you have to do to prepare the schema for Kinesis Data Firehose.

2. Define the data streams

Next, you need to define the Kinesis data streams that will be used to stream the Amazon Connect events.  Open the Kinesis Data Streams console and create two streams.  You can configure them with only one shard each because you don’t have a lot of data right now.

3. Define the Kinesis Data Firehose delivery stream for Parquet

Let’s configure the Data Firehose delivery stream using the data stream as the source and Amazon S3 as the output. Start by opening the Kinesis Data Firehose console and creating a new data delivery stream. Give it a name, and associate it with the Kinesis data stream that you created in Step 2.

As shown in the following screenshot, enable Record format conversion (1) and choose Apache Parquet (2). As you can see, Apache ORC is also supported. Scroll down and provide the AWS Glue Data Catalog database name (3) and table names (4) that you created in Step 1. Choose Next.

To make things easier, the output S3 bucket and prefix fields are automatically populated using the values that you defined in the LOCATION parameter of the create table statement from Step 1. Pretty cool. Additionally, you have the option to save the raw events into another location as defined in the Source record S3 backup section. Don’t forget to add a trailing forward slash “ / “ so that Data Firehose creates the date partitions inside that prefix.

On the next page, in the S3 buffer conditions section, there is a note about configuring a large buffer size. The Parquet file format is highly efficient in how it stores and compresses data. Increasing the buffer size allows you to pack more rows into each output file, which is preferred and gives you the most benefit from Parquet.

Compression using Snappy is automatically enabled for both Parquet and ORC. You can modify the compression algorithm by using the Kinesis Data Firehose API and update the OutputFormatConfiguration.

Be sure to also enable Amazon CloudWatch Logs so that you can debug any issues that you might run into.

Lastly, finalize the creation of the Firehose delivery stream, and continue on to the next section.

4. Set up the Amazon Connect contact center

After setting up the Kinesis pipeline, you now need to set up a simple contact center in Amazon Connect. The Getting Started page provides clear instructions on how to set up your environment, acquire a phone number, and create an agent to accept calls.

After setting up the contact center, in the Amazon Connect console, choose your Instance Alias, and then choose Data Streaming. Under Agent Event, choose the Kinesis data stream that you created in Step 2, and then choose Save.

At this point, your pipeline is complete.  Agent events from Amazon Connect are generated as agents go about their day. Events are sent via Kinesis Data Streams to Kinesis Data Firehose, which converts the event data from JSON to Parquet and stores it in S3. Athena and Amazon Redshift Spectrum can simply query the data without any additional work.

So let’s generate some data. Go back into the Administrator console for your Amazon Connect contact center, and create an agent to handle incoming calls. In this example, I creatively named mine Agent One. After it is created, Agent One can get to work and log into their console and set their availability to Available so that they are ready to receive calls.

To make the data a bit more interesting, I also created a second agent, Agent Two. I then made some incoming and outgoing calls and caused some failures to occur, so I now have enough data available to analyze.

5. Analyze the data with Athena

Let’s open the Athena console and run some queries. One thing you’ll notice is that when we created the schema for the dataset, we defined some of the fields as Strings even though in the documentation they were complex structures.  The reason for doing that was simply to show some of the flexibility of Athena to be able to parse JSON data. However, you can define nested structures in your table schema so that Kinesis Data Firehose applies the appropriate schema to the Parquet file.

Let’s run the first query to see which agents have logged into the system.

The query might look complex, but it’s fairly straightforward:

WITH dataset AS (
    from_iso8601_timestamp(eventtimestamp) AS event_ts,
      '$.agentstatus.name') AS current_status,
        '$.agentstatus.starttimestamp')) AS current_starttimestamp,
      '$.configuration.firstname') AS current_firstname,
      '$.configuration.lastname') AS current_lastname,
      '$.configuration.username') AS current_username,
      '$.configuration.routingprofile.defaultoutboundqueue.name') AS               current_outboundqueue,
      '$.configuration.routingprofile.inboundqueues[0].name') as current_inboundqueue,
      '$.agentstatus.name') as prev_status,
       '$.agentstatus.starttimestamp')) as prev_starttimestamp,
      '$.configuration.firstname') as prev_firstname,
      '$.configuration.lastname') as prev_lastname,
      '$.configuration.username') as prev_username,
      '$.configuration.routingprofile.defaultoutboundqueue.name') as current_outboundqueue,
      '$.configuration.routingprofile.inboundqueues[0].name') as prev_inboundqueue
  from kfhconnectblog
  where eventtype <> 'HEART_BEAT'
  current_status as status,
  current_username as username,
FROM dataset
WHERE eventtype = 'LOGIN' AND current_username <> ''
ORDER BY event_ts DESC

The query output looks something like this:

Here is another query that shows the sessions each of the agents engaged with. It tells us where they were incoming or outgoing, if they were completed, and where there were missed or failed calls.

WITH src AS (
     json_extract_scalar(currentagentsnapshot, '$.configuration.username') as username,
     cast(json_extract(currentagentsnapshot, '$.contacts') AS ARRAY(JSON)) as c,
     cast(json_extract(previousagentsnapshot, '$.contacts') AS ARRAY(JSON)) as p
  from kfhconnectblog
src2 AS (
  FROM src CROSS JOIN UNNEST (c, p) AS contacts(c_item, p_item)
dataset AS (
  json_extract_scalar(c_item, '$.contactid') as c_contactid,
  json_extract_scalar(c_item, '$.channel') as c_channel,
  json_extract_scalar(c_item, '$.initiationmethod') as c_direction,
  json_extract_scalar(c_item, '$.queue.name') as c_queue,
  json_extract_scalar(c_item, '$.state') as c_state,
  from_iso8601_timestamp(json_extract_scalar(c_item, '$.statestarttimestamp')) as c_ts,
  json_extract_scalar(p_item, '$.contactid') as p_contactid,
  json_extract_scalar(p_item, '$.channel') as p_channel,
  json_extract_scalar(p_item, '$.initiationmethod') as p_direction,
  json_extract_scalar(p_item, '$.queue.name') as p_queue,
  json_extract_scalar(p_item, '$.state') as p_state,
  from_iso8601_timestamp(json_extract_scalar(p_item, '$.statestarttimestamp')) as p_ts
FROM src2
  c_channel as channel,
  c_direction as direction,
  p_state as prev_state,
  c_state as current_state,
  c_ts as current_ts,
  c_contactid as id
FROM dataset
WHERE c_contactid = p_contactid
ORDER BY id DESC, current_ts ASC

The query output looks similar to the following:

6. Analyze the data with Amazon Redshift Spectrum

With Amazon Redshift Spectrum, you can query data directly in S3 using your existing Amazon Redshift data warehouse cluster. Because the data is already in Parquet format, Redshift Spectrum gets the same great benefits that Athena does.

Here is a simple query to show querying the same data from Amazon Redshift. Note that to do this, you need to first create an external schema in Amazon Redshift that points to the AWS Glue Data Catalog.

  json_extract_path_text(currentagentsnapshot,'agentstatus','name') AS current_status,
  json_extract_path_text(currentagentsnapshot, 'configuration','firstname') AS current_firstname,
  json_extract_path_text(currentagentsnapshot, 'configuration','lastname') AS current_lastname,
    'configuration','routingprofile','defaultoutboundqueue','name') AS current_outboundqueue,
FROM default_schema.kfhconnectblog

The following shows the query output:


In this post, I showed you how to use Kinesis Data Firehose to ingest and convert data to columnar file format, enabling real-time analysis using Athena and Amazon Redshift. This great feature enables a level of optimization in both cost and performance that you need when storing and analyzing large amounts of data. This feature is equally important if you are investing in building data lakes on AWS.


Additional Reading

If you found this post useful, be sure to check out Analyzing VPC Flow Logs with Amazon Kinesis Firehose, Amazon Athena, and Amazon QuickSight and Work with partitioned data in AWS Glue.

About the Author

Roy Hasson is a Global Business Development Manager for AWS Analytics. He works with customers around the globe to design solutions to meet their data processing, analytics and business intelligence needs. Roy is big Manchester United fan cheering his team on and hanging out with his family.




10 visualizations to try in Amazon QuickSight with sample data

Post Syndicated from Karthik Kumar Odapally original https://aws.amazon.com/blogs/big-data/10-visualizations-to-try-in-amazon-quicksight-with-sample-data/

If you’re not already familiar with building visualizations for quick access to business insights using Amazon QuickSight, consider this your introduction. In this post, we’ll walk through some common scenarios with sample datasets to provide an overview of how you can connect yuor data, perform advanced analysis and access the results from any web browser or mobile device.

The following visualizations are built from the public datasets available in the links below. Before we jump into that, let’s take a look at the supported data sources, file formats and a typical QuickSight workflow to build any visualization.

Which data sources does Amazon QuickSight support?

At the time of publication, you can use the following data methods:

  • Connect to AWS data sources, including:
    • Amazon RDS
    • Amazon Aurora
    • Amazon Redshift
    • Amazon Athena
    • Amazon S3
  • Upload Excel spreadsheets or flat files (CSV, TSV, CLF, and ELF)
  • Connect to on-premises databases like Teradata, SQL Server, MySQL, and PostgreSQL
  • Import data from SaaS applications like Salesforce and Snowflake
  • Use big data processing engines like Spark and Presto

This list is constantly growing. For more information, see Supported Data Sources.

Answers in instants

SPICE is the Amazon QuickSight super-fast, parallel, in-memory calculation engine, designed specifically for ad hoc data visualization. SPICE stores your data in a system architected for high availability, where it is saved until you choose to delete it. Improve the performance of database datasets by importing the data into SPICE instead of using a direct database query. To calculate how much SPICE capacity your dataset needs, see Managing SPICE Capacity.

Typical Amazon QuickSight workflow

When you create an analysis, the typical workflow is as follows:

  1. Connect to a data source, and then create a new dataset or choose an existing dataset.
  2. (Optional) If you created a new dataset, prepare the data (for example, by changing field names or data types).
  3. Create a new analysis.
  4. Add a visual to the analysis by choosing the fields to visualize. Choose a specific visual type, or use AutoGraph and let Amazon QuickSight choose the most appropriate visual type, based on the number and data types of the fields that you select.
  5. (Optional) Modify the visual to meet your requirements (for example, by adding a filter or changing the visual type).
  6. (Optional) Add more visuals to the analysis.
  7. (Optional) Add scenes to the default story to provide a narrative about some aspect of the analysis data.
  8. (Optional) Publish the analysis as a dashboard to share insights with other users.

The following graphic illustrates a typical Amazon QuickSight workflow.

Visualizations created in Amazon QuickSight with sample datasets

Visualizations for a data analyst

Source:  https://data.worldbank.org/

Download and Resources:  https://datacatalog.worldbank.org/dataset/world-development-indicators

Data catalog:  The World Bank invests into multiple development projects at the national, regional, and global levels. It’s a great source of information for data analysts.

The following graph shows the percentage of the population that has access to electricity (rural and urban) during 2000 in Asia, Africa, the Middle East, and Latin America.

The following graph shows the share of healthcare costs that are paid out-of-pocket (private vs. public). Also, you can maneuver over the graph to get detailed statistics at a glance.

Visualizations for a trading analyst

Source:  Deutsche Börse Public Dataset (DBG PDS)

Download and resources:  https://aws.amazon.com/public-datasets/deutsche-boerse-pds/

Data catalog:  The DBG PDS project makes real-time data derived from Deutsche Börse’s trading market systems available to the public for free. This is the first time that such detailed financial market data has been shared freely and continually from the source provider.

The following graph shows the market trend of max trade volume for different EU banks. It builds on the data available on XETRA engines, which is made up of a variety of equities, funds, and derivative securities. This graph can be scrolled to visualize trade for a period of an hour or more.

The following graph shows the common stock beating the rest of the maximum trade volume over a period of time, grouped by security type.

Visualizations for a data scientist

Source:  https://catalog.data.gov/

Download and resources:  https://catalog.data.gov/dataset/road-weather-information-stations-788f8

Data catalog:  Data derived from different sensor stations placed on the city bridges and surface streets are a core information source. The road weather information station has a temperature sensor that measures the temperature of the street surface. It also has a sensor that measures the ambient air temperature at the station each second.

The following graph shows the present max air temperature in Seattle from different RWI station sensors.

The following graph shows the minimum temperature of the road surface at different times, which helps predicts road conditions at a particular time of the year.

Visualizations for a data engineer

Source:  https://www.kaggle.com/

Download and resources:  https://www.kaggle.com/datasnaek/youtube-new/data

Data catalog:  Kaggle has come up with a platform where people can donate open datasets. Data engineers and other community members can have open access to these datasets and can contribute to the open data movement. They have more than 350 datasets in total, with more than 200 as featured datasets. It has a few interesting datasets on the platform that are not present at other places, and it’s a platform to connect with other data enthusiasts.

The following graph shows the trending YouTube videos and presents the max likes for the top 20 channels. This is one of the most popular datasets for data engineers.

The following graph shows the YouTube daily statistics for the max views of video titles published during a specific time period.

Visualizations for a business user

Source:  New York Taxi Data

Download and resources:  https://data.cityofnewyork.us/Transportation/2016-Green-Taxi-Trip-Data/hvrh-b6nb

Data catalog: NYC Open data hosts some very popular open data sets for all New Yorkers. This platform allows you to get involved in dive deep into the data set to pull some useful visualizations. 2016 Green taxi trip dataset includes trip records from all trips completed in green taxis in NYC in 2016. Records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.

The following graph presents maximum fare amount grouped by the passenger count during a period of time during a day. This can be further expanded to follow through different day of the month based on the business need.

The following graph shows the NewYork taxi data from January 2016, showing the dip in the number of taxis ridden on January 23, 2016 across all types of taxis.

A quick search for that date and location shows you the following news report:


Using Amazon QuickSight, you can see patterns across a time-series data by building visualizations, performing ad hoc analysis, and quickly generating insights. We hope you’ll give it a try today!


Additional Reading

If you found this post useful, be sure to check out Amazon QuickSight Adds Support for Combo Charts and Row-Level Security and Visualize AWS Cloudtrail Logs Using AWS Glue and Amazon QuickSight.

Karthik Odapally is a Sr. Solutions Architect in AWS. His passion is to build cost effective and highly scalable solutions on the cloud. In his spare time, he bakes cookies and cupcakes for family and friends here in the PNW. He loves vintage racing cars.




Pranabesh Mandal is a Solutions Architect in AWS. He has over a decade of IT experience. He is passionate about cloud technology and focuses on Analytics. In his spare time, he likes to hike and explore the beautiful nature and wild life of most divine national parks around the United States alongside his wife.





Easier way to control access to AWS regions using IAM policies

Post Syndicated from Sulay Shah original https://aws.amazon.com/blogs/security/easier-way-to-control-access-to-aws-regions-using-iam-policies/

We made it easier for you to comply with regulatory standards by controlling access to AWS Regions using IAM policies. For example, if your company requires users to create resources in a specific AWS region, you can now add a new condition to the IAM policies you attach to your IAM principal (user or role) to enforce this for all AWS services. In this post, I review conditions in policies, introduce the new condition, and review a policy example to demonstrate how you can control access across multiple AWS services to a specific region.

Condition concepts

Before I introduce the new condition, let’s review the condition element of an IAM policy. A condition is an optional IAM policy element that lets you specify special circumstances under which the policy grants or denies permission. A condition includes a condition key, operator, and value for the condition. There are two types of conditions: service-specific conditions and global conditions. Service-specific conditions are specific to certain actions in an AWS service. For example, the condition key ec2:InstanceType supports specific EC2 actions. Global conditions support all actions across all AWS services.

Now that I’ve reviewed the condition element in an IAM policy, let me introduce the new condition.

AWS:RequestedRegion condition key

The new global condition key, , supports all actions across all AWS services. You can use any string operator and specify any AWS region for its value.

Condition key Description Operator(s) Value
aws:RequestedRegion Allows you to specify the region to which the IAM principal (user or role) can make API calls All string operators (for example, StringEquals Any AWS region (for example, us-east-1)

I’ll now demonstrate the use of the new global condition key.

Example: Policy with region-level control

Let’s say a group of software developers in my organization is working on a project using Amazon EC2 and Amazon RDS. The project requires a web server running on an EC2 instance using Amazon Linux and a MySQL database instance in RDS. The developers also want to test Amazon Lambda, an event-driven platform, to retrieve data from the MySQL DB instance in RDS for future use.

My organization requires all the AWS resources to remain in the Frankfurt, eu-central-1, region. To make sure this project follows these guidelines, I create a single IAM policy for all the AWS services that this group is going to use and apply the new global condition key aws:RequestedRegion for all the services. This way I can ensure that any new EC2 instances launched or any database instances created using RDS are in Frankfurt. This policy also ensures that any Lambda functions this group creates for testing are also in the Frankfurt region.

    "Version": "2012-10-17",
    "Statement": [
            "Effect": "Allow",
            "Action": [
            "Resource": "*"
            "Effect": "Allow",
            "Action": [
            "Resource": "*",
      "Condition": {"StringEquals": {"aws:RequestedRegion": "eu-central-1"}}

            "Effect": "Allow",
            "Action": [
            "Resource": "arn:aws:iam::account-id:role/*"

The first statement in the above example contains all the read-only actions that let my developers use the console for EC2, RDS, and Lambda. The permissions for IAM-related actions are required to launch EC2 instances with a role, enable enhanced monitoring in RDS, and for AWS Lambda to assume the IAM execution role to execute the Lambda function. I’ve combined all the read-only actions into a single statement for simplicity. The second statement is where I give write access to my developers for the three services and restrict the write access to the Frankfurt region using the aws:RequestedRegion condition key. You can also list multiple AWS regions with the new condition key if your developers are allowed to create resources in multiple regions. The third statement grants permissions for the IAM action iam:PassRole required by AWS Lambda. For more information on allowing users to create a Lambda function, see Using Identity-Based Policies for AWS Lambda.


You can now use the aws:RequestedRegion global condition key in your IAM policies to specify the region to which the IAM principal (user or role) can invoke an API call. This capability makes it easier for you to restrict the AWS regions your IAM principals can use to comply with regulatory standards and improve account security. For more information about this global condition key and policy examples using aws:RequestedRegion, see the IAM documentation.

If you have comments about this post, submit them in the Comments section below. If you have questions about or suggestions for this solution, start a new thread on the IAM forum.

Want more AWS Security news? Follow us on Twitter.

[$] wait_var_event()

Post Syndicated from corbet original https://lwn.net/Articles/750774/rss

One of the trickiest aspects to concurrency in the kernel is waiting for a
specific event to take place. There is a wide variety of possible events,
including a process exiting, the last reference to a data structure going
away, a device completing an operation, or a timeout occurring.
Waiting is surprisingly hard to get right — race conditions abound to trap
the unwary — so the kernel has
accumulated a
large set of wait_event_*() macros to make the task easier. An
attempt to add a new one, though, has led to the generalization of specific
types of waits for 4.17.

Facebook and Cambridge Analytica

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/03/facebook_and_ca.html

In the wake of the Cambridge Analytica scandal, news articles and commentators have focused on what Facebook knows about us. A lot, it turns out. It collects data from our posts, our likes, our photos, things we type and delete without posting, and things we do while not on Facebook and even when we’re offline. It buys data about us from others. And it can infer even more: our sexual orientation, political beliefs, relationship status, drug use, and other personality traits — even if we didn’t take the personality test that Cambridge Analytica developed.

But for every article about Facebook’s creepy stalker behavior, thousands of other companies are breathing a collective sigh of relief that it’s Facebook and not them in the spotlight. Because while Facebook is one of the biggest players in this space, there are thousands of other companies that spy on and manipulate us for profit.

Harvard Business School professor Shoshana Zuboff calls it “surveillance capitalism.” And as creepy as Facebook is turning out to be, the entire industry is far creepier. It has existed in secret far too long, and it’s up to lawmakers to force these companies into the public spotlight, where we can all decide if this is how we want society to operate and — if not — what to do about it.

There are 2,500 to 4,000 data brokers in the United States whose business is buying and selling our personal data. Last year, Equifax was in the news when hackers stole personal information on 150 million people, including Social Security numbers, birth dates, addresses, and driver’s license numbers.

You certainly didn’t give it permission to collect any of that information. Equifax is one of those thousands of data brokers, most of them you’ve never heard of, selling your personal information without your knowledge or consent to pretty much anyone who will pay for it.

Surveillance capitalism takes this one step further. Companies like Facebook and Google offer you free services in exchange for your data. Google’s surveillance isn’t in the news, but it’s startlingly intimate. We never lie to our search engines. Our interests and curiosities, hopes and fears, desires and sexual proclivities, are all collected and saved. Add to that the websites we visit that Google tracks through its advertising network, our Gmail accounts, our movements via Google Maps, and what it can collect from our smartphones.

That phone is probably the most intimate surveillance device ever invented. It tracks our location continuously, so it knows where we live, where we work, and where we spend our time. It’s the first and last thing we check in a day, so it knows when we wake up and when we go to sleep. We all have one, so it knows who we sleep with. Uber used just some of that information to detect one-night stands; your smartphone provider and any app you allow to collect location data knows a lot more.

Surveillance capitalism drives much of the internet. It’s behind most of the “free” services, and many of the paid ones as well. Its goal is psychological manipulation, in the form of personalized advertising to persuade you to buy something or do something, like vote for a candidate. And while the individualized profile-driven manipulation exposed by Cambridge Analytica feels abhorrent, it’s really no different from what every company wants in the end. This is why all your personal information is collected, and this is why it is so valuable. Companies that can understand it can use it against you.

None of this is new. The media has been reporting on surveillance capitalism for years. In 2015, I wrote a book about it. Back in 2010, the Wall Street Journal published an award-winning two-year series about how people are tracked both online and offline, titled “What They Know.”

Surveillance capitalism is deeply embedded in our increasingly computerized society, and if the extent of it came to light there would be broad demands for limits and regulation. But because this industry can largely operate in secret, only occasionally exposed after a data breach or investigative report, we remain mostly ignorant of its reach.

This might change soon. In 2016, the European Union passed the comprehensive General Data Protection Regulation, or GDPR. The details of the law are far too complex to explain here, but some of the things it mandates are that personal data of EU citizens can only be collected and saved for “specific, explicit, and legitimate purposes,” and only with explicit consent of the user. Consent can’t be buried in the terms and conditions, nor can it be assumed unless the user opts in. This law will take effect in May, and companies worldwide are bracing for its enforcement.

Because pretty much all surveillance capitalism companies collect data on Europeans, this will expose the industry like nothing else. Here’s just one example. In preparation for this law, PayPal quietly published a list of over 600 companies it might share your personal data with. What will it be like when every company has to publish this sort of information, and explicitly explain how it’s using your personal data? We’re about to find out.

In the wake of this scandal, even Mark Zuckerberg said that his industry probably should be regulated, although he’s certainly not wishing for the sorts of comprehensive regulation the GDPR is bringing to Europe.

He’s right. Surveillance capitalism has operated without constraints for far too long. And advances in both big data analysis and artificial intelligence will make tomorrow’s applications far creepier than today’s. Regulation is the only answer.

The first step to any regulation is transparency. Who has our data? Is it accurate? What are they doing with it? Who are they selling it to? How are they securing it? Can we delete it? I don’t see any hope of Congress passing a GDPR-like data protection law anytime soon, but it’s not too far-fetched to demand laws requiring these companies to be more transparent in what they’re doing.

One of the responses to the Cambridge Analytica scandal is that people are deleting their Facebook accounts. It’s hard to do right, and doesn’t do anything about the data that Facebook collects about people who don’t use Facebook. But it’s a start. The market can put pressure on these companies to reduce their spying on us, but it can only do that if we force the industry out of its secret shadows.

This essay previously appeared on CNN.com.

EDITED TO ADD (4/2): Slashdot thread.

Performing Unit Testing in an AWS CodeStar Project

Post Syndicated from Jerry Mathen Jacob original https://aws.amazon.com/blogs/devops/performing-unit-testing-in-an-aws-codestar-project/

In this blog post, I will show how you can perform unit testing as a part of your AWS CodeStar project. AWS CodeStar helps you quickly develop, build, and deploy applications on AWS. With AWS CodeStar, you can set up your continuous delivery (CD) toolchain and manage your software development from one place.

Because unit testing tests individual units of application code, it is helpful for quickly identifying and isolating issues. As a part of an automated CI/CD process, it can also be used to prevent bad code from being deployed into production.

Many of the AWS CodeStar project templates come preconfigured with a unit testing framework so that you can start deploying your code with more confidence. The unit testing is configured to run in the provided build stage so that, if the unit tests do not pass, the code is not deployed. For a list of AWS CodeStar project templates that include unit testing, see AWS CodeStar Project Templates in the AWS CodeStar User Guide.

The scenario

As a big fan of superhero movies, I decided to list my favorites and ask my friends to vote on theirs by using a WebService endpoint I created. The example I use is a Python web service running on AWS Lambda with AWS CodeCommit as the code repository. CodeCommit is a fully managed source control system that hosts Git repositories and works with all Git-based tools.

Here’s how you can create the WebService endpoint:

Sign in to the AWS CodeStar console. Choose Start a project, which will take you to the list of project templates.

create project

For code edits I will choose AWS Cloud9, which is a cloud-based integrated development environment (IDE) that you use to write, run, and debug code.

choose cloud9

Here are the other tasks required by my scenario:

  • Create a database table where the votes can be stored and retrieved as needed.
  • Update the logic in the Lambda function that was created for posting and getting the votes.
  • Update the unit tests (of course!) to verify that the logic works as expected.

For a database table, I’ve chosen Amazon DynamoDB, which offers a fast and flexible NoSQL database.

Getting set up on AWS Cloud9

From the AWS CodeStar console, go to the AWS Cloud9 console, which should take you to your project code. I will open up a terminal at the top-level folder under which I will set up my environment and required libraries.

Use the following command to set the PYTHONPATH environment variable on the terminal.

export PYTHONPATH=/home/ec2-user/environment/vote-your-movie

You should now be able to use the following command to execute the unit tests in your project.

python -m unittest discover vote-your-movie/tests

cloud9 setup

Start coding

Now that you have set up your local environment and have a copy of your code, add a DynamoDB table to the project by defining it through a template file. Open template.yml, which is the Serverless Application Model (SAM) template file. This template extends AWS CloudFormation to provide a simplified way of defining the Amazon API Gateway APIs, AWS Lambda functions, and Amazon DynamoDB tables required by your serverless application.

AWSTemplateFormatVersion: 2010-09-09
- AWS::Serverless-2016-10-31
- AWS::CodeStar

    Type: String
    Description: CodeStar projectId used to associate new resources to team members

  # The DB table to store the votes.
    Type: AWS::Serverless::SimpleTable
        # Name of the "Candidate" is the partition key of the table.
        Name: Candidate
        Type: String
  # Creating a new lambda function for retrieving and storing votes.
    Type: AWS::Serverless::Function
      Handler: index.handler
      Runtime: python3.6
        # Setting environment variables for your lambda function.
          TABLE_NAME: !Ref "MovieVoteTable"
          TABLE_REGION: !Ref "AWS::Region"
          !Join ['-', [!Ref 'ProjectId', !Ref 'AWS::Region', 'LambdaTrustRole']]
          Type: Api
            Path: /
            Method: get
          Type: Api
            Path: /
            Method: post

We’ll use Python’s boto3 library to connect to AWS services. And we’ll use Python’s mock library to mock AWS service calls for our unit tests.
Use the following command to install these libraries:

pip install --upgrade boto3 mock -t .

install dependencies

Add these libraries to the buildspec.yml, which is the YAML file that is required for CodeBuild to execute.

version: 0.2


      # Upgrade AWS CLI to the latest version
      - pip install --upgrade awscli boto3 mock


      # Discover and run unit tests in the 'tests' directory. For more information, see <https://docs.python.org/3/library/unittest.html#test-discovery>
      - python -m unittest discover tests


      # Use AWS SAM to package the application by using AWS CloudFormation
      - aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml

  type: zip
    - template-export.yml

Open the index.py where we can write the simple voting logic for our Lambda function.

import json
import datetime
import boto3
import os

table_name = os.environ['TABLE_NAME']
table_region = os.environ['TABLE_REGION']

VOTES_TABLE = boto3.resource('dynamodb', region_name=table_region).Table(table_name)
CANDIDATES = {"A": "Black Panther", "B": "Captain America: Civil War", "C": "Guardians of the Galaxy", "D": "Thor: Ragnarok"}

def handler(event, context):
    if event['httpMethod'] == 'GET':
        resp = VOTES_TABLE.scan()
        return {'statusCode': 200,
                'body': json.dumps({item['Candidate']: int(item['Votes']) for item in resp['Items']}),
                'headers': {'Content-Type': 'application/json'}}

    elif event['httpMethod'] == 'POST':
            body = json.loads(event['body'])
            return {'statusCode': 400,
                    'body': 'Invalid input! Expecting a JSON.',
                    'headers': {'Content-Type': 'application/json'}}
        if 'candidate' not in body:
            return {'statusCode': 400,
                    'body': 'Missing "candidate" in request.',
                    'headers': {'Content-Type': 'application/json'}}
        if body['candidate'] not in CANDIDATES.keys():
            return {'statusCode': 400,
                    'body': 'You must vote for one of the following candidates - {}.'.format(get_allowed_candidates()),
                    'headers': {'Content-Type': 'application/json'}}

        resp = VOTES_TABLE.update_item(
            Key={'Candidate': CANDIDATES.get(body['candidate'])},
            UpdateExpression='ADD Votes :incr',
            ExpressionAttributeValues={':incr': 1},
        return {'statusCode': 200,
                'body': "{} now has {} votes".format(CANDIDATES.get(body['candidate']), resp['Attributes']['Votes']),
                'headers': {'Content-Type': 'application/json'}}

def get_allowed_candidates():
    l = []
    for key in CANDIDATES:
        l.append("'{}' for '{}'".format(key, CANDIDATES.get(key)))
    return ", ".join(l)

What our code basically does is take in the HTTPS request call as an event. If it is an HTTP GET request, it gets the votes result from the table. If it is an HTTP POST request, it sets a vote for the candidate of choice. We also validate the inputs in the POST request to filter out requests that seem malicious. That way, only valid calls are stored in the table.

In the example code provided, we use a CANDIDATES variable to store our candidates, but you can store the candidates in a JSON file and use Python’s json library instead.

Let’s update the tests now. Under the tests folder, open the test_handler.py and modify it to verify the logic.

import os
# Some mock environment variables that would be used by the mock for DynamoDB
os.environ['TABLE_NAME'] = "MockHelloWorldTable"
os.environ['TABLE_REGION'] = "us-east-1"

# The library containing our logic.
import index

# Boto3's core library
import botocore
# For handling JSON.
import json
# Unit test library
import unittest
## Getting StringIO based on your setup.
    from StringIO import StringIO
except ImportError:
    from io import StringIO
## Python mock library
from mock import patch, call
from decimal import Decimal

class TestCandidateVotes(unittest.TestCase):

    ## Test the HTTP GET request flow. 
    ## We expect to get back a successful response with results of votes from the table (mocked).
    def test_get_votes(self, boto_mock):
        # Input event to our method to test.
        expected_event = {'httpMethod': 'GET'}
        # The mocked values in our DynamoDB table.
        items_in_db = [{'Candidate': 'Black Panther', 'Votes': Decimal('3')},
                        {'Candidate': 'Captain America: Civil War', 'Votes': Decimal('8')},
                        {'Candidate': 'Guardians of the Galaxy', 'Votes': Decimal('8')},
                        {'Candidate': "Thor: Ragnarok", 'Votes': Decimal('1')}
        # The mocked DynamoDB response.
        expected_ddb_response = {'Items': items_in_db}
        # The mocked response we expect back by calling DynamoDB through boto.
        response_body = botocore.response.StreamingBody(StringIO(str(expected_ddb_response)),
        # Setting the expected value in the mock.
        boto_mock.side_effect = [expected_ddb_response]
        # Expecting that there would be a call to DynamoDB Scan function during execution with these parameters.
        expected_calls = [call('Scan', {'TableName': os.environ['TABLE_NAME']})]

        # Call the function to test.
        result = index.handler(expected_event, {})

        # Run unit test assertions to verify the expected calls to mock have occurred and verify the response.
        assert result.get('headers').get('Content-Type') == 'application/json'
        assert result.get('statusCode') == 200

        result_body = json.loads(result.get('body'))
        # Verifying that the results match to that from the table.
        assert len(result_body) == len(items_in_db)
        for i in range(len(result_body)):
            assert result_body.get(items_in_db[i].get("Candidate")) == int(items_in_db[i].get("Votes"))

        assert boto_mock.call_count == 1

    ## Test the HTTP POST request flow that places a vote for a selected candidate.
    ## We expect to get back a successful response with a confirmation message.
    def test_place_valid_candidate_vote(self, boto_mock):
        # Input event to our method to test.
        expected_event = {'httpMethod': 'POST', 'body': "{\"candidate\": \"D\"}"}
        # The mocked response in our DynamoDB table.
        expected_ddb_response = {'Attributes': {'Candidate': "Thor: Ragnarok", 'Votes': Decimal('2')}}
        # The mocked response we expect back by calling DynamoDB through boto.
        response_body = botocore.response.StreamingBody(StringIO(str(expected_ddb_response)),
        # Setting the expected value in the mock.
        boto_mock.side_effect = [expected_ddb_response]
        # Expecting that there would be a call to DynamoDB UpdateItem function during execution with these parameters.
        expected_calls = [call('UpdateItem', {
                                                'TableName': os.environ['TABLE_NAME'], 
                                                'Key': {'Candidate': 'Thor: Ragnarok'},
                                                'UpdateExpression': 'ADD Votes :incr',
                                                'ExpressionAttributeValues': {':incr': 1},
                                                'ReturnValues': 'ALL_NEW'
        # Call the function to test.
        result = index.handler(expected_event, {})
        # Run unit test assertions to verify the expected calls to mock have occurred and verify the response.
        assert result.get('headers').get('Content-Type') == 'application/json'
        assert result.get('statusCode') == 200

        assert result.get('body') == "{} now has {} votes".format(

        assert boto_mock.call_count == 1

    ## Test the HTTP POST request flow that places a vote for an non-existant candidate.
    ## We expect to get back a successful response with a confirmation message.
    def test_place_invalid_candidate_vote(self, boto_mock):
        # Input event to our method to test.
        # The valid IDs for the candidates are A, B, C, and D
        expected_event = {'httpMethod': 'POST', 'body': "{\"candidate\": \"E\"}"}
        # Call the function to test.
        result = index.handler(expected_event, {})
        # Run unit test assertions to verify the expected calls to mock have occurred and verify the response.
        assert result.get('headers').get('Content-Type') == 'application/json'
        assert result.get('statusCode') == 400
        assert result.get('body') == 'You must vote for one of the following candidates - {}.'.format(index.get_allowed_candidates())

    ## Test the HTTP POST request flow that places a vote for a selected candidate but associated with an invalid key in the POST body.
    ## We expect to get back a failed (400) response with an appropriate error message.
    def test_place_invalid_data_vote(self, boto_mock):
        # Input event to our method to test.
        # "name" is not the expected input key.
        expected_event = {'httpMethod': 'POST', 'body': "{\"name\": \"D\"}"}
        # Call the function to test.
        result = index.handler(expected_event, {})
        # Run unit test assertions to verify the expected calls to mock have occurred and verify the response.
        assert result.get('headers').get('Content-Type') == 'application/json'
        assert result.get('statusCode') == 400
        assert result.get('body') == 'Missing "candidate" in request.'

    ## Test the HTTP POST request flow that places a vote for a selected candidate but not as a JSON string which the body of the request expects.
    ## We expect to get back a failed (400) response with an appropriate error message.
    def test_place_malformed_json_vote(self, boto_mock):
        # Input event to our method to test.
        # "body" receives a string rather than a JSON string.
        expected_event = {'httpMethod': 'POST', 'body': "Thor: Ragnarok"}
        # Call the function to test.
        result = index.handler(expected_event, {})
        # Run unit test assertions to verify the expected calls to mock have occurred and verify the response.
        assert result.get('headers').get('Content-Type') == 'application/json'
        assert result.get('statusCode') == 400
        assert result.get('body') == 'Invalid input! Expecting a JSON.'

if __name__ == '__main__':

I am keeping the code samples well commented so that it’s clear what each unit test accomplishes. It tests the success conditions and the failure paths that are handled in the logic.

In my unit tests I use the patch decorator (@patch) in the mock library. @patch helps mock the function you want to call (in this case, the botocore library’s _make_api_call function in the BaseClient class).
Before we commit our changes, let’s run the tests locally. On the terminal, run the tests again. If all the unit tests pass, you should expect to see a result like this:

You:~/environment $ python -m unittest discover vote-your-movie/tests
Ran 5 tests in 0.003s

You:~/environment $

Upload to AWS

Now that the tests have passed, it’s time to commit and push the code to source repository!

Add your changes

From the terminal, go to the project’s folder and use the following command to verify the changes you are about to push.

git status

To add the modified files only, use the following command:

git add -u

Commit your changes

To commit the changes (with a message), use the following command:

git commit -m "Logic and tests for the voting webservice."

Push your changes to AWS CodeCommit

To push your committed changes to CodeCommit, use the following command:

git push

In the AWS CodeStar console, you can see your changes flowing through the pipeline and being deployed. There are also links in the AWS CodeStar console that take you to this project’s build runs so you can see your tests running on AWS CodeBuild. The latest link under the Build Runs table takes you to the logs.

unit tests at codebuild

After the deployment is complete, AWS CodeStar should now display the AWS Lambda function and DynamoDB table created and synced with this project. The Project link in the AWS CodeStar project’s navigation bar displays the AWS resources linked to this project.

codestar resources

Because this is a new database table, there should be no data in it. So, let’s put in some votes. You can download Postman to test your application endpoint for POST and GET calls. The endpoint you want to test is the URL displayed under Application endpoints in the AWS CodeStar console.

Now let’s open Postman and look at the results. Let’s create some votes through POST requests. Based on this example, a valid vote has a value of A, B, C, or D.
Here’s what a successful POST request looks like:

POST success

Here’s what it looks like if I use some value other than A, B, C, or D:



Now I am going to use a GET request to fetch the results of the votes from the database.

GET success

And that’s it! You have now created a simple voting web service using AWS Lambda, Amazon API Gateway, and DynamoDB and used unit tests to verify your logic so that you ship good code.
Happy coding!

New – Amazon DynamoDB Continuous Backups and Point-In-Time Recovery (PITR)

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/new-amazon-dynamodb-continuous-backups-and-point-in-time-recovery-pitr/

The Amazon DynamoDB team is back with another useful feature hot on the heels of encryption at rest. At AWS re:Invent 2017 we launched global tables and on-demand backup and restore of your DynamoDB tables and today we’re launching continuous backups with point-in-time recovery (PITR).

You can enable continuous backups with a single click in the AWS Management Console, a simple API call, or with the AWS Command Line Interface (CLI). DynamoDB can back up your data with per-second granularity and restore to any single second from the time PITR was enabled up to the prior 35 days. We built this feature to protect against accidental writes or deletes. If a developer runs a script against production instead of staging or if someone fat-fingers a DeleteItem call, PITR has you covered. We also built it for the scenarios you can’t normally predict. You can still keep your on-demand backups for as long as needed for archival purposes but PITR works as additional insurance against accidental loss of data. Let’s see how this works.

Continuous Backup

To enable this feature in the console we navigate to our table and select the Backups tab. From there simply click Enable to turn on the feature. I could also turn on continuous backups via the UpdateContinuousBackups API call.

After continuous backup is enabled we should be able to see an Earliest restore date and Latest restore date

Let’s imagine a scenario where I have a lot of old user profiles that I want to delete.

I really only want to send service updates to our active users based on their last_update date. I decided to write a quick Python script to delete all the users that haven’t used my service in a while.

import boto3
table = boto3.resource("dynamodb").Table("VerySuperImportantTable")
items = table.scan(
    FilterExpression="last_update >= :date",
    ExpressionAttributeValues={":date": "2014-01-01T00:00:00"},
print("Deleting {} Items! Dangerous.".format(len(items)))
with table.batch_writer() as batch:
    for item in items:

Great! This should delete all those pesky non-users of my service that haven’t logged in since 2013. So,— CTRL+C CTRL+C CTRL+C CTRL+C (interrupt the currently executing command).

Yikes! Do you see where I went wrong? I’ve just deleted my most important users! Oh, no! Where I had a greater-than sign, I meant to put a less-than! Quick, before Jeff Barr can see, I’m going to restore the table. (I probably could have prevented that typo with Boto 3’s handy DynamoDB conditions: Attr("last_update").lt("2014-01-01T00:00:00"))


Luckily for me, restoring a table is easy. In the console I’ll navigate to the Backups tab for my table and click Restore to point-in-time.

I’ll specify the time (a few seconds before I started my deleting spree) and a name for the table I’m restoring to.

For a relatively small and evenly distributed table like mine, the restore is quite fast.

The time it takes to restore a table varies based on multiple factors and restore times are not neccesarily coordinated with the size of the table. If your dataset is evenly distributed across your primary keys you’ll be able to take advanatage of parallelization which will speed up your restores.

Learn More & Try It Yourself
There’s plenty more to learn about this new feature in the documentation here.

Pricing for continuous backups varies by region and is based on the current size of the table and all indexes.

A few things to note:

  • PITR works with encrypted tables.
  • If you disable PITR and later reenable it, you reset the start time from which you can recover.
  • Just like on-demand backups, there are no performance or availability impacts to enabling this feature.
  • Stream settings, Time To Live settings, PITR settings, tags, Amazon CloudWatch alarms, and auto scaling policies are not copied to the restored table.
  • Jeff, it turns out, knew I restored the table all along because every PITR API call is recorded in AWS CloudTrail.

Let us know how you’re going to use continuous backups and PITR on Twitter and in the comments.

Six more companies adopt GPLv3 termination language

Post Syndicated from corbet original https://lwn.net/Articles/749758/rss

Red Hat has announced
that six more companies (CA Technologies, Cisco, HPE, Microsoft, SAP, and
SUSE) have agreed to apply the GPLv3 termination conditions (wherein a
violator’s license is automatically restored if the problem is fixed in a
timely manner) to GPLv2-licensed code. “GPL version 3 (GPLv3)
introduced an approach to termination that offers distributors of the code
an opportunity to correct errors and mistakes in license compliance. This
approach allows for enforcement of license compliance consistent with a
community in which heavy-handed approaches to enforcement, including for
financial gain, are out of place.