Enabling Amazon QuickSight federation with Azure AD

Post Syndicated from Adnan Hasan original https://aws.amazon.com/blogs/big-data/enabling-amazon-quicksight-federation-with-azure-ad/

Customers today want to establish a single identity and access strategy across all of their own apps, such as on-premises apps, third-party cloud apps (SaaS), or apps in AWS. If your organization use Azure Active Directory (Azure AD) for cloud applications, you can enable single sign-on (SSO) for applications like Amazon QuickSight without needing to create another user account or remember passwords. You can also enable role-based access control to make sure users get appropriate role permissions in QuickSight based on their entitlement stored in Active Directory attributes or granted through Active Directory group membership. The setup also allows administrators to focus on managing a single source of truth for user identities in Azure AD while having the convenience of configuring access to other AWS accounts and apps centrally.

In this post, we walk through the steps required to configure federated SSO between QuickSight and Azure AD. We also demonstrate ways to assign a QuickSight role based on Azure AD group membership. Administrators can publish the QuickSight app in the Azure App portal to enable users to SSO to QuickSight using their Azure AD or Active Directory credentials.

The solution in this post uses an identity provider (IdP)-initiated SSO, which means your end-users must log in to Azure AD and choose the published QuickSight app in the Azure App Portal portal to sign in to QuickSight.

Registering a QuickSight application in Azure AD

Your first step is to create a QuickSight application in Azure AD.

  1. Log in to your Azure portal using the administrator account in the Azure AD tenant where you want to register the QuickSight application.
  2. Under Azure Services, open Azure Active Directory and under Manage, choose Enterprise Application.
  3. Choose New Application.
  4. Select Non-gallery application.
  5. For Name, enter Amazon QuickSight.
  6. Choose Add to register the application.

Creating users and groups in Azure AD

You can now create new users and groups or choose existing users and groups that can access QuickSight.

  1. Under Manage, choose All applications and open Amazon QuickSight
  2. Under Getting Started, choose Assign users and groups.
  3. For this post, you create three groups, one for each QuickSight role:
    1. QuickSight-Admin
    2. QuickSight-Author
    3. QuickSight-Reader

For instructions on creating groups in Azure AD, see Create a basic group and add members using Azure Active Directory.

Configuring SSO in Azure AD

You can now start configuring the SSO settings for the app.

  1. Under Manage, choose Single sign-on.
  2. For Select a single sign-on method, choose SAML.
  3. To configure the sections, choose Edit.
  4. In the Basic SAML Configuration section, for Identifier (Entity ID), enter URN:AMAZON:WEBSERVICES.

This is the entity ID passed during the SAML exchange. Azure requires that this value be unique for each application. For additional AWS applications, you can append a number to the string; for example, URN:AMAZON:WEBSERVICES2.

  1. For Reply URL, enter https://signin.aws.amazon.com/saml.
  2. Leave Sign on URL blank.
  3. For Relay State, enter https://quicksight.aws.amazon.com.
  4. Leave Logout Url blank.

  5. Under SAML Signing Certificate, choose Download next to Federation Metadata XML.

You use this XML document later when setting up the SAML provider in AWS Identity and Access Management (IAM).

  1. Leave this tab open in your browser while moving on to the next steps.

Creating Azure AD as your SAML IdP in AWS

You now configure Azure AD as your SAML IdP.

  1. Open a new tab in your browser.
  2. Log in to the IAM console in your AWS account with admin permissions.
  3. On the IAM console, choose Identity providers.
  4. Choose Create provider.
  5. For Provider name, enter AzureActiveDirectory.
  6. Choose Choose File to upload the metadata document you downloaded earlier.
  7. Choose Next Step.
  8. Verify the provider information and choose Create.
  9. On the summary page, record the value for the provider ARN (arn:aws:iam::<AccountID>:saml-provider/AzureActiveDirectory).

You need this ARN to configure claims rules later in this post.

You can also complete this configuration using the AWS Command Line Interface (AWS CLI).

Configuring IAM policies

In this step, you create three IAM policies for different role permissions in QuickSight:

  • QuickSight-Federated-Admin
  • QuickSight-Federated-Author
  • QuickSight-Federated-Reader

Use the following steps to set up QuickSight-Federated-Admin policy. This policy grants admin privileges in QuickSight to the federated user:

  1. On the IAM console, choose Policies.
  2. Choose Create Policy.
  3. Choose JSON and replace the existing text with the following code:
        "Version": "2012-10-17",
        "Statement": [
                "Effect": "Allow",
                "Action": "quicksight:CreateAdmin",
                "Resource": "*"

  4. Choose Review policy
  5. For Name enter QuickSight-Federated-Admin.
  6. Choose Create policy.

Now repeat the steps to create QuickSight-Federated-Author and QuickSight-Federated-Reader policy using the following JSON codes for each policy:


The following policy grants author privileges in QuickSight to the federated user:

    "Version": "2012-10-17",
    "Statement": [
            "Effect": "Allow",
            "Action": "quicksight:CreateUser",
            "Resource": "*"


The following policy grants reader privileges in QuickSight to the federated user:

    "Version": "2012-10-17",
    "Statement": [
            "Effect": "Allow",
            "Action": "quicksight:CreateReader",
            "Resource": "*"

Configuring IAM roles

Next, create the roles that your Azure AD users assume when federating into QuickSight. The following steps set up the admin role:

  1. On the IAM console, choose Roles.
  2. Choose Create role.
  3. For Select type of trusted entity, choose SAML 2.0 federation.
  4. For SAML provider, choose the provider you created earlier (AzureActiveDirectory).
  5. Select Allow programmatic and AWS Management Console access.
  6. For Attribute, make sure SAML:aud is selected.
  7. Value should show https://signin.aws.amazon.com/saml.
  8. Choose Next: Permissions.
  9. Choose the QuickSight-Federated-Admin IAM policy you created earlier.
  10. Choose Next: Tags.
  11. Choose Next: Review
  12. For Role name, enter QuickSight-Admin-Role.
  13. For Role description, enter a description.
  14. Choose Create role.
  15. On the IAM console, in the navigation pane, choose Roles.
  16. Choose the QuickSight-Admin-Role role you created to open the role’s properties.
  17. Record the role ARN to use later.
  18. On the Trust Relationships tab, choose Edit Trust Relationship.
  19. Under Trusted Entities, verify that the IdP you created is listed.
  20. Under Conditions, verify that SAML:aud with a value of https://signin.aws.amazon.com/saml is present.
  21. Repeat these steps to create your author and reader roles and attach the appropriate policies:
    1. For QuickSight-Author-Role, use the policy QuickSight-Federated-Author.
    2. For QuickSight-Reader-Role, use the policy QuickSight-Federated-Reader.

Configuring user attributes and claims in Azure AD

In this step, you return to the application in Azure portal and configure the user claims that Azure AD sends to AWS.

By default, several SAML attributes are populated for the new application, but you don’t need these attributes for federation into QuickSight. Under Additional Claims, select the unnecessary claims and choose Delete.

For this post, you create three claims:

  • Role
  • RoleSessionName

Creating the Role claim

To create the Role claim, complete the following steps:

  1. Under Manage, choose Single sign-on.
  2. Choose Edit on User Attributes & Claims section
  3. Choose Add new claim.
  4. For Name, enter Role.
  5. For Namespace, enter https://aws.amazon.com/SAML/Attributes.
  6. Under Claim conditions, add a condition for the admin, author, and reader roles. Use the parameters in the following table:
User Type Scoped Group Source Value
Any QuickSight-Admin Attribute arn:aws:iam::253914981264:role/Quicksight-Admin-Role,arn:aws:iam::253914981264:saml-provider/AzureActiveDirectory
Any QuickSight-Author Attribute arn:aws:iam::253914981264:role/Quicksight-Author-Role,arn:aws:iam::253914981264:saml-provider/AzureActiveDirectory
Any QuickSight-Reader Attribute arn:aws:iam::253914981264:role/Quicksight-Reader-Role,arn:aws:iam::253914981264:saml-provider/AzureActiveDirectory

Creating the RoleSessionName claim

To create your RoleSessionName claim, complete the following steps:

  1. Choose Add new claim.
  2. For Name, enter RoleSessionName.
  3. For Namespace, enter https://aws.amazon.com/SAML/Attributes.
  4. For Source, choose Transformation.
  5. For Transformation, enter ExtractMailPrefix().
  6. For Parameter 1, enter user.userprincipalname.

We use the ExtractMailPrefix() function to extract the name from the userprincipalname attribute. For example, the function extracts the name joe from the user principal name value of [email protected]. IAM uses RoleSessionName to build the role session ID for the user signing into QuickSight. The role session ID is made up of the Role name and RoleSessionName, in Role/RoleSessionName format. Users are registered in QuickSight with the role session ID as the username.

Creating the SAML_SUBJECT claim

To create your final claim, SAML_SUBJECT, complete the following steps:

  1. Choose Add new claim.
  2. For Name, enter SAML_SUBJECT.
  3. For Namespace, enter https://aws.amazon.com/SAML/Attributes.
  4. For Source, choose Attribute.
  5. For Source attribute, enter ““Azure AD - QuickSight SSO””.

Testing the application

You’re now ready to test the application.

  1. In the Azure portal, on the Azure Active Directory page, choose All groups.
  2. Update the group membership of the QuickSight-Admin group by adding the current user to it.
  3. Under Enterprise Applications, choose Amazon QuickSight.
  4. Under Manage, choose Single sign-on.
  5. Choose Test this application to test the authentication flow.
  6. Log in to QuickSight as an admin.

The following screenshot shows you the QuickSight dashboard for the admin user.

  1. Remove the current user from QuickSight-Admin Azure AD group and add it to QuickSight-Author group.

When you test the application flow, you log in to QuickSight as an author.

  1. Remove the current user from QuickSight-Author group and add it to QuickSight-Reader group.

When you test the application flow again, you log in as a reader.

By removing the user from the Azure AD group will not automatically remove the registered user in QuickSight. You have to remove the user manually in the QuickSight admin console. The user management inside QuickSight is documented in this article.

Deep-linking QuickSight dashboards

You can share QuickSight dashboards using the sign-on URL for the QuickSight application published in the Azure Apps portal. This allows users to federate directly into the QuickSight dashboard without having to land first on the QuickSight homepage.

To deep-link to a specific QuickSight dashboard with SSO, complete the following steps:

  1. Under Enterprise Applications, choose Amazon QuickSight
  2. Under Manage, choose Properties.
  3. Locate the User access URL.
  4. Append ?RelayState to the end of the URL containing the URL of your dashboard. For example, https://myapps.microsoft.com/signin/Amazon%20QuickSight/a06d28e5-4aa4-4888-bb99-91d6c2c4eae8?RelayState=https://us-east-1.quicksight.aws.amazon.com/sn/dashboards/224103be-0470-4de4-829f-390e55b3ef96.

You can test it by creating a custom sign-in URL using the RelayState parameter pointing to an existing dashboard. Make sure the user signing in to the dashboard has been granted proper access.


This post provided step-by-step instructions to configure a federated SSO with Azure AD as the IdP. We also discussed how to map users and groups in Azure AD to IAM roles for secure access into QuickSight.

If you have any questions or feedback, please leave a comment.

About the Author

Adnan Hasan is a Global GTM Analytics Specialist at Amazon Web Services, helping customers transform their business using data, machine learning and advanced analytics. 


Federating single sign-on access to your Amazon Redshift cluster with PingIdentity

Post Syndicated from Rajesh Francis original https://aws.amazon.com/blogs/big-data/federating-single-sign-on-access-to-your-amazon-redshift-cluster-with-pingidentity/

Single sign-on (SSO) enables users to have a seamless user experience while accessing various applications in the organization. If you’re responsible for setting up security and database access privileges for users and tasked with enabling SSO for Amazon Redshift, you can set up SSO authentication using ADFS, PingIdentity, Okta, Azure AD or other SAML browser based Identity Providers.

With federation, you can centralize management and governance of authentication and permissions by managing users and groups within the enterprise identity provider (IdP) and use them to authenticate to Amazon Redshift. For more information about the federation workflow using IAM and an identity provider, see Federate Database User Authentication Easily with IAM and Amazon Redshift.

This post shows you how to set up PingOne as your IdP. I provide step-by-step guidance to set up a trial account at pingidentity.com, build users and groups within your organization’s directory, and enable federated SSO into Amazon Redshift to maintain group-level access controls for your data warehouse.

Solution overview

The steps in this post are structured into the following sections:

  • IdP (PingOne) groups configuration – Create groups and assign users to logical groups in PingOne.
  • IdP (PingOne) application configuration – Create PingOne application(s) and configure AWS Identity and Access Management (IAM) roles, and groups allowed to be passed to Amazon Redshift.
  • IAM SAML federation configuration – Setup a role that allows PingOne to access Amazon Redshift by establishing a trust relationship between PingOne IdP and AWS.
  • Amazon Redshift groups and privileges setup – Setup groups within the Amazon Redshift database to match the PingOne groups. You also authorize these groups to access certain schemas and tables.
  • Amazon Redshift server and client setup and test SSO – Finally, configure SQL client tools to use your enterprise credentials and sign in to Amazon Redshift.

The process flow for federated authentication is shown in the following diagram and steps:

  1. The user logs in using a JDBC/ODBC SQL client.
  2. The IdP authenticates using the corporate user name and password, and returns a SAML assertion.
  3. The client uses AWS SDK to call AWS Security Token Service (AWS STS) to assume a role with SAML.
  4. AWS STS returns temporary AWS credentials.
  5. The client uses the temporary AWS credentials to get temporary cluster credentials.
  6. The client connects to Amazon Redshift using the temporary credentials.

Setting up PingOne provider groups and users

Before you get started, sign up for a free trial of PingOne for Enterprise. You then create the users and groups, and assign the users to the groups they belong to and are authorized to access.

You create groups and users in the PingOne user directory. You can set up the groups according to the read/write access privileges or by business functions in your organization to control access to the database objects.

In this post, we set up groups based on ReadOnly and ReadWrite privileges across all functions.

  1. After you have a PingOne account, log in to the PingOne admin dashboard.
  2. Choose Setup from the menu bar.
  3. On the Identity Repository tab, choose Connect to an Identity Repository.
  4. For Select an Identity Repository, you will see options for PingOne Directory, Active Directory, PingFederate and others. Choose PingOne Directory and go to Next.

After you connect to the PingOne repository, you should see the status CONFIGURED.

You can now create your groups and assign users.

  1. Choose Users from the menu bar.
  2. On the User Directory tab, choose Groups.
  3. Choose Add Group.
  4. For Name, enter readonly.
  5. For Directly Applied Role, select No Access.
  6. Choose Save.
  7. Repeat these steps for your readwrite group.
  8. To create the users, choose Users from the menu bar.
  9. On the User Directory tab, choose Users.
  10. Choose Add Users.

For this post, we create two users, Bob and Rachel.

  1. Under Group Memberships, for Memberships, select the group to add your user to.

For this post, we add Bob to readonly and Rachel to readwrite.

  1. Choose Add.
  2. Choose Save.
  3. Repeat these steps to create both users.

Configuring your IdP (PingOne) application

The next step is to set up the applications in the IdP for Amazon Redshift. Because we decided to control access through two groups, we create two applications.

  1. On the PingOne dashboard, choose Applications from the menu bar.
  2. On the My Applications tab, choose SAML.
  3. Choose Add Application.
  4. Choose New SAML Application.
  5. For Application Name, enter AmazonRedshiftReadOnly.
  6. Choose Continue to Next Step.
  7. On the Application Configuration page, for Assertion Consumer Service (ACS), enter http://localhost:7890/redshift/.
  8. For Entity ID, enter urn:amazon:webservices.
  9. For Signing, select Sign Assertion.
  10. For Signing Algorithm, choose RSA_SHA256.
  11. Choose Continue to Next Step.
  12. On the SSO Attribute Mapping page, add the following application attributes:
Application Attribute Identity Bridge As Literal

arn:aws:iam::<AWSAccount>:role/pingreadonlyrole,arn:aws:iam:: <AWSAccount>:saml-provider/pingreadonlyprov

pingreadonlyrole is the name of the IAM role you create in the next step.

pingreadonlyprov is the Identity Provider name in IAM where the metadata is imported. You use this name in next step to create your Identity Provider and import the metadata downloaded from this PingOne application configuration.

https://aws.amazon.com/SAML/Attributes/RoleSessionName Email
https://redshift.amazon.com/SAML/Attributes/AutoCreate true True
https://redshift.amazon.com/SAML/Attributes/DbUser Email


Choose Advanced and for Function, choose ExtractByRegularExpression. For Expression, enter (readonly|readwrite).+

This regular expression is to remove the @directory value from the PingIdentiy group name to be in line with the Amazon Redshift DB group names and send only the relevant groups to the Application.

Refer to the PingIdentity documentation for more details on parsing the memberof attribute in PingOne.

  1. Choose Continue to Next Step.
  2. On the Group Access page, add the groups that this application can access.

This adds the users who are members of that group so they can SSO to the application.

  1. On the Review Setup page, for SAML Metadata, choose Download.
  2. Save the file as ping-saml-readonly.xml.

You use this file later to import the metadata to create the PingOne IdP.

  1. Record the URL for Initiate Single Sign-On (SSO).

You use this URL to set up the SQL client for federated SSO.

  1. Choose Finish.
  2. Repeat these steps to create the second application, AmazonRedshiftReadWrite, with the following changes:
    1. On the SSO Attribute Mapping page, use the IAM role name pingreadwriterole and IdP name pingreadwriteprov.
    2. Save the SAML metadata file as ping-saml-readwrite.xml.

You should now see the two application names on the My Applications tab.

Configuring IAM SAML federation

To set up your IAM SAML configuration, you create the IAM IdP and the roles and policies for the groups.

Setting up the IAM SAML IdP

You set up the IAM IdP and the roles used in the PingOnereadonly and PingOnereadwrite applications to establish a trust relationship between the IdP and AWS. You need to create two IAM IdPs, one for each application. Complete the following steps:

  1. On the IAM console, under Access management, choose Identity providers.
  2. Choose Create Provider.
  3. For Provider Type, choose SAML.
  4. For Provider name, enter pingreadonlyprov.
  5. For Metadata Document, choose the metadata XML file you downloaded from the AmazonRedshiftReadOnly application.
  6. Repeat these steps to create the provider pingreadwriteprov.
    1. Choose the metadata XML file you downloaded from the AmazonRedshiftReadWrite application.

You now have two IdP providers: pingreadonlyprov and pingreadwriteprov.

Creating the IAM role and policy for the groups

You control access privileges to database objects for specific user groups by using IAM roles. In this section, you create separate IAM roles with policies to map to each of the groups defined in PingOne. These roles allow the user to access Amazon Redshift through the IdP.

You use the same role names that you used to set up applications in PingOne: pingreadonlyrole and pingreadwriterole.

Before you create the role, create the policies with the appropriate joingroup privileges.

  1. On the IAM console, under Access Management, choose Policies.
  2. Choose Create policy.
  3. On the JSON tab, enter the following code to create the two policies.
    1. Replace <cluster> with your cluster name and <dbname> with your database name.

The only difference between the two policies is the Action- redshift:JoinGroup section:

  • “JoinGroup”: pingreadonlypolicy allows users to join the readonly group
  • “JoinGroup”: pingreadwritepolicy allows users to join the readwrite group

The group membership lasts only for the duration of the user session, and there is no CreateGroup permission because you need to manually create groups and grant DB privileges in Amazon Redshift.

The following code is the pingreadonlypolicy policy:

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "GetClusterCredsStatement",
            "Effect": "Allow",
            "Action": [
            "Resource": [
                "arn:aws:redshift:*:*:dbuser: <dbname>/${redshift:DbUser}"
            "aws:userid":"*:${redshift:DbUser} "
            "Sid": "CreateClusterUserStatement",
            "Effect": "Allow",
            "Action": [
            "Resource": [
                "arn:aws:redshift:*:*:dbuser: <dbname>/${redshift:DbUser}"
            "Sid": "RedshiftJoinGroupStatement",
            "Effect": "Allow",
            "Action": [
            "Resource": [

The following code is the pingreadwritepolicy policy:

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "GetClusterCredsStatement",
            "Effect": "Allow",
            "Action": [
            "Resource": [
            "aws:userid":"*:${redshift:DbUser} "
            "Sid": "CreateClusterUserStatement",
            "Effect": "Allow",
            "Action": [
            "Resource": [
            "Sid": "RedshiftJoinGroupStatement",
            "Effect": "Allow",
            "Action": [
            "Resource": [
  1. On the IAM console, choose Roles.
  2. Choose Create role.
  3. For Select type of trusted entity, choose SAML 2.0 federation.
  4. For SAML provider, choose the provider you created.
  5. Select Allow programmatic access only.
  6. For Attribute, choose SAML:aud.
  7. For Value, enter http://localhost:7890/redshift/.
  8. Select pingreadonlypolicy for the first role and pingreadwritepolicy for the second role.
  9. Enter a name and description for each role.

The following screenshot shows your new roles: pingreadonlyrole and pingreadwriterole.

Setting up your groups and privileges in Amazon Redshift

In this section, you create the database groups in Amazon Redshift. These group names should match the group names you used when you set up your PingOne groups. Then you assign privileges to the groups to access the database objects including schemas and tables. User assignment to groups is done only one time in PingOne; you don’t assign users to groups in Amazon Redshift.

  1. Log in to your Amazon Redshift cluster with an admin account using the admin database credentials.
  2. Use the following scripts to create groups that match the IdP group names and grant the appropriate permissions to tables and schemas:
    CREATE GROUP readonly;
    CREATE GROUP readwrite;
    GRANT SELECT on TABLES to GROUP readonly;
    GRANT USAGE on SCHEMA finance to GROUP readonly;
    GRANT SELECT on ALL TABLES in SCHEMA finance to GROUP readonly;
    GRANT ALL on TABLES to GROUP readwrite;
    GRANT USAGE on SCHEMA finance to GROUP readwrite;
    GRANT ALL on ALL TABLES in SCHEMA finance to GROUP readwrite;

Setting up your Amazon Redshift server and client and testing SSO

In these final steps, you set up your client tools to use your enterprise credentials and sign in to Amazon Redshift.

Configuring the JDBC SQL Client using SQL Workbench/J

If you haven’t installed the JDBC driver, you can download the Amazon Redshift JDBC driver from the console. You then set up a new connection to your cluster using your PingOne IdP credentials.

  1. Create two new connection profiles, Redshift-ReadOnly and Redshift-ReadWrite.
  2. For URL, enter jdbc:redshift:iam://<cluster endpoint>.

IAM authentication requires using the JDBC driver with the AWS SDK included or making sure the AWS SDK is within your Java classpath.

You don’t need to enter a user name or password in JDBC setting. PingIdentity prompts you to log in on the web browser.

  1. Choose Extended Properties to define the SSO parameters for loging_url and plugin_name.
  2. In the Edit extended properties section, enter the following properties and values:
Property Value
login_url https://sso.connect.PingOne.com/sso/sp/initsso?saasid=
plugin_name com.amazon.redshift.plugin.BrowserSamlCredentialsProvider
listen_port 7890
idp_response_timeout 60

The login_url is the URL from the PingOne AmazonRedshift applications you set up earlier. Choose the SSO URL from the RedshiftReadOnly application for the readonly connection and the SSO URL from RedshiftReadWrite application for the readwrite connection.

The configuration in your extended properties screen should look like the screenshot below:

  1. Choose OK.

Testing SSO authentication and access privileges

When you log in from the SQL client, you’re redirected to the browser to sign in with your PingOne user name and password.

Log in as user bob with the IdP password.

This user has access to SELECT all tables in the finance schema and not INSERT/UPDATE access. You can enter the following statements to test your access.

The following query shows the results from the finance.revenue table:

/* Finance ReadOnly Query */
select * from finance.revenue limit 10;

customer		salesamt
ABC Company	        12000
Tech Logistics		175400
XYZ Industry		24355
The tax experts        186577

When you run an INSERT statement, you get the message that you’re not authorized to insert data:

/* Finance ReadWrite Insert */
insert into finance.revenue
values (10001, 'ABC Company', 12000);

You should see the results below:

INSERT INTO finance.revenue not successful
An error occurred when executing the SQL command:
insert into finance.revenue
values(10001, 'ABC Company', 12000)

[Amazon]()500310)Invalid operation:permission denied for relation revenue;1 statement failed.
Execution time:0.05s

You can repeat these steps for the user rachel, who has access to read and write (INSERT) data into the finance schema.

Configuring the ODBC client

To configure your ODBC client, complete the following steps.

  1. Open the ODBC Data source administrator from your desktop.
  2. On the System DSN tab, choose Add.
  3. For Server, enter your Amazon Redshift ODBC endpoint.
  4. For Port, enter 5439.
  5. For Database, enter your database name.
  6. For Auth Type, choose Identity Provider: Browser SAML to use browser-based authentication.
  7. For Cluster ID, enter your cluster ID.
  8. For Preferred Role, enter your IAM role ARN.
  9. For Login URL, enter your PingOne login URL from the application configuration (https://sso.connect.PingOne.com/sso/sp/initsso?saasid=<saasid>&idpid=<idpid>).
  10. For Listen port, enter 7890 (default).
  11. For Timeout, enter 60.


In this blog post, I walked you through a step-by-step guide to configure and use PingOne as your IdP and enabled federated SSO to an Amazon Redshift cluster. You can follow these steps to setup federated SSO for your organization and manage access privileges based on read/write privileges or by business function and passing group membership defined in your PingOne IdP to your Amazon Redshift cluster.

About the Authors

Rajesh Francis is a Sr. Analytics Specialist Solutions Architect at AWS. He specializes in Amazon Redshift and works with customers to build scalable Analytic solutions.






Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Post Syndicated from Jason Kincaid original https://blog.cloudflare.com/birthday-week-on-cloudflare-tv-announcing-24-hours-of-live-discussions-on-the-future-of-the-internet/

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

This week marks Cloudflare’s 10th birthday, and we’re excited to continue our annual tradition of launching an array of products designed to help give back to the Internet. (Check back here each morning for the latest!)

We also see this milestone as an opportunity to reflect on where the Internet was ten years ago, and where it might be headed over the next decade. So we reached out to some of the people we respect most to see if they’d be interested in joining us for a series of Fireside Chats on Cloudflare TV.

We’ve been blown away by the response, and are thrilled to announce our lineup of speakers, featuring many of the most celebrated names in tech and beyond. Among the highlights: Apple co-founder Steve Wozniak, Zoom CEO Eric Yuan, OpenTable CEO Debby Soo, Stripe co-founder and President John Collison, Former CEO & Executive Chairman, Google // Co-Founder, Schmidt Futures. Eric Schmidt, former McAfee CEO Chris Young, Magic Leap CEO and longtime Microsoft executive Peggy Johnson, former Seal Team 6 Commander Dave Cooper, Project Include CEO Ellen Pao, and so many more. All told, we have over 24 hours of live discussions scheduled throughout the week.

To tune in, just head to Cloudflare TV (no registration required). You can view the details for each session by clicking the links below, where you’ll find handy Add to Calendar buttons to make sure you don’t miss anything. We’ll also be rebroadcasting these talks throughout the week, so they’ll be easily accessible in different timezones.

A tremendous thank you to everyone on this list for helping us celebrate Cloudflare’s 10th annual Birthday Week!

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Jay Adelson

Founder of Equinix and Chairman & Co-Founder of Scorbit

Thursday, October 1, 10:00 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Shellye Archambeau

Fortune 500 Board Member and Author & Former CEO of MetricStream

Thursday, October 1, 6:30 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Abhinav Asthana

Founder & CEO of Postman

Wednesday, September 30, 3:30 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Azeem Azhar

Founder of Exponential View

Friday, October 2, 9:00 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

John Battelle

Co-Founder & CEO of Recount Media

Wednesday, September 30, 8:30 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Christian Beedgen

CTO & Co-Founder of SumoLogic

Details coming soon

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Scott Belsky

Chief Product Officer and Executive Vice President, Creative Cloud at Adobe

Wednesday, September 30, 11:00 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Gleb Budman

CEO & Co-Founder of Backblaze

Wednesday, September 30, 2:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Hayden Brown

CEO of Upwork

Details coming soon

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Stewart Butterfield

CEO of Slack

Thursday, October 1, 8:30 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

John P. Carlin

Former Assistant Attorney General for the US Department of Justice’s National Security Division and current Chair of Morrison & Foerster’s Global Risk + Crisis Management practice

Tuesday, September 29, 12:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

John Collison

Co-Founder & President of Stripe

Friday, October 2, 3:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Dave Cooper

Former Seal Team 6 Commander

Tuesday, September 29, 10:30 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Scott Galloway

Founder & Chair of L2

Wednesday September 30th, 12PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Kara Goldin

Founder & CEO of Hint Inc.

Thursday, October 1, 12:30 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

David Gosset

Founder of Europe China Forum

Monday September 28th, 5:00PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Jon Green

VP and Chief Technologist for Security at Aruba, a Hewlett Packard Enterprise company

Wednesday, September 30th, 9:00AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Arvind Gupta

Former CEO of MyGov, Govt. of India and current Head & Co-Founder of Digital India Foundation

Monday, September 28, 8:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Anu Hariharan

Partner at Y Combinator

Monday, September 28, 1:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Brett Hautop

VP of Global Design + Build at LinkedIn

Friday, October 2, 11:00 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Erik Hersman


Details coming soon

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Jennifer Hyman

CEO & Co-Founder of Rent the Runway

Wednesday, September 30, 1:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Peggy Johnson

CEO of Magic Leap and former Executive at Microsoft and Qualcomm

Details coming soon

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

David Kaye

Former UN Special Rapporteur

Details Coming Soon

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Pam Kostka

CEO of All Raise

Thursday, October 1, 1:30 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Raffi Krikorian

Managing Director at Emerson Collective and former Engineering Executive at Twitter & Uber

Friday, October 2, 1:30 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Albert Lee

Co-Founder of MyFitnessPal

Monday, September 28, 12:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Aaron Levie

CEO & Co-Founder of Box

Thursday, October 1, 4:30 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Alexander Macgillivray

Co-Founder & GC of Alloy and former Deputy CTO of US Government

Thursday, October 1, 11:30 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Ellen Pao

Former CEO of Reddit and current CEO of Project Include

Tuesday, September 29, 2:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Keith Rabois

General Partner at Founders Fund and former COO of Square

Wednesday, September 30, 3:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Eric Schmidt

Former CEO of Google and current Technical Advisor at Alphabet, Inc.

Monday, September 28, 12:30 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Pradeep Sindhu

Founder & Chief Scientist at Juniper Networks, and Founder & CEO at Fungible

Wednesday, September 30, 11:30 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Karan Singh

Co-Founder & Chief Operating Officer of Ginger

Monday, September 28, 3:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Debby Soo

CEO of OpenTable and former Chief Commercial Officer of KAYAK

Details coming soon

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Dan Springer

CEO of DocuSign

Thursday, October 1, 1:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Bonita Stewart

Vice President, Global Partnerships & Americas Partnerships Solutions at Google

Friday, October 2, 2:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Hemant Taneja

Managing Director at General Catalyst

Friday, October 2nd, 4:00PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Bret Taylor

President & Chief Operating Officer of Salesforce

Friday, October 2, 12:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Jennifer Tejada

CEO of PagerDuty

Details coming soon

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Robert Thomson

Chief Executive at News Corp and former Editor-in-Chief at The Wall Street Journal & Dow Jones

Thursday, October 1, 12:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Robin Thurston

Founder & CEO of Pocket Outdoor Media and former EVP, Chief Digital Officer of Under Armour

Monday, September 28, 12:00 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Selina Tobaccowala

Chief Digital Officer at Openfit, Co-Founder of Gixo, and former President & CTO of SurveyMonkey

Details coming soon

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Michael Wolf

Founder & CEO of Activate and former President and Chief Operating Officer of MTV Networks

Tuesday, September 29, 3:30 PM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Josh Wolfe

Co-Founder and Managing Partner of Lux Capital

Details coming soon

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Steve Wozniak

Co-Founder of Apple, Inc.

Wednesday, September 30, 10:00 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Chris Young

Former CEO of McAfee

Thursday, October 1, 11:00 AM (PDT) // Add to Calendar

Birthday Week on Cloudflare TV: Announcing 24 Hours of Live Discussions on the Future of the Internet

Eric Yuan

Founder & Chief Executive Officer of Zoom

Monday, September 28, 3:30 PM (PDT) // Add to Calendar

[$] Mercurial planning to transition away from SHA-1

Post Syndicated from coogle original https://lwn.net/Articles/832111/rss

Recently, the Mercurial project has
been discussing its plans to migrate away from the compromised SHA-1 hashing algorithm in favor of
a more secure alternative. So far, the discussion is in the planning stages
of algorithm selection and migration strategy, with a general transition plan
for users. The project, for the moment, is favoring the BLAKE2 hashing algorithm.

OpenSSH 8.4 released

Post Syndicated from original https://lwn.net/Articles/832857/rss

OpenSSH 8.4 is out. The SHA-1 algorithm is deprecated and the “ssh-rsa”
public key signature algorithm will be disabled by default “in a
near-future release.
” They note that it is possible to perform
chosen-prefix attacks against the SHA-1 algorithm for less than USD$50K.

Security updates for Monday

Post Syndicated from original https://lwn.net/Articles/832831/rss

Security updates have been issued by Debian (curl, libdbi-perl, linux-4.19, lua5.3, mediawiki, nfdump, openssl1.0, qt4-x11, qtbase-opensource-src, ruby-gon, and yaws), Fedora (f2fs-tools, grub2, libxml2, perl-DBI, singularity, xawtv, and xen), Mageia (cifs-utils, kio-extras, libproxy, mbedtls, nodejs, novnc, and pdns), openSUSE (bcm43xx-firmware, chromium, conmon, fuse-overlayfs, libcontainers-common, podman, firefox, libqt4, libqt5-qtbase, openldap2, ovmf, pdns, rubygem-actionpack-5_1, and tiff), SUSE (firefox, go1.14, ImageMagick, and libqt5-qtbase), and Ubuntu (firefox, gnuplot, libquicktime, miniupnpd, ruby-sanitize, and sudo).

Choosing between messaging services for serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/choosing-between-messaging-services-for-serverless-applications/

Most serverless application architectures use a combination of different AWS services, microservices, and AWS Lambda functions. Messaging services are important in allowing distributed applications to communicate with each other, and are fundamental to most production serverless workloads.

Messaging services can improve the resilience, availability, and scalability of applications, when used appropriately. They can also enable your applications to communicate beyond your workload or even the AWS Cloud, and provide extensibility for future service features and versions.

In this blog post, I compare the primary messaging services offered by AWS and how you can use these in your serverless application architectures. I also show how you use and deploy these integrations with the AWS Serverless Application Model (AWS SAM).

Examples in this post refer to code that can be downloaded from this GitHub repository. The README.md file explains how to deploy and run each example.


Three of the most useful messaging patterns for serverless developers are queues, publish/subscribe, and event buses. In AWS, these are provided by Amazon SQS, Amazon SNS, and Amazon EventBridge respectively. All of these services are fully managed and highly available, so there is no infrastructure to manage. All three integrate with Lambda, allowing you to publish messages via the AWS SDK and invoke functions as targets. Each of these services has an important role to play in serverless architectures.

SNS enables you to send messages reliably between parts of your infrastructure. It uses a robust retry mechanism for when downstream targets are unavailable. When the delivery policy is exhausted, it can optionally send those messages to a dead-letter queue for further processing. SNS uses topics to logically separate messages into channels, and your Lambda functions interact with these topics.

SQS provides queues for your serverless applications. You can use a queue to send, store, and receive messages between different services in your workload. Queues are an important mechanism for providing fault tolerance in distributed systems, and help decouple different parts of your application. SQS scales elastically, and there is no limit to the number of messages per queue. The service durably persists messages until they are processed by a downstream consumer.

EventBridge is a serverless event bus service, simplifying routing events between AWS services, software as a service (SaaS) providers, and your own applications. It logically separates routing using event buses, and you implement the routing logic using rules. You can filter and transform incoming messages at the service level, and route events to multiple targets, including Lambda functions.

Integrating an SQS queue with AWS SAM

The first example shows an AWS SAM template defining a serverless application with two Lambda functions and an SQS queue:

Producer-consumer example

You can declare an SQS queue in an AWS SAM template with the AWS::SQS::Queue resource:

    Type: AWS::SQS::Queue

To publish to the queue, the publisher function must have permission to send messages. Using an AWS SAM policy template, you can apply policy that enables send messaging to one specific queue:

        - SQSSendMessagePolicy:
            QueueName: !GetAtt MySqsQueue.QueueName

The AWS SAM template passes the queue name into the Lambda function as an environment variable. The function uses the sendMessage method of the AWS.SQS class to publish the message:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION 
const sqs = new AWS.SQS({apiVersion: '2012-11-05'})

// The Lambda handler
exports.handler = async (event) => {
  // Params object for SQS
  const params = {
    MessageBody: `Message at ${Date()}`,
    QueueUrl: process.env.SQSqueueName
  // Send to SQS
  const result = await sqs.sendMessage(params).promise()

When the SQS queue receives the message, it publishes to the consuming Lambda function. To configure this integration in AWS SAM, the consumer function is granted the SQSPollerPolicy policy. The function’s event source is set to receive messages from the queue in batches of 10:

    Type: AWS::Serverless::Function 
      CodeUri: code/
      Handler: consumer.handler
      Runtime: nodejs12.x
      Timeout: 3
      MemorySize: 128
        - SQSPollerPolicy:
            QueueName: !GetAtt MySqsQueue.QueueName
          Type: SQS
            Queue: !GetAtt MySqsQueue.Arn
            BatchSize: 10

The payload for the consumer function is the message from SQS. This is an array of messages up to the batch size, containing a body attribute with the publishing function’s MessageBody. You can see this in the CloudWatch log for the function:

CloudWatch log result

Integrating an SNS topic with AWS SAM

The second example shows an AWS SAM template defining a serverless application with three Lambda functions and an SNS topic:

SNS fanout to Lambda functions

You declare an SNS topic and the subscribing Lambda functions with the AWS::SNS:Topic resource:

    Type: AWS::SNS::Topic
        - Protocol: lambda
          Endpoint: !GetAtt TopicConsumerFunction1.Arn    
        - Protocol: lambda
          Endpoint: !GetAtt TopicConsumerFunction2.Arn

You provide the SNS service with permission to invoke the Lambda functions but defining an AWS::Lambda::Permission for each:

    Type: 'AWS::Lambda::Permission'
      Action: 'lambda:InvokeFunction'
      FunctionName: !Ref TopicConsumerFunction1
      Principal: sns.amazonaws.com

The SNSPublishMessagePolicy policy template grants permission to the publishing function to send messages to the topic. In the function, the publish method of the AWS.SNS class handles publishing:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION 
const sns = new AWS.SNS({apiVersion: '2012-11-05'})

// The Lambda handler
exports.handler = async (event) => {
  // Params object for SNS
  const params = {
    Message: `Message at ${Date()}`,
    Subject: 'New message from publisher',
    TopicArn: process.env.SNStopic
  // Send to SQS
  const result = await sns.publish(params).promise()

The payload for the consumer functions is the message from SNS. This is an array of messages, containing subject and message attributes from the publishing function. You can see this in the CloudWatch log for the function:

CloudWatch log result

Differences between SQS and SNS configurations

SQS queues and SNS topics offer different functionality, though both can publish to downstream Lambda functions.

An SQS message is stored on the queue for up to 14 days until it is successfully processed by a subscriber. SNS does not retain messages so if there are no subscribers for a topic, the message is discarded.

SNS topics may broadcast to multiple targets. This behavior is called fan-out. It can be used to parallelize work across Lambda functions or send messages to multiple environments (such as test or development). An SNS topic can have up to 12,500,000 subscribers, providing highly scalable fan-out capabilities. The targets may include HTTP/S endpoints, SMS text messaging, SNS mobile push, email, SQS, and Lambda functions.

In AWS SAM templates, you can retrieve properties such as ARNs and names of queues and topics, using the following intrinsic functions:

Amazon SQS Amazon SNS
Channel type Queue Topic
Get ARN !GetAtt MySqsQueue.Arn !Ref MySnsTopic
Get name !GetAtt MySqsQueue.QueueName !GetAtt MySnsTopic.TopicName

Integrating with EventBridge in AWS SAM

The third example shows the AWS SAM template defining a serverless application with two Lambda functions and an EventBridge rule:

EventBridge integration with AWS SAM

The default event bus already exists in every AWS account. You declare a rule that filters events in the event bus using the AWS::Events::Rule resource:

    Type: AWS::Events::Rule
      Description: "EventRule"
          - "demo.event"
            - "new"
      State: "ENABLED"
        - Arn: !GetAtt EventConsumerFunction.Arn
          Id: "ConsumerTarget"

The rule describes an event pattern specifying matching JSON attributes. Events that match this pattern are routed to the list of targets. You provide the EventBridge service with permission to invoke the Lambda functions in the target list:

    Type: AWS::Lambda::Permission
        Ref: "EventConsumerFunction"
      Action: "lambda:InvokeFunction"
      Principal: "events.amazonaws.com"
      SourceArn: !GetAtt EventRule.Arn

The AWS SAM template uses an IAM policy statement to grant permission to the publishing function to put events on the event bus:

    Type: AWS::Serverless::Function
      CodeUri: code/
      Handler: publisher.handler
      Timeout: 3
      Runtime: nodejs12.x
        - Statement:
          - Effect: Allow
            Resource: '*'
              - events:PutEvents      

The publishing function then uses the putEvents method of the AWS.EventBridge class, which returns after the events have been durably stored in EventBridge:

const AWS = require('aws-sdk')
AWS.config.update({region: 'us-east-1'})
const eventbridge = new AWS.EventBridge()

exports.handler = async (event) => {
  const params = {
    Entries: [ 
        Detail: JSON.stringify({
          "message": "Hello from publisher",
          "state": "new"
        DetailType: 'Message',
        EventBusName: 'default',
        Source: 'demo.event',
        Time: new Date 
  const result = await eventbridge.putEvents(params).promise()

The payload for the consumer function is the message from EventBridge. This is an array of messages, containing subject and message attributes from the publishing function. You can see this in the CloudWatch log for the function:

CloudWatch log result

Comparing SNS with EventBridge

SNS and EventBridge have many similarities. Both can be used to decouple publishers and subscribers, filter messages or events, and provide fan-in or fan-out capabilities. However, there are differences in the list of targets and features for each service, and your choice of service depends on the needs of your use-case.

EventBridge offers two newer capabilities that are not available in SNS. The first is software as a service (SaaS) integration. This enables you to authorize supported SaaS providers to send events directly from their EventBridge event bus to partner event buses in your account. This replaces the need for polling or webhook configuration, and creates a highly scalable way to ingest SaaS events directly into your AWS account.

The second feature is the Schema Registry, which makes it easier to discover and manage OpenAPI schemas for events. EventBridge can infer schemas based on events routed through an event bus by using schema discovery. This can be used to generate code bindings directly to your IDE for type-safe languages like Python, Java, and TypeScript. This can help accelerate development by automating the generation of classes and code directly from events.

This table compares the major features of both services:

Amazon SNS Amazon EventBridge
Number of targets 10 million (soft) 5
Availability SLA 99.9% 99.99%
Limits 100,000 topics. 12,500,000 subscriptions per topic. 100 event buses. 300 rules per event bus.
Publish throughput Varies by Region. Soft limits. Varies by Region. Soft limits.
Input transformation No Yes – see details.
Message filtering Yes – see details. Yes, including IP address matching – see details.
Message size maximum 256 KB 256 KB
Billing Per 64 KB
Format Raw or JSON JSON
Receive events from AWS CloudTrail No Yes
Targets HTTP(S), SMS, SNS Mobile Push, Email/Email-JSON, SQS, Lambda functions. 15 targets including AWS LambdaAmazon SQSAmazon SNSAWS Step FunctionsAmazon Kinesis Data StreamsAmazon Kinesis Data Firehose.
SaaS integration No Yes – see integrations.
Schema Registry integration No Yes – see details.
Dead-letter queues supported Yes No
FIFO ordering available No No
Public visibility Can create public topics Cannot create public buses
Pricing $0.50/million requests + variable delivery cost + data transfer out cost. SMS varies. $1.00/million events. Free for AWS events. No charge for delivery.
Billable request size 1 request = 64 KB 1 event = 64 KB
AWS Free Tier eligible Yes No
Cross-Region You can subscribe your AWS Lambda functions to an Amazon SNS topic in any Region. Targets must be in the same Region. You can publish across Regions to another event bus.
Retry policy
  • For SQS/Lambda, exponential backoff over 23 days.
  • For SMTP, SMS and Mobile push, exponential backoff over 6 hours.
At-least-once event delivery to targets, including retry with exponential backoff for up to 24 hours.


Messaging is an important part of serverless applications and AWS services provide queues, publish/subscribe, and event routing capabilities. This post reviews the main features of SNS, SQS, and EventBridge and how they provide different capabilities for your workloads.

I show three example applications that publish and consume events from the three services. I walk through AWS SAM syntax for deploying these resources in your applications. Finally, I compare differences between the services.

To learn more building decoupled architectures, see this Learning Path series on EventBridge. For more serverless learning resources, visit https://serverlessland.com.

Introducing Cron Triggers for Cloudflare Workers

Post Syndicated from Nancy Gao original https://blog.cloudflare.com/introducing-cron-triggers-for-cloudflare-workers/

Introducing Cron Triggers for Cloudflare Workers

Introducing Cron Triggers for Cloudflare Workers

Today the Cloudflare Workers team is thrilled to announce the launch of Cron Triggers. Before now, Workers were triggered purely by incoming HTTP requests but starting today you’ll be able to set a scheduler to run your Worker on a timed interval. This was a highly requested feature that we know a lot of developers will find useful, and we’ve heard your feedback after Serverless Week.

Introducing Cron Triggers for Cloudflare Workers

We are excited to offer this feature at no additional cost, and it will be available on both the Workers free tier and the paid tier, now called Workers Bundled. Since it doesn’t matter which city a Cron Trigger routes the Worker through, we are able to maximize Cloudflare’s distributed system and send scheduled jobs to underutilized machinery. Running jobs on these quiet machines is both efficient and cost effective, and we are able to pass those cost savings down to you.

What is a Cron Trigger and how might I use such a feature?

Introducing Cron Triggers for Cloudflare Workers

In case you’re not familiar with Unix systems, the cron pattern allows you to schedule jobs to run periodically at fixed intervals or at scheduled times. Cron Triggers in the context of Workers allow users to set time-based invocations for the job. These Workers happen on a recurring schedule, and differ from traditional Workers in that they do not fire on HTTP requests.

Most developers are familiar with the cron pattern and its usefulness across a wide range of applications. Pulling the latest data from APIs or running regular integration tests on a preset schedule are common examples of this.

“We’re excited about Cron Triggers. Workers is crucial to our stack, so using this feature for live integration tests will boost the developer experience.” – Brian Marks, Software Engineer at Bazaarvoice

How much does it cost to use Cron Triggers?

Triggers are included at no additional cost! Scheduled Workers count towards your request cap for both the free tier and Workers Bundled, but rest assured that there will be no hidden or extra fees. Our competitors charge extra for cron events, or in some cases offer a very limited free tier. We want to make this feature widely accessible and have decided not to charge on a per-trigger basis. While there are no limits for the number of triggers you can have across an account, note that there is a limit of 3 triggers per Worker script for this feature. You can read more about limits on Workers plans in this documentation.

How are you able to offer this feature at no additional cost?

Cloudflare supports a massive distributed system that spans the globe with 200+ cities. Our nodes are named for the IATA airport code that they are closest to. Most of the time we run Workers close to the request origin for performance reasons (ie SFO if you are in the Bay Area, or CDG if you are lucky enough to be in Paris 🥐🍷🧀).  In a typical HTTP Worker, we do this because we know that performance is of material importance when someone is waiting for the response.

In the case of Cron Triggers, where the user is running a task on a timed basis, those performance needs are different. A few milliseconds of extra latency do not matter as much when the user isn’t actively waiting for the response. The nature of the feature gives us much more flexibility on where to run the job, since it doesn’t have to necessarily be in a city close to the end user.

Cron Triggers are run on underutilized machines to make the best use of our capacity and route traffic efficiently. For example, a job scheduled from San Francisco at 7pm Pacific Time might be sent to Paris because it’s 4am there and traffic across Europe is low.  Sending traffic to these machines during quiet hours is very efficient, and we are more than happy to pass those cost savings down to you. Aside from this scheduling optimization, Workers that are called by Cron Triggers behave similarly to and have all of the same performance and security benefits as typical HTTP Workers.

What’s happening below the hood?

At a high level, schedules created through our API create records in our database. These records contain the information necessary to execute the Worker on the given cron schedule. These records are then picked up by another service which continuously evaluates the state of our edge and distributes the schedules among cities. Once the schedules have been distributed to the edge, a service running in the node polls for changes to the schedules and makes sure they get sent to our runtime at the appropriate time.

If you want to know more details about how we implemented this feature, please refer to the technical blog.

What’s coming next?

With this feature, we’ve expanded what’s possible to build with Workers, and further simplified the developer experience. While Workers previously only ran on web requests, we believe the future of edge computing isn’t strictly tied to HTTP requests and responses.  We want to introduce more types of Workers in the future.

We plan to expand out triggers to include different types, such as data or event-based triggers. Our goal is to give users more flexibility and control over when their Workers run. Cron Triggers are our first step in this direction. In addition, we plan to keep iterating on Cron Triggers to make edge infrastructure selection even more sophisticated and optimized — for example, we might even consider triggers that allow our users to run in the most energy-efficient data centers.

How to try Cron Triggers

Cron triggers are live today! You can try it in the Workers dashboard by creating a new Worker and setting up a Cron Trigger.

Making Time for Cron Triggers: A Look Inside

Post Syndicated from Aaron Lisman original https://blog.cloudflare.com/cron-triggers-for-scheduled-workers/

Making Time for Cron Triggers: A Look Inside

Making Time for Cron Triggers: A Look Inside

Today, we are excited to launch Cron Triggers to the Cloudflare Workers serverless compute platform. We’ve heard the developer feedback, and we want to give our users the ability to run a given Worker on a scheduled basis. In case you’re not familiar with Unix systems, the cron pattern allows developers to schedule jobs to run at fixed intervals. This pattern is ideal for running any types of periodic jobs like maintenance or calling third party APIs to get up-to-date data. Cron Triggers has been a highly requested feature even inside Cloudflare and we hope that you will find this feature as useful as we have!

Making Time for Cron Triggers: A Look Inside

Where are Cron Triggers going to be run?

Cron Triggers are executed from the edge. At Cloudflare, we believe strongly in edge computing and wanted our new feature to get all of the performance and reliability benefits of running on our edge. Thus, we wrote a service in core that is responsible for distributing schedules to a new edge service through Quicksilver which will then trigger the Workers themselves.

What’s happening under the hood?

At a high level, schedules created through our API create records in our database with the information necessary to execute the Worker and the given cron schedule. These records are then picked up by another service which continuously evaluates the state of our edge and distributes the schedules between cities.

Once the schedules have been distributed to the edge, a service running in the edge node polls for changes to the schedules and makes sure they get sent to our runtime at the appropriate time.

New Event Type

Making Time for Cron Triggers: A Look Inside

Cron Triggers gave us the opportunity to finally recognize a new Worker ‘type’ in our API. While Workers currently only run on web requests, we have lots of ideas for the future of edge computing that aren’t strictly tied to HTTP requests and responses. Expect to see even more new handlers in the future for other non-HTTP events like log information from your Worker (think custom wrangler tail!) or even TCP Workers.

Here’s an example of the new Javascript API:

addEventListener('scheduled', event => {

Where event has the following interface in Typescript:

interface ScheduledEvent {
  type: 'scheduled';
  scheduledTime: int; // milliseconds since the Unix epoch

As long as your Worker has a handler for this new event type, you’ll be able to give it a schedule.

New APIs

PUT /client/v4/accounts/:account_identifier/workers/scripts/:name

The script upload API remains the same, but during script validation we now detect and return the registered event handlers.

PUT /client/v4/accounts/:account_identifier/workers/scripts/:name/schedules

 {"cron": "* * * * *"},

This will create or modify all schedules for a script, removing all schedules not in the list. For now, there’s a limit of 3 distinct cron schedules. Schedules can be set to run as often as one minute and don’t accept schedules with years in them (sorry, you’ll have to run your Y3K migration script another way).

GET /client/v4/accounts/:account_identifier/workers/scripts/:name/schedules

 "schedules": [
     "cron": "* * * * *",
      "created_on": <time>,
      "modified_on": <time>

The Scheduler service is responsible for reading the schedules from Postgres and generating per-node schedules to place into Quicksilver. For now, the service simply avoids trying to execute your Worker on an edge node that may be disabled for some reason, but such an approach also gives us a lot of flexibility in deciding where your Worker executes.

In addition to edge node availability, we could optimize for compute cost, bandwidth, or even latency in the future!

What’s actually executing these schedules?

To consume the schedules and actually trigger the Worker, we built a new service in Rust and deployed to our edge using HashiCorp Nomad. Nomad ensures that the schedule runner remains running in the edge node and can move it between machines as necessary. Rust was the best choice for this service since it needed to be fast with high availability and Cap’n Proto RPC support for calling into the runtime. With Tokio, Anyhow, Clap, and Serde, it was easy to quickly get the service up and running without having to really worry about async, error handling, or configuration.

On top of that, due to our specific needs for cron parsing, we built a specialized cron parser using nom that allowed us to quickly parse and compile expressions into values that check against a given time to determine if we should run a schedule.

Once the schedule runner has the schedules, it checks the time and selects the Workers that need to be run. To let the runtime know it’s time to run, we send a Cap’n Proto RPC message. The runtime then does its thing, calling the new ‘scheduled’ event handler instead of ‘fetch’.

How can I try this?

As of today, the Cron Triggers feature is live! Please try it out by creating a Worker and finding the Triggers tab – we’re excited to see what you build with it!

Workers Durable Objects Beta: A New Approach to Stateful Serverless

Post Syndicated from Kenton Varda original https://blog.cloudflare.com/introducing-workers-durable-objects/

Workers Durable Objects Beta:
A New Approach to Stateful Serverless

Workers Durable Objects Beta:
A New Approach to Stateful Serverless

We launched Cloudflare Workers® in 2017 with a radical vision: code running at the network edge could not only improve performance, but also be easier to deploy and cheaper to run than code running in a single datacenter. That vision means Workers is about more than just edge compute — we’re rethinking how applications are built.

Using a “serverless” approach has allowed us to make deploys dead simple, and using isolate technology has allowed us to deliver serverless more cheaply and without the lengthy cold starts that hold back other providers. We added easy-to-use eventually-consistent edge storage to the platform with Workers KV.

But up until today, it hasn’t been possible to manage state with strong consistency, or to coordinate in real time between multiple clients, entirely on the edge. Thus, these parts of your application still had to be hosted elsewhere.

Durable Objects provide a truly serverless approach to storage and state: consistent, low-latency, distributed, yet effortless to maintain and scale. They also provide an easy way to coordinate between clients, whether it be users in a particular chat room, editors of a particular document, or IoT devices in a particular smart home. Durable Objects are the missing piece in the Workers stack that makes it possible for whole applications to run entirely on the edge, with no centralized “origin” server at all.

Today we are beginning a closed beta of Durable Objects.

Request a beta invite »

What is a “Durable Object”?

I’m going to be honest: naming this product was hard, because it’s not quite like any other cloud technology that is widely-used today. This proverbial bike shed has many layers of paint, but ultimately we settled on “Unique Durable Objects”, or “Durable Objects” for short. Let me explain what they are by breaking that down:

  • Objects: Durable Objects are objects in the sense of Object-Oriented Programming. A Durable Object is an instance of a class — literally, a class definition written in JavaScript (or your language of choice). The class has methods which define its public interface. An object is an instance of this class, combining the code with some private state.
  • Unique: Each object has a globally-unique identifier. That object exists in only one location in the whole world at a time. Any Worker running anywhere in the world that knows the object’s ID can send messages to it. All those messages end up delivered to the same place.
  • Durable: Unlike a normal object in JavaScript, Durable Objects can have persistent state stored on disk. Each object’s durable state is private to it, which means not only that access to storage is fast, but the object can even safely maintain a consistent copy of the state in memory and operate on it with zero latency. The in-memory object will be shut down when idle and recreated later on-demand.

What can they do?

Durable Objects have two primary abilities:

  • Storage: Each object has attached durable storage. Because this storage is private to a specific object, the storage is always co-located with the object. This means the storage can be very fast while providing strong, transactional consistency. Durable Objects apply the serverless philosophy to storage, splitting the traditional large monolithic databases up into many small, logical units. In doing so, we get the advantages you’ve come to expect from serverless: effortless scaling with zero maintenance burden.
  • Coordination: Historically, with Workers, each request would be randomly load-balanced to a Worker instance. Since there was no way to control which instance received a request, there was no way to force two clients to talk to the same Worker, and therefore no way for clients to coordinate through Workers. Durable Objects change that: requests related to the same topic can be forwarded to the same object, which can then coordinate between them, without any need to touch storage. For example, this can be used to facilitate real-time chat, collaborative editing, video conferencing, pub/sub message queues, game sessions, and much more.

The astute reader may notice that many coordination use cases call for WebSockets — and indeed, conversely, most WebSocket use cases require coordination. Because of this complementary relationship, along with the Durable Objects beta, we’ve also added WebSocket support to Workers. For more on this, see the Q&A below.

Region: Earth

Workers Durable Objects Beta:
A New Approach to Stateful Serverless

When using Durable Objects, Cloudflare automatically determines the Cloudflare datacenter that each object will live in, and can transparently migrate objects between locations as needed.

Traditional databases and stateful infrastructure usually require you to think about geographical “regions”, so that you can be sure to store data close to where it is used. Thinking about regions can often be an unnatural burden, especially for applications that are not inherently geographical.

With Durable Objects, you instead design your storage model to match your application’s logical data model. For example, a document editor would have an object for each document, while a chat app would have an object for each chat. There is no problem creating millions or billions of objects, as each object has minimal overhead.

Killer app: Real-time collaborative document editing

Let’s say you have a spreadsheet editor application — or, really, any kind of app where users edit a complex document. It works great for one user, but now you want multiple users to be able to edit it at the same time. How do you accomplish this?

For the standard web application stack, this is a hard problem. Traditional databases simply aren’t designed to be real-time. When Alice and Bob are editing the same spreadsheet, you want every one of Alice’s keystrokes to appear immediately on Bob’s screen, and vice versa. But if you merely store the keystrokes to a database, and have the users repeatedly poll the database for new updates, at best your application will have poor latency, and at worst you may find database transactions repeatedly fail as users on opposite sides of the world fight over editing the same content.

The secret to solving this problem is to have a live coordination point. Alice and Bob connect to the same coordinator, typically using WebSockets. The coordinator then forwards Alice’s keystrokes to Bob and Bob’s keystrokes to Alice, without having to go through a storage layer. When Alice and Bob edit the same content at the same time, the coordinator resolves conflicts instantly. The coordinator can then take responsibility for updating the document in storage — but because the coordinator keeps a live copy of the document in-memory, writing back to storage can happen asynchronously.

Every big-name real-time collaborative document editor works this way. But for many web developers, especially those building on serverless infrastructure, this kind of solution has long been out-of-reach. Standard serverless infrastructure — and even cloud infrastructure more generally — just does not make it easy to assign these coordination points and direct users to talk to the same instance of your server.

Durable Objects make this easy. Not only do they make it easy to assign a coordination point, but Cloudflare will automatically create the coordinator close to the users using it and migrate it as needed, minimizing latency. The availability of local, durable storage means that changes to the document can be saved reliably in an instant, even if the eventual long-term storage is slower. Or, you can even store the entire document on the edge and abandon your database altogether.

With Durable Objects lowering the barrier, we hope to see real-time collaboration become the norm across the web. There’s no longer any reason to make users refresh for updates.

Example: An atomic counter

Here’s a very simple example of a Durable Object which can be incremented, decremented, and read over HTTP. This counter is consistent even when receiving simultaneous requests from multiple clients — none of the increments or decrements will be lost. At the same time, reads are served entirely from memory, no disk access needed.

export class Counter {
  // Constructor called by the system when the object is needed to
  // handle requests.
  constructor(controller, env) {
    // `controller.storage` is an interface to access the object's
    // on-disk durable storage.
    this.storage = controller.storage

  // Private helper method called from fetch(), below.
  async initialize() {
    let stored = await this.storage.get("value");
    this.value = stored || 0;

  // Handle HTTP requests from clients.
  // The system calls this method when an HTTP request is sent to
  // the object. Note that these requests strictly come from other
  // parts of your Worker, not from the public internet.
  async fetch(request) {
    // Make sure we're fully initialized from storage.
    if (!this.initializePromise) {
      this.initializePromise = this.initialize();
    await this.initializePromise;

    // Apply requested action.
    let url = new URL(request.url);
    switch (url.pathname) {
      case "/increment":
        await this.storage.put("value", this.value);
      case "/decrement":
        await this.storage.put("value", this.value);
      case "/":
        // Just serve the current value. No storage calls needed!
        return new Response("Not found", {status: 404});

    // Return current value.
    return new Response(this.value);

Once the class has been bound to a Durable Object namespace, a particular instance of Counter can be accessed from anywhere in the world using code like:

// Derive the ID for the counter object named "my-counter".
// This name is associated with exactly one instance in the
// whole world.
let id = COUNTER_NAMESPACE.idFromName("my-counter");

// Send a request to it.
let response = await COUNTER_NAMESPACE.get(id).fetch(request);

Demo: Chat

Chat is arguably real-time collaboration in its purest form. And to that end, we have built a demo open source chat app that runs entirely at the edge using Durable Objects.

Try the live demo »See the source code on GitHub »

This chat app uses a Durable Object to control each chat room. Users connect to the object using WebSockets. Messages from one user are broadcast to all the other users. The chat history is also stored in durable storage, but this is only for history. Real-time messages are relayed directly from one user to others without going through the storage layer.

Additionally, this demo uses Durable Objects for a second purpose: Applying a rate limit to messages from any particular IP. Each IP is assigned a Durable Object that tracks recent request frequency, so that users who send too many messages can be temporarily blocked — even across multiple chat rooms. Interestingly, these objects don’t actually store any durable state at all, because they only care about very recent history, and it’s not a big deal if a rate limiter randomly resets on occasion. So, these rate limiter objects are an example of a pure coordination object with no storage.

This chat app is only a few hundred lines of code. The deployment configuration is only a few lines. Yet, it will scale seamlessly to any number of chat rooms, limited only by Cloudflare’s available resources. Of course, any individual chat room’s scalability has a limit, since each object is single-threaded. But, that limit is far beyond what a human participant could keep up with anyway.

Other use cases

Durable Objects have infinite uses. Here are just a few ideas, beyond the ones described above:

  • Shopping cart: An online storefront could track a user’s shopping cart in an object. The rest of the storefront could be served as a fully static web site. Cloudflare will automatically host the cart object close to the end user, minimizing latency.
  • Game server: A multiplayer game could track the state of a match in an object, hosted on the edge close to the players.
  • IoT coordination: Devices within a family’s house could coordinate through an object, avoiding the need to talk to distant servers.
  • Social feeds: Each user could have a Durable Object that aggregates their subscriptions.
  • Comment/chat widgets: A web site that is otherwise static content can add a comment widget or even a live chat widget on individual articles. Each article would use a separate Durable Object to coordinate. This way the origin server can focus on static content only.

The Future: True Edge Databases

We see Durable Objects as a low-level primitive for building distributed systems. Some applications, like those mentioned above, can use objects directly to implement a coordination layer, or maybe even as their sole storage layer.

However, Durable Objects today are not a complete database solution. Each object can see only its own data. To perform a query or transaction across multiple objects, the application needs to do some extra work.

That said, every big distributed database – whether it be relational, document, graph, etc. – is, at some low level, composed of “chunks” or “shards” that store one piece of the overall data. The job of a distributed database is to coordinate between chunks.

We see a future of edge databases that store each “chunk” as a Durable Object. By doing so, it will be possible to build databases that operate entirely at the edge, fully distributed with no regions or home location. These databases need not be built by us; anyone can potentially build them on top of Durable Objects. Durable Objects are only the first step in the edge storage journey.

Join the Beta

Storing data is a big responsibility which we do not take lightly. Because of the critical importance of getting it right, we are being careful. We will be making Durable Objects available gradually over the next several months.

As with any beta, this product is a work in progress, and some of what is described in this post is not fully enabled yet. Full details of beta limitations can be found in the documentation.

If you’d like to try out Durable Objects now, tell us about your use case. We’ll be selecting the most interesting use cases for early access.

Request a beta invite »


Can Durable Objects serve WebSockets?


As part of the Durable Objects beta, we’ve made it possible for Workers to act as WebSocket endpoints — including as a client or as a server. Before now, Workers could proxy WebSocket connections on to a back-end server, but could not speak the protocol directly.

While technically any Worker can speak WebSocket in this way, WebSockets are most useful when combined with Durable Objects. When a client connects to your application using a WebSocket, you need a way for server-generated events to be sent back to the existing socket connection. Without Durable Objects, there’s no way to send an event to the specific Worker holding a WebSocket. With Durable Objects, you can now forward the WebSocket to an Object. Messages can then be addressed to that Object by its unique ID, and the Object can then forward those messages down the WebSocket to the client.

The chat app demo presented above uses WebSockets. Check out the source code to see how it works.

How does this compare to Workers KV?

Two years ago, we introduced Workers KV, a global key-value data store. KV is a fairly minimalist global data store that serves certain purposes well, but is not for everyone. KV is eventually consistent, which means that writes made in one location may not be visible in other locations immediately. Moreover, it implements “last write wins” semantics, which means that if a single key is being modified from multiple locations in the world at once, it’s easy for those writes to overwrite each other. KV is designed this way to support low-latency reads for data that doesn’t frequently change. However, these design decisions make KV inappropriate for state that changes frequently, or when changes need to be immediately visible worldwide.

Durable Objects, in contrast, are not primarily a storage product at all — many use cases for them do not actually utilize durable storage. To the extent that they do provide storage, Durable Objects sit at the opposite end of the storage spectrum from KV. They are extremely well-suited to workloads requiring transactional guarantees and immediate consistency. However, since transactions inherently must be coordinated in a single location, and clients on the opposite side of the world from that location will experience moderate latency due to the inherent limitations of the speed of light. Durable Objects will combat this problem by auto-migrating to live close to where they are used.

In short, Workers KV remains the best way to serve static content, configuration, and other rarely-changing data around the world, while Durable Objects are better for managing dynamic state and coordination.

Going forward, we plan to utilize Durable Objects in the implementation of Workers KV itself, in order to deliver even better performance.

Why not use CRDTs?

You can build CRDT-based storage on top of Durable Objects, but Durable Objects do not require you to use CRDTs.

Conflict-free Replicated Data Types (CRDTs), or their cousins, Operational Transforms (OTs), are a technology that allows data to be edited from multiple places in the world simultaneously without synchronization, and without data loss. For example, these technologies are commonly used in the implementation of real-time collaborative document editors, so that a user’s keypresses can show up in their local copy of the document in real time, without waiting to see if anyone else edited another part of the document first. Without getting into details, you can think of these techniques like a real time version of “git fork” and “git merge”, where all merge conflicts are resolved automatically in a deterministic way, so that everyone ends up with the same state in the end.

CRDTs are a powerful technology, but applying them correctly can be challenging. Only certain kinds of data structures lend themselves to automatic conflict resolution in a way that doesn’t lead to easy data loss. Any developer familiar with git can see the problem: arbitrary conflict resolution is hard, and any automated algorithm for it will likely get things wrong sometimes. It’s all the more difficult if the algorithm has to handle merges in arbitrary order and still get the same answer.

We feel that, for most applications, CRDTs are overly complex and not worth the effort. Worse, the set of data structures that can be represented as a CRDT is too limited for many applications. It’s usually much easier to assign a single authoritative coordination point for each document, which is exactly what Durable Objects accomplish.

With that said, CRDTs can be used on top of Durable Objects. If an object’s state lends itself to CRDT treatment, then an application could replicate that object into several objects serving different regions, which then synchronize their states via CRDT. This would make sense for applications to implement as an optimization if and when they find it is worth the effort.

Last thoughts: What does it mean for state to be “serverless”?

Traditionally, serverless has focused on stateless compute. In serverless architectures, the logical unit of compute is reduced to something fine-grained: a single event, such as an HTTP request. This works especially well because events just happened to be the logical unit of work that we think about when designing server applications. No one thinks about their business logic in units of “servers” or “containers” or “processes” — we think about events. It is exactly because of this semantic alignment that serverless succeeds in shifting so much of the logistical burden of maintaining servers away from the developer and towards the cloud provider.

However, serverless architecture has traditionally been stateless. Each event executes in isolation. If you wanted to store data, you had to connect to a traditional database. If you wanted to coordinate between requests, you had to connect to some other service that provides that ability. These external services have tended to re-introduce the operational concerns that serverless was intended to avoid. Developers and service operators have to worry not just about scaling their databases to handle increasing load, but also about how to split their database into “regions” to effectively handle global traffic. The latter concern can be especially cumbersome.

So how can we apply the serverless philosophy to state? Just like serverless compute is about splitting compute into fine-grained pieces, serverless state is about splitting state into fine-grained pieces. Again, we seek to find a unit of state that corresponds to logical units in our application. The logical unit of state in an application is not a “table” or a “collection” or a “graph”. Instead, it depends on the application. The logical unit of state in a chat app is a chat room. The logical unit of state in an online spreadsheet editor is a spreadsheet. The logical unit of state in an online storefront is a shopping cart. By making the physical unit of storage provided by the storage layer match the logical unit of state inherent in the application, we can allow the underlying storage provider (Cloudflare) to take responsibility for a wide array of logistical concerns that previously fell on the developer, including scalability and regionality.

This is what Durable Objects do.

On Executive Order 12333

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/09/on-executive-order-12333.html

Mark Jaycox has written a long article on the US Executive Order 12333: “No Oversight, No Limits, No Worries: A Primer on Presidential Spying and Executive Order 12,333“:

Abstract: Executive Order 12,333 (“EO 12333”) is a 1980s Executive Order signed by President Ronald Reagan that, among other things, establishes an overarching policy framework for the Executive Branch’s spying powers. Although electronic surveillance programs authorized by EO 12333 generally target foreign intelligence from foreign targets, its permissive targeting standards allow for the substantial collection of Americans’ communications containing little to no foreign intelligence value. This fact alone necessitates closer inspection.

This working draft conducts such an inspection by collecting and coalescing the various declassifications, disclosures, legislative investigations, and news reports concerning EO 12333 electronic surveillance programs in order to provide a better understanding of how the Executive Branch implements the order and the surveillance programs it authorizes. The Article pays particular attention to EO 12333’s designation of the National Security Agency as primarily responsible for conducting signals intelligence, which includes the installation of malware, the analysis of internet traffic traversing the telecommunications backbone, the hacking of U.S.-based companies like Yahoo and Google, and the analysis of Americans’ communications, contact lists, text messages, geolocation data, and other information.

After exploring the electronic surveillance programs authorized by EO 12333, this Article proposes reforms to the existing policy framework, including narrowing the aperture of authorized surveillance, increasing privacy standards for the retention of data, and requiring greater transparency and accountability.

ElasticSearch Multitenancy With Routing

Post Syndicated from Bozho original https://techblog.bozho.net/elasticsearch-multitenancy-with-routing/

Elasticsearch is great, but optimizing it for high load is always tricky. This won’t be yet another “Tips and tricks for optimizing Elasticsearch” article – there are many great ones out there. I’m going to focus on one narrow use-case – multitenant systems, i.e. those that support multiple customers/users (tenants).

You can build a multitenant search engine in three different ways:

  • Cluster per tenant – this is the hardest to manage and requires a lot of devops automation. Depending on the types of customers it may be worth it to completely isolate them, but that’s rarely the case
  • Index per tenant – this can be fine initially, and requires little additional coding (you just parameterize the “index” parameter in the URL of the queries), but it’s likely to cause problems as the customer base grows. Also, supporting consistent mappings and settings across indexes may be trickier than it sounds (e.g. some may reject an update and others may not depending on what’s indexed). Moving data to colder indexes also becomes more complex.
  • Tenant-based routing – this means you put everything in one cluster but you configure your search routing to be tenant-specific, which allows you to logically isolate data within a single index.

The last one seems to be the preferred option in general. What is routing? The Elasticsearch blog has a good overview and documentation. The idea lies in the way Elasticsearch handles indexing and searching – it splits data into shards (each shard is a separate Lucene index and can be replicated on more than one node). A shard is a logical grouping within a single Elasticsearch node. When no custom routing is used, and an index request comes, the ID is used to determine which shard is going to be used to store the data. However, during search, Elasticsearch doesn’t know which shards have the data, so it has ask multiple shards and gather the results. Related to that, there’s the newly introduced adaptive replica selection, where the proper shard replica is selected intelligently, rather than using round-robin.

Custom routing allows you to specify a routing value when indexing a document and then a search can be directed only to the shard that has the same routing value. For example, at LogSentinel when we index a log entry, we use the data source id (applicationId) for routing. So each application (data source) that generates logs has a separate identifier which allows us to query only that data source’s shard. That way, even though we may have a thousand clients with a hundred data sources each, a query will be precisely targeted to where the data for that particular customer’s data source lies.

This is key for horizontally scaling multitenant applications. When there’s terabytes of data and billions of documents, many shards will be needed (in order to avoid large and heavy shards that cause performance issues). Finding data in this haystack requires the ability to know where to look.

Note that you can (and probably should) make routing required in these cases – each indexed document must be required to have a routing key, otherwise an implementation oversight may lead to a slow index.

Using custom routing you are practically turning one large Elasticsearch cluster into smaller sections, logically separated based on meaningful identifiers. In our case, it is not a userId/customerId, but one level deeper – there are multiple shards per customer, but depending on the use-case, it can be one shard per customer, using the userId/customerId. Using more than one shard per customer may complicate things a little – for example having too many shards per customer may require searches that span too many shards, but that’s not necessarily worse than not using routing.

There are some caveats – the isolation of customer data has to be handled in the application layer (whereas for the first two approaches data is segregated operationally). If there’s an application bug or lack of proper access checks, one user can query data from other users’ shards by specifying their routing key. It’s the role of the application in front of Elasticsearch to only allow queries with routing keys belonging to the currently authenticated user.

There are cases when the first two approaches to multitenancy are viable (e.g. a few very large customers), but in general the routing approach is the most scalable one.

The post ElasticSearch Multitenancy With Routing appeared first on Bozho's tech blog.

Raspberry Pi High Quality Camera takes photos through thousands of straws

Post Syndicated from Ashley Whittaker original https://www.raspberrypi.org/blog/raspberry-pi-high-quality-camera-takes-photos-through-thousands-of-straws/

Adrian Hanft is our favourite kind of maker: weird. He’s also the guy who invented the Lego camera, 16 years ago. This time, he spent more than a year creating what he describes as “one of the strangest cameras you may ever hear about.”

What? Looks normal from here. Massive, but normal

What’s with all the straws?

OK, here’s why it’s weird: it takes photos with a Raspberry Pi High Quality Camera through a ‘lens’ of tiny drinking straws packed together. 23,248 straws, to be exact, are inside the wooden box-shaped bit of the machine above. The camera itself sits at the slim end of the black and white part. The Raspberry Pi, power bank, and controller all sit on top of the wooden box full of straws.

Here’s what an image of Yoda looks like, photographed through that many straws:

Mosaic, but make it techy

Ground glass lenses

The concept isn’t as easy as it may look. As you can see from the images below, if you hold up a load of straws, you can only see the light through a few of them. Adrian turned to older technology for a solution, taking a viewfinder from an old camera which had ground glass (which ‘collects’ light) on the surface.

Left: looking through straws at light with the naked eye
Right: the same straws viewed through a ground glass lens

Even though Adrian was completely new to both Raspberry Pi and Python, it only took him a week of evenings and weekends to code the software needed to control the Raspberry Pi High Quality Camera.

Long story short, on the left is the final camera, with all the prototypes queued up behind it

An original Nintendo controller runs the show and connects to the Raspberry Pi with a USB adapter. The buttons are mapped to the functions of Adrian’s software.

A super satisfying time-lapse of the straws being loaded

What does the Nintendo controller do?

In his original post, Adrian explains what all the buttons on the controller do in order to create images:

“The Start button launches a preview of what the camera is seeing. The A button takes a picture. The Up and Down buttons increase or decrease the exposure time by 1 second. The Select button launches a gallery of photos so I can see the last photo I took. The Right and Left buttons cycle between photos in the gallery. I am saving the B button for something else in the future. Maybe I will use it for uploading to Dropbox, I haven’t decided yet.”

Adrian made a Lego mount for the Raspberry Pi camera
The Lego mount makes it easy to switch between cameras and lenses

A mobile phone serves as a wireless display so he can keep an eye on what’s going on. The phone communicates with the Raspberry Pi connected to the camera via a VPN app.

One of the prototypes in action

Follow Adrian on Instagram to keep up with all the photography captured using the final camera, as well as the prototypes that came before it.

The post Raspberry Pi High Quality Camera takes photos through thousands of straws appeared first on Raspberry Pi.

The collective thoughts of the interwebz

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.