Security updates for Tuesday

Post Syndicated from corbet original https://lwn.net/Articles/895521/

Security updates have been issued by Debian (cifs-utils, ffmpeg, libxml2, and vim), Fedora (rsyslog), Mageia (chromium-browser-stable), SUSE (chromium, containerd, docker, e2fsprogs, gzip, jackson-databind, jackson-dataformats-binary, jackson-annotations, jackson-bom, jackson-core, kernel, nodejs8, openldap2, pidgin, podofo, slurm, and tiff), and Ubuntu (clamav, containerd, libxml2, and openldap).

Debugging Hardware Performance on Gen X Servers

Post Syndicated from Yasir Jamal original https://blog.cloudflare.com/debugging-hardware-performance-on-gen-x-servers/

Debugging Hardware Performance on Gen X Servers

Debugging Hardware Performance on Gen X Servers

In Cloudflare’s global network, every server runs the whole software stack. Therefore, it’s critical that every server performs to its maximum potential capacity. In order to provide us better flexibility from a supply chain perspective, we buy server hardware from multiple vendors with the exact same configuration. However, after the deployment of our Gen X AMD EPYC Zen 2 (Rome) servers, we noticed that servers from one vendor (which we’ll call SKU-B) were consistently performing 5-10% worse than servers from second vendor (which we’ll call SKU-A).

The graph below shows the performance discrepancy between the two SKUs in terms of percentage difference. The performance is gauged on the metric of requests per second, and this data is an average of observations captured over 24 hours.

Debugging Hardware Performance on Gen X Servers
Machines before implementing performance improvements. The average RPS for SKU-B is approximately 10% below SKU-A.

Compute performance via DGEMM

The initial debugging efforts centered around the compute performance. We ran AMD’s DGEMM high performance computing tool to determine if CPU performance was the cause. DGEMM is designed to measure the sustained floating-point computation rate of a single server. Specifically, the code measures the floating point rate of execution of a real matrix–matrix multiplication with double precision. We ran a modified version of DGEMM equipped with specific AMD libraries to support the EPYC processor instruction set.

The DGEMM test brought out a few points related to the processor’s Thermal Design Power (TDP). TDP refers to the maximum power that a processor can draw for a thermally significant period while running a software application. The underperforming servers were only drawing 215 to 220 watts of power when fully stressed, whereas the max supported TDP on the processors is 240 watts. Additionally, their floating-point computation rate was ~100 gigaflops (GFLOPS) behind the better performing servers.

Screenshot from the DGEMM run of a good system:

Debugging Hardware Performance on Gen X Servers

Screenshot from an underperforming system:

Debugging Hardware Performance on Gen X Servers

To debug the issue, we first tried disabling idle power saving mode (also known as C-states) in the CPU BIOS configuration. The servers reported expected GFLOPS numbers and achieved max TDP. Believing that this could have been the root cause, the servers were moved back into the production test environment for data collection.

However, the performance delta was still there.

Further debugging

We started the debugging process all over again by comparing the BIOS settings logs of both SKU-A and SKU-B. Once this debugging option was exhausted, the focus shifted to the network interface. We ran the open source network interface tool iPerf to check if there were any bottlenecks — no issues were observed either.

During this activity, we noticed that the BIOS of both SKUs were not using the AMD Preferred I/O functionality, which allows devices on a single PCIe bus to obtain improved DMA write performance. We enabled the Preferred I/O option on SKU-B and tested the change in production. Unfortunately, there were no performance gains. After the focus on network activity, the team started looking into memory configuration and operating speed.

AMD HSMP tool & Infinity Fabric diagnosis

The Gen X systems are configured with DDR4 memory modules that can support a maximum of 2933 megatransfers per second (MT/s). With the BIOS configuration for memory clock Frequency set to Auto, the 2933 MT/s automatically configures the memory clock frequency to 1467 MHz. Double Data Rate (DDR) technology allows for the memory signal to be sampled twice per clock cycle: once on the rising edge and once on the falling edge of the clock signal. Because of this, the reported memory speed rate is twice the true memory clock frequency. Memory bandwidth was validated to be working quite well by running a Stream benchmark test.

Running out of debugging options, we reached out to AMD for assistance and were provided a debug tool called HSMP that lets users access the Host System Management Port. This tool provides a wide variety of processor management options, such as reading and changing processor TDP limits, reading processor and memory temperatures, and reading memory and processor clock frequencies. When we ran the HSMP tool, we discovered that the memory was being clocked at a different rate from AMD’s Infinity Fabric system — the architecture which connects the memory to the processor core. As shown below, the memory clock frequency was set to 1467 MHz while the Infinity Fabric clock frequency was set to 1333 MHz.

Debugging Hardware Performance on Gen X Servers

Ideally, the two clock frequencies need to be equal for optimized workloads sensitive to latency and throughput. Below is a snapshot for an SKU-A server where both memory and Infinity Fabric frequencies are equal.

Debugging Hardware Performance on Gen X Servers

Improved Performance

Since the Infinity Fabric clock setting was not exposed as a tunable parameter in the BIOS, we asked the vendor to provide a new BIOS that set the frequency to 1467 MHz during compile time. Once we deployed the new BIOS on the underperforming systems in production, we saw that all servers started performing to their expected levels. Below is a snapshot of the same set of systems with data captured and averaged over a 24 hours window on the same day of the week as the previous dataset. Now, the requests per second performance of SKU-B equals and in some cases exceeds the performance of SKU-A!

Debugging Hardware Performance on Gen X Servers
Internet Requests Per Second (RPS) for four SKU-A machines and four SKU-B machines after implementing the BIOS fix on SKU-B. The RPS of SKU-B now equals the RPS of SKU-A.

Hello, I am Yasir Jamal. I recently joined Cloudflare as a Hardware Engineer with the goal of helping provide a better Internet to everyone. If you share the same interest, come join us!

Announcing our Spring Developer Challenge

Post Syndicated from Albert Zhao original https://blog.cloudflare.com/announcing-our-spring-developer-challenge/

Announcing our Spring Developer Challenge

Announcing our Spring Developer Challenge

After many announcements from Platform Week, we’re thrilled to make one more: our Spring Developer Challenge!

The theme for this challenge is building real-time, collaborative applications — one of the most exciting use-cases emerging in the Cloudflare ecosystem. This is an opportunity for developers to merge their ideas with our newly released features, earn recognition on our blog, and take home our best swag yet.

Here’s a list of our tools that will get you started:

  • Workers can either be powerful middleware connecting your app to different APIs and an origin — or it can be the entire application itself. We recommend using Worktop, a popular framework for Workers, if you need TypeScript support, routing, and well-organized submodules. Worktop can also complement your existing app even if it already uses a framework,  such as Svelte.
  • Cloudflare Pages makes it incredibly easy to deploy sites, which you can make into truly dynamic apps by putting a Worker in front or using the Pages Functions (beta).
  • Durable Objects are great for collaborative apps because you can use websockets while coordinating state at the edge, seen in this chat demo. To help scale any load, we also recommend Durable Object Groups.
  • Workers KV provides a global key-value data store that securely stores and quickly serves data across Cloudflare’s network. R2 allows you to store enormous amounts of data without trapping you with costly egress services.

Last year, our Developer Spotlight series highlighted how developers around the world built entire applications on Cloudflare. Our Discord server maintained that momentum with users demonstrating that any type of application can be built. Need a way to organize thousands of lines of JSON? JSON Hero, built with Remix and deployed with Workers, provides an incredibly readable UI for your JSON files. Trying to deploy a GraphQL server for your app that scales? helix-flare deploys a GraphQL server easily through Workers and uses Durable Objects to coordinate data.

We hope developers continue to explore the boundaries of what they can build on Cloudflare as our platform evolves. During our Summer Developer Challenge in 2021, we received over 1,200 submissions that revealed you can build almost any app imaginable with Workers, Pages, and the rest of the developer ecosystem. We sent out hundreds of swag boxes to participants, to show our appreciation. The ensuing unboxing videos on Twitter and YouTube thrilled our team.

This year’s Spring Developer Challenge is all about making real-time, collaborative apps such as chat rooms, games, web-based editing tools, or anything else in your imagination! Here are the rules:

  • You must be at least 18 years old to participate
  • You can work in teams of up to 10 people per submission
  • The deadline to submit your repo is May 24

Enter the challenge by going to this site.

As you build your app, join our Discord if you or your team need any help. We will be enthusiastically reviewing submissions, promoting them on Twitter, and sending out swag boxes.

If you’re new to Cloudflare or have an exciting idea as a developer, this is your opportunity to see how far our platform has evolved and get rewarded for it!

Attacks on Managed Service Providers Expected to Increase

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/attacks-on-managed-service-providers-expected-to-increase.html

CISA, NSA, FBI, and similar organizations in the other Five Eyes countries are warning that attacks on MSPs — as a vector to their customers — are likely to increase. No details about what this prediction is based on. Makes sense, though. The SolarWinds attack was incredibly successful for the Russian SVR, and a blueprint for future attacks.

News articles.

Service architecture revamp

Post Syndicated from Grab Tech original https://engineering.grab.com/service-architecture-revamp

Background

Prior to 2021, Grab’s search architecture was designed to only support textual matching, which takes in a user query and looks for exact matches within the ecosystem through an inverted index. This legacy system meant that only textual matching results could be fetched.

In the second half of 2021, the Deliveries search team worked on improving this architecture to make it smarter, more scalable and also unlock future growth for different search use cases at Grab. The figure below shows a simplified overview of the legacy architecture.

Point multiplier
Legacy architecture

Problem statement

With the legacy system, we noticed several problems.

Search results were textually matched without considering intention and context

If a user types in a query “Roti Prata” (flatbread), he is likely looking for Roti Prata dishes and those matches with the dish name should be prioritised compared with matches with the merchant-partner’s name or matches with other entities.

In the legacy system, all entities whose names partially matched “Roti Prata” were displayed and ranked according to hard coded weights, and matches with merchant-partner names were always prioritised, even if the user intention was clearly to search for the “Roti Prata” dish itself.  

This problem was more common in Mart, as users often intended to search for items instead of shops. Besides the lack of intention recognition, the search system was also unable to take context into consideration; users searching the same keyword query at different times and locations could have different objectives. E.g. if users search for “Bread” in the day, they may be likely to look for cafes while searches at night could be for breakfast the next day.

Search results from multiple business verticals were not blended effectively

In Grab’s context, results from multiple verticals were often merged. For example, in Mart searches, Ads and Mart organic search results were displayed together; in Food searches, Ads, Food and Mart organic results were blended together.

In the legacy architecture, multiple business verticals were merged on the Deliveries API layer, which resulted in the leak of abstraction and loss of useful data as data from the search recall stage was also not taken into account during the merge stage.

Inability to quickly scale to new search use cases and difficulty in reusing existing components

The legacy code base was not written in a structured way that could scale to new use cases easily. If new search use cases cannot be built on top of an existing system, it can be rather tedious to keep rebuilding the function every time there is a new search use case.

Solution

In this section, solutions from both architecture and implementation perspectives are presented to address the above problem statements.

Architecture

In the new architecture, the flow is extended from lexical recall only to multi-layer including boosting, multi-recall, and ranking. The addition of boosting enables capabilities like intent recognition and query expansion, while the change from single lexical recall to multi-recall opens up the potential for other recall methods, e.g. embedding based and graph based.

These help address the first problem statement. Furthermore, the multi-recall framework enables fetching results from multiple business verticals, addressing the second problem statement. In the new framework, results from different verticals and different recall methods were grouped and ranked together without any leak of abstraction or loss of useful data from search recall stage in ranking.

Point multiplier
Upgraded architecture

Implementation

We believe that the key to a platform’s success is modularisation and flexible assembling of plugins to enable quick product iteration. That is why we implemented a combination of a framework defined by the platform and plugins provided by service teams. In this implementation, plugins are assembled through configurations, which addresses the third problem statement and has two advantages:

  • Separation of concern. With the main flow abstracted and maintained by the platform, service team developers could focus on the application logic by writing plugins and fitting them into the main flow. In this case, developers without search experience could quickly enable new search flows.
  • Reusing plugins and economies of scale. With more use cases onboarded, more plugins are written by service teams and these plugins are reusable assets, resulting in scale effect. For example, an Ads recall plugin could be reused in Food keyword or non-keyword searches, Mart keyword or non-keyword searches and universal search flows as all these searches contain non-organic Ads. Similarly, a Mart recall plugin could be reused in Mart keyword or non-keyword searches, universal search and Food keyword search flows, as all these flows contain Mart results. With more plugins accumulated on our platform, developers might be able to ship a new search flow by just reusing and assembling the existing plugins.

Conclusion

Our platform now has a smart search with intent recognition and semantic (embedding-based) search. The process of adding new modules is also more straightforward and adds intention recognition to the boosting step as well as embedding as an additional recall to the multi-recall step. These modules can be easily reused by other use cases.

On top of that, we also have a mixed Ads and an organic framework. This means that data in the recall stage is taken into consideration and Ads can now be ranked together with organic results, e.g. text relevance.

With a modularised design and plugins provided by the platform, it is easier for clients to use our platform with a simple onboarding process. Furthermore, plugins can be reused to cater to new use cases and achieve a scale effect.

Join us

Grab is the leading superapp platform in Southeast Asia, providing everyday services that matter to consumers. More than just a ride-hailing and food delivery app, Grab offers a wide range of on-demand services in the region, including mobility, food, package and grocery delivery services, mobile payments, and financial services across 428 cities in eight countries.

Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!

Federate access to Amazon Redshift query editor V2 with Active Directory Federation Services (AD FS): Part 3

Post Syndicated from Sumeet Joshi original https://aws.amazon.com/blogs/big-data/federate-access-to-amazon-redshift-query-editor-v2-with-active-directory-federation-services-ad-fs-part-3/

In the first post of this series, Federate access to your Amazon Redshift cluster with Active Directory Federation Services (AD FS): Part 1, you set up Microsoft Active Directory Federation Services (AD FS) and Security Assertion Markup Language (SAML) based authentication and tested the SAML federation using a web browser.

In Part 2, you learned to set up an Amazon Redshift cluster and use federated authentication with AD FS to connect from a JDBC SQL client tool.

In this post, we walk through the steps to configure Amazon Redshift query editor v2 to work with AD FS federation SSO.

Organizations want to enable their end-users such as data analysts, data scientists, and database developers to use the query editor v2 to accelerate self-service analytics. Amazon Redshift query editor v2 lets users explore, analyze, and collaborate on data. You can use the query editor to create databases, schemas, tables, and load data from Amazon Simple Storage Service (Amazon S3) using the COPY command or by using a wizard. You can browse multiple databases and run queries on your Amazon Redshift data warehouse or data lake, or run federated queries to operational databases such as Amazon Aurora.

In this post, we show how you can use your corporate Active Directory (AD) and the SAML 2.0 AD FS identity provider (IdP) to enable your users to easily access Amazon Redshift clusters through query editor v2 using corporate user names without managing database users and passwords. We also demonstrate how you can limit the access for your users to use only the query editor without giving them access to perform any admin functions on the AWS Management Console.

Solution overview

After you follow the steps explained in Part 1, you set up a deep link for federated users via the SAML 2.0 RelayState parameter in AD FS. You use the user you set up in your AD in Part 1 (Bob) to authenticate using AD FS and control access to database objects based on the group the user is assigned to. You also test if user Bob is integrated with Amazon Redshift database groups as controlled in AD groups.

By the end of this post, you will have created a unique deep link that authenticates the user Bob using AD FS and redirects them directly to the query editor v2 console, where they’re authenticated using the federation SSO option.

The sign-in process is as follows:

  1. The user chooses a deep link that redirects to the IdP for authentication with the information about the destination (query editor v2, in our case) URL embedded in the RelayState parameter. The user enters their credentials on the login page.
  2. Your IdP (AD FS in the case) verifies the user’s identity in your organization.
  3. Your IdP generates a SAML authentication response that includes assertions that identify the user and attributes about the user. The IdP sends this response to the user’s browser.
  4. The user’s browser is redirected to the AWS Single Sign-On endpoint and posts the SAML assertion and the RelayState parameter.
  5. The endpoint calls the AssumeRoleWithSAML API action to request temporary credentials from the AWS Identity and Access Management (IAM) role specified in the SAML assertion and creates a query editor v2 console sign-in URL that uses those credentials. The IAM role trusts the SAML federation entity and also has a policy that has access to query editor V2. If the SAML authentication response includes attributes that map to multiple IAM roles, the user is first prompted to choose the role to use for access to the query editor v2 console. The sign-in URL is the one specified by the RelayState parameter.
  6. AWS sends the sign-in URL back to the user’s browser as a redirect.
  7. The user’s browser is redirected to the Amazon Redshift query editor v2 console defined by the RelayState parameter.

The following diagram illustrates this flow.

In this post, we walk you through the following steps:

  1. Set up the Sales group in AD and set up the PrincipalTag claim rules in AD FS.
  2. Update the IAM roles.
  3. Construct the SSO URL to authenticate and redirect users to the Amazon Redshift query editor v2 console.
  4. Set up Amazon Redshift database groups and permissions on the Amazon Redshift cluster.
  5. Set up Amazon Redshift query editor v2 to use federated authentication with AD FS to connect directly from the query editor interface.
  6. Query Amazon Redshift objects to validate your authorization.

Prerequisites

For this walkthrough, complete the following prerequisite steps:

  1. Create an Amazon Redshift cluster. For instructions, refer to Create a sample Amazon Redshift cluster or complete the steps in Part 2 of this series.
  2. Complete the steps in Part 1 to set up SAML federation with AD FS:
    1. Set up an AD domain controller using an AWS CloudFormation template on a Windows 2016 Amazon Elastic Compute Cloud (Amazon EC2) instance.
    2. Configure federation in AD FS.
    3. Configure AWS as the relying party with AD FS using an IAM SAML provider and SAML roles with an attached policy to allow access to the Amazon Redshift cluster.
    4. Configure claim rules.
    5. Test the SAML authentication using a web browser.
  3. Verify that your IdP supports RelayState and is enabled. If you’re using AD FS 2.0, you need to download and install either Update Rollup 3 or Update Rollup 2 from Microsoft to enable the RelayState parameter.

Configure AD and AD FS

After you configure your AD FS and AD services by following the instructions in Part 1, you can set up the following AD group and claim rules.

In this post, you use the user Bob to log in to Amazon Redshift and check if Bob can access the Sales and Marketing schemas on the Amazon Redshift cluster. To create the sales group and assign the user [email protected] to it, log in to your AD FS server (Amazon EC2 machine) that you created in Part 1 and use the Windows command tool to run the following command:

dsadd group "cn=RSDB-sales, cn=Users, dc=adfsredshift, dc=com" -members "cn=Bob, cn=Users, dc=adfsredshift, dc=com"

Now you’re ready to create your custom claim rules: PrincipalTag:RedshiftDbUser and PrincipalTag:RedshiftDbGroup.

PrincipalTag:RedshiftDbUser

The custom claim rule PrincipalTag:RedshiftDbUser is mapped to the universal principal name in AD FS. When a user authenticates through federated SSO, this claim rule is mapped to the user name. If user doesn’t exist in the Amazon Redshift database, then the user is automatically created. The auto create option is granted through an IAM policy that is attached to the IAM role. The CreateClusterUser permission allows for auto creation of the user (you set this up as part of Part 1 as a prerequisite).

Complete the following steps to create your custom claim rule:

    1. On the AD FS management console, choose Relying Party Trusts.
    2. Choose Edit Claim Issuance Policy.
    3. Choose Choose Rule Type.
    4. For Claim rule template, choose Send Claims Using a Custom Rule.
    5. Choose Next.
    6. For Claims rule name, enter RedshiftDbUser.
    7. Add the following custom rule:
      c:[Type == "http://schemas.microsoft.com/ws/2008/06/identity/claims/windowsaccountname", Issuer == "AD AUTHORITY"]
       => issue(store = "Active Directory", types = ("https://aws.amazon.com/SAML/Attributes/PrincipalTag:RedshiftDbUser"), query = ";userPrincipalName;{0}", param = c.Value);

    8. Choose Finish.
    9. Capture the claim rules sent in a SAML assertion response through your browser. For instructions, refer to How to view a SAML response in your browser for troubleshooting.

In my example, I use the following SAML attribute for the RedshiftDbUser PrincipalTag:

<Attribute Name="https://aws.amazon.com/SAML/Attributes/PrincipalTag:RedshiftDbUser"> <AttributeValue>[email protected]</AttributeValue> 
</Attribute>

PrincipalTag:RedshiftDbGroup

The custom claim rule PrincipalTag:RedshiftDbGroup is built out of AD groups that the user is a member of. This rule is mapped to the Amazon Redshift database groups. The AD groups and Amazon Redshift database group names should match. JoinGroup permission set in the IAM policy allows the user to assume a database group and is session based. If the user is mapped to multiple groups in the AD group, the SAML assertion response should send those groups in : separated values and not as multiple value claims. The following steps demonstrate how to send AD groups as : separated values.

In this example, the user Bob is assigned to the marketing and sales groups. The following code shows how to send multiple groups through the SAML response when the user is in multiple groups, and also how to handle the situation when a user doesn’t exist in any particular group.

  1. Follow the same steps as in the previous section to create the rule Marketing, using the following code for the custom rule:
    c:[Type == " http://temp/groups", Value =~ "RSDB-Marketing"] => add(Type = " http://temp/marketing", Value = c.Value);

  2. Create the rule MarketingNotExists using the following code:
    NOT EXISTS([Type == "http://temp/variable", Value =~ "RSDB-marketing"]) => add(Type = "http://temp/marketing", Value = ""); 

  3. Create the rule sales using the following code:
    c:[Type == " http://temp/groups", Value =~ "RSDB-sales"] => add(Type = " http://temp/marketing", Value = c.Value);

  4. Create the rule SalesNotExists using the following code:
    NOT EXISTS([Type == "http://temp/variable", Value =~ "RSDB-sales"])
     => add(Type = "http://temp/sales", Value = ""); 

  5. Create the rule RedshiftDbGroups using the following code:
    c:[Type == "http://temp/marketing"]
     && c2:[Type == "http://temp/sales"]
     => issue(Type = "https://aws.amazon.com/SAML/Attributes/PrincipalTag:RedshiftDbGroups", Value = c.Value + ":" + c2.Value);

The following screenshot shows the list of rules that I created in my AD FS. Note the number of rules and the order in which they’re positioned. We created rules 6–11 as part of this post.

If you see a similar SAML response for RedshiftDbGroups, your setup is good:

<Attribute Name="https://redshift.amazon.com/SAML/Attributes/PrincipalTag:RedshiftDbGroups"> <AttributeValue> marketing:sales</AttributeValue>

If a user doesn’t exist in one of the groups, an empty value is passed to the claim rule. For example, if user bob is removed from the marketing group, the SAML response for PrincipalTag:RedshiftDbGroup would be :sales.

Update IAM roles

In Part 1 of this series, you created two IAM roles: ADFZ-Dev and ADFZ-Production. These two roles aren’t yet set up with grants on the query editor. In this section, you update these roles with query editor permissions.

Amazon Redshift query editor v2 provides multiple managed policies to access the query editor. For a list of all the managed policies, refer to Configuring your AWS account. For this post, we attach the AmazonRedshiftQueryEditorV2ReadSharing managed policy to the roles.

  1. On the IAM console, choose Roles in the navigation pane.
  2. Choose the role ADFZ-Dev.
  3. Choose Add permissions and then Attach policies.
  4. Under Other permission policies, search for the AmazonRedshiftQueryEditorV2ReadSharing managed policy.
  5. Select the policy and choose Attach policies.
  6. Modify the trust relationships for your role and add sts:TagSession. While in role select Trust relationships and click on Edit trust policy.When using session tags, trust policies for all roles connected to the IdP passing tags must have the sts:TagSession permission. For roles without this permission in the trust policy, the AssumeRole operation fails.
  7. Choose Update policy.
  8. Repeat these steps to attach the AmazonRedshiftQueryEditorV2ReadSharing managed policy to the ADFZ-Production role.

Limiting User access only to Query Editor

If you would like to limit users only access Query Editor then  update the policy redshift-marketing that you have created in Part 1 blog post as below.

Note: once updated, users will lose admin privileges such as create cluster.

Replace the region, account, and cluster parameters. This custom policy grants access to Amazon Redshift to get cluster credentials, create users, and allow users to join groups.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "RedshiftClusterPermissions",
            "Effect": "Allow",
            "Action": [
                "redshift:GetClusterCredentials",
                "redshift:CreateClusterUser",
                "redshift:JoinGroup"
            ],
            "Resource": [
                "arn:aws:redshift:<region>:<account>:cluster:<cluster>,
                "arn:aws:redshift:<region>:<account>:dbuser:<cluster>/${aws:PrincipalTag/RedshiftDbUser}",
                "arn:aws:redshift:<region>:<account>:dbgroup:<cluster>/marketing",
                "arn:aws:redshift:<region>:<account>:dbgroup:<cluster>/sales",
                "arn:aws:redshift:<region>:<account>:dbname:<cluster>/${redshift:DBName}"
            ]
        }
    ]
}

There are a few important things to note:

  1. The group membership lasts only for the duration of the user session.
  2. There is no CreateGroup permission because groups need to be manually created and granted DB privileges.

Generate the SSO destination URL as the Amazon Redshift query editor v2 console

In this step, you construct the sign-in URL for the AD FS IdP to redirect users to the Amazon Redshift query editor v2 console. For instructions, refer to How to Use SAML to Automatically Direct Federated Users to a Specific AWS Management Console Page.

To provide a full SSO experience for the end-users, the SAML response can include an optional parameter called RelayState. This parameter contains the destination URL.

Microsoft provides a tool to help generate these SSO URLs for AD FS called the AD FS 2.0 RelayState Generator.

To build this URL, you need three pieces of information:

  • IdP URL string – The string is in the format https://ADFSSERVER/adfs/ls/idpinitiatedsignon.aspx. For this post, we use https://EC2AMAZ-F9TJOIC.adfsredshift.com/adfs/ls/IdpInitiatedSignOn.aspx.
  • Relying party identifier – For AWS, this is urn:amazon:webservices.
  • Relay state or target app – This is the AWS Management Console URL you want your authenticated users redirect to. In this case, it’s https://eu-west-1.console.aws.amazon.com/sqlworkbench/home?. For this post, we use the eu-west-1 Region, but you can adjust this as needed.

I followed the instructions in How to Use SAML to Automatically Direct Federated Users to a Specific AWS Management Console Page and used the AD FS 2.0 RelayState Generator to generate the URL shown in the following screenshot.

The following is an example of the final URL that you use to get authenticated and also get redirected to Amazon Redshift query editor v2 (this URL won’t work in your setup because it has been created specifically for an AD FS server in my account): https://EC2AMAZ-F9TJOIC.adfsredshift.com/adfs/ls/IdpInitiatedSignOn.aspx?RelayState=RPID%3Durn%253Aamazon%253Awebservices%26RelayState%3Dhttps%253A%252F%252Feu-west-1.console.aws.amazon.com%252Fsqlworkbench%252Fhome%253Fregion%253Deu-west-1%2523%252Fclient

You can now save this URL and use it from anywhere you can reach your AD FS server. After you enter the URL in a browser, you first authenticate to AD FS, then you’re redirected to the Amazon Redshift query editor v2 console.

Set up DB groups on Amazon Redshift cluster

In this step, you set up your Amazon Redshift database group. This step is necessary because when the user is authenticated, they have to be part of an Amazon Redshift DB group with proper permissions set on a schema or table (or view).

In Active Directory, the user Bob is part of two groups: Sales and Marketing. In your Amazon Redshift database, you have three database groups: Sales, Marketing, and Finance.

When user Bob logs in via federated authentication, the user assumes the Sales and Marketing database groups, so this user can query tables in both the Sales and Marketing schemas. Because the user Bob isn’t part of the Finance group, when they try to access the Finance schema, they receive a permission denied error.

The following diagram illustrates this configuration.

Complete the following steps to set up your DB groups:

  1. Connect as awsuser (a superuser).
  2. Create three database groups:
    CREATE GROUP sales;
    CREATE GROUP marketing;
    CREATE GROUP finance;

  3. Create three schemas:
    CREATE SCHEMA sales;
    CREATE SCHEMA marketing;
    CREATE SCHEMA finance;

  4. Create a table in each schema:
    CREATE TABLE IF NOT EXISTS marketing.employee
    (
    	n_empkey INTEGER   
    	,n_name CHAR(25)   
    	,n_regionkey INTEGER   
    	,n_comment VARCHAR(152)   
    )
    DISTSTYLE AUTO
     SORTKEY (n_empkey);
    
    CREATE TABLE IF NOT EXISTS sales.employee_sales
    (
    	n_empkey INTEGER   
    	,n_name CHAR(25)   
    	,n_regionkey INTEGER   
    	,n_comment VARCHAR(152)   
    )
    DISTSTYLE AUTO
     SORTKEY (n_empkey);
    
    
    CREATE TABLE IF NOT EXISTS finance.accounts
    (
    	account_id INTEGER   
    	,account_name CHAR(25)   
    	 
    )
    DISTSTYLE AUTO
     SORTKEY (account_id);

  5. Insert sample data into the three tables:
    INSERT INTO marketing.employee
    VALUES(1, 'Bob', 0, 'Marketing');
    
    INSERT INTO sales.employee_sales
    VALUES(1, 'John', 0, 'Sales');
    
    INSERT INTO finance.accounts
    VALUES(1, 'online company');

  6. Validate the data is available in the tables:
    Select * from marketing.employee;
    Select * from sales.employee_sales;
    Select * from finance.accounts;

You can now set up appropriate privileges for the sales, finance, and marketing groups. Groups are collections of users who are all granted privileges associated with the group. You can use groups to assign privileges by job function. For example, you can create different groups for sales, administration, and support, and give the users in each group the appropriate access to the data they require for their work. You can grant or revoke privileges at the group level, and those changes apply to all members of the group, except for superusers.

  1. Enter the following SQL queries to grant access to all tables in the sales schema to the sales group, access to all tables in the marketing schema to the marketing group, and access to all tables in the finance schema to the finance group:
ALTER DEFAULT PRIVILEGES IN SCHEMA sales
GRANT SELECT on TABLES to GROUP sales;
GRANT USAGE on SCHEMA sales to GROUP sales;
GRANT SELECT on ALL TABLES in SCHEMA sales to GROUP sales;

ALTER DEFAULT PRIVILEGES IN SCHEMA marketing
GRANT SELECT on TABLES to GROUP marketing;
GRANT USAGE on SCHEMA marketing to GROUP marketing;
GRANT SELECT on ALL TABLES in SCHEMA marketing to GROUP marketing;

ALTER DEFAULT PRIVILEGES IN SCHEMA finance
GRANT SELECT on TABLES to GROUP finance;
GRANT USAGE on SCHEMA finance to GROUP finance;
GRANT SELECT on ALL TABLES in SCHEMA finance to GROUP finance;

Access Amazon Redshift query editor v2 through federated authentication

Now that you have completed your SAML integration, deep link setup, and DB groups and access rights configuration, you can set up Amazon Redshift query editor v2 to use federated authentication with AD FS to connect from directly from the query editor interface.

  1. Navigate to the deep link URL you created earlier.
    You’re redirected to the AD FS login page.
  2. Sign in as [email protected].
    For this post, I accessed this URL from an Amazon EC2 machine, but you can access it from any location where you can reach the AD FS IdP.
    After AD FS successfully authenticates you, it redirects to the AWS SSO endpoint and posts the SAML assertion and RelayState parameter. Because you configured two IAM roles on the AWS side, you’re prompted to select a role.
  3. Select a role (for this example, ADFZ-Dev) and choose Sign In.

    AWS sends the sign-in URL that is based on the RelayState value back to your browser as a redirect. Your browser is redirected to the query editor v2 console automatically.
  4. Right-click your Amazon Redshift cluster (for this post, redshift-cluster-1) and choose Edit connection.

    The value for User name is automatically populated, and Federated Access is automatically selected.
  5. Choose Edit connection to save the connection and log in to the database.

    After you’re successfully logged in, you can browse the database objects in the left pane.
  6. Test the connection by running the following query:
    select * from stv_sessions;

The following screenshot shows the output.

The output shows that the user [email protected] was authenticated using AD FS. The user also joined the marketing and Sales groups as enforced by the AD FS PrincipalTag:RedshiftDbGroups claim rule and the policy associated with the ADFZ-Dev role, which the user assumes during this session.

Run queries to validate authorization through federated groups

In this final step, you validate how the groups and membership configured in AD are seamlessly integrated with Amazon Redshift database groups.

Run the following query against the marketing and sales schema:

select * from marketing.employee;
select * from sales.employee_sales;

The following screenshots shows the output:

The preceding images show that AD user Bob is part of the AD group RSDB-marketing and RSDB-sales, which are mapped to the DB groups marketing and sales. These DB groups have select access to the schemas marketing and sales and all tables in those schemas. Therefore, the user can successfully query the tables.

To run a query against the finance schema, enter the following code:

select * from finance.accounts;

The following screenshot shows the output.

The output shows that Bob is only part of the AD groups RSDB-marketing and RSDB-sales. Due to the way the claim rule is set up, Bob doesn’t have access to the database group finance, and therefore the query returns with a permission denied error.

Clean up

To avoid incurring future charges, delete the resources by deleting the CloudFormation stack. This cleans up all the resources from your AWS account that you set up in Part 1.

Conclusion

In this post, we demonstrated how to set up an AD FS server, configure different PrincipalTag attributes used for Amazon Redshift query editor v2, and generate an SSO URL with the query editor as the destination location. You then connected to the Amazon Redshift DB cluster using a database user with administrator privileges to set up DB groups and permissions, and used a federated user authentication with AD FS to run several queries. This solution enables you to control access to your Amazon Redshift database objects by using AD groups and memberships seamlessly.

If you have any feedback or questions, please leave them in the comments.


About the Authors

Sumeet Joshi is an Analytics Specialist Solutions Architect based out of New York. He specializes in building large-scale data warehousing solutions. He has over 16 years of experience in the data warehousing and analytical space.

Bhanu Pittampally is an Analytics Specialist Solutions Architect based out of Dallas. He specializes in building analytical solutions. His background is in data and analytics for over 14 years. His LinkedIn profile can be found here.

Erol Murtezaoglu, a Technical Product Manager at AWS, is an inquisitive and enthusiastic thinker with a drive for self-improvement and learning. He has a strong and proven technical background in software development and architecture, balanced with a drive to deliver commercially successful products. Erol highly values the process of understanding customer needs and problems, in order to deliver solutions that exceed expectations.

Yanis Telaoumaten is a Software Development Engineer at AWS. His passions are building reliable software and creating tools to allow other engineers to work more efficiently. In the past years, he worked on identity, security and reliability of Redshift services

Choosing the right certificate revocation method in ACM Private CA

Post Syndicated from Arthur Mnev original https://aws.amazon.com/blogs/security/choosing-the-right-certificate-revocation-method-in-acm-private-ca/

AWS Certificate Manager Private Certificate Authority (ACM PCA) is a highly available, fully managed private certificate authority (CA) service that allows you to create CA hierarchies and issue X.509 certificates from the CAs you create in ACM PCA. You can then use these certificates for scenarios such as encrypting TLS communication channels, cryptographically signing code, authenticating users, and more. But what happens if you decide to change your TLS endpoint or update your code signing entity? How do you revoke a certificate so that others no longer accept it?

In this blog post, we will cover two fully managed certificate revocation status checking mechanisms provided by ACM PCA: the Online Certificate Status Protocol (OCSP) and certificate revocation lists (CRLs). OCSP and CRLs both enable you to manage how you can notify services and clients about ACM PCA–issued certificates that you revoke. We’ll explain how these standard mechanisms work, we’ll highlight appropriate deployment use cases, and we’ll identify the advantages and downsides of each. We won’t cover configuration topics directly, but will provide you with links to that information as we go.

Certificate revocation

An X.509 certificate is a static, cryptographically signed document that represents a user, an endpoint, an IoT device, or a similar end entity. Because certificates provide a mechanism to authenticate these end entities, they are valid for a fixed period of time that you specify in the expiration date attribute when you generate a certificate. The expiration attribute is important, because it validates and regulates an end entity’s identity, and provides a means to schedule the termination of a certificate’s validity. However, there are situations where a certificate might need to be revoked before its scheduled expiration. These scenarios can include a compromised private key, the end of agreement between signed and signing organizations, user or configuration error when issuing certificates, and more. Although you can use certificates in many ways, we will refer to the predominant use case of TLS-based client-server implementations for the remainder of this blog post.

Certificate revocation can be used to identify certificates that are no longer trusted, and CRLs and OCSP are the standard mechanisms used to publish the revocation information. In addition, the special use case of OCSP stapling provides a more efficient mechanism that is supported in TLS 1.2 and later versions.

ACM PCA gives you the flexibility to use either of these mechanisms, or both. More importantly, as an ACM PCA administrator, the mechanism you choose to use is reflected in the certificate, and you must know how you want to manage revocation before you create the certificate. Therefore, you need to understand how the mechanisms work, select your strategy based on its appropriateness to your needs, and then create and deploy your certificates. Let’s look at how each mechanism works, the use cases for each, and issues to be aware of when you select a revocation strategy.

Certificate revocation using CRLs

As the name suggests, a CRL contains a list of revoked certificates. A CRL is cryptographically signed and issued by a CA, and made available for download by clients (for example, web browsers for TLS) through a CRL distribution point (CDP) such as a web server or a Lightweight Directory Access Point (LDAP) endpoint.

A CRL contains the revocation date and the serial number of revoked certificates. It also includes extensions, which specify whether the CA administrator temporarily suspended or irreversibly revoked the certificate. The CRL is signed and timestamped by the CA and can be verified by using the public key of the CA and the cryptographic algorithm included in the certificate. Clients download the CRL by using the address provided in the CDP extension and trust a certificate by verifying the signature, expiration date, and revocation status in the CRL.

CRLs provide an easy way to verify certificate validity. They can be cached and reused, which makes them resilient to network disruptions, and are an excellent choice for a server that is getting requests from many clients for the same CA. All major web browsers, OpenSSL, and other major TLS implementations support the CRL method of validating certificates.

However, the size of CRLs can lead to inefficiency for clients that are validating server identities. An example is the scenario of browsing multiple websites and downloading a CRL for each site that is visited. CRLs can also grow large over time as you revoke more certificates. Consider the World Wide Web and the number of invalidations that take place daily, which makes CRLs an inefficient choice for small-memory devices (for example, mobile, IoT, and similar devices). In addition, CRLs are not suited for real-time use cases. CRLs are downloaded periodically, a value that can be hours, days, or weeks, and cached for memory management. Many default TLS implementations, such as Mozilla, Chrome, Windows OS, and similar, cache CRLs for 24 hours, leaving a window of up to a day where an endpoint might incorrectly trust a revoked certificate. Cached CRLs also open opportunities for non-trusted sites to establish secure connections until the server refreshes the list, leading to security risks such as data breaches and identity theft.

Implementing CRLs by using ACM PCA

ACM PCA supports CRLs and stores them in an Amazon Simple Storage Service (Amazon S3) bucket for high availability and durability. You can refer to this blog post for an overview of how to securely create and store your CRLs for ACM PCA. Figure 1 shows how CRLs are implemented by using ACM PCA.

Figure 1: Certificate validation with a CRL

Figure 1: Certificate validation with a CRL

The workflow in Figure 1 is as follows:

  1. On certificate revocation, ACM PCA updates the Amazon S3 CRL bucket with a new CRL.

    Note: An update to the CRL may take up to 30 minutes after a certificate is revoked.

  2. The client requests a TLS connection and receives the server’s certificate.
  3. The client retrieves the current CRL file from the Amazon S3 bucket and validates it.

The refresh interval is the period between when an administrator revokes a certificate and when all parties consider that certificate revoked. The length of the refresh interval can depend on how quickly new information is published and how long clients cache revocation information to improve performance.

When you revoke a certificate, ACM PCA publishes a new CRL. ACM PCA waits 5 minutes after a RevokeCertificate API call before publishing a new CRL. This process exists to accommodate multiple revocation requests in a short time frame. An update to the CRL can take up to 30 minutes to propagate. If the CRL update fails, ACM PCA makes further attempts every 15 minutes.

CRLs also have a validity period, which you define as part of the CRL configuration by using ExpirationInDays. ACM PCA uses the value in the ExpirationInDays parameter to calculate the nextUpdate field in the CRL (the day and time when ACM PCA will publish the next CRL). If there are no changes to the CRL, the CRL is refreshed at half the interval of the next update. Clients may cache CRLs while they are still valid, so not all clients will have the updated CRL with the newly revoked certificates until the previous published CRL has expired.

Certificate revocation using OCSP

OCSP removes the burden of downloading the CRL from the client. With OCSP, clients provide the serial number and obtain the certificate status for a single certificate from an OCSP Responder. The OCSP Responder can be the CA or an endpoint managed by the CA. The certificate that is returned to the client contains an authorityInfoAccess extension, which provides an accessMethod (for example, OCSP), and identifies the OCSP Responder by a URL (for example, http://example-responder:<port>) in the accessLocation. You can also specify the OCSP Responder location manually in the CA profile. The certificate status response that is returned by the OCSP Responder can be good, revoked, or unknown, and is signed by using a process similar to the CRL for protection against forgery.

OCSP status checks are conducted in real time and are a good choice for time-sensitive devices, as well as mobile and IoT devices with limited memory.

However, the certificate status needs to be checked against the OCSP Responder for every connection, therefore requiring an extra hop. This can overwhelm the responder endpoint that needs to be designed for high availability, low latency, and protection against network and system failures. We will cover how ACM PCA addresses these availability and latency concerns in the next section.

Another thing to be mindful of is that the OCSP protocol implements OCSP status checks over unencrypted HTTP that poses privacy risks. When a client requests a certificate status, the CA receives information regarding the endpoint that is being connected to (for example, domain, IP address, and related information), which can easily be intercepted by a middle party. We will address how OCSP stapling can be used to address these privacy concerns in the OCSP stapling section.

Implementing OCSP by using ACM PCA

ACM PCA provides a highly available, fully managed OCSP solution to notify endpoints that certificates have been revoked. The OCSP implementation uses AWS managed OCSP responders and a globally available Amazon CloudFront distribution that caches OCSP responses closer to you, so you don’t need to set up and operate any infrastructure by yourself. You can enable OCSP on new or existing CAs using the ACM PCA console, the API, the AWS Command Line Interface (AWS CLI), or through AWS CloudFormation. Figure 2 shows how OCSP is implemented on ACM PCA.

Note: OCSP Responders, and the CloudFront distribution that caches the OCSP response for client requests, are managed by AWS.

Figure 2: Certificate validation with OCSP

Figure 2: Certificate validation with OCSP

The workflow in Figure 2 is as follows:

  1. On certificate revocation, the ACM PCA updates the OCSP Responder, which generates the OCSP response.
  2. The client requests a TLS connection and receives the server’s certificate.
  3. The client sends a query to the OCSP endpoint on CloudFront.

    Note: If the response is still valid in the CloudFront cache, it will be served to the client from the cache.

  4. If the response is invalid or missing in the CloudFront cache, the request is forwarded to the OCSP Responder.
  5. The OCSP Responder sends the OCSP response to the CloudFront cache.
  6. CloudFront caches the OCSP response and returns it to the client.

The ACM PCA OCSP Responder generates an OCSP response that gets cached by CloudFront for 60 minutes. When a certificate is revoked, ACM PCA updates the OCSP Responder to generate a new OCSP response. During the caching interval, clients continue to receive responses from the CloudFront cache. As with CRLs, clients may also cache OCSP responses, which means that not all clients will have the updated OCSP response for the newly revoked certificate until the previously published (client-cached) OCSP response has expired. Another thing to be mindful of is that while the response is cached, a compromised certificate can be used to spoof a client.

Certificate revocation using OCSP stapling

With both CRLs and OCSP, the client is responsible for validating the certificate status. OCSP stapling addresses the client validation overhead and privacy concerns that we mentioned earlier by having the server obtain status checks for certificates that the server holds, directly from the CA. These status checks are periodic (based on a user-defined value), and the responses are stored on the web server. During TLS connection establishment, the server staples the certificate status in the response that is sent to the client. This improves connection establishment speed by combining requests and reduces the number of requests that are sent to the OCSP endpoint. Because clients are no longer directly connecting to OCSP Responders or the CAs, the privacy risks that we mentioned earlier are also mitigated.

Implementing OCSP stapling by using ACM PCA

OCSP stapling is supported by ACM PCA. You simply use the OCSP Certificate Status Response passthrough to add the stapling extension in the TLS response that is sent from the server to the client. Figure 3 shows how OCSP stapling works with ACM PCA.

Figure 3: Certificate validation with OCSP stapling

Figure 3: Certificate validation with OCSP stapling

The workflow in Figure 3 is as follows:

  1. On certificate revocation, the ACM PCA updates the OCSP Responder, which generates the OCSP response.
  2. The client requests a TLS connection and receives the server’s certificate.
  3. In the case of server’s cache miss, the server will query the OCSP endpoint on CloudFront.

    Note: If the response is still valid in the CloudFront cache, it will be returned to the server from the cache.

  4. If the response is invalid or missing in the CloudFront cache, the request is forwarded to the OCSP Responder.
  5. The OCSP Responder sends the OCSP response to the CloudFront cache.
  6. CloudFront caches the OCSP response and returns it to the server, which also caches the response.
  7. The server staples the certificate status in its TLS connection response (for TLS 1.2 and later versions).

OCSP stapling is supported with TLS 1.2 and later versions.

Selecting the correct path with OCSP and CRLs

All certificate revocation offerings from AWS run on a highly available, distributed, and performance-optimized infrastructure. We strongly recommend that you enable a certificate validation and revocation strategy in your environment that best reflects your use case. You can opt to use CRLs, OCSP, or both. Without a revocation and validation process in place, you risk unauthorized access. We recommend that you review your business requirements and evaluate the risk profile of access with an invalid certificate versus the availability requirements for your application.

In the following sections, we’ll provide some recommendations on when to select which certificate validation and revocation strategy. We’ll cover client-server TLS communication, and also provide recommendations for mutual TLS (mTLS) authentication scenarios.

Recommended scenarios for OCSP stapling and OCSP Must-Staple

If your organization requires support for TLS 1.2 and later versions, you should use OCSP stapling. If you want to reduce the application availability risk for a client that is configured to fail the TLS connection establishment when it is unable to validate the certificate, you should consider using the OCSP Must-Staple extension.

OCSP stapling

If your organization requires support for TLS 1.2 and later versions, you should use OCSP stapling. With OCSP stapling, you reduce your client’s load and connectivity requirements, which helps if your network connectivity is unpredictable. For example, if your application client is a mobile device, you should anticipate network failures, low bandwidth, limited processing capacity, and impatient users. In this scenario, you will likely benefit the most from a system that relies on OCSP stapling.

Although the majority of web browsers support OCSP stapling, not all servers support it. OCSP stapling is, therefore, typically implemented together with CRLs that provide an alternate validation mechanism or as a passthrough for when the OCSP response fails or is invalid.

OCSP Must-Staple

If you want to rely on OCSP alone and avoid implementing CRLs, you can use the OCSP Must-Staple certificate extension, which tells the connecting client to expect a stapled response. You can then use OCSP Must-Staple as a flag for your client to fail the connection if the client does not receive a valid OCSP response during connection establishment.

Recommended scenarios for CRLs, OCSP (without stapling), and combinational strategies

If your application needs to support legacy, now deprecated protocols such as TLS 1.0 or 1.1, or if your server doesn’t support OCSP stapling, you could use a CRL, OCSP, or both together. To determine which option is best, you should consider your sensitivity to CA availability, recently revoked certificates, the processing capacity of your application client, and network latency.

CRLs

If your application needs to be available independent of your CA connectivity, you should consider using a CRL. CRLs are much larger files that, from a practical standpoint, require much longer cache times to be of use, but they will be present and available for verification on your system regardless of the status of your network connection. In addition, the lookup time of a certificate within a CRL is local and therefore shorter than a network round trip to an OCSP Responder, because there are no network connection or DNS lookup times.

OCSP (without stapling)

If you are sensitive to the processing capacity of your application client, you should use OCSP. The size of an OCSP message is much smaller compared to a CRL, which allows you to configure shorter caching times that are better suited for your risk profile. To optimize your OCSP and OCSP stapling process, you should review your DNS configuration because it plays a significant role in the amount of time your application will take to receive a response.

For example, if you’re building an application that will be hosted on infrastructure that doesn’t support OCSP stapling, you will benefit from clients making an OCSP request and caching it for a short period. In this scenario, your application client will make a single OCSP request during its connection setup, cache the response, and reuse the certificate state for the duration of its application session.

Combining CRLs and OCSP

You can also choose to implement both CRLs and OCSP for your certificate revocation and validation needs. For example, if your application needs to support legacy TLS protocols while providing resiliency to network failures, you can implement both CRLs and OCSP. When you use CRLs and OCSP together, you verify certificates primarily by using OCSP; however, in case your client is unable to reach the OCSP endpoint, you can fail over to an alternative validation method (for example, CRL). This approach of combining CRLs and OCSP gives you all the benefits of OCSP mentioned earlier, while providing a backup mechanism for failure scenarios such as an unreachable OCSP Responder, invalid response from the OCSP Responder, and similar. However, while this approach adds resilience to your application, it will add management overhead because you will need to set up CRL-based and OCSP-based revocation separately. Also, remember that clients with reduced computing power or poor network connectivity might struggle as they attempt to download and process the CRL.

Recommendations for mTLS authentication scenarios

You should consider network latency and revocation propagation delays when optimizing your server infrastructure for mTLS authentication. In a typical scenario, server certificate changes are infrequent, so caching an OCSP response or CRL on your client and an OCSP-stapled response on a server will improve performance. For mTLS, you can revoke a client certificate at any time; therefore, cached responses could introduce the risk of invalid access. You should consider designing your system such that a copy of a CRL for client certificates is maintained on the server and refreshed based on your business needs. For example, you can use S3 ETags to determine whether an object has changed, and flush the server’s cache in response.

Conclusion

This blog post covered two certificate revocation methods, OCSP and CRLs, that are available on ACM PCA. Remember, when you deploy CA hierarchies for public key infrastructure (PKI), it’s important to define how to handle certificate revocation. The certificate revocation information must be included in the certificate when it is issued, so the choice to enable either CRL or OCSP, or both, has to happen before the certificate is issued. It’s also important to have highly available CRL and OCSP endpoints for certificate lifecycle management. ACM PCA provides a highly available, fully managed CA service that you can use to meet your certificate revocation and validation requirements. Get started using ACM PCA.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Arthur Mnev

Arthur is a Senior Specialist Security Architect for Global Accounts. He spends his day working with customers and designing innovative approaches to help customers move forward with their initiatives, increase their security posture, and reduce security risks in their cloud journeys. Outside of work, Arthur enjoys being a father, skiing, scuba diving, and Krav Maga.

Basit Hussain

Basit Hussain

Basit is a Senior Security Specialist Solutions Architect based out of Seattle, focused on data protection in transit and at rest. In his almost 7 years at Amazon, Basit has diverse experience working across different teams. In his current role, he helps customers secure their workloads on AWS and provide innovative solutions to unblock customers in their cloud journey.

Trevor Freeman

Trevor Freeman

Trevor is an innovative and solutions-oriented Product Manager at Amazon Web Services, focusing on ACM PCA. With over 20 years of experience in software and service development, he became an expert in Cloud Services, Security, Enterprise Software, and Databases. Being adept in product architecture and quality assurance, Trevor takes great pride in providing exceptional customer service.

Umair Rehmat

Umair Rehmat

Umair is a cloud solutions architect and technologist based out of the Seattle WA area working on greenfield cloud migrations, solutions delivery, and any-scale cloud deployments. Umair specializes in telecommunications and security, and helps customers onboard, as well as grow, on AWS.

AWS Week in Review – May 16, 2022

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-week-in-review-may-16-2022/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

I had been on the road for the last five weeks and attended many of the AWS Summits in Europe. It was great to talk to so many of you in person. The Serverless Developer Advocates are going around many of the AWS Summits with the Serverlesspresso booth. If you attend an event that has the booth, say “Hi 👋” to my colleagues, and have a coffee while asking all your serverless questions. You can find all the upcoming AWS Summits in the events section at the end of this post.

Last week’s launches
Here are some launches that got my attention during the previous week.

AWS Step Functions announced a new console experience to debug your state machine executions – Now you can opt-in to the new console experience of Step Functions, which makes it easier to analyze, debug, and optimize Standard Workflows. The new page allows you to inspect executions using three different views: graph, table, and event view, and add many new features to enhance the navigation and analysis of the executions. To learn about all the features and how to use them, read Ben’s blog post.

Example on how the Graph View looks

Example on how the Graph View looks

AWS Lambda now supports Node.js 16.x runtime – Now you can start using the Node.js 16 runtime when you create a new function or update your existing functions to use it. You can also use the new container image base that supports this runtime. To learn more about this launch, check Dan’s blog post.

AWS Amplify announces its Android library designed for Kotlin – The Amplify Android library has been rewritten for Kotlin, and now it is available in preview. This new library provides better debugging capacities and visibility into underlying state management. And it is also using the new AWS SDK for Kotlin that was released last year in preview. Read the What’s New post for more information.

Three new APIs for batch data retrieval in AWS IoT SiteWise – With this new launch AWS IoT SiteWise now supports batch data retrieval from multiple asset properties. The new APIs allow you to retrieve current values, historical values, and aggregated values. Read the What’s New post to learn how you can start using the new APIs.

AWS Secrets Manager now publishes secret usage metrics to Amazon CloudWatch – This launch is very useful to see the number of secrets in your account and set alarms for any unexpected increase or decrease in the number of secrets. Read the documentation on Monitoring Secrets Manager with Amazon CloudWatch for more information.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other launches and news that you may have missed:

IBM signed a deal with AWS to offer its software portfolio as a service on AWS. This allows customers using AWS to access IBM software for automation, data and artificial intelligence, and security that is built on Red Hat OpenShift Service on AWS.

Podcast Charlas Técnicas de AWS – If you understand Spanish, this podcast is for you. Podcast Charlas Técnicas is one of the official AWS podcasts in Spanish. This week’s episode introduces you to Amazon DynamoDB and shares stories on how different customers use this database service. You can listen to all the episodes directly from your favorite podcast app or the podcast web page.

AWS Open Source News and Updates – Ricardo Sueiras, my colleague from the AWS Developer Relation team, runs this newsletter. It brings you all the latest open-source projects, posts, and more. Read edition #112 here.

Upcoming AWS Events
It’s AWS Summits season and here are some virtual and in-person events that might be close to you:

You can register for re:MARS to get fresh ideas on topics such as machine learning, automation, robotics, and space. The conference will be in person in Las Vegas, June 21–24.

That’s all for this week. Check back next Monday for another Week in Review!

— Marcia

[$] Dynamically allocated pseudo-filesystems

Post Syndicated from jake original https://lwn.net/Articles/895111/

It is perhaps unusual to have a kernel tracing developer leading a
filesystem session, Steven Rostedt said, at the beginning of such a session at the
2022 Linux Storage,
Filesystem, Memory-management and BPF Summit
(LSFMM). But he was doing
so to
try to find a good way to dynamically allocate kernel data structures
for some of the pseudo-filesystems, such as sysfs, debugfs, and tracefs,
in the kernel.
Avoiding static allocations would save memory, especially on systems
that are not actually using any of the files in those filesystems.

Цифрово десетилетие за децата и младежите: новата европейска стратегия за по-добър интернет за децата (BIK+)

Post Syndicated from nellyo original https://nellyo.wordpress.com/2022/05/16/bik/

ЕК публикува COM/2022/212 final  Цифрово десетилетие за децата и младежите: новата европейска стратегия за по-добър интернет за децата (BIK+).

През 2012 г. беше изготвена първата европейска стратегия за по-добър интернет за децата (Better internet for kids, BIK). С изложената в настоящото съобщение актуализирана стратегия за по-добър интернет за децата (BIK+) ще се гарантира, че децата са защитени, зачитани и овластени онлайн през новото цифрово десетилетие, се казва в Съобщението.

Списъкът на опасенията, посочени от самите деца в отговорите на проведената от Комисията консултация, включва гледане на вредно съдържание, което може да възхвалява и да насърчава самонараняване, самоубийство, насилие, реч на омразата, сексуален тормоз, приемане на наркотици, рискови онлайн предизвикателства, хранителни разстройства и опасни диети. Такова съдържащо насилие, плашещо или неподходящо по друг начин за възрастта съдържание е леснодостъпно. Децата съобщават, че виждат порнография още от ранна възраст.

Въпреки съществуващото право на ЕС (Директивата за аудио-визуалните медийни услуги и Общия регламент относно защитата на данните) механизмите за проверка на възрастта и инструментите за родителско съгласие в много случаи все още не са ефективни.

Очаква се одобреният неотдавна Законодателен акт за цифровите услуги да доведе до значително подобряване на безопасността на всички потребители, в това число и на децата, и да им даде възможност да правят информиран избор в онлайн пространството.  Като част от рамката за управление на риска в Законодателния акт за цифровите услуги, се изисква специално внимание по отношение на системните рискове във връзка с децата.

С BIK+, водеща инициатива на Европейската година на младежта 2022 г., се предлагат действия около три стълба:

1.безопасен цифров опит, за да защитим децата от вредни и незаконни онлайн контакти, съдържание, поведение и рискове за потребителите, както и за да подобрим тяхното благополучие онлайн чрез безопасна, съобразена с възрастта цифрова среда, създадена при зачитане на интересите на децата;

2.цифрово овластяване, така че децата да придобиват необходимите умения и компетентности, за да правят правилни избори и да се изразяват безопасно и отговорно в онлайн средата;

3.активно участие, при което се зачитат децата, като им се дава възможност да изразяват мнението си в цифровата среда, с повече дейности, ръководени от деца, с цел да се насърчи иновативен и творчески безопасен цифров опит.

Изпълнението на стратегията BIK+ изисква основано на фактите създаване на политики, както и сътрудничество и координация на европейско и международно равнище.

Всички, които и този път са отстояли цифрово десетилетие и цифрови технологии (вм. дигиталните дигиталности) в българската езикова версия на Съобщението, имат най-горещата ми благодарност.

Нови членове на СЕМ

Post Syndicated from nellyo original https://nellyo.wordpress.com/2022/05/16/cem-36/

Обнародвани са решенията на НС и указът на президента  за новите членове на СЕМ:

  • ДВ от 10 май 2022 – парламентарна квота

Пролет Велкова

Симона Велева

  • ДВ от 13 май 2022 – президентска квота

Габриела Наплатанова

Заварените членове са Соня Момчилова от президентската квота, председател,  и Галина Георгиева от  парламентарната квота.

How we’re using projects to build projects

Post Syndicated from Jed Verity original https://github.blog/2022-05-16-how-were-using-projects-to-build-projects/

At GitHub, we use GitHub to build our own products, whether that be moving our entire Engineering team over to Codespaces for the majority of GitHub.com development, or utilizing GitHub Actions to coordinate our GitHub Mobile releases. And while GitHub Issues has been a part of the GitHub experience since the early days and is an integral part of how we work together as Hubbers internally, the addition of powerful project planning has given us more opportunities to test out some of our most exciting products.

In this post, I’m going to share how we’ve been utilizing the new projects experience across our team (from an engineer like myself all the way to our VPs and team leads). We love working so closely with developers to ship requested features and updates (all of which roll up into the Changelogs you see), and using the new projects helps us stay consistent in our shipping cadence.

How we think about shipping

Our core team consists of members of the product, engineering, design, and user research teams. We recognize that good ideas can come from anywhere. Our process is designed to inspire, surface, and implement those ideas, whether they come from users, individual contributors, managers, directors, or VPs. To get the proper alignment for this group, we’ve agreed on a few guiding principles that drive what our roadmap will look like:

💭 The pitch: When it comes to what we’re going to work on (outside of the big pieces of work on our roadmap) people within our team can pitch ideas in our team’s repository for upcoming cycles (which we define as 6-8 weeks of work, inclusive of planning, engineering work, and an unstructured passion project week); these can be features, fixes, or even maintenance work. Every pitch must clearly state the problem it’s solving and why it’s important for us to prioritize. Some features that have come from this process include live updates, burn up charts for insights, and more. Note: these are all the changes you see as a developer, but we also have a lot of pitches come in from my fellow engineers focused around the developer experience. For example, a couple successful pitches have included reducing our CI time to 10 minutes, and streamlining our release process by switching to a ring deployment model and adding ChatOps.

💡 In addition to using issues to propose and converse on pitches from the team, we use the new projects experience to track and manage all the pitches from the team so we can see them in an all-up table or board view.

✂ Keep it small: We knew for ourselves, and for developers, that we didn’t want to lock them into a specific planning methodology and over-complicate a team’s planning and tracking process. For us, we wanted to plan shorter cycles for our team to increase our tempo and focus, so we opted for six-week cycles to break up and build features. Check out how we recommend getting started with this approach in a recent blog post.

📬 Ship to learn: Similar to how we ship a lot of our products, we knew developers and customers were going to be heavily intertwined with each and every ship, giving us immediate feedback in both the private and public beta. Their feedback both influenced what we built and then how we iterated and continued to better the experience once something did ship. While there are so many people to thank, we’re extremely grateful for all our customers along the way for being our partners in building GitHub Issues into the place for developers to project plan.

How we used projects to do it

We love that the product we’re building doesn’t tool a specific project management methodology, but equips users with powerful primitives that they can compose into their preferred experiences and workflows. This allows for many people (not just us engineers) involved in building and developing products at GitHub (team leads, marketing, design, sales, etc.) the ability to use the product in a way that makes sense for them.

With the above principles in mind, once a pitch has been agreed upon to move forward on building, that pitch issue becomes a tracking issue in a project table or board that we convert into pieces of work that fit into an upcoming cycle. A great example of this was when we updated the GitHub Issues icons to lessen confusion among developers. This came in as a pitch from a designer on the team, and was soon accepted and moved into epic planning in which the team responsible began to track the individual pieces of work needed to make this happen.

IC approach

Let’s start with how my fellow engineers, individual contributors and I use projects for day-to-day development within cycles. From our perspective on any given day, we’re hyper-focused on tackling what issues and pull requests are assigned to us (fun fact: we recently added the assignee:me filter to make this even easier) in a given cycle, so we work from more individually scoped project tables or boards that stem from the larger epic and iteration tracking. Because of this, we can easily zoom out from our individual tasks to see how our work fits into a given cycle, and then even zoom out more into how it fits into larger initiatives.

💡 In addition to scoping more specifically a given table or board, engineers across our organization utilize a personal project table or board to track all the things specific to themselves like what issues are assigned to them—even work not connected to a given cycle, like open source work.

EM approach

If we pull back to engineering managers overseeing those smaller cycles, they’re focused on kicking off an accepted pitch’s work, breaking it first into cycles and then into smaller iterations in which they can assign out work. A given cycle’s table or board view allows the managers to have a whole look at all members of their team and look specifically at things that are important to them, like all the pull requests that are open and quickly seeing which engineers are assigned, what pull requests have been merged, deployed, etc.

💡 Check out what this looks like in our team board.

Team lead approach

Now, if we put ourselves in the shoes of our team leads and Directors/VPs, we see that they’re using the new projects experience to primarily get the full picture of where product and feature development currently sit. They told me the main team roadmap and backlog is where they can get questions answered like:

  • Which projects do we have in flight in which product area right now?
  • Who’s the key decision maker for this project?
  • Which engineers are working on which projects?
  • Which projects are at risk and need help (progress/status)?

What’s great about this is that they can quickly glance at what’s in motion and then click into any cycles or status to get more context on open issues, pull requests, and how everything is connected.

💡 Outside of being able to check in on what’s being worked on and where the organization’s current focus is, our leads have found additional use cases that may not be applicable for an engineer like me. They use private projects for more sensitive tasks, like managing our team’s hiring, tracking upcoming start dates, making sure they’re staying on top of career development, organizational change management, and more.

Wrap-up

This is how we as the planning and tracking team at GitHub are using the very product we’re building for you to build the new projects experience. There are many other teams across GitHub that utilize the new project tables and boards, but we hope this gives you a little bit of inspiration about how to think about project planning on GitHub and how to optimize for all the stakeholders involved in building and shipping products.

What’s great about project planning on GitHub is that our focus on powerful primitives approach to project management means that there is an unlimited amount of flexibility for you and your team to play around with, and likely many, many ways we haven’t even thought about how to use the product. So, please let us know how you’re using it and how we can improve the experience!

SFC v. Vizio remanded back to California state courts

Post Syndicated from jake original https://lwn.net/Articles/895405/

Software Freedom Conservancy (SFC)
has announced
that it succeeded with its motion in US Federal Court to send the case back
to California, where it was originally filed. The suit was filed
in October 2021 by SFC, as an owner of Vizio televisions, to get
the company to comply with the GPL on some of the code in the TVs. Back in November, Vizio had
asked to move the
case to Federal Court
, because the GPL is only a copyright license
(which is a dispute handled at the Federal level) and not a contract (that
could be adjudicated in state court). Friday’s ruling disagreed with that premise:

The May 13 ruling by the Honorable Josephine L. Staton stated that the
claim from Software Freedom Conservancy succeeded in the “extra element
test” and was not preempted by copyright claims, and the court finds “that
the enforcement of ‘an additional contractual promise separate and distinct
from any rights provided by the copyright laws’ amounts to an ‘extra
element,’ and therefore, SFC’s claims are not preempted.”

“The ruling is a watershed moment in the history of copyleft
licensing. This ruling shows that the GPL agreements function both as
copyright licenses and as contractual agreements,” says Karen M. Sandler,
executive director of Software Freedom Conservancy. Sandler noted that many
in the Free and Open Source Software (FOSS) legal community argue
incorrectly that the GPL and other copyleft licenses only function as
copyright licenses.

[$] The netfslib helper library

Post Syndicated from jake original https://lwn.net/Articles/894589/

A new helper library for network filesystems, called netfslib, was the subject
of a filesystem session at the
2022 Linux Storage,
Filesystem, Memory-management and BPF Summit
(LSFMM). David Howells
developed
netfslib
, which was merged for 5.13 a year ago, and led the session.
Some filesystems, like AFS and Ceph, are already using some of the services
that netfslib provides, while others are starting to look into it.

Use direct service integrations to optimize your architecture

Post Syndicated from Jerome Van Der Linden original https://aws.amazon.com/blogs/architecture/use-direct-service-integrations-to-optimize-your-architecture/

When designing an application, you must integrate and combine several AWS services in the most optimized way for an effective and efficient architecture:

  • Optimize for performance by reducing the latency between services
  • Optimize for costs operability and sustainability, by avoiding unnecessary components and reducing workload footprint
  • Optimize for resiliency by removing potential point of failures
  • Optimize for security by minimizing the attack surface

As stated in the Serverless Application Lens of the Well-Architected Framework, “If your AWS Lambda function is not performing custom logic while integrating with other AWS services, chances are that it may be unnecessary.” In addition, Amazon API Gateway, AWS AppSync, AWS Step Functions, Amazon EventBridge, and Lambda Destinations can directly integrate with a number of services. These optimizations can offer you more value and less operational overhead.

This blog post will show how to optimize an architecture with direct integration.

Workflow example and initial architecture

Figure 1 shows a typical workflow for the creation of an online bank account. The customer fills out a registration form with personal information and adds a picture of their ID card. The application then validates ID and address, and scans if there is already an existing user by that name. If everything checks out, a backend application will be notified to create the account. Finally, the user is notified of successful completion.

Figure 1. Bank account application workflow

Figure 1. Bank account application workflow

The workflow architecture is shown in Figure 2 (click on the picture to get full resolution).

Figure 2. Initial account creation architecture

Figure 2. Initial account creation architecture

This architecture contains 13 Lambda functions. If you look at the code on GitHub, you can see that:

Five of these Lambda functions are basic and perform simple operations:

Additional Lambda functions perform other tasks, such as verification and validation:

  • One function generates a presigned URL to upload ID card pictures to Amazon Simple Storage Service (Amazon S3)
  • One function uses the Amazon Textract API to extract information from the ID card
  • One function verifies the identity of the user against the information extracted from the ID card
  • One function performs simple HTTP request to a third-party API to validate the address

Finally, four functions concern the websocket (connect, message, and disconnect) and notifications to the user.

Opportunities for improvement

If you further analyze the code of the five basic functions (see startWorkflow on GitHub, for example), you will notice that there are actually three lines of fundamental code that start the workflow. The others 38 lines involve imports, input validation, error handling, logging, and tracing. Remember that all this code must be tested and maintained.

import os
import json
import boto3
from aws_lambda_powertools import Tracer
from aws_lambda_powertools import Logger
import re

logger = Logger()
tracer = Tracer()

sfn = boto3.client('stepfunctions')

PATTERN = re.compile(r"^arn:(aws[a-zA-Z-]*)?:states:[a-z]{2}((-gov)|(-iso(b?)))?-[a-z]+-\d{1}:\d{12}:stateMachine:[a-zA-Z0-9-_]+$")

if ('STATE_MACHINE_ARN' not in os.environ
    or os.environ['STATE_MACHINE_ARN'] is None
    or not PATTERN.match(os.environ['STATE_MACHINE_ARN'])):
    raise RuntimeError('STATE_MACHINE_ARN env var is not set or incorrect')

STATE_MACHINE_ARN = os.environ['STATE_MACHINE_ARN']

@logger.inject_lambda_context
@tracer.capture_lambda_handler
def handler(event, context):
    try:
        event['requestId'] = context.aws_request_id

        sfn.start_execution(
            stateMachineArn=STATE_MACHINE_ARN,
            input=json.dumps(event)
        )

        return {
            'requestId': event['requestId']
        }
    except Exception as error:
        logger.exception(error)
        raise RuntimeError('Internal Error - cannot start the creation workflow') from error

After running this workflow several times and reviewing the AWS X-Ray traces (Figure 3), we can see that it takes about 2–3 seconds when functions are warmed:

Figure 3. X-Ray traces when Lambda functions are warmed

Figure 3. X-Ray traces when Lambda functions are warmed

But the process takes around 10 seconds with cold starts, as shown in Figure 4:

Figure 4. X-Ray traces when Lambda functions are cold

Figure 4. X-Ray traces when Lambda functions are cold

We use an asynchronous architecture to avoid waiting time for the user, as this can be a long process. We also use WebSockets to notify the user when it’s finished. This adds some complexity, new components, and additional costs to the architecture. Now let’s look at how we can optimize this architecture.

Improving the initial architecture

Direct integration with Step Functions

Step Functions can directly integrate with some AWS services, including DynamoDB, Amazon SQS, and EventBridge, and more than 10,000 APIs from 200+ AWS services. With these integrations, you can replace Lambda functions when they do not provide value. We recommend using Lambda functions to transform data, not to transport data from one service to another.

In our bank account creation use case, there are four Lambda functions we can replace with direct service integrations (see large arrows in Figure 5):

  • Query a DynamoDB table to search for a user
  • Send a message to an SQS queue when the extraction fails
  • Create the user in DynamoDB
  • Send an event on EventBridge to notify the backend
Figure 5. Lambda functions that can be replaced

Figure 5. Lambda functions that can be replaced

It is not as clear that we need to replace the other Lambda functions. Here are some considerations:

  • To extract information from the ID card, we use Amazon Textract. It is available through the SDK integration in Step Functions. However, the API’s response provides too much information. We recommend using a library such as amazon-textract-response-parser to parse the result. For this, you’ll need a Lambda function.
  • The identity cross-check performs a simple comparison between the data provided in the web form and the one extracted in the ID card. We can perform this comparison in Step Functions using a Choice state and several conditions. If the business logic becomes more complex, consider using a Lambda function.
  • To validate the address, we query a third-party API. Step Functions cannot directly call a third-party HTTP endpoint, but because it’s integrated with API Gateway, we can create a proxy for this endpoint.

If you only need to retrieve data from an API or make a simple API call, use the direct integration. If you need to implement some logic, use a Lambda function.

Direct integration with API Gateway

API Gateway also provides service integrations. In particular, we can start the workflow without using a Lambda function. In the console, select the integration type “AWS Service”, the AWS service “Step Functions”, the action “StartExecution”, and “POST” method, as shown in Figure 6.

Figure 6. API Gateway direct integration with Step Functions

Figure 6. API Gateway direct integration with Step Functions

After that, use a mapping template in the integration request to define the parameters as shown here:

{
  "stateMachineArn":"arn:aws:states:eu-central-1:123456789012:stateMachine: accountCreationWorkflow",
  "input":"$util.escapeJavaScript($input.json('$'))"
}

We can go further and remove the websockets and associated Lambda functions connect, message, and disconnect. By using Synchronous Express Workflows and the StartSyncExecution API, we can start the workflow and wait for the result in a synchronous fashion. API Gateway will then directly return the result of the workflow to the client.

Final optimized architecture

After applying these optimizations, we have the updated architecture shown in Figure 7. It uses only two Lambda functions out of the initial 13. The rest have been replaced by direct service integrations or implemented in Step Functions.

Figure 7. Final optimized architecture

Figure 7. Final optimized architecture

We were able to remove 11 Lambda functions and their associated fees. In this architecture, the cost is mainly driven by Step Functions, and the main price difference will be your use of Express Workflows instead of Standard Workflows. If you need to keep some Lambda functions, use AWS Lambda Power Tuning to configure your function correctly and benefit from the best price/performance ratio.

One of the main benefits of this architecture is performance. With the final workflow architecture, it now takes about 1.5 seconds when the Lambda function is warmed and 3 seconds on cold starts (versus up to 10 seconds previously), see Figure 8:

Figure 8. X-Ray traces for the final architecture

Figure 8. X-Ray traces for the final architecture

The process can now be synchronous. It reduces the complexity of the architecture and vastly improves the user experience.

An added benefit is that by reducing the overall complexity and removing the unnecessary Lambda functions, we have also reduced the risk of failures. These can be errors in the code, memory or timeout issues due to bad configuration, lack of permissions, network issues between components, and more. This increases the resiliency of the application and eases its maintenance.

Testing

Testability is an important consideration when building your workflow. Unit testing a Lambda function is straightforward, and you can use your preferred testing framework and validate methods. Adopting a hexagonal architecture also helps remove dependencies to the cloud.

When removing functions and using an approach with direct service integrations, you are by definition directly connected to the cloud. You still must verify that the overall process is working as expected, and validate these integrations.

You can achieve this kind of tests locally using Step Functions Local, and the recently announced Mocked Service Integrations. By mocking service integrations, for example, retrieving an item in DynamoDB, you can validate the different paths of your state machine.

You also have to perform integration tests, but this is true whether you use direct integrations or Lambda functions.

Conclusion

This post describes how to simplify your architecture and optimize for performance, resiliency, and cost by using direct integrations in Step Functions and API Gateway. Although many Lambda functions were reduced, some remain useful for handling more complex business logic and data transformation. Try this out now by visiting the GitHub repository.

For further reading:

Maximize Your VM Investment: Fix Vulnerabilities Faster With Automox + Rapid7

Post Syndicated from Nicholas Colyer original https://blog.rapid7.com/2022/05/16/maximize-your-vm-investment-fix-vulnerabilities-faster-with-automox-rapid7/

Maximize Your VM Investment: Fix Vulnerabilities Faster With Automox + Rapid7

The Rapid7 InsightConnect Extension library is getting bigger! We’ve teamed up with IT operations platform, Automox, to release a new plugin and technology alliance that closes the aperture of attack for vulnerability findings and automates remediation. Using the Automox Plugin for Rapid7 InsightConnect in conjunction with InsightVM, customers are able to:

  • Automate discovery-to-remediation of vulnerability findings
  • Query Automox device details via Slack or Microsoft Teams

Getting started with Automox within InsightConnect

Automox is an IT Operations platform that fully automates the process of endpoint management across Windows, macOS, Linux, and third-party software — including Adobe, Java, Firefox, Chrome, and Windows.

The Automox InsightConnect Plugin allows mutual customers of Rapid7 and Automox to expand their capabilities between products, ultimately streamlining cyber security outcomes and operational effectiveness. Seamlessly transition CVE-based vulnerability findings through discovery to remediation, and perform device queries without needing to leave Slack or Microsoft Teams!

Example workflows you can start leveraging now with the Automox Plugin

  • Generate Rapid7 InsightVM Report and Upload to Automox Vulnerability Sync: An example workflow that leverages threat context for assets and prioritizes them for remediation. An InsightVM report is automatically generated and uploaded using Automox’s Vulnerability Sync for easy remediation, saving internal teams precious time and effort in managing  critically emerging threats – from start to finish.
  • Automox Device Lookup from Microsoft Teams: An example workflow that lets a user query a device in Automox directly from Microsoft Teams.
  • Automox Device Lookup from Slack: An example workflow that lets a user query a device in Automox directly from Slack.

For more information or to start using this plugin, access and install the Automox Plugin for Rapid7 InsightConnect through the Rapid7 Extension Library.

Additional reading:

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

How Ramadan shows up in Internet trends

Post Syndicated from João Tomé original https://blog.cloudflare.com/how-ramadan-shows-up-in-internet-trends/

How Ramadan shows up in Internet trends

How Ramadan shows up in Internet trends

What happens to the Internet traffic in countries where many observe Ramadan? Depending on the country, there are clear shifts and changing patterns in Internet use, particularly before dawn and after sunset.

This year, Ramadan started on April 2, and it continued until May 1, 2022, (dates vary and are dependent on the appearance of the crescent moon). For Muslims, it is a period of introspection, communal prayer and also of fasting every day from dawn to sunset. That means that people only eat at night (Iftar is the first meal after sunset that breaks the fast and often also a family or community event), and also before sunrise (Suhur).

In some countries, the impact is so big that we can see in our Internet traffic charts when the sun sets. Sunrise is more difficult to check in the charts, but in the countries more impacted, people wake up much earlier than usual and were using the Internet in the early morning because of that.

Cloudflare Radar data shows that Internet traffic was impacted in several countries by Ramadan, with a clear increase in traffic before sunrise, and a bigger than usual decrease after sunset. All times in this blog post are local. The data in the charts is bucketed into hours. So, for example, when we show an increase in traffic at 0400 we are showing that an increase occurred between 0400 and 0459 local time.

Indonesia is a clear example of that, showing trends that continued until the end of Ramadan:

How Ramadan shows up in Internet trends

In the next table, we show a country ranking by order of impact. Here, we include traffic changes before dawn and after sunset. In the last column, you can also see the change in traffic after Ramadan ended, right after sunset. In this case, we’re looking at Wednesday, May 4, right after the Eid al-fitr — the May 2-3, 2022 holiday of breaking the fast, in a comparison with the previous Wednesday at the same time (when Ramadan was ongoing):

Internet traffic: Ramadan’s impact Before sunrise After sunset Post-Ramadan, May 4 (after sunset)
Afghanistan +203% -28% +20%
Pakistan +119% -39% +13%
Indonesia +98% -13%
Morocco +90% -36% +44%
Libya +81% -27% +48%
Turkey +78% -19% +22%
Bangladesh +62% -40% +12%
Saudi Arabia +55% -45% -5%
United Arab Emirates +52% -13% +4%
Bahrain +44% -31% +21%
Malaysia +41% -8% -9%
Qatar +35% -23% +5%
Egypt +31% -32% +56%
Tunisia +25% -43% +101%
Iran +24% +10% -12%
Singapore +8% -5% +4%
India -15%

Afghanistan, Pakistan, Indonesia, Morocco, Libya and Turkey had the biggest impact in an increase in traffic before sunrise. After sunset, it was (by order of impact) Saudi Arabia, Tunisia, Bangladesh, Pakistan that showed a more clear decrease in traffic after sunset.

Here’s the impact of the start of Ramadan on Bangladesh, with more highlights inside the next chart:

How Ramadan shows up in Internet trends

Waking up earlier

There’s a clear pattern in most of the countries, Internet traffic was much higher than usual between 04:00 to 04:59 local time (where usually it’s the time with the lowest traffic).

The same early spike is seen in Turkey and the United Arab Emirates. In the case of the United Arab Emirates, the time before sunrise for the Suhur meal had more mobile usage than usual (so people were using their mobile devices to access the Internet more than usual at that time).

That’s also the case for Pakistan, where traffic is 119% higher on the 04:00 to 04:59 hour on April 3, than on the previous Sunday, but also in Qatar (sunrise at 05:25 and a spike of 35%) or Afghanistan. In the latter, the spike is 203% higher:

How Ramadan shows up in Internet trends

We also saw the same trend in Indonesia, sunrise was at 05:55 local time at the beginning of April, and there’s a clear spike in traffic in the 04:00 to 04:59 hour with a 98% growth in requests.

Northern African countries like Egypt, Tunisia, Morocco or Libya (sunrise at 06:54), show the same 04:00 to 04:59 hour spike. In Libya, traffic was 81% higher on Sunday, April 3, than it was the previous Sunday at the same time. Usually, the 04:00 to 04:59 hour is the lowest point in traffic in the country, but on April 3 and the following days it was at 08:00.

Saudi Arabia shows a similar pattern in terms of Internet traffic on Sunday, April 3, 2022, sunrise was at 05:44, and there was 55% more Internet use than at the same time on the previous Sunday, before Ramadan.

How Ramadan shows up in Internet trends

Does daily total Internet traffic go up or down?

The short answer is: depends on the country, given that there are examples of a  general increase and decrease in traffic in the most impacted countries. We see similar trends for the sunset and sunrise times of day, but it’s a different story throughout the 30 days of Ramadan.

Iran, in general, shows an increase in traffic after Ramadan started on April 2, and a decrease after it ended on May 3 (of around 15%).

How Ramadan shows up in Internet trends

Something similar is seen in Pakistan, that had a general decrease in traffic the week after Ramadan ended, but during the 18:00 to 18:59 hour, May 4, had 13% more traffic than at the same time on the previous Wednesday, when Ramadan was being observed and the iftar meal would have happened during the 18:00 to 18:59 hour.

How Ramadan shows up in Internet trends

The opposite happens in Libya, where traffic, generally speaking, declined during Ramadan and picked up after — comparing Wednesday, May 4, 2022, with the previous one during the 19:00 to 19:59 hour, traffic grew around 48%. The same trend is seen in another North African country: Morocco (growth of 44% after Ramadan ended).

How Ramadan shows up in Internet trends

After Ramadan, sunsets ‘bring’ more Internet traffic

Another pattern, unsurprisingly, that our chart at the beginning of this blog post shows is how the sunset period changes when Ramadan (and the holiday that follows) ends, in most cases clearly increasing traffic at around 18:00 or 19:00.

Of the 16 countries with a bigger Ramadan impact, only four had a decrease in traffic after sunset on May 4: Iran, Indonesia, Saudi Arabia and Malaysia. All of these countries had an increase (or sustained traffic) in daily traffic during Ramadan and lost daily Internet usage after it ended (in May).

Here’s the example of Indonesia through the Ramadan period that includes April and May:

How Ramadan shows up in Internet trends

And a zoomed-in Indonesia chart after Ramadan ended (May 1, but bear in mind that May 2-3 is the holiday Eid al-fitr) that shows not only the general decrease in traffic, but also how the sunset period doesn’t have a clear drop in requests as seen in the Ramadan period:

How Ramadan shows up in Internet trends

Conclusion: a human impact

Ramadan has a clear impact on Internet traffic patterns as humans change their habits.

The Internet may be the network of networks, where there are many bots (friendly and less friendly), but it continues to be a human-powered network, made by humans for humans.

Follow our Internet trends (including details about ASNs) on Cloudflare Radar, and also on Radar’s Twitter account.

The collective thoughts of the interwebz

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close