Tag Archives: announcements

IAM Access Analyzer simplifies inspection of unused access in your organization

Post Syndicated from Achraf Moussadek-Kabdani original https://aws.amazon.com/blogs/security/iam-access-analyzer-simplifies-inspection-of-unused-access-in-your-organization/

AWS Identity and Access Management (IAM) Access Analyzer offers tools that help you set, verify, and refine permissions. You can use IAM Access Analyzer external access findings to continuously monitor your AWS Organizations organization and Amazon Web Services (AWS) accounts for public and cross-account access to your resources, and verify that only intended external access is granted. Now, you can use IAM Access Analyzer unused access findings to identify unused access granted to IAM roles and users in your organization.

If you lead a security team, your goal is to manage security for your organization at scale and make sure that your team follows best practices, such as the principle of least privilege. When your developers build on AWS, they create IAM roles for applications and team members to interact with AWS services and resources. They might start with broad permissions while they explore AWS services for their use cases. To identify unused access, you can review the IAM last accessed information for a given IAM role or user and refine permissions gradually. If your company has a multi-account strategy, your roles and policies are created in multiple accounts. You then need visibility across your organization to make sure that teams are working with just the required access.

Now, IAM Access Analyzer simplifies inspection of unused access by reporting unused access findings across your IAM roles and users. IAM Access Analyzer continuously analyzes the accounts in your organization to identify unused access and creates a centralized dashboard with findings. From a delegated administrator account for IAM Access Analyzer, you can use the dashboard to review unused access findings across your organization and prioritize the accounts to inspect based on the volume and type of findings. The findings highlight unused roles, unused access keys for IAM users, and unused passwords for IAM users. For active IAM users and roles, the findings provide visibility into unused services and actions. With the IAM Access Analyzer integration with Amazon EventBridge and AWS Security Hub, you can automate and scale rightsizing of permissions by using event-driven workflows.

In this post, we’ll show you how to set up and use IAM Access Analyzer to identify and review unused access in your organization.

Generate unused access findings

To generate unused access findings, you need to create an analyzer. An analyzer is an IAM Access Analyzer resource that continuously monitors your accounts or organization for a given finding type. You can create an analyzer for the following findings:

An analyzer for unused access findings is a new analyzer that continuously monitors roles and users, looking for permissions that are granted but not actually used. This analyzer is different from an analyzer for external access findings; you need to create a new analyzer for unused access findings even if you already have an analyzer for external access findings.

You can centrally view unused access findings across your accounts by creating an analyzer at the organization level. If you operate a standalone account, you can get unused access findings by creating an analyzer at the account level. This post focuses on the organization-level analyzer setup and management by a central team.


IAM Access Analyzer charges for unused access findings based on the number of IAM roles and users analyzed per analyzer per month. You can still use IAM Access Analyzer external access findings at no additional cost. For more details on pricing, see IAM Access Analyzer pricing.

Create an analyzer for unused access findings

To enable unused access findings for your organization, you need to create your analyzer by using the IAM Access Analyzer console or APIs in your management account or a delegated administrator account. A delegated administrator is a member account of the organization that you can delegate with administrator access for IAM Access Analyzer. A best practice is to use your management account only for tasks that require the management account and use a delegated administrator for other tasks. For steps on how to add a delegated administrator for IAM Access Analyzer, see Delegated administrator for IAM Access Analyzer.

To create an analyzer for unused access findings (console)

  1. From the delegated administrator account, open the IAM Access Analyzer console, and in the left navigation pane, select Analyzer settings.
  2. Choose Create analyzer.
  3. On the Create analyzer page, do the following, as shown in Figure 1:
    1. For Findings type, select Unused access analysis.
    2. Provide a Name for the analyzer.
    3. Select a Tracking period. The tracking period is the threshold beyond which IAM Access Analyzer considers access to be unused. For example, if you select a tracking period of 90 days, IAM Access Analyzer highlights the roles that haven’t been used in the last 90 days.
    4. Set your Selected accounts. For this example, we select Current organization to review unused access across the organization.
    5. Select Create.
    Figure 1: Create analyzer page

    Figure 1: Create analyzer page

Now that you’ve created the analyzer, IAM Access Analyzer starts reporting findings for unused access across the IAM users and roles in your organization. IAM Access Analyzer will periodically scan your IAM roles and users to update unused access findings. Additionally, if one of your roles, users or policies is updated or deleted, IAM Access Analyzer automatically updates existing findings or creates new ones. IAM Access Analyzer uses a service-linked role to review last accessed information for all roles, user access keys, and user passwords in your organization. For active IAM roles and users, IAM Access Analyzer uses IAM service and action last accessed information to identify unused permissions.

Note: Although IAM Access Analyzer is a regional service (that is, you enable it for a specific AWS Region), unused access findings are linked to IAM resources that are global (that is, not tied to a Region). To avoid duplicate findings and costs, enable your analyzer for unused access in the single Region where you want to review and operate findings.

IAM Access Analyzer findings dashboard

Your analyzer aggregates findings from across your organization and presents them on a dashboard. The dashboard aggregates, in the selected Region, findings for both external access and unused access—although this post focuses on unused access findings only. You can use the dashboard for unused access findings to centrally review the breakdown of findings by account or finding types to identify areas to prioritize for your inspection (for example, sensitive accounts, type of findings, type of environment, or confidence in refinement).

Unused access findings dashboard – Findings overview

Review the findings overview to identify the total findings for your organization and the breakdown by finding type. Figure 2 shows an example of an organization with 100 active findings. The finding type Unused access keys is present in each of the accounts, with the most findings for unused access. To move toward least privilege and to avoid long-term credentials, the security team should clean up the unused access keys.

Figure 2: Unused access finding dashboard

Figure 2: Unused access finding dashboard

Unused access findings dashboard – Accounts with most findings

Review the dashboard to identify the accounts with the highest number of findings and the distribution per finding type. In Figure 2, the Audit account has the highest number of findings and might need attention. The account has five unused access keys and six roles with unused permissions. The security team should prioritize this account based on volume of findings and review the findings associated with the account.

Review unused access findings

In this section, we’ll show you how to review findings. We’ll share two examples of unused access findings, including unused access key findings and unused permissions findings.

Finding example: unused access keys

As shown previously in Figure 2, the IAM Access Analyzer dashboard showed that accounts with the most findings were primarily associated with unused access keys. Let’s review a finding linked to unused access keys.

To review the finding for unused access keys

  1. Open the IAM Access Analyzer console, and in the left navigation pane, select Unused access.
  2. Select your analyzer to view the unused access findings.
  3. In the search dropdown list, select the property Findings type, the Equals operator, and the value Unused access key to get only Findings type = Unused access key, as shown in Figure 3.
    Figure 3: List of unused access findings

    Figure 3: List of unused access findings

  4. Select one of the findings to get a view of the available access keys for an IAM user, their status, creation date, and last used date. Figure 4 shows an example in which one of the access keys has never been used, and the other was used 137 days ago.
    Figure 4: Finding example - Unused IAM user access keys

    Figure 4: Finding example – Unused IAM user access keys

From here, you can investigate further with the development teams to identify whether the access keys are still needed. If they aren’t needed, you should delete the access keys.

Finding example: unused permissions

Another goal that your security team might have is to make sure that the IAM roles and users across your organization are following the principle of least privilege. Let’s walk through an example with findings associated with unused permissions.

To review findings for unused permissions

  1. On the list of unused access findings, apply the filter on Findings type = Unused permissions.
  2. Select a finding, as shown in Figure 5. In this example, the IAM role has 148 unused actions on Amazon Relational Database Service (Amazon RDS) and has not used a service action for 200 days. Similarly, the role has unused actions for other services, including Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), and Amazon DynamoDB.
    Figure 5: Finding example - Unused permissions

    Figure 5: Finding example – Unused permissions

The security team now has a view of the unused actions for this role and can investigate with the development teams to check if those permissions are still required.

The development team can then refine the permissions granted to the role to remove the unused permissions.

Unused access findings notify you about unused permissions for all service-level permissions and for 200 services at the action-level. For the list of supported actions, see IAM action last accessed information services and actions.

Take actions on findings

IAM Access Analyzer categorizes findings as active, resolved, and archived. In this section, we’ll show you how you can act on your findings.

Resolve findings

You can resolve unused access findings by deleting unused IAM roles, IAM users, IAM user credentials, or permissions. After you’ve completed this, IAM Access Analyzer automatically resolves the findings on your behalf.

To speed up the process of removing unused permissions, you can use IAM Access Analyzer policy generation to generate a fine-grained IAM policy based on your access analysis. For more information, see the blog post Use IAM Access Analyzer to generate IAM policies based on access activity found in your organization trail.

Archive findings

You can suppress a finding by archiving it, which moves the finding from the Active tab to the Archived tab in the IAM Access Analyzer console. To archive a finding, open the IAM Access Analyzer console, select a Finding ID, and in the Next steps section, select Archive, as shown in Figure 6.

Figure 6: Archive finding in the AWS management console

Figure 6: Archive finding in the AWS management console

You can automate this process by creating archive rules that archive findings based on their attributes. An archive rule is linked to an analyzer, which means that you can have archive rules exclusively for unused access findings.

To illustrate this point, imagine that you have a subset of IAM roles that you don’t expect to use in your tracking period. For example, you might have an IAM role that is used exclusively for break glass access during your disaster recovery processes—you shouldn’t need to use this role frequently, so you can expect some unused access findings. For this example, let’s call the role DisasterRecoveryRole. You can create an archive rule to automatically archive unused access findings associated with roles named DisasterRecoveryRole, as shown in Figure 7.

Figure 7: Example of an archive rule

Figure 7: Example of an archive rule


IAM Access Analyzer exports findings to both Amazon EventBridge and AWS Security Hub. Security Hub also forwards events to EventBridge.

Using an EventBridge rule, you can match the incoming events associated with IAM Access Analyzer unused access findings and send them to targets for processing. For example, you can notify the account owners so that they can investigate and remediate unused IAM roles, user credentials, or permissions.

For more information, see Monitoring AWS Identity and Access Management Access Analyzer with Amazon EventBridge.


With IAM Access Analyzer, you can centrally identify, review, and refine unused access across your organization. As summarized in Figure 8, you can use the dashboard to review findings and prioritize which accounts to review based on the volume of findings. The findings highlight unused roles, unused access keys for IAM users, and unused passwords for IAM users. For active IAM roles and users, the findings provide visibility into unused services and actions. By reviewing and refining unused access, you can improve your security posture and get closer to the principle of least privilege at scale.

Figure 8: Process to address unused access findings

Figure 8: Process to address unused access findings

The new IAM Access Analyzer unused access findings and dashboard are available in AWS Regions, excluding the AWS GovCloud (US) Regions and AWS China Regions. To learn more about how to use IAM Access Analyzer to detect unused accesses, see the IAM Access Analyzer documentation.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Achraf Moussadek-Kabdani

Achraf Moussadek-Kabdani

Achraf is a Senior Security Specialist at AWS. He works with global financial services customers to assess and improve their security posture. He is both a builder and advisor, supporting his customers to meet their security objectives while making security a business enabler.


Yevgeniy Ilyin

Yevgeniy is a Solutions Architect at AWS. He has over 20 years of experience working at all levels of software development and solutions architecture and has used programming languages from COBOL and Assembler to .NET, Java, and Python. He develops and code clouds native solutions with a focus on big data, analytics, and data engineering.

Mathangi Ramesh

Mathangi Ramesh

Mathangi is the product manager for IAM. She enjoys talking to customers and working with data to solve problems. Outside of work, Mathangi is a fitness enthusiast and a Bharatanatyam dancer. She holds an MBA degree from Carnegie Mellon University.

Zonal autoshift – Automatically shift your traffic away from Availability Zones when we detect potential issues

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/zonal-autoshift-automatically-shift-your-traffic-away-from-availability-zones-when-we-detect-potential-issues/

Today we’re launching zonal autoshift, a new capability of Amazon Route 53 Application Recovery Controller that you can enable to automatically and safely shift your workload’s traffic away from an Availability Zone when AWS identifies a potential failure affecting that Availability Zone and shift it back once the failure is resolved.

When deploying resilient applications, you typically deploy your resources across multiple Availability Zones in a Region. Availability Zones are distinct groups of physical data centers at a meaningful distance apart (typically miles) to make sure that they have diverse power, connectivity, network devices, and flood plains.

To help you protect against an application’s errors, like a failed deployment, an error of configuration, or an operator error, we introduced last year the ability to manually or programmatically trigger a zonal shift. This enables you to shift the traffic away from one Availability Zone when you observe degraded metrics in that zone. It does so by configuring your load balancer to direct all new connections to infrastructure in healthy Availability Zones only. This allows you to preserve your application’s availability for your customers while you investigate the root cause of the failure. Once fixed, you stop the zonal shift to ensure the traffic is distributed across all zones again.

Zonal shift works at the Application Load Balancer (ALB) or Network Load Balancer (NLB) level only when cross-zone load balancing is turned off, which is the default for NLB. In a nutshell, load balancers offer two levels of load balancing. The first level is configured in the DNS. Load balancers expose one or more IP addresses for each Availability Zone, offering a client-side load balancing between zones. Once the traffic hits an Availability Zone, the load balancer sends traffic to registered healthy targets, typically an Amazon Elastic Compute Cloud (Amazon EC2) instance. By default, ALBs send traffic to targets across all Availability Zones. For zonal shift to properly work, you must configure your load balancers to disable cross-zone load balancing.

When zonal shift starts, the DNS sends all traffic away from one Availability Zone, as illustrated by the following diagram.

ARC Zonal Shift

Manual zonal shift helps to protect your workload against errors originating from your side. But when there is a potential failure in an Availability Zone, it is sometimes difficult for you to identify or detect the failure. Detecting an issue in an Availability Zone using application metrics is difficult because, most of the time, you don’t track metrics per Availability Zone. Moreover, your services often call dependencies across Availability Zone boundaries, resulting in errors seen in all Availability Zones. With modern microservice architectures, these detection and recovery steps must often be performed across tens or hundreds of discrete microservices, leading to recovery times of multiple hours.

Customers asked us if we could take the burden off their shoulders to detect a potential failure in an Availability Zone. After all, we might know about potential issues through our internal monitoring tools before you do.

With this launch, you can now configure zonal autoshift to protect your workloads against potential failure in an Availability Zone. We use our own AWS internal monitoring tools and metrics to decide when to trigger a network traffic shift. The shift starts automatically; there is no API to call. When we detect that a zone has a potential failure, such as a power or network disruption, we automatically trigger an autoshift of your infrastructure’s NLB or ALB traffic, and we shift the traffic back when the failure is resolved.

Obviously, shifting traffic away from an Availability Zone is a delicate operation that must be carefully prepared. We built a series of safeguards to ensure we don’t degrade your application availability by accident.

First, we have internal controls to ensure we shift traffic away from no more than one Availability Zone at a time. Second, we practice the shift on your infrastructure for 30 minutes every week. You can define blocks of time when you don’t want the practice to happen, for example, 08:00–18:00, Monday through Friday. Third, you can define two Amazon CloudWatch alarms to act as a circuit breaker during the practice run: one alarm to prevent starting the practice run at all and one alarm to monitor your application health during a practice run. When either alarm triggers during the practice run, we stop it and restore traffic to all Availability Zones. The state of application health alarm at the end of the practice run indicates its outcome: success or failure.

According to the principle of shared responsibility, you have two responsibilities as well.

First you must ensure there is enough capacity deployed in all Availability Zones to sustain the increase of traffic in remaining Availability Zones after traffic has shifted. We strongly recommend having enough capacity in remaining Availability Zones at all times and not relying on scaling mechanisms that could delay your application recovery or impact its availability. When zonal autoshift triggers, AWS Auto Scaling might take more time than usual to scale your resources. Pre-scaling your resource ensures a predictable recovery time for your most demanding applications.

Let’s imagine that to absorb regular user traffic, your application needs six EC2 instances across three Availability Zones (2×3 instances). Before configuring zonal autoshift, you should ensure you have enough capacity in the remaining Availability Zones to absorb the traffic when one Availability Zone is not available. In this example, it means three instances per Availability Zone (3×3 = 9 instances with three Availability Zones in order to keep 2×3 = 6 instances to handle the load when traffic is shifted to two Availability Zones).

In practice, when operating a service that requires high reliability, it’s normal to operate with some redundant capacity online for eventualities such as customer-driven load spikes, occasional host failures, etc. Topping up your existing redundancy in this way both ensures you can recover rapidly during an Availability Zone issue but can also give you greater robustness to other events.

Second, you must explicitly enable zonal autoshift for the resources you choose. AWS applies zonal autoshift only on the resources you chose. Applying a zonal autoshift will affect the total capacity allocated to your application. As I just described, your application must be prepared for that by having enough capacity deployed in the remaining Availability Zones.

Of course, deploying this extra capacity in all Availability Zones has a cost. When we talk about resilience, there is a business tradeoff to decide between your application availability and its cost. This is another reason why we apply zonal autoshift only on the resources you select.

Let’s see how to configure zonal autoshift
To show you how to configure zonal autoshift, I deploy my now-famous TicTacToe web application using a CDK script. I open the Route 53 Application Recovery Controller page of the AWS Management Console. On the left pane, I select Zonal autoshift. Then, on the welcome page, I select Configure zonal autoshift for a resource.

Zonal autoshift - 1

I select the load balancer of my demo application. Remember that currently, only load balancers with cross-zone load balancing turned off are eligible for zonal autoshift. As the warning on the console reminds me, I also make sure my application has enough capacity to continue to operate with the loss of one Availability Zone.

Zonal autoshift - 2

I scroll down the page and configure the times and days I don’t want AWS to run the 30-minute practice. At first, and until I’m comfortable with autoshift, I block the practice 08:00–18:00, Monday through Friday. Pay attention that hours are expressed in UTC, and they don’t vary with daylight saving time. You may use a UTC time converter application for help. While it is safe for you to exclude business hours at the start, we recommend configuring the practice run also during your business hours to ensure capturing issues that might not be visible when there is low or no traffic on your application. You probably most need zonal autoshift to work without impact at your peak time, but if you have never tested it, how confident are you? Ideally, you don’t want to block any time at all, but we recognize that’s not always practical.

Zonal autoshift - 3

Further down on the same page, I enter the two circuit breaker alarms. The first one prevents the practice from starting. You use this alarm to tell us this is not a good time to start a practice run. For example, when there is an issue ongoing with your application or when you’re deploying a new version of your application to production. The second CloudWatch alarm gives the outcome of the practice run. It enables zonal autoshift to judge how your application is responding to the practice run. If the alarm stays green, we know all went well.

If either of these two alarms triggers during the practice run, zonal autoshift stops the practice and restores the traffic to all Availability Zones.

Finally, I acknowledge that a 30-minute practice run will run weekly and that it might reduce the availability of my application.

Then, I select Create.

Zonal autoshift - 4And that’s it.

After a few days, I see the history of the practice runs on the Zonal shift history for resource tab of the console. I monitor the history of my two circuit breaker alarms to stay confident everything is correctly monitored and configured.

ARC Zonal Shift - practice run

It’s not possible to test an autoshift itself. It triggers automatically when we detect a potential issue in an Availability Zone. I asked the service team if we could shut down an Availability Zone to test the instructions I shared in this post; they politely declined my request :-).

To test your configuration, you can trigger a manual shift, which behaves identically to an autoshift.

A few more things to know
Zonal autoshift is now available at no additional cost in all AWS Regions, except for China and GovCloud.

We recommend applying the crawl, walk, run methodology. First, you get started with manual zonal shifts to acquire confidence in your application. Then, you turn on zonal autoshift configured with practice runs outside of your business hours. Finally, you modify the schedule to include practice zonal shifts during your business hours. You want to test your application response to an event when you least want it to occur.

We also recommend that you think holistically about how all parts of your application will recover when we move traffic away from one Availability Zone and then back. The list that comes to mind (although certainly not complete) is the following.

First, plan for extra capacity as I discussed already. Second, think about possible single points of failure in each Availability Zone, such as a self-managed database running on a single EC2 instance or a microservice that leaves in a single Availability Zone, and so on. I strongly recommend using managed databases, such as Amazon DynamoDB or Amazon Aurora for applications requiring zonal shifts. These have built-in replication and fail-over mechanisms in place. Third, plan the switch back when the Availability Zone will be available again. How much time do you need to scale your resources? Do you need to rehydrate caches?

You can learn more about resilient architectures and methodologies with this great series of articles from my colleague Adrian.

Finally, remember that only load balancers with cross-zone load balancing turned off are currently eligible for zonal autoshift. To turn off cross-zone load balancing from a CDK script, you need to remove stickinessCookieDuration and add load_balancing.cross_zone.enabled=false on the target group. Here is an example with CDK and Typescript:

    // Add the auto scaling group as a load balancing
    // target to the listener.
    const targetGroup = listener.addTargets('MyApplicationFleet', {
      port: 8080,
      // for zonal shift, stickiness & cross-zones load balancing must be disabled
      // stickinessCookieDuration: Duration.hours(1),
      targets: [asg]
    // disable cross zone load balancing
    targetGroup.setAttribute("load_balancing.cross_zone.enabled", "false");

Now it’s time for you to select your applications that would benefit from zonal autoshift. Start by reviewing your infrastructure capacity in each Availability Zone and then define the circuit breaker alarms. Once you are confident your monitoring is correctly configured, go and enable zonal autoshift.

— seb

Amazon SageMaker Studio adds web-based interface, Code Editor, flexible workspaces, and streamlines user onboarding

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/amazon-sagemaker-studio-adds-web-based-interface-code-editor-flexible-workspaces-and-streamlines-user-onboarding/

Today, we are announcing an improved Amazon SageMaker Studio experience! The new SageMaker Studio web-based interface loads faster and provides consistent access to your preferred integrated development environment (IDE) and SageMaker resources and tooling, irrespective of your IDE choice. In addition to JupyterLab and RStudio, SageMaker Studio now includes a fully managed Code Editor based on Code-OSS (Visual Studio Code Open Source).

Both Code Editor and JupyterLab can be launched using a flexible workspace. With spaces, you can scale the compute and storage for your IDE up and down as you go, customize runtime environments, and pause-and-resume coding anytime from anywhere. You can spin up multiple such spaces, each configured with a different combination of compute, storage, and runtimes.

SageMaker Studio now also comes with a streamlined onboarding and administration experience to help both individual users and enterprise administrators get started in minutes. Let me give you a quick tour of some of these highlights.

New SageMaker Studio web-based interface
The new SageMaker Studio web-based interface acts as a command center for launching your preferred IDE and accessing your SageMaker tools to build, train, tune, and deploy models. You can now view SageMaker training jobs and endpoints in SageMaker Studio and access foundation models (FMs) via SageMaker JumpStart. Also, you no longer need to manually upgrade SageMaker Studio.

Amazon SageMaker Studio

New Code Editor based on Code-OSS (Visual Studio Code Open Source)
As a data scientist or machine learning (ML) practitioner, you can now sign in to SageMaker Studio and launch Code Editor directly from your browser. With Code Editor, you have access to thousands of VS Code compatible extensions from Open VSX registry and the preconfigured AWS toolkit for Visual Studio Code for developing and deploying applications on AWS. You can also use the artificial intelligence (AI)-powered coding companion and security scanning tool powered by Amazon CodeWhisperer and Amazon CodeGuru.

Amazon SageMaker Studio

Launch Code Editor and JupyterLab in a flexible workspace
You can launch both Code Editor and JupyterLab using private spaces that only the user creating the space has access to. This flexible workspace is designed to provide a faster and more efficient coding environment.

The spaces come preconfigured with a SageMaker distribution that contains popular ML frameworks and Python packages. With the help of the AI-powered coding companions and security tools, you can quickly generate, debug, explain, and refactor your code.

In addition, SageMaker Studio comes with an improved collaboration experience. You can use the built-in Git integration to share and version code or bring your own shared file storage using Amazon EFS to access a collaborative filesystem across different users or teams.

Amazon SageMaker Studio

Amazon SageMaker Studio

Amazon SageMaker Studio

Streamlined user onboarding and administration
With redesigned setup and onboarding workflows, you can now set up SageMaker Studio domains within minutes. As an individual user, you can now use a one-click experience to launch SageMaker Studio using default presets and without the need to learn about domains or AWS IAM roles.

As an enterprise administrator, step-by-step instructions help you choose the right authentication method, connect to your third-party identity providers, integrate networking and security configurations, configure fine-grained access policies, and choose the right applications to enable in SageMaker Studio. You can also update settings at any time.

To get started, navigate to the SageMaker console and select either Set up for single user or Set up for organization.

Amazon SageMaker Studio

The single-user setup will start deploying a SageMaker Studio domain using default presets and will be ready within a few minutes. The setup for organizations will guide you through the configuration step-by-step. Note that you can choose to keep working with the classic SageMaker Studio experience or start exploring the new experience.

Amazon SageMaker Studio

Now available
The new Amazon SageMaker Studio experience is available today in all AWS Regions where SageMaker Studio is available. Starting today, new SageMaker Studio domains will default to the new web-based interface. If you have an existing setup and want to start using the new experience, check out the SageMaker Developer Guide for instructions on how to migrate your existing domains.

Give it a try, and let us know what you think. You can send feedback to AWS re:Post for Amazon SageMaker Studio or through your usual AWS contacts.

Start building your ML projects with Amazon SageMaker Studio today!

— Antje

Three new capabilities for Amazon Inspector broaden the realm of vulnerability scanning for workloads

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/three-new-capabilities-for-amazon-inspector-broaden-the-realm-of-vulnerability-scanning-for-workloads/

Today, Amazon Inspector adds three new capabilities to increase the realm of possibilities when scanning your workloads for software vulnerabilities:

  • Amazon Inspector introduces a new set of open source plugins and an API allowing you to assess your container images for software vulnerabilities at build time directly from your continuous integration and continuous delivery (CI/CD) pipelines wherever they are running.
  • Amazon Inspector can now continuously monitor your Amazon Elastic Compute Cloud (Amazon EC2) instances without installing an agent or additional software (in preview).
  • Amazon Inspector uses generative artificial intelligence (AI) and automated reasoning to provide assisted code remediation for your AWS Lambda functions.

Amazon Inspector is a vulnerability management service that continually scans your AWS workloads for known software vulnerabilities and unintended network exposure. Amazon Inspector automatically discovers and scans running EC2 instances, container images in Amazon Elastic Container Registry (Amazon ECR) and within your CI/CD tools, and Lambda functions.

We all know engineering teams often face challenges when it comes to promptly addressing vulnerabilities. This is because of the tight release deadlines that force teams to prioritize development over tackling issues in their vulnerability backlog. But it’s also due to the complex and ever-evolving nature of the security landscape. As a result, a study showed that organizations take 250 days on average to resolve critical vulnerabilities. It is therefore crucial to identify potential security issues early in the development lifecycle to prevent their deployment into production.

Detecting vulnerabilities in your AWS Lambda functions code
Let’s start close to the developer with Lambda functions code.

In November 2022 and June 2023, Amazon Inspector added the capability to scan your function’s dependencies and code. Today, we’re adding generative AI and automated reasoning to analyze your code and automatically create remediation as code patches.

Amazon Inspector can now provide in-context code patches for multiple classes of vulnerabilities detected during security scans. Amazon Inspector extends the assessment of your code for security issues like injection flaws, data leaks, weak cryptography, or missing encryption. Thanks to generative AI, Amazon Inspector now provides suggestions how to fix it. It shows affected code snippets in context with suggested remediation.

Here is an example. I wrote a short snippet of Python code with a hardcoded AWS secret key. Never do that!

def create_session_noncompliant():
    import boto3
    # Noncompliant: uses hardcoded secret access key.
    sample_key = "AjWnyxxxxx45xxxxZxxxX7ZQxxxxYxxx1xYxxxxx"
    return response

I deploy the code. This triggers the assessment. I open the AWS Management Console and navigate to the Amazon Inspector page. In the Findings section, I find the vulnerability. It gives me the Vulnerability location and the Suggested remediation in a plain natural language explanation but also in diff text and graphical formats.

Inspector automated code remediation

Detecting vulnerabilities in your container CI/CD pipeline
Now, let’s move to your CI/CD pipelines when building containers.

Until today, Amazon Inspector was able to assess container images once they were built and stored in Amazon Elastic Container Registry (Amazon ECR). Starting today, Amazon Inspector can detect security issues much sooner in the development process by assessing container images during their build within CI/CD tools. Assessment results are returned in near real-time directly to the CI/CD tool’s dashboard. There is no need to enable Amazon Inspector to use this new capability.

We provide ready-to-use CI/CD plugins for Jenkins and JetBrain’s TeamCity, with more to come. There is also a new API (inspector-scan) and command (inspector-sbomgen) available from our AWS SDKs and AWS Command Line Interface (AWS CLI). This new API allows you to integrate Amazon Inspector in the CI/CD tool of your choice.

Upon execution, the plugin runs a container extraction engine on the configured resource and generates a CycloneDX-compatible software bill of materials (SBOM). Then, the plugin sends the SBOM to Amazon Inspector for analysis. The plugin receives the result of the scan in near real-time. It parses the response and generates outputs that Jenkins or TeamCity uses to pass or fail the execution of the pipeline.

To use the plugin with Jenkins, I first make sure there is a role attached to the EC2 instance where Jenkins is installed, or I have an AWS access key and secret access key with permissions to call the Amazon Inspector API.

I install the plugin directly from Jenkins (Jenkins Dashboard > Manage Jenkins > Plugins)

Inspect CICD Install Jenkins plugin

Then, I add an Amazon Inspector Scan step in my pipeline.

Inspector CICD - add Jenkins step

I configure the step with the IAM Role I created (or an AWS access key and secret access key when running on premises), my Docker Credentials, the AWS Region, and the Image Id.

Inspector CICD - configure jenkins plugins

When Amazon Inspector detects vulnerabilities, it reports them to the plugin. The build fails, and I can view the details directly in Jenkins.

Inspector CICD - findings in jenkins

The SBOM generation understands packages or applications for popular operating systems, such as Alpine, Amazon Linux, Debian, Ubuntu, and Red Hat packages. It also detects packages for Go, Java, NodeJS, C#, PHP, Python, Ruby, and Rust programming languages.

Detecting vulnerabilities on Amazon EC2 without installing agents (in preview)
Finally, let’s talk about agentless inspection of your EC2 instances.

Currently, Amazon Inspector uses AWS Systems Manager and the AWS Systems Manager Agent (SSM Agent) to collect information about the inventory of your EC2 instances. To ensure Amazon Inspector can communicate with your instances, you have to ensure three conditions. First, a recent version of the SSM Agent is installed on the instance. Second, the SSM Agent is started. And third, you attached an IAM role to the instance to allow the SSM Agent to communicate back to the SSM service. This seems fair and simple. But it is not when considering large deployments across multiple OS versions, AWS Regions, and accounts, or when you manage legacy applications. Each instance launched that doesn’t satisfy these three conditions is a potential security gap in your infrastructure.

With agentless scanning (in preview), Amazon Inspector doesn’t require the SSM Agent to scan your instances. It automatically discovers existing and new instances and schedules a vulnerability assessment for them. It does so by taking a snapshot of the instance’s EBS volumes and analyzing the snapshot. This technique has the extra advantage of not consuming any CPU cycle or memory on your instances, leaving 100 percent of the (virtual) hardware available for your workloads. After the analysis, Amazon Inspector deletes the snapshot.

To get started, enable hybrid scanning under EC2 scanning settings in the Amazon Inspector section of the AWS Management Console. Hybrid mode means Amazon Inspector continues to use the SSM Agent–based scanning for instances managed by SSM and automatically switches to agentless for instances that are not managed by SSM.

Inspector enable hybrid scanning

Under Account management, I can verify the list of scanned instances. I can see which instances are scanned with the SSM Agent and which are not.

Inspector list of instances monitored

Under Findings, I can filter by vulnerability, by account, by instance, and so on. I select by instance and select the agentless instance I want to review.

For that specific instance, Amazon Inspector lists more than 200 findings, sorted by severity.

Inspector list of findings

As usual, I can see the details of a finding to understand what the risk is and how to mitigate it.

Inspector details of a finding

Pricing and availability
Amazon Inspector code remediation for Lambda functions is available in ten Regions: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, London, Stockholm). It is available at no additional cost.

Amazon Inspector agentless vulnerability scanning for Amazon EC2 is available in preview in three AWS Regions: US East (N. Virginia), US West (Oregon), and Europe (Ireland).

The new API to scan containers at build time is available in the 21 AWS Regions where Amazon Inspector is available today.

There are no upfront or subscription costs. We charge on-demand based on the volume of activity. There is a price per EC2 instance or container image scan. As usual, the Amazon Inspector pricing page has the details.

Start today by adding the Jenkins or TeamCity agent to your containerized application CI/CD pipelines or activate the agentless Amazon EC2 inspection.

Now go build!

— seb

Fall 2023 SOC reports now available with 171 services in scope

Post Syndicated from Ryan Wilks original https://aws.amazon.com/blogs/security/fall-2023-soc-reports-now-available-with-171-services-in-scope/

At Amazon Web Services (AWS), we’re committed to providing our customers with continued assurance over the security, availability, confidentiality, and privacy of the AWS control environment.

We’re proud to deliver the Fall 2023 System and Organizational (SOC) 1, 2, and 3 reports to support your confidence in AWS services. The reports cover the period October 1, 2022, to September 30, 2023. We extended the period of coverage to 12 months so that you have a full year of assurance from a single report. We also updated the associated infrastructure supporting our in-scope products and services to reflect new edge locations, AWS Wavelength zones, and AWS Local Zones.

The SOC 2 report includes the Security, Availability, Confidentiality, and Privacy Trust Service Criteria that cover both the design and operating effectiveness of controls over a period of time. The SOC 2 Privacy Trust Service Criteria, developed by the American Institute of Certified Public Accountants (AICPA), establishes the criteria for evaluating controls and how personal information is collected, used, retained, disclosed, and disposed of. For more information about our privacy commitments supporting the SOC 2 Type 2 report, see the AWS Customer Agreement.

The scope of the Fall 2023 SOC 2 Type 2 report includes information about how we handle the content that you upload to AWS, and how we protect that content across the services and locations that are in scope for the latest AWS SOC reports.

The Fall 2023 SOC reports include an additional 13 services in scope, for a total of 171 services. See the full list on our Services in Scope by Compliance Program page.

Here are the 13 additional services in scope for the Fall 2023 SOC reports:

Customers can download the Fall 2023 SOC reports through AWS Artifact in the AWS Management Console. You can also download the SOC 3 report as a PDF file from AWS.

AWS strives to bring services into the scope of its compliance programs to help you meet your architectural and regulatory needs. If there are additional AWS services that you would like us to add to the scope of our SOC reports (or other compliance programs), reach out to your AWS representatives.

We value your feedback and questions. Feel free to reach out to the team through the Contact Us page. If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to-content, news, and feature announcements? Follow us on Twitter.

ryan wilks

Ryan Wilks

Ryan is a Compliance Program Manager at AWS. He leads multiple security and privacy initiatives within AWS. Ryan has 13 years of experience in information security. He has a bachelor of arts degree from Rutgers University and holds ITIL, CISM, and CISA certifications.

Nathan Samuel

Nathan Samuel

Nathan is a Compliance Program Manager at AWS. He leads multiple security and privacy initiatives within AWS. Nathan has a bachelor of commerce degree from the University of the Witwatersrand, South Africa, and has over 20 years of experience in security assurance. He holds the CISA, CRISC, CGEIT, CISM, CDPSE, and Certified Internal Auditor certifications.

Brownell Combs

Brownell Combs

Brownell is a Compliance Program Manager at AWS. He leads multiple security and privacy initiatives within AWS. Brownell holds a master of science degree in computer science from the University of Virginia and a bachelor of science degree in computer science from Centre College. He has over 20 years of experience in IT risk management and CISSP, CISA, and CRISC certifications.

Paul Hong

Paul Hong

Paul is a Compliance Program Manager at AWS. He leads multiple security, compliance, and training initiatives within AWS, and has 10 years of experience in security assurance. Paul holds CISSP, CEH, and CPA certifications, and holds a master’s degree in accounting information systems and a bachelor’s degree in business administration from James Madison University, Virginia.

Package and deploy models faster with new tools and guided workflows in Amazon SageMaker

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/package-and-deploy-models-faster-with-new-tools-and-guided-workflows-in-amazon-sagemaker/

I’m happy to share that Amazon SageMaker now comes with an improved model deployment experience to help you deploy traditional machine learning (ML) models and foundation models (FMs) faster.

As a data scientist or ML practitioner, you can now use the new ModelBuilder class in the SageMaker Python SDK to package models, perform local inference to validate runtime errors, and deploy to SageMaker from your local IDE or SageMaker Studio notebooks.

In SageMaker Studio, new interactive model deployment workflows give you step-by-step guidance on which instance type to choose to find the most optimal endpoint configuration. SageMaker Studio also provides additional interfaces to add models, test inference, and enable auto scaling policies on the deployed endpoints.

New tools in SageMaker Python SDK
The SageMaker Python SDK has been updated with new tools, including ModelBuilder and SchemaBuilder classes that unify the experience of converting models into SageMaker deployable models across ML frameworks and model servers. Model builder automates the model deployment by selecting a compatible SageMaker container and capturing dependencies from your development environment. Schema builder helps to manage serialization and deserialization tasks of model inputs and outputs. You can use the tools to deploy the model in your local development environment to experiment with it, fix any runtime errors, and when ready, transition from local testing to deploy the model on SageMaker with a single line of code.

Amazon SageMaker ModelBuilder

Let me show you how this works. In the following example, I choose the Falcon-7B model from the Hugging Face model hub. I first deploy the model locally, run a sample inference, perform local benchmarking to find the optimal configuration, and finally deploy the model with the suggested configuration to SageMaker.

First, import the updated SageMaker Python SDK and define a sample model input and output that matches the prompt format for the selected model.

import sagemaker
from sagemaker.serve.builder.model_builder import ModelBuilder
from sagemaker.serve.builder.schema_builder import SchemaBuilder
from sagemaker.serve import Mode

prompt = "Falcons are"
response = "Falcons are small to medium-sized birds of prey related to hawks and eagles."

sample_input = {
    "inputs": prompt,
    "parameters": {"max_new_tokens": 32}

sample_output = [{"generated_text": response}]

Then, create a ModelBuilder instance with the Hugging Face model ID, a SchemaBuilder instance with the sample model input and output, define a local model path, and set the mode to LOCAL_CONTAINER to deploy the model locally. The schema builder generates the required functions for serializing and deserializing the model inputs and outputs.

model_builder = ModelBuilder(
    schema_builder=SchemaBuilder(sample_input, sample_output),
	env_vars={"HF_TRUST_REMOTE_CODE": "True"}

Next, call build() to convert the PyTorch model into a SageMaker deployable model. The build function generates the required artifacts for the model server, including the inferency.py and serving.properties files.

local_mode_model = model_builder.build()

For FMs, such as Falcon, you can optionally run tune() in local container mode that performs local benchmarking to find the optimal model serving configuration. This includes the tensor parallel degree that specifies the number of GPUs to use if your environment has multiple GPUs available. Once ready, call deploy() to deploy the model in your local development environment.

tuned_model = local_mode_model.tune()

Let’s test the model.

updated_sample_input = model_builder.schema_builder.sample_input

{'inputs': 'Falcons are',
 'parameters': {'max_new_tokens': 32}}

In my demo, the model returns the following response:

a type of bird that are known for their sharp talons and powerful beaks. They are also known for their ability to fly at high speeds […]

When you’re ready to deploy the model on SageMaker, call deploy() again, set the mode to SAGEMAKLER_ENDPOINT, and provide an AWS Identity and Access Management (IAM) role with appropriate permissions.

sm_predictor = tuned_model.deploy(

This starts deploying your model on a SageMaker endpoint. Once the endpoint is ready, you can run predictions.

new_input = {'inputs': 'Eagles are','parameters': {'max_new_tokens': 32}}

New SageMaker Studio model deployment experience
You can start the new interactive model deployment workflows by selecting one or more models to deploy from the models landing page or SageMaker JumpStart model details page or by creating a new endpoint from the endpoints details page.

Amazon SageMaker - New Model Deployment Experience

The new workflows help you quickly deploy the selected model(s) with minimal inputs. If you used SageMaker Inference Recommender to benchmark your model, the dropdown will show instance recommendations from that benchmarking.

Model deployment experience in SageMaker Studio

Without benchmarking your model, the dropdown will display prospective instances that SageMaker predicts could be a good fit based on its own heuristics. For some of the most popular SageMaker JumpStart models, you’ll see an AWS pretested optimal instance type. For other models, you’ll see generally recommended instance types. For example, if I select the Falcon 40B Instruct model in SageMaker JumpStart, I can see the recommended instance types.

Model deployment experience in SageMaker Studio

Model deployment experience in SageMaker Studio

However, if I want to optimize the deployment for cost or performance to meet my specific use cases, I could open the Alternate configurations panel to view more options based on data from before benchmarking.

Model deployment experience in SageMaker Studio

Once deployed, you can test inference or manage auto scaling policies.

Model deployment experience in SageMaker Studio

Things to know
Here are a couple of important things to know:

Supported ML models and frameworks – At launch, the new SageMaker Python SDK tools support model deployment for XGBoost and PyTorch models. You can deploy FMs by specifying the Hugging Face model ID or SageMaker JumpStart model ID using the SageMaker LMI container or Hugging Face TGI-based container. You can also bring your own container (BYOC) or deploy models using the Triton model server in ONNX format.

Now available
The new set of tools is available today in all AWS Regions where Amazon SageMaker real-time inference is available. There is no cost to use the new set of tools; you pay only for any underlying SageMaker resources that get created.

Learn more

Get started
Explore the new SageMaker model deployment experience in the AWS Management Console today!

— Antje

Use natural language to explore and prepare data with a new capability of Amazon SageMaker Canvas

Post Syndicated from Irshad Buchh original https://aws.amazon.com/blogs/aws/use-natural-language-to-explore-and-prepare-data-with-a-new-capability-of-amazon-sagemaker-canvas/

Today, I’m happy to introduce the ability to use natural language instructions in Amazon SageMaker Canvas to explore, visualize, and transform data for machine learning (ML).

SageMaker Canvas now supports using foundation model- (FM) powered natural language instructions to complement its comprehensive data preparation capabilities for data exploration, analysis, visualization, and transformation. Using natural language instructions, you can now explore and transform your data to build highly accurate ML models. This new capability is powered by Amazon Bedrock.

Data is the foundation for effective machine learning, and transforming raw data to make it suitable for ML model building and generating predictions is key to better insights. Analyzing, transforming, and preparing data to build ML models is often the most time-consuming part of the ML workflow. With SageMaker Canvas, data preparation for ML is seamless and fast with 300+ built-in transforms, analyses, and an in-depth data quality insights report without writing any code. Starting today, the process of data exploration and preparation is faster and simpler in SageMaker Canvas using natural language instructions for exploring, visualizing, and transforming data.

Data preparation tasks are now accelerated through a natural language experience using queries and responses. You can quickly get started with contextual, guided prompts to understand and explore your data.

Say I want to build an ML model to predict house prices Using SageMaker Canvas. First, I need to prepare my housing dataset to build an accurate model. To get started with the new natural language instructions, I open the SageMaker Canvas application, and in the left navigation pane, I choose Data Wrangler. Under the Data tab and from the list of available datasets, I select the canvas-housing-sample.csv as the dataset, then select Create a data flow and choose Create. I see the tabular view of my dataset and an introduction to the new Chat for data prep capability.


I select Chat for data prep, and it displays the chat interface with a set of guided prompts relevant to my dataset. I can use any of these prompts or query the data for something else.


First, I want to understand the quality of my dataset to identify any outliers or anomalies. I ask SageMaker Canvas to generate a data quality report to accomplish this task.


I see there are no major issues with my data. I would now like to visualize the distribution of a couple of features in the data. I ask SageMaker Canvas to plot a chart.


I now want to filter certain rows to transform my data. I ask SageMaker Canvas to remove rows where the population is less than 1,000. Canvas removes those rows, shows me a preview of the transformed data, and also gives me the option to view and update the code that generated the transform.


I am happy with the preview and add the transformed data to my list of data transform steps on the right. SageMaker Canvas adds the step along with the code.


Now that my data is transformed, I can go on to build my ML model to predict house prices and even deploy the model into production using the same visual interface of SageMaker Canvas, without writing a single line of code.

Data preparation has never been easier for ML!

The new capability in Amazon SageMaker Canvas to explore and transform data using natural language queries is available in all AWS Regions where Amazon SageMaker Canvas and Amazon Bedrock are supported.

Learn more
Amazon SageMaker Canvas product page

Go build!

— Irshad

Amazon SageMaker adds new inference capabilities to help reduce foundation model deployment costs and latency

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/amazon-sagemaker-adds-new-inference-capabilities-to-help-reduce-foundation-model-deployment-costs-and-latency/

Today, we are announcing new Amazon SageMaker inference capabilities that can help you optimize deployment costs and reduce latency. With the new inference capabilities, you can deploy one or more foundation models (FMs) on the same SageMaker endpoint and control how many accelerators and how much memory is reserved for each FM. This helps to improve resource utilization, reduce model deployment costs on average by 50 percent, and lets you scale endpoints together with your use cases.

For each FM, you can define separate scaling policies to adapt to model usage patterns while further optimizing infrastructure costs. In addition, SageMaker actively monitors the instances that are processing inference requests and intelligently routes requests based on which instances are available, helping to achieve on average 20 percent lower inference latency.

Key components
The new inference capabilities build upon SageMaker real-time inference endpoints. As before, you create the SageMaker endpoint with an endpoint configuration that defines the instance type and initial instance count for the endpoint. The model is configured in a new construct, an inference component. Here, you specify the number of accelerators and amount of memory you want to allocate to each copy of a model, together with the model artifacts, container image, and number of model copies to deploy.

Amazon SageMaker - MME

Let me show you how this works.

New inference capabilities in action
You can start using the new inference capabilities from SageMaker Studio, the SageMaker Python SDK, and the AWS SDKs and AWS Command Line Interface (AWS CLI). They are also supported by AWS CloudFormation.

For this demo, I use the AWS SDK for Python (Boto3) to deploy a copy of the Dolly v2 7B model and a copy of the FLAN-T5 XXL model from the Hugging Face model hub on a SageMaker real-time endpoint using the new inference capabilities.

Create a SageMaker endpoint configuration

import boto3
import sagemaker

role = sagemaker.get_execution_role()
sm_client = boto3.client(service_name="sagemaker")

        "VariantName": "AllTraffic",
        "InstanceType": "ml.g5.12xlarge",
        "InitialInstanceCount": 1,
		"RoutingConfig": {
            "RoutingStrategy": "LEAST_OUTSTANDING_REQUESTS"

Create the SageMaker endpoint


Before you can create the inference component, you need to create a SageMaker-compatible model and specify a container image to use. For both models, I use the Hugging Face LLM Inference Container for Amazon SageMaker. These deep learning containers (DLCs) include the necessary components, libraries, and drivers to host large models on SageMaker.

Prepare the Dolly v2 model

from sagemaker.huggingface import get_huggingface_llm_image_uri

# Retrieve the container image URI
hf_inference_dlc = get_huggingface_llm_image_uri(

# Configure model container
dolly7b = {
    'Image': hf_inference_dlc,
    'Environment': {

# Create SageMaker Model
    ModelName        = "dolly-v2-7b",
    ExecutionRoleArn = role,
    Containers       = [dolly7b]

Prepare the FLAN-T5 XXL model

# Configure model container
flant5xxlmodel = {
    'Image': hf_inference_dlc,
    'Environment': {

# Create SageMaker Model
    ModelName        = "flan-t5-xxl",
    ExecutionRoleArn = role,
    Containers       = [flant5xxlmodel]

Now, you’re ready to create the inference component.

Create an inference component for each model
Specify an inference component for each model you want to deploy on the endpoint. Inference components let you specify the SageMaker-compatible model and the compute and memory resources you want to allocate. For CPU workloads, define the number of cores to allocate. For accelerator workloads, define the number of accelerators. RuntimeConfig defines the number of model copies you want to deploy.

# Inference compoonent for Dolly v2 7B
        "ModelName": "dolly-v2-7b",
        "ComputeResourceRequirements": {
		    "NumberOfAcceleratorDevicesRequired": 2, 
			"NumberOfCpuCoresRequired": 2, 
			"MinMemoryRequiredInMb": 1024
    RuntimeConfig={"CopyCount": 1},

# Inference component for FLAN-T5 XXL
        "ModelName": "flan-t5-xxl",
        "ComputeResourceRequirements": {
		    "NumberOfAcceleratorDevicesRequired": 2, 
			"NumberOfCpuCoresRequired": 1, 
			"MinMemoryRequiredInMb": 1024
    RuntimeConfig={"CopyCount": 1},

Once the inference components have successfully deployed, you can invoke the models.

Run inference
To invoke a model on the endpoint, specify the corresponding inference component.

import json
sm_runtime_client = boto3.client(service_name="sagemaker-runtime")
payload = {"inputs": "Why is California a great place to live?"}

response_dolly = sm_runtime_client.invoke_endpoint(
    InferenceComponentName = "IC-dolly-v2-7b",

response_flant5 = sm_runtime_client.invoke_endpoint(
    InferenceComponentName = "IC-flan-t5-xxl",

result_dolly = json.loads(response_dolly['Body'].read().decode())
result_flant5 = json.loads(response_flant5['Body'].read().decode())

Next, you can define separate scaling policies for each model by registering the scaling target and applying the scaling policy to the inference component. Check out the SageMaker Developer Guide for detailed instructions.

The new inference capabilities provide per-model CloudWatch metrics and CloudWatch Logs and can be used with any SageMaker-compatible container image across SageMaker CPU- and GPU-based compute instances. Given support by the container image, you can also use response streaming.

Now available
The new Amazon SageMaker inference capabilities are available today in AWS Regions US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Jakarta, Mumbai, Seoul, Singapore, Sydney, Tokyo), Canada (Central), Europe (Frankfurt, Ireland, London, Stockholm), Middle East (UAE), and South America (São Paulo). For pricing details, visit Amazon SageMaker Pricing. To learn more, visit Amazon SageMaker.

Get started
Log in to the AWS Management Console and deploy your FMs using the new SageMaker inference capabilities today!

— Antje

Leverage foundation models for business analysis at scale with Amazon SageMaker Canvas

Post Syndicated from Irshad Buchh original https://aws.amazon.com/blogs/aws/leverage-foundation-models-for-business-analysis-at-scale-with-amazon-sagemaker-canvas/

Today, I’m excited to introduce a new capability in Amazon SageMaker Canvas to use foundation models (FMs) from Amazon Bedrock and Amazon SageMaker Jumpstart through a no-code experience. This new capability makes it easier for you to evaluate and generate responses from FMs for your specific use case with high accuracy.

Every business has its own set of unique domain-specific vocabulary that generic models are not trained to understand or respond to. The new capability in Amazon SageMaker Canvas bridges this gap effectively. SageMaker Canvas trains the models for you so you don’t need to write any code using our company data so that the model output reflects your business domain and use case such as completing a marketing analysis. For the fine-tuning process, SageMaker Canvas creates a new custom model in your account, and the data used for fine-tuning is not used to train the original FM, ensuring the privacy of your data.

Earlier this year, we expanded support for ready-to-use models in Amazon SageMaker Canvas to include foundation models (FMs). This allows you to access, evaluate, and query FMs such as Claude 2, Amazon Titan, and Jurassic-2 (powered by Amazon Bedrock), as well as publicly available models such as Falcon and MPT (powered by Amazon SageMaker JumpStart) through a no-code interface. Extending this experience, we enabled the ability to query the FMs to generate insights from a set of documents in your own enterprise document index, such as Amazon Kendra. While it is valuable to query FMs, customers want to build FMs that generate responses and insights for their use cases. Starting today, a new capability to build FMs addresses this need to generate custom responses.

To get started, I open the SageMaker Canvas application and in the left navigation pane, I choose My models. I select the New model button, select Fine-tune foundation model, and select Create.


I select the training dataset and can choose up to three models to tune. I choose the input column with the prompt text and the output column with the desired output text. Then, I initiate the fine-tuning process by selecting Fine-tune.


Once the fine-tuning process is completed, SageMaker Canvas gives me an analysis of the fine-tuned model with different metrics such as perplexity and loss curves, training loss, validation loss, and more. Additionally, SageMaker Canvas provides a model leaderboard that gives me the ability to measure and compare metrics around model quality for the generated models.


Now, I am ready to test the model and compare responses with the original base model. To test, I select Test in Ready-to-use models from the Analyze page. The fine-tuned model is automatically deployed and is now available for me to chat and compare responses.


Now, I am ready to generate and evaluate insights specific to my use case. The icing on the cake was to achieve this without writing a single line of code.

Learn more

Go build!

— Irshad

PS: Writing a blog post at AWS is always a team effort, even when you see only one name under the post title. In this case, I want to thank Shyam Srinivasan for his technical assistance.

Amazon SageMaker Clarify makes it easier to evaluate and select foundation models (preview)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/amazon-sagemaker-clarify-makes-it-easier-to-evaluate-and-select-foundation-models-preview/

I’m happy to share that Amazon SageMaker Clarify now supports foundation model (FM) evaluation (preview). As a data scientist or machine learning (ML) engineer, you can now use SageMaker Clarify to evaluate, compare, and select FMs in minutes based on metrics such as accuracy, robustness, creativity, factual knowledge, bias, and toxicity. This new capability adds to SageMaker Clarify’s existing ability to detect bias in ML data and models and explain model predictions.

The new capability provides both automatic and human-in-the-loop evaluations for large language models (LLMs) anywhere, including LLMs available in SageMaker JumpStart, as well as models trained and hosted outside of AWS. This removes the heavy lifting of finding the right model evaluation tools and integrating them into your development environment. It also simplifies the complexity of trying to adopt academic benchmarks to your generative artificial intelligence (AI) use case.

Evaluate FMs with SageMaker Clarify
With SageMaker Clarify, you now have a single place to evaluate and compare any LLM based on predefined criteria during model selection and throughout the model customization workflow. In addition to automatic evaluation, you can also use the human-in-the-loop capabilities to set up human reviews for more subjective criteria, such as helpfulness, creative intent, and style, by using your own workforce or managed workforce from SageMaker Ground Truth.

To get started with model evaluations, you can use curated prompt datasets that are purpose-built for common LLM tasks, including open-ended text generation, text summarization, question answering (Q&A), and classification. You can also extend the model evaluation with your own custom prompt datasets and metrics for your specific use case. Human-in-the-loop evaluations can be used for any task and evaluation metric. After each evaluation job, you receive an evaluation report that summarizes the results in natural language and includes visualizations and examples. You can download all metrics and reports and also integrate model evaluations into SageMaker MLOps workflows.

In SageMaker Studio, you can find Model evaluation under Jobs in the left menu. You can also select Evaluate directly from the model details page of any LLM in SageMaker JumpStart.

Evaluate foundation models with Amazon SageMaker Clarify

Select Evaluate a model to set up the evaluation job. The UI wizard will guide you through the selection of automatic or human evaluation, model(s), relevant tasks, metrics, prompt datasets, and review teams.

Evaluate foundation models with Amazon SageMaker Clarify

Once the model evaluation job is complete, you can view the results in the evaluation report.

Evaluate foundation models with Amazon SageMaker Clarify

In addition to the UI, you can also start with example Jupyter notebooks that walk you through step-by-step instructions on how to programmatically run model evaluation in SageMaker.

Evaluate models anywhere with the FMEval open source library
To run model evaluation anywhere, including models trained and hosted outside of AWS, use the FMEval open source library. The following example demonstrates how to use the library to evaluate a custom model by extending the ModelRunner class.

For this demo, I choose GPT-2 from the Hugging Face model hub and define a custom HFModelConfig and HuggingFaceCausalLLMModelRunner class that works with causal decoder-only models from the Hugging Face model hub such as GPT-2. The example is also available in the FMEval GitHub repo.

!pip install fmeval

# ModelRunners invoke FMs
from amazon_fmeval.model_runners.model_runner import ModelRunner

# Additional imports for custom model
import warnings
from dataclasses import dataclass
from typing import Tuple, Optional
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

class HFModelConfig:
    model_name: str
    max_new_tokens: int
    normalize_probabilities: bool = False
    seed: int = 0
    remove_prompt_from_generated_text: bool = True

class HuggingFaceCausalLLMModelRunner(ModelRunner):
    def __init__(self, model_config: HFModelConfig):
        self.config = model_config
        self.model = AutoModelForCausalLM.from_pretrained(self.config.model_name)
        self.tokenizer = AutoTokenizer.from_pretrained(self.config.model_name)

    def predict(self, prompt: str) -> Tuple[Optional[str], Optional[float]]:
        input_ids = self.tokenizer(prompt, return_tensors="pt").to(self.model.device)
        generations = self.model.generate(
        generation_contains_input = (
            input_ids["input_ids"][0] == generations[0][: input_ids["input_ids"].shape[1]]
        if self.config.remove_prompt_from_generated_text and not generation_contains_input:
                "Your model does not return the prompt as part of its generations. "
                "`remove_prompt_from_generated_text` does nothing."
        if self.config.remove_prompt_from_generated_text and generation_contains_input:
            output = self.tokenizer.batch_decode(generations[:, input_ids["input_ids"].shape[1] :])[0]
            output = self.tokenizer.batch_decode(generations, skip_special_tokens=True)[0]

        with torch.inference_mode():
            input_ids = self.tokenizer(self.tokenizer.bos_token + prompt, return_tensors="pt")["input_ids"]
            model_output = self.model(input_ids, labels=input_ids)
            probability = -model_output[0].item()

        return output, probability

Next, create an instance of HFModelConfig and HuggingFaceCausalLLMModelRunner with the model information.

hf_config = HFModelConfig(model_name="gpt2", max_new_tokens=32)
model = HuggingFaceCausalLLMModelRunner(model_config=hf_config)

Then, select and configure the evaluation algorithm.

# Let's evaluate the FM for FactualKnowledge
from amazon_fmeval.fmeval import get_eval_algorithm
from amazon_fmeval.eval_algorithms.factual_knowledge import FactualKnowledgeConfig

eval_algorithm_config = FactualKnowledgeConfig("<OR>")
eval_algorithm = get_eval_algorithm("factual_knowledge", eval_algorithm_config)

Let’s first test with one sample. The evaluation score is the percentage of factually correct responses.

model_output = model.predict("London is the capital of")[0]

    target_output="UK<OR>England<OR>United Kingdom", 
the UK, and the UK is the largest producer of food in the world.

The UK is the world's largest producer of food in the world.
[EvalScore(name='factual_knowledge', value=1)]

Although it’s not a perfect response, it includes “UK.”

Next, you can evaluate the FM using built-in datasets or define your custom dataset. If you want to use a custom evaluation dataset, create an instance of DataConfig:

config = DataConfig(

eval_output = eval_algorithm.evaluate(
    prompt_template="$feature", #$feature is replaced by the input value in the dataset 

The evaluation results will return a combined evaluation score across the dataset and detailed results for each model input stored in a local output path.

Join the preview
FM evaluation with Amazon SageMaker Clarify is available today in public preview in AWS Regions US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (Ireland). The FMEval open source library] is available on GitHub. To learn more, visit Amazon SageMaker Clarify.

Get started
Log in to the AWS Management Console and start evaluating your FMs with SageMaker Clarify today!

— Antje

Amazon Redshift adds new AI capabilities, including Amazon Q, to boost efficiency and productivity

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/amazon-redshift-adds-new-ai-capabilities-to-boost-efficiency-and-productivity/

Amazon Redshift puts artificial intelligence (AI) at your service to optimize efficiencies and make you more productive with two new capabilities that we are launching in preview today.

First, Amazon Redshift Serverless becomes smarter. It scales capacity proactively and automatically along dimensions such as the complexity of your queries, their frequency, the size of the dataset, and so on to deliver tailored performance optimizations. This allows you to spend less time tuning your data warehouse instances and more time getting value from your data.

Second, Amazon Q generative SQL in Amazon Redshift Query Editor generates SQL recommendations from natural language prompts. This helps you to be more productive in extracting insights from your data.

Let’s start with Amazon Redshift Serverless
When you use Amazon Redshift Serverless, you can now opt in for a preview of AI-driven scaling and optimizations. When enabled, the system observes and learns from your usage patterns, such as the concurrent number of queries, their complexity, and the time it takes to run them. Then, it automatically optimizes your serverless endpoint to meet your price performance target. Based on AWS internal testing, this new capability may give you up to ten times better price performance for variable workloads without any manual intervention.

AI-driven scaling and optimizations eliminate the time and effort to manually resize your workgroup and plan background optimizations based on workload needs. It continually runs automatic optimizations when they are most valuable for better performance, avoiding performance cliffs and time-outs.

This new capability goes beyond the existing self-tuning capabilities of Amazon Redshift Serverless, such as machine learning (ML)-enhanced techniques to adjust your compute, modify the physical schema of the database, create or drop materialized views as needed (the one we manage automatically, not yours), and vacuum tables. This new capability brings more intelligence to decide how to adjust the compute, what background optimizations are required, and when to apply them, and it makes its decisions based on more dimensions. We also orchestrate ML-based optimizations for materialized views, table optimizations, and workload management when your queries need it.

During the preview, you must opt in to enable these AI-driven scaling and optimizations on your workgroups. You configure the system to balance the optimization for price or performance. There is only one slider to adjust in the console.

Redshift serverless - AI driven workgoups

As usual, you can track resource usage and associated changes through the console, Amazon CloudWatch metrics, and the system table SYS_SERVERLESS_USAGE.

Now, let’s look at Amazon Q generative SQL in Amazon Redshift Query Editor
What if you could use generative AI to help analysts write effective SQL queries more rapidly? This is the new experience we introduce today in Amazon Redshift Query Editor, our web-based SQL editor.

You can now describe the information you want to extract from your data in natural language, and we generate the SQL query recommendations for you. Behind the scenes, Amazon Q generative SQL uses a large language model (LLM) and Amazon Bedrock to generate the SQL query. We use different techniques, such as prompt engineering and Retrieval Augmented Generation (RAG), to query the model based on your context: the database you’re connected to, the schema you’re working on, your query history, and optionally the query history of other users connected to the same endpoint. The system also remembers previous questions. You can ask it to refine a previously generated query.

The SQL generation model uses metadata specific to your data schema to generate relevant queries. For example, it uses the table and column names and the relationship between the tables in your database. In addition, your database administrator can authorize the model to use the query history of all users in your AWS account to generate even more relevant SQL statements. We don’t share your query history with other AWS accounts and we don’t train our generation models with any data coming from your AWS account. We maintain the high level of privacy and security that you expect from us.

Using generated SQL queries helps you to get started when discovering new schemas. It does the heavy lifting of discovering the column names and relationships between tables for you. Senior analysts also benefit from asking what they want in natural language and having the SQL statement automatically generated. They can review the queries and run them directly from their notebook.

Let’s explore a schema and extract information
For this demo, let’s pretend I am a data analyst at a company that sells concert tickets. The database schema and data are available for you to download. My manager asks me to analyze the ticket sales data to send a thank you note with discount coupons to the highest-spending customers in Seattle.

I connect to Amazon Redshift Query Editor and connect the analytic endpoint. I create a new tab for a Notebook (SQL generation is available from notebooks only).

Instead of writing a SQL statement, I open the chat panel and type, “Find the top five users from Seattle who bought the most number of tickets in 2022.” I take the time to verify the generated SQL statement. It seems correct, so I decide to run it. I select Add to notebook and then Run. The query returns the list of the top five buyers in Seattle.

sql generation - top 5 users

I had no previous knowledge of the data schema, and I did not type a single line of SQL to find the information I needed.

But generative SQL is not limited to a single interaction. I can chat with it to dynamically refine the queries. Here is another example.

I ask “Which state has the most venues?” Generative SQL proposes the following query. The answer is New York, with 49 venues, if you’re curious.

generative sql chat 01

I changed my mind, and I want to know the top three cities with the most venues. I simply rephrase my question: “What about the top three venues?

generative sql chat 02

I add the query to the notebook and run it. It returns the expected result.

generative sql chat 03

Best practices for prompting
Here are a couple of tips and tricks to get the best results out of your prompts.

Be specific – When asking questions in natural language, be as specific as possible to help the system understand exactly what you need. For example, instead of writing “find the top venues that sold the most tickets,” provide more details like “find the names of the top three venues that sold the most tickets in 2022.” Use consistent entity names like venue, ticket, and location instead of referring to the same entity in different ways, which can confuse the system.

Iterate – Break your complex requests into multiple simple statements that are easier for the system to interpret. Iteratively ask follow-up questions to get more detailed analysis from the system. For example, start by asking, “Which state has the most venues?” Then, based on the response, ask a follow-up question like “Which is the most popular venue from this state?”

Verify – Review the generated SQL before running it to ensure accuracy. If the generated SQL query has errors or does not match your intent, provide instructions to the system on how to correct it instead of rephrasing the entire request. For example, if the query is missing a filter clause on year, write “provide venues from year 2022.”

Availability and pricing
AI-driven scaling and optimizations are in preview in six AWS Regions: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland, Stockholm). They come at no additional cost. You pay only for the compute capacity your data warehouse consumes when it is active. Pricing is per Redshift Processing Unit (RPU) per hour. The billing is per second of used capacity. The pricing page for Amazon Redshift has the details.

Amazon Q generative SQL for Amazon Redshift Query Editor is in preview in two AWS Regions today: US East (N. Virginia) and US West (Oregon). There is no charge during the preview period.

These are two examples of how AI helps to optimize performance and increase your productivity, either by automatically adjusting the price-performance ratio of your Amazon Redshift Serverless endpoints or by generating correct SQL statements from natural language prompts.

Previews are essential for us to capture your feedback before we make these capabilities available for all. Experiment with these today and let us know what you think on the re:Post forums or using the feedback button on the bottom left side of the console.

— seb

AWS Clean Rooms Differential Privacy enhances privacy protection of your users data (preview)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-clean-rooms-differential-privacy-enhances-privacy-protection-of-your-users-data-preview/

Starting today, you can use AWS Clean Rooms Differential Privacy (preview) to help protect the privacy of your users with mathematically backed and intuitive controls in a few steps. As a fully managed capability of AWS Clean Rooms, no prior differential privacy experience is needed to help you prevent the reidentification of your users.

AWS Clean Rooms Differential Privacy obfuscates the contribution of any individual’s data in generating aggregate insights in collaborations so that you can run a broad range of SQL queries to generate insights about advertising campaigns, investment decisions, clinical research, and more.

Quick overview on differential privacy
Differential privacy is not new. It is a strong, mathematical definition of privacy compatible with statistical and machine learning based analysis, and has been used by the United States Census Bureau as well as companies with vast amounts of data.

Differential privacy helps with a wide variety of use cases involving large datasets, where adding or removing a few individuals has a small impact on the overall result, such as population analyses using count queries, histograms, benchmarking, A/B testing, and machine learning.

The following illustration shows how differential privacy works when it is applied to SQL queries.

When an analyst runs a query, differential privacy adds a carefully calibrated amount of error (also referred to as noise) to query results at run-time, masking the contribution of individuals while still keeping the query results accurate enough to provide meaningful insights. The noise is carefully fine-tuned to mask the presence or absence of any possible individual in the dataset.

Differential privacy also has another component called privacy budget. The privacy budget is a finite resource consumed each time a query is run and thus controls the number of queries that can be run on your datasets, helping ensure that the noise cannot be averaged out to reveal any private information about an individual. When the privacy budget is fully exhausted, no more queries can be run on your tables until it is increased or refreshed.

However, differential privacy is not easy to implement because this technique requires an in-depth understanding of mathematically rigorous formulas and theories to apply it effectively. Configuring differential privacy is also a complex task because customers need to calculate the right level of noise in order to preserve the privacy of their users without negatively impacting the utility of query results.

Customers also want to enable their partners to conduct a wide variety of analyses including highly complex and customized queries on their data. This requirement is hard to support with differential privacy because of the intricate nature of the calculations involved in calibrating the noise while processing various query components such as aggregations, joins, and transformations.

We created AWS Clean Rooms Differential Privacy to help you protect the privacy of your users with mathematically backed controls in a few clicks.

How differential privacy works in AWS Clean Rooms
While differential privacy is quite a sophisticated technique, AWS Clean Rooms Differential Privacy makes it easy for you to apply it and protect the privacy of your users with mathematically backed, flexible, and intuitive controls. You can begin using it with just a few steps after starting or joining an AWS Clean Rooms collaboration as a member with abilities to contribute data.

You create a configured table, which is a reference to your table in the AWS Glue Data Catalog, and choose to turn on differential privacy while adding a custom analysis rule to the configured table.

Next, you associate the configured table to your AWS Clean Rooms collaboration and configure a differential privacy policy in the collaboration to make your table available for querying. You can use a default policy to quickly complete the setup or customize it to meet your specific requirements. As part of this step, you will configure the following:

Privacy budget
Quantified as a value that we call epsilon, the privacy budget controls the level of privacy protection. It is a common, finite resource that is applied for all of your tables protected with differential privacy in the collaboration because the goal is to preserve the privacy of your users whose information can be present in multiple tables. The privacy budget is consumed every time a query is run on your tables. You have the flexibility to increase the privacy budget value any time during the collaboration and automatically refresh it each calendar month.

Noise added per query
Measured in terms of the number of users whose contributions you want to obscure, this input parameter governs the rate at which the privacy budget is depleted.

In general, you need to balance your privacy needs against the number of queries you want to permit and the accuracy of those queries. AWS Clean Rooms makes it easy for you to complete this step by helping you understand the resulting utility you are providing to your collaboration partner. You can also use the interactive examples to understand how your chosen settings would impact the results for different types of SQL queries.

Now that you have successfully enabled differential privacy protection for your data, let’s see AWS Clean Rooms Differential Privacy in action. For this demo, let’s assume I am your partner in the AWS Clean Rooms collaboration.

Here, I’m running a query to count the number of overlapping customers and the result shows there are 3,227,643 values for tv.customer_id.

Now, if I run the same query again after removing records about an individual from coffee_customers table, it shows a different result, 3,227,604 tv.customer_id. This variability in the query results prevents me from identifying the individuals from observing the difference in query results.

I can also see the impact of differential privacy, including the remaining queries I can run.

Available for preview
Join this preview and start protecting the privacy of your users with AWS Clean Rooms Differential Privacy. During this preview period, you can use AWS Clean Rooms Differential Privacy wherever AWS Clean Rooms is available. To learn more on how to get started, visit the AWS Clean Rooms Differential Privacy page.

Happy collaborating!

AWS Clean Rooms ML helps customers and partners apply ML models without sharing raw data (preview)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-clean-rooms-ml-helps-customers-and-partners-apply-ml-models-without-sharing-raw-data-preview/

Today, we’re introducing AWS Clean Rooms ML (preview), a new capability of AWS Clean Rooms that helps you and your partners apply machine learning (ML) models on your collective data without copying or sharing raw data with each other. With this new capability, you can generate predictive insights using ML models while continuing to protect your sensitive data.

During this preview, AWS Clean Rooms ML introduces its first model specialized to help companies create lookalike segments for marketing use cases. With AWS Clean Rooms ML lookalike, you can train your own custom model, and you can invite partners to bring a small sample of their records to collaborate and generate an expanded set of similar records while protecting everyone’s underlying data.

In the coming months, AWS Clean Rooms ML will release a healthcare model. This will be the first of many models that AWS Clean Rooms ML will support next year.

AWS Clean Rooms ML helps you to unlock various opportunities for you to generate insights. For example:

  • Airlines can take signals about loyal customers, collaborate with online booking services, and offer promotions to users with similar characteristics.
  • Auto lenders and car insurers can identify prospective auto insurance customers who share characteristics with a set of existing lease owners.
  • Brands and publishers can model lookalike segments of in-market customers and deliver highly relevant advertising experiences.
  • Research institutions and hospital networks can find candidates similar to existing clinical trial participants to accelerate clinical studies (coming soon).

AWS Clean Rooms ML lookalike modeling helps you apply an AWS managed, ready-to-use model that is trained in each collaboration to generate lookalike datasets in a few clicks, saving months of development work to build, train, tune, and deploy your own model.

How to use AWS Clean Rooms ML to generate predictive insights
Today I will show you how to use lookalike modeling in AWS Clean Rooms ML and assume you have already set up a data collaboration with your partner. If you want to learn how to do that, check out the AWS Clean Rooms Now Generally Available — Collaborate with Your Partners without Sharing Raw Data post.

With your collective data in the AWS Clean Rooms collaboration, you can work with your partners to apply ML lookalike modeling to generate a lookalike segment. It works by taking a small sample of representative records from your data, creating a machine learning (ML) model, then applying the particular model to identify an expanded set of similar records from your business partner’s data.

The following screenshot shows the overall workflow for using AWS Clean Rooms ML.

By using AWS Clean Rooms ML, you don’t need to build complex and time-consuming ML models on your own. AWS Clean Rooms ML trains a custom, private ML model, which saves months of your time while still protecting your data.

Eliminating the need to share data
As ML models are natively built within the service, AWS Clean Rooms ML helps you protect your dataset and customer’s information because you don’t need to share your data to build your ML model.

You can specify the training dataset using the AWS Glue Data Catalog table, which contains user-item interactions.

Under Additional columns to train, you can define numerical and categorical data. This is useful if you need to add more features to your dataset, such as the number of seconds spent watching a video, the topic of an article, or the product category of an e-commerce item.

Applying custom-trained AWS-built models
Once you have defined your training dataset, you can now create a lookalike model. A lookalike model is a machine learning model used to find similar profiles in your partner’s dataset without either party having to share their underlying data with each other.

When creating a lookalike model, you need to specify the training dataset. From a single training dataset, you can create many lookalike models. You also have the flexibility to define the date window in your training dataset using Relative range or Absolute range. This is useful when you have data that is constantly updated within AWS Glue, such as articles read by users.

Easy-to-tune ML models
After you create a lookalike model, you need to configure it to use in AWS Clean Rooms collaboration. AWS Clean Rooms ML provides flexible controls that enable you and your partners to tune the results of the applied ML model to garner predictive insights.

On the Configure lookalike model page, you can choose which Lookalike model you want to use and define the Minimum matching seed size you need. This seed size defines the minimum number of profiles in your seed data that overlap with profiles in the training data.

You also have the flexibility to choose whether the partner in your collaboration receives metrics in Metrics to share with other members.

With your lookalike models properly configured, you can now make the ML models available for your partners by associating the configured lookalike model with a collaboration.

Creating lookalike segments
Once the lookalike models have been associated, your partners can now start generating insights by selecting Create lookalike segment and choosing the associated lookalike model for your collaboration.

Here on the Create lookalike segment page, your partners need to provide the Seed profiles. Examples of seed profiles include your top customers or all customers who purchased a specific product. The resulting lookalike segment will contain profiles from the training data that are most similar to the profiles from the seed.

Lastly, your partner will get the Relevance metrics as the result of the lookalike segment using the ML models. At this stage, you can use the Score to make a decision.

Export data and use programmatic API
You also have the option to export the lookalike segment data. Once it’s exported, the data is available in JSON format and you can process this output by integrating with AWS Clean Rooms API and your applications.

Join the preview
AWS Clean Rooms ML is now in preview and available via AWS Clean Rooms in US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Seoul, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, London). Support for additional models is in the works.

Learn how to apply machine learning with your partners without sharing underlying data on the AWS Clean Rooms ML page.

Happy collaborating!
— Donnie

Analyze large amounts of graph data to get insights and find trends with Amazon Neptune Analytics

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/introducing-amazon-neptune-analytics-a-high-performance-graph-analytics/

I am happy to announce the general availability of Amazon Neptune Analytics, a new analytics database engine that makes it faster for data scientists and application developers to quickly analyze large amounts of graph data. With Neptune Analytics, you can now quickly load your dataset from Amazon Neptune or your data lake on Amazon Simple Storage Service (Amazon S3), run your analysis tasks in near real time, and optionally terminate your graph afterward.

Graph data enables the representation and analysis of intricate relationships and connections within diverse data domains. Common applications include social networks, where it aids in identifying communities, recommending connections, and analyzing information diffusion. In supply chain management, graphs facilitate efficient route optimization and bottleneck identification. In cybersecurity, they reveal network vulnerabilities and identify patterns of malicious activity. Graph data finds application in knowledge management, financial services, digital advertising, and network security, performing tasks such as identifying money laundering networks in banking transactions and predicting network vulnerabilities.

Since the launch of Neptune in May 2018, thousands of customers have embraced the service for storing their graph data and performing updates and deletion on specific subsets of the graph. However, analyzing data for insights often involves loading the entire graph into memory. For instance, a financial services company aiming to detect fraud may need to load and correlate all historical account transactions.

Performing analyses on extensive graph datasets, such as running common graph algorithms, requires specialized tools. Utilizing separate analytics solutions demands the creation of intricate pipelines to transfer data for processing, which is challenging to operate, time-consuming, and prone to errors. Furthermore, loading large datasets from existing databases or data lakes to a graph analytic solution can take hours or even days.

Neptune Analytics offers a fully managed graph analytics experience. It takes care of the infrastructure heavy lifting, enabling you to concentrate on problem-solving through queries and workflows. Neptune Analytics automatically allocates compute resources according to the graph’s size and quickly loads all the data in memory to run your queries in seconds. Our initial benchmarking shows that Neptune Analytics loads data from Amazon S3 up to 80x faster than existing AWS solutions.

Neptune Analytics supports 5 families of algorithms covering 15 different algorithms, each with multiple variants. For example, we provide algorithms for path-finding, detecting communities (clustering), identifying important data (centrality), and quantifying similarity. Path-finding algorithms are used for use cases such as route planning for supply chain optimization. Centrality algorithms like page rank identify the most influential sellers in a graph. Algorithms like connected components, clustering, and similarity algorithms can be used for fraud-detection use cases to determine whether the connected network is a group of friends or a fraud ring formed by a set of coordinated fraudsters.

Neptune Analytics facilitates the creation of graph applications using openCypher, presently one of the widely adopted graph query languages. Developers, business analysts, and data scientists appreciate openCypher’s SQL-inspired syntax, finding it familiar and structured for composing graph queries.

Let’s see it at work
As we usually do on the AWS News blog, let’s show how it works. For this demo, I first navigate to Neptune in the AWS Management Console. There is a new Analytics section on the left navigation pane. I select Graphs and then Create graph.

Neptune Analytics - create graph 1

On the Create graph page, I enter the details of my graph analytics database engine. I won’t detail each parameter here; their names are self-explanatory.

Neptune Analytics - Create graph 1

Pay attention to Allow from public because, the vast majority of the time, you want to keep your graph only available from the boundaries of your VPC. I also create a Private endpoint to allow private access from machines and services inside my account VPC network.

Neptune Analytics - Create graph 2

In addition to network access control, users will need proper IAM permissions to access the graph.

Finally, I enable Vector search to perform similarity search using embeddings in the dataset. The dimension of the vector depends on the large language model (LLM) that you use to generate the embedding.

Neptune Analytics - Create graph 3

When I am ready, I select Create graph (not shown here).

After a few minutes, my graph is available. Under Connectivity & security, I take note of the Endpoint. This is the DNS name I will use later to access my graph from my applications.

I can also create Replicas. A replica is a warm standby copy of the graph in another Availability Zone. You might decide to create one or more replicas for high availability. By default, we create one replica, and depending on your availability requirements, you can choose not to create replicas.

Neptune Analytics - create graph 3

Business queries on graph data
Now that the Neptune Analytics graph is available, let’s load and analyze data. For the rest of this demo, imagine I’m working in the finance industry.

I have a dataset obtained from the US Securities and Exchange Commission (SEC). This dataset contains the list of positions held by investors that have more than $100 million in assets. Here is a diagram to illustrate the structure of the dataset I use in this demo.

Nuptune graph analytics - dataset structure

I want to get a better understanding of the positions held by one investment firm (let’s name it “Seb’s Investments LLC”). I wonder what its top five holdings are and who else holds more than $1 billion in the same companies. I am also curious to know what are other investment companies that have a similar portfolio as Seb’s Investments LLC.

To start my analysis, I create a Jupyter notebook in the Neptune section of the AWS Management Console. In the notebook, I first define my analytics endpoint and load the data set from an S3 bucket. It takes only 18 seconds to load 17 million records.

Neptune Analytics - load data

Then, I start to explore the dataset using openCypher queries. I start by defining my parameters:

params = {'name': "Seb's Investments LLC", 'quarter': '2023Q4'}

First, I want to know what the top five holdings are for Seb’s Investments LLC in this quarter and who else holds more than $1 billion in the same companies. In openCypher, it translates to the query hereafter. The $name parameter’s value is “Seb’s Investment LLC” and the $quarter parameter’s value is 2023Q4.

MATCH p=(h:Holder)-->(hq1)-[o:owns]->(holding)
WHERE h.name = $name AND hq1.name = $quarter
WITH DISTINCT holding as holding, o ORDER BY o.value DESC LIMIT 5
MATCH (holding)<-[o2:owns]-(hq2)<--(coholder:Holder)
WHERE hq2.name = '2023Q4'
WITH sum(o2.value) AS totalValue, coholder, holding
WHERE totalValue > 1000000000
RETURN coholder.name, collect(holding.name)

Neptune Analytics - query 1

Then, I want to know what the other top five companies are that have similar holdings as “Seb’s Investments LLC.” I use the topKByNode() function to perform a vector search.

MATCH (n:Holder)
WHERE n.name = $name
CALL neptune.algo.vectors.topKByNode(n)
YIELD node, score
WHERE score >0
RETURN node.name LIMIT 5

This query identifies a specific Holder node with the name “Seb’s Investments LLC.” Then, it utilizes the Neptune Analytics custom vector similarity search algorithm on the embedding property of the Holder node to find other nodes in the graph that are similar. The results are filtered to include only those with a positive similarity score, and the query finally returns the names of up to five related nodes.

Neptune Analytics - query 2

Pricing and availability
Neptune Analytics is available today in seven AWS Regions: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Singapore, Tokyo), and Europe (Frankfurt, Ireland).

AWS charges for the usage on a pay-as-you-go basis, with no recurring subscriptions or one-time setup fees.

Pricing is based on configurations of memory-optimized Neptune capacity units (m-NCU). Each m-NCU corresponds to one hour of compute and networking capacity and 1 GiB of memory. You can choose configurations starting with 128 m-NCUs and up to 4096 m-NCUs. In addition to m-NCU, storage charges apply for graph snapshots.

I invite you to read the Neptune pricing page for more details

Neptune Analytics is a new analytics database engine to analyze large graph datasets. It helps you discover insights faster for use cases such as fraud detection and prevention, digital advertising, cybersecurity, transportation logistics, and bioinformatics.

Get started
Log in to the AWS Management Console to give Neptune Analytics a try.

— seb

Amazon Titan Image Generator, Multimodal Embeddings, and Text models are now available in Amazon Bedrock

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/amazon-titan-image-generator-multimodal-embeddings-and-text-models-are-now-available-in-amazon-bedrock/

Today, we’re introducing two new Amazon Titan multimodal foundation models (FMs): Amazon Titan Image Generator (preview) and Amazon Titan Multimodal Embeddings. I’m also happy to share that Amazon Titan Text Lite and Amazon Titan Text Express are now generally available in Amazon Bedrock. You can now choose from three available Amazon Titan Text FMs, including Amazon Titan Text Embeddings.

Amazon Titan models incorporate 25 years of artificial intelligence (AI) and machine learning (ML) innovation at Amazon and offer a range of high-performing image, multimodal, and text model options through a fully managed API. AWS pre-trained these models on large datasets, making them powerful, general-purpose models built to support a variety of use cases while also supporting the responsible use of AI.

You can use the base models as is, or you can privately customize them with your own data. To enable access to Amazon Titan FMs, navigate to the Amazon Bedrock console and select Model access on the bottom left menu. On the model access overview page, choose Manage model access and enable access to the Amazon Titan FMs.

Amazon Titan Models

Let me give you a quick tour of the new models.

Amazon Titan Image Generator (preview)
As a content creator, you can now use Amazon Titan Image Generator to quickly create and refine images using English natural language prompts. This helps companies in advertising, e-commerce, and media and entertainment to create studio-quality, realistic images in large volumes and at low cost. The model makes it easy to iterate on image concepts by generating multiple image options based on the text descriptions. The model can understand complex prompts with multiple objects and generates relevant images. It is trained on high-quality, diverse data to create more accurate outputs, such as realistic images with inclusive attributes and limited distortions.

Titan Image Generator’s image editing features include the ability to automatically edit an image with a text prompt using a built-in segmentation model. The model supports inpainting with an image mask and outpainting to extend or change the background of an image. You can also configure image dimensions and specify the number of image variations you want the model to generate.

In addition, you can customize the model with proprietary data to generate images consistent with your brand guidelines or to generate images in a specific style, for example, by fine-tuning the model with images from a previous marketing campaign. Titan Image Generator also mitigates harmful content generation to support the responsible use of AI. All images generated by Amazon Titan contain an invisible watermark, by default, designed to help reduce the spread of misinformation by providing a discreet mechanism to identify AI-generated images.

Amazon Titan Image Generator in action
You can start using the model in the Amazon Bedrock console by submitting either an English natural language prompt to generate images or by uploading an image for editing. In the following example, I show you how to generate an image with Amazon Titan Image Generator using the AWS SDK for Python (Boto3).

First, let’s have a look at the configuration options for image generation that you can specify in the body of the inference request. For task type, I choose TEXT_IMAGE to create an image from a natural language prompt.

import boto3
import json

bedrock = boto3.client(service_name="bedrock")
bedrock_runtime = boto3.client(service_name="bedrock-runtime")

# ImageGenerationConfig Options:
#   numberOfImages: Number of images to be generated
#   quality: Quality of generated images, can be standard or premium
#   height: Height of output image(s)
#   width: Width of output image(s)
#   cfgScale: Scale for classifier-free guidance
#   seed: The seed to use for reproducibility  

body = json.dumps(
        "taskType": "TEXT_IMAGE",
        "textToImageParams": {
            "text": "green iguana",   # Required
#           "negativeText": "<text>"  # Optional
        "imageGenerationConfig": {
            "numberOfImages": 1,   # Range: 1 to 5 
            "quality": "premium",  # Options: standard or premium
            "height": 768,         # Supported height list in the docs 
            "width": 1280,         # Supported width list in the docs
            "cfgScale": 7.5,       # Range: 1.0 (exclusive) to 10.0
            "seed": 42             # Range: 0 to 214783647

Next, specify the model ID for Amazon Titan Image Generator and use the InvokeModel API to send the inference request.

response = bedrock_runtime.invoke_model(

Then, parse the response and decode the base64-encoded image.

import base64
from PIL import Image
from io import BytesIO

response_body = json.loads(response.get("body").read())
images = [Image.open(BytesIO(base64.b64decode(base64_image))) for base64_image in response_body.get("images")]

for img in images:

Et voilà, here’s the green iguana (one of my favorite animals, actually):

Green iguana generated by Amazon Titan Image Generator

To learn more about all the Amazon Titan Image Generator features, visit the Amazon Titan product page. (You’ll see more of the iguana over there.)

Next, let’s use this image with the new Amazon Titan Multimodal Embeddings model.

Amazon Titan Multimodal Embeddings
Amazon Titan Multimodal Embeddings helps you build more accurate and contextually relevant multimodal search and recommendation experiences for end users. Multimodal refers to a system’s ability to process and generate information using distinct types of data (modalities). With Titan Multimodal Embeddings, you can submit text, image, or a combination of the two as input.

The model converts images and short English text up to 128 tokens into embeddings, which capture semantic meaning and relationships between your data. You can also fine-tune the model on image-caption pairs. For example, you can combine text and images to describe company-specific manufacturing parts to understand and identify parts more effectively.

By default, Titan Multimodal Embeddings generates vectors of 1,024 dimensions, which you can use to build search experiences that offer a high degree of accuracy and speed. You can also configure smaller vector dimensions to optimize for speed and price performance. The model provides an asynchronous batch API, and the Amazon OpenSearch Service will soon offer a connector that adds Titan Multimodal Embeddings support for neural search.

Amazon Titan Multimodal Embeddings in action
For this demo, I create a combined image and text embedding. First, I base64-encode my image, and then I specify either inputText, inputImage, or both in the body of the inference request.

# Maximum image size supported is 2048 x 2048 pixels
with open("iguana.png", "rb") as image_file:
    input_image = base64.b64encode(image_file.read()).decode('utf8')

# You can specify either text or image or both
body = json.dumps(
        "inputText": "Green iguana on tree branch",
        "inputImage": input_image

Next, specify the model ID for Amazon Titan Multimodal Embeddings and use the InvokeModel API to send the inference request.

response = bedrock_runtime.invoke_model(

Let’s see the response.

response_body = json.loads(response.get("body").read())

[0.005087942, -0.004392853, -0.04764151, -0.024312444, 0.049922388, 0.0132532045, 0.014374298, 0.005523709, -0.015199458, 0.02182385, ...]

I redacted the output for brevity. The distance between multimodal embedding vectors, measured with metrics like cosine similarity or euclidean distance, shows how similar or different the represented information is across modalities. Smaller distances mean more similarity, while larger distances mean more dissimilarity.

As a next step, you could build an image database by storing and indexing the multimodal embeddings in a vector store or vector database. To implement text-to-image search, query the database with inputText. For image-to-image search, query the database with inputImage. For image+text-to-image search, query the database with both inputImage and inputText.

Amazon Titan Text
Amazon Titan Text Lite and Amazon Titan Text Express are large language models (LLMs) that support a wide range of text-related tasks, including summarization, translation, and conversational chatbot systems. They can also generate code and are optimized to support popular programming languages and text formats like JSON and CSV.

Titan Text Express – Titan Text Express has a maximum context length of 8,192 tokens and is ideal for a wide range of tasks, such as open-ended text generation and conversational chat, and support within Retrieval Augmented Generation (RAG) workflows.

Titan Text Lite – Titan Text Lite has a maximum context length of 4,096 tokens and is a price-performant version that is ideal for English-language tasks. The model is highly customizable and can be fine-tuned for tasks such as article summarization and copywriting.

Amazon Titan Text in action
For this demo, I ask Titan Text to write an email to my team members suggesting they organize a live stream: “Compose a short email from Antje, Principal Developer Advocate, encouraging colleagues in the developer relations team to organize a live stream to demo our new Amazon Titan V1 models.”

body = json.dumps({
    "inputText": prompt, 

Titan Text FMs support temperature and topP inference parameters to control the randomness and diversity of the response, as well as maxTokenCount and stopSequences to control the length of the response.

Next, choose the model ID for one of the Titan Text models and use the InvokeModel API to send the inference request.

response = bedrock_runtime.invoke_model(
	# Choose modelID
	# Titan Text Express: "amazon.titan-text-express-v1"
	# Titan Text Lite: "amazon.titan-text-lite-v1"

Let’s have a look at the response.

response_body = json.loads(response.get('body').read())
outputText = response_body.get('results')[0].get('outputText')

text = outputText[outputText.index('\n')+1:]
email = text.strip()

Subject: Demo our new Amazon Titan V1 models live!

Dear colleagues,

I hope this email finds you well. I am excited to announce that we have recently launched our new Amazon Titan V1 models, and I believe it would be a great opportunity for us to showcase their capabilities to the wider developer community.

I suggest that we organize a live stream to demo these models and discuss their features, benefits, and how they can help developers build innovative applications. This live stream could be hosted on our YouTube channel, Twitch, or any other platform that is suitable for our audience.

I believe that showcasing our new models will not only increase our visibility but also help us build stronger relationships with developers. It will also provide an opportunity for us to receive feedback and improve our products based on the developer’s needs.

If you are interested in organizing this live stream, please let me know. I am happy to provide any support or guidance you may need. Together, let’s make this live stream a success and showcase the power of Amazon Titan V1 models to the world!

Best regards,
Principal Developer Advocate

Nice. I could send this email right away!

Availability and pricing
Amazon Titan Text FMs are available today in AWS Regions US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore, Tokyo), and Europe (Frankfurt). Amazon Titan Multimodal Embeddings is available today in the AWS Regions US East (N. Virginia) and US West (Oregon). Amazon Titan Image Generator is available in public preview in the AWS Regions US East (N. Virginia) and US West (Oregon). For pricing details, see the Amazon Bedrock Pricing page.

Learn more

Go to the AWS Management Console to start building generative AI applications with Amazon Titan FMs on Amazon Bedrock today!

— Antje

Amazon DynamoDB zero-ETL integration with Amazon OpenSearch Service is now available

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-dynamodb-zero-etl-integration-with-amazon-opensearch-service-is-now-generally-available/

Today, we are announcing the general availability of Amazon DynamoDB zero-ETL integration with Amazon OpenSearch Service, which lets you perform a search on your DynamoDB data by automatically replicating and transforming it without custom code or infrastructure. This zero-ETL integration reduces the operational burden and cost involved in writing code for a data pipeline architecture, keeping the data in sync, and updating code with frequent application changes, enabling you to focus on your application.

With this zero-ETL integration, Amazon DynamoDB customers can now use the powerful search features of Amazon OpenSearch Service, such as full-text search, fuzzy search, auto-complete, and vector search for machine learning (ML) capabilities to offer new experiences that boost user engagement and improve satisfaction with their applications.

This zero-ETL integration uses Amazon OpenSearch Ingestion to synchronize the data between Amazon DynamoDB and Amazon OpenSearch Service. You choose the DynamoDB table whose data needs to be synchronized and Amazon OpenSearch Ingestion synchronizes the data to an Amazon OpenSearch managed cluster or serverless collection within seconds of it being available.

You can also specify index mapping templates to ensure that your Amazon DynamoDB fields are mapped to the correct fields in your Amazon OpenSearch Service indexes. Also, you can synchronize data from multiple DynamoDB tables into one Amazon OpenSearch Service managed cluster or serverless collection to offer holistic insights across several applications.

Getting started with this zero-ETL integration
With a few clicks, you can synchronize data from DynamoDB to OpenSearch Service. To create an integration between DynamoDB and OpenSearch Service, choose the Integrations menu in the left pane of the DynamoDB console and the DynamoDB table whose data you want to synchronize.

You must turn on point-in-time recovery (PITR) and the DynamoDB Streams feature. This feature allows you to capture item-level changes in your table and push the changes to a stream. Choose Turn on for PITR and enable DynamoDB Streams in the Exports and streams tab.

After turning on PITR and DynamoDB Stream, choose Create to set up an OpenSearch Ingestion pipeline in your account that replicates the data to an OpenSearch Service managed domain.

In the first step, enter a unique pipeline name and set up pipeline capacity and compute resources to automatically scale your pipeline based on the current ingestion workload.

Now you can configure the pre-defined pipeline configuration in YAML file format. You can browse resources to look up and paste information to build the pipeline configuration. This pipeline is a combination of a source part from DyanmoDB settings and a sink part for OpenSearch Service.

You must set multiple IAM roles (sts_role_arn) with the necessary permissions to read data from the DynamoDB table and write to an OpenSearch domain. This role is then assumed by OpenSearch Ingestion pipelines to ensure that the right security posture is always maintained when moving the data from source to destination. To learn more, see Setting up roles and users in Amazon OpenSearch Ingestion in the AWS documentation.

After entering all required values, you can validate the pipeline configuration to ensure that your configuration is valid. To learn more, see Creating Amazon OpenSearch Ingestion pipelines in the AWS documentation.

Take a few minutes to set up the OpenSearch Ingestion pipeline, and you can see your integration is completed in the DynamoDB table.

Now you can search synchronized items in the OpenSearch Dashboards.

Things to know
Here are a couple of things that you should know about this feature:

  • Custom schema – You can specify your custom data schema along with the index mappings used by OpenSearch Ingestion when writing data from Amazon DynamoDB to OpenSearch Service. This experience is added to the console within Amazon DynamoDB so that you have full control over the format of indices that are created on OpenSearch Service.
  • Pricing – There will be no additional cost to use this feature apart from the cost of the existing underlying components. Note that Amazon OpenSearch Ingestion charges OpenSearch Compute Units (OCUs) which will be used to replicate data between Amazon DynamoDB and Amazon OpenSearch Service. Furthermore, this feature uses Amazon DynamoDB streams for the change data capture (CDC) and you will incur the standard costs for Amazon DynamoDB Streams.
  • Monitoring – You can monitor the state of the pipelines by checking the status of the integration on the DynamoDB console or using the OpenSearch Ingestion dashboard. Additionally, you can use Amazon CloudWatch to provide real-time metrics and logs, which lets you to set up alerts in case of a breach of user-defined thresholds.

Now available
Amazon DynamoDB zero-ETL integration with Amazon OpenSearch Service is now generally available in all AWS Regions where OpenSearch Ingestion is available today.


New Amazon Q in QuickSight uses generative AI assistance for quicker, easier data insights (preview)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/new-amazon-q-in-quicksight-uses-generative-ai-assistance-for-quicker-easier-data-insights-preview/

Today, I’m happy to share that Amazon Q in QuickSight is available for preview. Now you can experience the Generative BI capabilities in Amazon QuickSight announced on July 26, as well as two additional capabilities for business users.

Turning insights into impact faster with Amazon Q in QuickSight
With this announcement, business users can now generate compelling sharable stories examining their data, see executive summaries of dashboards surfacing key insights from data in seconds, and confidently answer questions of data not answered by dashboards and reports with a reimagined Q&A experience.

Before we go deeper into each capability, here’s a quick summary:

  • Stories — This is a new and visually compelling way to present and share insights. Stories can automatically generated in minutes using natural language prompts, customized using point-and-click options, and shared securely with others.
  • Executive summaries — With this new capability, Amazon Q helps you to understand key highlights in your dashboard.
  • Data Q&A — This capability provides a new and easy-to-use natural-language Q&A experience to help you get answers for questions beyond what is available in existing dashboards and reports.​​

To get started, you need to enable Preview Q Generative Capabilities in Preview manager.

Once enabled, you’re ready to experience what Amazon Q in QuickSight brings for business users and business analysts building dashboards.

Stories automatically builds formatted narratives
Business users often need to share their findings of data with others to inform team decisions; this has historically involved taking data out of the business intelligence (BI) system. Stories are a new feature enabling business users to create beautifully formatted narratives that describe data, and include visuals, images, and text in document or slide format directly that can easily be shared with others within QuickSight.

Now, business users can use natural language to ask Amazon Q to build a story about their data by starting from the Amazon Q Build menu on an Amazon QuickSight dashboard. Amazon Q extracts data insights and statistics from selected visuals, then uses large language models (LLMs) to build a story in multiple parts, examining what the data may mean to the business and suggesting ideas to achieve specific goals.

For example, a sales manager can ask, “Build me a story about overall sales performance trends. Break down data by product and region. Suggest some strategies for improving sales.” Or, “Write a marketing strategy that uses regional sales trends to uncover opportunities that increase revenue.” Amazon Q will build a story exploring specific data insights, including strategies to grow sales.

Once built, business users get point-and-click tools augmented with artificial intelligence- (AI) driven rewriting capabilities to customize stories using a rich text editor to refine the message, add ideas, and highlight important details.

Stories can also be easily and securely shared with other QuickSight users by email.

Executive summaries deliver a quick snapshot of important information
Executive summaries are now available with a single click using the Amazon Q Build menu in Amazon QuickSight. Amazon QuickSight automatically determines interesting facts and statistics, then use LLMs to write about interesting trends.

This new capability saves time in examining detailed dashboards by providing an at-a-glance view of key insights described using natural language.

The executive summaries feature provides two advantages. First, it helps business users generate all the key insights without the need to browse through tens of visuals on the dashboard and understand changes from each. Secondly, it enables readers to find key insights based on information in the context of dashboards and reports with minimum effort.

New data Q&A experience
Once an interesting insight is discovered, business users frequently need to dig in to understand data more deeply than they can from existing dashboards and reports. Natural language query (NLQ) solutions designed to solve this problem frequently expect that users already know what fields may exist or how they should be combined to answer business questions. However, business users aren’t always experts in underlying data schemas, and their questions frequently come in more general terms, like “How were sales last week in NY?” Or, “What’s our top campaign?”

The new Q&A experience accessed within the dashboards and reports helps business users confidently answer questions about data. It includes AI-suggested questions and a profile of what data can be asked about and automatically generated multi-visual answers with narrative summaries explaining data context.

Furthermore, Amazon Q brings the ability to answer vague questions and offer alternatives for specific data. For example, customers can ask a vague question, such as “Top products,” and Amazon Q will provide an answer that breaks down products by sales and offers alternatives for products by customer count and products by profit. Amazon Q explains answer context in a narrative summarizing total sales, number of products, and picking out the sales for the top product.

Customers can search for specific data values and even a single word such as, for example, the product name “contactmatcher.” Amazon Q returns a complete set of data related to that product and provides a natural language breakdown explaining important insights like total units sold. Specific visuals from the answers can also be added to a pinboard for easy future access.

Watch the demo
To see these new capabilities in action, have a look at the demo.

Things to Know
Here are a few additional things that you need to know:

Join the preview
Amazon Q in QuickSight product page

Happy building!
— Donnie

Introducing Amazon Q, a new generative AI-powered assistant (preview)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/introducing-amazon-q-a-new-generative-ai-powered-assistant-preview/

Today, we are announcing Amazon Q, a new generative artificial intelligence- (AI)-powered assistant designed for work that can be tailored to your business. You can use Amazon Q to have conversations, solve problems, generate content, gain insights, and take action by connecting to your company’s information repositories, code, data, and enterprise systems. Amazon Q provides immediate, relevant information and advice to employees to streamline tasks, accelerate decision-making and problem-solving, and help spark creativity and innovation at work.

Amazon Q offers user-based plans, so you get features, pricing, and options tailored to how you use the product. Amazon Q can adapt its interactions to each individual user based on the existing identities, roles, and permissions of your business. AWS never uses customers’ content from Amazon Q to train the underlying models. In other words, your company information remains secure and private.

In this post, I’ll give you a quick tour of how you can use Amazon Q for general business use. 

Amazon Q is your business expert
Let’s look at a few examples of how Amazon Q can help business users complete tasks using simple natural language prompts. As a marketing manager, you could ask Amazon Q to transform a press release into a blog post, create a summary of the press release, or create an email draft based on the provided release. Amazon Q searches through your company content, which can include internal style guides, for example, to provide a response appropriate to your company’s brand standards. Then, you could ask Amazon Q to generate tailored social media prompts to promote your story through each of your social media channels. Later, you can ask Amazon Q to analyze the results of your campaign and summarize them for leadership reviews.

Amazon Q

In the following example, I deployed Amazon Q with access to my AWS News Blog posts from 2023 and called the assistant “AWS Blog Expert.”

Amazon Q

Coming back to my previous example, let’s assume I’m a marketing manager and want Amazon Q to help me create social media posts for recent company blog posts.

I enter the following prompt: “Summarize the key insights from Antje’s recent AWS Weekly Roundup posts and craft a compelling social media post that not only highlights the most important points but also encourages engagement. Consider our target audience and aim for a tone that aligns with our brand identity. The social media post should be concise, informative, and enticing to encourage readers to click through and read the full articles. Please ensure the content is shareable and includes relevant hashtags for maximum visibility.”

Amazon Q

Behind the scenes, Amazon Q searches the documents in connected data sources and creates a relevant and detailed suggestion for a social media post based on my blog posts. Amazon Q also tells me which document was used to generate the answer. In this case, it is PDF file of the blog posts in question.

As an administrator, you can define the context for responses, restrict irrelevant topics, and configure whether to respond only using trusted company information or complement responses with knowledge from the underlying model. Restricting responses to trusted company information helps mitigate hallucinations, a common phenomenon where the underlying model generates responses that sound plausible but are based on misinterpreted or nonexistent data.

Amazon Q provides fine-grained access controls that restrict responses to only using data or acting based on the employee’s level of access and provides citations and references to the original sources for fact-checking and traceability. You can choose among 40+ built-in connectors for popular data sources and enterprise systems, including Amazon S3, Google Drive, Microsoft SharePoint, Salesforce, ServiceNow, and Slack.

How to tailor Amazon Q to your business
To tailor Amazon Q to your business, navigate to Amazon Q in the console, select Applications in the left menu, and choose Create application.

Amazon Q

This starts the following workflow.

Step 1. Create application. Provide an application name and create a new or select an existing AWS Identity and Access Management (IAM) service role that Amazon Q is allowed to assume. I call my application AWS-Blog-Expert. Then, choose Create.

Amazon Q

Step 2. Select retriever. A retriever pulls data from the index in real time during a conversation. You can choose between two options: use the Amazon Q native retriever or use an existing Amazon Kendra retriever. The native retriever can connect to the Amazon Q supported data sources. If you already use Amazon Kendra, you can select the existing Amazon Kendra retriever to connect the associated data sources to your Amazon Q application. I select the native retriever option. Then, choose Next.

Amazon Q

Step 3. Connect data sources. Amazon Q comes with built-in connectors for popular data sources and enterprise systems. For this demo, I choose Amazon S3 and configure the data source by pointing to my S3 bucket with the PDFs of my blog posts.

Amazon Q
Once the data source sync is successfully complete and the retriever shows the accurate document count, you can preview the web experience and start a conversation. Note that the data source sync can take from a few minutes to a few hours, depending on the amount and size of data to index.

You can also connect plugins that manage access to enterprise systems, including ServiceNow, Jira, Salesforce, and Zendesk. Plugins enable Amazon Q to perform user-requested tasks, such as creating support tickets or analyzing sales forecasts.

Amazon Q

Preview and deploy web experience
In the application overview, choose Preview web experience. This opens the web experience with the conversational interface to chat with the tailored Amazon Q AWS Blog Expert. In the final step, you deploy the Amazon Q web experience. You can integrate your SAML 2.0–compliant external identity provider (IdP) using IAM. Amazon Q can work with any IdP that’s compliant with SAML 2.0. Amazon Q uses service-initiated single sign-on (SSO) to authenticate users.

Join the preview
Amazon Q is available today in preview in AWS Regions US East (N. Virginia) and US West (Oregon). Visit the product page to learn how Amazon Q can become your expert in your business.

Also, check out the Amazon Q Slack Gateway GitHub repository that shows how to make Amazon Q available to users as a Slack Bot application.Amazon Q Slack Bot

Learn more

— Antje

Upgrade your Java applications with Amazon Q Code Transformation (preview)

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/upgrade-your-java-applications-with-amazon-q-code-transformation-preview/

As our applications age, it takes more and more effort just to keep them secure and running smoothly. Developers managing the upgrades must spend time relearning the intricacies and nuances of breaking changes and performance optimizations others have already discovered in past upgrades. As a result, it’s difficult to balance the focus between new features and essential maintenance work.

Today, we are introducing in preview Amazon Q Code Transformation. This new capability simplifies upgrading and modernizing existing application code using Amazon Q, a new type of assistant powered by generative artificial intelligence (AI). Amazon Q is specifically designed for work and can be tailored to your business.

Amazon Q Code Transformation can perform Java application upgrades now, from version 8 and 11 to version 17, a Java Long-Term Support (LTS) release, and it will soon be able to transform Windows-based .NET Framework applications to cross-platform .NET.

Previously, developers could spend two to three days upgrading each application. Our internal testing shows that the transformation capability can upgrade an application in minutes compared to the days or weeks typically required for manual upgrades, freeing up time to focus on new business requirements. For example, an internal Amazon team of five people successfully upgraded one thousand production applications from Java 8 to 17 in 2 days. It took, on average, 10 minutes to upgrade applications, and the longest one took less than an hour.

Amazon Q Code Transformation automatically analyzes the existing code, generates a transformation plan, and completes the transformation tasks suggested by the plan. While doing so, it identifies and updates package dependencies and refactors deprecated and inefficient code components, switching to new language frameworks and incorporating security best practices. Once complete, you can review the transformed code, complete with build and test results, before accepting the changes.

In this way, you can keep applications updated and supported in just a few steps, gain performance benefits, and remove vulnerabilities from using unsupported versions, freeing up time to focus on new business requirements. Let’s see how this works in practice.

Upgrading a Java application from version 8 to 17
I am using IntelliJ IDEA in this walkthrough (the same is available for Visual Studio Code). To have Amazon Q Code Transformation in my IDE, I install the latest version of the AWS Toolkit for IntelliJ IDEA and sign in using the AWS IAM Identity Center credentials provided by my organization. Note that to access Amazon Q Code Transformation, the CodeWhisperer administrator needs to explicitly give access to Amazon Q features in the profile used by the organization.

I open an old project that I never had the time to update to a more recent version of Java. The project is using Apache Maven to manage the build. The project object model (POM) file (pom.xml), an XML representation of the project, is in the root directory.

First, in the project settings, I check that the project is configured to use the correct SDK version (1.8 in this case). I choose AWS Toolkit on the left pane and then the Amazon Q + CodeWhisperer tab. In the Amazon Q (Preview) section, I choose Transform.

IDE screenshot.

This opens a dialog where I check that the correct Maven module is selected for the upgrade before proceeding with the transformation.

IDE screenshot.

I follow the progress in the Transformation Hub window. The upgrade completes in a few minutes for my small application, while larger ones might take more than an hour to complete.

The end-to-end application upgrade consists of three steps:

  1. Identifying and analyzing the application – The code is copied to a managed environment in the cloud where the build process is set up based on the instructions in the repository. At this stage, the components to be upgraded are identified.
  2. Creating a transformation plan – The code is analyzed to create a transformation plan that lists the steps that Amazon Q Code Transformation will take to upgrade the code, including updating dependencies, building the upgraded code, and then iteratively fixing any build errors encountered during the upgrade.
  3. Code generation, build testing, and finalization – The transformation plan is followed iteratively to update existing code and configuration files, generate new files where needed, perform build validation using the tests provided with the code, and fix issues identified in failed builds.

IDE screenshot.

After a few minutes, the transformation terminates successfully. From here, I can open the plan and a summary of the transformation. I choose View diff to see the proposed changes. In the Apply Patch dialog, I see a recap of the files that have been added, modified, or deleted.

IDE screenshot.

First, I select the pom.xml file and then choose Show Difference (the icon with the left/right arrows) to have a side-by-side view of the current code in the project and the proposed changes. For example, I see that the version of one of the dependencies (Project Lombok) has been increased for compatibility with the target Java version.

IDE screenshot.

In the Java file, the annotations used by the upgraded dependency have been updated. With the new version, @With has been promoted, and @Wither (which was experimental) deprecated. These changes are reflected in the import statements.

IDE screenshot.

There is also a summary file that I keep in the code repo to quickly look up the changes made to complete the upgrade.

I spend some time reviewing the files. Then, I choose OK to accept all changes.

Now the patch has been successfully applied, and the proposed changes merged with the code. I commit changes to my repo and move on to focus on business-critical changes that have been waiting for the migration to be completed.

Things to know
The preview of Amazon Q Code Transformation is available today for customers on the Amazon CodeWhisperer Professional Tier in the AWS Toolkit for IntelliJ IDEA and the AWS Toolkit for Visual Studio Code. To use Amazon Q Code Transformation, the CodeWhisperer administrator needs to give access to the profile used by the organization.

There is no additional cost for using Amazon Q Code Transformation during the preview. You can upgrade Java 8 and 11 applications that are built using Apache Maven to Java version 17. The project must have the POM file (pom.xml) in the root directory. We’ll soon add the option to transform Windows-based .NET Framework applications to cross-platform .NET and help accelerate migrations to Linux.

Once a transformation job is complete, you can use a diff view to verify and accept the proposed changes. The final transformation summary provides details of the dependencies updated and code files changed by Amazon Q Code Transformation. It also provides details of any build failures encountered in the final build of the upgraded code that you can use to fix the issues and complete the upgrade.

Combining Amazon’s long-term investments in automated reasoning and static code analysis with the power of generative AI, Amazon Q Code Transformation incorporates foundation models that we found to be essential for context-specific code transformations that often require updating a long tail of Java libraries with backward-incompatible changes.

In addition to generative AI-powered code transformations built by AWS, Amazon Q Code Transformation uses parts of OpenRewrite to further accelerate Java upgrades for customers. At AWS, many of our services are built with open source components and promoting the long-term sustainability of these communities is critical to us and our customers. That is why it’s important for us to contribute back to communities like OpenRewrite, helping ensure the whole industry can continue to benefit from their innovations. AWS plans to contribute to OpenRewrite recipes and improvements developed as part of Amazon Q Code Transformation to open source.

“The ability for software to adapt at a much faster pace is one of the most fundamental advantages any business can have. That’s why we’re excited to see AWS using OpenRewrite, the open source automated code refactoring technology, as a component of their service,” said Jonathan Schneider, CEO and Co-founder of Moderne (the sponsor of OpenRewrite). “We’re happy to have AWS join the OpenRewrite community and look forward to their contributions to make it even easier to migrate frameworks, patch vulnerabilities, and update APIs.”

Upgrade your Java applications now
Amazon Q Code Transformation product page


Improve developer productivity with generative-AI powered Amazon Q in Amazon CodeCatalyst (preview)

Post Syndicated from Irshad Buchh original https://aws.amazon.com/blogs/aws/improve-developer-productivity-with-generative-ai-powered-amazon-q-in-amazon-codecatalyst-preview/

Today, I’m excited to introduce the preview of new generative artificial intelligence (AI) capabilities within Amazon CodeCatalyst that accelerate software delivery using Amazon Q.

Accelerate feature development – The feature development capability in Amazon Q can help you accelerate the implementation of software development tasks such as adding comments and READMEs, refining issue descriptions, generating small classes and unit tests, and updating CodeCatalyst workflows — tedious and undifferentiated tasks that take up developers’ time. Developers can go from an idea in an issue to fully tested, merge-ready, running code with only natural language inputs, in just a few clicks. AI does the heavy lifting of converting the human prompt to an actionable plan, summarizing source code repositories, generating code, unit tests, and workflows, and summarizing any changes in a pull request which is assigned back to the developer. You can also provide feedback to Amazon Q directly on the published pull request and ask it to generate a new revision. If the code change falls short of expectations, you can create a development environment directly from the pull request, make any necessary adjustments manually, publish a new revision, and proceed with the merge upon approval.

Example: make an API change in an existing application
In the navigation pane, I choose Issues and then I choose Create issue. I give the issue the title, Change the get_all_mysfits() API to return mysfits sorted by the Age attribute. I then assign this issue to Amazon Q and choose Create issue.


Amazon Q will automatically move the issue into the In progress state while it analyzes the issue title and description to formulate a potential solution approach. If there is already some discussion on the issue, it should be summarized in the description to help Q understand what needs to be done. As it works, Amazon Q will report on its progress by leaving comments on the issue at every stage. It will attempt to create a solution based on its understanding of the code already present in the repository and the approach it formulated. If Amazon Q is able to successfully generate a potential solution, it will create a branch and commit code to that branch. It will then create a pull request that will merge the changes into the default branch once approved. Once the pull request is published, Amazon Q will change the issue status to In Review so that you and your team know that the code is now ready for you to review.


Summarize a change – Pull request authors can save time by asking Amazon Q to summarize the change they are publishing for review. Today pull request authors have to write the description manually or they may choose not to write it at all. If the author does not provide a description, it makes it harder for reviewers to understand what changes are being made and why, delaying the review process and slowing down software delivery.

Pull request authors and reviewers can also save time by asking Amazon Q to summarize the comments left on the pull request. The summary is useful for the author because they can easily see common feedback themes. For the reviewers it is useful because they can quickly catch up on the conversation and feedback from themselves and other team members. The overall benefits are streamlined collaboration, accelerated review process, and faster software delivery.

Join the preview
Amazon Q is available in Amazon CodeCatalyst today for spaces in AWS Region US West (Oregon).

Learn more