Tag Archives: news

AWS Weekly Roundup: Amazon EC2 U7i Instances, Bedrock Converse API, AWS World IPv6 Day and more (June 3, 2024)

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-ec2-u7i-instances-bedrock-converse-api-aws-world-ipv6-day-and-more-june-3-2024/

Life is not always happy, there are difficult times. However, we can share our joys and sufferings with those we work with. The AWS Community is no exception.

Jeff Barr introduced two members of the AWS community who are dealing with health issues. Farouq Mousa is an AWS Community Builder and fighting brain cancer. Allen Helton is an AWS Serverless Hero and his young daughter is fighting leukemia.

Please donate to support Farauq and Olivia, Allen’s daughter to overcome their disease.

Last week’s launches
Here are some launches that got my attention:

Amazon EC2 high memory U7i Instances – These instances with up to 32 TiB of DDR5 memory and 896 vCPUs are powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids). These high memory instances are designed to support large, in-memory databases including SAP HANA, Oracle, and SQL Server. To learn more, visit Jeff’s blog post.

New Amazon Connect analytics data lake – You can use a single source for contact center data including contact records, agent performance, Contact Lens insights, and more — eliminating the need to build and maintain complex data pipelines. Your organization can create your own custom reports using Amazon Connect data or combine data queried from third-party sources. To learn more, visit Donnie’s blog post.

Amazon Bedrock Converse API – This API provides developers a consistent way to invoke Amazon Bedrock models removing the complexity to adjust for model-specific differences such as inference parameters. With this API, you can write a code once and use it seamlessly with different models in Amazon Bedrock. To learn more, visit Dennis’s blog post to get started.

New Document widget for PartyRock – You can build, use, and share generative AI-powered apps for fun and for boosting personal productivity, using PartyRock. Its widgets display content, accept input, connect with other widgets, and generate outputs like text, images, and chats using foundation models. You can now use new document widget to integrate text content from files and documents directly into a PartyRock app.

30 days of alarm history in Amazon CloudWatch – You can view the history of your alarm state changes for up to 30 days prior. Previously, CloudWatch provided 2 weeks of alarm history. This extended history makes it easier to observe past behavior and review incidents over a longer period of time. To learn more, visit the CloudWatch alarms documentation section.

10x faster startup time in Amazon SageMaker Canvas – You can launch SageMaker Canvas in less than a minute and get started with your visual, no-code interface for machine learning 10x faster than before. Now, all new user profiles created in existing or new SageMaker domains can experience this accelerated startup time.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items and a Twitch show that you might find interesting:

Let us manage your relational database! – Jeff Barr ran a poll to better understand why some AWS customers still choose to host their own databases in the cloud. Working backwards, he highlights four issues that AWS managed database services address. Consider these before hosting your own database.

Amazon Bedrock Serverless Prompt Chaining – This repository provides examples of using AWS Step Functions and Amazon Bedrock to build complex, serverless, and highly scalable generative AI applications with prompt chaining.

AWS Merch Store Spring Sale – Do you want to buy AWS branded t-shirts, hats, bags, and so on? Get 15% off on all items now through June 7th.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS World IPv6 Day — Join us a free in-person celebration event on June 6, for technical presentations from AWS experts plus a workshop and whiteboarding session. You will learn how to get started with IPv6 and hear from customers who have started on the journey of IPv6 adoption. Check out your near city: San Francisco, Seattle, New YorkLondon, Mumbai, Bangkok, Singapore, Kuala Lumpur, Beijing, Manila, and Sydney.

AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Stockholm (June 4), Madrid (June 5), and Washington, DC (June 26–27).

AWS re:Inforce — Join us for AWS re:Inforce (June 10–12) in Philadelphia, PA. AWS re:Inforce is a learning conference focused on AWS security solutions, cloud security, compliance, and identity. Connect with the AWS teams that build the security tools and meet AWS customers to learn about their security journeys.

AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), New Zealand (August 15), Nigeria (August 24), and New York (August 28).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Channy

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Simplify custom contact center insights with Amazon Connect analytics data lake

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/simplify-custom-contact-center-insights-with-amazon-connect-analytics-data-lake/

Analytics are vital to the success of a contact center. Having insights into each touchpoint of the customer experience allows you to accurately measure performance and adapt to shifting business demands. While you can find common metrics in the Amazon Connect console, sometimes you need to have more details and custom requirements for reporting based on the unique needs of your business. 

Starting today, the Amazon Connect analytics data lake is generally available. As announced last year as preview, this new capability helps you to eliminate the need to build and maintain complex data pipelines. Amazon Connect data lake is zero-ETL capable, so no extract, transform, or load (ETL) is needed.

Here’s a quick look at the Amazon Connect analytics data lake:

Improving your customer experience with Amazon Connect
Amazon Connect analytics data lake helps you to unify disparate data sources, including customer contact records and agent activity, into a single location. By having your data in a centralized location, you now have access to analyze contact center performance and gain insights while reducing the costs associated with implementing complex data pipelines.

With Amazon Connect analytics data lake, you can access and analyze contact center data, such as contact trace records and Amazon Connect Contact Lens data. This provides you the flexibility to prepare and analyze data with Amazon Athena and use the business intelligence (BI) tools of your choice, such as, Amazon QuickSight and Tableau

Get started with the Amazon Connect analytics data lake
To get started with the Amazon Connect analytics data lake, you’ll first need to have an Amazon Connect instance setup. You can follow the steps in the Create an Amazon Connect instance page to create a new Amazon Connect instance. Because I’ve already created my Amazon Connect instance, I will go straight to showing you how you can get started with Amazon Connect analytics data lake.

First, I navigate to the Amazon Connect console and select my instance.

Then, on the next page, I can set up my analytics data lake by navigating to Analytics tools and selecting Add data share.

This brings up a pop-up dialog, and I first need to define the target AWS account ID. With this option, I can set up a centralized account to receive all data from Amazon Connect instances running in multiple accounts. Then, under Data types, I can select the types I need to share with the target AWS account. To learn more about the data types that you can share in the Amazon Connect analytics data lake, please visit Associate tables for Analytics data lake.

Once it’s done, I can see the list of all the target AWS account IDs with which I have shared all the data types.

Besides using the AWS Management Console, I can also use the AWS Command Line Interface (AWS CLI) to associate my tables with the analytics data lake. The following is a sample command:

$> aws connect batch-associate-analytics-data-set --cli-input-json file:///input_batch_association.json

Where input_batch_association.json is a JSON file that contains association details. Here’s a sample:

{
	"InstanceId": YOUR_INSTANCE_ID,
	"DataSetIds": [
		"<DATA_SET_ID>"
		],
	"TargetAccountId": YOUR_ACCOUNT_ID
} 

Next, I need to approve (or reject) the request in the AWS Resource Access Manager (RAM) console in the target account. RAM is a service to help you securely share resources across AWS accounts. I navigate to AWS RAM and select Resource shares in the Shared with me section.

Then, I select the resource and select Accept resource share

At this stage, I can access shared resources from Amazon Connect. Now, I can start creating linked tables from shared tables in AWS Lake Formation. In the Lake Formation console, I navigate to the Tables page and select Create table.

I need to create a Resource link to a shared table. Then, I fill in the details and select the available Database and the Shared table’s region.

Then, when I select Shared table, it will list all the available shared tables that I can access.

Once I select the shared table, it will automatically populate Shared table’s database and Shared table’s owner ID. Once I’m happy with the configuration, I select Create.

To run some queries for the data, I go to the Amazon Athena console.The following is an example of a query that I ran:

With this configuration, I have access to certain Amazon Connect data types. I can even visualize the data by integrating with Amazon QuickSight. The following screenshot show some visuals in the Amazon QuickSight dashboard with data from Amazon Connect.

Customer voice
During the preview period, we heard lots of feedback from our customers about Amazon Connect analytics data lake. Here’s what our customer say:

Joulica is an analytics platform supporting insights for software like Amazon Connect and Salesforce. Tony McCormack, founder and CEO of Joulica, said, “Our core business is providing real-time and historical contact center analytics to Amazon Connect customers of all sizes. In the past, we frequently had to set up complex data pipelines, and so we are excited about using Amazon Connect analytics data lake to simplify the process of delivering actionable intelligence to our shared customers.”

Things you need to know

  • Pricing — Amazon Connect analytics data lake is available for you to use up to 2 years of data without any additional charges in Amazon Connect. You only need to pay for any services you use to interact with the data.
  • Availability — Amazon Connect analytics data lake is generally available in the following AWS Regions: US East (N. Virginia), US West (Oregon), Africa (Cape Town), Asia Pacific (Mumbai, Seoul, Singapore, Sydney, Tokyo), Canada (Central), and Europe (Frankfurt, London)
  • Learn more — For more information, please visit Analytics data lake documentation page.

Happy building,
Donnie

AWS analytics services streamline user access to data, permissions setting, and auditing

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-analytics-services-streamline-user-access-to-data-permissions-setting-and-auditing/

I am pleased to announce a new use case based on trusted identity propagation, a recently introduced capability of AWS IAM Identity Center.

Tableau, a commonly used business intelligence (BI) application, can now propagate end-user identity down to Amazon Redshift. This has a triple benefit. It simplifies the sign-in experience for end users. It allows data owners to define access based on real end-user identity. It allows auditors to verify data access by users.

Trusted identity propagation allows applications that consume data (such as Tableau, Amazon QuickSight, Amazon Redshift Query Editor, Amazon EMR Studio, and others) to propagate the user’s identity and group memberships to the services that store and manage access to the data, such as Amazon Redshift, Amazon Athena, Amazon Simple Storage Service (Amazon S3), Amazon EMR, and others. Trusted identity propagation is a capability of IAM Identity Center that improves the sign-in experience across multiple analytics applications, simplifies data access management, and simplifies audit. End users benefit from single sign-on and do not have to specify the IAM roles they want to assume to connect to the system.

Before diving into more details, let’s agree on terminology.

I use the term “identity providers” to refer to the systems that hold user identities and group memberships. These are the systems that prompt the user for credentials and perform the authentication. For example, Azure Directory, Okta, Ping Identity, and more. Check the full list of identity providers we support.

I use the term “user-facing applications” to designate the applications that consume data, such as Tableau, Microsoft PowerBI, QuickSight, Amazon Redshift Query Editor, and others.

And finally, when I write “downstream services”, I refer to the analytics engines and storage services that process, store, or manage access to your data: Amazon Redshift, Athena, S3, EMR, and others.

Trusted Identity Propagation - high-level diagram

To understand the benefit of trusted identity propagation, let’s briefly talk about how data access was granted until today. When a user-facing application accesses data from a downstream service, either the upstream service uses generic credentials (such as “tableau_user“) or assumes an IAM role to authenticate against the downstream service. This is the source of two challenges.

First, it makes it difficult for the downstream service administrator to define access policies that are fine-tuned for the actual user making the request. As seen from the downstream service, all requests originate from that common user or IAM role. If Jeff and Jane are both mapped to the BusinessAnalytics IAM role, then it is not possible to give them different levels of access, for example, readonly and read-write. Furthermore, if Jeff is also in the Finance group, he needs to choose a role in which to operate; he cannot access data from both groups in the same session.

Secondly, the task of associating a data-access event to an end user involves some undifferentiated heavy lifting. If the request originates from an IAM role called BusinessAnalytics, then additional work is required to figure out which user was behind that action.

Well, this particular example might look very simple, but in real life, organizations have hundreds of users and thousands of groups to match to hundreds of datasets. There was an opportunity for us to Invent and Simplify.

Once configured, the new trusted identity propagation provides a technical mechanism for user-facing applications to access data on behalf of the actual user behind the keyboard. Knowing the actual user identity offers three main advantages.

First, it allows downstream service administrators to create and manage access policies based on actual user identities, the groups they belong to, or a combination of the two. Downstream service administrators can now assign access in terms of users, groups, and datasets. This is the way most of our customers naturally think about access to data—intermediate mappings to IAM roles are no longer necessary to achieve these patterns.

Second, auditors now have access to the original user identity in system logs and can verify that policies are implemented correctly and follow all requirements of the company or industry-level policies.

Third, users of BI applications can benefit from single sign-on between applications. Your end-users no longer need to understand your company’s AWS accounts and IAM roles. Instead, they can sign in to EMR Studio (for example) using their corporate single sign-on that they’re used to for so many other things they do at work.

How does trusted identity propagation work?
Trusted identity propagation relies on standard mechanisms from our industry: OAuth2 and JWT. OAuth2 is an open standard for access delegation that allows users to grant third-party user-facing applications access to data on other services (downstream services) without exposing their credentials. JWT (JSON Web Token) is a compact, URL-safe means of representing identities and claims to be transferred between two parties. JWTs are signed, which means their integrity and authenticity can be verified.

How to configure trusted identity propagation
Configuring trusted identity propagation requires setup in IAM Identity Center, at the user-facing application, and at the downstream service because each of these needs to be told to work with end-user identities. Although the particulars will be different for each application, they will all follow this pattern:

  1. Configure an identity source in AWS IAM Identity Center. AWS recommends enabling automated provisioning if your identity provider supports it, as most do. Automated provisioning works through the SCIM synchronization standard to synchronize your directory users and groups into IAM Identity Center. You probably have configured this already if you currently use IAM Identity Center to federate your workforce into the AWS Management Console. This is a one-time configuration, and you don’t have to repeat this step for each user-facing application.
  2. Configure your user-facing application to authenticate its users with your identity provider. For example, configure Tableau to use Okta.
  3. Configure the connection between the user-facing application and the downstream service. For example, configure Tableau to access Amazon Redshift. In some cases, it requires using the ODBC or JDBC driver for Redshift.

Then comes the configuration specific to trusted identity propagation. For example, imagine your organization has developed a user-facing web application that authenticates the users with your identity provider, and that you want to access data in AWS on behalf of the current authenticated user. For this use case, you would create a trusted token issuer in IAM Identity Center. This powerful new construct gives you a way to map your application’s authenticated users to the users in your IAM Identity Center directory so that it can make use of trusted identity propagation. My colleague Becky wrote a blog post to show you how to develop such an application. This additional configuration is required only when using third-party applications, such as Tableau, or a customer-developed application, that authenticate outside of AWS. When using user-facing applications managed by AWS, such as Amazon QuickSight, no further setup is required.

setup an external IdP to issue trusted token

Finally, downstream service administrators must configure the access policies based on the user identity and group memberships. The exact configuration varies from one downstream service to the other. If the application reads or writes data in Amazon S3, the data owner may use S3 Access Grants in the Amazon S3 console to grant access for users and groups to prefixes in Amazon S3. If the application makes queries to an Amazon Redshift data warehouse, the data owner must configure IAM Identity Center trusted connection in the Amazon Redshift console and match the audience claim (aud) from the identity provider.

Now that you have a high-level overview of the configuration, let’s dive into the most important part: the user experience.

The end-user experience
Although the precise experience of the end user will obviously be different for different applications, in all cases, it will be simpler and more familiar to workforce users than before. The user interaction will begin with a redirect-based authentication single sign-on flow that takes the user to their identity provider, where they can sign in with credentials, multi-factor authentication, and so on.

Let’s look at the details of how an end user might interact with Okta and Tableau when trusted identity propagation has been configured.

Here is an illustration of the flow and the main interactions between systems and services.

Trusted Identity Propagation flow

Here’s how it goes.

1. As a user, I attempt to sign in to Tableau.

2. Tableau initiates a browser-based flow and redirects to the Okta sign-in page where I can enter my sign-in credentials. On successful authentication, Okta issues an authentication token (ID and access token) to Tableau.

3. Tableau initiates a JDBC connection with Amazon Redshift and includes the access token in the connection request. The Amazon Redshift JDBC driver makes a call to Amazon Redshift. Because your Amazon Redshift administrator enabled IAM Identity Center, Amazon Redshift forwards the access token to IAM Identity Center.

4. IAM Identity Center verifies and validates the access token and exchange the access token for an Identity Center issued token.

5. Amazon Redshift will resolve the Identity Center token to determine the corresponding Identity Center user and authorize access to the resource. Upon successful authorization, I can connect from Tableau to Amazon Redshift.

Once authenticated, I can start to use Tableau as usual.

Trusted Identity Propagation - Tableau usage

And when I connect to Amazon Redshift Query Editor, I can observe the sys_query_history table to check who was the user who made the query. It correctly reports awsidc:<email address>, the Okta email address I used when I connected from Tableau.

Trusted Identity Propagation - audit in Redshift

You can read Tableau’s documentation for more details about this configuration.

Pricing and availability
Trusted identity propagation is provided at no additional cost in the 26 AWS Regions where AWS IAM Identity Center is available today.

Here are more details about trusted identity propagation and downstream service configurations.

Happy reading!

With trusted identity propagation, you can now configure analytics systems to propagate the actual user identity, group membership, and attributes to AWS services such as Amazon Redshift, Amazon Athena, or Amazon S3. It simplifies the management of access policies on these services. It also allows auditors to verify your organization’s compliance posture to know the real identity of users accessing data.

Get started now and configure your Tableau integration with Amazon Redshift.

— seb

PS: Writing a blog post at AWS is always a team effort, even when you see only one name under the post title. In this case, I want to thank Eva Mineva, Laura Reith, and Roberto Migli for their much-appreciated help in understanding the many subtleties and technical details of trusted identity propagation.

Amazon EC2 high memory U7i Instances for large in-memory databases

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-high-memory-u7i-instances-for-large-in-memory-databases/

Announced in preview form at re:Invent 2023, Amazon Elastic Compute Cloud (Amazon EC2) U7i instances with up to 32 TiB of DDR5 memory and 896 vCPUs are now available. Powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids), these high memory instances are designed to support large, in-memory databases including SAP HANA, Oracle, and SQL Server. Here are the specs:

Instance Name vCPUs
Memory (DDR5)
EBS Bandwidth
Network Bandwidth
u7i-12tb.224xlarge 896 12,288 GiB 60 Gbps 100 Gbps
u7in-16tb.224xlarge 896 16,384 GiB 100 Gbps 200 Gbps
u7in-24tb.224xlarge 896 24,576 GiB 100 Gbps 200 Gbps
u7in-32tb.224xlarge 896 32,768 GiB 100 Gbps 200 Gbps

The new instances deliver the best compute price performance for large in-memory workloads, and offer the highest memory and compute power of any SAP-certified virtual instance from a leading cloud provider.

Thanks to AWS Nitro System, all of the memory on the instance is available for use. For example, here’s the 32 TiB instance:

In comparison to the previous generation of EC2 High Memory instances, the U7i instances offer more than 135% of the compute performance, up to 115% more memory performance, and 2.5x the EBS bandwidth. This increased bandwidth allows you to transfer 30 TiB of data from EBS into memory in an hour or less, making data loads and cache refreshes faster than ever before. The instances also support ENA Express with 25 Gbps of bandwidth per flow, and provide an 85% improvement in P99.9 latency between instances.

Each U7i instance supports attachment of up to 128 General Purpose (gp2 and gp3) or Provisioned IOPS (io1 and io2 Block Express) EBS volumes. Each io2 Block Express volume can be as big as 64 TiB and can deliver up to 256K IOPS at up to 32 Gbps, making them a great match for U7i instances.

The instances are SAP certified to run Business Suite on HANA, Business Suite S/4HANA, Business Warehouse on HANA (BW), and SAP BW/4HANA in production environments. To learn more, consult the Certified and Supported SAP HANA Hardware and the SAP HANA to AWS Migration Guide. Also, be sure to take a look at the AWS Launch Wizard for SAP.

Things to Know
Here are a couple of things that you should know about these new instances:

Regions – U7i instances are available in the US East (N. Virginia), US West (Oregon), and Asia Pacific (Seoul, Sydney) AWS Regions.

Operating Systems – Supported operating systems include Amazon Linux, Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Ubuntu, and Windows Server.

Larger Instances – We are also working on offering even larger instance later this year with increased compute to meet our customer needs.

Jeff;

AWS Weekly Roundup – LlamaIndex support for Amazon Neptune, force AWS CloudFormation stack deletion, and more (May 27, 2024)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-llamaindex-support-for-amazon-neptune-force-aws-cloudformation-stack-deletion-and-more-may-27-2024/

Last week, Dr. Matt Wood, VP for AI Products at Amazon Web Services (AWS), delivered the keynote at the AWS Summit Los Angeles. Matt and guest speakers shared the latest advancements in generative artificial intelligence (generative AI), developer tooling, and foundational infrastructure, showcasing how they come together to change what’s possible for builders. You can watch the full keynote on YouTube.

AWS Summit LA 2024 keynote

Announcements during the LA Summit included two new Amazon Q courses as part of Amazon’s AI Ready initiative to provide free AI skills training to 2 million people globally by 2025. The courses are part of the Amazon Q learning plan. But that’s not all that happened last week.

Last week’s launches
Here are some launches that got my attention:

LlamaIndex support for Amazon Neptune — You can now build Graph Retrieval Augmented Generation (GraphRAG) applications by combining knowledge graphs stored in Amazon Neptune and LlamaIndex, a popular open source framework for building applications with large language models (LLMs) such as those available in Amazon Bedrock. To learn more, check the LlamaIndex documentation for Amazon Neptune Graph Store.

AWS CloudFormation launches a new parameter called DeletionMode for the DeleteStack API — You can use the AWS CloudFormation DeleteStack API to delete your stacks and stack resources. However, certain stack resources can prevent the DeleteStack API from successfully completing, for example, when you attempt to delete non-empty Amazon Simple Storage Service (Amazon S3) buckets. The DeleteStack API can enter into the DELETE_FAILED state in such scenarios. With this launch, you can now pass FORCE_DELETE_STACK value to the new DeletionMode parameter and delete such stacks. To learn more, check the DeleteStack API documentation.

Mistral Small now available in Amazon Bedrock — The Mistral Small foundation model (FM) from Mistral AI is now generally available in Amazon Bedrock. This a fast-follow to our recent announcements of Mistral 7B and Mixtral 8x7B in March, and Mistral Large in April. Mistral Small, developed by Mistral AI, is a highly efficient large language model (LLM) optimized for high-volume, low-latency language-based tasks. To learn more, check Esra’s post.

New Amazon CloudFront edge location in Cairo, Egypt — The new AWS edge location brings the full suite of benefits provided by Amazon CloudFront, a secure, highly distributed, and scalable content delivery network (CDN) that delivers static and dynamic content, APIs, and live and on-demand video with low latency and high performance. Customers in Egypt can expect up to 30 percent improvement in latency, on average, for data delivered through the new edge location. To learn more about AWS edge locations, visit CloudFront edge locations.

Amazon OpenSearch Service zero-ETL integration with Amazon S3 — This Amazon OpenSearch Service integration offers a new efficient way to query operational logs in Amazon S3 data lakes, eliminating the need to switch between tools to analyze data. You can get started by installing out-of-the-box dashboards for AWS log types such as Amazon VPC Flow Logs, AWS WAF Logs, and Elastic Load Balancing (ELB). To learn more, check out the Amazon OpenSearch Service Integrations page and the Amazon OpenSearch Service Developer Guide.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items and a Twitch show that you might find interesting:

AWS Build On Generative AIBuild On Generative AI — Now streaming every Thursday, 2:00 PM US PT on twitch.tv/aws, my colleagues Tiffany and Mike discuss different aspects of generative AI and invite guest speakers to demo their work. Check out show notes and the full list of episodes on community.aws.

Amazon Bedrock Studio bootstrapper script — We’ve heard your feedback! To everyone who struggled setting up the required AWS Identity and Access Management (IAM) roles and permissions to get started with Amazon Bedrock Studio: You can now use the Bedrock Studio bootstrapper script to automate the creation of the permissions boundary, service role, and provisioning role.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS SummitsAWS Summits — It’s AWS Summit season! Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Dubai (May 29), Bangkok (May 30), Stockholm (June 4), Madrid (June 5), and Washington, DC (June 26–27).

AWS re:InforceAWS re:Inforce — Join us for AWS re:Inforce (June 10–12) in Philadelphia, PA. AWS re:Inforce is a learning conference focused on AWS security solutions, cloud security, compliance, and identity. Connect with the AWS teams that build the security tools and meet AWS customers to learn about their security journeys.

AWS Community DaysAWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), New Zealand (August 15), Nigeria (August 24), and New York (August 28).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Antje

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Optimized for low-latency workloads, Mistral Small now available in Amazon Bedrock

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/optimized-for-low-latency-workloads-mistral-small-now-available-in-amazon-bedrock/

Today, I am happy to announce that the Mistral Small foundation model (FM) from Mistral AI is now generally available in Amazon Bedrock. This a fast-follow to our recent announcements of Mistral 7B and Mixtral 8x7B in March, and Mistral Large in April. You can now access four high-performing models from Mistral AI in Amazon Bedrock including Mistral Small, Mistral Large, Mistral 7B, and Mixtral 8x7B, further expanding model choice.

Mistral Small, developed by Mistral AI, is a highly efficient large language model (LLM) optimized for high-volume, low-latency language-based tasks. Mistral Small is perfectly suited for straightforward tasks that can be performed in bulk, such as classification, customer support, or text generation. It provides outstanding performance at a cost-effective price point.

Some key features of Mistral Small you need to know about:

  • Retrieval-Augmented Generation (RAG) specialization – Mistral Small ensures that important information is retained even in long context windows, which can extend up to 32K tokens.
  • Coding proficiency – Mistral Small excels in code generation, review, and commenting, supporting major coding languages.
  • Multilingual capability – Mistral Small delivers top-tier performance in French, German, Spanish, and Italian, in addition to English. It also supports dozens of other languages.

Getting started with Mistral Small
I first need access to the model to get started with Mistral Small. I go to the Amazon Bedrock console, choose Model access, and then choose Manage model access. I expand the Mistral AI section, choose Mistral Small, and then choose Save changes.

I now have model access to Mistral Small, and I can start using it in Amazon Bedrock. I refresh the Base models table to view the current status.

I use the following template to build a prompt for the model to get sub-optimal outputs:

<s>[INST] Instruction [/INST]

Note that <s> is a special token for beginning of string (BOS) while [INST] and [/INST] are regular strings.

I try the following prompt to see a classification example:

Prompt:

<s>[INST] Classify the following email to determine if it is spam or not. Only respond with the exact words "Spam" or "Not spam".
🎉 Limited time offer! Buy 2, Get 1 Free! Offer ends today! Don't miss out the CHANCE TO SAVE, please click on the link below: https://bit.ly/buy-2-get-1-free [/INST]

Mistral 7B, Mixtral 8x7B, and Mistral Large can all correctly classify this email as “Spam.” Mistral Small is also able to classify this accurately, just as the larger models can. I also try several similar tasks, such as generating a Bash script from a text prompt and generating a recipe to prepare yoghurt, and get good results. For this reason, Mistral Small is the most cost-effective and efficient option of the Mistral AI models in Amazon Bedrock for such tasks.

Mistral Small excels in multilingual tasks. Along with English, it delivers superior performance in French, German, Spanish, and Italian. I test the model’s understanding of the German language by asking it to give me two sentences about sustainability:

Prompt:

<s>[INST] Geben Sie mir zwei Sätze zum Thema Nachhaltigkeit. [/INST]

Output:

1. Nachhaltigkeit bedeutet, die Bedürfnisse der heutigen Generation zu befriedigen, ohne die Möglichkeiten künftiger Generationen zu gefährden.
2. Ein nachhaltiger Lebensstil umfasst den bewussten Umgang mit Ressourcen, wie Wasser, Energie und Rohstoffen, sowie den Schutz von Umwelt und Klima.
Diese Sätze sollten Ihnen einen guten Überblick über das Thema Nachhaltigkeit geben.

Programmatically interact with Mistral Small
I can use AWS Command Line Interface (AWS CLI) and AWS Software Development Kit (SDK) to programmatically interact with Mistral Small using Amazon Bedrock APIs. I use the following code in Python, which interacts with Amazon Bedrock Runtime APIs with AWS SDK, asking, “What is the color of the sky?”:

import argparse
import boto3
from botocore.exceptions import ClientError
import json

accept = "application/json"
content_type = "application/json"

def invoke_model(model_id, input_data, region, streaming): 
  client = boto3.client('bedrock-runtime', region_name=region)
  try:
    if streaming:
      response = client.invoke_model_with_response_stream(body=input_data, modelId=model_id, accept=accept, contentType=content_type)
    else:
      response = client.invoke_model(body=input_data, modelId=model_id, accept=accept, contentType=content_type)
    status_code = response['ResponseMetadata']['HTTPStatusCode']
    print(json.loads(response.get('body').read()))
  except ClientError as e:
    print(e)

if __name__ == "__main__":
  parser = argparse.ArgumentParser(description="Bedrock Testing Tool")
  parser.add_argument("--prompt", type=str, help="prompt to use", default="Hello")
  parser.add_argument("--max-tokens", type=int, default=64)
  parser.add_argument("--streaming", choices=["true", "false"], help="whether to stream or not", default="false")
  args = parser.parse_args()
  streaming = False
  if args.streaming == "true":
    streaming = True
  input_data = json.dumps({
    "prompt": f"<s>[INST]{args.prompt}[/INST]",
    "max_tokens": args.max_tokens
  })
  invoke_model(model_id="mistral.mistral-small-2402-v1:0", input_data=input_data, region="us-east-1", streaming=streaming)

I get the following output:

{'outputs': [{'text': ' The color of the sky can vary depending on the time of day, weather,', 'stop_reason': 'length'}]}

Now available
The Mistral Small model is now available in Amazon Bedrock in the US East (N. Virginia) Region.

To learn more, visit the Mistral AI in Amazon Bedrock product page. For pricing details, review the Amazon Bedrock pricing page.

To get started with Mistral Small in Amazon Bedrock, visit the Amazon Bedrock console and Amazon Bedrock User Guide.

— Esra

The Oral History of Selling World of Warcraft Server Blades

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/the-oral-history-of-selling-world-of-warcraft-server-blades-hp-hpe-amd/

We record the oral history of how World of Warcraft server blades were sold via a charity auction where fans could buy their online realms

The post The Oral History of Selling World of Warcraft Server Blades appeared first on ServeTheHome.

AWS Weekly Roundup – Application Load Balancer IPv6, Amazon S3 pricing update, Amazon EC2 Flex instances, and more (May 20, 2024)

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-application-load-balancer-ipv6-amazon-s3-pricing-update-amazon-ec2-flex-instances-and-more-may-20-2024/

AWS Summit season is in full swing around the world, with last week’s events in Bengaluru, Berlin, and  Seoul, where my blog colleague Channy delivered one of the keynotes.

AWS Summit Seoul Keynote

Last week’s launches
Here are some launches that got my attention:

Amazon S3 will no longer charge for several HTTP error codesA customer reported how he was charged for Amazon S3 API requests he didn’t initiate and which resulted in AccessDenied errors. The Amazon Simple Storage Service (Amazon S3) service team updated the service to not charge such API requests anymore. As always when talking about pricing, the exact wording is important, so please read the What’s New post for the details.

Introducing Amazon EC2 C7i-flex instances – These instances delivers up to 19 percent better price performance compared to C6i instances. Using C7i-flex instances is the easiest way for you to get price performance benefits for a majority of compute-intensive workloads. The new instances are powered by the 4th generation Intel Xeon Scalable custom processors (Sapphire Rapids) that are available only on AWS and offer 5 percent lower prices compared to C7i.

Application Load Balancer launches IPv6 only support for internet clientsApplication Load Balancer now allows customers to provision load balancers without IPv4s for clients that can connect using just IPv6s. To connect, clients can resolve AAAA DNS records that are assigned to Application Load Balancer. The Application Load Balancer is still dual stack for communication between the load balancer and targets. With this new capability, you have the flexibility to use both IPv4s or IPv6s for your application targets while avoiding IPv4 charges for clients that don’t require it.

Amazon VPC Lattice now supports TLS Passthrough – We announced the general availability of TLS passthrough for Amazon VPC Lattice, which allows customers to enable end-to-end authentication and encryption using their existing TLS or mTLS implementations. Prior to this launch, VPC Lattice supported HTTP and HTTPS listener protocols only, which terminates TLS and performs request-level routing and load balancing based on information in HTTP headers.

Amazon DocumentDB zero-ETL integration with Amazon OpenSearch Service – This new integration provides you with advanced search capabilities, such as fuzzy search, cross-collection search and multilingual search, on your Amazon DocumentDB (with MongoDB compatibility) documents using the OpenSearch API. With a few clicks in the AWS Management Console, you can now synchronize your data from Amazon DocumentDB to Amazon OpenSearch Service, eliminating the need to write any custom code to extract, transform, and load the data.

Amazon EventBridge now supports customer managed keys (CMK) for event buses – This capability allows you to encrypt your events using your own keys instead of an AWS owned key (which is used by default). With support for CMK, you now have more fine-grained security control over your events, satisfying your company’s security requirements and governance policies.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items, open source projects, and Twitch shows that you might find interesting:

The Four Pillars of Managing Email Reputation – Dustin Taylor is the manager of anti-abuse and email deliverability for Amazon Simple Email Service (SES). He wrote a remarkable post exploring Amazon SES approach to managing domain and IP reputation. Maintaining a high reputation ensures optimal recipient inboxing. His post outlines how Amazon SES protects its network reputation to help you deliver high-quality email consistently. A worthy read, even if you’re not sending email at scale. I learned a lot.

AWS Build On Generative AIBuild On Generative AI – Season 3 of your favorite weekly Twitch show about all things generative artificial intelligence (AI) is in full swing! Streaming every Monday, 9:00 AM US PT, my colleagues Tiffany and Darko discuss different aspects of generative AI and invite guest speakers to demo their work.

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter, in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Hong Kong (May 22), Milan (May 23), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore 2.5 days of immersive cloud security learning in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), Nigeria (August 24), and New York (August 28).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

New compute-optimized (C7i-flex) Amazon EC2 Flex instances

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/new-compute-optimized-c7i-flex-amazon-ec2-flex-instances/

The vast majority of applications don’t run run the CPU flat-out at 100% utilization continuously. Take a web application, for instance. It typically fluctuates between periods of high and low demand, but hardly ever uses a server’s compute at full capacity.

a graph showing how a typical application runs with low-to-moderate CPU utilization most of the time with occasional peaks.

CPU utilization for many common workloads that customers run in the AWS Cloud today. (source: AWS Documentation)

One easy and cost-effective way to run such workloads is to use the Amazon EC2 M7i-flex instances which we introduced last August. These are lower-priced variants of the Amazon EC2 M7i instances offering the same next-generation specs for general purpose compute for the most popular sizes with the added benefit of giving you better price/performance if you don’t need full compute power 100 percent of the time. This makes them a great first choice if you are looking to reduce your running cost while meeting the same performance benchmarks.

This flexibility resonated really well with customers so, today, we are expanding our Flex portfolio by launching Amazon EC2 C7i-flex instances offering similar benefits of price/performance and lower costs for compute-intensive workloads. These are lower-priced variants of the Amazon EC2 C7i instances that offer a baseline level of CPU performance with the ability to scale up to the full compute performance 95% of the time.

C7i-flex instances
C7i-flex offers five of the most common sizes from large to 8xlarge, delivering 19 percent better price performance than Amazon EC2 C6i instances.

Instance name vCPU Memory (GiB) Instance storage (GB) Network bandwidth (Gbps) EBS bandwidth (Gbps)
c7i-flex.large 2 4 EBS-only up to 12.5 up to 10
c7i-flex.xlarge 4 8 EBS-only up to 12.5 up to 10
c7i-flex.2xlarge 8 16 EBS-only up to 12.5 up to 10
c7i-flex.4xlarge 16 32 EBS-only up to 12.5 up to 10
c7i-flex.8xlarge 32 64 EBS-only up to 12.5 up to 10

Should I use C7i-flex or C7i?
Both C7i-flex and C7i are compute-optmized instances powered by custom 4th Generation Intel Xeon Scalable processors which are only available at Amazon Web Services (AWS). They offer up to 15 percent better performance over comparable x86-based Intel processors used by other cloud providers.

They both also use DDR5 memory, feature a 2:1 ratio of memory to vCPU, and are ideal for running applications such as web and application servers, databases, caches, Apache Kafka, and Elasticsearch.

So why would you use one over the other? Here are three things to consider when deciding which one is right for you.

Usage pattern
EC2 flex instances are a great fit for when you don’t need to fully utilize all compute resources.

You can achieve 5 percent better price performance and 5 percent lower prices due to efficient use of compute resources. Typically, this is a great fit for most applications, so C7i-flex instances should be the first choice for compute-intensive workloads.

However, if your application requires continuous high CPU usage, then you should use C7i instances instead. They are likely more suitable for workloads such as batch processing, distributed analytics, high performance computing (HPC), ad serving, highly scalable multiplayer gaming, and video encoding.

Instance sizes
C7i-flex instances offer the most common sizes used by a majority of workloads going up to a maximum of 8xlarge in size.

If you need higher specs, then you should look into the large C7i instances, which include 12xlarge, 16xlarge, 24xlarge, 48xlarge and two bare metal options with metal-24xl and metal-48xl sizes.

Network bandwidth
Larger sizes also offer higher network and Amazon Elastic Block Store (Amazon EBS) bandwidths so you may need to use one of the larger C7i instances depending on your requirements. C7i-flex instances offer up to 12.5 Gbps of network bandwidth and up to 10 Gbps of Amazon Elastic Block Store (Amazon EBS) bandwidth which should be suitable for most applications.

Things to know
Regions – Visit AWS Services by Region to check whether C7i-flex instances are available in your preferred regions.

Purchasing options – C7i-Flex and C7i instances are available in On-Demand, Savings Plan, Reserved Instance, and Spot form. C7i instances are also available in Dedicated Host and Dedicated Instance form.

To learn more visit Amazon EC2 C7i and C7i-flex instances

Matheus Guimaraes

AWS Weekly Roundup: New capabilities in Amazon Bedrock, AWS Amplify Gen 2, Amazon RDS and more (May 13, 2024)

Post Syndicated from Abhishek Gupta original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-new-capabilities-in-amazon-bedrock-aws-amplify-gen-2-amazon-rds-and-more-may-13-2024/

AWS Summit is in full swing around the world, with the most recent one being AWS Summit Singapore! Here is a sneak peek of the AWS staff and ASEAN community members at the Developer Lounge booth. It featured AWS Community speakers giving lightning talks on serverless, Amazon Elastic Kubernetes Service (Amazon EKS), security, generative AI, and more.

Last week’s launches
Here are some launches that caught my attention. Not surprisingly, a lot of interesting generative AI features!

Amazon Titan Text Premier is now available in Amazon Bedrock – This is the latest addition to the Amazon Titan family of large language models (LLMs) and offers optimized performance for key features like Retrieval Augmented Generation (RAG) on Knowledge Bases for Amazon Bedrock, and function calling on Agents for Amazon Bedrock.

Amazon Bedrock Studio is now available in public previewAmazon Bedrock Studio offers a web-based experience to accelerate the development of generative AI applications by providing a rapid prototyping environment with key Amazon Bedrock features, including Knowledge Bases, Agents, and Guardrails.

Amazon Bedrock Studio

Agents for Amazon Bedrock now supports Provisioned Throughput pricing model – As agentic applications scale, they require higher input and output model throughput compared to on-demand limits. The Provisioned Throughput pricing model makes it possible to purchase model units for the specific base model.

MongoDB Atlas is now available as a vector store in Knowledge Bases for Amazon Bedrock – With MongoDB Atlas vector store integration, you can build RAG solutions to securely connect your organization’s private data sources to foundation models (FMs) in Amazon Bedrock.

Amazon RDS for PostgreSQL supports pgvector 0.7.0 – You can use the open-source PostgreSQL extension for storing vector embeddings and add retrieval-augemented generation (RAG) capability in your generative AI applications. This release includes features that increase the number of dimensions of vectors you can index, reduce index size, and includes additional support for using CPU SIMD in distance computations. Also Amazon RDS Performance Insights now supports the Oracle Multitenant configuration on Amazon RDS for Oracle.

Amazon EC2 Inf2 instances are now available in new regions – These instances are optimized for generative AI workloads and are generally available in the Asia Pacific (Sydney), Europe (London), Europe (Paris), Europe (Stockholm), and South America (Sao Paulo) Regions.

New Generative Engine in Amazon Polly is now generally available – The generative engine in Amazon Polly is it’s most advanced text-to-speech (TTS) model and currently includes two American English voices, Ruth and Matthew, and one British English voice, Amy.

AWS Amplify Gen 2 is now generally availableAWS Amplify offers a code-first developer experience for building full-stack apps using TypeScript and enables developers to express app requirements like the data models, business logic, and authorization rules in TypeScript. AWS Amplify Gen 2 has added a number of features since the preview, including a new Amplify console with features such as custom domains, data management, and pull request (PR) previews.

Amazon EMR Serverless now includes performance monitoring of Apache Spark jobs with Amazon Managed Service for Prometheus – This lets you analyze, monitor, and optimize your jobs using job-specific engine metrics and information about Spark event timelines, stages, tasks, and executors. Also, Amazon EMR Studio is now available in the Asia Pacific (Melbourne) and Israel (Tel Aviv) Regions.

Amazon MemoryDB launched two new condition keys for IAM policies – The new condition keys let you create AWS Identity and Access Management (IAM) policies or Service Control Policies (SCPs) to enhance security and meet compliance requirements. Also, Amazon ElastiCache has updated it’s minimum TLS version to 1.2.

Amazon Lightsail now offers a larger instance bundle – This includes 16 vCPUs and 64 GB memory. You can now scale your web applications and run more compute and memory-intensive workloads in Lightsail.

Amazon Elastic Container Registry (ECR) adds pull through cache support for GitLab Container Registry – ECR customers can create a pull through cache rule that maps an upstream registry to a namespace in their private ECR registry. Once rule is configured, images can be pulled through ECR from GitLab Container Registry. ECR automatically creates new repositories for cached images and keeps them in-sync with the upstream registry.

AWS Resilience Hub expands application resilience drift detection capabilities – This new enhancement detects changes, such as the addition or deletion of resources within the application’s input sources.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional projects and blog posts that you might find interesting.

Building games with LLMs – Check out this fun experiment by Banjo Obayomi to generate Super Mario levels using different LLMs on Amazon Bedrock!

Troubleshooting with Amazon Q –  Ricardo Ferreira walks us through how he solved a nasty data serialization problem while working with Apache Kafka, Go, and Protocol Buffers.

Getting started with Amazon Q in VS Code – Check out this excellent step-by-step guide by Rohini Gaonkar that covers installing the extension for features like code completion chat, and productivity-boosting capabilities powered by generative AI.

AWS open source news and updates – My colleague Ricardo writes about open source projects, tools, and events from the AWS Community. Check out Ricardo’s page for the latest updates.

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Bengaluru (May 15–16), Seoul (May 16–17), Hong Kong (May 22), Milan (May 23), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore 2.5 days of immersive cloud security learning in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Turkey (May 18), Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), Nigeria (August 24), and New York (August 28).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Abhishek

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

A new generative engine and three voices are now generally available on Amazon Polly

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/a-new-generative-engine-and-three-voices-are-now-generally-available-on-amazon-polly/

Today, we are announcing the general availability of the generative engine of Amazon Polly with three voices: Ruth and Matthew in American English and Amy in British English. The new generative engine was trained with publicly available and proprietary data, a variety of voices, languages, and styles. It performs with the highest precision to render context-dependent prosody, pausing, spelling, dialectal properties, foreign word pronunciation, and more.

Amazon Polly is a machine learning (ML) service that converts text to lifelike speech, called text-to-speech (TTS) technology. Now, Amazon Polly includes high-quality, natural-sounding human-like voices in dozens of languages, so you can select the ideal voice and distribute your speech-enabled applications in many locales or countries.

With Amazon Polly, you can select various voice options, including neural, long-form, and generative voices, which deliver ground-breaking improvements in speech quality and produce human-like, highly expressive, and emotionally adept voices. You can store speech output in standard formats like MP3 or OGG, adjust the speech rate, pitch, or volume with Speech Synthesis Markup Language (SSML) tags, and quickly deliver lifelike voices and conversational user experiences with consistently fast response times.

What’s the new generative engine?
Amazon Polly now supports four voice engines: standard, neural, long-form, and generative voices.

Standard TTS voices, introduced in 2016 use traditional concatenative synthesis. This method strings together the phonemes of recorded speech, producing very natural-sounding synthesized speech. However, the inevitable variations in speech and the techniques used to segment the waveforms limit the quality of speech.

Neural TTS (NTTS) voices, introduced in 2019, use a sequence-to-sequence neural network that converts a sequence of phonemes into spectrograms, and a neural vocoder that converts the spectrograms into a continuous audio signal. The NTTS produces even higher quality human-like voices than its standard voices.

Long-form voices, introduced in 2023, are developed with cutting-edge deep learning TTS technology and designed to captivate listeners’ attention for longer content, such as news articles, training materials, or marketing videos.

In February 2024, Amazon scientists introduced a new research TTS model called Big Adaptive Streamable TTS with Emergent abilities (BASE). With this technology, Polly Generative engine is able to create human-like synthetically generated voices. You can use these voices as a knowledgeable customer assistant, a virtual trainer, or an experienced marketer.

Here are the new generative voices:

Name Locale Gender Language Sample prompt NTTS voices
Generative voices
Ruth en_US Female English (US) Selma was lying on the ground halfway down the steps. 'Selma! Selma!' we shouted in panic.
Matthew en_US Male English (US) The guards were standing outside with some of our neighbours, listening to a transistor radio. 'Any good news?' I asked. 'No, we're listening to the names of people who were killed yesterday,' Bruno replied.
Amy en_GB Female English (British) What are you looking at?' he said as he stood over me. They got off the bus and started searching the baggage compartment. The tension on the bus was like a dark, menacing cloud that hovered above us.

You can choose from these voice options to suit your application and use case. To learn more about the generative engine, visit Generative voices in the AWS documentation.

Get started with using generative voices
You can access the new voices using the AWS Management Console, AWS Command Line Interface (AWS CLI), or the AWS SDKs.

To get started, go to the Amazon Polly console in the US (N. Virginia) Region and choose Text-to-Speech menu in the left pane. If you select the voice of Ruth or Matthew in the language of English, US or Amy in English, UK, you can choose Generative engine. Input your text and listen to or download the generated voice output.

Using the CLI, you can list the voices that use the new generative engine:

$ aws polly describe-voices --output json --region us-east-1 \
| jq -r '.Voices[] | select(.SupportedEngines | index("generative")) | .Name'

Matthew
Amy
Ruth

Now, run the synthesize-speech CLI command to synthesize sample text to an audio file (hello.mp3) with the parameters of generative engine and a supported voice ID.

$ aws polly synthesize-speech --output-format mp3 --region us-east-1 \
  --text "Hello. This is my first generative voices!" \
  --voice-id Matthew --engine generative hello.mp3

To learn more code examples using AWS SDKs, visit Code and Application Examples in the AWS documentation. You can use Java and Python code examples, application examples such as web applications using Java or Python, or iOS and Android applications.

Now available
The new generative voices of Amazon Polly are now available today in the US East (N. Virginia) Region. You only pay for what you use based on the number of characters of text that you convert to speech. To learn more, visit our Amazon Polly Pricing page.

Give new generative voices a try in the Amazon Polly console today and send feedback to AWS re:Post for Amazon Polly or through your usual AWS Support contacts.

Channy

Build RAG and agent-based generative AI applications with new Amazon Titan Text Premier model, available in Amazon Bedrock

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/build-rag-and-agent-based-generative-ai-applications-with-new-amazon-titan-text-premier-model-available-in-amazon-bedrock/

Today, we’re happy to welcome a new member of the Amazon Titan family of models: Amazon Titan Text Premier, now available in Amazon Bedrock.

Following Amazon Titan Text Lite and Titan Text Express, Titan Text Premier is the latest large language model (LLM) in the Amazon Titan family of models, further increasing your model choice within Amazon Bedrock. You can now choose between the following Titan Text models in Bedrock:

  • Titan Text Premier is the most advanced Titan LLM for text-based enterprise applications. With a maximum context length of 32K tokens, it has been specifically optimized for enterprise use cases, such as building Retrieval Augmented Generation (RAG) and agent-based applications with Knowledge Bases and Agents for Amazon Bedrock. As with all Titan LLMs, Titan Text Premier has been pre-trained on multilingual text data but is best suited for English-language tasks. You can further custom fine-tune (preview) Titan Text Premier with your own data in Amazon Bedrock to build applications that are specific to your domain, organization, brand style, and use case. I’ll dive deeper into model highlights and performance in the following sections of this post.
  • Titan Text Express is ideal for a wide range of tasks, such as open-ended text generation and conversational chat. The model has a maximum context length of 8K tokens.
  • Titan Text Lite is optimized for speed, is highly customizable, and is ideal to be fine-tuned for tasks such as article summarization and copywriting. The model has a maximum context length of 4K tokens.

Now, let’s discuss Titan Text Premier in more detail.

Amazon Titan Text Premier model highlights
Titan Text Premier has been optimized for high-quality RAG and agent-based applications and customization through fine-tuning while incorporating responsible artificial intelligence (AI) practices.

Optimized for RAG and agent-based applications – Titan Text Premier has been specifically optimized for RAG and agent-based applications in response to customer feedback, where respondents named RAG as one of their key components in building generative AI applications. The model training data includes examples for tasks like summarization, Q&A, and conversational chat and has been optimized for integration with Knowledge Bases and Agents for Amazon Bedrock. The optimization includes training the model to handle the nuances of these features, such as their specific prompt formats.

  • High-quality RAG through integration with Knowledge Bases for Amazon Bedrock – With a knowledge base, you can securely connect foundation models (FMs) in Amazon Bedrock to your company data for RAG. You can now choose Titan Text Premier with Knowledge Bases to implement question-answering and summarization tasks over your company’s proprietary data.
    Amazon Titan Text Premier support in Knowledge Bases
  • Automating tasks through integration with Agents for Amazon Bedrock – You can also create custom agents that can perform multistep tasks across different company systems and data sources using Titan Text Premier with Agents for Amazon Bedrock. Using agents, you can automate tasks for your internal or external customers, such as managing retail orders or processing insurance claims.
    Amazon Titan Text Premier with Agents for Amazon Bedrock

We already see customers exploring Titan Text Premier to implement interactive AI assistants that create summaries from unstructured data such as emails. They’re also exploring the model to extract relevant information across company systems and data sources to create more meaningful product summaries.

Here’s a demo video created by my colleague Brooke Jamieson that shows an example of how you can put Titan Text Premier to work for your business.

Custom fine-tuning of Amazon Titan Text Premier (preview) – You can fine-tune Titan Text Premier with your own data in Amazon Bedrock to increase model accuracy by providing your own task-specific labeled training dataset. Customizing Titan Text Premier helps to further specialize your model and create unique user experiences that reflect your company’s brand, style, voice, and services.

Built responsibly – Amazon Titan Text Premier incorporates safe, secure, and trustworthy practices. The AWS AI Service Card for Amazon Titan Text Premier documents the model’s performance across key responsible AI benchmarks from safety and fairness to veracity and robustness. The model also integrates with Guardrails for Amazon Bedrock so you can implement additional safeguards customized to your application requirements and responsible AI policies. Amazon indemnifies customers who responsibly use Amazon Titan models against claims that generally available Amazon Titan models or their outputs infringe on third-party copyrights.

Amazon Titan Text Premier model performance
Titan Text Premier has been built to deliver broad intelligence and utility relevant for enterprises. The following table shows evaluation results on public benchmarks that assess critical capabilities, such as instruction following, reading comprehension, and multistep reasoning against price-comparable models. The strong performance across these diverse and challenging benchmarks highlights that Titan Text Premier is built to handle a wide range of use cases in enterprise applications, offering great price performance. For all benchmarks listed below, a higher score is a better score.

Capability Benchmark Description Amazon Google OpenAI
Titan Text Premier Gemini Pro 1.0 GPT-3.5
General MMLU
(Paper)
Representation of questions in 57 subjects 70.4%
(5-shot)
71.8%
(5-shot)
70.0%
(5-shot)
Instruction following IFEval
(Paper)
Instruction-following evaluation for large language models 64.6%
(0-shot)
not published not published
Reading comprehension RACE-H
(Paper)
Large-scale reading comprehension 89.7%
(5-shot)
not published not published
Reasoning HellaSwag
(Paper)
Common-sense reasoning 92.6%
(10-shot)
84.7%
(10-shot)
85.5%
(10-shot)
DROP, F1 score
(Paper)
Reasoning over text 77.9
(3-shot)
74.1
(Variable Shots)
64.1
(3-shot)
BIG-Bench Hard
(Paper)
Challenging tasks requiring multistep reasoning 73.7%
(3-shot CoT)
75.0%
(3-shot CoT)
not published
ARC-Challenge
(Paper)
Common-sense reasoning 85.8%
(5-shot)
not published 85.2%
(25-shot)

Note: Benchmarks evaluate model performance using a variation of few-shot and zero-shot prompting. With few-shot prompting, you provide the model with a number of concrete examples (three for 3-shot, five for 5-shot, etc.) of how to solve a specific task. This demonstrates the model’s ability to learn from example, called in-context learning. With zero-shot prompting on the other hand, you evaluate a model’s ability to perform tasks by relying only on its preexisting knowledge and general language understanding without providing any examples.

Get started with Amazon Titan Text Premier
To enable access to Amazon Titan Text Premier, navigate to the Amazon Bedrock console and choose Model access on the bottom left pane. On the Model access overview page, choose the Manage model access button in the upper right corner and enable access to Amazon Titan Text Premier.

Select Amazon Titan Text Premier in Amazon Bedrock model access page

To use Amazon Titan Text Premier in the Bedrock console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Amazon as the category and Titan Text Premier as the model. To explore the model, you can load examples. The following screenshot shows one of those examples that demonstrates the model’s chain of thought (CoT) and reasoning capabilities.

Amazon Titan Text Premier in the Amazon Bedrock chat playground

By choosing View API request, you can get a code example of how to invoke the model using the AWS Command Line Interface (AWS CLI) with the current example prompt. You can also access Amazon Bedrock and available models using the AWS SDKs. In the following example, I will use the AWS SDK for Python (Boto3).

Amazon Titan Text Premier in action
For this demo, I ask Amazon Titan Text Premier to summarize one of my previous AWS News Blog posts that announced the availability of Amazon Titan Image Generator and the watermark detection feature.

For summarization tasks, a recommended prompt template looks like this:

The following is text from a {{Text Category}}:
{{Text}}
Summarize the {{Text Category}} in {{length of summary}}

For more prompting best practices, check out the Amazon Titan Text Prompt Engineering Guidelines.

I adapt this template to my example and define the prompt. In preparation, I saved my News Blog post as a text file and read it into the post string variable.

prompt = """
The following is text from a AWS News Blog post:

<text>
%s
</text>

Summarize the above AWS News Blog post in a short paragraph.
""" % post

Similar to previous Amazon Titan Text models, Amazon Titan Text Premier supports temperature and topP inference parameters to control the randomness and diversity of the response, as well as maxTokenCount and stopSequences to control the length of the response.

import boto3
import json

bedrock_runtime = boto3.client(service_name="bedrock-runtime")

body = json.dumps({
    "inputText": prompt, 
    "textGenerationConfig":{  
        "maxTokenCount":256,
        "stopSequences":[],
        "temperature":0,
        "topP":0.9
    }
})

Then, I use the InvokeModel API to send the inference request.

response = bedrock_runtime.invoke_model(
    body=body,
	modelId="amazon.titan-text-premier-v1:0",
    accept="application/json", 
    contentType="application/json"
)

response_body = json.loads(response.get('body').read())
print(response_body.get('results')[0].get('outputText'))

And here’s the response:

Amazon Titan Image Generator is now generally available in Amazon Bedrock, giving you an easy way to build and scale generative AI applications with new image generation and image editing capabilities, including instant customization of images. Watermark detection for Titan Image Generator is now generally available in the Amazon Bedrock console. Today, we’re also introducing a new DetectGeneratedContent API (preview) in Amazon Bedrock that checks for the existence of this watermark and helps you confirm whether an image was generated by Titan Image Generator.

For more examples in different programming languages, check out the code examples section in the Amazon Bedrock User Guide.

More resources
Here are some additional resources that you might find helpful:

Intended use cases and more — Check out the AWS AI Service Card for Amazon Titan Text Premier to learn more about the models’ intended use cases, design, and deployment, as well as performance optimization best practices.

AWS Generative AI CDK Constructs — Amazon Titan Text Premier is supported by the AWS Generative AI CDK Constructs, an open source extension of the AWS Cloud Development Kit (AWS CDK), providing sample implementations of AWS CDK for common generative AI patterns.

Amazon Titan models — If you’re curious to learn more about Amazon Titan models in general, check out the following video. Dr. Sherry Marcus, Director of Applied Science for Amazon Bedrock, shares how the Amazon Titan family of models incorporates the 25 years of experience Amazon has innovating with AI and machine learning (ML) across its business.

Now available
Amazon Titan Text Premier is available today in the AWS US East (N. Virginia) Region. Custom fine-tuning for Amazon Titan Text Premier is available today in preview in the AWS US East (N. Virginia) Region. Check the full Region list for future updates. To learn more about the Amazon Titan family of models, visit the Amazon Titan product page. For pricing details, review the Amazon Bedrock pricing page.

Give Amazon Titan Text Premier a try in the Amazon Bedrock console today, send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS contacts, and engage with the generative AI builder community at community.aws.

— Antje

Build generative AI applications with Amazon Bedrock Studio (preview)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/build-generative-ai-applications-with-amazon-bedrock-studio-preview/

Today, we’re introducing Amazon Bedrock Studio, a new web-based generative artificial intelligence (generative AI) development experience, in public preview. Amazon Bedrock Studio accelerates the development of generative AI applications by providing a rapid prototyping environment with key Amazon Bedrock features, including Knowledge BasesAgents, and Guardrails.

As a developer, you can now use your company’s single sign-on credentials to sign in to Bedrock Studio and start experimenting. You can build applications using a wide array of top performing models, evaluate, and share your generative AI apps within Bedrock Studio. The user interface guides you through various steps to help improve a model’s responses. You can experiment with model settings, and securely integrate your company data sources, tools, and APIs, and set guardrails. You can collaborate with team members to ideate, experiment, and refine your generative AI applications—all without requiring advanced machine learning (ML) expertise or AWS Management Console access.

As an Amazon Web Services (AWS) administrator, you can be confident that developers will only have access to the features provided by Bedrock Studio, and won’t have broader access to AWS infrastructure and services.

Amazon Bedrock Studio

Now, let me show you how to get started with Amazon Bedrock Studio.

Get started with Amazon Bedrock Studio
As an AWS administrator, you first need to create an Amazon Bedrock Studio workspace, then select and add users you want to give access to the workspace. Once the workspace is created, you can share the workspace URL with the respective users. Users with access privileges can sign in to the workspace using single sign-on, create projects within their workspace, and start building generative AI applications.

Create Amazon Bedrock Studio workspace
Navigate to the Amazon Bedrock console and choose Bedrock Studio on the bottom left pane.

Amazon Bedrock Studio in the Bedrock console

Before creating a workspace, you need to configure and secure the single sign-on integration with your identity provider (IdP) using the AWS IAM Identity Center. For detailed instructions on how to configure various IdPs, such as AWS Directory Service for Microsoft Active Directory, Microsoft Entra ID, or Okta, check out the AWS IAM Identity Center User Guide. For this demo, I configured user access with the default IAM Identity Center directory.

Next, choose Create workspace, enter your workspace details, and create any required AWS Identity and Access Management (IAM) roles.

If you want, you can also select default generative AI models and embedding models for the workspace. Once you’re done, choose Create.

Next, select the created workspace.

Amazon Bedrock Studio, workspace created

Then, choose User management and Add users or groups to select the users you want to give access to this workspace.

Add users to your Amazon Bedrock Studio workspace

Back in the Overview tab, you can now copy the Bedrock Studio URL and share it with your users.

Amazon Bedrock Studio, share workspace URL

Build generative AI applications using Amazon Bedrock Studio
As a builder, you can now navigate to the provided Bedrock Studio URL and sign in with your single sign-on user credentials. Welcome to Amazon Bedrock Studio! Let me show you how to choose from industry leading FMs, bring your own data, use functions to make API calls, and safeguard your applications using guardrails.

Choose from multiple industry leading FMs
By choosing Explore, you can start selecting available FMs and explore the models using natural language prompts.

Amazon Bedrock Studio UI

If you choose Build, you can start building generative AI applications in a playground mode, experiment with model configurations, iterate on system prompts to define the behavior of your application, and prototype new features.

Amazon Bedrock Studio - start building applications

Bring your own data
With Bedrock Studio, you can securely bring your own data to customize your application by providing a single file or by selecting a knowledge base created in Amazon Bedrock.

Amazon Bedrock Studio - start building applications

Use functions to make API calls and make model responses more relevant
A function call allows the FM to dynamically access and incorporate external data or capabilities when responding to a prompt. The model determines which function it needs to call based on an OpenAPI schema that you provide.

Functions enable a model to include information in its response that it doesn’t have direct access to or prior knowledge of. For example, a function could allow the model to retrieve and include the current weather conditions in its response, even though the model itself doesn’t have that information stored.

Amazon Bedrock Studio - Add functions

Safeguard your applications using Guardrails for Amazon Bedrock
You can create guardrails to promote safe interactions between users and your generative AI applications by implementing safeguards customized to your use cases and responsible AI policies.

Amazon Bedrock Studio - Add Guardrails

When you create applications in Amazon Bedrock Studio, the corresponding managed resources such as knowledge bases, agents, and guardrails are automatically deployed in your AWS account. You can use the Amazon Bedrock API to access those resources in downstream applications.

Here’s a short demo video of Amazon Bedrock Studio created by my colleague Banjo Obayomi.

Join the preview
Amazon Bedrock Studio is available today in public preview in AWS Regions US East (N. Virginia) and US West (Oregon). To learn more, visit the Amazon Bedrock Studio page and User Guide.

Give Amazon Bedrock Studio a try today and let us know what you think! Send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS contacts, and engage with the generative AI builder community at community.aws.

— Antje

AWS Weekly Roundup: Amazon Q, Amazon QuickSight, AWS CodeArtifact, Amazon Bedrock, and more (May 6, 2024)

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-q-amazon-quicksight-aws-codeartifact-amazon-bedrock-and-more-may-6-2024/

April has been packed with new releases! Last week continued that trend with many new releases supporting a variety of domains such as security, analytics, devops, and many more, as well as more exciting new capabilities within generative AI.

If you missed the AWS Summit London 2024, you can now watch the sessions on demand, including the keynote by Tanuja Randery, VP & Marketing Director, EMEA, and many of the break-out sessions which will continue to be released over the coming weeks.

Last week’s launches
Here are some of the highlights that caught my attention this week:

Manual and automatic rollback from any stage in AWS CodePipeline – You can now rollback any stage, other than Source, to any previously known good state in if you use a V2 pipeline in AWS CodePipeline. You can configure automatic rollback which will use the source changes from the most recent successful pipeline execution in the case of failure, or you can initiate a manual rollback for any stage from the console, API or SDK and choose which pipeline execution you want to use for the rollback.

AWS CodeArtifact now supports RubyGems – Ruby community, rejoice, you can now store your gems in AWS CodeArtifact! You can integrate it with RubyGems.org, and CodeArtifact will automatically fetch any gems requested by the client and store them locally in your CodeArtifact repository. That means that you can have a centralized place for both your first-party and public gems so developers can access their dependencies from a single source.

Ruby-repo screenshot

Create a repository in AWS CodeArtifact and choose “rubygems-store” to connect your repository to RubyGems.org on the “Public upstream repositories” dropdown.

Amazon EventBridge Pipes now supports event delivery through AWS PrivateLink – You can now deliver events to an Amazon EventBridge Pipes target without traversing the public internet by using AWS PrivateLink. You can poll for events in a private subnet in your Amazon Virtual Private Cloud (VPC) without having to deploy any additional infrastructure to keep your traffic private.

Amazon Bedrock launches continue. You can now run scalable, enterprise-grade generative AI workloads with Cohere Command R & R+. And Amazon Titan Text V2 is now optimized for improving Retrieval-Augmented Generation (RAG).

AWS Trusted Advisor – last year we launched Trusted Advisor APIs enabling you to programmatically consume recommendations. A new API is available now that you can use to exclude resources from recommendations.

Amazon EC2 – there have been two new great launches this week for EC2 users. You can now mark your AMIs as “protected” to avoid them being deregistered by accident. You can also now easily discover your active AMIs by simply describing them.

Amazon CodeCatalyst – you can now view your git commit history in the CodeCatalyst console.

General Availability
Many new services and capabilities became generally available this week.

Amazon Q in QuickSight – Amazon Q has brought generative BI to Amazon QuickSight giving you the ability to build beautiful dashboards automatically simply by using natural language and it’s now generally available. To get started, head to the Quicksight Pricing page to explore all options or start a 30-day free trial which allows up to 4 users per QuickSight account to use all the new generative AI features.

With the new generative AI features enabled by Amazon Q in Amazon QuickSight you can use natural language queries to build, sort and filter dashboards. (source: AWS Documentation)

Amazon Q Business (GA) and Amazon Q Apps (Preview) – Also generally available now is Amazon Q Business which we launched last year at AWS re:Invent 2023 with the ability to connect seamlessly with over 40 popular enterprise systems, including Microsoft 365, Salesforce, Amazon Simple Storage Service (Amazon S3), Gmail, and so many more. This allows Amazon Q Business to know about your business so your employees can generate content, solve problems, and take actions that are specific to your business.

We have also launched support for custom plug-ins, so now you can create your own integrations with any third-party application.

Q-business screenshot

With general availability of Amazon Q Business we have also launched the ability to create your own custom plugins to connect to any third-party API.

Another highlight of this release is the launch of Amazon Q Apps, which enables you to quickly generate an app from your conversation with Amazon Q Business, or by describing what you would like it to generate for you. All guardrails from Amazon Q Business apply, and it’s easy to share your apps with colleagues through an admin-managed library. Amazon Q Apps is in preview now.

Check out Channy Yun’s post for a deeper dive into Amazon Q Business and Amazon Q Apps, which guides you through these new features.

Amazon Q Developer – you can use Q Developer to completely change your developer flow. It has all the capabilities of what was previously known as Amazon CodeWhisperer, such as Q&A, diagnosing common errors, generating code including tests, and many more. Now it has expanded, so you can use it to generate SQL, and build data integration pipelines using natural language. In preview, it can describe resources in your AWS account and help you retrieve and analyze cost data from AWS Cost Explorer.

For a full list of AWS announcements, be sure to keep an eye on the ‘What’s New with AWS?‘ page.

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

AWS open source news and updates – My colleague Ricardo writes about open source projects, tools, and events from the AWS Community.

Discover Claude 3 – If you’re a developer looking for a good source to get started with Claude 3 them I recommend this great post from my colleague Haowen Huang: Mastering Amazon Bedrock with Claude 3: Developer’s Guide with Demos.

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Singapore (May 7), Seoul (May 16–17), Hong Kong (May 22), Milan (May 23), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore 2.5 days of immersive cloud security learning in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Turkey (May 18), Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), Nigeria (August 24), and New York (August 28).

GOTO EDA Day LondonJoin us in London on May 14 to learn about event-driven architectures (EDA) for building highly scalable, fault tolerant, and extensible applications. This conference is organized by GOTO, AWS, and partners.

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Matheus Guimaraes

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Build RAG applications with MongoDB Atlas, now available in Knowledge Bases for Amazon Bedrock

Post Syndicated from Abhishek Gupta original https://aws.amazon.com/blogs/aws/build-rag-applications-with-mongodb-atlas-now-available-in-knowledge-bases-for-amazon-bedrock/

Foundational models (FMs) are trained on large volumes of data and use billions of parameters. However, in order to answer customers’ questions related to domain-specific private data, they need to reference an authoritative knowledge base outside of the model’s training data sources. This is commonly achieved using a technique known as Retrieval Augmented Generation (RAG). By fetching data from the organization’s internal or proprietary sources, RAG extends the capabilities of FMs to specific domains, without needing to retrain the model. It is a cost-effective approach to improving model output so it remains relevant, accurate, and useful in various contexts.

Knowledge Bases for Amazon Bedrock is a fully managed capability that helps you implement the entire RAG workflow from ingestion to retrieval and prompt augmentation without having to build custom integrations to data sources and manage data flows.

Today, we are announcing the availability of MongoDB Atlas as a vector store in Knowledge Bases for Amazon Bedrock. With MongoDB Atlas vector store integration, you can build RAG solutions to securely connect your organization’s private data sources to FMs in Amazon Bedrock. This integration adds to the list of vector stores supported by Knowledge Bases for Amazon Bedrock, including Amazon Aurora PostgreSQL-Compatible Edition, vector engine for Amazon OpenSearch Serverless, Pinecone, and Redis Enterprise Cloud.

Build RAG applications with MongoDB Atlas and Knowledge Bases for Amazon Bedrock
Vector Search in MongoDB Atlas is powered by the vectorSearch index type. In the index definition, you must specify the field that contains the vector data as the vector type. Before using MongoDB Atlas vector search in your application, you will need to create an index, ingest source data, create vector embeddings and store them in a MongoDB Atlas collection. To perform queries, you will need to convert the input text into a vector embedding, and then use an aggregation pipeline stage to perform vector search queries against fields indexed as the vector type in a vectorSearch type index.

Thanks to the MongoDB Atlas integration with Knowledge Bases for Amazon Bedrock, most of the heavy lifting is taken care of. Once the vector search index and knowledge base are configured, you can incorporate RAG into your applications. Behind the scenes, Amazon Bedrock will convert your input (prompt) into embeddings, query the knowledge base, augment the FM prompt with the search results as contextual information and return the generated response.

Let me walk you through the process of setting up MongoDB Atlas as a vector store in Knowledge Bases for Amazon Bedrock.

Configure MongoDB Atlas
Start by creating a MongoDB Atlas cluster on AWS. Choose an M10 dedicated cluster tier. Once the cluster is provisioned, create a database and collection. Next, create a database user and grant it the Read and write to any database role. Select Password as the Authentication Method. Finally, configure network access to modify the IP Access List – add IP address 0.0.0.0/0 to allow access from anywhere.

Use the following index definition to create the Vector Search index:

{
  "fields": [
    {
      "numDimensions": 1536,
      "path": "AMAZON_BEDROCK_CHUNK_VECTOR",
      "similarity": "cosine",
      "type": "vector"
    },
    {
      "path": "AMAZON_BEDROCK_METADATA",
      "type": "filter"
    },
    {
      "path": "AMAZON_BEDROCK_TEXT_CHUNK",
      "type": "filter"
    }
  ]
}

Configure the knowledge base
Create an AWS Secrets Manager secret to securely store the MongoDB Atlas database user credentials. Choose Other as the Secret type. Create an Amazon Simple Storage Service (Amazon S3) storage bucket and upload the Amazon Bedrock documentation user guide PDF. Later, you will use the knowledge base to ask questions about Amazon Bedrock.

You can also use another document of your choice because Knowledge Base supports multiple file formats (including text, HTML, and CSV).

Navigate to the Amazon Bedrock console and refer to the Amzaon Bedrock User Guide to configure the knowledge base. In the Select embeddings model and configure vector store, choose Titan Embeddings G1 – Text as the embedding model. From the list of databases, choose MongoDB Atlas.

Enter the basic information for the MongoDB Atlas cluster (Hostname, Database name, etc.) as well as the ARN of the AWS Secrets Manager secret you had created earlier. In the Metadata field mapping attributes, enter the vector store specific details. They should match the vector search index definition you used earlier.

Initiate the knowledge base creation. Once complete, synchronise the data source (S3 bucket data) with the MongoDB Atlas vector search index.

Once the synchronization is complete, navigate to MongoDB Atlas to confirm that the data has been ingested into the collection you created.

Notice the following attributes in each of the MongoDB Atlas documents:

  • AMAZON_BEDROCK_TEXT_CHUNK – Contains the raw text for each data chunk.
  • AMAZON_BEDROCK_CHUNK_VECTOR – Contains the vector embedding for the data chunk.
  • AMAZON_BEDROCK_METADATA – Contains additional data for source attribution and rich query capabilities.

Test the knowledge base
It’s time to ask questions about Amazon Bedrock by querying the knowledge base. You will need to choose a foundation model. I picked Claude v2 in this case and used “What is Amazon Bedrock” as my input (query).

If you are using a different source document, adjust the questions accordingly.

You can also change the foundation model. For example, I switched to Claude 3 Sonnet. Notice the difference in the output and select Show source details to see the chunks cited for each footnote.

Integrate knowledge base with applications
To build RAG applications on top of Knowledge Bases for Amazon Bedrock, you can use the RetrieveAndGenerate API which allows you to query the knowledge base and get a response.

Here is an example using the AWS SDK for Python (Boto3):

import boto3

bedrock_agent_runtime = boto3.client(
    service_name = "bedrock-agent-runtime"
)

def retrieveAndGenerate(input, kbId):
    return bedrock_agent_runtime.retrieve_and_generate(
        input={
            'text': input
        },
        retrieveAndGenerateConfiguration={
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kbId,
                'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
                }
            }
        )

response = retrieveAndGenerate("What is Amazon Bedrock?", "BFT0P4NR1U")["output"]["text"]

If you want to further customize your RAG solutions, consider using the Retrieve API, which returns the semantic search responses that you can use for the remaining part of the RAG workflow.

import boto3

bedrock_agent_runtime = boto3.client(
    service_name = "bedrock-agent-runtime"
)

def retrieve(query, kbId, numberOfResults=5):
    return bedrock_agent_runtime.retrieve(
        retrievalQuery= {
            'text': query
        },
        knowledgeBaseId=kbId,
        retrievalConfiguration= {
            'vectorSearchConfiguration': {
                'numberOfResults': numberOfResults
            }
        }
    )

response = retrieve("What is Amazon Bedrock?", "BGU0Q4NU0U")["retrievalResults"]

Things to know

  • MongoDB Atlas cluster tier – This integration requires requires an Atlas cluster tier of at least M10.
  • AWS PrivateLink – For the purposes of this demo, MongoDB Atlas database IP Access List was configured to allow access from anywhere. For production deployments, AWS PrivateLink is the recommended way to have Amazon Bedrock establish a secure connection to your MongoDB Atlas cluster. Refer to the Amazon Bedrock User guide (under MongoDB Atlas) for details.
  • Vector embedding size – The dimension size of the vector index and the embedding model should be the same. For example, if you plan to use Cohere Embed (which has a dimension size of 1024) as the embedding model for the knowledge base, make sure to configure the vector search index accordingly.
  • Metadata filters – You can add metadata for your source files to retrieve a well-defined subset of the semantically relevant chunks based on applied metadata filters. Refer to the documentation to learn more about how to use metadata filters.

Now available
MongoDB Atlas vector store in Knowledge Bases for Amazon Bedrock is available in the US East (N. Virginia) and US West (Oregon) Regions. Be sure to check the full Region list for future updates.

Learn more

Try out the MongoDB Atlas integration with Knowledge Bases for Amazon Bedrock! Send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS contacts and engage with the generative AI builder community at community.aws.

Abhishek

Leaky HPE SGI Cheyenne Supercomputer for Sale at Perhaps a Deal

Post Syndicated from Cliff Robinson original https://www.servethehome.com/leaky-hpe-sgi-cheyenne-supercomputer-for-sale-at-perhaps-a-deal-intel-supermicro-mellanox/

A leaky HPE SGI Cheyenne Supercomputer is on auction for about the price of a single NVIDIA H100 GPU system

The post Leaky HPE SGI Cheyenne Supercomputer for Sale at Perhaps a Deal appeared first on ServeTheHome.

Stop the CNAME chain struggle: Simplified management with Route 53 Resolver DNS Firewall

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/stop-the-cname-chain-struggle-simplified-management-with-route-53-resolver-dns-firewall/

Starting today, you can configure your DNS Firewall to automatically trust all domains in a resolution chain (such as aCNAME, DNAME, or Alias chain).

Let’s walk through this in nontechnical terms for those unfamiliar with DNS.

Why use DNS Firewall?
DNS Firewall provides protection for outbound DNS requests from your private network in the cloud (Amazon Virtual Private Cloud (Amazon VPC)). These requests route through Amazon Route 53 Resolver for domain name resolution. Firewall administrators can configure rules to filter and regulate the outbound DNS traffic.

DNS Firewall helps to protect against multiple security risks.

Let’s imagine a malicious actor managed to install and run some code on your Amazon Elastic Compute Cloud (Amazon EC2) instances or containers running inside one of your virtual private clouds (VPCs). The malicious code is likely to initiate outgoing network connections. It might do so to connect to a command server and receive commands to execute on your machine. Or it might initiate connections to a third-party service in a coordinated distributed denial of service (DDoS) attack. It might also try to exfiltrate data it managed to collect on your network.

Fortunately, your network and security groups are correctly configured. They block all outgoing traffic except the one to well-known API endpoints used by your app. So far so good—the malicious code cannot dial back home using regular TCP or UDP connections.

But what about DNS traffic? The malicious code may send DNS requests to an authoritative DNS server they control to either send control commands or encoded data, and it can receive data back in the response. I’ve illustrated the process in the following diagram.

DNS exfiltration illustrated

To prevent these scenarios, you can use a DNS Firewall to monitor and control the domains that your applications can query. You can deny access to the domains that you know to be bad and allow all other queries to pass through. Alternately, you can deny access to all domains except those you explicitly trust.

What is the challenge with CNAME, DNAME, and Alias records?
Imagine you configured your DNS Firewall to allow DNS queries only to specific well-known domains and blocked all others. Your application communicates with alexa.amazon.com; therefore, you created a rule allowing DNS traffic to resolve that hostname.

However, the DNS system has multiple types of records. The ones of interest in this article are

  • A records that map a DNS name to an IP address,
  • CNAME records that are synonyms for other DNS names,
  • DNAME records that provide redirection from a part of the DNS name tree to another part of the DNS name tree, and
  • Alias records that provide a Route 53 specific extension to DNS functionality. Alias records let you route traffic to selected AWS resources, such as Amazon CloudFront distributions and Amazon S3 buckets

When querying alexa.amazon.com, I see it’s actually a CNAME record that points to pitangui.amazon.com, which is another CNAME record that points to tp.5fd53c725-frontier.amazon.com, which, in turn, is a CNAME to d1wg1w6p5q8555.cloudfront.net. Only the last name (d1wg1w6p5q8555.cloudfront.net) has an A record associated with an IP address 3.162.42.28. The IP address is likely to be different for you. It points to the closest Amazon CloudFront edge location, likely the one from Paris (CDG52) for me.

A similar redirection mechanism happens when resolving DNAME or Alias records.

DNS resolution for alexa.amazon.com

To allow the complete resolution of such a CNAME chain, you could be tempted to configure your DNS Firewall rule to allow all names under amazon.com (*.amazon.com), but that would fail to resolve the last CNAME that goes to cloudfront.net.

Worst, the DNS CNAME chain is controlled by the service your application connects to. The chain might change at any time, forcing you to manually maintain the list of rules and authorized domains inside your DNS Firewall rules.

Introducing DNS Firewall redirection chain authorization
Based on this explanation, you’re now equipped to understand the new capability we launch today. We added a parameter to the UpdateFirewallRule API (also available on the AWS Command Line Interface (AWS CLI) and AWS Management Console) to configure the DNS Firewall so that it follows and automatically trusts all the domains in a CNAME, DNAME, or Alias chain.

This parameter allows firewall administrators to only allow the domain your applications query. The firewall will automatically trust all intermediate domains in the chain until it reaches the A record with the IP address.

Let’s see it in action
I start with a DNS Firewall already configured with a domain list, a rule group, and a rule that ALLOW queries for the domain alexa.amazon.com. The rule group is attached to a VPC where I have an EC2 instance started.

When I connect to that EC2 instance and issue a DNS query to resolve alexa.amazon.com, it only returns the first name in the domain chain (pitangui.amazon.com) and stops there. This is expected because pitangui.amazon.com is not authorized to be resolved.

DNS query for alexa.amazon.com is blocked at first CNAME

To solve this, I update the firewall rule to trust the entire redirection chain. I use the AWS CLI to call the update-firewall-rule API with a new parameter firewall-domain-redirection-action set to TRUST_REDIRECTION_DOMAIN.

AWS CLI to update the DNS firewall rule

The following diagram illustrates the setup at this stage.

DNS Firewall rule diagram

Back to the EC2 instance, I try the DNS query again. This time, it works. It resolves the entire redirection chain, down to the IP address 🎉.

DNS resolution for the full CNAME chain

Thanks to the trusted chain redirection, network administrators now have an easy way to implement a strategy to block all domains and authorize only known domains in their DNS Firewall without having to care about CNAME, DNAME, or Alias chains.

This capability is available at no additional cost in all AWS Regions. Try it out today!

— seb