Tag Archives: Thought Leadership

How AWS can help you navigate the complexity of digital sovereignty

Post Syndicated from Max Peterson original https://aws.amazon.com/blogs/security/how-aws-can-help-you-navigate-the-complexity-of-digital-sovereignty/

Customers from around the world often tell me that digital sovereignty is a top priority as they look to meet new compliance and industry regulations. In fact, 82% of global organizations are either currently using, planning to use, or considering sovereign cloud solutions in the next two years, according to the International Data Corporation (IDC). However, many leaders face complexity as policies and requirements continue to rapidly evolve, and have concerns on acquiring the right knowledge and skills, at an affordable cost, to simplify efforts in meeting digital sovereignty goals.

At Amazon Web Services (AWS), we understand that protecting your data in a world with changing regulations, technology, and risks takes teamwork. We’re committed to making sure that the AWS Cloud remains sovereign-by-design, as it has been from day one, and providing customers with more choice to help meet their unique sovereignty requirements across our offerings in AWS Regions around the world, dedicated sovereign cloud infrastructure solutions, and the recently announced independent European Sovereign Cloud. In this blog post, I’ll share how the cloud is helping organizations meet their digital sovereignty needs, and ways that we can help you navigate the ever-evolving landscape.

Digital sovereignty needs of customers vary based on multiple factors

Digital sovereignty means different things to different people, and every country or region has their own requirements. Adding to the complexity is the fact that no uniform guidance exists for the types of workloads, industries, and sectors that must adhere to these requirements.

Although digital sovereignty needs vary based on multiple factors, key themes that we’ve identified by listening to customers, partners, and regulators include data residency, operator access restriction, resiliency, and transparency. AWS works closely with customers to understand the digital sovereignty outcomes that they’re focused on to determine the right AWS solutions that can help to meet them.

Meet requirements without compromising the benefits of the cloud

We introduced the AWS Digital Sovereignty Pledge in 2022 as part of our commitment to offer all AWS customers the most advanced set of sovereignty controls and security features available in the cloud. We continue to deeply engage with regulators to help make sure that AWS meets various standards and achieves certifications that our customers directly inherit, allowing them to meet requirements while driving continuous innovation. AWS was recently named a leader in Sovereign Cloud Infrastructure Services (EU) by Information Services Group (ISG), a global technology research and IT advisory firm.

Customers who use our global infrastructure with sovereign-by-design features can optimize for increased scale, agility, speed, and reduced costs while getting the highest levels of security and protection. Our AWS Regions are powered by the AWS Nitro System, which helps ensure the confidentiality and integrity of customer data. Building on our commitment to provide greater transparency and assurances on how AWS services are designed and operated, the security design of our Nitro System was validated in an independent public report by the global cybersecurity consulting firm NCC Group.

Customers have full control of their data on AWS and determine where their data is stored, how it’s stored, and who has access to it. We provide tools to help you automate and monitor your storage location and encrypt your data, including data residency guardrails in AWS Control Tower. We recently announced more than 65 new digital sovereignty controls that you can choose from to help prevent actions, enforce configurations, and detect undesirable changes.

All AWS services support encryption, and most services also support encryption with customer managed keys that AWS can’t access such as AWS Key Management Service (KMS), AWS CloudHSM, and AWS KMS External Key Store (XKS). Both the hardware used in AWS KMS and the firmware used in AWS CloudHSM are FIPS 140-2 Level 3 compliant as certified by a NIST-accredited laboratory.

Infrastructure choice to support your unique needs and local regulations

AWS provides hybrid cloud storage and edge computing capabilities so that you can use the same infrastructure, services, APIs, and tools across your environments. We think of our AWS infrastructure and services as a continuum that helps meet your requirements wherever you need it. Having a consistent experience across environments helps to accelerate innovation, increase operational efficiencies and reduce costs by using the same skills and toolsets, and meet specific security standards by adopting cloud security wherever applications and data reside.

We work closely with customers to support infrastructure decisions that meet unique workload needs and local regulations, and continue to invent based on what we hear from customers. To help organizations comply with stringent regulatory requirements, we launched AWS Dedicated Local Zones. This is a type of infrastructure that is fully managed by AWS, built for exclusive use by a customer or community, and placed in a customer-specified location or data center to run sensitive or other regulated industry workloads. At AWS re:Invent 2023, I sat down with Cheow Hoe Chan, Government Chief Digital Technology Officer of Singapore, to discuss how we collaborated with Singapore’s Smart Nation and Digital Government Group to define and build this dedicated infrastructure.

We also recently announced our plans to launch the AWS European Sovereign Cloud to provide customers in highly regulated industries with more choice to help meet varying data residency, operational autonomy, and resiliency requirements. This is a new, independent cloud located and operated within the European Union (EU) that will have the same security, availability, and performance that our customers get from existing AWS Regions today, with important features specific to evolving EU regulations.

Build confidently with AWS and our AWS Partners

In addition to our AWS offerings, you can access our global network of more than 100,000 AWS Partners specialized in various competencies and industry verticals to get local guidance and services.

There is a lot of complexity involved with navigating the evolving digital sovereignty landscape—but you don’t have to do it alone. Using the cloud and working with AWS and our partners can help you move faster and more efficiently while keeping costs low. We’re committed to helping you meet necessary requirements while accelerating innovation, and can’t wait to see the kinds of advancements that you’ll continue to drive.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Max Peterson

Max Peterson

Max is the Vice President of AWS Sovereign Cloud. He leads efforts to ensure that all AWS customers around the world have the most advanced set of sovereignty controls, privacy safeguards, and security features available in the cloud. Before his current role, Max served as the VP of AWS Worldwide Public Sector (WWPS) and created and led the WWPS International Sales division, with a focus on empowering government, education, healthcare, aerospace and satellite, and nonprofit organizations to drive rapid innovation while meeting evolving compliance, security, and policy requirements. Max has over 30 years of public sector experience and served in other technology leadership roles before joining Amazon. Max has earned both a Bachelor of Arts in Finance and Master of Business Administration in Management Information Systems from the University of Maryland.

AWS named as a Leader in 2023 Gartner Magic Quadrant for Strategic Cloud Platform Services for thirteenth year in a row

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/read-the-2023-gartner-magic-quadrant-for-strategic-cloud-platform-services/

On December 4, 2023, AWS was named as a Leader in the 2023 Magic Quadrant for Strategic Cloud Platform Services (SCPS). AWS is the longest-running Magic Quadrant Leader, with Gartner naming AWS a Leader for the thirteenth consecutive year. AWS is placed highest on the Ability to Execute axis.

SCPS, previously known as Magic Quadrant for Cloud Infrastructure and Platform Services (CIPS), is defined as “standardized, automated, public cloud offerings integrating infrastructure services (for example, computing, network, and storage), platform services (for example, managed application and data services) and transformation services (programs/resources that help customers adopt cloud-oriented IT delivery models).”

I have the chance to talk with our customers every single week. When I ask the main reasons why they choose AWS, I consistently hear the following responses:

Breadth and depth. AWS offers more cloud services and features than other providers, including compute, storage, databases, machine learning (ML), data analytics, and Internet of Things (IoT). This allows faster, easier, and cheaper cloud migration of existing apps and building new apps. AWS has the deepest functionality within services, such as a wide variety of purpose-built databases optimized for cost and performance.

A rapid pace of innovation. AWS enables faster experimentation and innovation through the latest technologies. We continually accelerate innovation pace to invent new technologies for business transformation. For example, in 2014, we launched the serverless computing service AWS Lambda, eliminating server provisioning and management for developers. In 2017, we launched the AWS Nitro System, a combination of dedicated hardware and a lightweight hypervisor that enables better performance, increased security, and cost savings for Amazon EC2 instances. At re:Invent 2018, we announced AWS Graviton, a family of processors designed to deliver the best price performance for your cloud workloads running in Amazon Elastic Compute Cloud (Amazon EC2). And today, we continue to innovate with generative artificial intelligence (AI) services such as Amazon Q or Amazon CodeWhisperer, your coding productivity tool available in developer’s integrated development environment (IDE) and on the command line (CLI).

A large community of customers and partners. AWS has a large, active community with millions of customers and tens of thousands of partners globally. Customers in most industries and of varied sizes use AWS for diverse applications. The AWS Partner Network includes thousands of systems integrators specializing in AWS and tens of thousands of independent software vendors (ISV) adapting their technologies for AWS.

You also benefit from the global AWS infrastructure, including the 33 Regions where you can deploy your workload and store your data. We pre-announced four future Regions in Malaysia, New Zealand, Thailand, and the AWS European Sovereign Cloud.

An AWS Region is a physical location in the world where we have multiple Availability Zones. Availability Zones consist of one or more discrete data centers, each with redundant power, networking, and connectivity, housed in separate facilities. Unlike with other cloud providers, who often define a region as a single data center, having multiple Availability Zones allows you to operate production applications and databases that are more highly available, fault-tolerant, and scalable than would be possible from a single data center.

AWS has more than 17 years of experience building its global infrastructure. And, as Werner Vogels, Amazon CTO, keeps repeating, “There’s no compression algorithm for experience,” especially when it comes to scale, security, and performance.

Here is the graphical representation of the 2023 Magic Quadrant for Strategic Cloud Platform Services.

Gartner | 2023 Magic Quadrant for Strategic Cloud Platform ServicesThe full Gartner report has details about the features and factors they reviewed. It explains the methodology used and the recognitions. This report can serve as a guide when choosing a cloud provider that helps you innovate on behalf of your customers.

— seb

Gartner, 2023 Magic Quadrant for Strategic Cloud Platform Services, 4 December 2023, David Wright, Dennis Smith, et. al.

Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

GARTNER is a registered trademark and service mark of Gartner and Magic Quadrant is a registered trademark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and are used herein with permission. All rights reserved.

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from AWS.

AWS re:Invent 2023: Security, identity, and compliance recap

Post Syndicated from Nisha Amthul original https://aws.amazon.com/blogs/security/aws-reinvent-2023-security-identity-and-compliance-recap/

In this post, we share the key announcements related to security, identity, and compliance at AWS re:Invent 2023, and offer details on how you can learn more through on-demand video of sessions and relevant blog posts. AWS re:Invent returned to Las Vegas in November 2023. The conference featured over 2,250 sessions and hands-on labs, with over 52,000 attendees over five days. If you couldn’t join us in person or want to revisit the security, identity, and compliance announcements and on-demand sessions, this post is for you.

At re:Invent 2023, and throughout the AWS security service announcements, there are key themes that underscore the security challenges that we help customers address through the sharing of knowledge and continuous development in our native security services. The key themes include helping you architect for zero trust, scalable identity and access management, early integration of security in the development cycle, container security enhancement, and using generative artificial intelligence (AI) to help improve security services and mean time to remediation.

Key announcements

To help you more efficiently manage identity and access at scale, we introduced several new features:

  • A week before re:Invent, we announced two new features of Amazon Verified Permissions:
    • Batch authorization — Batch authorization is a new way for you to process authorization decisions within your application. Using this new API, you can process 30 authorization decisions for a single principal or resource in a single API call. This can help you optimize multiple requests in your user experience (UX) permissions.
    • Visual schema editor — This new visual schema editor offers an alternative to editing policies directly in the JSON editor. View relationships between entity types, manage principals and resources visually, and review the actions that apply to principal and resources types for your application schema.
  • We launched two new features for AWS Identity and Access Management (IAM) Access Analyzer:
    • Unused access — The new analyzer continuously monitors IAM roles and users in your organization in AWS Organizations or within AWS accounts, identifying unused permissions, access keys, and passwords. Using this new capability, you can benefit from a dashboard to help prioritize which accounts need attention based on the volume of excessive permissions and unused access findings. You can set up automated notification workflows by integrating IAM Access Analyzer with Amazon EventBridge. In addition, you can aggregate these new findings about unused access with your existing AWS Security Hub findings.
    • Custom policy checks — This feature helps you validate that IAM policies adhere to your security standards ahead of deployments. Custom policy checks use the power of automated reasoning—security assurance backed by mathematical proof—to empower security teams to detect non-conformant updates to policies proactively. You can move AWS applications from development to production more quickly by automating policy reviews within your continuous integration and continuous delivery (CI/CD) pipelines. Security teams automate policy reviews before deployments by collaborating with developers to configure custom policy checks within AWS CodePipeline pipelines, AWS CloudFormation hooks, GitHub Actions, and Jenkins jobs.
  • We announced AWS IAM Identity Center trusted identity propagation to manage and audit access to AWS Analytics services, including Amazon QuickSight, Amazon Redshift, Amazon EMR, AWS Lake Formation, and Amazon Simple Storage Service (Amazon S3) through S3 Access Grants. This feature of IAM Identity Center simplifies data access management for users, enhances auditing granularity, and improves the sign-in experience for analytics users across multiple AWS analytics applications.

To help you improve your security outcomes with generative AI and automated reasoning, we introduced the following new features:

AWS Control Tower launched a set of 65 purpose-built controls designed to help you meet your digital sovereignty needs. In November 2022, we launched AWS Digital Sovereignty Pledge, our commitment to offering all AWS customers the most advanced set of sovereignty controls and features available in the cloud. Introducing AWS Control Tower controls that support digital sovereignty is an additional step in our roadmap of capabilities for data residency, granular access restriction, encryption, and resilience. AWS Control Tower offers you a consolidated view of the controls enabled, your compliance status, and controls evidence across multiple accounts.

We announced two new feature expansions for Amazon GuardDuty to provide the broadest threat detection coverage:

We launched two new capabilities for Amazon Inspector in addition to Amazon Inspector code remediation for Lambda function to help you detect software vulnerabilities at scale:

We introduced four new capabilities in AWS Security Hub to help you address security gaps across your organization and enhance the user experience for security teams, providing increased visibility:

  • Central configuration — Streamline and simplify how you set up and administer Security Hub in your multi-account, multi-Region organizations. With central configuration, you can use the delegated administrator account as a single pane of glass for your security findings—and also for your organization’s configurations in Security Hub.
  • Customize security controls — You can now refine the best practices monitored by Security Hub controls to meet more specific security requirements. There is support for customer-specific inputs in Security Hub controls, so you can customize your security posture monitoring on AWS.
  • Metadata enrichment for findings — This enrichment adds resource tags, a new AWS application tag, and account name information to every finding ingested into Security Hub. This includes findings from AWS security services such as GuardDuty, Amazon Inspector, and IAM Access Analyzer, in addition to a large and growing list of AWS Partner Network (APN) solutions. Using this enhancement, you can better contextualize, prioritize, and act on your security findings.
  • Dashboard enhancements — You can now filter and customize your dashboard views, and access a new set of widgets that we carefully chose to help reflect the modern cloud security threat landscape and relate to potential threats and vulnerabilities in your AWS cloud environment. This improvement makes it simpler for you to focus on risks that require your attention, providing a more comprehensive view of your cloud security.

We added three new capabilities for Amazon Detective in addition to Amazon Detective finding group summaries to simplify the security investigation process:

We introduced AWS Secrets Manager batch retrieval of secrets to identify and retrieve a group of secrets for your application at once with a single API call. The new API, BatchGetSecretValue, provides greater simplicity for common developer workflows, especially when you need to incorporate multiple secrets into your application.

We worked closely with AWS Partners to create offerings that make it simpler for you to protect your cloud workloads:

  • AWS Built-in Competency — AWS Built-in Competency Partner solutions help minimize the time it takes for you to figure out the best AWS services to adopt, regardless of use case or category.
  • AWS Cyber Insurance Competency — AWS has worked with leading cyber insurance partners to help simplify the process of obtaining cyber insurance. This makes it simpler for you to find affordable insurance policies from AWS Partners that integrate their security posture assessment through a user-friendly customer experience with Security Hub.

Experience content on demand

If you weren’t able to join in person or you want to watch a session again, you can see the many sessions that are available on demand.

Keynotes, innovation talks, and leadership sessions

Catch the AWS re:Invent 2023 keynote where AWS chief executive officer Adam Selipsky shares his perspective on cloud transformation and provides an exclusive first look at AWS innovations in generative AI, machine learning, data, and infrastructure advancements. You can also replay the other AWS re:Invent 2023 keynotes.

The security landscape is evolving as organizations adapt and embrace new technologies. In this talk, discover the AWS vision for security that drives business agility. Stream the innovation talk from Amazon chief security officer, Steve Schmidt, and AWS chief information security officer, Chris Betz, to learn their insights on key topics such as Zero Trust, builder security experience, and generative AI.

At AWS, we work closely with customers to understand their requirements for their critical workloads. Our work with the Singapore Government’s Smart Nation and Digital Government Group (SNDGG) to build a Smart Nation for their citizens and businesses illustrates this approach. Watch the leadership session with Max Peterson, vice president of Sovereign Cloud at AWS, and Chan Cheow Hoe, government chief digital technology officer of Singapore, as they share how AWS is helping Singapore advance on its cloud journey to build a Smart Nation.

Breakout sessions and new launch talks

Stream breakout sessions and new launch talks on demand to learn about the following topics:

  • Discover how AWS, customers, and partners work together to raise their security posture with AWS infrastructure and services.
  • Learn about trends in identity and access management, detection and response, network and infrastructure security, data protection and privacy, and governance, risk, and compliance.
  • Dive into our launches! Learn about the latest announcements from security experts, and uncover how new services and solutions can help you meet core security and compliance requirements.

Consider joining us for more in-person security learning opportunities by saving the date for AWS re:Inforce 2024, which will occur June 10-12 in Philadelphia, Pennsylvania. We look forward to seeing you there!

If you’d like to discuss how these new announcements can help your organization improve its security posture, AWS is here to help. Contact your AWS account team today.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Nisha Amthul

Nisha Amthul

Nisha is a Senior Product Marketing Manager at AWS Security, specializing in detection and response solutions. She has a strong foundation in product management and product marketing within the domains of information security and data protection. When not at work, you’ll find her cake decorating, strength training, and chasing after her two energetic kiddos, embracing the joys of motherhood.

Author

Himanshu Verma

Himanshu is a Worldwide Specialist for AWS Security Services. He leads the go-to-market creation and execution for AWS security services, field enablement, and strategic customer advisement. Previously, he held leadership roles in product management, engineering, and development, working on various identity, information security, and data protection technologies. He loves brainstorming disruptive ideas, venturing outdoors, photography, and trying new restaurants.

Author

Marshall Jones

Marshall is a Worldwide Security Specialist Solutions Architect at AWS. His background is in AWS consulting and security architecture, focused on a variety of security domains including edge, threat detection, and compliance. Today, he is focused on helping enterprise AWS customers adopt and operationalize AWS security services to increase security effectiveness and reduce risk.

Building a security-first mindset: three key themes from AWS re:Invent 2023

Post Syndicated from Clarke Rodgers original https://aws.amazon.com/blogs/security/building-a-security-first-mindset-three-key-themes-from-aws-reinvent-2023/

Amazon CSO Stephen Schmidt

Amazon CSO Stephen Schmidt

AWS re:Invent drew 52,000 attendees from across the globe to Las Vegas, Nevada, November 27 to December 1, 2023.

Now in its 12th year, the conference featured 5 keynotes, 17 innovation talks, and over 2,250 sessions and hands-on labs offering immersive learning and networking opportunities.

With dozens of service and feature announcements—and innumerable best practices shared by AWS executives, customers, and partners—the air of excitement was palpable. We were on site to experience all of the innovations and insights, but summarizing highlights isn’t easy. This post details three key security themes that caught our attention.

Security culture

When we think about cybersecurity, it’s natural to focus on technical security measures that help protect the business. But organizations are made up of people—not technology. The best way to protect ourselves is to foster a proactive, resilient culture of cybersecurity that supports effective risk mitigation, incident detection and response, and continuous collaboration.

In Sustainable security culture: Empower builders for success, AWS Global Services Security Vice President Hart Rossman and AWS Global Services Security Organizational Excellence Leader Sarah Currey presented practical strategies for building a sustainable security culture.

Rossman noted that many customers who meet with AWS about security challenges are attempting to manage security as a project, a program, or a side workstream. To strengthen your security posture, he said, you have to embed security into your business.

“You’ve got to understand early on that security can’t be effective if you’re running it like a project or a program. You really have to run it as an operational imperative—a core function of the business. That’s when magic can happen.” — Hart Rossman, Global Services Security Vice President at AWS

Three best practices can help:

  1. Be consistently persistent. Routinely and emphatically thank employees for raising security issues. It might feel repetitive, but treating security events and escalations as learning opportunities helps create a positive culture—and it’s a practice that can spread to other teams. An empathetic leadership approach encourages your employees to see security as everyone’s responsibility, share their experiences, and feel like collaborators.
  2. Brief the board. Engage executive leadership in regular, business-focused meetings. By providing operational metrics that tie your security culture to the impact that it has on customers, crisply connecting data to business outcomes, and providing an opportunity to ask questions, you can help build the support of executive leadership, and advance your efforts to establish a sustainable proactive security posture.
  3. Have a mental model for creating a good security culture. Rossman presented a diagram (Figure 1) that highlights three elements of security culture he has observed at AWS: a student, a steward, and a builder. If you want to be a good steward of security culture, you should be a student who is constantly learning, experimenting, and passing along best practices. As your stewardship grows, you can become a builder, and progress the culture in new directions.
Figure 1: Sample mental model for building security culture

Figure 1: Sample mental model for building security culture

Thoughtful investment in the principles of inclusivity, empathy, and psychological safety can help your team members to confidently speak up, take risks, and express ideas or concerns. This supports an escalation-friendly culture that can reduce employee burnout, and empower your teams to champion security at scale.

In Shipping securely: How strong security can be your strategic advantage, AWS Enterprise Strategy Director Clarke Rodgers reiterated the importance of security culture to building a security-first mindset.

Rodgers highlighted three pillars of progression (Figure 2)—aware, bolted-on, and embedded—that are based on meetings with more than 800 customers. As organizations mature from a reactive security posture to a proactive, security-first approach, he noted, security culture becomes a true business enabler.

“When organizations have a strong security culture and everyone sees security as their responsibility, they can move faster and achieve quicker and more secure product and service releases.” — Clarke Rodgers, Director of Enterprise Strategy at AWS
Figure 2: Shipping with a security-first mindset

Figure 2: Shipping with a security-first mindset

Human-centric AI

CISOs and security stakeholders are increasingly pivoting to a human-centric focus to establish effective cybersecurity, and ease the burden on employees.

According to Gartner, by 2027, 50% of large enterprise CISOs will have adopted human-centric security design practices to minimize cybersecurity-induced friction and maximize control adoption.

As Amazon CSO Stephen Schmidt noted in Move fast, stay secure: Strategies for the future of security, focusing on technology first is fundamentally wrong. Security is a people challenge for threat actors, and for defenders. To keep up with evolving changes and securely support the businesses we serve, we need to focus on dynamic problems that software can’t solve.

Maintaining that focus means providing security and development teams with the tools they need to automate and scale some of their work.

“People are our most constrained and most valuable resource. They have an impact on every layer of security. It’s important that we provide the tools and the processes to help our people be as effective as possible.” — Stephen Schmidt, CSO at Amazon

Organizations can use artificial intelligence (AI) to impact all layers of security—but AI doesn’t replace skilled engineers. When used in coordination with other tools, and with appropriate human review, it can help make your security controls more effective.

Schmidt highlighted the internal use of AI at Amazon to accelerate our software development process, as well as new generative AI-powered Amazon Inspector, Amazon Detective, AWS Config, and Amazon CodeWhisperer features that complement the human skillset by helping people make better security decisions, using a broader collection of knowledge. This pattern of combining sophisticated tooling with skilled engineers is highly effective, because it positions people to make the nuanced decisions required for effective security that AI can’t make on its own.

In How security teams can strengthen security using generative AI, AWS Senior Security Specialist Solutions Architects Anna McAbee and Marshall Jones, and Principal Consultant Fritz Kunstler featured a virtual security assistant (chatbot) that can address common security questions and use cases based on your internal knowledge bases, and trusted public sources.

Figure 3: Generative AI-powered chatbot architecture

Figure 3: Generative AI-powered chatbot architecture

The generative AI-powered solution depicted in Figure 3—which includes Retrieval Augmented Generation (RAG) with Amazon Kendra, Amazon Security Lake, and Amazon Bedrock—can help you automate mundane tasks, expedite security decisions, and increase your focus on novel security problems.

It’s available on Github with ready-to-use code, so you can start experimenting with a variety of large and multimodal language models, settings, and prompts in your own AWS account.

Secure collaboration

Collaboration is key to cybersecurity success, but evolving threats, flexible work models, and a growing patchwork of data protection and privacy regulations have made maintaining secure and compliant messaging a challenge.

An estimated 3.09 billion mobile phone users access messaging apps to communicate, and this figure is projected to grow to 3.51 billion users in 2025.

The use of consumer messaging apps for business-related communications makes it more difficult for organizations to verify that data is being adequately protected and retained. This can lead to increased risk, particularly in industries with unique recordkeeping requirements.

In How the U.S. Army uses AWS Wickr to deliver lifesaving telemedicine, Matt Quinn, Senior Director at The U.S. Army Telemedicine & Advanced Technology Research Center (TATRC), Laura Baker, Senior Manager at Deloitte, and Arvind Muthukrishnan, AWS Wickr Head of Product highlighted how The TATRC National Emergency Tele-Critical Care Network (NETCCN) was integrated with AWS Wickr—a HIPAA-eligible secure messaging and collaboration service—and AWS Private 5G, a managed service for deploying and scaling private cellular networks.

During the session, Quinn, Baker, and Muthukrishnan described how TATRC achieved a low-resource, cloud-enabled, virtual health solution that facilitates secure collaboration between onsite and remote medical teams for real-time patient care in austere environments. Using Wickr, medics on the ground were able to treat injuries that exceeded their previous training (Figure 4) with the help of end-to-end encrypted video calls, messaging, and file sharing with medical professionals, and securely retain communications in accordance with organizational requirements.

“Incorporating Wickr into Military Emergency Tele-Critical Care Platform (METTC-P) not only provides the security and privacy of end-to-end encrypted communications, it gives combat medics and other frontline caregivers the ability to gain instant insight from medical experts around the world—capabilities that will be needed to address the simultaneous challenges of prolonged care, and the care of large numbers of casualties on the multi-domain operations (MDO) battlefield.” — Matt Quinn, Senior Director at TATRC
Figure 4: Telemedicine workflows using AWS Wickr

Figure 4: Telemedicine workflows using AWS Wickr

In a separate Chalk Talk titled Bolstering Incident Response with AWS Wickr and Amazon EventBridge, Senior AWS Wickr Solutions Architects Wes Wood and Charles Chowdhury-Hanscombe demonstrated how to integrate Wickr with Amazon EventBridge and Amazon GuardDuty to strengthen incident response capabilities with an integrated workflow (Figure 5) that connects your AWS resources to Wickr bots. Using this approach, you can quickly alert appropriate stakeholders to critical findings through a secure communication channel, even on a potentially compromised network.

Figure 5: AWS Wickr integration for incident response communications

Figure 5: AWS Wickr integration for incident response communications

Security is our top priority

AWS re:Invent featured many more highlights on a variety of topics, including adaptive access control with Zero Trust, AWS cyber insurance partners, Amazon CTO Dr. Werner Vogels’ popular keynote, and the security partnerships showcased on the Expo floor. It was a whirlwind experience, but one thing is clear: AWS is working hard to help you build a security-first mindset, so that you can meaningfully improve both technical and business outcomes.

To watch on-demand conference sessions, visit the AWS re:Invent Security, Identity, and Compliance playlist on YouTube.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Clarke Rodgers

Clarke Rodgers

Clarke is a Director of Enterprise Security at AWS. Clarke has more than 25 years of experience in the security industry, and works with enterprise security, risk, and compliance-focused executives to strengthen their security posture, and understand the security capabilities of the cloud. Prior to AWS, Clarke was a CISO for the North American operations of a multinational insurance company.

Anne Grahn

Anne Grahn

Anne is a Senior Worldwide Security GTM Specialist at AWS, based in Chicago. She has more than 13 years of experience in the security industry, and focuses on effectively communicating cybersecurity risk. She maintains a Certified Information Systems Security Professional (CISSP) certification.

Building a generative AI Marketing Portal on AWS

Post Syndicated from Tristan Nguyen original https://aws.amazon.com/blogs/messaging-and-targeting/building-a-generative-ai-marketing-portal-on-aws/

Introduction

In the preceding entries of this series, we examined the transformative impact of Generative AI on marketing strategies in “Building Generative AI into Marketing Strategies: A Primer” and delved into the intricacies of Prompt Engineering to enhance the creation of marketing content with services such as Amazon Bedrock in “From Prompt Engineering to Auto Prompt Optimisation”. We also explored the potential of Large Language Models (LLMs) to refine prompts for more effective customer engagement.

Continuing this exploration, we will articulate how Amazon Bedrock, Amazon Personalize, and Amazon Pinpoint can be leveraged to construct a marketer portal that not only facilitates AI-driven content generation but also personalizes and distributes this content effectively. The aim is to provide a clear blueprint for deploying a system that crafts, personalizes, and distributes marketing content efficiently. This blog will guide you through the deployment process, underlining the real-world utility of these services in optimizing marketing workflows. Through use cases and a code demonstration, we’ll see these technologies in action, offering a hands-on perspective on enhancing your marketing pipeline with AI-driven solutions.

The Challenge with Content Generation in Marketing

Many companies struggle to streamline their marketing operations effectively, facing hurdles at various stages of the marketing operations pipeline. Below, we list the challenges at three main stages of the pipeline: content generation, content personalization, and content distribution.

Content Generation

Creating high-quality, engaging content is often easier said than done. Companies need to invest in skilled copywriters or content creators who understand not just the product but also the target audience. Even with the right talent, the process can be time-consuming and costly. Moreover, generating content at scale while maintaining quality and compliance to industry regulations is the key blocker for many companies considering adopting generative AI technologies in production environments.

Content Personalization

Once the content is created, the next hurdle is personalization. In today’s digital age, generic content rarely captures attention. Customers expect content tailored to their needs, preferences, and behaviors. However, personalizing content is not straightforward. It requires a deep understanding of customer data, which often resides in siloed databases, making it difficult to create a 360-degree view of the customer.

Content Distribution

Finally, even the most captivating, personalized content is ineffective if it doesn’t reach the right audience at the right time. Companies often grapple with choosing the appropriate channels for content distribution, be it email, social media, or mobile notifications. Additionally, ensuring that the content complies with various regulations and doesn’t end up in spam folders adds another layer of complexity to the distribution phase. Sending at scale requires paying attention to deliverability, security and reliability which often poses significant challenges to marketers.

By addressing these challenges, companies can significantly improve their marketing operations and empower their marketers to be more effective. But how can this be achieved efficiently and at scale? The answer lies in leveraging the power of Amazon Bedrock, Amazon Personalize, and Amazon Pinpoint, as we will explore in the following solution.

The Solution In Action

Before we dive into the details of the implementation, let’s take a look at the end result through the linked demo video.

Use Case 1: Banking/Financial Services Industry

You are a relationship manager working in the Consumer Banking department of a fictitious company called AnyCompany Bank. You are assigned a group of customers and would like to send out personalized and targeted communications to the channel of choice to every members of this group of customer.

Behind the scene, the marketer is utilizing Amazon Pinpoint to create the segment of customers they would like to target. The customers’ information and the marketer’s prompt are then fed into Amazon Bedrock to generate the marketing content, which is then sent to the customer via SMS and email using Amazon Pinpoint.

  • In the Prompt Iterator page, you can employ a process called “prompt engineering” to further optimize your prompt to maximize the effectiveness of your marketing campaigns. Please refer to this blog on the process behind engineering the prompt as well as how to apply an additional LLM model for auto-prompting. To get started, simply copy the sample banking prompt which has gone through the prompt engineering process in this page.
  • Next, you can either upload your customer group by uploading a .csv file (through “Importing a Segment”) or specify a customer group using pre-defined filter criteria based on your current customer database using Amazon Pinpoint.

UseCase1Segment

E.g.: The screenshot shows a sample filtered segment named ManagementOrRetired that only filters to customers who are management or retirees.

  • Once done, you can log into the marketer portal and choose the relevant segment that you’ve just created within the Amazon Pinpoint console.

PinpointSegment

  • You can then preview the customers and their information stored in your Amazon Pinpoint’s customer database. Once satisfied, we’re ready to start generating content for those customers!
  • Click on 1:1 Content Generator tab, your content is automatically generated for your first customer. Here, you can cycle through your customers one by one, and depending on the customer’s preferred language and channel, an email or SMS in the preferred language is automatically generated for them.
    • Generated SMS in English

PostiveSMS

    • A negative example showing proper prompt-engineering at work to moderate content. This happens if we try to insert data that does not make sense for the marketing content generator to output. In this case, the marketing generator refuses to output (justifiably) an advertisement for a 6-year-old on a secured instalment loan.

NegativeSMS

  • Finally, we choose to send the generated content via Amazon Pinpoint by clicking on “Send with Amazon Pinpoint”. In the back end, Amazon Pinpoint will orchestrate the sending of the email/SMS through the appropriate channels.
    • Alternatively, if the auto-generated content still did not meet your needs and you want to generate another draft, you can Disagree and try again.

Use Case 2: Travel & Hospitality

You are a marketing executive that’s working for an online air ticketing agency. You’ve been tasked to promote a specific flight from Singapore to Hong Kong for AnyCompany airline. You’d first like to identify which customers would be prime candidates to promote this flight leg to and then send out hyper-personalized message to them.

Behind the scene, instead of using Amazon Pinpoint to manually define the segment, the marketer in this case is leveraging AIML capabilities of Amazon Personalize to define the best group of customers to recommend the specific flight leg to them. Similar to the above use case, the customers’ information and LLM prompt are fed into the Amazon Bedrock, which generates the marketing content that is eventually sent out via Amazon Pinpoint.

  • Similar to the above use case, you’d need to go through a prompt engineering process to ensure that the content the LLM model is generating will be relevant and safe for use. To get started quickly, go to the Prompt Iterator page, you can use the sample airlines prompt and iterate from there.
  • Your company offers many different flight legs, aggregated from many different carriers. You first filter down to the flight leg that you want to promote using the Filters on the left. In this case, we are filtering for flights originating from Singapore (SRCCity) and going to Hong Kong (DSTCity), operated by AnyCompany Airlines.

PersonalizeInstructions

  • Now, let’s choose the number of customers that you’d like to generate. Once satisfied, you choose to start the batch segmentation job.
  • In the background, Amazon Personalize generates a group of customers that are most likely to be interested in this flight leg based on past interactions with similar flight itineraries.
  • Once the segmentation job is finished as shown, you can fetch the recommended group of customers and start generating content for them immediately, similar to the first use case.

Setup instructions

The setup instructions and deployment details can be found in the GitHub link.

Conclusion

In this blog, we’ve explored the transformative potential of integrating Amazon Bedrock, Amazon Personalize, and Amazon Pinpoint to address the common challenges in marketing operations. By automating the content generation with Amazon Bedrock, personalizing at scale with Amazon Personalize, and ensuring precise content distribution with Amazon Pinpoint, companies can not only streamline their marketing processes but also elevate the customer experience.

The benefits are clear: time-saving through automation, increased operational efficiency, and enhanced customer satisfaction through personalized engagement. This integrated solution empowers marketers to focus on strategy and creativity, leaving the heavy lifting to AWS’s robust AI and ML services.

For those ready to take the next step, we’ve provided a comprehensive guide and resources to implement this solution. By following the setup instructions and leveraging the provided prompts as a starting point, you can deploy this solution and begin customizing the marketer portal to your business’ needs.

Call to Action

Don’t let the challenges of content generation, personalization, and distribution hold back your marketing potential. Deploy the Generative AI Marketer Portal today, adapt it to your specific needs, and watch as your marketing operations transform. For a hands-on start and to see this solution in action, visit the GitHub repository for detailed setup instructions.

Have a question? Share your experiences or leave your questions in the comment section.

About the Authors

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen

Tristan (Tri) Nguyen is an Amazon Pinpoint and Amazon Simple Email Service Specialist Solutions Architect at AWS. At work, he specializes in technical implementation of communications services in enterprise systems and architecture/solutions design. In his spare time, he enjoys chess, rock climbing, hiking and triathlon.

Philipp Kaindl

Philipp Kaindl

Philipp Kaindl is a Senior Artificial Intelligence and Machine Learning Solutions Architect at AWS. With a background in data science and
mechanical engineering his focus is on empowering customers to create lasting business impact with the help of AI. Outside of work, Philipp enjoys tinkering with 3D printers, sailing and hiking.

Bruno Giorgini

Bruno Giorgini

Bruno Giorgini is a Senior Solutions Architect specializing in Pinpoint and SES. With over two decades of experience in the IT industry, Bruno has been dedicated to assisting customers of all sizes in achieving their objectives. When he is not crafting innovative solutions for clients, Bruno enjoys spending quality time with his wife and son, exploring the scenic hiking trails around the SF Bay Area.

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

Post Syndicated from Raghavarao Sodabathina original https://aws.amazon.com/blogs/big-data/architectural-patterns-for-real-time-analytics-using-amazon-kinesis-data-streams-part-1/

We’re living in the age of real-time data and insights, driven by low-latency data streaming applications. Today, everyone expects a personalized experience in any application, and organizations are constantly innovating to increase their speed of business operation and decision making. The volume of time-sensitive data produced is increasing rapidly, with different formats of data being introduced across new businesses and customer use cases. Therefore, it is critical for organizations to embrace a low-latency, scalable, and reliable data streaming infrastructure to deliver real-time business applications and better customer experiences.

This is the first post to a blog series that offers common architectural patterns in building real-time data streaming infrastructures using Kinesis Data Streams for a wide range of use cases. It aims to provide a framework to create low-latency streaming applications on the AWS Cloud using Amazon Kinesis Data Streams and AWS purpose-built data analytics services.

In this post, we will review the common architectural patterns of two use cases: Time Series Data Analysis and Event Driven Microservices. In the subsequent post in our series, we will explore the architectural patterns in building streaming pipelines for real-time BI dashboards, contact center agent, ledger data, personalized real-time recommendation, log analytics, IoT data, Change Data Capture, and real-time marketing data. All these architecture patterns are integrated with Amazon Kinesis Data Streams.

Real-time streaming with Kinesis Data Streams

Amazon Kinesis Data Streams is a cloud-native, serverless streaming data service that makes it easy to capture, process, and store real-time data at any scale. With Kinesis Data Streams, you can collect and process hundreds of gigabytes of data per second from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time. The collected data is available in milliseconds to allow real-time analytics use cases, such as real-time dashboards, real-time anomaly detection, and dynamic pricing. By default, the data within the Kinesis Data Stream is stored for 24 hours with an option to increase the data retention to 365 days. If customers want to process the same data in real-time with multiple applications, then they can use the Enhanced Fan-Out (EFO) feature. Prior to this feature, every application consuming data from the stream shared the 2MB/second/shard output. By configuring stream consumers to use enhanced fan-out, each data consumer receives dedicated 2MB/second pipe of read throughput per shard to further reduce the latency in data retrieval.

For high availability and durability, Kinesis Data Streams achieves high durability by synchronously replicating the streamed data across three Availability Zones in an AWS Region and gives you the option to retain data for up to 365 days. For security, Kinesis Data Streams provide server-side encryption so you can meet strict data management requirements by encrypting your data at rest and Amazon Virtual Private Cloud (VPC) interface endpoints to keep traffic between your Amazon VPC and Kinesis Data Streams private.

Kinesis Data Streams has native integrations with other AWS services such as AWS Glue and Amazon EventBridge to build real-time streaming applications on AWS. Refer to Amazon Kinesis Data Streams integrations for additional details.

Modern data streaming architecture with Kinesis Data Streams

A modern streaming data architecture with Kinesis Data Streams can be designed as a stack of five logical layers; each layer is composed of multiple purpose-built components that address specific requirements, as illustrated in the following diagram:

The architecture consists of the following key components:

  • Streaming sources – Your source of streaming data includes data sources like clickstream data, sensors, social media, Internet of Things (IoT) devices, log files generated by using your web and mobile applications, and mobile devices that generate semi-structured and unstructured data as continuous streams at high velocity.
  • Stream ingestion – The stream ingestion layer is responsible for ingesting data into the stream storage layer. It provides the ability to collect data from tens of thousands of data sources and ingest in real time. You can use the Kinesis SDK for ingesting streaming data through APIs, the Kinesis Producer Library for building high-performance and long-running streaming producers, or a Kinesis agent for collecting a set of files and ingesting them into Kinesis Data Streams. In addition, you can use many pre-build integrations such as AWS Database Migration Service (AWS DMS), Amazon DynamoDB, and AWS IoT Core to ingest data in a no-code fashion. You can also ingest data from third-party platforms such as Apache Spark and Apache Kafka Connect
  • Stream storage – Kinesis Data Streams offer two modes to support the data throughput: On-Demand and Provisioned. On-Demand mode, now the default choice, can elastically scale to absorb variable throughputs, so that customers do not need to worry about capacity management and pay by data throughput. The On-Demand mode automatically scales up 2x the stream capacity over its historic maximum data ingestion to provide sufficient capacity for unexpected spikes in data ingestion. Alternatively, customers who want granular control over stream resources can use the Provisioned mode and proactively scale up and down the number of Shards to meet their throughput requirements. Additionally, Kinesis Data Streams can store streaming data up to 24 hours by default, but can extend to 7 days or 365 days depending upon use cases. Multiple applications can consume the same stream.
  • Stream processing – The stream processing layer is responsible for transforming data into a consumable state through data validation, cleanup, normalization, transformation, and enrichment. The streaming records are read in the order they are produced, allowing for real-time analytics, building event-driven applications or streaming ETL (extract, transform, and load). You can use Amazon Managed Service for Apache Flink for complex stream data processing, AWS Lambda for stateless stream data processing, and AWS Glue & Amazon EMR for near-real-time compute. You can also build customized consumer applications with Kinesis Consumer Library, which will take care of many complex tasks associated with distributed computing.
  • Destination – The destination layer is like a purpose-built destination depending on your use case. You can stream data directly to Amazon Redshift for data warehousing and Amazon EventBridge for building event-driven applications. You can also use Amazon Kinesis Data Firehose for streaming integration where you can light stream processing with AWS Lambda, and then deliver processed streaming into destinations like Amazon S3 data lake, OpenSearch Service for operational analytics, a Redshift data warehouse, No-SQL databases like Amazon DynamoDB, and relational databases like Amazon RDS to consume real-time streams into business applications. The destination can be an event-driven application for real-time dashboards, automatic decisions based on processed streaming data, real-time altering, and more.

Real-time analytics architecture for time series

Time series data is a sequence of data points recorded over a time interval for measuring events that change over time. Examples are stock prices over time, webpage clickstreams, and device logs over time. Customers can use time series data to monitor changes over time, so that they can detect anomalies, identify patterns, and analyze how certain variables are influenced over time. Time series data is typically generated from multiple sources in high volumes, and it needs to be cost-effectively collected in near real time.

Typically, there are three primary goals that customers want to achieve in processing time-series data:

  • Gain insights real-time into system performance and detect anomalies
  • Understand end-user behavior to track trends and query/build visualizations from these insights
  • Have a durable storage solution to ingest and store both archival and frequently accessed data.

With Kinesis Data Streams, customers can continuously capture terabytes of time series data from thousands of sources for cleaning, enrichment, storage, analysis, and visualization.

The following architecture pattern illustrates how real time analytics can be achieved for Time Series data with Kinesis Data Streams:

Build a serverless streaming data pipeline for time series data

The workflow steps are as follows:

  1. Data Ingestion & Storage – Kinesis Data Streams can continuously capture and store terabytes of data from thousands of sources.
  2. Stream Processing – An application created with Amazon Managed Service for Apache Flink can read the records from the data stream to detect and clean any errors in the time series data and enrich the data with specific metadata to optimize operational analytics. Using a data stream in the middle provides the advantage of using the time series data in other processes and solutions at the same time. A Lambda function is then invoked with these events, and can perform time series calculations in memory.
  3. Destinations – After cleaning and enrichment, the processed time series data can be streamed to Amazon Timestream database for real-time dashboarding and analysis, or stored in databases such as DynamoDB for end-user query. The raw data can be streamed to Amazon S3 for archiving.
  4. Visualization & Gain insights – Customers can query, visualize, and create alerts using Amazon Managed Service for Grafana. Grafana supports data sources that are storage backends for time series data. To access your data from Timestream, you need to install the Timestream plugin for Grafana. End-users can query data from the DynamoDB table with Amazon API Gateway acting as a proxy.

Refer to Near Real-Time Processing with Amazon Kinesis, Amazon Timestream, and Grafana showcasing a serverless streaming pipeline to process and store device telemetry IoT data into a time series optimized data store such as Amazon Timestream.

Enriching & replaying data in real time for event-sourcing microservices

Microservices are an architectural and organizational approach to software development where software is composed of small independent services that communicate over well-defined APIs. When building event-driven microservices, customers want to achieve 1. high scalability to handle the volume of incoming events and 2. reliability of event processing and maintain system functionality in the face of failures.

Customers utilize microservice architecture patterns to accelerate innovation and time-to-market for new features, because it makes applications easier to scale and faster to develop. However, it is challenging to enrich and replay the data in a network call to another microservice because it can impact the reliability of the application and make it difficult to debug and trace errors. To solve this problem, event-sourcing is an effective design pattern that centralizes historic records of all state changes for enrichment and replay, and decouples read from write workloads. Customers can use Kinesis Data Streams as the centralized event store for event-sourcing microservices, because KDS can 1/ handle gigabytes of data throughput per second per stream and stream the data in milliseconds, to meet the requirement on high scalability and near real-time latency, 2/ integrate with Flink and S3 for data enrichment and achieving while being completely decoupled from the microservices, and 3/ allow retry and asynchronous read in a later time, because KDS retains the data record for a default of 24 hours, and optionally up to 365 days.

The following architectural pattern is a generic illustration of how Kinesis Data Streams can be used for Event-Sourcing Microservices:

The steps in the workflow are as follows:

  1. Data Ingestion and Storage – You can aggregate the input from your microservices to your Kinesis Data Streams for storage.
  2. Stream processing Apache Flink Stateful Functions simplifies building distributed stateful event-driven applications. It can receive the events from an input Kinesis data stream and route the resulting stream to an output data stream. You can create a stateful functions cluster with Apache Flink based on your application business logic.
  3. State snapshot in Amazon S3 – You can store the state snapshot in Amazon S3 for tracking.
  4. Output streams – The output streams can be consumed through Lambda remote functions through HTTP/gRPC protocol through API Gateway.
  5. Lambda remote functions – Lambda functions can act as microservices for various application and business logic to serve business applications and mobile apps.

To learn how other customers built their event-based microservices with Kinesis Data Streams, refer to the following:

Key considerations and best practices

The following are considerations and best practices to keep in mind:

  • Data discovery should be your first step in building modern data streaming applications. You must define the business value and then identify your streaming data sources and user personas to achieve the desired business outcomes.
  • Choose your streaming data ingestion tool based on your steaming data source. For example, you can use the Kinesis SDK for ingesting streaming data through APIs, the Kinesis Producer Library for building high-performance and long-running streaming producers, a Kinesis agent for collecting a set of files and ingesting them into Kinesis Data Streams, AWS DMS for CDC streaming use cases, and AWS IoT Core for ingesting IoT device data into Kinesis Data Streams. You can ingest streaming data directly into Amazon Redshift to build low-latency streaming applications. You can also use third-party libraries like Apache Spark and Apache Kafka to ingest streaming data into Kinesis Data Streams.
  • You need to choose your streaming data processing services based on your specific use case and business requirements. For example, you can use Amazon Kinesis Managed Service for Apache Flink for advanced streaming use cases with multiple streaming destinations and complex stateful stream processing or if you want to monitor business metrics in real time (such as every hour). Lambda is good for event-based and stateless processing. You can use Amazon EMR for streaming data processing to use your favorite open source big data frameworks. AWS Glue is good for near-real-time streaming data processing for use cases such as streaming ETL.
  • Kinesis Data Streams on-demand mode charges by usage and automatically scales up resource capacity, so it’s good for spiky streaming workloads and hands-free maintenance. Provisioned mode charges by capacity and requires proactive capacity management, so it’s good for predictable streaming workloads.
  • You can use the Kinesis Shared Calculator to calculate the number of shards needed for provisioned mode. You don’t need to be concerned about shards with on-demand mode.
  • When granting permissions, you decide who is getting what permissions to which Kinesis Data Streams resources. You enable specific actions that you want to allow on those resources. Therefore, you should grant only the permissions that are required to perform a task. You can also encrypt the data at rest by using a KMS customer managed key (CMK).
  • You can update the retention period via the Kinesis Data Streams console or by using the IncreaseStreamRetentionPeriod and the DecreaseStreamRetentionPeriod operations based on your specific use cases.
  • Kinesis Data Streams supports resharding. The recommended API for this function is UpdateShardCount, which allows you to modify the number of shards in your stream to adapt to changes in the rate of data flow through the stream. The resharding APIs (Split and Merge) are typically used to handle hot shards.

Conclusion

This post demonstrated various architectural patterns for building low-latency streaming applications with Kinesis Data Streams. You can build your own low-latency steaming applications with Kinesis Data Streams using the information in this post.

For detailed architectural patterns, refer to the following resources:

If you want to build a data vision and strategy, check out the AWS Data-Driven Everything (D2E) program.


About the Authors

Raghavarao Sodabathina is a Principal Solutions Architect at AWS, focusing on Data Analytics, AI/ML, and cloud security. He engages with customers to create innovative solutions that address customer business problems and to accelerate the adoption of AWS services. In his spare time, Raghavarao enjoys spending time with his family, reading books, and watching movies.

Hang Zuo is a Senior Product Manager on the Amazon Kinesis Data Streams team at Amazon Web Services. He is passionate about developing intuitive product experiences that solve complex customer problems and enable customers to achieve their business goals.

Shwetha Radhakrishnan is a Solutions Architect for AWS with a focus in Data Analytics. She has been building solutions that drive cloud adoption and help organizations make data-driven decisions within the public sector. Outside of work, she loves dancing, spending time with friends and family, and traveling.

Brittany Ly is a Solutions Architect at AWS. She is focused on helping enterprise customers with their cloud adoption and modernization journey and has an interest in the security and analytics field. Outside of work, she loves to spend time with her dog and play pickleball.

Best practices for scaling AWS CDK adoption within your organization

Post Syndicated from David Hessler original https://aws.amazon.com/blogs/devops/best-practices-for-scaling-aws-cdk-adoption-within-your-organization/

Enterprises are constantly seeking ways to accelerate their journey to the cloud. Infrastructure as code (IaC) is crucial for automating and managing cloud resources efficiently. The AWS Cloud Development Kit (AWS CDK) lets you define your cloud infrastructure as code in your favorite programming language and deploy it using AWS CloudFormation. In this post, we will discuss strategies and best practices for accelerating CDK adoption within your organization. Our discussion begins after your organization has successfully completed a pilot. In this post, you will learn how to scale the lessons learned from the pilot project across your organization through platform engineering. You will learn how to reduce complexity through building reusable components, deploy with speed and safety via builder tooling, and accelerate project startup with an internal developer portal (IDP). We will conclude by discussing ways to participate in and benefit from the broader CDK community.

Before we dive in, let’s briefly discuss a new trend in technology: Platform Engineering. DevOps practices have helped IT organizations deliver software to customers more frequently and with higher quality. A recent evolution in DevOps is the introduction of platform engineering teams to build services, toolchains, and documentation to support workload teams. An important responsibility of the platform engineering team is governance of the software delivery process.

At Amazon, we have a long and storied history of leveraging platform engineering to accelerate deployments. This is why we are able to maintain 143 different compliance certifications and attestations while deploying 150 million times per year. Platform engineering increases productivity, reduces friction between ideas and implementation, and improves agility by accelerating the delivery of workloads via a secure, scalable, and reusable set of resources and components through self-service portals and developer tools. Platform Engineering is comprised of seven capabilities: Platform Architecture, Data Architecture, Platform Product Engineering, Data Engineering, Provisioning & Orchestration, Modern App Development and CI/CD. For more information on platform engineering visit the AWS Cloud Adoption Framework.

Establishing these capabilities takes several platform and workload teams working together. From an operating model standpoint, a workload team interacts with Platform Engineering in one of the three following ways (for more information, see Building a Cloud Operating Model):

Image describes a three different cloud operating models. The first model is a transitional model where Application Engineering and Application Operations teams both supported by Cloud Platform Engineering. The second model is strategic where Application Engineering and Cloud Platform Engineering equally own the responsibility. The third model is also strategic where Application Engineering and Cloud Platform Engineering jointly own responsibility but Application Engineering owns most of the responsibility.

Reduce Builder Complexity and Cognitive load with Reusable Components

So, how can the platform team incorporate CDK to accomplish their goals? One of the common objectives of the Platform Engineering team is to publish and curate reusable patterns called Constructs. Constructs provide a mechanism to create reusable, extensible, and common components that can be shared across multiple teams and projects.

Many customers write their own implementations for constructs to enforce security best practices such as encryption and specific AWS Identity and Access Management policies. For example, you might create a MyCompanyBucket that implements your organizations security requirements in place of the default Amazon S3 Bucket construct. This bucket configuration can be implemented and extended by multiple teams to ensure they are using components that are validated by your security and compliance teams.

For customers focused on data governance, CDK constructs can automatically add in best practices for recovery time objectives and recovery point objectives by ensuring backups and architecture meet an organization’s resilience policies. For advance customers looking to enforce data lifecycle policies, create uniform access controls, or emit required KPIs, CDK constructs can provide avenues to create safe and secure configuration by default. Applying CDK constructs to DataOps, customers can benefit from templated ETL pipelines that ensure data lineage metadata is maintained and data cleansing occurs.

Customers also build constructs for non-AWS resources. Teams can build Constructs for third-party builder tooling, observability systems, testing apparatuses and more. In this way, workload teams can codify AWS and non-AWS resources in one code base. There is a balance required when writing your own constructs between ensuring standardization and providing the freedom and flexibility of taking advantage of the growing ecosystems of CDK packages. Examples of this balance include AWS Solutions Constructs, as these are typically built upon standard constructs. Without extending standard constructs, the constructs you build will be harder for consumer to integrate with the larger CDK ecosystem since it uses standardize interfaces.

Construct Hub is a central destination for discovering and sharing cloud application design patterns and reference architectures defined for CDK, that are built and published by the AWS community. While AWS provides a public Construct Hub, enterprises can maintain their private Construct Hub inside their own AWS accounts (see construct-hub, the GitHub repository, or the CDK Workshop for more details). The primary objective in either case remains consistent: to provide shared libraries that can be readily utilized by different workload teams. This approach ensures enhanced consistency, reusability, and ultimately leads to cost reduction and faster development timelines.

One of the pitfalls customers often have with leveraging this approach is that Platform Engineering cannot keep up building reusable components to leverage the latest technology enhancements. This is where leveraging the lessons learned from a pilot really can help. A pilot team works with platform engineering to research and implement security best practices. Some customers have the platform engineering team act as approvers for new constructs in addition to authors of new constructs. In this model, a pilot team works to build construct(s) for a new technology. The platform engineers approve the new construct(s). Platform engineers ensure the pilot team meets required standards such as enforcing encryption at rest, encryption in transit, and least privilege. When approval occurs, the pilot team can publish the new construct(s) to Construct Hub. In this way, platform engineering can enable experimentation and innovation, rather than become a gatekeeper. Additionally, platform engineering teams can encourage and curate an inner-sourcing model for construct creation rather than being the sole creator of constructs.

Deploy Applications Using DevSecOps Best Practices

Application builders are most productive when their expertise is channeled towards writing code that directly addresses business challenges. While creating applications is a skill well within the grasp of many software developers, the complex task of deploying and operating these applications in line with organizational standards can be overwhelming, especially for those new to a team. This complexity often acts as a bottleneck, slowing down the experimentation process and delaying the realization of value from new application initiatives.

A solution to this challenge lies in automating the deployment pipeline and operational model. By employing thoroughly tested CDK (Cloud Development Kit) components that are shared across teams and validated through a robust CI/CD (Continuous Integration/Continuous Deployment) process, the burden on developers is significantly reduced. They no longer need to delve into the complexities of the organization’s deployment strategies, allowing them to concentrate on writing unique, innovative code. This approach not only streamlines the development process but also bridges the gap between development and operations, leading to more cohesive teams and faster, more efficient releases.

One key to high-quality software delivery is to have a proper Continuous Integration and Continuous Delivery (CI/CD) process in place. You can see CDK Pipelines: Continuous delivery for AWS CDK applications for practical examples. This high-level construct, powered by AWS CodePipeline, comes in handy when you need to go beyond test deployments with the cdk deploy command and build automated pipelines for production deployments to multiple environments in different regions and/or accounts.

Whenever you commit your AWS CDK app’s source code into AWS CodeCommit, GitHub, GitLab, BitBucket, or Amazon CodeCatalyst source repository, AWS CDK Pipelines automatically builds, tests, and deploys a new version of the application. This pipeline automatically reconfigures itself to deploy as the resources in stacks changes or the environments being deployed to change. For GitHub Actions users, see CDK Pipelines for GitHub Workflows.

A number of teams are extending these pipelines and adding their own stages to ensure deployed code meets the organization’s quality, security, risk, compliance and cloud financial management criteria. For best practices of what automation to put inside the pipeline, see the AWS Deployment Pipeline Reference Architecture. By creating fully functional pipelines, platform engineering teams can reduce the cognitive load place on development teams and increase the developer experience. This strategy has two implementations: QuickStart pipelines and golden pipelines.

In QuickStart pipelines, these pipelines are created as a construct in your Construct Hub and treated similar to the above discussion on reusable components. While these pipelines offer simplified interfaces and a reduction in cognitive load, workload teams remain in control of the pipeline and are free to modify it. As a result, quality gates such as security or compliance tooling can be disabled by workload teams and controls inside the pipeline aren’t provable. This is suboptimal for organizations looking to reduce costs of compliance and audit. As the number of versions of the construct grows, teams can have difficulty governing which versions are used to ensure teams consume.

In golden pipelines, the pipelines are created as constructs, but deployed via a centralized team. Workload teams cannot control or modify these pipelines, so quality gates such as security and compliance tooling cannot be disabled. These controls become provable to stakeholders in security, risk and compliance such as auditors. Removing permissions from workload teams comes with costs. With golden pipelines, platform engineering teams often spend a majority of their time troubleshooting workload teams’ deployments. With so much time spent on troubleshooting, teams have little time to introduce new tooling to raise the security and quality standard, improve environment setup and organizational consistency, or improve audit evidence and enforcement.

Two mechanisms can augment these strategies. Traditional change control boards (CCB) can provide provability in situations where gathering evidence and enforcement are difficult. CCBs can benefit from CDK constructs that integrate IT Service Management (ITSM) approvals and fleet management processes into the pipeline and account creation processes. Alternatively, there is an emerging story with Software Supply Chain Level Artifacts (SLSA). These artifacts can be used as digital proof. In the Kubernetes space, we see this pattern with tools like Tekton chains where attestations associated with OCI images and Kyverno is used for to enforcement the presence of attestations (see Protect the pipe! Secure CI/CD pipelines with a policy-based approach using Tekton and Kyverno for details).

Multi-account and cross-region deployment with CDK

DevOps best practices suggest multiple stages of deployment and testing before deploying to production. On top of that, AWS recommends a dedicated account for each stage to simplify resource isolation and access control. This multi-account strategy helps organizations make best use of AWS resources and provides fine-grain controls (see Recommended OUs and accounts).

Often, you will have a designated AWS account, where all CI/CD pipelines reside. A deployment is executed by these pipelines to publish to other AWS accounts, which may correspond to development, staging, or production stages. For more information about a cross-account strategy in reference to CI/CD pipelines on AWS, see Building a Secure Cross-Account Continuous Delivery Pipeline.

Automated Governance

Many enterprise customers leverage CDK to enforce security controls and policies and can prevent security issues before deployment with tooling to analyze code as part of the deployment pipeline. Using the industry standard tooling of cdk-nag, many teams check applications for best practices using a combination of available rule packs. We are also seeing enterprises build their own Aspects to enforce additional requirements such as tagging requirements to manage and organize their deployed resources.

Customers can create CDK synthesized CloudFormation and add additional checkpoints with CloudFormation Guard to verify the output using policy-as-code domain-specific language (DSL) rules. Platform Engineering teams can build the rules and workload team can consume rules and run CloudFormation Guard inside the pipeline. There is an official construct that supports makes it easy to add CloudFormation Guard checks to your application.

With AWS CDK, infrastructure is code. So, the standard tooling you already use to ensure quality and improve the builder experience should be used with CDK. If your organization has a code quality program, treat CDK applications no differently than web applications or microservices. Similarly, with Amazon CodeGuru Security and Amazon CodeWhisperer, builders can get actionable recommendations on how to improve both the security and quality on their CDK code as they would with any other type of application.

With Aspects, cdk-nag, and code quality tools, organizations can prevent security issues before they are deployed. However, it is also important to create controls that work after a deployment occurs. AWS CloudFormation Hooks allow customers to inspect resources prior to create, update, or delete CloudFormation Stacks or CDK Applications. With CloudFormation Hooks, Platform Engineering teams can provide warnings or prevent provisioning resources for non-compliant resources. These hooks can be created via CDK (see Build and Deploy CloudFormation Hooks using A CI/CD Pipeline for details).

Finally, you can deploy AWS Config’s conformance packs via CDK. These collections of rules you’re your organization insist on security standards at scale. If your organization wishes to build custom rules, teams can build reactive controls using higher level constructs for AWS Config Rules. While many of these patterns existed prior to CDK, CDK helps accelerate building and deploying cloud applications and controls by leveraging reusable components that are shared within the enterprise or by the community at large.

Operate the Application using Observability

The open-source community provides high-level construct libraries that expand basic monitoring capabilities for CDK applications. The cdk-monitoring-constructs project makes it easy to monitor CDK apps. Similarly, Cdk-wakeful takes that a step further, adding many additional services and provides easily configurable interfaces to automatically be notified by AWS System Manager Incident Manager, AWS Chatbot, or Amazon Simple Notification Service. By leveraging prebuilt solutions from the open-source community, you can focus on creating custom metrics and thresholds around your business logic. Platform Engineering teams can modify and extends 1open-source projects to help workload teams simplify their operations and emit health and status to centralized systems.

Accelerate New Project Startup with an Internal Developer Platform

An Internal Developer Platform (IDP) is built by platform engineering teams to build golden paths and enable developer self-service. These golden paths are expressed as a series of templates that the structure of a source control repository and files stored inside the repository. When the IDP uses these templates to create source code repositories, the resultant repository contains the following:

  • A getting-started tutorial (usually in a README.md)
  • Reference documentation
  • Skeleton source code
  • Dependency Management
  • CI/CD pipeline template
  • IaC template
  • Observability configuration

With CDK, the CI/CD pipeline, IaC template, and observability configuration can all be a part of a single CDK application.

Platform engineering teams build golden paths and expose them using tools like Backstage, Humanitec, or Port. When building golden paths, there are two common approaches to the underlying project structure. Some organizations choose the approach where their IaC code repository is separate from the application code. Others choose to include everything in one repository. There is a healthy tension between how much to place inside a golden path vs a reusable component. In both strategies, platform engineering teams can avoid code duplication by leveraging CDK. The approach your organization chooses will dictate how you organize your reusable components. Below, we will walk through both options and the implications on reusable constructs.

Option 1: Everything in one repository

In this approach, all the code is contained in one repository: infrastructure, application, configuration, and deployment. This approach enables builders to collaborate, build features, and innovate together quickly, which is why it is the recommended approach. For more details, refer to the Best practices documentation. For examples, see AWS Deployment Reference Architecture for Applications.

This approach works best in teams that are “value-stream aligned.” Value-stream aligned teams have development and operations capabilities within the same team. These teams are organized around solving problems for customers rather than technical capabilities. Within the project, teams can organize around logical units such as application tier (API, database, etc.) or business capabilities (order management, product catalog, delivery services, etc.). In organizations that are value stream aligned, larger, highly conventionalized reusable components are better. An extreme example of this type of constructs is a single construct that contains all the code for an entire microservice. In these teams, the cognitive load focuses on the customer problem, so reducing the complexity of developing applications is critical to success.

Option 2: Separated application code pipeline

In this alternative approach, you can decouple your application code from your infrastructure by storing them in separate repositories and having separate pipelines. Separating the pipelines often leads to siloes and less collaboration between workload builders, who shift focus to developing features, and infrastructure engineers, who limit their efforts to building the infrastructure on which those applications run.

This approach works best in teams that are “matrixed.” A matrix organization is structured around technical capabilities (development, operations, security, business, etc.). In these cases, more modular constructs work better than constructs that are highly conventionalized. Experts from each organization can use CDK constructs as mechanisms to share their expertise across the entire organization. Examples of these types of constructs are monitoring, alerting, or security constructs prebuilt with hooks to plug in to centralized monitoring.

Building a Community of Practice with Platform Engineering

Scaling any new technology within a large organization requires the creation and enablement of a community that fosters collaboration, establishes best practices, and stays up to date with the changes in the ecosystem. In order to enable the creation of these communities of practice within your organization, AWS supports multiple public communities centered around the creation of content to educate and enable CDK users. Members of your organization’s community of practice can connect with other CDK development teams around the world through these public AWS supported communities.

Communities of Practice

A Community of Practice (CoP) is a group of people with shared interest who come together to learn, collaborate and develop expertise in a specific domain through informal interactions and knowledge sharing. Within your organization, establishing communities of practice around CDK has been proven to enable mentorship, problem solving, and reusable assets. To get started, your platform engineering team – the creators of reusable constructs and builder tooling with CDK – become early content creators for the community of practice. This establishes a feedback loop where CDK creators publicize their achievements via the CoP and consumers can ask questions and provide direct guidance to creators. Once the CoP has sustainably expanded by the initial group that established it, the CoP can start to add hack-a-thons or game days within your organization, which can bring innovation and solve organization-wide challenges. Fully mature communities of practices own curated wikis or databases of knowledge. They use mechanisms such as townhalls, office hours, newsletters, and chat channels to keep the community up to date. In this way, CDK expertise is diffused across the organization. At AWS, this diffusion of expertise has led to teams other than platform engineering becoming creators of reusable constructs. By expanding who can create reusable constructs, we are able to accelerate our own innovation.

Communities

There is a growing community that supports CDK, with many different platforms available providing content, code, examples and meetups. CDK is currently maintained by AWS with support from the community on AWS CDK GitHub page where you can contribute to the platform, raise issues, see the backlog and join discussions with active community members.

CDK.dev is the community driven hub around the CDK ecosystem. This site brings together all the latest blogs, videos, and educational content. It also provides links to join the community Slack platform.

CDK Patterns houses an open source collection of AWS Serverless architecture patterns built with CDK for developers to use. These patterns are sources via AWS Community Builders / AWS Heroes.

Finally, AWS re:Post provides a question-and-answer portal for the community to resolve.

The AWS Community Builders program offers technical resources, education, and networking opportunities to AWS technical enthusiasts and emerging thought leaders who are passionate about sharing knowledge and connecting with the technical community.

Communities of practice can leverage AWS public communities like cdk.dev to fill gaps in knowledge. Townhalls can benefit from speakers from AWS Heroes or community builders, frequent contributors to GitHub or re:Post, or speakers from CDK Day. Newsletters can aggregate and summarize the latest news from across all AWS channels. Once your community of practice establishes CDK competencies, this collaboration can also be bidirectional. For example, experts in your organization’s community can become AWS Heroes. Success stories can be shared via CDK Day, guest blog posts, and you might even speak at one of our major events such as AWS Summits, AWS re:Invent, AWS re:Inforce, or AWS re:Mars.

Final Thoughts

As we’ve said throughout this blog, with CDK, Infrastructure is code. This has enabled a paradigm shift in the infrastructure management space. Today, we see many customers such as Liberty Mutual, Scenario, Checkmarx, and Registers of Scotland establishing mature ecosystems using CDK. With an active open-source community, an AWS dev team for long term support, and multiple platforms for knowledge sharing, your builders can quickly learn, build, and innovate. Due to successful pilots, many organizations adopt CDK, become more agile, and innovate faster. This is exactly what happened at Amazon, where CDK is the first choice for building new services.

Organizations often scale and reduce complexity through platform engineering. These teams build higher level constructs by applying best practices, and provide CI/CD pipelines to accelerate deployments. Your deployment is safer using unit testing on your infrastructure as code and through robust security controls to provide guidance to builders at every stage: from author to operate.

Finally, establishing a community enables your organization to build its own mature ecosystem. Through both internal and open-source communities your builders can connect, discover, and grow.

Photo of David Hessler

David Hessler

Prior to joining AWS, David spent a decade serving as a principal technologist and establishing Platform Engineering and SRE teams for the United States government. Since joining AWS in 2020, David has spent his time helping customers accelerate deployment speed and safety for some of AWS’s largest commercial and public sector customers. Today, as a part of the DevSecOps team within Global Services Security, he is building the next generation of DevSecOps tooling for AWS customers.

Amritha Shetty

Amritha Shetty

Amritha is a Solutions architect at AWS. She works with public sector customers to help migrate and modernize in the cloud. She loves helping citizens get more from public sector institutions through rapid innovation in the cloud. She brings over twelve years of software design and development experience and passionate about helping customers implement the next-generation development experience.

Photo of Chris Scudder

Chris Scudder

Chris is a Senior Solutions Architect with the UK Public Sector team. His primary focus is helping Public Sector customers adopt cloud technologies for their workloads, helping them streamline their development and operational processes. He has a background in application development and has created multiple Industry Solutions for UK Local Government. He has an interesting in Machine Learning and delivers AWS DeepRacer events alongside his day-to-day role.

Photo of Kumar Karra

Kumar Karra

Kumar Karra is a Senior Field Solutions Architect for AWS Small and Medium Business Customers. He has a strong background in designing and developing applications for small consumer facing customers to large mission critical applications for enterprises. He specialized in NextGen Developer Experience tools and enjoys helping customer shorten their time to value by guiding them on strategies to implement fast, repeatable, testable, and scalable tools and architectures.

AWS Security Profile: Arynn Crow, Sr. Manager for AWS User AuthN

Post Syndicated from Maddie Bacon original https://aws.amazon.com/blogs/security/aws-security-profile-arynn-crow-sr-manager-for-aws-user-authn/

AWS Security Profile series, I interview some of the humans who work in AWS Security and help keep our customers safe and secure. In this profile, I interviewed Arynn Crow, senior manager for AWS User AuthN in AWS Identity.


How long have you been at AWS, and what do you do in your current role?

I’ve been at Amazon for over 10 years now, and AWS for three of those years. I lead a team of product managers in AWS Identity who define the strategy for our user authentication related services. This includes internal and external services that handle AWS sign-in, account creation, threat mitigation, and underlying authentication components that support other AWS services. It’s safe to say that I’m thinking about something different nearly every day, which keeps it fun.

How do you explain your job to non-technical friends and family?

I tell people that my job is about figuring out how to make sure that people are who they say they are online. If they want to know a bit more, sometimes I will relate this to examples they’re increasingly likely to encounter in their everyday lives—getting text or email messages for additional security when they try to sign in to their favorite website, or using their fingerprint or facial scan to sign in instead of entering a password. There’s a lot more to identity and authentication, of course, but this usually gets the point across!

You haven’t always been in security. Tell me a little bit about your journey and how you got started in this space?

More than 10 years ago now, I started in one of our call centers as a temporary customer service agent. I was handling Kindle support calls (this was back when our Kindles still had physical keyboards on them, and “Alexa” wasn’t even part of our lexicon yet). After New Year’s 2013, I was converted to a full-time employee and resumed my college education—I earned both of my degrees (a BA in International Affairs, and MA in political science) while working at Amazon. Over the next few years, I moved into different positions including our Back Office team, a Kindle taskforce role supporting the launch of a new services, and Executive Customer Relations. Throughout these roles, I continued to manage projects related to anti-abuse and security. I got a lot of fulfillment out of these projects—protecting our customers, employees, and business against fraud and data loss is very gratifying work. When a position opened up in our Customer Service Security team, I got the role thanks in part to my prior experience working with that team to deliver security solutions within our operations centers.

After that, things moved fast—I started first with a project on account recovery and access control for our internal workforce, and continuously expanded my portfolio into increasingly broad and more technical projects that all related to what I now know is the field of Identity and Access Management. Eventually, I started leading our identity strategy for customer service as a whole, including our internal authentication and access management as well as external customer authentication to our call centers. I also began learning about and engaging more with the security and identity community that existed outside of Amazon by attending conferences and getting involved with organizations working on standards development like the FIDO Alliance. Moving to AWS Identity a few years later was an obvious next step to gain exposure to broader applications of identity.

What advice do you have for people who want to get into security but don’t have the traditional background?

First, it can be hard. This journey wasn’t easy for me, and I’m still working to learn more every day. I want to say that because if someone is having trouble landing their first security job, or feeling like they still don’t “fit” at first when they do get the job, they should know it doesn’t mean they’re failing. There are a lot of inspiring stories out there about people who seemingly naturally segued into this field from other projects and work, but there are just as many people working very hard to find their footing. Everyone doubts themselves sometimes. Don’t let it hold you back.

Next for the practical advice, whatever you’re doing now, there are probably opportunities to begin looking at your space with a security lens, and start helping wherever you find problems to address or processes to improve by bringing them to your security teams. This will help your organization while also helping you build relationships. Be insatiably curious! Cybersecurity is community-oriented, and I find that people in this field are very passionate about what we do. Many people I met were excited that I was interested in learning about what they do and how they do it. Sometimes, they’d agree to take a couple hours with me each month for me to ask questions about how things worked, and narrow down what resources were the best use of my time.

Finally, there are a lot of resources for learning. We have highly competent, successful security professionals that learned on the job and don’t hold a roster of certifications, so I don’t think these are essential for success. But, I do think these programs can be beneficial to familiarize you with basic concepts and give you access to a common language. Various certification and training courses exist, from basic, free computer science courses online to security-specific ones like CISSP, SANS, COMPTIA Security+, and CIDPro, to name just a few. AWS offers AWS-specific cloud security training, too, like our Ramp-Up Guide. You don’t have to learn to code beautifully to succeed in security, but I think developing a working understanding of systems and principles will help build credibility and extract deeper learning out of experiences you have.

In your opinion, why is it important to have people with different backgrounds working in security?

Our backgrounds color the way we think about and approach problems, and considering all of these different approaches helps make us well-rounded. And particularly in the current context, in which women and marginalized communities are underrepresented in STEM, expanding our thinking about what skills make a good security practitioner makes room for more people at the table while giving us a more comprehensive toolkit to tackle our toughest problems. As for myself, I apply my training in political science. Security sometimes looks like a series of technical challenges and solutions, but it’s interwoven with a complex array of regulatory and social considerations, too—this makes the systems-based and abstract thinking I honed in my education useful. I know other folks who came to identity from social science, mathematics, and biology backgrounds who feel the same about skills learned from their respective fields.

Pivoting a bit, what’s something that you’re working on right now that you’re excited about?

It’s a very interesting time to be working on authentication, many people who aren’t working in enterprises or regulated industries are still hesitant to adopt controls like multi-factor authentication. And beyond MFA, organizations like NIST and CISA are emphasizing the importance of phishing-resistant MFA. So, at the same time we’re continuously working to innovate in our MFA and other authentication offerings to customers, we’re collaborating with the rest of the industry to advance technologies for strong authentication and their adoption across sectors. I represent Amazon to the FIDO Alliance, which is an industry association that supports the development of a set of protocols collectively known as FIDO2 for strong, phishing-resistant authentication. With FIDO and its various member companies, we’re working to increase the usability, awareness, and adoption of FIDO2 security keys and passkeys, which are a newer implementation of FIDO2 that improves ease of use by enabling customers to use phishing-resistant keys across devices and platforms.

In your opinion, what is the coolest thing happening in identity right now?

What I think is the most important thing happening in identity is the convergence of digital and “traditional” identities. The industry is working through challenging questions with emerging technology right now to bring forth innovation balanced with concern for equity, privacy, and sustainability. Ease of use and improved security for users as well as abuse prevention for businesses is driving conversion of real-life identities and credentials (such as peoples’ driver’s licenses as one example) to a digital format, such as digital driver’s licenses, wallets, and emerging verifiable credentials.

What are you most proud of in your career?

I’m most grateful for the opportunities I’ve had to help define the next chapter of the AWS account protection strategy. Some of our work also translates to features we get to ship to customers, like when we extended support for multiple MFA devices for AWS Identity and Access Management (IAM) late last year, and this year we announced that in 2024 we will require MFA when customers sign in to the AWS Management Console. Seeing how excited people were for a security feature was really awesome. Account protection has always been important, but this is especially true in the years following the COVID-19 outbreak when we saw a rapid acceleration of resources going digital. This kind of work definitely isn’t a one-person show, and as fulfilling as it is to see the impact I have here, what I’m really proud of is that I get to work with and learn from so many really smart, competent, and kind team members that are just as passionate about this space as I am.

If you were to do anything other than security, what would you want to do?

Before I discovered my interest for security, I was trying to decide if I would continue on from my master’s program in political science to do a PhD in either political science or public health. Towards the end of my degree program, I became really interested in how research-driven public policy could drive improvements in maternal and infant health outcomes in areas with acute opioid-related health crises, which is an ongoing struggle for my home place. I’m still very invested in that topic and try to keep on top of the latest research—I could easily see myself moving back towards that if I ever decide it’s time to close this chapter.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Maddie Bacon

Maddie (she/her) is a technical writer for Amazon Security with a passion for creating meaningful content that focuses on the human side of security and encourages a security-first mindset. She previously worked as a reporter and editor, and has a BA in Mathematics. In her spare time, she enjoys reading, traveling, and staunchly defending the Oxford comma.

Arynn Crow

Arynn Crow

Arynn Crow is a Manager of Product Management for AWS Identity. Arynn started at Amazon in 2012, trying out many different roles over the years before finding her happy place in security and identity in 2017. Arynn now leads the product team responsible for developing user authentication services at AWS.

AWS Security Profile: Chris Betz, CISO of AWS

Post Syndicated from Chris Betz original https://aws.amazon.com/blogs/security/aws-security-profile-chris-betz-ciso-of-aws/

In the AWS Security Profile series, we feature the people who work in Amazon Web Services (AWS) Security and help keep our customers safe and secure. This interview is with Chris Betz, Chief Information Security Officer (CISO), who began his role as CISO of AWS in August of 2023.


How did you get started in security? What prompted you to pursue this field?

I’ve always had a passion for technology, and for keeping people out of harm’s way. When I found computer science and security in the Air Force, this world opened up to me that let me help others, be a part of building amazing solutions, and engage my competitive spirit. Security has the challenges of the ultimate chess game, though with real and impactful consequences. I want to build reliable, massively scaled systems that protect people from malicious actors. This is really hard to do and a huge challenge I undertake every day. It’s an amazing team effort that brings together the smartest folks that I know, competing with threat actors.

What are you most excited about in your new role?

One of the most exciting things about my role is that I get to work with some of the smartest people in the field of security, people who inspire, challenge, and teach me something new every day. It’s exhilarating to work together to make a significant difference in the lives of people all around the world, who trust us at AWS to keep their information secure. Security is constantly changing, we get to learn, adapt, and get better every single day. I get to spend my time helping to build a team and culture that customers can depend on, and I’m constantly impressed and amazed at the caliber of the folks I get work with here.

How does being a former customer influence your role as AWS CISO?

I was previously the CISO at Capital One and was an AWS customer. As a former customer, I know exactly what it’s like to be a customer who relies on a partner for significant parts of their security. There needs to be a lot of trust, a lot of partnership across the shared responsibility model, and consistent focus on what’s being done to keep sensitive data secure. Every moment that I’m here at AWS, I’m reminded about things from the customer perspective and how I can minimize complexity, and help customers leverage the “super powers” that the cloud provides for CISOs who need to defend the breadth of their digital estate. I know how important it is to earn and keep customer trust, just like the trust I needed when I was in their shoes. This mindset influences me to learn as much as I can, never be satisfied with ”good enough,” and grab every opportunity I can to meet and talk with customers about their security.

What’s been the most dramatic change you’ve seen in the security industry recently?

This is pretty easy to answer: artificial intelligence (AI). This is a really exciting time. AI is dominating the news and is on the mind of every security professional, everywhere. We’re witnessing something very big happening, much like when the internet came into existence and we saw how the world dramatically changed because of it. Every single sector was impacted, and AI has the same potential. Many customers use AWS machine learning (ML) and AI services to help improve signal-to-noise ratio, take over common tasks to free up valuable time to dig deeper into complex cases, and analyze massive amounts of threat intelligence to determine the right action in less time. The combination of Data + Compute power + AI is a huge advantage for cloud companies.

AI and ML have been a focus for Amazon for more than 25 years, and we get to build on an amazing foundation. And it’s exciting to take advantage of and adapt to the recent big changes and the impact this is having on the world. At AWS, we’re focused on choice and broadening access to generative AI and foundation models at every layer of the ML stack, including infrastructure (chips), developer tools, and AI services. What a great time to be in security!

What’s the most challenging part of being a CISO?

Maintaining a culture of security involves each person, each team, and each leader. That’s easy to say, but the challenge is making it tangible—making sure that each person sees that, even though their title doesn’t have “security” in it, they are still an integral part of security. We often say, “If you have access, you have responsibility.” We work hard to limit that access. And CISOs must constantly work to build and maintain a culture of security and help every single person who has access to data understand that security is an important part of their job.

What’s your short- and long-term vision for AWS Security?

Customers trust AWS to protect their data so they can innovate and grow quickly, so in that sense, our vision is for security to be a growth lever for our customers, not added friction. Cybersecurity is key to unlocking innovation, so managing risk and aligning the security posture of AWS with our business objectives will continue for the immediate future and long term. For our customers, my vision is to continue helping them understand that investing in security helps them move faster and take the right risks—the kind of risks they need to remain competitive and innovative. When customers view security as a business accelerator, they achieve new technical capabilities and operational excellence. Strong security is the ultimate business enabler.

If you could give one piece of advice to all CISOs, what would it be?

Nail Zero Trust. Zero Trust is the path to the strongest, most effective security, and getting back to the core concepts is important. While Zero Trust is a different journey for every organization, it’s a natural evolution of cybersecurity and defense in depth in particular. No matter what’s driving organizations toward Zero Trust—policy considerations or the growing patchwork of data protection and privacy regulations—Zero Trust meaningfully improves security outcomes through an iterative process. When companies get this right, they can quickly identify and investigate threats and take action to contain or disrupt unwanted activity.

What are you most proud of in your career?

I’m proud to have worked—and still be working with—such talented, capable, and intelligent security professionals who care deeply about security and are passionate about making the world a safer place. Being among the world’s top security experts really makes me grateful and humble for all the amazing opportunities I’ve had to work alongside them, working together to solve problems and being part of creating a legacy to make security better.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Chris Betz

Chris Betz

Chris is CISO at AWS. He oversees security teams and leads the development and implementation of security policies, with the aim of managing risk and aligning the company’s security posture with business objectives. Chris joined Amazon in August 2023, after holding CISO and security leadership roles at leading companies. He lives in Northern Virginia with his family.

Lisa Maher

Lisa Maher

Lisa Maher joined AWS in February 2022 and leads AWS Security Thought Leadership PR. Before joining AWS, she led crisis communications for clients experiencing data security incidents at two global PR firms. Lisa, a former journalist, is a graduate of Texas A&M School of Law, where she specialized in Cybersecurity Law & Policy, Risk Management & Compliance.

Happy anniversary, Amazon CloudFront: 15 years of evolution and internet advancements

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/happy-anniversary-amazon-cloudfront-15-years-of-evolution-and-internet-advancements/

I can’t believe it’s been 15 years since Amazon CloudFront was launched! When Amazon S3 became available in 2006, developers loved the flexibility and started to build a new kind of globally distributed applications where storage was not a bottleneck. These applications needed to be performant, reliable, and cost-efficient for every user on the planet. So in 2008 a small team (a “two-pizza team“) launched CloudFront in just 200 days. Jeff Barr hinted at the new and yet unnamed service in September and introduced CloudFront two months later.

Since the beginning, CloudFront has provided an easy way to distribute content to end users with low latency, high data transfer speeds, and no long-term commitments. What started as a simple cache for Amazon S3 quickly evolved into a fully featured content delivery network. Now CloudFront delivers applications at blazing speeds across the globe, supporting live sporting events such as NFL, Cricket World Cup, and FIFA World Cup.

At the same time, we also want to provide you with the best tools to secure applications. In 2015, we announced AWS WAF integration with CloudFront to provide fast and secure access control at the edge. Then, we focused on developing robust threat intelligence by combining signals across services. This threat intelligence integrates with CloudFront, adding AWS Shield to protect applications from common exploits and distributed denial of service (DDoS) attacks. For example, we recently detected an unusual spike in HTTP/2 requests to Amazon CloudFront. We quickly realized that CloudFront had automatically mitigated a new type of HTTP request flood DDoS event.

A lot also happens at lower levels than HTTP. For example, when you serve your application with CloudFront, all of the packets received by the application are inspected by a fully inline DDoS mitigation system which doesn’t introduce any observable latency. In this way, L3/L4 DDoS attacks against CloudFront distributions are mitigated in real time.

We also made under-the-hood improvements like s2n-tls (short for “signal to noise”), an open-source implementation of the TLS protocol that has been designed to be small and fast with simplicity as a priority. Another similar improvement is s2n-quic, an open-source QUIC protocol implementation written in Rust.

With CloudFront, you can also control access to content through a number of capabilities. You can restrict access to only authenticated viewers or, through geo-restriction capability, configure the specific geographic locations that can access content.

Security is always important, but not every organization has dedicated security experts on staff. To make robust security more accessible, CloudFront now includes built-in protections such as one-click web application firewall setup, security recommendations, and an intuitive security dashboard. With these integrated security features, teams can put critical safeguards in place without deep security expertise. Our goal is to empower all customers to easily implement security best practices.

Web applications delivery
During the past 15 years, web applications have become much more advanced and essential to end users. When CloudFront launched, our focus was helping deliver content stored in S3 buckets. Dynamic content was introduced to optimize web applications where portions of a website change for each user. Dynamic content also improves access to APIs that need to be delivered globally.

As applications become more distributed, we looked at ways to help developers make efficient use of its global footprint and resources at the edge. To allow customization and personalization of content close to end users and minimize latency, Lambda@Edge was introduced.

When fewer compute resources are needed, CloudFront Functions can run lightweight JavaScript functions across edge locations for low-latency HTTP manipulations and personalized content delivery. Recently, CloudFront Functions expanded to further customize responses, including modifying HTTP status codes and response bodies.

Today, CloudFront handles over 3 trillion HTTP requests daily and uses a global network of more than 600 points of presence and 13 Regional edge caches in more than 100 cities across 50 countries. This scale helps power the most demanding online events. For example, during the 2023 Amazon Prime Day, CloudFront handled peak loads of over 500 million HTTP requests per minute, totaling over 1 trillion HTTP requests.

Amazon CloudFront has more than 600,000 active developers building and delivering applications to end users. To help teams work at their full speed, CloudFront introduced continuous deployment so developers can test and validate configuration changes on a portion of traffic before full deployment.

Media and entertainment
It’s now common to stream music, movies, and TV series to our homes, but 15 years ago, renting DVDs was still the norm. Running streaming servers was technically complex, requiring long-term contracts to access the global infrastructure needed for high performance.

First, we added support for audio and video streaming capabilities using custom protocols since technical standards were still evolving. To handle large audiences and simplify cost-effective delivery of live events, CloudFront launched live HTTP streaming and, shortly after, improved support for both Flash-based (popular at the time) and Apple iOS devices.

As the media industry continued moving to internet-based delivery, AWS acquired Elemental, a pioneer in software-defined video solutions. Integrating Elemental offerings helped provide services, software, and appliances that efficiently and economically scale video infrastructures for use cases such as broadcast and content production.

The evolution of technologies and infrastructure allows for new ways of communication to become possible, such as when NASA did the first-ever live 4K stream from space using CloudFront.

Today, the world’s largest events and leading video platforms rely on CloudFront to deliver massive video catalogs and live stream content to millions. For example, CloudFront delivered streams for the FIFA World Cup 2022 on behalf of more than 19 major broadcasters globally. More recently, CloudFront handled over 120 Tbps of peak data transfer during one of the Thursday Night Football games of the NFL season on Prime Video and helped deliver the Cricket World Cup to millions of viewers across the globe.

What’s next?
Many things have changed during these 15 years but the focus on security, performance, and scalability stays the same. At AWS, it’s always Day 1, and the CloudFront team is constantly looking for ways to improve based on your feedback.

The rise of botnets is driving an ever-evolving, highly dynamic, and shifting threat landscape. Layer 7 DDoS attacks are becoming increasingly prevalent. The pervasiveness of bot traffic is increasing exponentially. As this occurs, we are evolving how we mitigate threats at the network border, at the edge, and in the Region, making it simpler for customers to configure the right security options.

Web applications are becoming more complex and interactive, and viewer expectations on latency and resiliency are even more stringent. This will drive new innovation. As new applications use generative artificial intelligence (AI), needs will evolve. These trends are will continue growing, so our investments will be focused on improving security and edge compute capabilities to support these new use cases.

With the current macroeconomic environment, many customers, especially small and medium-sized businesses and startups, look at how they can reduce their costs. Providing optimal price-performance has always been a priority for CloudFront. Cacheable data transferred to CloudFront edge locations from AWS resources does not incur additional fees. Also, 1 TB of data transfer from CloudFront to the internet per month is included in the free tier. CloudFront operates on a pay-as-you-go model with no upfront costs or minimum usage requirements. For more info, see CloudFront pricing.

As we approach AWS re:Invent, take note of these sessions that can help you learn about the latest innovations and connect with experts:

To learn more on how to speed up your websites and APIs and keep them protected, see the Application Security and Performance section of the AWS Developer Center.

Reduce latency and improve the security for your applications with Amazon CloudFront.

Danilo

AWS Speaker Profile: Zach Miller, Senior Worldwide Security Specialist Solutions Architect

Post Syndicated from Roger Park original https://aws.amazon.com/blogs/security/aws-speaker-profile-zach-miller-senior-worldwide-security-specialist-solutions-architect/

In the AWS Speaker Profile series, we interview Amazon Web Services (AWS) thought leaders who help keep our customers safe and secure. This interview features Zach Miller, Senior Worldwide Security Specialist SA and re:Invent 2023 presenter of Securely modernize payment applications with AWS and Centrally manage application secrets with AWS Secrets Manager. Zach shares thoughts on the data protection and cloud security landscape, his unique background, his upcoming re:Invent sessions, and more.


How long have you been at AWS?

I’ve been at AWS for more than four years, and I’ve enjoyed every minute of it! I started as a consultant in Professional Services, and I’ve been a Security Solutions Architect for around three years.

How do you explain your job to your non-tech friends?

Well, my mother doesn’t totally understand my role, and she’s been known to tell her friends that I’m the cable company technician that installs your internet modem and router. I usually tell my non-tech friends that I help AWS customers protect their sensitive data. If I mention cryptography, I typically only get asked questions about cryptocurrency—which I’m not qualified to answer. If someone asks what cryptography is, I usually say it’s protecting data by using mathematics.

How did you get started in data protection and cryptography? What about it piqued your interest?

I originally went to school to become a network engineer, but I discovered that moving data packets from point A to point B wasn’t as interesting to me as securing those data packets. Early in my career, I was an intern at an insurance company, and I had a mentor who set up ethnical hacking lessons for me—for example, I’d come into the office and he’d have a compromised workstation preconfigured. He’d ask me to do an investigation and determine how the workstation was compromised and what could be done to isolate it and collect evidence. Other times, I’d come in and find my desk cabinets were locked with a padlock, and he wanted me to pick the lock. Security is particularly interesting because it’s an ever-evolving field, and I enjoy learning new things.

What’s been the most dramatic change you’ve seen in the data protection landscape?

One of the changes that I’ve been excited to see is an emphasis on encrypting everything. When I started my career, we’d often have discussions about encryption in the context of tradeoffs. If we needed to encrypt sensitive data, we’d have a conversation with application teams about the potential performance impact of encryption and decryption operations on their systems (for example, their databases), when to schedule downtime for the application to encrypt the data or rotate the encryption keys protecting the data, how to ensure the durability of their keys and make sure they didn’t lose data, and so on.

When I talk to customers about encryption on AWS today—of course, it’s still useful to talk about potential performance impact—but the conversation has largely shifted from “Should I encrypt this data?” to “How should I encrypt this data?” This is due to services such as AWS Key Management Service (AWS KMS) making it simpler for customers to manage encryption keys and encrypt and decrypt data in their applications with minimal performance impact or application downtime. AWS KMS has also made it simple to enable encryption of sensitive data—with over 120 AWS services integrated with AWS KMS, and services such as Amazon Simple Storage Service (Amazon S3) encrypting new S3 objects by default.

You are a frequent contributor to the AWS Security Blog. What were some of your recent posts about?

My last two posts covered how to use AWS Identity and Access Management (IAM) condition context keys to create enterprise controls for certificate management and how to use AWS Secrets Manager to securely manage and retrieve secrets in hybrid or multicloud workloads. I like writing posts that show customers how to use a new feature, or highlight a pattern that many customers ask about.

You are speaking in a couple of sessions at AWS re:Invent; what will your sessions focus on? What do you hope attendees will take away from your session?

I’m delivering two sessions at re:Invent this year. The first is a chalk talk, Centrally manage application secrets with AWS Secrets Manager (SEC221), that I’m delivering with Ritesh Desai, who is the General Manager of Secrets Manager. We’re discussing how you can securely store and manage secrets in your workloads inside and outside of AWS. We will highlight some recommended practices for managing secrets, and answer your questions about how Secrets Manager integrates with services such as AWS KMS to help protect application secrets.

The second session is also a chalk talk, Securely modernize payment applications with AWS (SEC326). I’m delivering this talk with Mark Cline, who is the Senior Product Manager of AWS Payment Cryptography. We will walk through an example scenario on creating a new payment processing application. We will discuss how to use AWS Payment Cryptography, as well as other services such as AWS Lambda, to build a simple architecture to help process and secure credit card payment data. We will also include common payment industry use cases such as tokenization of sensitive data, and how to include basic anti-fraud detection, in our example app.

What are you currently working on that you’re excited about?

My re:Invent sessions are definitely something that I’m excited about. Otherwise, I spend most of my time talking to customers about AWS Cryptography services such as AWS KMS, AWS Secrets Manager, and AWS Private Certificate Authority. I also lead a program at AWS that enables our subject matter experts to create and publish videos to demonstrate new features of AWS Security Services. I like helping people create videos, and I hope that our videos provide another mechanism for viewers who prefer information in a video format. Visual media can be more inclusive for customers with certain disabilities or for neurodiverse customers who find it challenging to focus on written text. Plus, you can consume videos differently than a blog post or text documentation. If you don’t have the time or desire to read a blog post or AWS public doc, you can listen to an instructional video while you work on other tasks, eat lunch, or take a break. I invite folks to check out the AWS Security Services Features Demo YouTube video playlist.

Is there something you wish customers would ask you about more often?

I always appreciate when customers provide candid feedback on our services. AWS is a customer-obsessed company, and we build our service roadmaps based on what our customers tell us they need. You should feel comfortable letting AWS know when something could be easier, more efficient, or less expensive. Many customers I’ve worked with have provided actionable feedback on our services and influenced service roadmaps, just by speaking up and sharing their experiences.

How about outside of work, any hobbies?

I have two toddlers that keep me pretty busy, so most of my hobbies are what they like to do. So I tend to spend a lot of time building elaborate toy train tracks, pushing my kids on the swings, and pretending to eat wooden toy food that they “cook” for me. Outside of that, I read a lot of fiction and indulge in binge-worthy TV.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Roger Park

Roger Park

Roger is a Senior Security Content Specialist at AWS Security focusing on data protection. He has worked in cybersecurity for almost ten years as a writer and content producer. In his spare time, he enjoys trying new cuisines, gardening, and collecting records.

Zach Miller

Zach Miller

Zach is a Senior Worldwide Security Specialist Solutions Architect at AWS. His background is in data protection and security architecture, focused on a variety of security domains, including cryptography, secrets management, and data classification. Today, he is focused on helping enterprise AWS customers adopt and operationalize AWS security services to increase security effectiveness and reduce risk.

Implement Apache Flink near-online data enrichment patterns

Post Syndicated from Luis Morales original https://aws.amazon.com/blogs/big-data/implement-apache-flink-near-online-data-enrichment-patterns/

Stream data processing allows you to act on data in real time. Real-time data analytics can help you have on-time and optimized responses while improving the overall customer experience.

Data streaming workloads often require data in the stream to be enriched via external sources (such as databases or other data streams). Pre-loading of reference data provides low latency and high throughput. However, this pattern may not be suitable for certain types of workloads:

  • Reference data updates with high frequency
  • The streaming application needs to make an external call to compute the business logic
  • Accuracy of the output is important and the application shouldn’t use stale data
  • Cardinality of reference data is very high, and the reference dataset is too big to be held in the state of the streaming application

For example, if you’re receiving temperature data from a sensor network and need to get additional metadata of the sensors to analyze how these sensors map to physical geographic locations, you need to enrich it with sensor metadata data.

Apache Flink is a distributed computation framework that allows for stateful real-time data processing. It provides a single set of APIs for building batch and streaming jobs, making it easy for developers to work with bounded and unbounded data. Amazon Managed Service for Apache Flink (successor to Amazon Kinesis Data Analytics) is an AWS service that provides a serverless, fully managed infrastructure for running Apache Flink applications. Developers can build highly available, fault tolerant, and scalable Apache Flink applications with ease and without needing to become an expert in building, configuring, and maintaining Apache Flink clusters on AWS.

You can use several approaches to enrich your real-time data in Amazon Managed Service for Apache Flink depending on your use case and Apache Flink abstraction level. Each method has different effects on the throughput, network traffic, and CPU (or memory) utilization. For a general overview of data enrichment patterns, refer to Common streaming data enrichment patterns in Amazon Managed Service for Apache Flink.

This post covers how you can implement data enrichment for near-online streaming events with Apache Flink and how you can optimize performance. To compare the performance of the enrichment patterns, we ran performance testing based on synthetic data. The result of this test is useful as a general reference. It’s important to note that the actual performance for your Flink workload will depend on various and different factors, such as API latency, throughput, size of the event, and cache hit ratio.

We discuss three enrichment patterns, detailed in the following table.

. Synchronous Enrichment Asynchronous Enrichment Synchronous Cached Enrichment
Enrichment approach Synchronous, blocking per-record requests to the external endpoint Non-blocking parallel requests to the external endpoint, using asynchronous I/O Frequently accessed information is cached in the Flink application state, with a fixed TTL
Data freshness Always up-to-date enrichment data Always up-to-date enrichment data Enrichment data may be stale, up to the TTL
Development complexity Simple model Harder to debug, due to multi-threading Harder to debug, due to relying on Flink state
Error handling Straightforward More complex, using callbacks Straightforward
Impact on enrichment API Max: one request per message Max: one request per message Reduce I/O to enrichment API (depends on cache TTL)
Application latency Sensitive to enrichment API latency Less sensitive to enrichment API latency Reduce application latency (depends on cache hit ratio)
Other considerations none none

Customizable TTL.

Only synchronous implementation as of Flink 1.17

Result of the comparative test (Throughput) ~350 events per second ~2,000 events per second ~28,000 events per second

Solution overview

For this post, we use an example of a temperature sensor network (component 1 in the following architecture diagram) that emits sensor information, such as temperature, sensor ID, status, and the timestamp this event was produced. These temperature events get ingested into Amazon Kinesis Data Streams (2). Downstream systems also require the brand and country code information of the sensors, in order to analyze, for example, the reliability per brand and temperature per plant side.

Based on the sensor ID, we enrich this sensor information from the Sensor Info API (3), which provide us with information of the brand, location, and an image. The resulting enriched stream is sent to another Kinesis data stream and can then be analyzed in an Amazon Managed Service for Apache Flink Studio notebook (4).

Solution overview

Prerequisites

To get started with implementing near-online data enrichment patterns, you can clone or download the code from the GitHub repository. This repository implements the Flink streaming application we described. You can find the instructions on how to set up Flink in either Amazon Managed Service for Apache Flink or other available Flink deployment options in the README.md file.

If you want to learn how these patterns are implemented and how to optimize performance for your Flink application, you can simply follow along with this post without deploying the samples.

Project overview

The project is structured as follows:

docs/                               -- Contains project documentation
src/
├── main/java/...                   -- Contains all the Flink application code
│   ├── ProcessTemperatureStream    -- Main class that decides on the enrichment strategy
│   ├── enrichment.                 -- Contains the different enrichment strategies (sync, async and cached)
│   ├── event.                      -- Event POJOs
│   ├── serialize.                  -- Utils for serialization
│   └── utils.                      -- Utils for Parameter parsing
└── test/                           -- Contains all the Flink testing code

The main method in the ProcessTemperatureStream class sets up the run environment and either takes the parameters from the command line, if it’s is a local environment, or uses the application properties from Amazon Managed Service for Apache Flink. Based on the parameter EnrichmentStrategy, it decides which implementation to pick: synchronous enrichment (default), asynchronous enrichment, or cached enrichment based on the Flink concept of KeyedState.

public static void main(String[] args) throws Exception {
    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
     ParameterTool parameter = ParameterToolUtils.getParameters(args, env);

    String strategy = parameter.get("EnrichmentStrategy", "SYNC");
     switch (strategy) {
         case "SYNC":
             new SyncProcessTemperatureStreamStrategy().run(env, parameter);
             break;
         case "ASYNC":
             new AsyncProcessTemperatureStreamStrategy().run(env, parameter);
             break;
        case "CACHED":
             new CachedProcessTemperatureStreamStrategy().run(env, parameter);
             break;
         default:
             throw new InvalidParameterException("Please choose one of the existing enrichment strategies (SYNC|ASYNC|CACHED)");
     }
}

We go over the three approaches in the following sections.

Synchronous data enrichment

When you want to enrich your data from an external provider, you can use synchronous per-record lookup. When your Flink application processes an incoming event, it makes an external HTTP call and after sending every request, it has to wait until it receives the response.

As Flink processes events synchronously, the thread that is running the enrichment is blocked until it receives the HTTP response. This results in the processor staying idle for a significant period of processing time. On the other hand, the synchronous model is easier to design, debug, and trace. It also allows you to always have the latest data.

It can be integrated into your streaming application as such:

DataStream<EnrichedTemperature> enrichedTemperatureDataStream =
        temperatureDataStream
                .map(new SyncEnrichmentFunction(parameter.get("SensorApiUrl", DEFAULT_API_URL)));

The implementation of the enrichment function looks like the following code:

public class SyncEnrichmentFunction extends RichMapFunction<Temperature, EnrichedTemperature> {

    // Setup of HTTP client and ObjectMapper

    @Override
    public EnrichedTemperature map(Temperature temperature) throws Exception {
        String url = this.getRequestUrl + temperature.getSensorId();

        // Retrieve response from sensor info API
        Response response = client
                .prepareGet(url)
                .execute()
                .toCompletableFuture()
                .get();

        // Parse the sensor info
        SensorInfo sensorInfo = parseSensorInfo(response.getResponseBody());

        // Merge the temperature sensor data and sensor info data
        return getEnrichedTemperature(temperature, sensorInfo);
    }

    // ...
}

To optimize the performance for synchronous enrichment, you can use the KeepAlive flag because the HTTP client will be reused for multiple events.

For applications with I/O-bound operators (such as external data enrichment), it can also make sense to increase the application parallelism without increasing the resources dedicated to the application. You can do this by increasing the ParallelismPerKPU setting of the Amazon Managed Service for Apache Flink application. This configuration describes the number of parallel subtasks an application can perform per Kinesis Processing Unit (KPU), and a higher value of ParallelismPerKPU can lead to full utilization of KPU resources. But keep in mind that increasing the parallelism doesn’t work in all cases, such as when you are consuming from sources with few shards or partitions.

In our synthetic testing with Amazon Managed Service for Apache Flink, we saw a throughput of approximately 350 events per second on a single KPU with 4 parallelism per KPU and the default settings.

Synchronous enrichment performance

Asynchronous data enrichment

Synchronous enrichment doesn’t take full advantage of computing resources. That’s because Fink waits for HTTP responses. But Flink offers asynchronous I/O for external data access. This allows you to enrich the stream events asynchronously, so it can send a request for other elements in the stream while it waits for the response for the first element and requests can be batched for greater efficiency.

Sync I/O vs Async I/O

While using this pattern, you have to decide between unorderedWait (where it emits the result to the next operator as soon as the response is received, disregarding the order of the elements on the stream) and orderedWait (where it waits until all inflight I/O operations complete, then sends the results to the next operator in the same order as the original elements were placed on the stream). When your use case doesn’t require event ordering, unorderedWait provides better throughput and less idle time. Refer to Enrich your data stream asynchronously using Amazon Managed Service for Apache Flink to learn more about this pattern.

The asynchronous enrichment can be added as follows:

SingleOutputStreamOperator<EnrichedTemperature> asyncEnrichedTemperatureSingleOutputStream =
        AsyncDataStream
                .unorderedWait(
                        temperatureDataStream,
                        new AsyncEnrichmentFunction(parameter.get("SensorApiUrl", DEFAULT_API_URL)),
                        ASYNC_OPERATOR_TIMEOUT,
                        TimeUnit.MILLISECONDS,
                        ASYNC_OPERATOR_CAPACITY);

The enrichment function works similar as the synchronous implementation. It first retrieves the sensor info as a Java Future, which represents the result of an asynchronous computation. As soon as it’s available, it parses the information and then merges both objects into an EnrichedTemperature:

public class AsyncEnrichmentFunction extends RichAsyncFunction<Temperature, EnrichedTemperature> {

    // Setup of HTTP client and ObjectMapper

    @Override
    public void asyncInvoke(final Temperature temperature, final ResultFuture<EnrichedTemperature> resultFuture) {
        String url = this.getRequestUrl + temperature.getSensorId();

        // Retrieve response from sensor info API
        Future<Response> future = client
                .prepareGet(url)
                .execute();
        CompletableFuture
                .supplyAsync(() -> {
                    try {
                        Response response = future.get();

                        // Parse the sensor info as soon as it is available
                        return parseSensorInfo(response.getResponseBody());
                    } catch (Exception e) {
                        return null;
                    }
                })
                .thenAccept((SensorInfo sensorInfo) ->

                    // Merge the temperature sensor data and sensor info data
                    resultFuture.complete(getEnrichedTemperature(temperature, sensorInfo)));
    }

    // ...
}

In our testing with Amazon Managed Service for Apache Flink, we saw a throughput of 2,000 events per second on a single KPU with 2 parallelism per KPU and the default settings.

Async enrichment performance

Synchronous cached data enrichment

Although numerous operations in a data flow focus on individual events independently, such as event parsing, there are certain operations that retain information across multiple events. These operations, such as window operators, are referred to as stateful due to their ability to maintain state.

The keyed state is stored within an embedded key-value store, conceptualized as a part of Flink’s architecture. This state is partitioned and distributed in conjunction with the streams that are consumed by the stateful operators. As a result, access to the key-value state is limited to keyed streams, meaning it can only be accessed after a keyed or partitioned data exchange, and is restricted to the values associated with the current event’s key. For more information about the concepts, refer to Stateful Stream Processing.

You can use the keyed state for frequently accessed information that doesn’t change often, such as the sensor information. This will not only allow you to reduce the load on downstream resources, but also increase the efficiency of your data enrichment because no round-trip to an external resource for already fetched keys is necessary and there’s also no need to recompute the information. But keep in mind that Amazon Managed Service for Apache Flink stores transient data in a RocksDB backend, which adds a latency to retrieving the information. But because RocksDB is local to the node processing the data, this is faster than reaching out to external resources, as you can see in the following example.

To use keyed streams, you have to partition your stream using the .keyBy(...) method, which assures that events for the same key, in this case sensor ID, will be routed to the same worker. You can implement it as follows:

SingleOutputStreamOperator<EnrichedTemperature> cachedEnrichedTemperatureSingleOutputStream = temperatureDataStream
        .keyBy(Temperature::getSensorId)
        .process(new CachedEnrichmentFunction(
                parameter.get("SensorApiUrl", DEFAULT_API_URL),
                parameter.get("CachedItemsTTL", String.valueOf(CACHED_ITEMS_TTL))));

We are using the sensor ID as the key to partition the stream and later enrich it. This way, we can then cache the sensor information as part of the keyed state. When picking a partition key for your use case, choose one that has a high cardinality. This leads to an even distribution of events across different workers.

To store the sensor information, we use the ValueState. To configure the state management, we have to describe the state type by using the TypeHint. Additionally, we can configure how long a certain state will be cached by specifying the time-to-live (TTL) before the state will be cleaned up and has to retrieved or recomputed again.

public class CachedEnrichmentFunction extends KeyedProcessFunction<String, Temperature, EnrichedTemperature> {

    // Setup of HTTP client and ObjectMapper...

    private transient ValueState<SensorInfo> cachedSensorInfoLight;
    
    @Override
    public void open(Configuration configuration) throws Exception {
        // Initialize HTTP client
    
        ValueStateDescriptor<SensorInfo> descriptor =
                new ValueStateDescriptor<>("sensorInfo", TypeInformation.of(new TypeHint<>()}));
    
        StateTtlConfig ttlConfig = StateTtlConfig
                .newBuilder(Time.seconds(this.ttl))
                .setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
                .setStateVisibility(StateTtlConfig.StateVisibility.ReturnExpiredIfNotCleanedUp)
                .build();
        descriptor.enableTimeToLive(ttlConfig);
    
        cachedSensorInfoLight = getRuntimeContext().getState(descriptor);
    }
    
    // ...
}

As of Flink 1.17, access to the state is not possible in asynchronous functions, so the implementation must be synchronous.

It first checks if the sensor information for this particular key exists; if so, it gets enriched. Otherwise, it retrieves the sensor information, parses it, and then merges both objects into an EnrichedTemperature:

public class CachedEnrichmentFunction extends KeyedProcessFunction<String, Temperature, EnrichedTemperature> {

    // Setup of HTTP client, ObjectMapper and ValueState

    @Override
    public void processElement(Temperature temperature, KeyedProcessFunction<String, Temperature, EnrichedTemperature>.Context ctx, Collector<EnrichedTemperature> out) throws Exception {
        SensorInfo sensorInfoCachedEntry = cachedSensorInfoLight.value();

        // Check if sensor info is cached
        if (sensorInfoCachedEntry != null) {
            out.collect(getEnrichedTemperature(temperature, sensorInfoCachedEntry));
        } else {
            String url = this.getRequestUrl + temperature.getSensorId();

            // Retrieve response from sensor info API
            Response response = client
                    .prepareGet(url)
                    .execute()
                    .toCompletableFuture()
                    .get();

            // Parse the sensor info
            SensorInfo sensorInfo = parseSensorInfo(response.getResponseBody());

            // Cache the sensor info
            cachedSensorInfoLight.update(sensorInfo);

            // Merge the temperature sensor data and sensor info data
            out.collect(getEnrichedTemperature(temperature, sensorInfo));
        }
    }

    // ...
}

In our synthetic testing with Amazon Managed Service for Apache Flink, we saw a throughput of 28,000 events per second on a single KPU with 4 parallelism per KPU and the default settings.

Sync+Cached enrichment performance

You can also see the impact and reduced load on the downstream sensor API.

Impact on Enrichment API

Test your workload on Amazon Managed Service for Apache Flink

This post compared different approaches to run an application on Amazon Managed Service for Apache Flink with 1 KPU. Testing with a single KPU gives a good performance baseline that allows you to compare the enrichment patterns without generating a full-scale production workload.

It’s important to understand that the actual performance of the enrichment patterns depends on the actual workload and other external systems the Flink application interacts with. For example, performance of cached enrichment may vary with the cache hit ratio. Synchronous enrichment may behave differently depending on the response latency of the enrichment endpoint.

To evaluate which approach best suits your workload, you should first perform scaled-down tests with 1 KPU and a limited throughput of realistic data, possibly experimenting with different values of Parallelism per KPU. After you identify the best approach, it’s important to test the implementation at full scale, with real data and integrating with real external systems, before moving to production.

Summary

This post explored different approaches to implement near-online data enrichment using Flink, focusing on three communication patterns: synchronous enrichment, asynchronous enrichment, and caching with Flink KeyedState.

We compared the throughput achieved by each approach, with caching using Flink KeyedState being up to 14 times faster than using asynchronous I/O, in this particular experiment with synthetic data. Furthermore, we delved into optimizing the performance of Apache Flink, specifically on Amazon Managed Service for Apache Flink. We discussed strategies and best practices to maximize the performance of Flink applications in a managed environment, enabling you to fully take advantage of the capabilities of Flink for your near-online data enrichment needs.

Overall, this overview offers insights into different data enrichment patterns, their performance characteristics, and optimization techniques when using Apache Flink, particularly in the context of near-online data enrichment scenarios and on Amazon Managed Service for Apache Flink.

We welcome your feedback. Please leave your thoughts and questions in the comments section.


About the authors

Luis MoralesLuis Morales works as Senior Solutions Architect with digital-native businesses to support them in constantly reinventing themselves in the cloud. He is passionate about software engineering, cloud-native distributed systems, test-driven development, and all things code and security.

Lorenzo NicoraLorenzo Nicora works as Senior Streaming Solution Architect helping customers across EMEA. He has been building cloud-native, data-intensive systems for several years, working in the finance industry both through consultancies and for fin-tech product companies. He leveraged open source technologies extensively and contributed to several projects, including Apache Flink.

AWS Security Profile: Tom Scholl, VP and Distinguished Engineer, AWS

Post Syndicated from Tom Scholl original https://aws.amazon.com/blogs/security/aws-security-profile-tom-scholl-vp-and-distinguished-engineer-aws/

Tom Scholl Main Image

In the AWS Security Profile series, we feature the people who work in Amazon Web Services (AWS) Security and help keep our customers safe and secure. This interview is with Tom Scholl, VP and Distinguished Engineer for AWS.

What do you do in your current role and how long have you been at AWS?

I’m currently a vice president and distinguished engineer in the infrastructure organization at AWS. My role includes working on the AWS global network backbone, as well as focusing on denial-of-service detection and mitigation systems. I’ve been with AWS for over 12 years.

What initially got you interested in networking and how did that lead you to the anti-abuse and distributed denial of service (DDoS) space?

My interest in large global network infrastructure started when I was a teenager in the 1990s. I remember reading a magazine at the time that cataloged all the large IP transit providers on the internet, complete with network topology maps and sizes of links. It inspired me to want to work on the engineering teams that supported that. Over time, I was fortunate enough to move from working at a small ISP to a telecom carrier where I was able to work on their POP and backbone designs. It was there that I learned about the internet peering ecosystem and started collaborating with network operators from around the globe.

For the last 20-plus years, DDoS was always something I had to deal with to some extent. Namely, from the networking lens of preventing network congestion through traffic-engineering and capacity planning, as well as supporting the integration of DDoS traffic scrubbers into network infrastructure.

About three years ago, I became especially intrigued by the network abuse and DDoS space after using AWS network telemetry to observe the size of malicious events in the wild. I started to be interested in how mitigation could be improved, and how to break down the problem into smaller pieces to better understand the true sources of attack traffic. Instead of merely being an observer, I wanted to be a part of the solution and make it better. This required me to immerse myself into the domain, both from the perspective of learning the technical details and by getting hands-on and understanding the DDoS industry and environment as a whole. Part of how I did this was by engaging with my peers in the industry at other companies and absorbing years of knowledge from them.

How do you explain your job to your non-technical friends and family?

I try to explain both areas that I work on. First, that I help build the global network infrastructure that connects AWS and its customers to the rest of the world. I explain that for a home user to reach popular destinations hosted on AWS, data has to traverse a series of networks and physical cables that are interconnected so that the user’s home computer or mobile phone can send packets to another part of the world in less than a second. All that requires coordination with external networks, which have their own practices and policies on how they handle traffic. AWS has to navigate that complexity and build and operate our infrastructure with customer availability and security in mind. Second, when it comes to DDoS and network abuse, I explain that there are bad actors on the internet that use DDoS to cause impairment for a variety of reasons. It could be someone wanting to disrupt online gaming, video conferencing, or regular business operations for any given website or company. I work to prevent those events from causing any sort of impairment and trace back the source to disrupt that infrastructure launching them to prevent it from being effective in the future.

Recently, you were awarded the J.D. Falk Award by the Messaging Malware Mobile Anti-Abuse Working Group (M3AAWG) for IP Spoofing Mitigation. Congratulations! Please tell us more about the efforts that led to this.

Basically, there are three main types of DDoS attacks we observe: botnet-based unicast floods, spoofed amplification/reflection attacks, and proxy-driven HTTP request floods. The amplification/reflection aspect is interesting because it requires DDoS infrastructure providers to acquire compute resources behind providers that permit IP spoofing. IP spoofing itself has a long history on the internet, with a request for comment/best current practice (RFC/BCP) first written back in 2000 recommending that providers prevent this from occurring. However, adoption of this practice is still spotty on the internet.

At NANOG76, there was a proposal that these sorts of spoofed attacks could be traced by network operators in the path of the pre-amplification/reflection traffic (before it bounced off the reflectors). I personally started getting involved in this effort about two years ago. AWS operates a large global network and has network telemetry data that would help me identify pre-amplification/reflection traffic entering our network. This would allow me to triangulate the source network generating this. I then started engaging various networks directly that we connect to and provided them timestamps, spoofed source IP addresses, and specific protocols and ports involved with the traffic, hoping they could use their network telemetry to identify the customer generating it. From there, they’d engage with their customer to get the source shutdown or, failing that, implement packet filters on their customer to prevent spoofing.

Initially, only a few networks were capable of doing this well. This meant I had to spend a fair amount of energy in educating various networks around the globe on what spoofed traffic is, how to use their network telemetry to find it, and how to handle it. This was the most complicated and challenging part because this wasn’t on the radar of many networks out there. Up to this time, frontline network operations and abuse teams at various networks, including some very large ones, were not proficient in dealing with this.

The education I did included a variety of engagements, including sharing drawings with the day-in-the-life of a spoofed packet in a reflection attack, providing instructions on how to use their network telemetry tools, connecting them with their network telemetry vendors to help them, and even going so far as using other more exotic methods to identify which of their customers were spoofing and pointing out who they needed to analyze more deeply. In the end, it’s about getting other networks to be responsive and take action and, in the best cases, find spoofing on their own and act upon it.

Incredible! How did it feel accepting the award at the M3AAWG General Meeting in Brooklyn?

It was an honor to accept it and see some acknowledgement for the behind-the-scenes work that goes on to make the internet a better place.

What’s next for you in your work to suppress IP spoofing?

Continue tracing exercises and engaging with external providers. In particular, some of the network providers that experience challenges in dealing with spoofing and how we can improve their operations. Also, determining more effective ways to educate the hosting providers where IP spoofing is a common issue and making them implement proper default controls to not allow this behavior. Another aspect is being a force multiplier to enable others to spread the word and be part of the education process.

Looking ahead, what are some of your other goals for improving users’ online experiences and security?

Continually focusing on improving our DDoS defense strategies and working with customers to build tailored solutions that address some of their unique requirements. Across AWS, we have many services that are architected in different ways, so a key part of this is how do we raise the bar from a DDoS defense perspective across each of them. AWS customers also have their own unique architecture and protocols that can require developing new solutions to address their specific needs. On the disruption front, we will continue to focus on disrupting DDoS-as-a-service provider infrastructure beyond disrupting spoofing to disrupting botnets and the infrastructure associated with HTTP request floods.

With HTTP request floods being much more popular than byte-heavy and packet-heavy threat methods, it’s important to highlight the risks open proxies on the internet pose. Some of this emphasizes why there need to be some defaults in software packages to prevent misuse, in addition to network operators proactively identifying open proxies and taking appropriate action. Hosting providers should also recognize when their customer resources are communicating with large fleets of proxies and consider taking appropriate mitigations.

What are the most critical skills you would advise people need to be successful in network security?

I’m a huge proponent of being hands-on and diving into problems to truly understand how things are operating. Putting yourself outside your comfort zone, diving deep into the data to understand something, and translating that into outcomes and actions is something I highly encourage. After you immerse yourself in a particular domain, you can be much more effective at developing strategies and rapid prototyping to move forward. You can make incremental progress with small actions. You don’t have to wait for the perfect and complete solution to make some progress. I also encourage collaboration with others because there is incredible value in seeking out diverse opinions. There are resources out there to engage with, provided you’re willing to put in the work to learn and determine how you want to give back. The best people I’ve worked with don’t do it for public attention, blog posts, or social media status. They work in the background and don’t expect anything in return. They do it because of their desire to protect their customers and, where possible, the internet at large.

Lastly, if you had to pick an industry outside of security for your career, what would you be doing?

I’m over the maximum age allowed to start as an air traffic controller, so I suppose an air transport pilot or a locomotive engineer would be pretty neat.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Tom Scholl

Tom Scholl

Tom is Vice President and Distinguished Engineer at AWS.

Amanda Mahoney

Amanda Mahoney

Amanda is the public relations lead for the AWS portfolio of security, identity, and compliance services. She joined AWS in September 2022.

A phased approach towards a complex HITRUST r2 validated assessment

Post Syndicated from Abdul Javid original https://aws.amazon.com/blogs/security/a-phased-approach-towards-a-complex-hitrust-r2-validated-assessment/

Health Information Trust Alliance (HITRUST) offers healthcare organizations a comprehensive and standardized approach to information security, privacy, and compliance. HITRUST Common Security Framework (HITRUST CSF) can be used by organizations to establish a robust security program, ensure patient data privacy, and assist with compliance with industry regulations. HITRUST CSF enhances security, streamlines compliance efforts, reduces risk, and contributes to overall security resiliency and the trustworthiness of healthcare entities in an increasingly challenging cybersecurity landscape.

While HITRUST primarily focuses on the healthcare industry, its framework and certification program are adaptable and applicable to other industries. The HITRUST CSF is a set of controls and requirements that organizations must comply with to achieve HITRUST certification. The HITRUST R2 assessment is the process by which organizations are evaluated against the requirements of the HITRUST CSF. During the assessment, an independent third party assessor examines the organization’s technical security controls, operational policies and procedures, and the implementation of all controls to determine if they meet the specified HITRUST requirements.

HITRUST r2 validated assessment certification is a comprehensive process that involves meeting numerous assessment requirements. The number of requirements can vary significantly, ranging from 500 to 2,000 depending on your environment’s risk factors and regulatory requirements. Attempting to address all of these requirements simultaneously especially when migrating systems to Amazon Web Services (AWS) can be overwhelming. By using a strategy of separating your compliance journey into environments and applications, you can streamline the process and achieve HITRUST compliance more efficiently and within a realistic timeframe.

In this blog post, we start by exploring the HITRUST domain structure, highlighting the security objective of each domain. We then show how you can use AWS configurable services to help meet these objectives.

Lastly, we present a simple and practical reference architecture with an AWS multi-account implementation that you can use as the foundation for hosting your AWS application, highlighting the phased approach for HITRUST compliance. Please note that this blog is intended to assist with using AWS services in a manner that supports an organization’s HITRUST compliance, but a HITRUST assessment is at an organizational level and involves controls that extend beyond the organization’s use of AWS.

HITRUST certification journey – Scope applications systems on AWS infrastructure:

The HITRUST controls needed for certification are structured within 19 HITRUST domains, covering a wide range of technical and administrative control requirements. To efficiently manage the scope of your certification assessment, start by focusing on the AWS landing zone, which serves as a critical foundational infrastructure component for running applications. When establishing the AWS landing zone, verify that it aligns with the AWS HITRUST security control requirements that are dependent on the scope of your assessment. Note that these 19 domains are a combination of technical controls and foundational administrative controls.

After you’ve set up a HITRUST compliant landing zone, you can begin evaluating your applications for HITRUST compliance as you migrate them to AWS. When you expand and migrate applications to the HITRUST-certified AWS landing zone assessed by your third party assessor, you can inherit the HITRUST controls required for application assessment directly from the landing zone. This simplifies and narrows the scope of your assessment activities.

Figure 1 that follows shows the two key phases and how a bottom-up phased approach can be structured with related HITRUST controls.

Figure 1: HITRUST Phase 1 and Phase 2 high-level components

Figure 1: HITRUST Phase 1 and Phase 2 high-level components

The diagram illustrates:

  • An AWS landing zone environment as Phase 1 and its related HITRUST domain controls
  • An application system as Phase 2 and its related application system specific controls

HITRUST domain security objectives:

The HITRUST CSF based certification consists of 19 domains, which are broad categories that encompass various aspects of information security and privacy controls. These domains serve as a framework for your organization to assess and enhance its security posture. These domains cover a wide range of controls and practices related to information security, privacy, risk management, and compliance. Each domain consists of a set of control objectives and requirements that your organization must meet to achieve HITRUST certification.

The following table lists each domain, the key security objectives expected, and the AWS configurable services relevant to the security objectives. These are listed as a reference to give you an idea of the scope of each domain; the actual services and tools to meet specific HITRUST requirements will vary depending upon your scope and its HITRUST requirements.

Note: The information in this post is a general guideline and recommendation based on a phased approach for HITRUST r2 validated assessment. The examples are based on the information available at the time of publication and are not a full solution.

HITRUST domains, security objectives, and related AWS services

HITRUST domain Summary of key security objectives expected in HITRUST domains Related AWS configurable services
1. Information Protection Program
  • Implement information security management program.
  • Verify role suitability for employees, contractors, and third-party users.
  • Provide management guidance aligned with business goals and regulations.
  • Safeguard an organization’s information and assets.
  • Enhance awareness of information security among stakeholders.
AWS Artifact
AWS Service Catalog
AWS Config
Amazon Cybersecurity Awareness Training
2. Endpoint Protection
  • Protect information and software from unauthorized or malicious code.
  • Safeguard information in networks and the supporting network infrastructure
AWS Systems Manager
AWS Config
Amazon Inspector
AWS Shield
AWS WAF
3. Portable Media Security
  • Ensure the protection of information assets, prevent unauthorized disclosure, alteration, deletion, or harm, and maintain uninterrupted business operations.
AWS Identity and Access Management (IAM)
Amazon Simple Storage Service (Amazon S3)
AWS Key Management Service (AWS KMS)
AWS CloudTrail
Amazon Macie
Amazon Cognito
Amazon Workspaces Family
4. Mobile Device Security
  • Ensure information security while using mobile computing devices and remote work facilities.
AWS Database Migration Service (AWS DMS)
AWS IoT Device Defender
AWS Snowball
AWS Config
5. Wireless Security
  • Ensure the safeguarding of information within networks and the security of the underlying network infrastructure.
AWS Certificate Manager (ACM)
6. Configuration Management
  • Ensure adherence to organizational security policies and standards for information systems.
  • Control system files, access, and program source code for security.
  • Document, maintain, and provide operating procedures to users.
  • Strictly control project and support environments for secure development of application system software and information.
AWS Config
AWS Trusted Advisor
Amazon CloudWatch
AWS Security Hub
Systems Manager
7. Vulnerability Management
  • Implement effective and repeatable technical vulnerability management to mitigate risks from exploited vulnerabilities.
  • Establish ownership and defined responsibilities for the protection of information assets within management.
  • Design controls in applications, including user-developed ones, to prevent errors, loss, unauthorized modification, or misuse of information. These controls should encompass input data validation, internal processing, and output data.
Amazon Inspector
CloudWatch
Security Hub
8. Network Protection
  • Secure information across networks and network infrastructure.
  • Prevent unauthorized access to networked services.
  • Ensure unauthorized access prevention to information in application systems.
  • Implement controls within applications to prevent errors, loss, unauthorized modification, or misuse of information.
Amazon Route 53
AWS Control Tower
Amazon Virtual Private Cloud (Amazon VPC)
AWS Transit Gateway
Network Load Balancer
AWS Direct Connect
AWS Site-to-Site VPN
AWS CloudFormation
AWS WAF
ACM
9. Transmission Protection
  • Ensure robust protection of information within networks and their underlying infrastructure.
  • Facilitate secure information exchange both internally and externally, adhering to applicable laws and agreements.
  • Ensure the security of electronic commerce services and their use.
  • Employ cryptographic methods to ensure confidentiality, authenticity, and integrity of information.
  • Formulate cryptographic control policies and institute key management to bolster their implementation.
Systems Manager
ACM
10. Password Management
  • Register, track, and periodically validate authorized user accounts to prevent unauthorized access to information systems.
AWS Secrets Manager, Systems Manager Parameter Store, AWS KMS
11. Access Control
  • Monitor and log security events to detect unauthorized activities in compliance with legal requirements.
  • Prevent unauthorized access, compromise, or theft of information, assets, and user entry.
  • Safeguard against unauthorized access to networked services, operating systems, and application information.
  • Manage access rights and asset recovery for terminated or transferred personnel and contractors.
  • Ensure adherence to applicable laws, regulations, contracts, and security requirements throughout information systems’ lifecycle.
IAM
AWS Resource Access Manager (AWS RAM)
Amazon GuardDuty
AWS Identity Center
12. Audit Logging & Monitoring
  • Comply with laws, regulations, contracts, and security mandates in information systems’ design, operation, use, and management.
  • Document, maintain, and share operating procedures with relevant users.
  • Monitor, record, and uncover unauthorized information processing in line with legal requirements.
AWS Control Tower
Amazon S3
CloudTrail
GuardDuty
AWS Config
CloudWatch
Amazon VPC Flow logs
Amazon OpenSearch Service
13. Education, Training and Awareness
  • Secure information when using mobile devices and teleworking.
  • Make employees, contractors, and third-party users aware of security threats, and responsibilities and reduce human error.
  • Ensure information systems comply with laws, regulations, contracts, and security requirements.
  • Assign ownership and defined responsibilities for protecting information assets.
  • Protect information and software integrity from unauthorized code.
  • Securely exchange information within and outside the organization, following relevant laws and agreements.
  • Develop strategies to counteract business interruptions, protect critical processes, and resume them promptly after system failures or disasters.
Security Hub
Amazon Cybersecurity Awareness Training
Trusted Advisor
14. Third-Party Assurance
  • Safeguard information and assets by mitigating risks linked to external products or services.
  • Verify third-party service providers adhere to security requirements and maintain agreed upon service levels.
  • Enforce stringent controls over development, project, and support environments to ensure software and information security.
AWS Artifact
AWS Service Organization Controls (SOC) Reports
ISO27001 reports
15. Incident Management
  • Address security events and vulnerabilities promptly for timely correction.
  • Foster awareness among employees, contractors, and third-party users to reduce human errors.
  • Consistently manage information security incidents for effective response.
  • Handle security events to facilitate timely corrective measures.
AWS Incident Detection and Response
Security Hub
Amazon Inspector
CloudTrail
AWS Config
Amazon Simple Notification Service (Amazon SNS)
GuardDuty
AWS WAF
Shield
CloudFormation
16. Business Continuity & Disaster Recovery
  • Maintain, protect, and make organizational information available.
  • Develop strategies and plans to prevent disruptions to business activities, safeguard critical processes from system failures or disasters, and ensure their prompt recovery.
AWS Backup & Restore
CloudFormation
Amazon Aurora
CrossRegion replication
AWS Backup
Disaster Recovery: Pilot Light, Warm Standby, Multi Site Active-Active
17. Risk Management
  • Integrate security as a vital element within information systems.
  • Develop and implement a risk management program encompassing risk assessment, mitigation, and evaluation
Trusted Advisor
AWS Config Rules
18. Physical & Environmental Security
  • Secure the organization’s premises and information from unauthorized physical access, damage, and interference.
  • Prevent unauthorized access to networked services.
  • Safeguard assets, prevent loss, damage, theft, or compromise, and ensure uninterrupted organizational activities.
  • Protect information assets from unauthorized disclosure, modification, removal, or destruction, and prevent interruptions to business activities.
AWS Data Centers
Amazon CloudFront
AWS Regions and Global Infrastructure
19. Data Protection & Privacy
  • Ensure the security of the organization’s information and assets when using external products or services.
  • Ensure planning, operation, use, and control of information systems align with applicable laws, regulations, contracts, and security requirements.
Amazon S3
AWS KMS
Aurora
OpenSearch Service
AWS Artifact
Macie

Note: You can use AWS HITRUST-certified services to support your HITRUST compliance requirements. Use of these services in their default state doesn’t automatically ensure HITRUST certifiability. You must demonstrate compliance through formal formulation of policies, procedures, and implementation tailored to your scope, which involves configuring and customizing AWS HITRUST certified services to align precisely with HITRUST requirements within your scope and involves implementation of controls outside of the scope of the use of AWS services (such as appropriate organization-wide policies and procedures).

HITRUST phased approach – Reference architecture:

Figure 2 shows the recommended HITRUST Phase 1 and Phase 2 accounts and components within a landing zone.

Figure 2: HITRUST Phases 1 and 2 architecture including accounts and components

Figure 2: HITRUST Phases 1 and 2 architecture including accounts and components

The reference architecture shown in Figure 2 illustrates:

  • A high-level structure of AWS accounts arranged in HITRUST Phase 1 and Phase 2
  • The accounts in HITRUST Phase 1 include:
    • Management account: The management account in the AWS landing zone is the primary account responsible for governing and managing the entire AWS environment.
    • Security account: The security account is dedicated to security and compliance functions, providing a centralized location for security-related tools and monitoring.
    • Central logging account: This account is designed for centralized logging and storage of logs from all other accounts, aiding in security analysis and troubleshooting.
    • Central audit: The central audit account is used for compliance monitoring, logging audit events, and verifying adherence to security standards.
    • DevOps account: DevOps accounts are used for software development and deployment, enabling continuous integration and delivery (CI/CD) processes.
    • Networking account: Networking accounts focus on network management, configuration, and monitoring to support reliable connectivity within the AWS environment.
    • DevSecOps account: DevSecOps accounts combine development, security, and operations to embed security practices throughout the software development lifecycle.
    • Shared services account: Shared services accounts host common resources, such as IAM services, that are shared across other accounts for centralized management.

The account group for HITRUST Phase 2 includes:

  • Tenant A – sample application workloads
  • Tenant B – sample application workloads

HITRUST Phase 1 – HITRUST foundational landing zone assessment phase:

In this phase you define the scope of assessment, including the specific AWS landing zone components and configurations that must be HITRUST compliant. The primary focus here is to evaluate the foundational infrastructure’s compliance with HITRUST controls. This involves a comprehensive review of policies and procedures, and implementation of all requirements within the landing zone scope. Assessing this phase separately enables you to verify that your foundational infrastructure adheres to HITRUST controls. Some of the policies, procedures, and configurations that are HITRUST assessed in this phase can be inherited across multiple applications’ assessments in later phases. Assessing this infrastructure once and then inheriting these controls for applications can be more efficient than assessing each application individually.

By establishing a secure and compliant foundation at the start, you can plan application assessments in later phases, making it simpler for subsequent applications to adhere to HITRUST requirements. This can streamline the compliance process and reduce the overall time and effort required. By assessing the landing zone separately, you can identify and address compliance gaps or issues in your foundational infrastructure, reducing the risk of non-compliance for the applications built upon it. Use the following high-level technical approach for this phase of assessment.

  1. Build your AWS landing zone with HITRUST controls. See Building a landing zone for more information.
  2. Use AWS and configure services according to the HITRUST requirements that are applicable to your infrastructure scope.
  3. The HITRUST on AWS Quick Start guide is a reference for building HITRUST with one account. You can use the guide as a starting point to build a multi account architecture.

HITRUST Phase 2 – HITRUST application assessment phase:

During this phase, you examine your AWS workload application accounts to conduct HITRUST assessments for application systems that are running within the AWS landing zone. You have the option to inherit environment-related controls that have been certified as HITRUST compliant within the landing zone in the previous phase.

The following key steps are recommended in this phase:

  1. Readiness assessment for application scope: Conduct a thorough readiness assessment focused on the application scope, and define boundaries with scoped applications (AWS workload accounts).
  2. HITRUST application controls: Gather specific HITRUST requirements for application scope by creating a HITRUST object for the application scope.
  3. Scoped requirements analysis: Analyze requirements and use requirements that can be inherited from Phase 1 of the infrastructure assessment.
  4. Gap analysis: Work with subject matter experts to conduct a gap analysis, and develop policies, procedures, and implementations for application specific controls.
  5. Remediation: Remediate the gaps identified during the gap analysis activity.
  6. Formal r2 assessment: Work with a third-party assessor to initiate a formal r2 validated assessment with HITRUST.

Conclusion

By breaking the compliance process into distinct phases, you can concentrate your resources on specific areas and prioritize essential assets accordingly. This approach supports a focused strategy, systematically addressing critical controls, and helping you to fulfill compliance requirements in a scalable manner. Obtaining the initial certification for the infrastructure and platform layers establishes a robust foundational architecture for subsequent phases, which involve application systems.

Earning certification at each phase provides tangible evidence of progress in your compliance journey. This achievement instills confidence in both internal and external stakeholders, affirming your organization’s commitment to security and compliance.

For guidance on achieving, maintaining, and automating compliance in the cloud, reach out to AWS Security Assurance Services (AWS SAS) or your account team. AWS SAS is a PCI QSAC and HITRUST External Assessor that can help by tying together applicable audit standards to AWS service-specific features and functionality. They can help you build on frameworks such as PCI DSS, HITRUST CSF, NIST, SOC 2, HIPAA, ISO 27001, GDPR, and CCPA.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security, Identity, & Compliance re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Abdul Javid

Abdul Javid

Abdul is a Senior Security Assurance Consultant and PCI DSS Qualified Security Assessor with AWS Security Assurance Services, and has more than 25 years of IT governance, operations, security, risk, and compliance experience. Abdul leverages his experience and knowledge to advise AWS customers with guidance and advice on their compliance journey. Abdul earned an M.S. in Computer Science from IIT, Chicago and holds various industry recognized sought after certifications in security and program and risk management from prominent organizations like AWS, HITRUST, ISACA, PMI, PCI DSS, and ISC2.

Cate Ciccolone

Cate Ciccolone

Cate is a Senior Security Consultant for Amazon Web Services (AWS) where she provides technical and advisory consulting services to global healthcare organizations to help them secure their regulated workloads, minimize risk, and meet compliance goals. Her experience spans cybersecurity engineering, healthcare compliance, electronic health record architecture, and clinical application security. Cate is an AWS Certified Solutions Architect and holds several certifications including EC-Council Certified Incident Handler (E|CIH) and HITRUST Certified Practitioner (CCSFP).

Unstructured data management and governance using AWS AI/ML and analytics services

Post Syndicated from Sakti Mishra original https://aws.amazon.com/blogs/big-data/unstructured-data-management-and-governance-using-aws-ai-ml-and-analytics-services/

Unstructured data is information that doesn’t conform to a predefined schema or isn’t organized according to a preset data model. Unstructured information may have a little or a lot of structure but in ways that are unexpected or inconsistent. Text, images, audio, and videos are common examples of unstructured data. Most companies produce and consume unstructured data such as documents, emails, web pages, engagement center phone calls, and social media. By some estimates, unstructured data can make up to 80–90% of all new enterprise data and is growing many times faster than structured data. After decades of digitizing everything in your enterprise, you may have an enormous amount of data, but with dormant value. However, with the help of AI and machine learning (ML), new software tools are now available to unearth the value of unstructured data.

In this post, we discuss how AWS can help you successfully address the challenges of extracting insights from unstructured data. We discuss various design patterns and architectures for extracting and cataloging valuable insights from unstructured data using AWS. Additionally, we show how to use AWS AI/ML services for analyzing unstructured data.

Why it’s challenging to process and manage unstructured data

Unstructured data makes up a large proportion of the data in the enterprise that can’t be stored in a traditional relational database management systems (RDBMS). Understanding the data, categorizing it, storing it, and extracting insights from it can be challenging. In addition, identifying incremental changes requires specialized patterns and detecting sensitive data and meeting compliance requirements calls for sophisticated functions. It can be difficult to integrate unstructured data with structured data from existing information systems. Some view structured and unstructured data as apples and oranges, instead of being complementary. But most important of all, the assumed dormant value in the unstructured data is a question mark, which can only be answered after these sophisticated techniques have been applied. Therefore, there is a need to being able to analyze and extract value from the data economically and flexibly.

Solution overview

Data and metadata discovery is one of the primary requirements in data analytics, where data consumers explore what data is available and in what format, and then consume or query it for analysis. If you can apply a schema on top of the dataset, then it’s straightforward to query because you can load the data into a database or impose a virtual table schema for querying. But in the case of unstructured data, metadata discovery is challenging because the raw data isn’t easily readable.

You can integrate different technologies or tools to build a solution. In this post, we explain how to integrate different AWS services to provide an end-to-end solution that includes data extraction, management, and governance.

The solution integrates data in three tiers. The first is the raw input data that gets ingested by source systems, the second is the output data that gets extracted from input data using AI, and the third is the metadata layer that maintains a relationship between them for data discovery.

The following is a high-level architecture of the solution we can build to process the unstructured data, assuming the input data is being ingested to the raw input object store.

Unstructured Data Management - Block Level Architecture Diagram

The steps of the workflow are as follows:

  1. Integrated AI services extract data from the unstructured data.
  2. These services write the output to a data lake.
  3. A metadata layer helps build the relationship between the raw data and AI extracted output. When the data and metadata are available for end-users, we can break the user access pattern into additional steps.
  4. In the metadata catalog discovery step, we can use query engines to access the metadata for discovery and apply filters as per our analytics needs. Then we move to the next stage of accessing the actual data extracted from the raw unstructured data.
  5. The end-user accesses the output of the AI services and uses the query engines to query the structured data available in the data lake. We can optionally integrate additional tools that help control access and provide governance.
  6. There might be scenarios where, after accessing the AI extracted output, the end-user wants to access the original raw object (such as media files) for further analysis. Additionally, we need to make sure we have access control policies so the end-user has access only to the respective raw data they want to access.

Now that we understand the high-level architecture, let’s discuss what AWS services we can integrate in each step of the architecture to provide an end-to-end solution.

The following diagram is the enhanced version of our solution architecture, where we have integrated AWS services.

Unstructured Data Management - AWS Native Architecture

Let’s understand how these AWS services are integrated in detail. We have divided the steps into two broad user flows: data processing and metadata enrichment (Steps 1–3) and end-users accessing the data and metadata with fine-grained access control (Steps 4–6).

  1. Various AI services (which we discuss in the next section) extract data from the unstructured datasets.
  2. The output is written to an Amazon Simple Storage Service (Amazon S3) bucket (labeled Extracted JSON in the preceding diagram). Optionally, we can restructure the input raw objects for better partitioning, which can help while implementing fine-grained access control on the raw input data (labeled as the Partitioned bucket in the diagram).
  3. After the initial data extraction phase, we can apply additional transformations to enrich the datasets using AWS Glue. We also build an additional metadata layer, which maintains a relationship between the raw S3 object path, the AI extracted output path, the optional enriched version S3 path, and any other metadata that will help the end-user discover the data.
  4. In the metadata catalog discovery step, we use the AWS Glue Data Catalog as the technical catalog, Amazon Athena and Amazon Redshift Spectrum as query engines, AWS Lake Formation for fine-grained access control, and Amazon DataZone for additional governance.
  5. The AI extracted output is expected to be available as a delimited file or in JSON format. We can create an AWS Glue Data Catalog table for querying using Athena or Redshift Spectrum. Like the previous step, we can use Lake Formation policies for fine-grained access control.
  6. Lastly, the end-user accesses the raw unstructured data available in Amazon S3 for further analysis. We have proposed integrating Amazon S3 Access Points for access control at this layer. We explain this in detail later in this post.

Now let’s expand the following parts of the architecture to understand the implementation better:

  • Using AWS AI services to process unstructured data
  • Using S3 Access Points to integrate access control on raw S3 unstructured data

Process unstructured data with AWS AI services

As we discussed earlier, unstructured data can come in a variety of formats, such as text, audio, video, and images, and each type of data requires a different approach for extracting metadata. AWS AI services are designed to extract metadata from different types of unstructured data. The following are the most commonly used services for unstructured data processing:

  • Amazon Comprehend – This natural language processing (NLP) service uses ML to extract metadata from text data. It can analyze text in multiple languages, detect entities, extract key phrases, determine sentiment, and more. With Amazon Comprehend, you can easily gain insights from large volumes of text data such as extracting product entity, customer name, and sentiment from social media posts.
  • Amazon Transcribe – This speech-to-text service uses ML to convert speech to text and extract metadata from audio data. It can recognize multiple speakers, transcribe conversations, identify keywords, and more. With Amazon Transcribe, you can convert unstructured data such as customer support recordings into text and further derive insights from it.
  • Amazon Rekognition – This image and video analysis service uses ML to extract metadata from visual data. It can recognize objects, people, faces, and text, detect inappropriate content, and more. With Amazon Rekognition, you can easily analyze images and videos to gain insights such as identifying entity type (human or other) and identifying if the person is a known celebrity in an image.
  • Amazon Textract – You can use this ML service to extract metadata from scanned documents and images. It can extract text, tables, and forms from images, PDFs, and scanned documents. With Amazon Textract, you can digitize documents and extract data such as customer name, product name, product price, and date from an invoice.
  • Amazon SageMaker – This service enables you to build and deploy custom ML models for a wide range of use cases, including extracting metadata from unstructured data. With SageMaker, you can build custom models that are tailored to your specific needs, which can be particularly useful for extracting metadata from unstructured data that requires a high degree of accuracy or domain-specific knowledge.
  • Amazon Bedrock – This fully managed service offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon with a single API. It also offers a broad set of capabilities to build generative AI applications, simplifying development while maintaining privacy and security.

With these specialized AI services, you can efficiently extract metadata from unstructured data and use it for further analysis and insights. It’s important to note that each service has its own strengths and limitations, and choosing the right service for your specific use case is critical for achieving accurate and reliable results.

AWS AI services are available via various APIs, which enables you to integrate AI capabilities into your applications and workflows. AWS Step Functions is a serverless workflow service that allows you to coordinate and orchestrate multiple AWS services, including AI services, into a single workflow. This can be particularly useful when you need to process large amounts of unstructured data and perform multiple AI-related tasks, such as text analysis, image recognition, and NLP.

With Step Functions and AWS Lambda functions, you can create sophisticated workflows that include AI services and other AWS services. For instance, you can use Amazon S3 to store input data, invoke a Lambda function to trigger an Amazon Transcribe job to transcribe an audio file, and use the output to trigger an Amazon Comprehend analysis job to generate sentiment metadata for the transcribed text. This enables you to create complex, multi-step workflows that are straightforward to manage, scalable, and cost-effective.

The following is an example architecture that shows how Step Functions can help invoke AWS AI services using Lambda functions.

AWS AI Services - Lambda Event Workflow -Unstructured Data

The workflow steps are as follows:

  1. Unstructured data, such as text files, audio files, and video files, are ingested into the S3 raw bucket.
  2. A Lambda function is triggered to read the data from the S3 bucket and call Step Functions to orchestrate the workflow required to extract the metadata.
  3. The Step Functions workflow checks the type of file, calls the corresponding AWS AI service APIs, checks the job status, and performs any postprocessing required on the output.
  4. AWS AI services can be accessed via APIs and invoked as batch jobs. To extract metadata from different types of unstructured data, you can use multiple AI services in sequence, with each service processing the corresponding file type.
  5. After the Step Functions workflow completes the metadata extraction process and performs any required postprocessing, the resulting output is stored in an S3 bucket for cataloging.

Next, let’s understand how can we implement security or access control on both the extracted output as well as the raw input objects.

Implement access control on raw and processed data in Amazon S3

We just consider access controls for three types of data when managing unstructured data: the AI-extracted semi-structured output, the metadata, and the raw unstructured original files. When it comes to AI extracted output, it’s in JSON format and can be restricted via Lake Formation and Amazon DataZone. We recommend keeping the metadata (information that captures which unstructured datasets are already processed by the pipeline and available for analysis) open to your organization, which will enable metadata discovery across the organization.

To control access of raw unstructured data, you can integrate S3 Access Points and explore additional support in the future as AWS services evolve. S3 Access Points simplify data access for any AWS service or customer application that stores data in Amazon S3. Access points are named network endpoints that are attached to buckets that you can use to perform S3 object operations. Each access point has distinct permissions and network controls that Amazon S3 applies for any request that is made through that access point. Each access point enforces a customized access point policy that works in conjunction with the bucket policy that is attached to the underlying bucket. With S3 Access Points, you can create unique access control policies for each access point to easily control access to specific datasets within an S3 bucket. This works well in multi-tenant or shared bucket scenarios where users or teams are assigned to unique prefixes within one S3 bucket.

An access point can support a single user or application, or groups of users or applications within and across accounts, allowing separate management of each access point. Every access point is associated with a single bucket and contains a network origin control and a Block Public Access control. For example, you can create an access point with a network origin control that only permits storage access from your virtual private cloud (VPC), a logically isolated section of the AWS Cloud. You can also create an access point with the access point policy configured to only allow access to objects with a defined prefix or to objects with specific tags. You can also configure custom Block Public Access settings for each access point.

The following architecture provides an overview of how an end-user can get access to specific S3 objects by assuming a specific AWS Identity and Access Management (IAM) role. If you have a large number of S3 objects to control access, consider grouping the S3 objects, assigning them tags, and then defining access control by tags.

S3 Access Points - Unstructured Data Management - Access Control

If you are implementing a solution that integrates S3 data available in multiple AWS accounts, you can take advantage of cross-account support for S3 Access Points.

Conclusion

This post explained how you can use AWS AI services to extract readable data from unstructured datasets, build a metadata layer on top of them to allow data discovery, and build an access control mechanism on top of the raw S3 objects and extracted data using Lake Formation, Amazon DataZone, and S3 Access Points.

In addition to AWS AI services, you can also integrate large language models with vector databases to enable semantic or similarity search on top of unstructured datasets. To learn more about how to enable semantic search on unstructured data by integrating Amazon OpenSearch Service as a vector database, refer to Try semantic search with the Amazon OpenSearch Service vector engine.

As of writing this post, S3 Access Points is one of the best solutions to implement access control on raw S3 objects using tagging, but as AWS service features evolve in the future, you can explore alternative options as well.


About the Authors

Sakti Mishra is a Principal Solutions Architect at AWS, where he helps customers modernize their data architecture and define their end-to-end data strategy, including data security, accessibility, governance, and more. He is also the author of the book Simplify Big Data Analytics with Amazon EMR. Outside of work, Sakti enjoys learning new technologies, watching movies, and visiting places with family.

Bhavana Chirumamilla is a Senior Resident Architect at AWS with a strong passion for data and machine learning operations. She brings a wealth of experience and enthusiasm to help enterprises build effective data and ML strategies. In her spare time, Bhavana enjoys spending time with her family and engaging in various activities such as traveling, hiking, gardening, and watching documentaries.

Sheela Sonone is a Senior Resident Architect at AWS. She helps AWS customers make informed choices and trade-offs about accelerating their data, analytics, and AI/ML workloads and implementations. In her spare time, she enjoys spending time with her family—usually on tennis courts.

Daniel Bruno is a Principal Resident Architect at AWS. He had been building analytics and machine learning solutions for over 20 years and splits his time helping customers build data science programs and designing impactful ML products.

AWS Digital Sovereignty Pledge: Announcing a new, independent sovereign cloud in Europe

Post Syndicated from Matt Garman original https://aws.amazon.com/blogs/security/aws-digital-sovereignty-pledge-announcing-a-new-independent-sovereign-cloud-in-europe/

French | German | Italian | Spanish

From day one, Amazon Web Services (AWS) has always believed it is essential that customers have control over their data, and choices for how they secure and manage that data in the cloud. Last year, we introduced the AWS Digital Sovereignty Pledge, our commitment to offering AWS customers the most advanced set of sovereignty controls and features available in the cloud. We pledged to work to understand the evolving needs and requirements of both customers and regulators, and to rapidly adapt and innovate to meet them. We committed to expanding our capabilities to allow customers to meet their digital sovereignty needs, without compromising on the performance, innovation, security, or scale of the AWS Cloud.

AWS offers the largest and most comprehensive cloud infrastructure globally. Our approach from the beginning has been to make AWS sovereign-by-design. We built data protection features and controls in the AWS Cloud with input from financial services, healthcare, and government customers—who are among the most security- and data privacy-conscious organizations in the world. This has led to innovations like the AWS Nitro System, which powers all our modern Amazon Elastic Compute Cloud (Amazon EC2) instances and provides a strong physical and logical security boundary to enforce access restrictions so that nobody, including AWS employees, can access customer data running in Amazon EC2. The security design of the Nitro System has also been independently validated by the NCC Group in a public report.

With AWS, customers have always had control over the location of their data. In Europe, customers that need to comply with European data residency requirements have the choice to deploy their data to any of our eight existing AWS Regions (Ireland, Frankfurt, London, Paris, Stockholm, Milan, Zurich, and Spain) to keep their data securely in Europe. To run their sensitive workloads, European customers can leverage the broadest and deepest portfolio of services, including AI, analytics, compute, database, Internet of Things (IoT), machine learning, mobile services, and storage. To further support customers, we’ve innovated to offer more control and choice over their data. For example, we announced further transparency and assurances, and new dedicated infrastructure options with AWS Dedicated Local Zones.

Announcing the AWS European Sovereign Cloud

When we speak to public sector and regulated industry customers in Europe, they share how they are facing incredible complexity and changing dynamics with an evolving sovereignty landscape. Customers tell us they want to adopt the cloud, but are facing increasing regulatory scrutiny over data location, European operational autonomy, and resilience. We’ve learned that these customers are concerned that they will have to choose between the full power of AWS or feature-limited sovereign cloud solutions. We’ve had deep engagements with European regulators, national cybersecurity authorities, and customers to understand how the sovereignty needs of customers can vary based on multiple factors, like location, sensitivity of workloads, and industry. These factors can impact their workload requirements, such as where their data can reside, who can access it, and the controls needed. AWS has a proven track record of innovation to address specialized workloads around the world.

Today, we’re excited to announce our plans to launch the AWS European Sovereign Cloud, a new, independent cloud for Europe, designed to help public sector organizations and customers in highly regulated industries meet their evolving sovereignty needs. We’re designing the AWS European Sovereign Cloud to be separate and independent from our existing Regions, with infrastructure located wholly within the European Union (EU), with the same security, availability, and performance our customers get from existing Regions today. To deliver enhanced operational resilience within the EU, only EU residents who are located in the EU will have control of the operations and support for the AWS European Sovereign Cloud. As with all current Regions, customers using the AWS European Sovereign Cloud will benefit from the full power of AWS with the same familiar architecture, expansive service portfolio, and APIs that millions of customers use today. The AWS European Sovereign Cloud will launch its first AWS Region in Germany available to all European customers.

The AWS European Sovereign Cloud will be sovereign-by-design, and will be built on more than a decade of experience operating multiple independent clouds for the most critical and restricted workloads. Like existing Regions, the AWS European Sovereign Cloud will be built for high availability and resiliency, and powered by the AWS Nitro System, to help ensure the confidentiality and integrity of customer data. Customers will have the control and assurance that AWS will not access or use customer data for any purpose without their agreement. AWS gives customers the strongest sovereignty controls among leading cloud providers. For customers with enhanced data residency needs, the AWS European Sovereign cloud is designed to go further and will allow customers to keep all metadata they create (such as the roles, permissions, resource labels, and configurations they use to run AWS) in the EU. The AWS European Sovereign Cloud will also be built with separate, in-Region billing and usage metering systems.

Delivering operational autonomy

The AWS European Sovereign Cloud will provide customers the capability to meet stringent operational autonomy and data residency requirements. To deliver enhanced data residency and operational resilience within the EU, the AWS European Sovereign Cloud infrastructure will be operated independently from existing AWS Regions. To assure independent operation of the AWS European Sovereign Cloud, only personnel who are EU residents, located in the EU, will have control of day-to-day operations, including access to data centers, technical support, and customer service.

We’re taking learnings from our deep engagements with European regulators and national cybersecurity authorities and applying them as we build the AWS European Sovereign Cloud, so that customers using the AWS European Sovereign Cloud can meet their data residency, operational autonomy, and resilience requirements. For example, we are looking forward to continuing to partner with Germany’s Federal Office for Information Security (BSI).

“The development of a European AWS Cloud will make it much easier for many public sector organizations and companies with high data security and data protection requirements to use AWS services. We are aware of the innovative power of modern cloud services and we want to help make them securely available for Germany and Europe. The C5 (Cloud Computing Compliance Criteria Catalogue), which was developed by the BSI, has significantly shaped cybersecurity cloud standards and AWS was in fact the first cloud service provider to receive the BSI’s C5 testate. In this respect, we are very pleased to constructively accompany the local development of an AWS Cloud, which will also contribute to European sovereignty, in terms of security.”
— Claudia Plattner, President of the German Federal Office for Information Security (BSI)

Control without compromise

Though separate, the AWS European Sovereign Cloud will offer the same industry-leading architecture built for security and availability as other AWS Regions. This will include multiple Availability Zones (AZs), infrastructure that is placed in separate and distinct geographic locations, with enough distance to significantly reduce the risk of a single event impacting customers’ business continuity. Each AZ will have multiple layers of redundant power and networking to provide the highest level of resiliency. All AZs in the AWS European Sovereign Cloud will be interconnected with fully redundant, dedicated metro fiber, providing high-throughput, low-latency networking between AZs. All traffic between AZs will be encrypted. Customers who need more options to address stringent isolation and in-country data residency needs will be able to use Dedicated Local Zones or AWS Outposts to deploy AWS European Sovereign Cloud infrastructure in locations they select.

Continued AWS investment in Europe

The AWS European Sovereign Cloud represents continued AWS investment in Europe. AWS is committed to innovating to support European values and Europe’s digital future. We drive economic development through investing in infrastructure, jobs, and skills in communities and countries across Europe. We are creating thousands of high-quality jobs and investing billions of euros in European economies. Amazon has created more than 100,000 permanent jobs across the EU. Some of our largest AWS development teams are located in Europe, with key centers in Dublin, Dresden, and Berlin. As part of our continued commitment to contribute to the development of digital skills, we will hire and develop additional local personnel to operate and support the AWS European Sovereign Cloud.

Customers, partners, and regulators welcome the AWS European Sovereign Cloud

In the EU, hundreds of thousands of organizations of all sizes and across all industries are using AWS – from start-ups, to small and medium-sized businesses, to the largest enterprises, including telecommunication companies, public sector organizations, educational institutions, and government agencies. Organizations across Europe support the introduction of the AWS European Sovereign Cloud.

“As the market leader in enterprise application software with strong roots in Europe, SAP and AWS have long collaborated on behalf of customers to accelerate digital transformation around the world. The AWS European Sovereign Cloud provides further opportunities to strengthen our relationship in Europe by enabling us to expand the choices we offer to customers as they move to the cloud. We appreciate the ongoing partnership with AWS, and the new possibilities this investment can bring for our mutual customers across the region.”
— Peter Pluim, President, SAP Enterprise Cloud Services and SAP Sovereign Cloud Services.

“The new AWS European Sovereign Cloud can be a game-changer for highly regulated business segments in the European Union. As a leading telecommunications provider in Germany, our digital transformation focuses on innovation, scalability, agility, and resilience to provide our customers with the best services and quality. This will now be paired with the highest levels of data protection and regulatory compliance that AWS delivers, and with a particular focus on digital sovereignty requirements. I am convinced that this new infrastructure offering has the potential to boost cloud adaptation of European companies and accelerate the digital transformation of regulated industries across the EU.”
— Mallik Rao, Chief Technology and Information Officer, O2 Telefónica in Germany

“Deutsche Telekom welcomes the announcement of the AWS European Sovereign Cloud, which highlights AWS’s dedication to continuous innovation for European businesses. The AWS solution will provide greater choice for organizations when moving regulated workloads to the cloud and additional options to meet evolving digital governance requirements in the EU.”
— Greg Hyttenrauch, senior vice president, Global Cloud Services at T-Systems

“Today, we stand at the cusp of a transformative era. The introduction of the AWS European Sovereign Cloud does not merely represent an infrastructural enhancement, it is a paradigm shift. This sophisticated framework will empower Dedalus to offer unparalleled services for storing patient data securely and efficiently in the AWS cloud. We remain committed, without compromise, to serving our European clientele with best-in-class solutions underpinned by trust and technological excellence.”
— Andrea Fiumicelli, Chairman, Dedalus

“At de Volksbank, we believe in investing in a better Netherlands. To do this effectively, we need to have access to the latest technologies in order for us to continually be innovating and improving services for our customers. For this reason, we welcome the announcement of the European Sovereign Cloud which will allow European customers to easily demonstrate compliance with evolving regulations while still benefitting from the scale, security, and full suite of AWS services.”
— Sebastiaan Kalshoven, Director IT/CTO, de Volksbank

“Eviden welcomes the launch of the AWS European Sovereign Cloud. This will help regulated industries and the public sector address the requirements of their sensitive workloads with a fully featured AWS cloud wholly operated in Europe. As an AWS Premier Tier Services Partner and leader in cybersecurity services in Europe, Eviden has an extensive track record in helping AWS customers formalize and mitigate their sovereignty risks. The AWS European Sovereign Cloud will allow Eviden to address a wider range of customers’ sovereignty needs.”
— Yannick Tricaud, Head of Southern and Central Europe, Middle East, and Africa, Eviden, Atos Group

“We welcome the commitment of AWS to expand its infrastructure with an independent European cloud. This will give businesses and public sector organizations more choice in meeting digital sovereignty requirements. Cloud services are essential for the digitization of the public administration. With the “German Administration Cloud Strategy” and the “EVB-IT Cloud” contract standard, the foundations for cloud use in the public administration have been established. I am very pleased to work together with AWS to practically and collaboratively implement sovereignty in line with our cloud strategy.”
— Dr. Markus Richter, CIO of the German federal government, Federal Ministry of the Interior

Our commitments to our customers

We remain committed to giving our customers control and choices to help meet their evolving digital sovereignty needs. We continue to innovate sovereignty features, controls, and assurances globally with AWS, without compromising on the full power of AWS.

You can discover more about the AWS European Sovereign Cloud and learn more about our customers in the Press Release and on our European Digital Sovereignty website. You can also get more information in the AWS News Blog.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Matt Garman

Matt Garman

Matt is currently the Senior Vice President of AWS Sales, Marketing and Global Services at AWS, and also sits on Amazon’s executive leadership S-Team. Matt joined Amazon in 2006, and has held several leadership positions in AWS over that time. Matt previously served as Vice President of the Amazon EC2 and Compute Services businesses for AWS for over 10 years. Matt was responsible for P&L, product management, and engineering and operations for all compute and storage services in AWS. He started at Amazon when AWS first launched in 2006 and served as one of the first product managers, helping to launch the initial set of AWS services. Prior to Amazon, he spent time in product management roles at early stage Internet startups. Matt earned a BS and MS in Industrial Engineering from Stanford University, and an MBA from the Kellogg School of Management at Northwestern University.

Author

Max Peterson

Max is the Vice President of AWS Sovereign Cloud. He leads efforts to ensure that all AWS customers around the world have the most advanced set of sovereignty controls, privacy safeguards, and security features available in the cloud. Before his current role, Max served as the VP of AWS Worldwide Public Sector (WWPS) and created and led the WWPS International Sales division, with a focus on empowering government, education, healthcare, aerospace and satellite, and nonprofit organizations to drive rapid innovation while meeting evolving compliance, security, and policy requirements. Max has over 30 years of public sector experience and served in other technology leadership roles before joining Amazon. Max has earned both a Bachelor of Arts in Finance and Master of Business Administration in Management Information Systems from the University of Maryland.


French

AWS Digital Sovereignty Pledge : Un nouveau cloud souverain, indépendant en Europe

Depuis sa création, Amazon Web Services (AWS) est convaincu qu’il est essentiel que les clients aient le contrôle de leurs données et puissent choisir la manière dont ils les sécurisent et les gèrent dans le cloud. L’année dernière, nous avons annoncé l’AWS Digital Sovereignty Pledge, notre engagement à offrir aux clients d’AWS l’ensemble le plus avancé de contrôles et de fonctionnalités de souveraineté disponibles dans le cloud. Nous nous sommes engagés à travailler pour comprendre les besoins et les exigences en constante évolution de nos clients et des régulateurs, et à nous adapter et innover rapidement pour y répondre. Nous nous sommes engagés à développer notre offre afin de permettre à nos clients de répondre à leurs besoins en matière de souveraineté numérique, sans compromis sur les performances, l’innovation, la sécurité ou encore l’étendue du Cloud AWS.

AWS propose l’infrastructure cloud la plus étendue et la plus complète au monde. Dès l’origine, notre approche a été de rendre AWS souverain dès la conception. Nous avons développé des fonctionnalités et des contrôles de protection des données dans le Cloud AWS en nous appuyant sur les retours de clients du secteur financier, de la santé et du secteur public, qui figurent parmi les organisations les plus soucieuses de la sécurité et de la confidentialité des données au monde. Cela a donné lieu à des innovations telles que AWS Nitro System, qui alimente toutes nos instances modernes Amazon Elastic Compute Cloud (Amazon EC2) et fournit une solide barrière de sécurité physique et logique pour implémenter les restrictions d’accès afin que personne, y compris les employés d’AWS, ne puisse accéder aux données des clients traitées dans Amazon EC2. La conception de la sécurité du Système Nitro a également été validée de manière indépendante par NCC Group dans un rapport public.

Avec AWS, les clients ont toujours eu le contrôle de l’emplacement de leurs données. En Europe, les clients qui doivent se conformer aux exigences européennes en matière de localisation des données peuvent choisir de déployer leurs données dans l’une de nos huit Régions AWS existantes (Irlande, Francfort, Londres, Paris, Stockholm, Milan, Zurich et Espagne) afin de conserver leurs données en toute sécurité en Europe. Pour exécuter leurs applications sensibles, les clients européens peuvent avoir recours à l’offre de services la plus étendue et la plus complète, de l’intelligence artificielle à l’analyse, du calcul aux bases de données, en passant par l’Internet des objets (IoT), l’apprentissage automatique, les services mobiles et le stockage. Pour soutenir davantage nos clients, nous avons innové pour offrir un plus grand choix en matière de contrôle sur leurs données. Par exemple, nous avons annoncé une transparence et des garanties de sécurité accrues, ainsi que de nouvelles options d’infrastructure dédiée appelées AWS Dedicated Local Zones.

Annonce de l’AWS European Sovereign Cloud
Lorsque nous parlons à des clients du secteur public et des industries régulées en Europe, ils nous font part de l’incroyable complexité à laquelle ils sont confrontés dans un contexte de souveraineté en pleine évolution. Les clients nous disent qu’ils souhaitent adopter le cloud, mais qu’ils sont confrontés à des exigences réglementaires croissantes en matière de localisation des données, d’autonomie opérationnelle européenne et de résilience. Nous entendons que ces clients craignent de devoir choisir entre la pleine puissance d’AWS et des solutions de cloud souverain aux fonctionnalités limitées. Nous avons eu des contacts approfondis avec les régulateurs européens, les autorités nationales de cybersécurité et les clients afin de comprendre comment ces besoins de souveraineté peuvent varier en fonction de multiples facteurs tels que la localisation, la sensibilité des applications et le secteur d’activité. Ces facteurs peuvent avoir une incidence sur leurs exigences, comme l’endroit où leurs données peuvent être localisées, les personnes autorisées à y accéder et les contrôles nécessaires. AWS a fait ses preuves en matière d’innovation pour les applications spécialisées dans le monde entier.

Aujourd’hui, nous sommes heureux d’annoncer le lancement de l’AWS European Sovereign Cloud, un nouveau cloud indépendant pour l’Europe, conçu pour aider les organisations du secteur public et les clients des industries régulées à répondre à leurs besoins évolutifs en matière de souveraineté. Nous concevons l’AWS European Sovereign Cloud de manière à ce qu’il soit distinct et indépendant de nos Régions existantes, avec une infrastructure entièrement située dans l’Union européenne (UE), avec les mêmes niveaux de sécurité, de disponibilité et de performance que ceux dont bénéficient nos clients aujourd’hui dans les Régions existantes. Pour offrir une résilience opérationnelle accrue au sein de l’UE, seuls des résidents de l’UE qui se trouvent dans l’UE auront le contrôle sur l’exploitation et le support de l’AWS European Sovereign Cloud. Comme dans toutes les Régions existantes, les clients utilisant l’AWS European Sovereign Cloud bénéficieront de toute la puissance d’AWS avec la même architecture à laquelle ils sont habitués, le même portefeuille de services étendu et les mêmes API que ceux utilisés par des millions de clients aujourd’hui. L’AWS European Sovereign Cloud lancera sa première Région AWS en Allemagne, disponible pour tous les clients européens.

L’AWS European Sovereign Cloud sera souverain dès la conception et s’appuiera sur plus d’une décennie d’expérience dans l’exploitation de clouds indépendants pour les applications les plus critiques et les plus sensibles. À l’instar des Régions existantes, l’AWS European Sovereign Cloud sera conçu pour offrir une haute disponibilité et un haut niveau de résilience, et sera basé sur le Système AWS Nitro afin de garantir la confidentialité et l’intégrité des données des clients. Les clients auront le contrôle de leurs données et l’assurance qu’AWS n’y accèdera pas, ni ne les utilisera à aucune fin sans leur accord. AWS offre à ses clients les contrôles de souveraineté les plus puissants parmi les principaux fournisseurs de cloud. Pour les clients ayant des besoins accrus en matière de localisation des données, l’AWS European Sovereign Cloud est conçu pour aller plus loin et permettra aux clients de conserver toutes les métadonnées qu’ils créent (telles que les rôles de compte, les autorisations, les étiquettes de données et les configurations qu’ils utilisent au sein d’AWS) dans l’UE. L’AWS European Sovereign Cloud sera également doté de systèmes de facturation et de mesure de l’utilisation distincts et propres.

Apporter l’autonomie opérationnelle
L’AWS European Sovereign Cloud permettra aux clients de répondre à des exigences strictes en matière d’autonomie opérationnelle et de localisation des données. Pour améliorer la localisation des données et la résilience opérationnelle au sein de l’UE, l’infrastructure de l’AWS European Sovereign Cloud sera exploitée indépendamment des Régions AWS existantes. Afin de garantir le fonctionnement indépendant de l’AWS European Sovereign Cloud, seul le personnel résidant de l’UE et situé dans l’UE contrôlera les opérations quotidiennes, y compris l’accès aux centres de données, l’assistance technique et le service client.

Nous tirons les enseignements de nos échanges approfondis auprès des régulateurs européens et des autorités nationales de cybersécurité et les appliquons à la création de l’AWS European Sovereign Cloud, afin que les clients qui l’utilisent puissent répondre à leurs exigences en matière de localisation des données, d’autonomie opérationnelle et de résilience. Par exemple, nous nous réjouissons de poursuivre notre partenariat avec l’Office fédéral allemand de la sécurité de l’information (BSI).

« Le développement d’un cloud AWS européen facilitera grandement l’utilisation des services AWS pour de nombreuses organisations du secteur public et des entreprises ayant des exigences élevées en matière de sécurité et de protection des données. Nous sommes conscients du pouvoir d’innovation des services cloud modernes et nous voulons contribuer à les rendre disponibles en toute sécurité pour l’Allemagne et l’Europe. Le C5 (Catalogue des critères de conformité du cloud computing), développé par le BSI, a considérablement façonné les normes de cybersécurité dans le cloud et AWS a été en fait le premier fournisseur de services cloud à recevoir la certification C5 du BSI. À cet égard, nous sommes très heureux d’accompagner de manière constructive le développement local d’un cloud AWS, qui contribuera également à la souveraineté européenne en termes de sécurité ».
— Claudia Plattner, Présidente de l’Office fédéral allemand de la sécurité de l’information (BSI)

Contrôle sans compromis
Bien que distinct, l’AWS European Sovereign Cloud proposera la même architecture à la pointe de l’industrie, conçue pour offrir la même sécurité et la même disponibilité que les autres Régions AWS. Cela inclura plusieurs Zones de Disponibilité (AZ), des infrastructures physiques placées dans des emplacements géographiques séparés et distincts, avec une distance suffisante pour réduire de manière significative le risque qu’un seul événement ait un impact sur la continuité des activités des clients. Chaque AZ disposera de plusieurs couches d’alimentation et de réseau redondantes pour fournir le plus haut niveau de résilience. Toutes les Zones de Disponibilité de l’AWS European Sovereign Cloud seront interconnectées par un réseau métropolitain de fibres dédié entièrement redondant, fournissant un réseau à haut débit et à faible latence entre les Zones de Disponibilité. Tous les échanges entre les AZ seront chiffrés. Les clients recherchant davantage d’options pour répondre à des besoins stricts en matière d’isolement et de localisation des données dans le pays pourront tirer parti des Dedicated Local Zones ou d’AWS Outposts pour déployer l’infrastructure de l’AWS European Sovereign Cloud sur les sites de leur choix.

Un investissement continu d’AWS en Europe
L’AWS European Sovereign Cloud s’inscrit dans un investissement continu d’AWS en Europe. AWS s’engage à innover pour soutenir les valeurs européennes et l’avenir numérique de l’Europe.

Nous créons des milliers d’emplois qualifiés et investissons des milliards d’euros dans l’économie européenne. Amazon a créé plus de 100 000 emplois permanents dans l’UE.

Nous favorisons le développement économique en investissant dans les infrastructures, les emplois et les compétences dans les territoires et les pays d’Europe. Certaines des plus grandes équipes de développement d’AWS sont situées en Europe, avec des centres majeurs à Dublin, Dresde et Berlin. Dans le cadre de notre engagement continu à contribuer au développement des compétences numériques, nous recruterons et formerons du personnel local supplémentaire pour exploiter et soutenir l’AWS European Sovereign Cloud.

Les clients, partenaires et régulateurs accueillent favorablement l’AWS European Sovereign Cloud
Dans l’UE, des centaines de milliers d’organisations de toutes tailles et de tous secteurs utilisent AWS, qu’il s’agisse de start-ups, de petites et moyennes entreprises ou de grandes entreprises, y compris des sociétés de télécommunications, des organisations du secteur public, des établissements d’enseignement ou des agences gouvernementales. Des organisations de toute l’Europe accueillent favorablement l’AWS European Sovereign Cloud.

« En tant que leader du marché des logiciels de gestion d’entreprise fortement ancré en Europe, SAP collabore depuis longtemps avec AWS pour le compte de ses clients, afin d’accélérer la transformation numérique dans le monde entier. L’AWS European Sovereign Cloud offre de nouvelles opportunités de renforcer notre relation en Europe en nous permettant d’élargir les choix que nous offrons aux clients lorsqu’ils passent au cloud. Nous apprécions le partenariat en cours avec AWS, et les nouvelles possibilités que cet investissement peut apporter à nos clients mutuels dans toute la région. »
— Peter Pluim, Président, SAP Enterprise Cloud Services and SAP Sovereign Cloud Services.

« Le nouvel AWS European Sovereign Cloud peut changer la donne pour les secteurs d’activité très réglementés de l’Union européenne. En tant que fournisseur de télécommunications de premier plan en Allemagne, notre transformation numérique se concentre sur l’innovation, l’évolutivité, l’agilité et la résilience afin de fournir à nos clients les meilleurs services et la meilleure qualité. Cela sera désormais associé aux plus hauts niveaux de protection des données et de conformité réglementaire qu’offre AWS, avec un accent particulier sur les exigences de souveraineté numérique. Je suis convaincu que cette nouvelle offre d’infrastructure a le potentiel de stimuler l’adaptation au cloud des entreprises européennes et d’accélérer la transformation numérique des industries réglementées à travers l’UE. »
— Mallik Rao, Chief Technology and Information Officer, O2 Telefónica, Allemagne

« Deutsche Telekom se réjouit de l’annonce de l’AWS European Sovereign Cloud, qui souligne la volonté d’AWS d’innover en permanence pour les entreprises européennes. La solution d’AWS offrira un plus grand choix aux organisations lorsqu’elles migreront des applications réglementées vers le cloud, ainsi que des options supplémentaires pour répondre à l’évolution des exigences en matière de gouvernance numérique dans l’UE. »
— Greg Hyttenrauch, vice-président senior, Global Cloud Services chez T-Systems

« Aujourd’hui, nous sommes à l’aube d’une ère de transformation. Le lancement de l’AWS European Sovereign Cloud ne représente pas seulement une amélioration de l’infrastructure, c’est un changement de paradigme. Ce cadre sophistiqué permettra à Dedalus d’offrir des services inégalés pour le stockage sécurisé et efficace des données des patients dans le cloud AWS. Nous restons engagés, sans compromis, à servir notre clientèle européenne avec les meilleures solutions de leur catégorie, étayées par la confiance et l’excellence technologique ».
— Andrea Fiumicelli, Chairman at Dedalus

« À de Volksbank, nous croyons qu’il faut investir dans l’avenir des Pays-Bas. Pour y parvenir efficacement, nous devons avoir accès aux technologies les plus récentes afin d’innover en permanence et d’améliorer les services offerts à nos clients. C’est pourquoi nous nous réjouissons de l’annonce de l’AWS European Sovereign Cloud, qui permettra aux clients européens de démontrer facilement leur conformité aux réglementations en constante évolution tout en bénéficiant de l’étendue, de la sécurité et de la suite complète de services AWS ».
— Sebastiaan Kalshoven, Director IT/CTO, de Volksbank

« Eviden se réjouit du lancement de l’AWS European Sovereign Cloud. Celui-ci aidera les industries réglementées et le secteur public à satisfaire leurs exigences pour les applications les plus sensibles, grâce à un Cloud AWS doté de toutes ses fonctionnalités et entièrement opéré en Europe. En tant que partenaire AWS Premier Tier Services et leader des services de cybersécurité en Europe, Eviden a une longue expérience dans l’accompagnement de clients AWS pour formaliser et maîtriser leurs risques en termes de souveraineté. L’AWS European Sovereign Cloud permettra à Eviden de répondre à un plus grand nombre de besoins de ses clients en matière de souveraineté ».
— Yannick Tricaud, Head of Southern and Central Europe, Middle East and Africa, Eviden, Atos Group

« Nous saluons l’engagement d’AWS d’étendre son infrastructure avec un cloud européen indépendant. Les entreprises et les organisations du secteur public auront ainsi plus de choix pour répondre aux exigences de souveraineté numérique. Les services cloud sont essentiels pour la numérisation de l’administration publique. La “stratégie de l’administration allemande en matière de cloud” et la norme contractuelle “EVB-IT Cloud” ont constitué les bases de l’utilisation du cloud dans l’administration publique. Je suis très heureux de travailler avec AWS pour mettre en œuvre de manière pratique et collaborative la souveraineté, conformément à notre stratégie cloud. »
— Markus Richter, DSI du gouvernement fédéral allemand, ministère fédéral de l’Intérieur.

Nos engagements envers nos clients
Nous restons déterminés à donner à nos clients le contrôle et les choix nécessaires pour répondre à l’évolution de leurs besoins en matière de souveraineté numérique. Nous continuons d’innover en matière de fonctionnalités, de contrôles et de garanties de souveraineté au niveau mondial au sein d’AWS, tout en fournissant sans compromis et sans restriction la pleine puissance d’AWS.

Pour en savoir plus sur l’AWS European Sovereign Cloud et en apprendre davantage sur nos clients, consultez notre
communiqué de presse, et notre site web sur la souveraineté numérique européenne. Vous pouvez également obtenir plus d’informations en lisant l’AWS News Blog.


German

AWS Digital Sovereignty Pledge: Ankündigung der neuen, unabhängigen AWS European Sovereign Cloud

Amazon Web Services (AWS) war immer der Meinung, dass es wichtig ist, dass Kunden die volle Kontrolle über ihre Daten haben. Kunden sollen die Wahl haben, wie sie diese Daten in der Cloud absichern und verwalten.

Letztes Jahr haben wir unseren „AWS Digital Sovereignty Pledge“ vorgestellt: Unser Versprechen, allen AWS-Kunden ohne Kompromisse die fortschrittlichsten Steuerungsmöglichkeiten für Souveränitätsanforderungen und Funktionen in der Cloud anzubieten. Wir haben uns dazu verpflichtet, die sich wandelnden Anforderungen von Kunden und Aufsichtsbehörden zu verstehen und sie mit innovativen Angeboten zu adressieren. Wir bauen unser Angebot so aus, dass Kunden ihre Bedürfnisse an digitale Souveränität erfüllen können, ohne Kompromisse bei der Leistungsfähigkeit, Innovationskraft, Sicherheit und Skalierbarkeit der AWS-Cloud einzugehen.

AWS bietet die größte und umfassendste Cloud-Infrastruktur weltweit. Von Anfang an haben wir bei der AWS-Cloud einen „sovereign-by-design“-Ansatz verfolgt. Wir haben mit Hilfe von Kunden aus besonders regulierten Branchen, wie z.B. Finanzdienstleistungen, Gesundheit, Staat und Verwaltung, Funktionen und Steuerungsmöglichkeiten für Datenschutz und Datensicherheit entwickelt. Dieses Vorgehen hat zu Innovationen wie dem AWS Nitro System geführt, das heute die Grundlage für alle modernen Amazon Elastic Compute Cloud (Amazon EC2) Instanzen und Confidential Computing auf AWS bildet. AWS Nitro setzt auf eine starke physikalische und logische Sicherheitsabgrenzung und realisiert damit Zugriffsbeschränkungen, die unautorisierte Zugriffe auf Kundendaten in EC2 unmöglich machen – das gilt auch für AWS-Personal. Die NCC Group hat das Sicherheitsdesign von AWS Nitro im Rahmen einer unabhängigen Untersuchung in einem öffentlichen Bericht validiert.

Mit AWS hatten und haben Kunden stets die Kontrolle über den Speicherort ihrer Daten. Kunden, die spezifische europäische Vorgaben zum Ort der Datenverarbeitung einhalten müssen, haben die Wahl, ihre Daten in jeder unserer bestehenden acht AWS-Regionen (Frankfurt, Irland, London, Mailand, Paris, Stockholm, Spanien und Zürich) zu verarbeiten und sicher innerhalb Europas zu speichern. Europäische Kunden können ihre kritischen Workloads auf Basis des weltweit umfangreichsten und am weitesten verbreiteten Portfolios an Diensten betreiben – dazu zählen AI, Analytics, Compute, Datenbanken, Internet of Things (IoT), Machine Learning (ML), Mobile Services und Storage. Wir haben Innovationen in den Bereichen Datenverwaltung und Kontrolle realisiert, um unsere Kunden besser zu unterstützen. Zum Beispiel haben wir weitergehende Transparenz und zusätzliche Zusicherungen sowie neue Optionen für dedizierte Infrastruktur mit AWS Dedicated Local Zones angekündigt.

Ankündigung der AWS European Sovereign Cloud
Kunden aus dem öffentlichen Sektor und aus regulierten Industrien in Europa berichten uns immer wieder, mit welcher Komplexität und Dynamik sie im Bereich Souveränität konfrontiert werden. Wir hören von unseren Kunden, dass sie die Cloud nutzen möchten, aber gleichzeitig zusätzliche Anforderungen im Zusammenhang mit dem Ort der Datenverarbeitung, der betrieblichen Autonomie und der operativen Souveränität erfüllen müssen.

Kunden befürchten, dass sie sich zwischen der vollen Leistung von AWS und souveränen Cloud-Lösungen mit eingeschränkter Funktion entscheiden müssen. Wir haben intensiv mit Aufsichts- und Cybersicherheitsbehörden sowie Kunden aus Deutschland und anderen europäischen Ländern zusammengearbeitet, um zu verstehen, wie Souveränitätsbedürfnisse aufgrund verschiedener Faktoren wie Standort, Klassifikation der Workloads und Branche variieren können. Diese Faktoren können sich auf Workload-Anforderungen auswirken, z. B. darauf, wo sich diese Daten befinden dürfen, wer darauf zugreifen kann und welche Steuerungsmöglichkeiten erforderlich sind. AWS hat eine nachgewiesene Erfolgsbilanz insbesondere für innovative Lösungen zur Verarbeitung spezialisierter Workloads auf der ganzen Welt.

Wir freuen uns, heute die AWS European Sovereign Cloud ankündigen zu können: Eine neue, unabhängige Cloud für Europa. Sie soll Kunden aus dem öffentlichen Sektor und stark regulierten Industrien (z.B. Betreiber kritischer Infrastrukturen („KRITIS“)) dabei helfen, spezifische gesetzliche Anforderungen an den Ort der Datenverarbeitung und den Betrieb der Cloud zu erfüllen. Die AWS European Sovereign Cloud wird sich in der Europäischen Union (EU) befinden und dort betrieben. Sie wird physisch und logisch von den bestehenden AWS-Regionen getrennt sein und dieselbe Sicherheit, Verfügbarkeit und Leistung wie die bestehenden AWS-Regionen bieten. Die Kontrolle über den Betrieb und den Support der AWS European Sovereign Cloud wird ausschließlich von AWS-Personal ausgeübt, das in der EU ansässig ist und sich in der EU aufhält.

Wie schon bei den bestehenden AWS-Regionen, werden Kunden, welche die AWS European Sovereign Cloud nutzen, von dem gesamten AWS-Leistungsumfang profitieren. Dazu zählen die gewohnte Architektur, das umfangreiche Service-Portfolio und die APIs, die heute schon von Millionen von Kunden verwendet werden. Die AWS European Sovereign Cloud wird mit ihrer ersten AWS-Region in Deutschland starten und allen Kunden in Europa zur Verfügung stehen.

Die AWS European Sovereign Cloud wird “sovereign-by-design” sein und basiert auf mehr als zehn Jahren Erfahrung beim Betrieb mehrerer unabhängiger Clouds für besonders kritische und vertrauliche Workloads. Wie schon bei unseren bestehenden AWS-Regionen wird die AWS European Sovereign Cloud für Hochverfügbarkeit und Ausfallsicherheit ausgelegt sein und auf dem AWS Nitro System aufbauen, um die Vertraulichkeit und Integrität von Kundendaten sicherzustellen. Kunden haben die Kontrolle und Gewissheit darüber, dass AWS nicht ohne ihr Einverständnis auf Kundendaten zugreift oder sie für andere Zwecke verwendet. Die AWS European Sovereign Cloud ist so gestaltet, dass nicht nur alle Kundendaten, sondern auch alle Metadaten, die durch Kunden angelegt werden (z.B. Rollen, Zugriffsrechte, Labels für Ressourcen und Konfigurationsinformationen), innerhalb der EU verbleiben. Die AWS European Sovereign Cloud verfügt über unabhängige Systeme für das Rechnungswesen und zur Nutzungsmessung.

„Die neue AWS European Sovereign Cloud kann ein Game Changer für stark regulierte Geschäftsbereiche in der Europäischen Union sein. Als führender Telekommunikationsanbieter in Deutschland konzentriert sich unsere digitale Transformation auf Innovation, Skalierbarkeit, Agilität und Resilienz, um unseren Kunden die besten Dienste und die beste Qualität zu bieten. Dies wird nun von AWS mit dem höchsten Datenschutzniveau unter Einhaltung der regulatorischen Anforderungen vereint mit einem besonderen Schwerpunkt auf die Anforderungen an digitale Souveränität. Ich bin überzeugt, dass dieses neue Infrastrukturangebot das Potenzial hat, die Cloud-Adaption von europäischen Unternehmen voranzutreiben und die digitale Transformation regulierter Branchen in der EU zu beschleunigen.“
— Mallik Rao, Chief Technology and Information Officer bei O2 Telefónica in Deutschland

Sicherstellung operativer Autonomie
Die AWS European Sovereign Cloud bietet Kunden die Möglichkeit, strenge Anforderungen an Betriebsautonomie und den Ort der Datenverarbeitung zu erfüllen. Um eine Datenverarbeitung und operative Souveränität innerhalb der EU zu gewährleisten, wird die AWS European Sovereign Cloud-Infrastruktur unabhängig von bestehenden AWS-Regionen betrieben. Um den unabhängigen Betrieb der AWS European Sovereign Cloud zu gewährleisten, hat nur Personal, das in der EU ansässig ist und sich in der EU aufhält die Kontrolle über den täglichen Betrieb. Dazu zählen der Zugang zu Rechenzentren, der technische Support und der Kundenservice.

Wir nutzen die Erkenntnisse aus unserer intensiven Zusammenarbeit mit Aufsichts- und Cybersicherheitsbehörden in Europa beim Aufbau der AWS European Sovereign Cloud, damit Kunden ihren Anforderungen an die Kontrolle über den Speicher- und Verarbeitungsort ihrer Daten, der betrieblichen Autonomie und der operativen Souveränität gerecht werden können. Wir freuen uns, mit dem Bundesamt für Sicherheit in der Informationstechnik (BSI) auch bei der Umsetzung der AWS European Sovereign Cloud zu kooperieren:

„Der Aufbau einer europäischen AWS-Cloud wird es für viele Behörden und Unternehmen mit hohen Anforderungen an die Datensicherheit und den Datenschutz deutlich leichter machen, die AWS-Services zu nutzen. Wir wissen um die Innovationskraft moderner Cloud-Dienste und wir wollen mithelfen, sie für Deutschland und Europa sicher verfügbar zu machen. Das BSI hat mit dem Kriterienkatalog C5 die Cybersicherheit im Cloud Computing bereits maßgeblich beeinflusst, und tatsächlich war AWS der erste Cloud Service Provider, der das C5-Testat des BSI erhalten hat. Insofern freuen wir uns sehr, den hiesigen Aufbau einer AWS-Cloud, die auch einen Beitrag zur europäischen Souveränität leisten wird, im Hinblick auf die Sicherheit konstruktiv zu begleiten.“
— Claudia Plattner, Präsidentin, deutsches Bundesamt für Sicherheit in der Informationstechnik (BSI)

Kontrolle ohne Kompromisse
Obwohl sie separat betrieben wird, bietet die AWS European Sovereign Cloud dieselbe branchenführende Architektur, die auf Sicherheit und Verfügbarkeit ausgelegt ist wie andere AWS-Regionen. Dazu gehören mehrere Verfügbarkeitszonen (Availability Zones, AZs) – eine Infrastruktur, die sich an verschiedenen voneinander getrennten geografischen Standorten befindet. Diese räumliche Trennung verringert signifikant das Risiko, dass ein Zwischenfall an einem einzelnen Standort den Geschäftsbetrieb des Kunden beeinträchtigt. Jede Verfügbarkeitszone besitzt eine autarke Stromversorgung und Kühlung und verfügt über redundante Netzwerkanbindungen, um ein Höchstmaß an Ausfallsicherheit zu gewährleisten. Zudem zeichnet sich jede Verfügbarkeitszone durch eine hohe physische Sicherheit aus. Alle AZs in der AWS European Sovereign Cloud werden über vollständig redundante, dedizierte Metro-Glasfaser miteinander verbunden und ermöglichen so eine Vernetzung mit hohem Durchsatz und niedriger Latenz zwischen den AZs. Der gesamte Datenverkehr zwischen AZs wird verschlüsselt. Für besonders strikte Anforderungen an die Trennung von Daten und den Ort der Datenverarbeitung innerhalb eines Landes bieten bestehende Angebote wie AWS Dedicated Local Zones oder AWS Outposts zusätzliche Optionen. Damit können Kunden die AWS European Sovereign Cloud Infrastruktur auf selbstgewählte Standorte erweitern.

Kontinuierliche AWS-Investitionen in Deutschland und Europa
Mit der AWS European Sovereign Cloud setzt AWS seine Investitionen in Deutschland und Europa fort. AWS entwickelt Innovationen, um europäische Werte und die digitale Zukunft in Deutschland und Europa zu unterstützen. Wir treiben die wirtschaftliche Entwicklung voran, indem wir in Infrastruktur, Arbeitsplätze und Ausbildung in ganz Europa investieren. Wir schaffen Tausende von hochwertigen Arbeitsplätzen und investieren Milliarden von Euro in europäische Volkswirtschaften. Amazon hat mehr als 100.000 dauerhafte Arbeitsplätze innerhalb der EU geschaffen.

„Die deutsche und europäische Wirtschaft befindet sich auf Digitalisierungskurs. Insbesondere der starke deutsche Mittelstand braucht eine souveräne Digitalinfrastruktur, die höchsten Anforderungen genügt, um auch weiterhin wettbewerbsfähig im globalen Markt zu sein. Für unsere digitale Unabhängigkeit ist wichtig, dass Rechenleistungen vor Ort in Deutschland entstehen und in unseren Digitalstandort investiert wird. Wir begrüßen daher die Ankündigung von AWS, die Cloud für Europa in Deutschland anzusiedeln.“
— Stefan Schnorr, Staatssekretär im deutschen Bundesministerium für Digitales und Verkehr

Einige der größten Entwicklungsteams von AWS sind in Deutschland und Europa angesiedelt, mit Standorten in Aachen, Berlin, Dresden, Tübingen und Dublin. Da wir uns verpflichtet fühlen, einen langfristigen Beitrag zur Entwicklung digitaler Kompetenzen zu leisten, wird AWS zusätzliches Personal vor Ort für die AWS European Sovereign Cloud einstellen und ausbilden.

Kunden, Partner und Aufsichtsbehörden begrüßen die AWS European Sovereign Cloud
In der EU nutzen Hunderttausende Organisationen aller Größen und Branchen AWS – von Start-ups über kleine und mittlere Unternehmen bis hin zu den größten Unternehmen, einschließlich Telekommunikationsunternehmen, Organisationen des öffentlichen Sektors, Bildungseinrichtungen und Regierungsbehörden. Europaweit unterstützen Organisationen die Einführung der AWS European Sovereign Cloud. Für Kunden wird die AWS European Sovereign Cloud neue Möglichkeiten im Cloudeinsatz eröffnen.

„Wir begrüßen das Engagement von AWS, seine Infrastruktur mit einer unabhängigen europäischen Cloud auszubauen. So erhalten Unternehmen und Organisationen der öffentlichen Hand mehr Auswahlmöglichkeiten bei der Erfüllung der Anforderungen an digitale Souveränität. Cloud-Services sind für die Digitalisierung der öffentlichen Verwaltung unerlässlich. Mit der Deutschen Verwaltungscloud-Strategie und dem Vertragsstandard EVB-IT Cloud wurden die Grundlagen für die Cloud-Nutzung in der Verwaltung geschaffen. Ich freue mich sehr, gemeinsam mit AWS Souveränität im Sinne unserer Cloud-Strategie praktisch und partnerschaftlich umzusetzen.”
— Dr. Markus Richter, Staatssekretär im deutschen Bundesministerium des Innern und für Heimat sowie Beauftragter der Bundesregierung für Informationstechnik (CIO des Bundes)

„Als Marktführer für Geschäftssoftware mit starken Wurzeln in Europa, arbeitet SAP seit langem im Interesse der Kunden mit AWS zusammen, um die digitale Transformation auf der ganzen Welt zu beschleunigen. Die AWS European Sovereign Cloud bietet weitere Möglichkeiten, unsere Beziehung in Europa zu stärken, indem wir die Möglichkeiten, die wir unseren Kunden beim Wechsel in die Cloud bieten, erweitern können. Wir schätzen die fortlaufende Zusammenarbeit mit AWS und die neuen Möglichkeiten, die diese Investition für unsere gemeinsamen Kunden in der gesamten Region mit sich bringen kann.“
— Peter Pluim, President – SAP Enterprise Cloud Services und SAP Sovereign Cloud Services

„Heute stehen wir an der Schwelle zu einer transformativen Ära. Die Einführung der AWS European Sovereign Cloud stellt nicht nur eine infrastrukturelle Erweiterung dar, sondern ist ein Paradigmenwechsel. Dieses hochentwickelte Framework wird Dedalus in die Lage versetzen, unvergleichliche Dienste für die sichere und effiziente Speicherung von Patientendaten in der AWS-Cloud anzubieten. Wir bleiben kompromisslos dem Ziel verpflichtet, unseren europäischen Kunden erstklassige Lösungen zu bieten, die auf Vertrauen und technologischer Exzellenz basieren.“
— Andrea Fiumicelli, Chairman bei Dedalus

„Die Deutsche Telekom begrüßt die Ankündigung der AWS European Sovereign Cloud, die das Engagement von AWS für fortwährende Innovationen für europäische Unternehmen unterstreicht. Diese AWS-Lösung wird Unternehmen eine noch größere Auswahl bieten, wenn sie kritische Workloads in die AWS-Cloud verlagern, und zusätzliche Optionen zur Erfüllung der sich entwickelnden Anforderungen an die digitale Governance in der EU.”
— Greg Hyttenrauch, Senior Vice President, Global Cloud Services bei T-Systems

„Wir begrüßen die AWS European Sovereign Cloud als neues Angebot innerhalb von AWS, um die komplexesten regulatorischen Anforderungen an die Datenresidenz und betrieblichen Erfordernisse in ganz Europa zu adressieren.“
— Bernhard Wagensommer, Vice President Prinect bei der Heidelberger Druckmaschinen AG

„Die AWS European Sovereign Cloud wird neue Branchenmaßstäbe setzen und sicherstellen, dass Finanzdienstleistungsunternehmen noch mehr Optionen innerhalb von AWS haben, um die wachsenden Anforderungen an die digitale Souveränität hinsichtlich der Datenresidenz und operativen Autonomie in der EU zu erfüllen.“
— Gerhard Koestler, Chief Information Officer bei Raisin

„Mit einem starken Fokus auf Datenschutz, Sicherheit und regulatorischer Compliance unterstreicht die AWS European Sovereign Cloud das Engagement von AWS, die höchsten Standards für die digitale Souveränität von Finanzdienstleistern zu fördern. Dieser zusätzliche robuste Rahmen ermöglicht es Unternehmen wie unserem, in einer sicheren Umgebung erfolgreich zu sein, in der Daten geschützt sind und die Einhaltung höchster Standards leichter denn je wird.“
— Andreas Schranzhofer, Chief Technology Officer bei Scalable Capital

„Die AWS European Sovereign Cloud ist ein wichtiges, zusätzliches Angebot von AWS, das hochregulierten Branchen, Organisationen der öffentlichen Hand und Regierungsbehörden in Deutschland weitere Optionen bietet, um strengste regulatorische Anforderungen an den Datenschutz in der Cloud noch einfacher umzusetzen. Als AWS Advanced Tier Services Partner, AWS Solution Provider und AWS Public Sector Partner beraten und unterstützen wir kritische Infrastrukturen (KRITIS) bei der erfolgreichen Implementierung. Das neue Angebot von AWS ist ein wichtiger Impuls für Innovationen und Digitalisierung in Deutschland.“
— Martin Wibbe, CEO bei Materna

„Als eines der größten deutschen IT-Unternehmen und strategischer AWS-Partner begrüßt msg ausdrücklich die Ankündigung der AWS European Sovereign Cloud. Für uns als Anbieter von Software as a Service (SaaS) und Consulting Advisor für Kunden mit spezifischen Datenschutzanforderungen ermöglicht die Schaffung einer eigenständigen europäischen Cloud, unseren Kunden dabei zu helfen, die Einhaltung sich entwickelnder Vorschriften leichter nachzuweisen. Diese spannende Ankündigung steht im Einklang mit unserer Cloud-Strategie. Wir betrachten dies als Chance, um unsere Partnerschaft mit AWS zu stärken und die Entwicklung der Cloud in Deutschland voranzutreiben.“
— Dr. Jürgen Zehetmaier, CEO von msg

Unsere Verpflichtung gegenüber unseren Kunden
Um Kunden bei der Erfüllung der sich wandelnden Souveränitätsanforderungen zu unterstützen, entwickelt AWS fortlaufend innovative Features, Kontrollen und Zusicherungen, ohne die Leistungsfähigkeit der AWS Cloud zu beeinträchtigen.

Weitere Informationen zur AWS European Sovereign Cloud und über unsere Kunden finden Sie in der Pressemitteilung und auf unserer Website zur europäischen digitalen Souveränität. Sie finden auch weitere Informationen im AWS News Blog.


Italian

AWS Digital Sovereignty Pledge: Annuncio di un nuovo cloud sovrano e indipendente in Europa

Fin dal primo giorno, abbiamo sempre creduto che fosse essenziale che tutti i clienti avessero il controllo sui propri dati e sulle scelte di come proteggerli e gestirli nel cloud. L’anno scorso abbiamo introdotto l’AWS Digital Sovereignty Pledge, il nostro impegno a offrire ai clienti AWS il set più avanzato di controlli e funzionalità di sovranità disponibili nel cloud. Ci siamo impegnati a lavorare per comprendere le esigenze e le necessità in costante evoluzione sia dei clienti che delle autorità di regolamentazione, e per adattarci e innovare rapidamente per soddisfarli. Ci siamo impegnati ad espandere le nostre funzionalità per consentire ai clienti di soddisfare le loro esigenze di sovranità digitale senza compromettere le prestazioni, l’innovazione, la sicurezza o la scalabilità del cloud AWS.

AWS offre l’infrastruttura cloud più grande e completa a livello globale. Il nostro approccio fin dall’inizio è stato quello di rendere il cloud AWS sovrano by design. Abbiamo creato funzionalità e controlli di protezione dei dati nel cloud AWS confrontandoci con i clienti che operano in settori quali i servizi finanziari e l’assistenza sanitaria, che sono in assoluto tra le organizzazioni più attente alla sicurezza e alla privacy dei dati. Ciò ha portato a innovazioni come AWS Nitro System, che alimenta tutte le nostre moderne istanze Amazon Elastic Compute Cloud (Amazon EC2) e fornisce un solido standard di sicurezza fisico e logico-infrastrutturale al fine di imporre restrizioni di accesso in modo che nessuno, compresi i dipendenti AWS, possa accedere ai dati dei clienti in esecuzione in EC2. Il design di sicurezza del sistema Nitro è stato inoltre convalidato in modo indipendente dal gruppo NCC in un report pubblico.

Con AWS i clienti hanno sempre avuto il controllo sulla posizione dei propri dati. I clienti che devono rispettare i requisiti europei di residenza dei dati possono scegliere di distribuire i propri dati in una delle otto regioni AWS esistenti (Irlanda, Francoforte, Londra, Parigi, Stoccolma, Milano, Zurigo e Spagna) per conservare i propri dati in modo sicuro in Europa. Per gestire i propri carichi di lavoro sensibili, i clienti europei possono sfruttare il portafoglio di servizi più ampio e completo, tra cui intelligenza artificiale, analisi ed elaborazione dati, database, Internet of Things (IoT), apprendimento automatico, servizi mobili e storage. Per supportare ulteriormente i clienti, abbiamo introdotto alcune innovazioni per offrire loro maggiore controllo e scelta sulla gestione dei dati. Ad esempio, abbiamo annunciato ulteriore trasparenza e garanzie e nuove opzioni di infrastruttura dedicate con AWS Dedicated Local Zones.

Annuncio AWS European Sovereign Cloud
Quando in Europa parliamo con i clienti del settore pubblico e delle industrie regolamentate, riceviamo continue conferme di come si trovano ad affrontare una incredibile complessità e mutevoli dinamiche di un panorama di sovranità in continua evoluzione. I clienti ci dicono che vogliono adottare il cloud, ma si trovano ad affrontare crescenti interventi normativi in relazione alla residenza dei dati, all’autonomia operativa ed alla resilienza europea. Abbiamo appreso che questi clienti temono di dover scegliere tra tutta la potenza di AWS e soluzioni cloud sovrane ma con funzionalità limitate. Abbiamo collaborato intensamente con le autorità di regolamentazione europee, le agenzie nazionali per la sicurezza informatica e i nostri clienti per comprendere come le esigenze di sovranità possano variare in base a molteplici fattori come la residenza, la sensibilità dei carichi di lavoro e il settore. Questi fattori possono influire sui requisiti del carico di lavoro, ad esempio dove possono risiedere i dati, chi può accedervi e i controlli necessari, ed AWS ha una comprovata esperienza di innovazione per affrontare carichi di lavoro specializzati in tutto il mondo.

Oggi siamo lieti di annunciare il nostro programma di lancio dell’AWS European Sovereign Cloud, un nuovo cloud indipendente per l’Europa, progettato per aiutare le organizzazioni del settore pubblico e i clienti in settori altamente regolamentati a soddisfare le loro esigenze di sovranità in continua evoluzione. Stiamo progettando il cloud sovrano europeo AWS in modo che sia separato e indipendente dalle nostre regioni esistenti, con un’infrastruttura situata interamente all’interno dell’Unione Europea (UE), con la stessa sicurezza, disponibilità e prestazioni che i nostri clienti ottengono dalle regioni esistenti oggi. Per garantire una maggiore resilienza operativa all’interno dell’UE, solo i residenti dell’UE che si trovano nell’UE avranno il controllo delle operazioni e il supporto per l’AWS European Sovereign Cloud. Come per tutte le regioni attuali, i clienti che utilizzeranno l’AWS European Sovereign Cloud trarranno vantaggio da tutta la potenza di AWS con la stessa architettura, un ampio portafoglio di servizi e API che milioni di clienti già utilizzano oggi. L’AWS European Sovereign Cloud lancerà la sua prima regione AWS in Germania, disponibile per tutti i clienti europei.

Il cloud sovrano europeo AWS sarà progettato per garantire l’indipendenza operativa e la resilienza all’interno dell’UE e sarà gestito e supportato solamente da dipendenti AWS che si trovano nell’UE e che vi risiedono. Questo design offrirà ai clienti una scelta aggiuntiva per soddisfare le diverse esigenze di residenza dei dati, autonomia operativa e resilienza. Come in tutte le regioni AWS attuali, i clienti che utilizzano l’AWS European Sovereign Cloud trarranno vantaggio da tutta la potenza di AWS, dalla stessa architettura, dall’ampio portafoglio di servizi e dalle stesse API utilizzate oggi da milioni di clienti. L’AWS European Sovereign Cloud lancerà la sua prima regione in Germania.

Il cloud sovrano europeo AWS sarà sovrano by design e si baserà su oltre un decennio di esperienza nella gestione di più cloud indipendenti per carichi di lavoro critici e soggetti a restrizioni. Come le regioni esistenti, il cloud sovrano europeo AWS sarà progettato per garantire disponibilità e resilienza elevate e sarà alimentato da AWS Nitro System per contribuire a garantire la riservatezza e l’integrità dei dati dei clienti. Clienti che avranno il controllo e la garanzia che AWS non potrà accedere od utilizzare i dati dei clienti per alcuno scopo senza il loro consenso. AWS offre ai clienti i controlli di sovranità più rigorosi tra quelli offerti dai principali cloud provider. Per i clienti con esigenze avanzate di residenza dei dati, il cloud sovrano europeo AWS è progettato per andare oltre, e consentirà ai clienti di conservare tutti i metadati che creano (come le etichette dei dati, le categorie, i ruoli degli account e le configurazioni che utilizzano per eseguire AWS) nell’UE. L’AWS European Sovereign Cloud sarà inoltre realizzato con sistemi separati di fatturazione e misurazione dell’utilizzo a livello regionale.

Garantire autonomia operativa
L’AWS European Sovereign Cloud fornirà ai clienti la capacità di soddisfare rigorosi requisiti di autonomia operativa e residenza dei dati. Per offrire un maggiore controllo sulla residenza dei dati e sulla resilienza operativa all’interno dell’UE, l’infrastruttura AWS European Sovereign Cloud sarà gestita indipendentemente dalle regioni AWS esistenti. Per garantire il funzionamento indipendente dell’AWS European Sovereign Cloud, solo il personale residente nell’UE, situato nell’UE, avrà il controllo delle operazioni quotidiane, compreso l’accesso ai data center, il supporto tecnico e il servizio clienti.

Stiamo attingendo alle nostre profonde collaborazioni con le autorità di regolamentazione europee e le agenzie nazionali per la sicurezza informatica per applicarle nella realizzazione del cloud sovrano europeo AWS, di modo che i clienti che utilizzano AWS European Sovereign Cloud possano soddisfare i loro requisiti di residenza dei dati, di controllo, di autonomia operativa e resilienza. Ne è un esempio la stretta collaborazione con l’Ufficio federale tedesco per la sicurezza delle informazioni (BSI).

“Lo sviluppo di un cloud AWS europeo renderà molto più semplice l’utilizzo dei servizi AWS per molte organizzazioni del settore pubblico e per aziende con elevati requisiti di sicurezza e protezione dei dati. Siamo consapevoli della forza innovativa dei moderni servizi cloud e vogliamo contribuire a renderli disponibili in modo sicuro per la Germania e l’Europa. Il C5 (Cloud Computing Compliance Criteria Catalogue), sviluppato da BSI, ha plasmato in modo significativo gli standard cloud di sicurezza informatica e AWS è stato infatti il ​​primo fornitore di servizi cloud a ricevere l’attestato C5 di BSI. In questo senso, siamo molto lieti di accompagnare in modo costruttivo lo sviluppo locale di un Cloud AWS, che contribuirà anche alla sovranità europea, in termini di sicurezza”.
— Claudia Plattner, Presidente dell’Ufficio federale tedesco per la sicurezza informatica (BSI)

Controllo senza compromessi
Sebbene separato, l’AWS European Sovereign Cloud offrirà la stessa architettura leader del settore creata per la sicurezza e la disponibilità delle altre regioni AWS. Ciò includerà multiple zone di disponibilità (AZ) e un’infrastruttura collocata in aree geografiche separate e distinte, con una distanza sufficiente a ridurre in modo significativo il rischio che un singolo evento influisca sulla continuità aziendale dei clienti. Ogni AZ disporrà di più livelli di alimentazione e rete ridondanti per fornire il massimo livello di resilienza. Tutte le AZ del cloud sovrano europeo AWS saranno interconnesse con fibra metropolitana dedicata e completamente ridondata, che fornirà reti ad alta velocità e bassa latenza tra le AZ. Tutto il traffico tra le AZ sarà crittografato. I clienti che necessitano di più opzioni per far fronte ai rigorosi requisiti di isolamento e residenza dei dati all’interno del Paese potranno sfruttare le zone locali dedicate o AWS Outposts per distribuire l’infrastruttura AWS European Sovereign Cloud nelle località da loro selezionate.

Continui investimenti di AWS in Europa
L’AWS European Sovereign Cloud è parte del continuo impegno ad investire in Europa di AWS. AWS si impegna a innovare per sostenere i valori europei e il futuro digitale dell’Europa. Promuoviamo lo sviluppo economico investendo in infrastrutture, posti di lavoro e competenze nelle comunità e nei paesi di tutta Europa. Stiamo creando migliaia di posti di lavoro di alta qualità e investendo miliardi di euro nelle economie europee. Amazon ha creato più di 100.000 posti di lavoro permanenti in tutta l’UE. Alcuni dei nostri team di sviluppo AWS più grandi si trovano in Europa, con centri di eccellenza a Dublino, Dresda e Berlino. Nell’ambito del nostro continuo impegno a contribuire allo sviluppo delle competenze digitali, assumeremo e svilupperemo ulteriore personale locale per gestire e supportare l’AWS European Sovereign Cloud.

Clienti, partner e autorità di regolamentazione accolgono con favore il cloud sovrano europeo AWS
Nell’UE, centinaia di migliaia di organizzazioni di tutte le dimensioni e in tutti i settori utilizzano AWS, dalle start-up alle piccole e medie imprese, alle grandi imprese, alle società di telecomunicazioni, alle organizzazioni del settore pubblico, agli istituti di istruzione e alle agenzie governative. Organizzazioni di tutta Europa sostengono l’introduzione dell’AWS European Sovereign Cloud.

“In qualità di leader di mercato nel software applicativo aziendale con forti radici in Europa, SAP collabora da tempo con AWS per conto dei clienti per accelerare la trasformazione digitale in tutto il mondo. L’AWS European Sovereign Cloud offre ulteriori opportunità per rafforzare le nostre relazioni in Europa consentendoci di ampliare le scelte che offriamo ai clienti mentre passano al cloud. Apprezziamo la partnership continua con AWS e le nuove possibilità che questo investimento può offrire ai nostri comuni clienti in tutta la regione.”
— Peter Pluim, Presidente, SAP Enterprise Cloud Services e SAP Sovereign Cloud Services

“Il nuovo AWS European Sovereign Cloud può rappresentare un punto di svolta per i segmenti di business altamente regolamentati nell’Unione Europea. In qualità di fornitore leader di telecomunicazioni in Germania, la nostra trasformazione digitale si concentra su innovazione, scalabilità, agilità e resilienza per fornire ai nostri clienti i migliori servizi e la migliore qualità. Ciò sarà ora abbinato ai più alti livelli di protezione dei dati e conformità normativa offerti da AWS e con un’attenzione particolare ai requisiti di sovranità digitale. Sono convinto che questa nuova offerta di infrastrutture abbia il potenziale per stimolare l’adozione del cloud da parte delle aziende europee e accelerare la trasformazione digitale delle industrie regolamentate in tutta l’UE”.
— Mallik Rao, Chief Technology & Information Officer (CTIO) presso O2 Telefónica in Germania

“Deutsche Telekom accoglie l’annuncio dell’AWS European Sovereign Cloud, che evidenzia l’impegno di AWS a un’innovazione continua nel mercato europeo. Questa soluzione AWS offrirà opportunità important per le aziende e le organizzazioni nell’ambito della migrazione regolamentata sul cloud e opzioni addizionali per soddisfare i requisiti di sovranità digitale europei in continua evoluzione”.
— Greg Hyttenrauch, Senior Vice President, Global Cloud Services presso T-Systems

“Oggi siamo al culmine di un’era di trasformazione. L’introduzione dell’AWS European Sovereign Cloud non rappresenta semplicemente un miglioramento infrastrutturale, è un cambio di paradigma. Questo sofisticato framework consentirà a Dedalus di offrire servizi senza precedenti per l’archiviazione dei dati dei pazienti in modo sicuro ed efficiente nel cloud AWS. Rimaniamo impegnati, senza compromessi, a servire la nostra clientela europea con soluzioni best-in-class sostenute da fiducia ed eccellenza tecnologica”.
— Andrea Fiumicelli, Presidente di Dedalus

“Noi di de Volksbank crediamo nell’investire per migliorare i Paesi Bassi. Ma perché questo avvenga in modo efficace, dobbiamo avere accesso alle tecnologie più recenti per poter innovare e migliorare continuamente i servizi per i nostri clienti. Per questo motivo, accogliamo con favore l’annuncio dello European Sovereign Cloud che consentirà ai clienti europei di rispettare facilmente la conformità alle normative in evoluzione, beneficiando comunque della scalabilità, della sicurezza e della suite completa dei servizi AWS”.
— Sebastiaan Kalshoven, Direttore IT/CTO della Volksbank

“Eviden accoglie con favore il lancio dell’AWS European Sovereign Cloud, che aiuterà le industrie regolamentate e il settore pubblico a soddisfare i requisiti dei loro carichi di lavoro sensibili con un cloud AWS completo e interamente gestito in Europa. In qualità di partner AWS Premier Tier Services e leader nei servizi di sicurezza informatica in Europa, Eviden ha una vasta esperienza nell’aiutare i clienti AWS a formalizzare e mitigare i rischi di sovranità. L’AWS European Sovereign Cloud consentirà a Eviden di soddisfare una gamma più ampia di esigenze di sovranità dei clienti”.
— Yannick Tricaud, Responsabile Europa meridionale e centrale, Medio Oriente e Africa, Eviden, Gruppo Atos

“Accogliamo con favore l’impegno di AWS di espandere la propria infrastruttura con un cloud europeo indipendente. Ciò offrirà alle imprese e alle organizzazioni del settore pubblico una scelta più ampia nel soddisfare i requisiti di sovranità digitale. I servizi cloud sono essenziali per la digitalizzazione della pubblica amministrazione. Con l’” Strategia cloud per l’Amministrazione tedesca” e lo standard contrattuale “EVB-IT Cloud”, sono state gettate le basi per l’utilizzo del cloud nella pubblica amministrazione. Sono molto lieto di collaborare con AWS per implementare in modo pratico e collaborativo la sovranità in linea con la nostra strategia cloud.”
— Dr. Markus Richter, CIO del governo federale tedesco, Ministero federale degli interni

I nostri impegni nei confronti dei nostri clienti
Manteniamo il nostro impegno a fornire ai nostri clienti il controllo e la possibilità di scelta per contribuire a soddisfare le loro esigenze in continua evoluzione in materia di sovranità digitale. Continueremo a innovare le funzionalità, i controlli e le garanzie di sovranità del dato all’interno del cloud AWS globale e a fornirli senza compromessi sfruttando tutta la potenza di AWS.

Puoi scoprire di più sull’AWS European Sovereign Cloud nel Comunicato Stampa o sul nostro sito European Digital Sovereignty.  Puoi anche ottenere ulteriori informazioni nel blog AWS News.


Spanish

Compromiso de Soberanía Digital de AWS: anuncio de una nueva nube soberana independiente en la Unión Europea

Desde el primer día, en Amazon Web Services (AWS) siempre hemos creído que es esencial que los clientes tengan el control sobre sus datos y capacidad para proteger y gestionar los mismos en la nube. El año pasado, anunciamos el Compromiso de Soberanía Digital de AWS, nuestra garantía de que ofrecemos a todos los clientes de AWS los controles y funcionalidades de soberanía más avanzados que estén disponibles en la nube. Nos comprometimos a trabajar para comprender las necesidades y los requisitos cambiantes tanto de los clientes como de los reguladores, y a adaptarnos e innovar rápidamente para satisfacerlos. Asimismo, nos comprometimos a ampliar nuestras capacidades para permitir a los clientes satisfacer sus necesidades de soberanía digital sin reducir el rendimiento, la innovación, la seguridad o la escalabilidad de la nube de AWS.

AWS ofrece la infraestructura de nube más amplia y completa del mundo. Nuestro enfoque desde el principio ha sido hacer que AWS sea una nube soberana por diseño. Creamos funcionalidades y controles de protección de datos en la nube de AWS teniendo en cuenta las aportaciones de clientes de sectores como los servicios financieros, sanidad y entidades gubernamentales, que se encuentran entre los más preocupados por la seguridad y la privacidad de los datos en el mundo. Esto ha dado lugar a innovaciones como el sistema Nitro de AWS, que impulsa todas nuestras instancias de Amazon Elastic Compute Cloud (Amazon EC2) y proporciona un límite de seguridad físico y lógico sólido para imponer restricciones de acceso, de modo que nadie, incluidos los empleados de AWS, pueda acceder a los datos de los clientes que se ejecutan en Amazon EC2. El diseño de seguridad del sistema Nitro también ha sido validado de forma independiente por el Grupo NCC en un informe público.

Con AWS, los clientes siempre han tenido el control sobre la ubicación de sus datos. En Europa, los clientes que deben cumplir con los requisitos de residencia de datos europeos tienen la opción de implementar sus datos en cualquiera de las ocho Regiones de AWS existentes (Irlanda, Frankfurt, Londres, París, Estocolmo, Milán, Zúrich y España) para mantener sus datos de forma segura en Europa. Para ejecutar sus cargas de trabajo sensibles, los clientes europeos pueden aprovechar la cartera de servicios más amplia y completa, que incluye inteligencia artificial, análisis, computación, bases de datos, Internet de las cosas (IoT), aprendizaje automático, servicios móviles y almacenamiento. Para apoyar aún más a los clientes, hemos innovado ofreciendo más control y opciones sobre sus datos. Por ejemplo, anunciamos una mayor transparencia y garantías, y nuevas opciones de infraestructura de uso exclusivo con Zonas Locales Dedicadas de AWS.

Anunciamos AWS European Sovereign Cloud
Cuando hablamos con clientes del sector público y de sectores regulados en Europa, nos comparten cómo se enfrentan a una gran complejidad y a una dinámica cambiante en el panorama de la soberanía, que está en constante evolución. Los clientes nos dicen que quieren adoptar la nube, pero se enfrentan a un creciente escrutinio regulatorio en relación con la ubicación de los datos, la autonomía operativa europea y la resiliencia. Sabemos que a estos clientes les preocupa tener que elegir entre toda la potencia de AWS o soluciones de nube soberana con funciones limitadas. Hemos mantenido conversaciones muy provechosas con los reguladores europeos, las autoridades nacionales de ciberseguridad y los clientes para entender cómo las necesidades de soberanía de los clientes pueden variar en función de diferentes factores, como la ubicación, la sensibilidad de las cargas de trabajo y el sector. Estos factores pueden impactar en los requisitos aplicables a sus cargas de trabajo, como dónde pueden residir sus datos, quién puede acceder a ellos y los controles necesarios. AWS tiene un historial comprobado de innovación para abordar cargas de trabajo sensibles o especiales en todo el mundo.

Hoy nos complace anunciar nuestros planes de lanzar la Nube Soberana Europea de AWS, una nueva nube independiente para la Unión Europea, diseñada para ayudar a las organizaciones del sector público y a los clientes de sectores altamente regulados a satisfacer sus necesidades de soberanía en constante evolución. Estamos diseñando la Nube Soberana Europea de AWS para que sea independiente y separada de nuestras Regiones actuales, con una infraestructura ubicada íntegramente dentro de la Unión Europea y con la misma seguridad, disponibilidad y rendimiento que nuestros clientes obtienen en las Regiones actuales. Para ofrecer una mayor resiliencia operativa dentro de la UE, solo los residentes de la UE que se encuentren en la UE, tendrán el control de las operaciones y el soporte de la Nube Soberana Europea de AWS. Como ocurre con todas las Regiones actuales, los clientes que utilicen la Nube Soberana Europea de AWS se beneficiarán de toda la potencia de AWS con la misma arquitectura conocida, una amplia cartera de servicios y las APIs que utilizan millones de clientes en la actualidad. La Nube Soberana Europea de AWS lanzará su primera Región de AWS en Alemania disponible para todos los clientes en Europa.

La Nube Soberana Europea de AWS será soberana por diseño y se basará en más de una década de experiencia en la gestión de múltiples nubes independientes para las cargas de trabajo más críticas y restringidas. Al igual que las Regiones existentes, la Nube Soberana Europea de AWS se diseñará para ofrecer una alta disponibilidad y resiliencia, y contará con la tecnología del sistema Nitro de AWS, a fin de garantizar la confidencialidad e integridad de los datos de los clientes. Los clientes tendrán el control y la seguridad de que AWS no accederá a los datos de los clientes ni los utilizará para ningún propósito sin su consentimiento. AWS ofrece a los clientes los controles de soberanía más estrictos entre los principales proveedores de servicios en la nube. Para los clientes con necesidades de residencia de datos mejoradas, la Nube Soberana Europea de AWS está diseñada para ir más allá y permitirá a los clientes conservar todos los metadatos que crean (como funciones, permisos, etiquetas de recursos y configuraciones), las funciones de las cuentas y las configuraciones que utilizan para ejecutar AWS) dentro de la UE. La Nube Soberana Europea de AWS también se construirá con sistemas independientes de facturación y medición del uso dentro de la Región.

Ofreciendo autonomía operativa
La Nube Soberana Europea de AWS proporcionará a los clientes la capacidad de cumplir con los estrictos requisitos de autonomía operativa y residencia de datos que sean de aplicación a cada cliente. Para proporcionar una mejor residencia de los datos y resiliencia operativa en la UE, la infraestructura de la Nube Soberana Europea de AWS se gestionará de forma independiente del resto de las Regiones de AWS existentes. Para garantizar el funcionamiento independiente de la Nube Soberana Europea de AWS, solo el personal residente en la UE y ubicado en la UE tendrá el control de las operaciones diarias, incluido el acceso a los centros de datos, el soporte técnico y el servicio de atención al cliente.

Estamos aprendiendo de nuestras intensas conversaciones con los reguladores europeos y las autoridades nacionales de ciberseguridad, aplicando estos aprendizajes a medida que construimos la Nube Soberana Europea de AWS, de modo que los clientes que la utilicen puedan cumplir sus requisitos de residencia, autonomía operativa y resiliencia de los datos. Por ejemplo, esperamos continuar colaborando con la Oficina Federal de Seguridad de la Información (BSI) de Alemania.

«El desarrollo de una nube europea de AWS facilitará mucho el uso de los servicios de AWS a muchas organizaciones y empresas del sector público con altos requisitos de seguridad y protección de datos. Somos conscientes del poder innovador de los servicios en la nube modernos y queremos contribuir a que estén disponibles de forma segura en Alemania y Europa. El C5 (Cloud Computing Compliance Criteria Catalogue), desarrollado por la BSI, ha influido considerablemente en los estándares de ciberseguridad en la nube y, de hecho, AWS fue el primer proveedor de servicios en la nube en recibir el certificado C5 de la BSI. En este sentido, nos complace acompañar de manera constructiva el desarrollo local de una nube de AWS, que también contribuirá a la soberanía europea en términos de seguridad».
— Claudia Plattner, presidenta de la Oficina Federal Alemana de Seguridad de la Información (BSI)

Control sin concesiones
A pesar de ser independiente, la Nube Soberana Europea de AWS ofrecerá la misma arquitectura líder en el sector que otras Regiones de AWS, creada para garantizar la seguridad y la disponibilidad. Esto incluirá varias Zonas de Disponibilidad, una infraestructura distribuida en ubicaciones geográficas separadas y distintas, con una distancia suficiente para reducir el riesgo de que un incidente afecte a la continuidad del negocio de los clientes. Cada Zona de Disponibilidad tendrá varias fuentes de alimentación eléctrica y redes redundantes para ofrecer el máximo nivel de resiliencia. Todas las Zonas de Disponibilidad de la Nube Soberana Europea de AWS estarán interconectadas mediante fibra de uso exclusivo y totalmente redundante, lo que proporcionará una red de alto rendimiento y baja latencia entre las Zonas de Disponibilidad. Todo el tráfico entre las Zonas de Disponibilidad se encriptará. Los clientes que necesiten más opciones para abordar estrictas necesidades de aislamiento y residencia de datos en el país podrán utilizar las Zonas Locales Dedicadas o AWS Outposts para implementar la infraestructura de Nube Soberana Europea de AWS en las ubicaciones que elijan.

Inversión continua de AWS en Europa
La Nube Soberana Europea de AWS representa una inversión continua de AWS en la UE. AWS se compromete a innovar para respaldar los valores y el futuro digital de la Unión Europea. Impulsamos el desarrollo económico mediante la inversión en infraestructura, empleos y habilidades en comunidades y países de toda Europa. Estamos creando miles de puestos de trabajo de alta calidad e invirtiendo miles de millones de euros en las economías europeas. Amazon ha creado más de 100 000 puestos de trabajo permanentes en toda la UE. Algunos de nuestros equipos de desarrollo de AWS más importantes se encuentran en Europa, con centros clave en Dublín, Dresde y Berlín. Como parte de nuestro compromiso continuo de contribuir al desarrollo de las habilidades digitales, contrataremos y capacitaremos a más personal local para gestionar y apoyar la Nube Soberana Europea de AWS.

Los clientes, socios y reguladores dan la bienvenida a la Nube Soberana Europea de AWS
En la UE, cientos de miles de organizaciones de todos los tamaños y sectores utilizan AWS, desde startups hasta PYMEs, grandes compañías incluyendo empresas de telecomunicaciones, organizaciones del sector público, instituciones educativas, ONGs y agencias gubernamentales. Organizaciones de toda Europa apoyan la introducción de la Nube Soberana Europea de AWS.

“Como líder del mercado en software de aplicaciones empresariales con sólidas raíces en Europa, SAP lleva colaborando durante mucho tiempo con AWS en nombre de los clientes para acelerar la transformación digital en todo el mundo. La Nube Soberana Europea de AWS ofrece nuevas oportunidades para fortalecer nuestra relación en Europa, ya que nos permite ampliar las opciones que ofrecemos a los clientes a medida que se trasladan a la nube. Valoramos la asociación existente con AWS y las nuevas posibilidades que esta inversión puede ofrecer a los clientes de ambos en toda la región”.
– Peter Pluim, Presidente de SAP Enterprise Cloud Services y SAP Sovereign Cloud Services.

“La nueva Nube Soberana Europea de AWS puede cambiar las reglas del juego para los segmentos empresariales altamente regulados de la Unión Europea. Como proveedor de telecomunicaciones líder en Alemania, nuestra transformación digital se centra en la innovación, la escalabilidad, la agilidad y la resiliencia para ofrecer a nuestros clientes los mejores servicios y la mejor calidad. Esto se combinará ahora con los niveles más altos de protección de datos y cumplimiento normativo que ofrece AWS, y con un enfoque particular en los requisitos de soberanía digital. Estoy convencido de que esta nueva oferta de infraestructura tiene el potencial de impulsar la adaptación a la nube de las empresas europeas y acelerar la transformación digital de las industrias reguladas en toda la UE”.
— Mallik Rao, Directora de Tecnología e Información de O2 Telefónica en Alemania

“Hoy nos encontramos en la cúspide de una era de transformación. La introducción de la Nube Soberana Europea de AWS no solo representa una mejora de la infraestructura, sino que supone un cambio de paradigma. Este sofisticado marco permitirá a Dedalus ofrecer servicios incomparables para almacenar los datos de los pacientes de forma segura y eficiente en la nube de AWS. Mantenemos nuestro compromiso, sin concesiones, de servir a nuestra clientela europea con las mejores soluciones de su clase respaldadas por la confianza y la excelencia tecnológica”.
— Andrea Fiumicelli, Presidente de Dedalus

“En de Volksbank, creemos en invertir en unos Países Bajos mejores. Para hacerlo de manera eficaz, necesitamos tener acceso a las últimas tecnologías para poder innovar y mejorar continuamente los servicios para nuestros clientes. Por este motivo, acogemos con satisfacción el anuncio de la Nube Soberana Europea, que permitirá a los clientes europeos demostrar fácilmente el cumplimiento de las cambiantes normativas y, al mismo tiempo, beneficiarse de la escala, la seguridad y la gama completa de servicios de AWS”.
— Sebastian Kalshoven, director de TI y CTO de Volksbank

“Eviden acoge con satisfacción el lanzamiento de la Nube Soberana Europea de AWS. Esto ayudará a las industrias reguladas y al sector público a abordar los requisitos de sus cargas de trabajo confidenciales con una nube de AWS con todas las funciones y que funcione exclusivamente en Europa. Como socio de servicios de primer nivel de AWS y líder en servicios de ciberseguridad en Europa, Eviden tiene una amplia trayectoria ayudando a los clientes de AWS a formalizar y mitigar sus riesgos de soberanía. La Nube Soberana Europea de AWS permitirá a Eviden abordar una gama más amplia de necesidades de soberanía de los clientes”.
— Yannick Tricaud, director de Europa Central y Meridional, Oriente Medio y África, de Eviden, del Grupo Atos

Nuestros compromisos con nuestros clientes
Mantenemos nuestro compromiso de ofrecer a nuestros clientes el control y las opciones que les ayuden a satisfacer sus necesidades de soberanía digital en constante evolución. Seguiremos innovando en las funcionalidades, los controles y las garantías de soberanía globalmente, y ofreceremos esto sin renunciar a la toda la potencia de AWS.

Puede descubrir más sobre la Nube Soberana Europea de AWS y obtener más información sobre nuestros clientes en nuestra Nota de Prensa y en la web European Digital Sovereignty.  También puede obtener más información en AWS News Blog.

AWS Security Profile: Liam Wadman, Senior Solutions Architect, AWS Identity

Post Syndicated from Maddie Bacon original https://aws.amazon.com/blogs/security/aws-security-profile-liam-wadman-sr-solutions-architect-aws-identity/

In the AWS Security Profile series, I interview some of the humans who work in AWS Security and help keep our customers safe and secure. In this profile, I interviewed Liam Wadman, Senior Solutions Architect for AWS Identity.

Pictured: Liam making quick informed decisions about risk and reward

Pictured: Liam making quick informed decisions about risk and reward


How long have you been at AWS and what do you do in your current role?

My first day was 1607328000 — for those who don’t speak fluent UTC, that’s December 2020. I’m a member of the Identity Solutions team. Our mission is to make it simpler for customers to implement access controls that protect their data in a straightforward and consistent manner across AWS services.

I spend a lot of time talking with security, identity, and cloud teams at some of our largest and most complex customers, understanding their problems, and working with teams across AWS to make sure that we’re building solutions that meet their diverse security requirements.

I’m a big fan of working with customers and fellow Amazonians on threat modeling and helping them make informed decisions about risks and the controls they put in place. It’s such a productive exercise because many people don’t have that clear model about what they’re protecting, and what they’re protecting it from.

When I work with AWS service teams, I advocate for making services that are simple to secure and simple for customers to configure. It’s not enough to offer only good security controls; the service should be simple to understand and straightforward to apply to meet customer expectations.
 

How did you get started in security? What about it piqued your interest?

I got started in security at a very young age: by circumventing network controls at my high school so that I could play Flash games circa 2004. Ever since then, I’ve had a passion for deeply understanding a system’s rules and how they can be bent or broken. I’ve been lucky enough to have a diverse set of experiences throughout my career, including working in a network operation center, security operation center, Linux and windows server administration, telephony, investigations, content delivery, perimeter security, and security architecture. I think having such a broad base of experience allows me to empathize with all the different people who are AWS customers on a day-to-day basis.

As I progressed through my career, I became very interested in the psychology of security and the mindsets of defenders, unauthorized users, and operators of computer systems. Security is about so much more than technology—it starts with people and processes.
 

How do you explain your job to non-technical friends and family?

I get to practice this question a lot! Very few of my family and friends work in tech.

I always start with something relatable to the person. I start with a website, mobile app, or product that they use, tell the story of how it uses AWS, then tie that in around how my team works to support many of the products they use in their everyday lives. You don’t have to look far into our customer success stories or AWS re:Invent presentations to see a product or company that’s meaningful to almost anyone you’d talk to.

I got to practice this very recently because the software used by my personal trainer is hosted on AWS. So when she asked what I actually do for a living, I was ready for her.
 

In your opinion, what’s the coolest thing happening in identity right now?

You left this question wide open, so I’m going to give you more than one answer.

First, outside of AWS, it’s the rise of ubiquitous, easy-to-use personal identity technology. I’m talking about products such as password managers, sign-in with Google or Apple, and passkeys. I’m excited to see the industry is finally offering services to consumers at no extra cost that you don’t need to be an expert to use and that will work on almost any device you sign in to. Everyday people can benefit from their use, and I have successfully converted many of the people I care about.

At AWS, it’s the work that we’re doing to enable data perimeters and provable security. We hear quite regularly from customers that data perimeters are super important to them, and they want to see us do more in that space and keep refining that journey. I’m all too happy to oblige. Provable security, while identity adjacent, is about getting real answers to questions such as “Can this resource be accessed publicly?” It’s making it simple for customers who don’t want to spend the time or money building the operational expertise to answer tough questions, and I think that’s incredible.
 

You presented at AWS re:Inforce 2023. What was your session about and what do you hope attendees took away from it?

My session was IAM336: Best practices for delegating access on IAM. I initially delivered this session at re:Inforce 2022, where customers gave it the highest overall rating for an identity session, so we brought it back for 2023! 

The talk dives deep into some AWS Identity and Access Management (IAM) primitives and provides a lot of candor on what we feel are best practices based on many of the real-world engagements I’ve had with customers. The top thing that I hope attendees learned is how they can safely empower their developers to have some self service and autonomy when working with IAM and help transform central teams from blockers to enablers.

I’m also presenting at re:Invent 2023 in November. I’ll be doing a chalk talk called Best practices for setting up AWS Organizations policies. We’re targeting it towards a more general audience, not just customers whose primary jobs are AWS security or identity. I’m excited about this presentation because I usually talk to a lot of customers who have very mature security and identity practices, and this is a great chance to get feedback from customers who do not.

I’d like to thank all the customers who attended the sessions over the years — the best part of AWS events is the customer interactions and fantastic discussions that we have.
 

Is there anything you wish customers would ask about more often?

I wish more customers would frame their problems within a threat model. Many customer engagements start with a specific problem, but it isn’t in the context of the risk this poses to their business, and often focuses too much on specific technical controls for very specific issues, rather than an outcome that they’re trying to arrive at or a risk that they’re trying to mitigate. I like to take a step back and work with the customer to frame the problem that they’re talking about in a bigger picture, then have a more productive conversation around how we can mitigate these risks and other considerations that they may not have thought of.
 

Where do you see the identity space heading in the future?

I think the industry is really getting ready for an identity renaissance as we start shifting towards more modern and Zero Trust architectures. I’m really excited to start seeing adoption of technologies such as token exchange to help applications avoid impersonating users to downstream systems, or mechanisms such as proof of possession to provide scalable ways to bind a given credential to a system that it’s intended to be used from.

On the AWS Identity side: More controls. Simpler. Scalable. Provable.
 

What are you most proud of in your career?

Getting involved with speaking at AWS: presenting at summits, re:Inforce, and re:Invent. It’s something I never would have seen myself doing before. I grew up with a pretty bad speech impediment that I’m always working against.

I think my proudest moment in particular is when I had customers come to my re:Invent session because they saw me at AWS Summits earlier in the year and liked what I did there. I get a little emotional thinking about it.

Being a speaker also allowed me to go to Disneyland for the first time last year before the Anaheim Summit, and that would have made 5-year-old Liam proud.
 

If you had to pick a career outside of tech, what would you want to do?

I think I’d certainly be involved in something in forestry, resource management, or conservation. I spend most of my free time in the forests of British Columbia. I’m a big believer in shinrin-yoku, and I believe in being a good steward of the land. We’ve only got one earth.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Maddie Bacon

Maddie (she/her) is a technical writer for Amazon Security with a passion for creating meaningful content that focuses on the human side of security and encourages a security-first mindset. She previously worked as a reporter and editor, and has a BA in mathematics. In her spare time, she enjoys reading, traveling, and staunchly defending the Oxford comma.

Liam Wadman

Liam Wadman

Liam is a Senior Solutions Architect with the Identity Solutions team. When he’s not building exciting solutions on AWS or helping customers, he’s often found in the hills of British Columbia on his mountain bike. Liam points out that you cannot spell LIAM without IAM.

Blue/Green deployments using AWS CDK Pipelines and AWS CodeDeploy

Post Syndicated from Luiz Decaro original https://aws.amazon.com/blogs/devops/blue-green-deployments-using-aws-cdk-pipelines-and-aws-codedeploy/

Customers often ask for help with implementing Blue/Green deployments to Amazon Elastic Container Service (Amazon ECS) using AWS CodeDeploy. Their use cases usually involve cross-Region and cross-account deployment scenarios. These requirements are challenging enough on their own, but in addition to those, there are specific design decisions that need to be considered when using CodeDeploy. These include how to configure CodeDeploy, when and how to create CodeDeploy resources (such as Application and Deployment Group), and how to write code that can be used to deploy to any combination of account and Region.

Today, I will discuss those design decisions in detail and how to use CDK Pipelines to implement a self-mutating pipeline that deploys services to Amazon ECS in cross-account and cross-Region scenarios. At the end of this blog post, I also introduce a demo application, available in Java, that follows best practices for developing and deploying cloud infrastructure using AWS Cloud Development Kit (AWS CDK).

The Pipeline

CDK Pipelines is an opinionated construct library used for building pipelines with different deployment engines. It abstracts implementation details that developers or infrastructure engineers need to solve when implementing a cross-Region or cross-account pipeline. For example, in cross-Region scenarios, AWS CloudFormation needs artifacts to be replicated to the target Region. For that reason, AWS Key Management Service (AWS KMS) keys, an Amazon Simple Storage Service (Amazon S3) bucket, and policies need to be created for the secondary Region. This enables artifacts to be moved from one Region to another. In cross-account scenarios, CodeDeploy requires a cross-account role with access to the KMS key used to encrypt configuration files. This is the sort of detail that our customers want to avoid dealing with manually.

AWS CodeDeploy is a deployment service that automates application deployment across different scenarios. It deploys to Amazon EC2 instances, On-Premises instances, serverless Lambda functions, or Amazon ECS services. It integrates with AWS Identity and Access Management (AWS IAM), to implement access control to deploy or re-deploy old versions of an application. In the Blue/Green deployment type, it is possible to automate the rollback of a deployment using Amazon CloudWatch Alarms.

CDK Pipelines was designed to automate AWS CloudFormation deployments. Using AWS CDK, these CloudFormation deployments may include deploying application software to instances or containers. However, some customers prefer using CodeDeploy to deploy application software. In this blog post, CDK Pipelines will deploy using CodeDeploy instead of CloudFormation.

A pipeline build with CDK Pipelines that deploys to Amazon ECS using AWS CodeDeploy. It contains at least 5 stages: Source, Build, UpdatePipeline, Assets and at least one Deployment stage.

Design Considerations

In this post, I’m considering the use of CDK Pipelines to implement different use cases for deploying a service to any combination of accounts (single-account & cross-account) and regions (single-Region & cross-Region) using CodeDeploy. More specifically, there are four problems that need to be solved:

CodeDeploy Configuration

The most popular options for implementing a Blue/Green deployment type using CodeDeploy are using CloudFormation Hooks or using a CodeDeploy construct. I decided to operate CodeDeploy using its configuration files. This is a flexible design that doesn’t rely on using custom resources, which is another technique customers have used to solve this problem. On each run, a pipeline pushes a container to a repository on Amazon Elastic Container Registry (ECR) and creates a tag. CodeDeploy needs that information to deploy the container.

I recommend creating a pipeline action to scan the AWS CDK cloud assembly and retrieve the repository and tag information. The same action can create the CodeDeploy configuration files. Three configuration files are required to configure CodeDeploy: appspec.yaml, taskdef.json and imageDetail.json. This pipeline action should be executed before the CodeDeploy deployment action. I recommend creating template files for appspec.yaml and taskdef.json. The following script can be used to implement the pipeline action:

##
#!/bin/sh
#
# Action Configure AWS CodeDeploy
# It customizes the files template-appspec.yaml and template-taskdef.json to the environment
#
# Account = The target Account Id
# AppName = Name of the application
# StageName = Name of the stage
# Region = Name of the region (us-east-1, us-east-2)
# PipelineId = Id of the pipeline
# ServiceName = Name of the service. It will be used to define the role and the task definition name
#
# Primary output directory is codedeploy/. All the 3 files created (appspec.json, imageDetail.json and 
# taskDef.json) will be located inside the codedeploy/ directory
#
##
Account=$1
Region=$2
AppName=$3
StageName=$4
PipelineId=$5
ServiceName=$6
repo_name=$(cat assembly*$PipelineId-$StageName/*.assets.json | jq -r '.dockerImages[] | .destinations[] | .repositoryName' | head -1) 
tag_name=$(cat assembly*$PipelineId-$StageName/*.assets.json | jq -r '.dockerImages | to_entries[0].key')  
echo ${repo_name} 
echo ${tag_name} 
printf '{"ImageURI":"%s"}' "$Account.dkr.ecr.$Region.amazonaws.com/${repo_name}:${tag_name}" > codedeploy/imageDetail.json                     
sed 's#APPLICATION#'$AppName'#g' codedeploy/template-appspec.yaml > codedeploy/appspec.yaml 
sed 's#APPLICATION#'$AppName'#g' codedeploy/template-taskdef.json | sed 's#TASK_EXEC_ROLE#arn:aws:iam::'$Account':role/'$ServiceName'#g' | sed 's#fargate-task-definition#'$ServiceName'#g' > codedeploy/taskdef.json 
cat codedeploy/appspec.yaml
cat codedeploy/taskdef.json
cat codedeploy/imageDetail.json

Using a Toolchain

A good strategy is to encapsulate the pipeline inside a Toolchain to abstract how to deploy to different accounts and regions. This helps decoupling clients from the details such as how the pipeline is created, how CodeDeploy is configured, and how cross-account and cross-Region deployments are implemented. To create the pipeline, deploy a Toolchain stack. Out-of-the-box, it allows different environments to be added as needed. Depending on the requirements, the pipeline may be customized to reflect the different stages or waves that different components might require. For more information, please refer to our best practices on how to automate safe, hands-off deployments and its reference implementation.

In detail, the Toolchain stack follows the builder pattern used throughout the CDK for Java. This is a convenience that allows complex objects to be created using a single statement:

 Toolchain.Builder.create(app, Constants.APP_NAME+"Toolchain")
        .stackProperties(StackProps.builder()
                .env(Environment.builder()
                        .account(Demo.TOOLCHAIN_ACCOUNT)
                        .region(Demo.TOOLCHAIN_REGION)
                        .build())
                .build())
        .setGitRepo(Demo.CODECOMMIT_REPO)
        .setGitBranch(Demo.CODECOMMIT_BRANCH)
        .addStage(
                "UAT",
                EcsDeploymentConfig.CANARY_10_PERCENT_5_MINUTES,
                Environment.builder()
                        .account(Demo.SERVICE_ACCOUNT)
                        .region(Demo.SERVICE_REGION)
                        .build())                                                                                                             
        .build();

In the statement above, the continuous deployment pipeline is created in the TOOLCHAIN_ACCOUNT and TOOLCHAIN_REGION. It implements a stage that builds the source code and creates the Java archive (JAR) using Apache Maven.  The pipeline then creates a Docker image containing the JAR file.

The UAT stage will deploy the service to the SERVICE_ACCOUNT and SERVICE_REGION using the deployment configuration CANARY_10_PERCENT_5_MINUTES. This means 10 percent of the traffic is shifted in the first increment and the remaining 90 percent is deployed 5 minutes later.

To create additional deployment stages, you need a stage name, a CodeDeploy deployment configuration and an environment where it should deploy the service. As mentioned, the pipeline is, by default, a self-mutating pipeline. For example, to add a Prod stage, update the code that creates the Toolchain object and submit this change to the code repository. The pipeline will run and update itself adding a Prod stage after the UAT stage. Next, I show in detail the statement used to add a new Prod stage. The new stage deploys to the same account and Region as in the UAT environment:

... 
        .addStage(
                "Prod",
                EcsDeploymentConfig.CANARY_10_PERCENT_5_MINUTES,
                Environment.builder()
                        .account(Demo.SERVICE_ACCOUNT)
                        .region(Demo.SERVICE_REGION)
                        .build())                                                                                                                                      
        .build();

In the statement above, the Prod stage will deploy new versions of the service using a CodeDeploy deployment configuration CANARY_10_PERCENT_5_MINUTES. It means that 10 percent of traffic is shifted in the first increment of 5 minutes. Then, it shifts the rest of the traffic to the new version of the application. Please refer to Organizing Your AWS Environment Using Multiple Accounts whitepaper for best-practices on how to isolate and manage your business applications.

Some customers might find this approach interesting and decide to provide this as an abstraction to their application development teams. In this case, I advise creating a construct that builds such a pipeline. Using a construct would allow for further customization. Examples are stages that promote quality assurance or deploy the service in a disaster recovery scenario.

The implementation creates a stack for the toolchain and another stack for each deployment stage. As an example, consider a toolchain created with a single deployment stage named UAT. After running successfully, the DemoToolchain and DemoService-UAT stacks should be created as in the next image:

Two stacks are needed to create a Pipeline that deploys to a single environment. One stack deploys the Toolchain with the Pipeline and another stack deploys the Service compute infrastructure and CodeDeploy Application and DeploymentGroup. In this example, for an application named Demo that deploys to an environment named UAT, the stacks deployed are: DemoToolchain and DemoService-UAT

CodeDeploy Application and Deployment Group

CodeDeploy configuration requires an application and a deployment group. Depending on the use case, you need to create these in the same or in a different account from the toolchain (pipeline). The pipeline includes the CodeDeploy deployment action that performs the blue/green deployment. My recommendation is to create the CodeDeploy application and deployment group as part of the Service stack. This approach allows to align the lifecycle of CodeDeploy application and deployment group with the related Service stack instance.

CodePipeline allows to create a CodeDeploy deployment action that references a non-existing CodeDeploy application and deployment group. This allows us to implement the following approach:

  • Toolchain stack deploys the pipeline with CodeDeploy deployment action referencing a non-existing CodeDeploy application and deployment group
  • When the pipeline executes, it first deploys the Service stack that creates the related CodeDeploy application and deployment group
  • The next pipeline action executes the CodeDeploy deployment action. When the pipeline executes the CodeDeploy deployment action, the related CodeDeploy application and deployment will already exist.

Below is the pipeline code that references the (initially non-existing) CodeDeploy application and deployment group.

private IEcsDeploymentGroup referenceCodeDeployDeploymentGroup(
        final Environment env, 
        final String serviceName, 
        final IEcsDeploymentConfig ecsDeploymentConfig, 
        final String stageName) {

    IEcsApplication codeDeployApp = EcsApplication.fromEcsApplicationArn(
            this,
            Constants.APP_NAME + "EcsCodeDeployApp-"+stageName,
            Arn.format(ArnComponents.builder()
                    .arnFormat(ArnFormat.COLON_RESOURCE_NAME)
                    .partition("aws")
                    .region(env.getRegion())
                    .service("codedeploy")
                    .account(env.getAccount())
                    .resource("application")
                    .resourceName(serviceName)
                    .build()));

    IEcsDeploymentGroup deploymentGroup = EcsDeploymentGroup.fromEcsDeploymentGroupAttributes(
            this,
            Constants.APP_NAME + "-EcsCodeDeployDG-"+stageName,
            EcsDeploymentGroupAttributes.builder()
                    .deploymentGroupName(serviceName)
                    .application(codeDeployApp)
                    .deploymentConfig(ecsDeploymentConfig)
                    .build());

    return deploymentGroup;
}

To make this work, you should use the same application name and deployment group name values when creating the CodeDeploy deployment action in the pipeline and when creating the CodeDeploy application and deployment group in the Service stack (where the Amazon ECS infrastructure is deployed). This approach is necessary to avoid a circular dependency error when trying to create the CodeDeploy application and deployment group inside the Service stack and reference these objects to configure the CodeDeploy deployment action inside the pipeline. Below is the code that uses Service stack construct ID to name the CodeDeploy application and deployment group. I set the Service stack construct ID to the same name I used when creating the CodeDeploy deployment action in the pipeline.

   // configure AWS CodeDeploy Application and DeploymentGroup
   EcsApplication app = EcsApplication.Builder.create(this, "BlueGreenApplication")
           .applicationName(id)
           .build();

   EcsDeploymentGroup.Builder.create(this, "BlueGreenDeploymentGroup")
           .deploymentGroupName(id)
           .application(app)
           .service(albService.getService())
           .role(createCodeDeployExecutionRole(id))
           .blueGreenDeploymentConfig(EcsBlueGreenDeploymentConfig.builder()
                   .blueTargetGroup(albService.getTargetGroup())
                   .greenTargetGroup(tgGreen)
                   .listener(albService.getListener())
                   .testListener(listenerGreen)
                   .terminationWaitTime(Duration.minutes(15))
                   .build())
           .deploymentConfig(deploymentConfig)
           .build();

CDK Pipelines roles and permissions

CDK Pipelines creates roles and permissions the pipeline uses to execute deployments in different scenarios of regions and accounts. When using CodeDeploy in cross-account scenarios, CDK Pipelines deploys a cross-account support stack that creates a pipeline action role for the CodeDeploy action. This cross-account support stack is defined in a JSON file that needs to be published to the AWS CDK assets bucket in the target account. If the pipeline has the self-mutation feature on (default), the UpdatePipeline stage will do a cdk deploy to deploy changes to the pipeline. In cross-account scenarios, this deployment also involves deploying/updating the cross-account support stack. For this, the SelfMutate action in UpdatePipeline stage needs to assume CDK file-publishing and a deploy roles in the remote account.

The IAM role associated with the AWS CodeBuild project that runs the UpdatePipeline stage does not have these permissions by default. CDK Pipelines cannot grant these permissions automatically, because the information about the permissions that the cross-account stack needs is only available after the AWS CDK app finishes synthesizing. At that point, the permissions that the pipeline has are already locked-in­­. Hence, for cross-account scenarios, the toolchain should extend the permissions of the pipeline’s UpdatePipeline stage to include the file-publishing and deploy roles.

In cross-account environments it is possible to manually add these permissions to the UpdatePipeline stage. To accomplish that, the Toolchain stack may be used to hide this sort of implementation detail. In the end, a method like the one below can be used to add these missing permissions. For each different mapping of stage and environment in the pipeline it validates if the target account is different than the account where the pipeline is deployed. When the criteria is met, it should grant permission to the UpdatePipeline stage to assume CDK bootstrap roles (tagged using key aws-cdk:bootstrap-role) in the target account (with the tag value as file-publishing or deploy). The example below shows how to add permissions to the UpdatePipeline stage:

private void grantUpdatePipelineCrossAccoutPermissions(Map<String, Environment> stageNameEnvironment) {

    if (!stageNameEnvironment.isEmpty()) {

        this.pipeline.buildPipeline();
        for (String stage : stageNameEnvironment.keySet()) {

            HashMap<String, String[]> condition = new HashMap<>();
            condition.put(
                    "iam:ResourceTag/aws-cdk:bootstrap-role",
                    new String[] {"file-publishing", "deploy"});
            pipeline.getSelfMutationProject()
                    .getRole()
                    .addToPrincipalPolicy(PolicyStatement.Builder.create()
                            .actions(Arrays.asList("sts:AssumeRole"))
                            .effect(Effect.ALLOW)
                            .resources(Arrays.asList("arn:*:iam::"
                                    + stageNameEnvironment.get(stage).getAccount() + ":role/*"))
                            .conditions(new HashMap<String, Object>() {{
                                    put("ForAnyValue:StringEquals", condition);
                            }})
                            .build());
        }
    }
}

The Deployment Stage

Let’s consider a pipeline that has a single deployment stage, UAT. The UAT stage deploys a DemoService. For that, it requires four actions: DemoService-UAT (Prepare and Deploy), ConfigureBlueGreenDeploy and Deploy.

When using CodeDeploy the deployment stage is expected to have four actions: two actions to create CloudFormation change set and deploy the ECS or compute infrastructure, an action to configure CodeDeploy and the last action that deploys the application using CodeDeploy. In the diagram, these are (in the diagram in the respective order): DemoService-UAT.Prepare and DemoService-UAT.Deploy, ConfigureBlueGreenDeploy and Deploy.

The
DemoService-UAT.Deploy action will create the ECS resources and the CodeDeploy application and deployment group. The
ConfigureBlueGreenDeploy action will read the AWS CDK
cloud assembly. It uses the configuration files to identify the Amazon Elastic Container Registry (Amazon ECR) repository and the container image tag pushed. The pipeline will send this information to the
Deploy action.  The
Deploy action starts the deployment using CodeDeploy.

Solution Overview

As a convenience, I created an application, written in Java, that solves all these challenges and can be used as an example. The application deployment follows the same 5 steps for all deployment scenarios of account and Region, and this includes the scenarios represented in the following design:

A pipeline created by a Toolchain should be able to deploy to any combination of accounts and regions. This includes four scenarios: single-account and single-Region, single-account and cross-Region, cross-account and single-Region and cross-account and cross-Region

Conclusion

In this post, I identified, explained and solved challenges associated with the creation of a pipeline that deploys a service to Amazon ECS using CodeDeploy in different combinations of accounts and regions. I also introduced a demo application that implements these recommendations. The sample code can be extended to implement more elaborate scenarios. These scenarios might include automated testing, automated deployment rollbacks, or disaster recovery. I wish you success in your transformative journey.

Luiz Decaro

Luiz is a Principal Solutions architect at Amazon Web Services (AWS). He focuses on helping customers from the Financial Services Industry succeed in the cloud. Luiz holds a master’s in software engineering and he triggered his first continuous deployment pipeline in 2005.

Automated data governance with AWS Glue Data Quality, sensitive data detection, and AWS Lake Formation

Post Syndicated from Shoukat Ghouse original https://aws.amazon.com/blogs/big-data/automated-data-governance-with-aws-glue-data-quality-sensitive-data-detection-and-aws-lake-formation/

Data governance is the process of ensuring the integrity, availability, usability, and security of an organization’s data. Due to the volume, velocity, and variety of data being ingested in data lakes, it can get challenging to develop and maintain policies and procedures to ensure data governance at scale for your data lake. Data confidentiality and data quality are the two essential themes for data governance. Data confidentiality refers to the protection and control of sensitive and private information to prevent unauthorized access, especially when dealing with personally identifiable information (PII). Data quality focuses on maintaining accurate, reliable, and consistent data across the organization. Poor data quality can lead to erroneous decisions, inefficient operations, and compromised business performance.

Companies need to ensure data confidentiality is maintained throughout the data pipeline and that high-quality data is available to consumers in a timely manner. A lot of this effort is manual, where data owners and data stewards define and apply the policies statically up front for each dataset in the lake. This gets tedious and delays the data adoption across the enterprise.

In this post, we showcase how to use AWS Glue with AWS Glue Data Quality, sensitive data detection transforms, and AWS Lake Formation tag-based access control to automate data governance.

Solution overview

Let’s consider a fictional company, OkTank. OkTank has multiple ingestion pipelines that populate multiple tables in the data lake. OkTank wants to ensure the data lake is governed with data quality rules and access policies in place at all times.

Multiple personas consume data from the data lake, such as business leaders, data scientists, data analysts, and data engineers. For each set of users, a different level of governance is needed. For example, business leaders need top-quality and highly accurate data, data scientists cannot see PII data and need data within an acceptable quality range for their model training, and data engineers can see all data except PII.

Currently, these requirements are hard-coded and managed manually for each set of users. OkTank wants to scale this and is looking for ways to control governance in an automated way. Primarily, they are looking for the following features:

  • When new data and tables get added to the data lake, the governance policies (data quality checks and access controls) get automatically applied for them. Unless the data is certified to be consumed, it shouldn’t be accessible to the end-users. For example, they want to ensure basic data quality checks are applied on all new tables and provide access to the data based on the data quality score.
  • Due to changes in source data, the existing data profile of data lake tables may drift. It’s required to ensure the governance is met as defined. For example, the system should automatically mark columns as sensitive if sensitive data is detected in a column that was earlier marked as public and was available publicly for users. The system should hide the column from unauthorized users accordingly.

For the purpose of this post, the following governance policies are defined:

  • No PII data should exist in tables or columns tagged as public.
  • If  a column has any PII data, the column should be marked as sensitive. The table should then also be marked sensitive.
  • The following data quality rules should be applied on all tables:
    • All tables should have a minimum set of columns: data_key, data_load_date, and data_location.
    • data_key is a key column and should meet key requirements of being unique and complete.
    • data_location should match with locations defined in a separate reference (base) table.
    • The data_load_date column should be complete.
  • User access to tables is controlled as per the following table.
User Description Can Access Sensitive Tables Can Access Sensitive Columns Min Data Quality Threshold Needed to consume Data
Category 1 Yes Yes 100%
Category 2 Yes No 50%
Category 3 No No 0%

In this post, we use AWS Glue Data Quality and sensitive data detection features. We also use Lake Formation tag-based access control to manage access at scale.

The following diagram illustrates the solution architecture.

The governance requirements highlighted in the previous table are translated to the following Lake Formation LF-Tags.

IAM User LF-Tag: tbl_class LF-Tag: col_class LF-Tag: dq_tag
Category 1 sensitive, public sensitive, public DQ100
Category 2 sensitive, public public DQ100,DQ90,DQ50_80,DQ80_90
Category 3 public public DQ90, DQ100, DQ_LT_50, DQ50_80, DQ80_90

This post uses AWS Step Functions to orchestrate the governance jobs, but you can use any other orchestration tool of choice. To simulate data ingestion, we manually place the files in an Amazon Simple Storage Service (Amazon S3) bucket. In this post, we trigger the Step Functions state machine manually for ease of understanding. In practice, you can integrate or invoke the jobs as part of a data ingestion pipeline, via event triggers like AWS Glue crawler or Amazon S3 events, or schedule them as needed.

In this post, we use an AWS Glue database named oktank_autogov_temp and a target table named customer on which we apply the governance rules. We use AWS CloudFormation to provision the resources. AWS CloudFormation lets you model, provision, and manage AWS and third-party resources by treating infrastructure as code.

Prerequisites

Complete the following prerequisite steps:

  1. Identify an AWS Region in which you want to create the resources and ensure you use the same Region throughout the setup and verifications.
  2. Have a Lake Formation administrator role to run the CloudFormation template and grant permissions.

Sign in to the Lake Formation console and add yourself as a Lake Formation data lake administrator if you aren’t already an admin. If you are setting up Lake Formation for the first time in your Region, then you can do this in the following pop-up window that appears up when you connect to the Lake Formation console and select the desired Region.

Otherwise, you can add data lake administrators by choosing Administrative roles and tasks in the navigation pane on the Lake Formation console and choosing Add administrators. Then select Data lake administrator, identity your users and roles, and choose Confirm.

Deploy the CloudFormation stack

Run the provided CloudFormation stack to create the solution resources.

You need to provide a unique bucket name and specify passwords for the three users reflecting three different user personas (Category 1, Category 2, and Category 3) that we use for this post.

The stack provisions an S3 bucket to store the dummy data, AWS Glue scripts, results of sensitive data detection, and Amazon Athena query results in their respective folders.

The stack copies the AWS Glue scripts into the scripts folder and creates two AWS Glue jobs Data-Quality-PII-Checker_Job and LF-Tag-Handler_Job pointing to the corresponding scripts.

The AWS Glue job Data-Quality-PII-Checker_Job applies the data quality rules and publishes the results. It also checks for sensitive data in the columns. In this post, we check for the PERSON_NAME and EMAIL data types. If any columns with sensitive data are detected, it persists the sensitive data detection results to the S3 bucket.

AWS Glue Data Quality uses Data Quality Definition Language (DQDL) to author the data quality rules.

The data quality requirements as defined earlier in this post are written as the following DQDL in the script:

Rules = [
ReferentialIntegrity "data_location" "reference.data_location" = 1.0,
IsPrimaryKey "data_key",
ColumnExists "data_load_date",
IsComplete "data_load_date
]

The following screenshot shows a sample result from the job after it runs. You can see this after you trigger the Step Functions workflow in subsequent steps. To check the results, on the AWS Glue console, choose ETL jobs and choose the job called Data-Quality-PII-Checker_Job. Then navigate to the Data quality tab to view the results.

The AWS Glue jobLF-Tag-Handler_Job fetches the data quality metrics published by Data-Quality-PII-Checker_Job. It checks the status of the DataQuality_PIIColumns result. It gets the list of sensitive column names from the sensitive data detection file created in the Data-Quality-PII-Checker_Job and tags the columns as sensitive. The rest of the columns are tagged as public. It also tags the table assensitive if sensitive columns are detected. The table is marked as public if no sensitive columns are detected.

The job also checks the data quality score for the DataQuality_BasicChecks result set. It maps the data quality score into tags as shown in the following table and applies the corresponding tag on the table.

Data Quality Score Data Quality Tag
100% DQ100
90-100% DQ90
80-90% DQ80_90
50-80% DQ50_80
Less than 50% DQ_LT_50

The CloudFormation stack copies some mock data to the data folder and registers this location under AWS Lake Formation Data lake locations so Lake Formation can govern access on the location using service-linked role for Lake Formation.

The customer subfolder contains the initial customer dataset for the table customer. The base subfolder contains the base dataset, which we use to check referential integrity as part of the data quality checks. The column data_location in the customer table should match with locations defined in this base table.

The stack also copies some additional mock data to the bucket under the data-v1 folder. We use this data to simulate data quality issues.

It also creates the following resources:

  • An AWS Glue database called oktank_autogov_temp and two tables under the database:
    • customer – This is our target table on which we will be governing the access based on data quality rules and PII checks.
    • base – This is the base table that has the reference data. One of the data quality rules checks that the customer data always adheres to locations present in the base table.
  • AWS Identity and Access Management (IAM) users and roles:
    • DataLakeUser_Category1 – The data lake user corresponding to the Category 1 user. This user should be able to access sensitive data but needs 100% accurate data.
    • DataLakeUser_Category2 – The data lake user corresponding to the Category 2 user. This user should not be able to access sensitive columns in the table. It needs more than 50% accurate data.
    • DataLakeUser_Category3 – The data lake user corresponding to the Category 3 user. This user should not be able to access tables containing sensitive data. Data quality can be 0%.
    • GlueServiceDQRole – The role for the data quality and sensitive data detection job.
    • GlueServiceLFTaggerRole – The role for the LF-Tags handler job for applying the tags to the table.
    • StepFunctionRole – The Step Functions role for triggering the AWS Glue jobs.
  • Lake Formation LF-Tags keys and values:
    • tbl_classsensitive, public
    • dq_classDQ100, DQ90, DQ80_90, DQ50_80, DQ_LT_50
    • col_classsensitive, public
  • A Step Functions state machine named AutoGovMachine that you use to trigger the runs for the AWS Glue jobs to check data quality and update the LF-Tags.
  • Athena workgroups named auto_gov_blog_workgroup_temporary_user1, auto_gov_blog_workgroup_temporary_user2, and auto_gov_blog_workgroup_temporary_user3. These workgroups point to different Athena query result locations for each user. Each user is granted access to the corresponding query result location only. This ensures a specific user doesn’t access the query results of other users. You should switch to a specific workgroup to run queries in Athena as part of the test for the specific user.

The CloudFormation stack generates the following outputs. Take note of the values of the IAM users to use in subsequent steps.

Grant permissions

After you launch the CloudFormation stack, complete the following steps:

  1. On the Lake Formation console, under Permissions choose Data lake permissions in the navigation pane.
  2. Search for the database oktank_autogov_temp and table customer.
  3. If IAMAllowedPrincipals access if present, select it choose Revoke.

  1. Choose Revoke again to revoke the permissions.

Category 1 users can access all data except if the data quality score of the table is below 100%. Therefore, we grant the user the necessary permissions.

  1. Under Permissions in the navigation pane, choose Data lake permissions.
  2. Search for database oktank_autogov_temp and table customer.
  3. Choose Grant
  4. Select IAM users and roles and choose the value for UserCategory1 from your CloudFormation stack output.
  5. Under LF-Tags or catalog resources, choose Add LF-Tag key-value pair.
  6. Add the following key-value pairs:
    1. For the col_class key, add the values public and sensitive.
    2. For the tbl_class key, add the values public and sensitive.
    3. For the dq_tag key, add the value DQ100.

  1. For Table permissions, select Select.
  2. Choose Grant.

Category 2 users can’t access sensitive columns. They can access tables with a data quality score above 50%.

  1. Repeat the preceding steps to grant the appropriate permissions in Lake Formation to UserCategory2:
    1. For the col_class key, add the value public.
    2. For the tbl_class key, add the values public and sensitive.
    3. For the dq_tag key, add the values DQ50_80, DQ80_90, DQ90, and DQ100.

  1. For Table permissions, select Select.
  2. Choose Grant.

Category 3 users can’t access tables that contain any sensitive columns. Such tables are marked as sensitive by the system. They can access tables with any data quality score.

  1. Repeat the preceding steps to grant the appropriate permissions in Lake Formation to UserCategory3:
    1. For the col_class key, add the value public.
    2. For the tbl_class key, add the value public.
    3. For the dq_tag key, add the values DQ_LT_50, DQ50_80, DQ80_90, DQ90, and DQ100.

  1. For Table permissions, select Select.
  2. Choose Grant.

You can verify the LF-Tag permissions assigned in Lake Formation by navigating to the Data lake permissions page and searching for the Resource type LF-Tag expression.

Test the solution

Now we can test the workflow. We test three different use cases in this post. You will notice how the permissions to the tables change based on the values of LF-Tags applied to the customer table and the columns of the table. We use Athena to query the tables.

Use case 1

In this first use case, a new table was created on the lake and new data was ingested to the table. The data file cust_feedback_v0.csv was copied to the data/customer location in the S3 bucket. This simulates new data ingestion on a new table called customer.

Lake Formation doesn’t allow any users to access this table currently. To test this scenario, complete the following steps:

  1. Sign in to the Athena console with the UserCategory1 user.
  2. Switch the workgroup to auto_gov_blog_workgroup_temporary_user1 in the Athena query editor.
  3. Choose Acknowledge to accept the workgroup settings.

  1. Run the following query in the query editor:
select * from "oktank_autogov_temp"."customer" limit 10

  1. On the Step Functions console, run the AutoGovMachine state machine.
  2. In the Input – optional section, use the following JSON and replace the BucketName value with the bucket name you used for the CloudFormation stack earlier (for this post, we use auto-gov-blog):
{
  "Comment": "Auto Governance with AWS Glue and AWS LakeFormation",
  "BucketName": "<Replace with your bucket name>"
}

The state machine triggers the AWS Glue jobs to check data quality on the table and apply the corresponding LF-Tags.

  1. You can check the LF-Tags applied on the table and the columns. To do so, when the state machine is complete, sign in to Lake Formation with the admin role used earlier to grant permissions.
  2. Navigate to the table customer under the oktank_autogov_temp database and choose Edit LF-Tags to validate the tags applied on the table.

You can also validate that columns customer_email and customer_name are tagged as sensitive for the col_class LF-Tag.

  1. To check this, choose Edit Schema for the customer table.
  2. Select the two columns and choose Edit LF-Tags.

You can check the tags on these columns.

The rest of the columns are tagged as public.

  1. Sign in to the Athena console with UserCategory1 and run the same query again:
select * from "oktank_autogov_temp"."customer" limit 10

This time, the user is able to see the data. This is because the LF-Tag permissions we applied earlier are in effect.

  1. Sign in as UserCategory2 user to verify permissions.
  2. Switch to workgroup auto_gov_blog_workgroup_temporary_user2 in Athena.

This user can access the table but can only see public columns. Therefore, the user shouldn’t be able to see the customer_email and customer_phone columns because these columns contain sensitive data as identified by the system.

  1. Run the same query again:
select * from "oktank_autogov_temp"."customer" limit 10

  1. Sign in to Athena and verify the permissions for DataLakeUser_Category3.
  2. Switch to workgroup auto_gov_blog_workgroup_temporary_user3 in Athena.

This user can’t access the table because the table is marked as sensitive due to the presence of sensitive data columns in the table.

  1. Run the same query again:
select * from "oktank_autogov_temp"."customer" limit 10

Use case 2

Let’s ingest some new data on the table.

  1. Sign in to the Amazon S3 console with the admin role used earlier to grant permissions.
  2. Copy the file cust_feedback_v1.csv from the data-v1 folder in the S3 bucket to the data/customer folder in the S3 bucket using the default options.

This new data file has data quality issues because the column data_location breaks referential integrity with the base table. This data also introduces some sensitive data in column comment1. This column was earlier marked as public because it didn’t have any sensitive data.

The following screenshot shows what the customer folder should look like now.

  1. Run the AutoGovMachine state machine again and use the same JSON as the StartExecution input you used earlier:
{
  "Comment": "Auto Governance with AWS Glue and AWS LakeFormation",
  "BucketName": "<Replace with your bucket name>"
}

The job classifies column comment1 as sensitive on the customer table. It also updates the dq_tag value on the table because the data quality has changed due to the breaking referential integrity check.

You can verify the new tag values via the Lake Formation console as described earlier. The dq_tag value was DQ100. The value is changed to DQ50_80, reflecting the data quality score for the table.

Also, earlier the value for the col_class tag for the comment1 column was public. The value is now changed to sensitive because sensitive data is detected in this column.

Category 2 users shouldn’t be able to access sensitive columns in the table.

  1. Sign in with UserCategory2 to Athena and rerun the earlier query:
select * from "oktank_autogov_temp"."customer" limit 10

The column comment1 is now not available for UserCategory2 as expected. The access permissions are handled automatically.

Also, because the data quality score goes down below 100%, this new dataset is now not available for the Category1 user. This user should have access to data only when the score is 100% as per our defined rules.

  1. Sign in with UserCategory1 to Athena and rerun the earlier query:
select * from "oktank_autogov_temp"."customer" limit 10

You will see the user is not able to access the table now. The access permissions are handled automatically.

Use case 3

Let’s fix the invalid data and remove the data quality issue.

  1. Delete the cust_feedback_v1.csv file from the data/customer Amazon S3 location.
  2. Copy the file cust_feedback_v1_fixed.csv from the data-v1 folder in the S3 bucket to the data/customer S3 location. This data file fixes the data quality issues.
  3. Rerun the AutoGovMachine state machine.

When the state machine is complete, the data quality score goes up to 100% again and the tag on the table gets updated accordingly. You can verify the new tag as shown earlier via the Lake Formation console.

The Category1 user can access the table again.

Clean up

To avoid incurring further charges, delete the CloudFormation stack to delete the resources provisioned as part of this post.

Conclusion

This post covered AWS Glue Data Quality and sensitive detection features and Lake Formation LF-Tag based access control. We explored how you can combine these features and use them to build a scalable automated data governance capability on your data lake. We explored how user permissions changed when data was initially ingested to the table and when data drift was observed as part of subsequent ingestions.

For further reading, refer to the following resources:


About the Author

Shoukat Ghouse is a Senior Big Data Specialist Solutions Architect at AWS. He helps customers around the world build robust, efficient and scalable data platforms on AWS leveraging AWS analytics services like AWS Glue, AWS Lake Formation, Amazon Athena and Amazon EMR.

How AWS protects customers from DDoS events

Post Syndicated from Tom Scholl original https://aws.amazon.com/blogs/security/how-aws-protects-customers-from-ddos-events/

At Amazon Web Services (AWS), security is our top priority. Security is deeply embedded into our culture, processes, and systems; it permeates everything we do. What does this mean for you? We believe customers can benefit from learning more about what AWS is doing to prevent and mitigate customer-impacting security events.

Since late August 2023, AWS has detected and been protecting customer applications from a new type of distributed denial of service (DDoS) event. DDoS events attempt to disrupt the availability of a targeted system, such as a website or application, reducing the performance for legitimate users. Examples of DDoS events include HTTP request floods, reflection/amplification attacks, and packet floods. The DDoS events AWS detected were a type of HTTP/2 request flood, which occurs when a high volume of illegitimate web requests overwhelms a web server’s ability to respond to legitimate client requests.

Between August 28 and August 29, 2023, proactive monitoring by AWS detected an unusual spike in HTTP/2 requests to Amazon CloudFront, peaking at over 155 million requests per second (RPS). Within minutes, AWS determined the nature of this unusual activity and found that CloudFront had automatically mitigated a new type of HTTP request flood DDoS event, now called an HTTP/2 rapid reset attack. Over those two days, AWS observed and mitigated over a dozen HTTP/2 rapid reset events, and through the month of September, continued to see this new type of HTTP/2 request flood. AWS customers who had built DDoS-resilient architectures with services like Amazon CloudFront and AWS Shield were able to protect their applications’ availability.

Figure 1: Global HTTP requests per second, September 13 – 16

Figure 1. Global HTTP requests per second, September 13 – 16

Overview of HTTP/2 rapid reset attacks

HTTP/2 allows for multiple distinct logical connections to be multiplexed over a single HTTP session. This is a change from HTTP 1.x, in which each HTTP session was logically distinct. HTTP/2 rapid reset attacks consist of multiple HTTP/2 connections with requests and resets in rapid succession. For example, a series of requests for multiple streams will be transmitted followed up by a reset for each of those requests. The targeted system will parse and act upon each request, generating logs for a request that is then reset, or cancelled, by a client. The system performs work generating those logs even though it doesn’t have to send any data back to a client. A bad actor can abuse this process by issuing a massive volume of HTTP/2 requests, which can overwhelm the targeted system, such as a website or application.

Keep in mind that HTTP/2 rapid reset attacks are just a new type of HTTP request flood. To defend against these sorts of DDoS attacks, you can implement an architecture that helps you specifically detect unwanted requests as well as scale to absorb and block those malicious HTTP requests.

Building DDoS resilient architectures

As an AWS customer, you benefit from both the security built into the global cloud infrastructure of AWS as well as our commitment to continuously improve the security, efficiency, and resiliency of AWS services. For prescriptive guidance on how to improve DDoS resiliency, AWS has built tools such as the AWS Best Practices for DDoS Resiliency. It describes a DDoS-resilient reference architecture as a guide to help you protect your application’s availability. While several built-in forms of DDoS mitigation are included automatically with AWS services, your DDoS resilience can be improved by using an AWS architecture with specific services and by implementing additional best practices for each part of the network flow between users and your application.

For example, you can use AWS services that operate from edge locations, such as Amazon CloudFront, AWS Shield, Amazon Route 53, and Route 53 Application Recovery Controller to build comprehensive availability protection against known infrastructure layer attacks. These services can improve the DDoS resilience of your application when serving any type of application traffic from edge locations distributed around the world. Your application can be on-premises or in AWS when you use these AWS services to help you prevent unnecessary requests reaching your origin servers. As a best practice, you can run your applications on AWS to get the additional benefit of reducing the exposure of your application endpoints to DDoS attacks and to protect your application’s availability and optimize the performance of your application for legitimate users. You can use Amazon CloudFront (and its HTTP caching capability), AWS WAF, and Shield Advanced automatic application layer protection to help prevent unnecessary requests reaching your origin during application layer DDoS attacks.

Putting our knowledge to work for AWS customers

AWS remains vigilant, working to help prevent security issues from causing disruption to your business. We believe it’s important to share not only how our services are designed, but also how our engineers take deep, proactive ownership of every aspect of our services. As we work to defend our infrastructure and your data, we look for ways to help protect you automatically. Whenever possible, AWS Security and its systems disrupt threats where that action will be most impactful; often, this work happens largely behind the scenes. We work to mitigate threats by combining our global-scale threat intelligence and engineering expertise to help make our services more resilient against malicious activities. We’re constantly looking around corners to improve the efficiency and security of services including the protocols we use in our services, such as Amazon CloudFront, as well as AWS security tools like AWS WAF, AWS Shield, and Amazon Route 53 Resolver DNS Firewall.

In addition, our work extends security protections and improvements far beyond the bounds of AWS itself. AWS regularly works with the wider community, such as computer emergency response teams (CERT), internet service providers (ISP), domain registrars, or government agencies, so that they can help disrupt an identified threat. We also work closely with the security community, other cloud providers, content delivery networks (CDNs), and collaborating businesses around the world to isolate and take down threat actors. For example, in the first quarter of 2023, we stopped over 1.3 million botnet-driven DDoS attacks, and we traced back and worked with external parties to dismantle the sources of 230 thousand L7/HTTP DDoS attacks. The effectiveness of our mitigation strategies relies heavily on our ability to quickly capture, analyze, and act on threat intelligence. By taking these steps, AWS is going beyond just typical DDoS defense, and moving our protection beyond our borders. To learn more behind this effort, please read How AWS threat intelligence deters threat actors.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Tom Scholl

Mark Ryland

Mark is the director of the Office of the CISO for AWS. He has over 30 years of experience in the technology industry, and has served in leadership roles in cybersecurity, software engineering, distributed systems, technology standardization, and public policy. Previously, he served as the Director of Solution Architecture and Professional Services for the AWS World Public Sector team.

Tom Scholl

Tom Scholl

Tom is Vice President and Distinguished Engineer at AWS.