Tag Archives: Foundational (100)

AWS Security Profile: Matt Luttrell, Principal Solutions Architect for AWS Identity

Post Syndicated from Maddie Bacon original https://aws.amazon.com/blogs/security/aws-security-profile-matt-luttrell-principal-solutions-architect-for-aws-identity/

AWS Security Profile: Matt Luttrell, Principal Solutions Architect for AWS Identity

In the AWS Security Profile series, I interview some of the humans who work in Amazon Web Services Security and help keep our customers safe and secure. In this profile, I interviewed Matt Luttrell, Principal Solutions Architect for AWS Identity.


How long have you been at AWS and what do you do in your current role?

I’ve been at AWS around five years and have worked in a variety of roles from Professional Services consulting as an application architect to a solutions architect. In my current role, I work on the Identity Solutions team, which is a group of solutions architects who are embedded directly in the Identity and Control Services team. We have both internal-facing and external-facing functions. Internally, we work with product managers, drive concepts like data perimeters, and generally act as the voice of the customer to our product teams. Externally, we have conversations with customers, present at events, and so on.

How did you get started in security?

My background is in software development. I’ve always had a side interest in security and have always worked for very security-conscious companies. Early in my career, I became CISSP certified and that’s what got me kickstarted in security-specific domains and conversations. At AWS, being involved in security isn’t an optional thing. So, even before I joined the Identity Solutions team, I spent a lot of time working on identity and AWS Identity and Access Management (IAM) in particular, as well as AWS IAM Access Analyzer, while working with security-conscious customers in the financial services industry. As I got involved in that, I was able to dive deep in the security elements of AWS, but I’ve always had a background in security.

How do you explain your job to non-technical friends and family?

I typically tell them that I work in the cloud computing division at Amazon and that my job title is Solutions Architect. Naturally, the next question is, “what does a solutions architect do? I’ve never heard of that.” I explain that I work with customers to figure out how to put the building blocks together that we offer them. We offer a bunch of different services and features, and my job is to teach customers how they all work and interact with each other.

What are you currently working on that you’re excited about?

One of the things our team is working on is data perimeters. Our customers will see continued guidance on data perimeters. We’ve done a lot of work in this space—workshops and presentations at some of our big conferences, as well as blog posts and example repositories.

I’m also putting together some videos that go in depth on IAM policy evaluation and offer prescriptive guidance on writing IAM policies.

In your opinion, what’s one of the coolest things happening in identity right now?

I might be biased here, but I think there’s been a shift in the security industry at large from network-based perimeters in the traditional on-premises world to identity-based perimeters in the cloud. This is where the concept of data perimeters comes into play. Because your resources and identities are distributed, you can no longer look at your server and touch your server that’s sitting right next to you. This really puts an extra emphasis on your authentication and authorization controls, as well as the need for visibility into those controls. I think there’s a lot of innovation happening in the identity world because of this increased focus on identity perimeters. You’re hearing about concepts in this area like zero trust, data perimeters, and general identity awareness in all levels of the application and infrastructure stacks. You have services like IAM Access Analyzer to help give you that visibility into your AWS environment and what your identities are doing in terms of who can access what. I think we’ll continue to see growth in these areas because workloads are not becoming less distributed over time.

Tell me about something fun that you’ve done recently at AWS.

Roberto Migli and I presented a 400-level workshop at re:Invent 2022 on IAM policy evaluation, AWS Identity and Access Management (IAM) policy evaluation in action. This workshop introduced a new mental model for thinking about policy evaluation and walked attendees through a number of different policy evaluation scenarios. The idea behind the workshop is that we introduce a scenario and have the attendee try to figure out what the result of the evaluation would be. It spends some extra time comparing how the evaluation of resource-based policies differs from that of identity-based policies. I hope attendees walked away with a better understanding of how policy evaluations work at a deeper level and how they can write better, more secure IAM policies. We presented practical advice on how to structure different types of IAM policies and the different tradeoffs when writing a policy one way compared to another. I hope the mental model we introduced helps customers better reason about how policies will evaluate when they write them in their environment.

What is your favorite Amazon Leadership Principle and why?

This is an easy one. For me, it’s definitely Learn and Be Curious. Something I try to do is put myself in uncomfortable situations because I feel that when I’m uncomfortable, I’m learning and growing because it means I don’t know something. I find comfortable situations boring at times, so I’m always trying to dig in and learn how things work. This can sometimes be distracting, too, because there’s so much to learn and understand in the identity world.

What’s the thing you’re most proud of in your career?

There’s no particular project that I can point to and say, “this is what I’m most proud of.” I’m proud to be a part of the team I’m on now. For my team, Customer Obsession is more than just a slogan. We really advocate on behalf of the customer, listen to the voice of the customer, and push back on features that might not be the best thing for the customer. I think it’s awesome that I get to work for a company that really does advocate on behalf of the customer, and that my voice is heard when I’m trying to be that advocate. That aspect of working at AWS and with my team is what I’m most proud of.

I’m also proud of the mentoring and teaching that I get to do within AWS and within my role specifically. It’s really fulfilling to watch somebody grow and realize that career growth is not a zero-sum game—just because someone else succeeds does not mean that I have to fail.

If you had to pick an industry outside of security, what would you want to do?

I’d probably choose to be a ski instructor. I’m a big fan of skiing, but I don’t get to ski very often because of where I live. I love being out on the mountains, skiing, and teaching. I’m looking for any excuse to spend my days in the mountains.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Maddie Bacon

Maddie (she/her) is a technical writer for Amazon Security with a passion for creating meaningful content that focuses on the human side of security and encourages a security-first mindset. She previously worked as a reporter and editor, and has a BA in Mathematics. In her spare time, she enjoys reading, traveling, and staunchly defending the Oxford comma.

Author

Matt Luttrell

Matt is a Principal Solutions Architect on the AWS Identity Solutions team. When he’s not spending time chasing his kids around, he enjoys skiing, cycling, and the occasional video game.

SANS Institute uses Amazon QuickSight to drive transformational security awareness maturity within organizations

Post Syndicated from Carl R. Marrelli original https://aws.amazon.com/blogs/big-data/sans-institute-uses-amazon-quicksight-to-drive-transformational-security-awareness-maturity-within-organizations/

This is a guest post by Carl Marrelli from SANS Institute.

The SANS Institute is a world leader in cybersecurity training and certification. For over 30 years, SANS has worked with leading organizations to help ensure security across their organization, as well as with individual IT professionals who want to build and grow their security careers. We partner with over 500 organizations and support over 200,000 IT professionals with more than 90 technical training courses and over 40 professional (GIAC) certifications.

Our Security Awareness products include more than 70 instructional modules and have been deployed to over 6.5 million end-users to bring cybersecurity training to each employee within an organization.

As the Security Awareness department in particular began developing product strategies to deliver data-driven insights to customers, we were clear on using existing analytics services to rapidly build customer-facing analytics solutions. Building on a proven cloud provider would allow us to focus on our core expertise of helping organizations train, learn, and mature their programs instead of spending extra time and resources building and maintaining analytics from scratch.

We identified Amazon QuickSight, a fully managed, cloud-native business intelligence (BI) service, as the product that fit all our criteria. With it, we found an intuitive product with rich visualizations that we could build and grow with rapidly, allowing us to innovate without monetary risks or being locked in to cumbersome contracts. We considered other options, but they couldn’t support the licensing model that fit our needs.

In this post, we go over how we use QuickSight to serve our security customers.

Helping manage human risk with data-driven insights

SANS Security Awareness helps organizations use best-in-class security awareness and training solutions to transform their ability to measure and manage human risk. Security awareness programs are initiatives aimed at educating individuals about the importance of information security and the best practices for maintaining the confidentiality, integrity, and availability of information. We deliver expertly authored training materials to organizations, including computer-based video training sessions, interactive learning modules, supplemental materials, and reinforcement curriculum to keep security top-of-mind for all employees.

As organizations rapidly adopt and expand their use of digital technologies in their day-to-day work, the number of touchpoints with humans increases. As threat landscapes become increasingly more severe, managing human risk is critical to the success of the security program in any organization. Not only do organizations have to conduct security awareness training programs, but they also need insights into data and metrics that identify points of weakness to take data-driven corrective courses of action. As a leader in the space, we wanted to innovate by bringing relevant data-driven insights to our Security Awareness partners and customers in the journey to ensuring human-centered security across their organizations.

New data products to enhance and gamify risk assessment

We built one of our first insights products to support our Behavioral Risk Assessment. This service allows senior security and risk leaders to assess human risk with data handling, digital behavior, and compliance in an organization by individual, team, geography, business unit, and more. Leaders use the assessment to mature their security awareness capability with risk-informed interventions, identify process and procedure gaps, surface shadow IT, and reduce overall awareness training costs by focusing attention on the most important areas of risk.

Behavioral Risk Assessment dashboard with various charts

Delivered via a survey customized to the data types and risk profile of an organization, this assessment allows risk management leaders to more easily understand the data handling practices across roles and departments. Dashboards built in QuickSight empower stakeholders to quickly visualize what areas may need added attention by way of training intervention or updated policy.

Another product area where we invested in analytics to help organizations identify human risk is in gamified awareness training. The SANS Scavenger Hunt utilizes QuickSight in a unique way as a real-time game scoreboard. Players compete in the hunt while solving cybersecurity-related challenges, giving security teams a fun way to engage the workforce and promote good cyber behaviors.

Security Awareness challenge dashboard

The Scavenger Hunt was widely deployed during global Cybersecurity Awareness Month—a time for security awareness practitioners to shine a light on the purpose and mission of security awareness and also have a little fun. Typically, programs run during this time take place outside any regulated training cycle and are typically not delivered as mandatory training. This being the case, we identified dashboards as a way to gamify the experience to increase engagement among participants. These dashboards, built using QuickSight, provided users access to a leaderboard to not only track their own progress, but to also see how they compared to their fellow participants.

Building on the success of their experience with QuickSight and the Scavenger Hunt, we wanted to push the gamification and dashboards concept further so Chief Information and Security Officers (CISOs) and security teams could identify and mitigate the human side of ransomware risk. We developed Snack Attack!, a gamified learning experience that shows an organization how employees are performing in six key defensive areas where ransomware can be prevented. In 2021, over 80% of cyber breaches involved human error of some kind. Employees must have a fundamental awareness of cybersecurity and the ability to apply cyber knowledge within the scope of their jobs. Snack Attack! and QuickSight proved to be a great product to visualize and action on areas of human risk and sentiment for senior leadership.

A screenshot of a Snack Attack! dashboard

With Snack Attack!, we looked at Cybersecurity Awareness Month from the viewpoint of the awareness practitioner. The program itself focuses on driving engagement through an entertaining storyline with creative visuals. We chose to use the data from the training to help our customers build their awareness programs going forward. The dashboards included in Snack Attack! give the security awareness practitioner insights into the learned behavior of their users. Quick visualizations of learners’ scoring in Snack Attack! can act as an audit of the effectiveness of their existing program and provide a roadmap for future trainings.

Paving the way in using analytics for customer security

The SANS Institute brings together security awareness training programs with a metrics-based approach through out-of-the-box analytics dashboards so our customers can assess and manage human risk successfully. With QuickSight, we were able to rapidly innovate, developing valuable data products at a speed we could not have otherwise. Without up-front investments to get started and with the low cost to try with usage-based pricing, we were able to quickly ideate, build, and deploy customer-facing analytic products to drive security awareness within our customer organizations. Our analytics solutions differentiate us from existing enterprise products. With QuickSight, we are able to show organizations where they have cyber risk.

With the delivery of analytics solutions to customers, the SANS Institute is not only a top cybersecurity training, learning, and certification platform, but also a technology provider that helps customers use data and insights to make meaningful change in their organization. Moving forward, we have identified an expansion of QuickSight dashboards into our larger suite of assessments as the next logical step. Along with the Behavioral Risk Assessment, we offer Knowledge and Culture assessments to help security awareness practitioners better understand where and how to apply training and gauge the effectiveness of their programs. Because of the success we have had with QuickSight on our existing projects, we feel that similar dashboards can provide even more value to our customers.

To learn more about how QuickSight can help your business with dashboards, reports, and more, visit Amazon QuickSight.


About the Author

Carl R. Marrelli is the Director of Business Development and Digital Programs at SANS Institute. Based in Charlotte, NC, he has extensive experience in cross-functional team leadership, product management, and product marketing. Previously as Head of Product at SANS, Carl led the product management team for the Online Training and Security Awareness divisions through a significant growth period. Carl’s unique perspective and innovative ideas, support SANS as the company continues its mission to empower cybersecurity practitioners around the world.

New Global AWS Data Processing Addendum

Post Syndicated from Chad Woolf original https://aws.amazon.com/blogs/security/new-global-aws-data-processing-addendum/

Navigating data protection laws around the world is no simple task. Today, I’m pleased to announce that AWS is expanding the scope of the AWS Data Processing Addendum (Global AWS DPA) so that it applies globally whenever customers use AWS services to process personal data, regardless of which data protection laws apply to that processing. AWS is proud to be the first major cloud provider to adopt this global approach to help you meet your compliance needs for data protection.

The Global AWS DPA is designed to help you satisfy requirements under data protection laws worldwide, without having to create separate country-specific data processing agreements for every location where you use AWS services. By introducing this global, one-stop addendum, we are simplifying contracting procedures and helping to reduce the time that you spend assessing contractual data privacy requirements on a country-by-country basis.

If you have signed a copy of the previous AWS General Data Protection Regulation (GDPR) DPA, then you do not need to take any action and can continue to rely on that addendum to satisfy data processing requirements. AWS is always innovating to help you meet your compliance obligations wherever you operate. We’re confident that this expanded Global AWS DPA will help you on your journey. If you have questions or need more information, see Data Protection & Privacy at AWS and GDPR Center.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Chad Woolf

Chad joined Amazon in 2010 and built the AWS compliance functions from the ground up, including audit and certifications, privacy, contract compliance, control automation engineering and security process monitoring. Chad’s work also includes enabling public sector and regulated industry adoption of the AWS cloud and leads the AWS trade and product compliance team.

It’s the Amazon QuickSight Community’s 1st birthday!

Post Syndicated from Kristin Mandia original https://aws.amazon.com/blogs/big-data/its-the-amazon-quicksight-communitys-1st-birthday/

Happy birthday Amazon QuickSight Community! We are celebrating 1 year since the launch of our new Community. The Amazon QuickSight Community website is a one-stop-shop where business intelligence (BI) authors and developers from across the globe can ask and answer questions, stay up to date, network, and learn together about Amazon QuickSight. In this post, we celebrate the rapid growth of our first year of the QuickSight Community, discuss new Community features, and share voices from our Community. We also invite you to sign up for the QuickSight Community (it’s easy and free) and get started on your learning journey.

Happy 1st birthday Amazon QuickSight Community!

We’re growing strong

Since last year’s announcement of the new QuickSight Community, we are witnessing hundreds of thousands of visits to the QuickSight Community each month. Answers to thousands of searchable questions have been posted, and our online Learning Series is now running every week. An events calendar has been added, and hundreds of learning resources have been posted (how-to videos and articles, Getting Started resources, blog posts, and What’s New posts). Furthermore, a Community Manager has been brought on board. And Community Experts and QuickSight Solution Architects are posting robust answers in the Q&A, supported by AWS Professional Services, AWS Data Lab, and Amazon QuickSight Service Delivery Partners—led by West Loop Strategy. In addition, other QuickSight Partners and content creators are bringing new life to the Community.

QuickSgiht Community members at an Amazon conference

New QuickSight Community members at an internal Amazon conference

Why are we excited? Community members share their voices

As we celebrate year one, QuickSight Community members share their voices:

“The community has been an awesome space to exchange ideas, problems, and solutions, which has really helped every one of us in adopting QuickSight more effectively!” #thank you #community — Sagnik Mukherjee, Data and Analytics Architect, QuickSight Expert

“I am so excited for the QuickSight Learning Series! And I am happy to know that there is a supportive community…as I begin my QS journey!“ — Rachel Krentz, Configuration Analyst, Madaket

“The community has evolved so much in just one year!” — Darren Demicoli, BI Team Lead, The Mill Adventure, QuickSight Expert

“Happy Birthday QuickSight Community! I’m pretty new here but have found this to be a really great resource with answers to every question I have been able to come up with so far. Cheers!” — Brian Jager, Sales Operations Lead, Amazon Business

“I love being connected directly with other users and solutions architects…The QuickSight Community helps me out a ton not having to spend as much of my time on training when I can point new users to a resource where their questions have likely already been answered.” — Ryane Brady, Programmer Analyst, GROWMARK, Inc., QuickSight Expert

Hear more voices from our Community on our birthday post.

The QuickSight Community: Your one-stop-shop for BI learning

Here’s a quick tour of the QuickSight Community website as well as its new features.

The following figure shows the homepage view and everything that’s available.

A One-Stop-Shop for your Business Intelligence journey includes resource toolbar, searchable Q&A, learning resources, what’s new, blog, events and featured content

Learning Center: What’s new

In addition to growing a robust, searchable Q&A, in the last year, we’ve added tons of new learning resources to the QuickSight Community, including:

When you select Learning Center on the homepage, you can choose from these various resources.

Amazon QuickSight Learning Center – includes getting started resources, how to videos, webinar videos, articles, and other resources.

New: Learning events

In the last year, we’ve added an Events section to the QuickSight Community. We now offer weekly and monthly learning webinars as well as featured in-person and online events. Ongoing virtual events include:

New: Experts and user groups

We’ve been excited to see QuickSight user groups pop up over the past year, including one in the Twin Cities and one in Chicago—both hosted by QuickSight Service Delivery Partners (Charter Solutions and West Loop Strategy, respectively).

In addition, we’ve launched our QuickSight Experts program, honoring top question-answerers in the Community. These rock stars have relentlessly helped their peers take their learning to the next level.

QuickSight Community Expert

Ryane Brady, Programmer Analyst, GROWMARK, Inc., is a QuickSight Expert and top question-answerer in the Community. Here she is showing off her Expert swag. See the 2022 to Q1 2023 list of Experts at the bottom of this post.

First External Viz Challenge

In the last year, AWS hosted its very first external data visualization design competition to invited QuickSight Service Delivery Partners. This enabled QuickSight Partners to show off their visual analytics and storytelling skills. We were proud to feature the QuickSight Partner Viz Challenge winners of 2022 on the QuickSight Community.

Amazon QuickSight Service Delivery Partner Viz Challenge winners

Want to be a part of the QuickSight Community?

Sign up now to join the QuickSight Community (it’s free and easy) and take your QuickSight learning to the next level.

As part of the Community, you can:

  • Encourage your teams and colleagues to create QuickSight Community user profiles—signup up is easy
  • Click the bell icon in the top right corner of the page to get notifications and keep up to date with QuickSight news
  • Share your community ideas with the Community Manager

Come in on the ground floor and help us take the QuickSight Community to the next level:

  • Become an Expert and share your knowledge
  • Pioneer a user group online or in your area
  • Contribute a how-to article

If you’re interested in any of these opportunities, reach out to Kristin, the Sr. Online Community Manager for QuickSight, at [email protected].

We’re just getting started

Thank you to everyone who has helped the QuickSight Community grow over the last year! We can’t wait to see what this next year has in store!

Special thanks to our QuickSight Experts who make the QuickSight Community possible (listed alphabetically): Ryane Brady, Biswajit Dash, Darren Demicoli, Max Engelhard, Charles Greenacre, Naveed Hashimi, Todd Hoffman, Mike Khuns, Thomas Kotzurek, Sanjeeb Mohapatra, Sagnik Mukherjee and David Wong. Incredible gratitude to Lillie Atkins and Mia Heard, who launched the Community a year ago. Also, thanks to Jose Kunnackal John, Jill Florant, the QuickSight Solution Architects, Amazon QuickSight Service Delivery Partners—led by West Loop Strategy, AWS Service Acceleration team, AWS ProServe team, and AWS Data Lab team for your incredible investment in the QuickSight Community.


About the Authors

Kristin Mandia is Senior Online Community Manager for Amazon QuickSight, Amazon Web Service’s cloud-native, fully managed BI service.


Ian McNamara
is a Program Manager and writer on the Customer Success Team for Amazon QuickSight, Amazon Web Service’s cloud-native, fully managed BI service.

Showpad accelerates data maturity to unlock innovation using Amazon QuickSight

Post Syndicated from Shruthi Panicker original https://aws.amazon.com/blogs/big-data/showpad-accelerates-data-maturity-to-unlock-innovation-using-amazon-quicksight/

Showpad aligns sales and marketing teams around impactful content and powerful training, helping sellers engage with buyers and generate the insights needed to continuously improve conversion rates. In 2021, Showpad set forth the vision to use the power of data to unlock innovations and drive business decisions across its organization. Showpad’s legacy solution was fragmented and expensive, with different tools providing conflicting insights and lengthening time to insight. The company decided to use AWS to unify its business intelligence (BI) and reporting strategy for both internal organization-wide use cases and in-product embedded analytics targeted at its customers.

Showpad built new customer-facing embedded dashboards within Showpad eOSTM and migrated its legacy dashboards to Amazon QuickSight, a unified BI service providing modern interactive dashboards, natural language querying, paginated reports, machine learning (ML) insights, and embedded analytics at scale.

In this post, we share how Showpad used QuickSight to streamline data and insights access across teams and customers. Showpad migrated over 70 dashboards with over 1,000 visuals. They have rolled out the solution to all its 600 employees, increased dashboard development activity by three times, and reduced dashboard turnaround time from months to weeks. Showpad also launched dashboards and reports to over 1,300 customers worldwide, providing access to tens of thousands of users across all its customers.

Streamlining data-driven decisions by defragmenting the data and reporting architecture

Founded in 2011, dual headquartered in Belgium and Chicago and with offices around the world, Showpad provides a single destination for sales representatives to access all sales content and information, along with coaching and training tools to create informed, upskilled, and trusted buying teams. The platform also provides analytics and insights to support successful information sharing and fuel continuous improvement. In 2021, Showpad decided to take the next step in its data evolution and set forth the vision to power innovation, product decisions, and customer engagement using data-driven insights. This required Showpad to accelerate its data maturity as a company by mindfully using data and technology holistically to help its customers.

But the company’s legacy BI solution and data were fragmented across multiple tools, some with proprietary logic. “Each of these tools were getting data from a different place, and that’s where it gets difficult,” says Jeroen Minnaert, head of data at Showpad. “If each tool tells a different story because it has different data, we won’t have alignment within the business on what this data means.” Showpad also struggled with data quality issues in terms of consistency, ownership, and insufficient data access across its targeted user base due to a complex BI access process, licensing challenges, and insufficient education.

Showpad wanted to unify all the data into a single unified interface through a data lake, democratize that data through a BI solution such that autonomous teams across the company could effectively use data, and drive and unlock innovation in the company through advanced insights data, artificial intelligence, and ML. The company already used AWS in other aspects of its business and found that QuickSight would not only meet all its BI and reporting needs with seamless integrations into the AWS stack, but also bring with it several unique benefits unlike incumbents and other tools evaluated. “We chose QuickSight because of its embedded analytic capabilities, serverless architecture, and consumption-based pricing,” says Minnaert. “Using QuickSight to launch interactive dashboards and reporting for our customers, along with the ability for our customer success teams to create or alter dashboard prototypes on our internal-facing QuickSight instance and then promote those dashboards to customers through the product, was a very compelling use case.”

QuickSight would help local data stewards, who weren’t technical but knew the use cases intimately, to create their own dashboards and prototype them with their customers before promoting them through the product. “The serverless model was also compelling because we did not have to pay for server instances nor license fees per reader. With QuickSight, we pay for usage. This makes it easy for us to provide access to everyone by default. This is a key pillar in our ability to democratize data,” says Minnaert.

A screenshot of Showpad's Platform/Adoption dashboard

Architecting a portable data layer and migrating to QuickSight to accelerate time to value

After choosing QuickSight as its solution in November 2021, Showpad took on two streams of development: migrating internal organization-wide BI reporting and building in-product reporting using embedded analytics. Showpad worked closely alongside the QuickSight team to have a smooth rollout. The company involved members of the QuickSight team in its internal communications and set up meetings every 2 weeks to resolve difficult issues quickly.

On the internal reporting front, the data team took a “working backwards” approach to make sure it had the right approach before going all in with all existing dashboards. Showpad selected five difficult and complex dashboards and took 3 months to explore various possibilities using QuickSight. For example, the team determined that user log-in would be automated with single sign-on and Okta and built a plan for assets organization, asset promotion, and access controls. The company also used the opportunity to reimagine its data pipeline and architecture. A key architectural decision that Showpad took during this time was to create a portable data layer by decoupling the data transformation from visualization, ML, or ad hoc querying tools and centralizing its business logic. The portable data layer facilitated the creation of data products for varied use cases, made available within various tools based on the need of the consumer—be it a business analyst, data scientist, or business user.

After the solution approach was decided on and the foundation was built, the team wanted to scale. However, with 70 dashboards with over 1,000 visuals with varying levels of complexities, including proprietary logic unique to certain tools, and data from over 1,000 tables ingesting data from over 20 data sources, the team decided to take a measured approach to the migration. The entire data team of 20 people were “all on hands on deck” for the project.

First, Showpad created a landscape of all the dashboards, connecting data sources and dependencies before prioritizing the migration order. The company decided to start with dashboards with the fewest dependencies, like product and engineering dashboards that had a single data source, followed by revenue operations dashboards with a couple of data sources, and lastly with customer success and marketing dashboards that combined product and engineering and revenue operations data. After migration was complete, Showpad validated all the numbers and worked with business stakeholders for quality assurance. It launched the first set of dashboards in April 2022, followed by customer success and marketing dashboards in July 2022. As of January 2023, Showpad’s QuickSight instance includes over 2,433 datasets and 199 dashboards.

Showpad also achieves benefits for its customers by using QuickSight to deliver a wide variety of insights, including usage dashboards, industry comparisons, user comparison, group comparison, and revenue attribution. On the second workstream of in-product customer-facing reporting, Showpad released its first version of QuickSight reporting to customers in June 2022. “We went through user research, development, and beta tests in a span of 6 months, which was a fast turnaround and a big win for us,” says Minnaert. And Showpad aims to further accelerate the turnaround time to make a report and ship it to a customer.

With the foundational architecture now in place, shipping to a customer can happen in a few sprints, with most of the time spent on iterating and fine-tuning insights instead of engineering a scalable reporting solution. Showpad can also improve the insights it offers to customers. “Using QuickSight in our product makes it a powerful tool,” says Minnaert. “We can launch reporting for our customers so that they can look at and interpret data by themselves.” Showpad can then follow up with tailor-made reporting for each customer using the same data so that it tells a consistent story.

After a dashboard is agreed on, the dashboard can go through Showpad’s automated dashboard promotion process that can take an idea from development to production to a smile on a customer’s face in weeks, not months.

A screenshot of their Shared Space engagement dashboard

Unlocking innovation with self-service BI and rapid prototyping

By providing dashboard and report building to analysts and nontechnical users, Showpad drastically reduced overall turnaround time to build and deliver insights, down from months to weeks. Showpad also increased dashboard development activity by three times across the organization.

Showpad users can quickly prototype reports in a well-known environment—building reports using QuickSight, and then testing them with customers by helping customers understand how the reporting would look and function. “After we settle on reports or dashboards, it does not take much engineering effort to bring them to production,” says Minnaert. “We can make a lot of innovation happen by quickly prototyping and bringing validated prototypes to production.” Showpad’s users and customers also benefit from performance gains with 10 times increased speed when using SPICE (Super-fast, Parallel, In-memory Calculation Engine), which is the robust in-memory engine that QuickSight uses. It takes only seconds to load dashboards.

Because QuickSight is serverless and uses a session-based pricing model, Showpad expects to see cost savings. By paying per use, Showpad can easily provide access to all its users and customers without purchasing expensive per-reader licenses. Showpad also doesn’t need to pay for server instances or maintain infrastructure for BI. In addition, Showpad can deprecate custom reporting, infrastructure, and multiple tools with the new data architecture and QuickSight. “Much of our cost savings will come from being able to deprecate the custom reporting that we’ve made in the past,” says Minnaert. “The custom reporting used a lot of infrastructure that we no longer need to maintain.” Showpad expects to see a three times increase in projected return on investment in the upcoming year.

Showpad completed its internal BI migration to QuickSight by the end of 2022. Showpad also continues to expand the in-product reporting while continuing to optimize performance for the best customer experience. Showpad hopes to further reduce the time it takes to load a dashboard to under 1 second.

In conjunction with Showpad’s new portable data layer, QuickSight helps users of all types across its organization and customers self-serve data and insights rapidly. Everyone in Showpad gets access to data and insights the day they onboard with Showpad. To make self-service even easier, Showpad will soon launch embedded Amazon QuickSight Q so anyone can ask questions in natural language and receive accurate answers with relevant visualizations that help them gain insights from the data. By helping business users and experts rapidly prototype dashboards and reports in line with user and customer needs, Showpad uses the power of data to unlock innovation and drive growth across its organization. “QuickSight has become our go-to tool for any BI requirement at Showpad—both internal and external customer facing, especially when it comes to correlating data across departments and business units,” says Minnaert.

Get started with QuickSight

Migrating to QuickSight enabled Showpad to streamline data and insights access across teams and customers and reduced overall turnaround time to build and deliver insights from months to weeks.

Learn more about unleashing your organization’s ability to accelerate revenue growth with Showpad. To learn more about how QuickSight can help your business with dashboards, reports, and more, visit Amazon QuickSight.


About the Author

Shruthi Panicker is a Senior Product Marketing Manager with Amazon QuickSight at AWS. As an engineer turned product marketer, Shruthi has spent over 15 years in the technology industry in various roles from software engineering, to solution architecting to product marketing. She is passionate about working at the intersection of technology and business to tell great product stories that help drive customer value.

Create threshold alerts on tables and pivot tables in Amazon QuickSight

Post Syndicated from Lillie Atkins original https://aws.amazon.com/blogs/big-data/create-threshold-alerts-on-tables-and-pivot-tables-in-amazon-quicksight/

Amazon QuickSight previously launched threshold alerts on KPIs and gauge charts. Now, QuickSight supports creating threshold alerts on tables and pivot tables—our most popular visual types. This allows readers and authors to track goals or key performance indicators (KPIs) and be notified via email when they are met. These alerts allow readers and authors to relax and rely on notifications for when their attention is needed. In this post, we share how to create threshold alerts on tables or pivot tables to track important metrics.

Background information

Threshold alerts are a QuickSight Enterprise Edition feature and available for dashboards consumed on the QuickSight website. Threshold alerts aren’t yet available in embedded QuickSight dashboards or on the mobile app.

Alerts are created based on the visual at that point in time and are not affected by potential future changes to the visual’s design. This means the visual can be changed or deleted and the alert continues to work as long as the data in the dataset remains valid. In addition, you can create multiple alerts off of one visual, and rename them as appropriate.

Finally, alerts respect RLS and CLS rules.

Set up an alert on a table or pivot table

Threshold alerts are configured for dashboards. On a dashboard, there are three different ways to create an alert on a table or pivot table.

First, you can create directly from a pivot table or table. You click directly on the cell you would like to create an alert on (if there is another action enabled, you may have to right-click to get this option to show). This needs to be on a numeric value (no dates or strings allow for creation of alerts). Then choose Create Alert to start creating the alert.

Let’s assume you want to track the profit coming from online purchases for auto-related merchandise being shipped first class. Choose the appropriate cell and then choose Create Alert.

Create Alert

You’re presented with the creation pane for alerts. The only difference from KPIs or gauge visual alerts is that here you’ll find the other dimensions in the row that you’re creating the alert on. This will help you identify what value from the table you have selected, because there can be duplicates of the numeric values.

In the following screenshot, the value to track is profit, which currently is $437.39. This is the value that will be compared to the threshold you set. You will also see the dimensions being used to define this alert, which are taken from the row of the table. In this case, the Category is Auto, the Segement is Online, and the Ship Mode is First Class.

Now that you have checked that the value is correct, you can update the name of the alert that is automatically filled with the name of the visual it is created off of, set the condition (between Is above, Is below, and Is equal to), and pick the threshold value, notification frequency, and whether you want to be emailed when there is no data.

In the following example, the alert has been configured so that you will receive an email when the profit is above the threshold of $1,000. You’ve also left the notification frquency at Daily at most and haven’t requested to be emailed when there is no data.

If you have a date field, you also will see an option to control the date. This will automatically set the date field to be the most recent of whatever aggregation you’re looking at, such as hour, week, day, month. However, you could override to use the specific date applied to the value you have selected if you would prefer.

Below is an example where the data was aggregated based on the week and so Latest Week has been selected rather than the historical Week of Jan 4, 2015.

You can then choose Save if you’re happy with the alert and it will load the Manage alert pane.

The Create Alert button is also at the bottom of the pane. This is the second way you can start creating an alert off of a table or pivot table.

You can also get to this pane from the upper right alert button on the dashboard.

Create Alert through the icon on dashboard

If you have no alerts, this will automatically drop you into the creation pane. There you will be asked to select a visual that supports alerts to begin creating an alert. If you already have alerts (as previously demonstrated), then all you need to do is choose Create Alert.

Then select a visual and choose Next.

You’re prompted to select a cell if you have picked a table or pivot table visual.

Then you repeat the same steps as creating off a cell within a table or pivot table.

Finally, you can start creating an alert from the bell icon on the pivot table or table. This is the third way to create an alert.

bell icon

You’ll be prompted to select a cell from the table, and the creation pane appears.

After you choose the cell that you want to track, you start the creation process just like the first two examples.

Update and delete alerts

To update or delete an alert, you need to navigate back to the Manage alerts pane. You get there from the bell icon on the top right corner of the dashboard.

Create Alert through the icon on dashboard

You can then choose the options menu (three dots) on the alert you want to manage. Three options appear: Edit alert, View history (to view recent times the alert has breached and notified you), and Delete.

Notifications

You’ll receive an email when your alert breaches the rule you set. The following is an example of what that looks like (the alert has been adjusted to be alerted if profit is over $100 and to be notified as frequently as possible).

notification if alert is breached

The current profit breach is highlighted and the historical numbers are shown along with the date and time of the recorded breaches. You can also navigate to the dashboard by choosing View Dashboard.

Evaluate alerts

The evaluation schedule for threshold alerts is based on the dataset. For SPICE datasets, alert rules are checked against the data after a successful data refresh. With datasets querying your data sources directly, alerts are evaluated daily at a random time between 6:00 PM and 8:00 AM. This is based on the the timezone of the AWS Region your dataset was created in. Dataset owners can set up their own schedules for checking alerts and increase the frequency up to hourly (to learn more, refer to Working with threshold alerts in Amazon QuickSight).

Restrict alerts

The admin for the QuickSight account can restrict who has access to set threshold alerts through custom permissions. For more information, see the section Customizing user permissions in Embed multi-tenant analytics in applications with Amazon QuickSight.

Pricing

Threshold alerts are billed for each evaluation, and follow the familiar pricing used for anomaly detection, starting at $0.50 per 1,000 evaluations. For example, if you set up an alert on a SPICE dataset that refreshes daily, you have 30 evaluations of the alert rule in a month, which costs 30 * $0.5/1000 = $0.015 in a month. For more information, refer to Amazon QuickSight Pricing.

Conclusion

In this post, you learned how to create threshold alerts on tables and pivot tables within QuickSight dashboards so that you can track important metrics. For more information about how to create threshold alerts on KPIs or gauge charts, refer to Create threshold-based alerts in Amazon QuickSight. Additional information is available in the Amazon QuickSight User Guide.


About the Author

Lillie Atkins is a Product Manager for Amazon QuickSight, Amazon Web Service’s cloud-native, fully managed BI service.

The National Intelligence Center of Spain and AWS collaborate to promote public sector cybersecurity

Post Syndicated from Borja Larrumbide original https://aws.amazon.com/blogs/security/the-national-intelligence-center-of-spain-and-aws-collaborate-to-promote-public-sector-cybersecurity/

Spanish version »

The National Intelligence Center and National Cryptological Center (CNI-CCN)—attached to the Spanish Ministry of Defense—and Amazon Web Services (AWS) have signed a strategic collaboration agreement to jointly promote cybersecurity and innovation in the public sector through AWS Cloud technology.

Under the umbrella of this alliance, the CNI-CCN will benefit from the help of AWS in defining the security roadmap of public administrations with different maturity levels and help the CNI-CCN comply with their security requirements and the National Security Scheme.

In addition, CNI-CCN and AWS will collaborate in various areas of cloud security. First, they will work together on cybersecurity training and awareness-raising focused on government institutions, through education, training, and certification programs for professionals who are experts in technology and security. Second, both organizations will collaborate in creating security guidelines for the use of cloud technology, incorporating the shared security model in the AWS cloud and contributing their experience and knowledge to help organizations with the challenges they face. Third, CNI-CCN and AWS will take part in joint events that demonstrate best practices for the deployment and secure use of the cloud, both for public and private sector organizations. Finally, AWS will support cybersecurity operations centers that are responsible for surveillance, early warning, and response to security incidents in public administrations.

Today, AWS has achieved certification in the National Security Scheme (ENS) High category and has been the first cloud provider to accredit several security services in the CNI-CCN’s STIC Products and Services catalog (CPSTIC), meaning that its infrastructure meets the highest levels of security and compliance for state agencies and public organizations in Spain. All of this gives Spanish organizations, including startups, large companies, and the public sector, access to AWS infrastructure that allows them to make use of advanced technologies such as data analysis, artificial intelligence, databases, Internet of Things (IoT), machine learning, and mobile or serverless services, to promote innovation.

In addition, the new cloud infrastructure of AWS in Spain, the AWS Europe (Spain) Region, allows customers who have data residency requirements to store their content in Spain, with the assurance that they maintain full control over the location of their data. This is a critical element for those who have data residency requirements. The launch of the AWS Europe (Spain) Region provides customers building applications that comply with General Data Protection Regulation (GDPR) access to another secure AWS Region in the European Union (EU) that helps meet the highest levels of security, compliance, and data protection. AWS is also Esquema Nacional de Seguridad (ENS) High certified, meaning its infrastructure meets the highest levels of security and compliance for government agencies and public organizations in Spain.
 


El Centro Nacional de Inteligencia de España y AWS colaboran para promover la ciberseguridad en el sector público

El Centro Nacional de Inteligencia y Centro Criptológico Nacional (CNI-CCN)–adscrito al Ministerio de Defensa de España- y Amazon Web Services (AWS) han firmado un acuerdo de colaboración estratégica para impulsar de forma conjunta la ciberseguridad y la innovación del sector público a través de la tecnología en la Nube de AWS.

Bajo el paraguas de esta alianza, el CNI-CCN se beneficiará de la ayuda de AWS en definir la hoja de ruta de seguridad de las administraciones públicas, con distintos niveles de madurez y ayudar al CNI-CCN en el cumplimiento de sus requisitos de seguridad y el Esquema Nacional de Seguridad.

Además, CNI-CCN y AWS colaborarán en diversos ámbitos en materia de seguridad en la nube. En primer lugar, trabajarán conjuntamente en formación y concienciación en ciberseguridad enfocadas a instituciones gubernamentales, a través de programas de educación, capacitación y certificación de profesionales expertos en tecnología y seguridad. En segundo lugar, ambas organizaciones colaborarán en la creación de guías de seguridad para el uso de tecnología en la nube, incorporando el modelo de seguridad compartida en la nube de AWS y aportando su experiencia y conocimiento para ayudar a las organizaciones con los desafíos a los que se enfrentan. En tercer lugar, CNI-CCN y AWS participarán en eventos conjuntos que demuestren las mejores prácticas de despliegue y uso seguro de la nube, tanto para organizaciones del sector público como privado. Finalmente, AWS apoyará a los centros de operaciones de ciberseguridad encargados de la vigilancia, alerta temprana y respuesta a incidentes de seguridad en las administraciones públicas.

Hoy AWS cuenta con la certificación del Esquema Nacional de Seguridad (ENS) categoría Alta y ha sido el primer proveedor de la nube en acreditar varios servicios de seguridad en el catálogo de Productos y Servicios STIC (CPSTIC) del CNI-CCN, los cual significa que su infraestructura cumple con los más altos niveles de seguridad y cumplimiento para agencias estatales y organizaciones públicas en España. Todo ello concede a las organizaciones españolas, incluyendo a startups, grandes empresas, así como al sector público, acceso a infraestructura de AWS que les permita hacer uso de tecnologías avanzadas como análisis de datos, inteligencia artificial, bases de datos, Internet de las Cosas (IoT), aprendizaje automático, y servicios móviles o serverless, para impulsar la innovación.

Además, la nueva infraestructura de nube de AWS en España, la Región AWS Europa (España) permite a los clientes almacenar su contenido en España, con la seguridad de que mantienen el control total sobre la localización de sus datos. Esto es un elemento crítico para quienes tienen requisitos de residencia de datos. Los clientes que desarrollan aplicaciones en cumplimiento con el Reglamento General de Protección de Datos (RGPD) tendrán acceso a otra región de infraestructura segura de AWS en la Unión Europea (UE), respetando los más altos estándares de seguridad, cumplimiento normativo y protección de datos. Hoy AWS también cuenta con la certificación del Esquema Nacional de Seguridad (ENS) categoría Alta, lo cual significa que su infraestructura cumple con los más altos niveles de seguridad y cumplimiento para agencias estatales y organizaciones públicas en España.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Borja Larrumbide

Borja Larrumbide

Borja is a Security Assurance Manager for AWS in Spain and Portugal. He received a bachelor’s degree in Computer Science from Boston University (USA). Since then, he has worked in companies such as Microsoft and BBVA, where he has performed in different roles and sectors. Borja is a seasoned security assurance practitioner with many years of experience engaging key stakeholders at national and international levels. His areas of interest include security, privacy, risk management, and compliance.

Deep Pool boosts software quality control using Amazon QuickSight

Post Syndicated from Shruthi Panicker original https://aws.amazon.com/blogs/big-data/deep-pool-boosts-software-quality-control-using-amazon-quicksight/

Deep Pool Financial Solutions, an investor servicing and compliance solutions supplier, was looking to build key performance indicators to track its software tests, failures, and successful fixes to pinpoint the specific areas for improvement in its client software. Deep Pool was unable to access the large amounts of data that its project management software provided, so it used AWS to access, manage, and analyze that data more efficiently.

During a larger migration to the AWS Cloud, the company discovered Amazon QuickSight, a cloud-native, serverless business intelligence (BI) service that powers interactive dashboards that let companies make better data-driven decisions. With QuickSight, Deep Pool could democratize access to this unused data and pinpoint areas for improvement in its software development processes, thereby improving the overall quality of its software.

According to Brett Promisel, Chief Operating Officer for Deep Pool, the company wanted to manage the data that it was collecting from a BI point of view to help it make more informed decisions. Because word of mouth and high-quality software are critical in Deep Pool’s industry, the company wanted to add additional rigorous quality controls to its product development and testing so that it continues to provide top-notch, stable software that its clients can rely on.

In this post, we share how Deep Pool boosted its software quality control using QuickSight.

Enhancing software testing to with data-driven insights

Continuous improvement is a hallmark of leading organizations. Deep Pool wanted achieve greater software quality and decided to improve how it monitored and managed its software testing processes using data.

Typical development processes involve extensive testing. First, the original developer tests the code, and then the code is unit tested. Next, larger groups test the code. For all this testing to be successful and result in product improvement, it needs to be measured and tracked so that developers can learn from it and implement improvements during the development process.

Using QuickSight for data-driven insights, Deep Pool has implemented significant software testing and control. It can now count the number of bugs found or tests failed and time how long it took to create patches and repair issues. It can also better track its work backlog and the progress of functionality requests. Monitoring this information lets the company know that it has successfully implemented improvements, because a decreasing number of bugs over time is a strong indicator of quality control.

Monitoring development to increase efficiency

Better software test management benefits two groups: internal teams and external customers. Deep Pool can now log and communicate important information, such as when a request is made, how it’s being resolved or addressed, and how it’s being tested. In addition to helping internal teams streamline their processes, the company can also use this data to track communications with customers, which are also stored in its project management software. Such knowledge helps the company determine whether customer requests are being promptly addressed and identify common trends that require action on a larger scale.

Seven development teams at Deep Pool independently write the code for the components of the company’s software products, and they must integrate those components to create the final products. With the granularity of the data that is provided by QuickSight, Deep Pool can thoroughly analyze the development and testing of these products. The company now has the ability to trace software bugs down to their original coding, which makes it simple to quickly locate and address any issues that come up. Deep Pool can also measure the results of those mitigating actions and determine whether its repairs were successful.

Attention to detail helps Deep Pool improve software quality, leading to better products and customers who are more likely to give positive referrals.

“Amazon QuickSight is extremely valuable to us when performing quality control. We can now expose our whole development team to how we’re managing databases and servers and measuring performance in a more optimal way,” says Promisel.

Deep Pool has successfully proven that it can use QuickSight to measure its quality control to improve its software and, ultimately, better support its customers. Since the migration to AWS, the number of software issues discovered and logged has dropped by 57%.

The increased quality control that has come from the company’s focus on accessing all its data and optimizing its use has led to better efficiency, which results in the ability to expand its growth without increasing its costs.

“We should be able to increase our customer base without adding the equivalent costs. Making sure we are as efficient as possible lets me manage that way,” says Promisel.

Expanding into more data sources

Deep Pool is also exploring how to expand its use of QuickSight to extract and use data from even more of its databases. In the future, it hopes to analyze its internal metrics, such as sales data, and its external client-related information, such as assets and holdings, to guide how it builds its client software and provides even more custom products.

Deep Pool is also committed to helping its employees be innovative and successful by investing in their futures and skill sets. It understands that well-trained employees can optimize the use of their tools, which results in better products. As such, the company will continue to invest in the training offered by AWS. Using cutting-edge tools and promoting its intent to invest in its employees indicate to Deep Pool’s customers that the company plans to stay innovative and ahead of the technological curve.

To learn more about how QuickSight can help your business with dashboards, reports, and more, visit Amazon QuickSight.


About the Author

Shruthi Panicker is a Senior Product Marketing Manager with Amazon QuickSight at AWS. As an engineer turned product marketer, Shruthi has spent over 15 years in the technology industry in various roles from software engineering, to solution architecting to product marketing. She is passionate about working at the intersection of technology and business to tell great product stories that help drive customer value.

How to use Amazon Macie to reduce the cost of discovering sensitive data

Post Syndicated from Nicholas Doropoulos original https://aws.amazon.com/blogs/security/how-to-use-amazon-macie-to-reduce-the-cost-of-discovering-sensitive-data/

Amazon Macie is a fully managed data security service that uses machine learning and pattern matching to discover and help protect your sensitive data, such as personally identifiable information (PII), payment card data, and Amazon Web Services (AWS) credentials. Analyzing large volumes of data for the presence of sensitive information can be expensive, due to the nature of compute-intensive operations involved in the process.

Macie offers several capabilities to help customers reduce the cost of discovering sensitive data, including automated data discovery, which can reduce your spend with new data sampling techniques that are custom-built for Amazon Simple Storage Service (Amazon S3). In this post, we will walk through such Macie capabilities and best practices so that you can cost-efficiently discover sensitive data with Macie.

Overview of the Macie pricing plan

Let’s do a quick recap of how customers pay for the Macie service. With Macie, you are charged based on three dimensions: the number of S3 buckets evaluated for bucket inventory and monitoring, the number of S3 objects monitored for automated data discovery, and the quantity of data inspected for sensitive data discovery. You can read more about these dimensions on the Macie pricing page.

The majority of the cost incurred by customers is driven by the quantity of data inspected for sensitive data discovery. For this reason, we will limit the scope of this post to techniques that you can use to optimize the quantity of data that you scan with Macie.

Not all security use cases require the same quantity of data for scanning

Broadly speaking, you can choose to scan your data in two ways—run full scans on your data or sample a portion of it. When to use which method depends on your use cases and business objectives. Full scans are useful when customers have identified what they want to scan. A few examples of such use cases are: scanning buckets that are open to the internet, monitoring a bucket for unintentionally added sensitive data by scanning every new object, or performing analysis on a bucket after a security incident.

The other option is to use sampling techniques for sensitive data discovery. This method is useful when security teams want to reduce data security risks. For example, just knowing that an S3 bucket contains credit card numbers is enough information to prioritize it for remediation activities.

Macie offers both options, and you can discover sensitive data either by creating and running sensitive data discovery jobs that perform full scans on targeted locations, or by configuring Macie to perform automated sensitive data discovery for your account or organization. You can also use both options simultaneously in Macie.

Use automated data discovery as a best practice

Automated data discovery in Macie minimizes the quantity of data scanning that is needed to a fraction of your S3 estate.

When you enable Macie for the first time, automated data discovery is enabled by default. When you already use Macie in your organization, you can enable automatic data discovery in the management console of the Amazon Macie administrator account. This Macie capability automatically starts discovering sensitive data in your S3 buckets and builds a sensitive data profile for each bucket. The profiles are organized in a visual, interactive data map, and you can use the data map to identify data security risks that need immediate attention.

Automated data discovery in Macie starts to evaluate the level of sensitivity of each of your buckets by using intelligent and fully managed data sampling techniques to minimize the quantity of data scanning needed. During evaluation, objects are organized with similar S3 metadata, such as bucket names, object-key prefixes, file-type extensions, and storage class, into groups that are likely to have similar content. Macie then selects small, but representative, samples from each identified group of objects and scans them to detect the presence of sensitive data. Macie has a feedback loop that uses the results of previously scanned samples to prioritize the next set of samples to inspect.

The automated sensitive data discovery feature is designed to detect sensitive data at scale across hundreds of buckets and accounts, which makes it easier to identify the S3 buckets that need to be prioritized for more focused scanning. Because the amount of data that needs to be scanned is reduced, this task can be done at fraction of the cost of running a full data inspection across all your S3 buckets. The Macie console displays the scanning results as a heat map (Figure 1), which shows the consolidated information grouped by account, and whether a bucket is sensitive, not sensitive, or not analyzed yet.

Figure 1: A heat map showing the results of automated sensitive data discovery

Figure 1: A heat map showing the results of automated sensitive data discovery

There is a 30-day free trial period when you enable automatic data discovery on your AWS account. During the trial period, in the Macie console, you can see the estimated cost of running automated sensitive data discovery after the trial period ends. After the evaluation period, we charge based on the total quantity of S3 objects in your account, as well as the bytes that are scanned for sensitive content. Charges are prorated per day. You can disable this capability at any time.

Tune your monthly spend on automated sensitive data discovery

To further reduce your monthly spend on automated sensitive data, Macie allows you to exclude buckets from automated discovery. For example, you might consider excluding buckets that are used for storing operational logs, if you’re sure they don’t contain any sensitive information. Your monthly spend is reduced roughly by the percentage of data in those excluded buckets compared to your total S3 estate.

Figure 2 shows the setting in the heatmap area of the Macie console that you can use to exclude a bucket from automated discovery.

Figure 2: Excluding buckets from automated sensitive data discovery from the heatmap

Figure 2: Excluding buckets from automated sensitive data discovery from the heatmap

You can also use the automated data discovery settings page to specify multiple buckets to be excluded, as shown in Figure 3.

Figure 3: Excluding buckets from the automated sensitive data discovery settings page

Figure 3: Excluding buckets from the automated sensitive data discovery settings page

How to run targeted, cost-efficient sensitive data discovery jobs

Making your sensitive data discovery jobs more targeted make them more cost-efficient, because it reduces the quantity of data scanned. Consider using the following strategies:

  1. Make your sensitive data discovery jobs as targeted and specific as possible in their scope by using the Object criteria settings on the Refine the scope page, shown in Figure 4.
    Figure 4: Adjusting the scope of a sensitive data discovery job

    Figure 4: Adjusting the scope of a sensitive data discovery job

    Options to make discovery jobs more targeted include:

    • Include objects by using the “last modified” criterion — If you are aware of the frequency at which your classifiable S3-hosted objects get modified, and you want to scan the resources that changed at a particular point in time, include in your scope the objects that were modified at a certain date or time by using the “last modified” criterion.
    • Don’t scan CloudTrail logs — Identify the S3 bucket prefixes that contain AWS CloudTrail logs and exclude them from scanning.
    • Consider using random object sampling — With this option, you specify the percentage of eligible S3 objects that you want Macie to analyze when a sensitive data discovery job runs. If this value is less than 100%, Macie selects eligible objects to analyze at random, up to the specified percentage, and analyzes the data in those objects. If your data is highly consistent and you want to determine whether a specific S3 bucket, rather than each object, contains sensitive information, adjust the sampling depth accordingly.
    • Include objects with specific extensions, tags, or storage size — To fine tune the scope of a sensitive data discovery job, you can also define custom criteria that determine which S3 objects Macie includes or excludes from a job’s analysis. These criteria consist of one or more conditions that derive from properties of S3 objects. You can exclude objects with specific file name extensions, exclude objects by using tags as the criterion, and exclude objects on the basis of their storage size. For example, you can use a criteria-based job to scan the buckets associated with specific tag key/value pairs such as Environment: Production.
  2. Specify S3 bucket criteria in your job — Use a criteria-based job to scan only buckets that have public read/write access. For example, if you have 100 buckets with 10 TB of data, but only two of those buckets containing 100 GB are public, you could reduce your overall Macie cost by 99% by using a criteria-based job to classify only the public buckets.
  3. Consider scheduling jobs based on how long objects live in your S3 buckets. Running jobs at a higher frequency than needed can result in unnecessary costs in cases where objects are added and deleted frequently. For example, if you’ve determined that the S3 objects involved contain high velocity data that is expected to reside in your S3 bucket for a few days, and you’re concerned that sensitive data might remain, scheduling your jobs to run at a lower frequency will help in driving down costs. In addition, you can deselect the Include existing objects checkbox to scan only new objects.
    Figure 5: Specifying the frequency of a sensitive data discovery job

    Figure 5: Specifying the frequency of a sensitive data discovery job

  4. As a best practice, review your scheduled jobs periodically to verify that they are still meaningful to your organization. If you aren’t sure whether one of your periodic jobs continues to be fit for purpose, you can pause it so that you can investigate whether it is still needed, without incurring potentially unnecessary costs in the meantime. If you determine that a periodic job is no longer required, you can cancel it completely.
    Figure 6: Pausing a scheduled sensitive data discovery job

    Figure 6: Pausing a scheduled sensitive data discovery job

  5. If you don’t know where to start to make your jobs more targeted, use the results of Macie automated data discovery to plan your scanning strategy. Start with small buckets and the ones that have policy findings associated with them.
  6. In multi-account environments, you can monitor Macie’s usage across your organization in AWS Organizations through the usage page of the delegated administrator account. This will enable you to identify member accounts that are incurring higher costs than expected, and you can then investigate and take appropriate actions to keep expenditure low.
  7. Take advantage of the Macie pricing calculator so that you get an estimate of your Macie fees in advance.

Conclusion

In this post, we highlighted the best practices to keep in mind and configuration options to use when you discover sensitive data with Amazon Macie. We hope that you will walk away with a better understanding of when to use the automated data discovery capability and when to run targeted sensitive data discovery jobs. You can use the pointers in this post to tune the quantity of data you want to scan with Macie, so that you can continuously optimize your Macie spend.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on Amazon Macie re:Post.

Want more AWS Security news? Follow us on Twitter.

Nicholas Doropoulos

Nicholas Doropoulos

Nicholas is an AWS Cloud Security Engineer, a Bestselling Udemy Instructor, and a subject matter expert in AWS Shield, GuardDuty and Certificate Manager. Outside work, he enjoys spending his time with his wife and their beautiful baby son.

Koulick Ghosh

Koulick Ghosh

Koulick is a Senior Product Manager in AWS Security based in Seattle, WA. He loves speaking with customers on how AWS Security services can help make them more secure. In his free-time, he enjoys playing the guitar, reading, and exploring the Pacific Northwest.

New AWS Security Blog homepage

Post Syndicated from Anna Brinkmann original https://aws.amazon.com/blogs/security/new-aws-security-blog-homepage/

We’ve launched a new AWS Security Blog homepage! While we currently have no plans to deprecate our existing list-view homepage, we have recently launched a new, security-centered homepage to provide readers with more blog info and easy access to the rest of AWS Security. Please bookmark the new page, and let us know what you think, by adding a comment, below.

Thumbnail view of the new page

The new AWS Security Blog homepage

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Author

Anna Brinkmann

Anna is a technical editor and writer, and she manages the AWS Security Blog. She enjoys creating helpful content and running short, streamlined meetings. In her free time, you can find her hanging out with her son, reading, and cooking with her air fryer.

Ivy Lin

Ivy Lin

Ivy is a seasoned web production expert. With the nickname “IvyBOT” from previous jobs, she has a passion for transforming raw documents and design comps into web pages with exceptional user experiences. In her spare time, she enjoys sharing information about delicious foods from her mother country of Taiwan with her friends.

2022 H2 IRAP report is now available on AWS Artifact for Australian customers

Post Syndicated from Patrick Chang original https://aws.amazon.com/blogs/security/2022-h2-irap-report-is-now-available-on-aws-artifact-for-australian-customers/

Amazon Web Services (AWS) is excited to announce that a new Information Security Registered Assessors Program (IRAP) report (2022 H2) is now available through AWS Artifact. An independent Australian Signals Directorate (ASD) certified IRAP assessor completed the IRAP assessment of AWS in December 2022.

The new IRAP report includes an additional six AWS services, as well as the new AWS Melbourne Region, that are now assessed at the PROTECTED level under IRAP. This brings the total number of services assessed at the PROTECTED level to 139.

The following are the six newly assessed services:

For the full list of services, see the IRAP tab on the AWS Services in Scope by Compliance Program page.

AWS has developed an IRAP documentation pack to assist Australian government agencies and their partners to plan, architect, and assess risk for their workloads when they use AWS Cloud services.

We developed this pack in accordance with the Australian Cyber Security Centre (ACSC) Cloud Security Guidance and Anatomy of a Cloud Assessment and Authorisation framework, which addresses guidance within the Australian Government Information Security Manual (ISM), the Attorney-General’s Protective Security Policy Framework (PSPF), and the Digital Transformation Agency Secure Cloud Strategy.

The IRAP pack on AWS Artifact also includes newly updated versions of the AWS Consumer Guide and the whitepaper Reference Architectures for ISM PROTECTED Workloads in the AWS Cloud.

Reach out to your AWS representatives to let us know which additional services you would like to see in scope for upcoming IRAP assessments. We strive to bring more services into scope at the PROTECTED level under IRAP to support your requirements.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Patrick Chang

Patrick Chang

Patrick is the APJ Audit Lead based in Hong Kong. He leads security audits, certifications and compliance programs across the APJ region. He is a technology risk and audit professional with over a decade of experience. He is passionate about delivering assurance programs that build trust with customers and provide them assurance on cloud security.

AWS Melbourne Region has achieved HCF Strategic Certification

Post Syndicated from Lori Klaassen original https://aws.amazon.com/blogs/security/aws-melbourne-region-has-achieved-hcf-strategic-certification/

Amazon Web Services (AWS) is delighted to confirm that our new AWS Melbourne Region has achieved Strategic Certification for the Australian Government’s Hosting Certification Framework (HCF).

We know that maintaining security and resiliency to keep critical data and infrastructure safe is a top priority for the Australian Government and all our customers in Australia. The Strategic Certification of both the existing Sydney and the new Melbourne Regions reinforces our ongoing commitment to meet security expectations for cloud service providers and means Australian citizens can now have even greater confidence that the Government is securing their data.

The HCF provides guidance to government customers to identify cloud providers that meet enhanced privacy, sovereignty, and security requirements. The expanded scope of the AWS HCF Strategic Certification gives Australian Government customers additional architectural options, including the ability to store backup data in geographically separated locations within Australia.

Our AWS infrastructure is custom-built for the cloud and designed to meet the most stringent security requirements in the world, and is monitored 24/7 to help support the confidentiality, integrity, and availability of customers’ data. All data flowing across the AWS global network that interconnects our data centers and Regions is automatically encrypted at the physical layer before it leaves our secured facilities. We will continue to expand the scope of our security assurance programs at AWS and are pleased that Australian Government customers can continue to innovate at a rapid pace and be confident AWS meets the Government’s requirements to support the secure management of government systems and data.

The Melbourne Region was officially added to the AWS HCF Strategic Certification on December 21, 2022, and the Sydney Region was certified in October 2021. AWS compliance status is available on the HCF Certified Service Providers website, and the Certificate of Compliance is available through AWS Artifact. AWS Artifact is a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact. AWS has also achieved many international certifications and accreditations, demonstrating compliance with third-party assurance frameworks such as ISO 27017 for cloud security, ISO 27018 for cloud privacy, and SOC 1, SOC 2, and SOC 3.

To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

Please reach out to your AWS account team if you have questions or feedback about HCF compliance.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Lori Klaassen

Lori Klaassen

Lori is a Senior Regulatory Specialist on the AWS Security Assurance team. She supports the operationalisation and ongoing assurance of direct regulatory oversight programs in ANZ.

Top 2022 AWS data protection service and cryptography tool launches

Post Syndicated from Marta Taggart original https://aws.amazon.com/blogs/security/top-2022-aws-data-protection-service-and-cryptography-tool-launches/

Given the pace of Amazon Web Services (AWS) innovation, it can be challenging to stay up to date on the latest AWS service and feature launches. AWS provides services and tools to help you protect your data, accounts, and workloads from unauthorized access. AWS data protection services provide encryption capabilities, key management, and sensitive data discovery. Last year, we saw growth and evolution in AWS data protection services as we continue to give customers features and controls to help meet their needs. Protecting data in the AWS Cloud is a top priority because we know you trust us to help protect your most critical and sensitive asset: your data. This post will highlight some of the key AWS data protection launches in the last year that security professionals should be aware of.

AWS Key Management Service
Create and control keys to encrypt or digitally sign your data

In April, AWS Key Management Service (AWS KMS) launched hash-based message authentication code (HMAC) APIs. This feature introduced the ability to create AWS KMS keys that can be used to generate and verify HMACs. HMACs are a powerful cryptographic building block that incorporate symmetric key material within a hash function to create a unique keyed message authentication code. HMACs provide a fast way to tokenize or sign data such as web API requests, credit card numbers, bank routing information, or personally identifiable information (PII). This technology is used to verify the integrity and authenticity of data and communications. HMACs are often a higher performing alternative to asymmetric cryptographic methods like RSA or elliptic curve cryptography (ECC) and should be used when both message senders and recipients can use AWS KMS.

At AWS re:Invent in November, AWS KMS introduced the External Key Store (XKS), a new feature for customers who want to protect their data with encryption keys that are stored in an external key management system under their control. This capability brings new flexibility for customers to encrypt or decrypt data with cryptographic keys, independent authorization, and audit in an external key management system outside of AWS. XKS can help you address your compliance needs where encryption keys for regulated workloads must be outside AWS and solely under your control. To provide customers with a broad range of external key manager options, AWS KMS developed the XKS specification with feedback from leading key management and hardware security module (HSM) manufacturers as well as service providers that can help customers deploy and integrate XKS into their AWS projects.

AWS Nitro System
A combination of dedicated hardware and a lightweight hypervisor enabling faster innovation and enhanced security

In November, we published The Security Design of the AWS Nitro System whitepaper. The AWS Nitro System is a combination of purpose-built server designs, data processors, system management components, and specialized firmware that serves as the underlying virtualization technology that powers all Amazon Elastic Compute Cloud (Amazon EC2) instances launched since early 2018. This new whitepaper provides you with a detailed design document that covers the inner workings of the AWS Nitro System and how it is used to help secure your most critical workloads. The whitepaper discusses the security properties of the Nitro System, provides a deeper look into how it is designed to eliminate the possibility of AWS operator access to a customer’s EC2 instances, and describes its passive communications design and its change management process. Finally, the paper surveys important aspects of the overall system design of Amazon EC2 that provide mitigations against potential side-channel vulnerabilities that can exist in generic compute environments.

AWS Secrets Manager
Centrally manage the lifecycle of secrets

In February, AWS Secrets Manager added the ability to schedule secret rotations within specific time windows. Previously, Secrets Manager supported automated rotation of secrets within the last 24 hours of a specified rotation interval. This new feature added the ability to limit a given secret rotation to specific hours on specific days of a rotation interval. This helps you avoid having to choose between the convenience of managed rotations and the operational safety of application maintenance windows. In November, Secrets Manager also added the capability to rotate secrets as often as every four hours, while providing the same managed rotation experience.

In May, Secrets Manager started publishing secrets usage metrics to Amazon CloudWatch. With this feature, you have a streamlined way to view how many secrets you are using in Secrets Manager over time. You can also set alarms for an unexpected increase or decrease in number of secrets.

At the end of December, Secrets Manager added support for managed credential rotation for service-linked secrets. This feature helps eliminate the need for you to manage rotation Lambda functions and enables you to set up rotation without additional configuration. Amazon Relational Database Service (Amazon RDS) has integrated with this feature to streamline how you manage your master user password for your RDS database instances. Using this feature can improve your database’s security by preventing the RDS master user password from being visible during the database creation workflow. Amazon RDS fully manages the master user password’s lifecycle and stores it in Secrets Manager whenever your RDS database instances are created, modified, or restored. To learn more about how to use this feature, see Improve security of Amazon RDS master database credentials using AWS Secrets Manager.

AWS Private Certificate Authority
Create private certificates to identify resources and protect data

In September, AWS Private Certificate Authority (AWS Private CA) launched as a standalone service. AWS Private CA was previously a feature of AWS Certificate Manager (ACM). One goal of this launch was to help customers differentiate between ACM and AWS Private CA. ACM and AWS Private CA have distinct roles in the process of creating and managing the digital certificates used to identify resources and secure network communications over the internet, in the cloud, and on private networks. This launch coincided with the launch of an updated console for AWS Private CA, which includes accessibility improvements to enhance screen reader support and additional tab key navigation for people with motor impairment.

In October, AWS Private CA introduced a short-lived certificate mode, a lower-cost mode of AWS Private CA that is designed for issuing short-lived certificates. With this new mode, public key infrastructure (PKI) administrators, builders, and developers can save money when issuing certificates where a validity period of 7 days or fewer is desired. To learn more about how to use this feature, see How to use AWS Private Certificate Authority short-lived certificate mode.

Additionally, AWS Private CA supported the launches of certificate-based authentication with Amazon AppStream 2.0 and Amazon WorkSpaces to remove the logon prompt for the Active Directory domain password. AppStream 2.0 and WorkSpaces certificate-based authentication integrates with AWS Private CA to automatically issue short-lived certificates when users sign in to their sessions. When you configure your private CA as a third-party root CA in Active Directory or as a subordinate to your Active Directory Certificate Services enterprise CA, AppStream 2.0 or WorkSpaces with AWS Private CA can enable rapid deployment of end-user certificates to seamlessly authenticate users. To learn more about how to use this feature, see How to use AWS Private Certificate Authority short-lived certificate mode.

AWS Certificate Manager
Provision and manage SSL/TLS certificates with AWS services and connected resources

In early November, ACM launched the ability to request and use Elliptic Curve Digital Signature Algorithm (ECDSA) P-256 and P-384 TLS certificates to help secure your network traffic. You can use ACM to request ECDSA certificates and associate the certificates with AWS services like Application Load Balancer or Amazon CloudFront. Previously, you could only request certificates with an RSA 2048 key algorithm from ACM. Now, AWS customers who need to use TLS certificates with at least 120-bit security strength can use these ECDSA certificates to help meet their compliance needs. The ECDSA certificates have a higher security strength—128 bits for P-256 certificates and 192 bits for P-384 certificates—when compared to 112-bit RSA 2048 certificates that you can also issue from ACM. The smaller file footprint of ECDSA certificates makes them ideal for use cases with limited processing capacity, such as small Internet of Things (IoT) devices.

Amazon Macie
Discover and protect your sensitive data at scale

Amazon Macie introduced two major features at AWS re:Invent. The first is a new capability that allows for one-click, temporary retrieval of up to 10 samples of sensitive data found in Amazon Simple Storage Service (Amazon S3). With this new capability, you can more readily view and understand which contents of an S3 object were identified as sensitive, so you can review, validate, and quickly take action as needed without having to review every object that a Macie job returned. Sensitive data samples captured with this new capability are encrypted by using customer-managed AWS KMS keys and are temporarily viewable within the Amazon Macie console after retrieval.

Additionally, Amazon Macie introduced automated sensitive data discovery, a new feature that provides continual, cost-efficient, organization-wide visibility into where sensitive data resides across your Amazon S3 estate. With this capability, Macie automatically samples and analyzes objects across your S3 buckets, inspecting them for sensitive data such as personally identifiable information (PII) and financial data; builds an interactive data map of where your sensitive data in S3 resides across accounts; and provides a sensitivity score for each bucket. Macie uses multiple automated techniques, including resource clustering by attributes such as bucket name, file types, and prefixes, to minimize the data scanning needed to uncover sensitive data in your S3 buckets. This helps you continuously identify and remediate data security risks without manual configuration and lowers the cost to monitor for and respond to data security risks.

Support for new open source encryption libraries

In February, we announced the availability of s2n-quic, an open source Rust implementation of the QUIC protocol, in our AWS encryption open source libraries. QUIC is a transport layer network protocol used by many web services to provide lower latencies than classic TCP. AWS has long supported open source encryption libraries for network protocols; in 2015 we introduced s2n-tls as a library for implementing TLS over HTTP. The name s2n is short for signal to noise and is a nod to the act of encryption—disguising meaningful signals, like your critical data, as seemingly random noise. Similar to s2n-tls, s2n-quic is designed to be small and fast, with simplicity as a priority. It is written in Rust, so it has some of the benefits of that programming language, such as performance, threads, and memory safety.

Cryptographic computing for AWS Clean Rooms (preview)

At re:Invent, we also announced AWS Clean Rooms, currently in preview, which includes a cryptographic computing feature that allows you to run a subset of queries on encrypted data. Clean rooms help customers and their partners to match, analyze, and collaborate on their combined datasets—without sharing or revealing underlying data. If you have data handling policies that require encryption of sensitive data, you can pre-encrypt your data by using a common collaboration-specific encryption key so that data is encrypted even when queries are run. With cryptographic computing, data that is used in collaborative computations remains encrypted at rest, in transit, and in use (while being processed).

If you’re looking for more opportunities to learn about AWS security services, read our AWS re:Invent 2022 Security recap post or watch the Security, Identity, and Compliance playlist.

Looking ahead in 2023

With AWS, you control your data by using powerful AWS services and tools to determine where your data is stored, how it is secured, and who has access to it. In 2023, we will further the AWS Digital Sovereignty Pledge, our commitment to offering AWS customers the most advanced set of sovereignty controls and features available in the cloud.

You can join us at our security learning conference, AWS re:Inforce 2023, in Anaheim, CA, June 13–14, for the latest advancements in AWS security, compliance, identity, and privacy solutions.

Stay updated on launches by subscribing to the AWS What’s New RSS feed and reading the AWS Security Blog.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Marta Taggart

Marta is a Seattle-native and Senior Product Marketing Manager in AWS Security Product Marketing, where she focuses on data protection services. Outside of work you’ll find her trying to convince Jack, her rescue dog, not to chase squirrels and crows (with limited success).

AWS completes CCAG 2022 pooled audit by European FSI customers

Post Syndicated from Manuel Mazarredo original https://aws.amazon.com/blogs/security/aws-completes-ccag-2022-pooled-audit-by-european-fsi-customers/

We are excited to announce that Amazon Web Services (AWS) has completed its annual Collaborative Cloud Audit Group (CCAG) Cloud Community audit with European financial service institutions (FSIs).

Security at AWS is the highest priority. As customers embrace the scalability and flexibility of AWS, we are helping them evolve security, identity, and compliance into key business enablers. At AWS, we are obsessed with earning and maintaining customer trust, and providing our FSI customers and their regulatory bodies with the assurance that AWS has the necessary controls in place to protect their most sensitive material and regulated workloads. The AWS Compliance Program helps customers understand the robust controls that are in place at AWS. By tying together governance-focused, audit-friendly service features with applicable compliance or audit standards, AWS Compliance helps customers to set up and operate in an AWS security control environment.

An example of how AWS supports customers’ risk management and regulatory efforts is our annual audit engagement with the CCAG. For the fourth year, the CCAG pooled audit thoroughly assessed the AWS controls that enable us to help protect our customers’ data and material workloads, while satisfying strict European and national regulatory obligations. CCAG currently represents more than 50 leading European FSIs and has grown steadily since its inception in 2017. Given the importance of cloud computing for the operations of FSI customers, the financial industry is coming under greater regulatory scrutiny. Similar to prior years, the CCAG 2022 audit was conducted based on customers’ right to conduct an audit of their service providers under European Banking Authority (EBA) outsourcing recommendations to cloud service providers (CSPs). The EBA suggests using pooled audits to use audit resources more efficiently and to decrease the organizational burden on both the clients and the CSP. Figure 1 illustrates the improved cost-effectiveness of pooled audits as compared to individual audits.

Figure 1: Efforts and costs are shared and reduced when a collaborative approach is followed

Figure 1: Efforts and costs are shared and reduced when a collaborative approach is followed

CCAG audit process

Although there are many security frameworks available, CCAG uses the Cloud Controls Matrix (CCM) of the Cloud Security Alliance (CSA) as the framework of reference for their CSP audits. The CSA is a not-for-profit organization with a mission, as stated on its website, to “promote the use of best practices for providing security assurance within cloud computing, and to provide education on the uses of cloud computing to help secure all other forms of computing.” CCM is specifically designed to provide fundamental security principles to guide cloud vendors and to assist cloud customers in assessing the overall security risk of a cloud provider.

Between February and December 2022, CCAG audited the AWS controls environment by following a hybrid approach, remotely and onsite in Seattle (USA), Dublin (IRL), and Frankfurt (DEU). For the scope of the 2022 CCAG audit, the participating auditors assessed AWS measures with regards to (1) keeping customer data sovereign, secure, and private, (2) effectively managing threats and vulnerabilities, (3) offering a highly available and resilient infrastructure, (4) preventing and responding rapidly to security events, and (5) enforcing strong authentication mechanisms and strict identity and access management constraint conditions to grant access to resources only under the need-to-know and need-to-have principles.

The scope of the audit encompassed individual services provided by AWS, and the policies, controls, and procedures for (and practice of) managing and maintaining them. Customers will still need to have their auditors assess the environments they create by using these services, and their policies and procedures for (and practices of) managing and maintaining these environments, on their side of the shared responsibility lines of demarcation for the AWS services involved.

CCAG audit results

CCAG members expressed their gratitude to AWS for the audit experience:

“The AWS Security Assurance team provided CCAG auditors with the needed logistical and technical assistance, by navigating the AWS organization to find the required information, performing advocacy of the CCAG audit rights, creating awareness and education, as well as exercising constant pressure for the timely delivery of information.”

The results of the CCAG pooled audit are available to the participants and their respective regulators only, and provide CCAG members with assurance regarding the AWS controls environment, enabling members to work to remove compliance blockers, accelerate their adoption of AWS services, and obtain confidence and trust in the security controls of AWS.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Manuel Mazarredo

Manuel Mazarredo

Manuel is a security audit program manager at AWS based in Amsterdam, the Netherlands. Manuel leads security audits, attestations, and certification programs across Europe, and is responsible for the BeNeLux area. For the past 18 years, he has worked in information systems audits, ethical hacking, project management, quality assurance, and vendor management across a variety of industries.

Andreas Terwellen

Andreas Terwellen

Andreas is a senior manager in security audit assurance at AWS, based in Frankfurt, Germany. His team is responsible for third-party and customer audits, attestations, certifications, and assessments across Europe. Previously, he was a CISO in a DAX-listed telecommunications company in Germany. He also worked for different consulting companies managing large teams and programs across multiple industries and sectors.

Julian Herlinghaus

Julian Herlinghaus

Julian is a Manager in AWS Security Assurance based in Berlin, Germany. He leads third-party and customer security audits across Europe and specifically the DACH region. He has previously worked as Information Security department lead of an accredited certification body and has multiple years of experience in information security and security assurance & compliance.

AWS now licensed by DESC to operate as a Tier 1 cloud service provider in the Middle East (UAE) Region

Post Syndicated from Ioana Mecu original https://aws.amazon.com/blogs/security/aws-now-licensed-by-desc-to-operate-as-a-tier-1-cloud-service-provider-in-the-middle-east-uae-region/

We continue to expand the scope of our assurance programs at Amazon Web Services (AWS) and are pleased to announce that our Middle East (UAE) Region is now certified by the Dubai Electronic Security Centre (DESC) to operate as a Tier 1 cloud service provider (CSP). This alignment with DESC requirements demonstrates our continuous commitment to adhere to the heightened expectations for CSPs. AWS government customers can run their applications in the AWS Cloud certified Regions in confidence.

AWS was evaluated by independent third-party auditor BSI on behalf of DESC on January 23, 2023. The Certificate of Compliance illustrating the AWS compliance status is available through AWS Artifact. AWS Artifact is a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

As of this writing, 62 services offered in the Middle East (UAE) Region are in scope of this certification. For up-to-date information, including when additional services are added, visit the AWS Services in Scope by Compliance Program webpage and choose DESC CSP.

AWS strives to continuously bring services into scope of its compliance programs to help you meet your architectural and regulatory needs. Please reach out to your AWS account team if you have questions or feedback about DESC compliance.

To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

 
If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Ioana Mecu

Ioana Mecu

Ioana is a Security Audit Program Manager at AWS based in Madrid, Spain. She leads security audits, attestations, and certification programs across Europe and the Middle East. Ioana has previously worked in risk management, security assurance, and technology audits in the financial sector industry for the past 15 years.

AWS Security Profile: Jana Kay, Cloud Security Strategist

Post Syndicated from Roger Park original https://aws.amazon.com/blogs/security/aws-security-profile-jana-kay-cloud-security-strategist/

In the AWS Security Profile series, we interview Amazon Web Services (AWS) thought leaders who help keep our customers safe and secure. This interview features Jana Kay, Cloud Security Strategist. Jana shares her unique career journey, insights on the Security and Resiliency of the Cloud Tabletop Exercise (TTX) program, thoughts on the data protection and cloud security landscape, and more.


How long have you been at AWS and what do you do in your current role?
I’ve been at AWS a little over four years. I started in 2018 as a Cloud Security Strategist, and in my opinion, I have one of the coolest jobs at AWS. I get to help customers think through how to use the cloud to address some of their most difficult security challenges, by looking at trends and emerging and evolving issues, and anticipating those that might still be on the horizon. I do this through various means, such as whitepapers, short videos, and tabletop exercises. I love working on a lot of different projects, which all have an impact on customers and give me the opportunity to learn new things myself all the time!

How did you get started in the security space? What about it piqued your interest?
After college, I worked in the office of a United States senator, which led me to apply to the Harvard Kennedy School for a graduate degree in public policy. When I started graduate school, I wasn’t sure what my focus would be, but my first day of class was September 11, 2001, which obviously had a tremendous impact on me and my classmates. I first heard about the events of September 11 while I was in an international security policy class, taught by the late Dr. Ash Carter. My classmates and I came from a range of backgrounds, cultures, and professions, and Dr. Carter challenged us to think strategically and objectively—but compassionately—about what was unfolding in the world and our responsibility to effect change. That experience led me to pursue a career in security. I concentrated in international security and political economy, and after graduation, accepted a Presidential Management Fellowship in the Office of the Secretary of Defense at the Pentagon, where I worked for 16 years before coming to AWS.

What’s been the most dramatic change you’ve seen in the security industry?
From the boardroom to builder teams, the understanding that security has to be integrated into all aspects of an organization’s ecosystem has been an important shift. Acceptance of security as foundational to the health of an organization has been evolving for a while, and a lot of organizations have more work to do, but overall there is prioritization of security within organizations.

I understand you’ve helped publish a number of papers at AWS. What are they and how can customers find them?
Good question! AWS publishes a lot of great whitepapers for customers. A few that I’ve worked on are Accreditation Models for Secure Cloud Adoption, Security at the Edge: Core Principles, and Does data localization cause more problems than it solves? To stay updated on the latest whitepapers, see AWS Whitepapers & Guides.

What are your thoughts on the security of the cloud today?
There are a lot of great technologies—such as AWS Data Protection services—that can help you with data protection, but it’s equally important to have the right processes in place and to create a strong culture of data protection. Although one of the biggest shifts I’ve seen in the industry is recognition of the importance of security, we still have a ways to go for people to understand that security and data protection is everyone’s job, not just the job of security experts. So when we talk about data protection and privacy issues, a lot of the conversation focuses on things like encryption, but the conversation shouldn’t end there because ultimately, security is only as good as the processes and people who implement it.

Do you have a favorite AWS Security service and why?
I like anything that helps simplify my life, so AWS Control Tower is one of my favorites. It has so much functionality. Not only does AWS Control Tower help you set up multi-account AWS environments, you can use it to help identify which of your resources are compliant. The dashboard, which allows for visibility of provisioned accounts, controls enabled policy enforcement and can help you detect noncompliant resources.

What are you currently working on that you’re excited about?
Currently, my focus is the Security and Resiliency of the Cloud Tabletop Exercise (TTX). It’s a 3-hour interactive event about incident response in which participants discuss how to prevent, detect, contain, and eradicate a simulated cyber incident. I’ve had the opportunity to conduct the TTX in South America, the Middle East, Europe, and the US, and it’s been so much fun meeting customers and hearing the discussions during the TTX and how much participants enjoy the experience. It scales well for groups of different sizes—and for a single customer or industry or for multiple customers or industries—and it’s been really interesting to see how the conversations change depending on the participants.

How does the Security and Resiliency of the Cloud Tabletop Exercise help security professionals hone their skills?
One of the great things about the tabletop is that it involves interacting with other participants. So it’s an opportunity for security professionals and business leaders to learn from their peers, hear different perspectives, and understand all facets of the problem and potential solutions. Often our participants range from CISOs to policymakers to technical experts, who come to the exercise with different priorities for data protection and different ideas on how to solve the scenarios that we present. The TTX isn’t a technical exercise, but participants link their collective understanding of what capabilities are needed in a given scenario to what services are available to them and then finally how to implement those services. One of the things that I hope participants leave with is a better understanding of the AWS tools and services that are available to them.

How can customers learn more about the Security and Resiliency of the Cloud Tabletop Exercise?
To learn more about the TTX, reach out to your account manager.

Is there something you wish customers would ask you about more often?
I wish they’d ask more about what they should be doing to prepare for a cyber incident. It’s one thing to have an incident response plan; it’s another thing to be confident that it’s going to work if you ever need it. If you don’t practice the plan, how do you know that it’s effective, if it has gaps, or if everyone knows their role in an incident?

How about outside of work—any hobbies?
I’m the mother of a teenager and tween, so between keeping up with their activities, I wish I had more time for hobbies! But someday soon, I’d like to get back to traveling more for leisure, reading for fun, and playing tennis.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Roger Park

Roger Park

Roger is a Senior Security Content Specialist at AWS Security focusing on data protection. He has worked in cybersecurity for almost ten years as a writer and content producer. In his spare time, he enjoys trying new cuisines, gardening, and collecting records.

Author

Jana Kay

Since 2018, Jana has been a cloud security strategist with the AWS Security Growth Strategies team. She develops innovative ways to help AWS customers achieve their objectives, such as security table top exercises and other strategic initiatives. Previously, she was a cyber, counter-terrorism, and Middle East expert for 16 years in the Pentagon’s Office of the Secretary of Defense.

Updated ebook: Protecting your AWS environment from ransomware

Post Syndicated from Megan O'Neil original https://aws.amazon.com/blogs/security/updated-ebook-protecting-your-aws-environment-from-ransomware/

Amazon Web Services is excited to announce that we’ve updated the AWS ebook, Protecting your AWS environment from ransomware. The new ebook includes the top 10 best practices for ransomware protection and covers new services and features that have been released since the original published date in April 2020.

We know that customers care about ransomware. Security teams across the board are ramping up their protective, detective, and reactive measures. AWS serves all customers, including those most sensitive to disruption like teams responsible for critical infrastructure, healthcare organizations, manufacturing, educational institutions, and state and local governments. We want to empower our customers to protect themselves against ransomware by using a range of security capabilities. These capabilities provide unparalleled visibility into your AWS environment, as well as the ability to update and patch efficiently, to seamlessly and cost-effectively back up your data, and to templatize your environment, enabling a rapid return to a known good state. Keep in mind that there is no single solution or quick fix to mitigate ransomware. In fact, the mitigations and controls outlined in this document are general security best practices. We hope you find this information helpful and take action.

For example, to help protect against a security event that impacts stored backups in the source account, AWS Backup supports cross-account backups and the ability to centrally define backup policies for accounts in AWS Organizations by using the management account. Also, AWS Backup Vault Lock enforces write-once, read-many (WORM) backups to help protect backups (recovery points) in your backup vaults from inadvertent or malicious actions. You can copy backups to a known logically isolated destination account in the organization, and you can restore from the destination account or, alternatively, to a third account. This gives you an additional layer of protection if the source account experiences disruption from accidental or malicious deletion, disasters, or ransomware.

Learn more about solutions like these by checking out our Protecting against ransomware webpage, which discusses security resources that can help you secure your AWS environments from ransomware.

Download the ebook Protecting your AWS environment from ransomware.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Megan O’Neil

Megan is a Principal Specialist Solutions Architect focused on threat detection and incident response. Megan and her team enable AWS customers to implement sophisticated, scalable, and secure solutions that solve their business challenges.

Merritt Baer

Merritt Baer

Merritt is a Principal in the Office of the CISO. She can be found on Twitter at @merrittbaer and looks forward to meeting you on her Twitch show, at re:Invent, or in your next executive conversation.

How Amazon Devices scaled and optimized real-time demand and supply forecasts using serverless analytics

Post Syndicated from Avinash Kolluri original https://aws.amazon.com/blogs/big-data/how-amazon-devices-scaled-and-optimized-real-time-demand-and-supply-forecasts-using-serverless-analytics/

Every day, Amazon devices process and analyze billions of transactions from global shipping, inventory, capacity, supply, sales, marketing, producers, and customer service teams. This data is used in procuring devices’ inventory to meet Amazon customers’ demands. With data volumes exhibiting a double-digit percentage growth rate year on year and the COVID pandemic disrupting global logistics in 2021, it became more critical to scale and generate near-real-time data.

This post shows you how we migrated to a serverless data lake built on AWS that consumes data automatically from multiple sources and different formats. Furthermore, it created further opportunities for our data scientists and engineers to use AI and machine learning (ML) services to continuously feed and analyze data.

Challenges and design concerns

Our legacy architecture primarily used Amazon Elastic Compute Cloud (Amazon EC2) to extract the data from various internal heterogeneous data sources and REST APIs with the combination of Amazon Simple Storage Service (Amazon S3) to load the data and Amazon Redshift for further analysis and generating the purchase orders.

We found this approach resulted in a few deficiencies and therefore drove improvements in the following areas:

  • Developer velocity – Due to the lack of unification and discovery of schema, which are primary reasons for runtime failures, developers often spent time dealing with operational and maintenance issues.
  • Scalability – Most of these datasets are shared across the globe. Therefore, we must meet the scaling limits while querying the data.
  • Minimal infrastructure maintenance – The current process spans multiple computes depending on the data source. Therefore, reducing infrastructure maintenance is critical.
  • Responsiveness to data source changes – Our current system gets data from various heterogeneous data stores and services. Any updates to those services takes months of developer cycles. The response times for these data sources are critical to our key stakeholders. Therefore, we must take a data-driven approach to select a high-performance architecture.
  • Storage and redundancy – Due to the heterogeneous data stores and models, it was challenging to store the different datasets from various business stakeholder teams. Therefore, having versioning along with incremental and differential data to compare will provide a remarkable ability to generate more optimized plans
  • Fugitive and accessibility – Due to the volatile nature of logistics, a few business stakeholder teams have a requirement to analyze the data on demand and generate the near-real-time optimal plan for the purchase orders. This introduces the need for both polling and pushing the data to access and analyze in near-real time.

Implementation strategy

Based on these requirements, we changed strategies and started analyzing each issue to identify the solution. Architecturally, we chose a serverless model, and the data lake architecture action line refers to all the architectural gaps and challenging features we determined were part of the improvements. From an operational standpoint, we designed a new shared responsibility model for data ingestion using AWS Glue instead of internal services (REST APIs) designed on Amazon EC2 to extract the data. We also used AWS Lambda for data processing. Then we chose Amazon Athena as our query service. To further optimize and improve the developer velocity for our data consumers, we added Amazon DynamoDB as a metadata store for different data sources landing in the data lake. These two decisions drove every design and implementation decision we made.

The following diagram illustrates the architecture

arch

In the following sections, we look at each component in the architecture in more detail as we move through the process flow.

AWS Glue for ETL

To meet customer demand while supporting the scale of new businesses’ data sources, it was critical for us to have a high degree of agility, scalability, and responsiveness in querying various data sources.

AWS Glue is a serverless data integration service that makes it easy for analytics users to discover, prepare, move, and integrate data from multiple sources. You can use it for analytics, ML, and application development. It also includes additional productivity and DataOps tooling for authoring, running jobs, and implementing business workflows.

With AWS Glue, you can discover and connect to more than 70 diverse data sources and manage your data in a centralized data catalog. You can visually create, run, and monitor extract, transform, and load (ETL) pipelines to load data into your data lakes. Also, you can immediately search and query cataloged data using Athena, Amazon EMR, and Amazon Redshift Spectrum.

AWS Glue made it easy for us to connect to the data in various data stores, edit and clean the data as needed, and load the data into an AWS-provisioned store for a unified view. AWS Glue jobs can be scheduled or called on demand to extract data from the client’s resource and from the data lake.

Some responsibilities of these jobs are as follows:

  • Extracting and converting a source entity to data entity
  • Enrich the data to contain year, month, and day for better cataloging and include a snapshot ID for better querying
  • Perform input validation and path generation for Amazon S3
  • Associate the accredited metadata based on the source system

Querying REST APIs from internal services is one of our core challenges, and considering the minimal infrastructure, we wanted to use them in this project. AWS Glue connectors assisted us in adhering to the requirement and goal. To query data from REST APIs and other data sources, we used PySpark and JDBC modules.

AWS Glue supports a wide variety of connection types. For more details, refer to Connection Types and Options for ETL in AWS Glue.

S3 bucket as landing zone

We used an S3 bucket as the immediate landing zone of the extracted data, which is further processed and optimized.

Lambda as AWS Glue ETL Trigger

We enabled S3 event notifications on the S3 bucket to trigger Lambda, which further partitions our data. The data is partitioned on InputDataSetName, Year, Month, and Date. Any query processor running on top of this data will scan only a subset of data for better cost and performance optimization. Our data can be stored in various formats, such as CSV, JSON, and Parquet.

The raw data isn’t ideal for most of our use cases to generate the optimal plan because it often has duplicates or incorrect data types. Most importantly, the data is in multiple formats, but we quickly modified the data and observed significant query performance gains from using the Parquet format. Here, we used one of the performance tips in Top 10 performance tuning tips for Amazon Athena.

AWS Glue jobs for ETL

We wanted better data segregation and accessibility, so we chose to have a different S3 bucket to improve performance further. We used the same AWS Glue jobs to further transform and load the data into the required S3 bucket and a portion of extracted metadata into DynamoDB.

DynamoDB as metadata store

Now that we have the data, various business stakeholders further consume it. This leaves us with two questions: which source data resides on the data lake and what version. We chose DynamoDB as our metadata store, which provides the latest details to the consumers to query the data effectively. Every dataset in our system is uniquely identified by snapshot ID, which we can search from our metadata store. Clients access this data store with an API’s.

Amazon S3 as data lake

For better data quality, we extracted the enriched data into another S3 bucket with the same AWS Glue job.

AWS Glue Crawler

Crawlers are the “secret sauce” that enables us to be responsive to schema changes. Throughout the process, we chose to make each step as schema-agnostic as possible, which allows any schema changes to flow through until they reach AWS Glue. With a crawler, we could maintain the agnostic changes happening to the schema. This helped us automatically crawl the data from Amazon S3 and generate the schema and tables.

AWS Glue Data Catalog

The Data Catalog helped us maintain the catalog as an index to the data’s location, schema, and runtime metrics in Amazon S3. Information in the Data Catalog is stored as metadata tables, where each table specifies a single data store.

Athena for SQL queries

Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries you run. We considered operational stability and increasing developer velocity as our key improvement factors.

We further optimized the process to query Athena so that users can plug in the values and the queries to get data out of Athena by creating the following:

  • An AWS Cloud Development Kit (AWS CDK) template to create Athena infrastructure and AWS Identity and Access Management (IAM) roles to access data lake S3 buckets and the Data Catalog from any account
  • A library so that client can provide an IAM role, query, data format, and output location to start an Athena query and get the status and result of the query run in the bucket of their choice.

To query Athena is a two-step process:

  • StartQueryExecution – This starts the query run and gets the run ID. Users can provide the output location where the output of the query will be stored.
  • GetQueryExecution – This gets the query status because the run is asynchronous. When successful, you can query the output in an S3 file or via API.

The helper method for starting the query run and getting the result would be in the library.

Data lake metadata service

This service is custom developed and interacts with DynamoDB to get the metadata (dataset name, snapshot ID, partition string, timestamp, and S3 link of the data) in the form of a REST API. When the schema is discovered, clients use Athena as their query processor to query the data.

Because all datasets have a snapshot ID are partitioned, the join query doesn’t result in a full table scan but only a partition scan on Amazon S3. We used Athena as our query processor because of its ease in not managing our query infrastructure. Later, if we feel we need something more, we can use either Redshift Spectrum or Amazon EMR.

Conclusion

Amazon Devices teams discovered significant value by moving to a data lake architecture using AWS Glue, which enabled multiple global business stakeholders to ingest data in more productive ways. This enabled the teams to generate the optimal plan to place purchase orders for devices by analyzing the different datasets in near-real time with appropriate business logic to solve the problems of the supply chain, demand, and forecast.

From an operational perspective, the investment has already started to pay off:

  • It standardized our ingestion, storage, and retrieval mechanisms, saving onboarding time. Before the implementation of this system, one dataset took 1 month to onboard. Due to our new architecture, we were able to onboard 15 new datasets in less than 2 months, which improved our agility by 70%.
  • It removed scaling bottlenecks, creating a homogeneous system that can quickly scale to thousands of runs.
  • The solution added schema and data quality validation before accepting any inputs and rejecting them if data quality violations are discovered.
  • It made it easy to retrieve datasets while supporting future simulations and back tester use cases requiring versioned inputs. This will make launching and testing models simpler.
  • The solution created a common infrastructure that can be easily extended to other teams across DIAL having similar issues with data ingestion, storage, and retrieval use cases.
  • Our operating costs have fallen by almost 90%.
  • This data lake can be accessed efficiently by our data scientists and engineers to perform other analytics and have a predictive approach as a future opportunity to generate accurate plans for the purchase orders.

The steps in this post can help you plan to build a similar modern data strategy using AWS-managed services to ingest data from different sources, automatically create metadata catalogs, share data seamlessly between the data lake and data warehouse, and create alerts in the event of an orchestrated data workflow failure.


About the authors

avinash_kolluriAvinash Kolluri is a Senior Solutions Architect at AWS. He works across Amazon Alexa and Devices to architect and design modern distributed solutions. His passion is to build cost-effective and highly scalable solutions on AWS. In his spare time, he enjoys cooking fusion recipes and traveling.

vipulVipul Verma is a Sr.Software Engineer at Amazon.com. He has been with Amazon since 2015,solving real-world challenges through technology that directly impact and improve the life of Amazon customers. In his spare time, he enjoys hiking.

Amazon EMR launches support for Amazon EC2 C7g (Graviton3) instances to improve cost performance for Spark workloads by 7–13%

Post Syndicated from Al MS original https://aws.amazon.com/blogs/big-data/amazon-emr-launches-support-for-amazon-ec2-c7g-graviton3-instances-to-improve-cost-performance-for-spark-workloads-by-7-13/

Amazon EMR provides a managed service to easily run analytics applications using open-source frameworks such as Apache Spark, Hive, Presto, Trino, HBase, and Flink. The Amazon EMR runtime for Spark and Presto includes optimizations that provide over twice the performance improvements compared to open-source Apache Spark and Presto.

With Amazon EMR release 6.7, you can now use Amazon Elastic Compute Cloud (Amazon EC2) C7g instances, which use the AWS Graviton3 processors. These instances improve price-performance of running Spark workloads on Amazon EMR by 7.93–13.35% over previous generation instances, depending on the instance size. In this post, we describe how we estimated the price-performance benefit.

Amazon EMR runtime performance with EC2 C7g instances

We ran TPC-DS 3 TB benchmark queries on Amazon EMR 6.9 using the Amazon EMR runtime for Apache Spark (compatible with Apache Spark 3.3) with C7g instances. Data was stored in Amazon Simple Storage Service (Amazon S3), and results were compared to equivalent C6g clusters from the previous generation instance family. We measured performance improvements using the total query runtime and geometric mean of the query runtime across TPC-DS 3 TB benchmark queries.

Our results showed 13.65–18.73% improvement in total query runtime performance and 16.98–20.28% improvement in geometric mean on EMR clusters with C7g compared to equivalent EMR clusters with C6g instances, depending on the instance size. In comparing costs, we observed 7.93–13.35% reduction in cost on the EMR cluster with C7g compared to the equivalent with C6g, depending on the instance size. We did not benchmark the C6g xlarge instance because it didn’t have sufficient memory to run the queries.

The following table shows the results from running the TPC-DS 3 TB benchmark queries using Amazon EMR 6.9 compared to equivalent C7g and C6g instance EMR clusters.

Instance Size 16 XL 12 XL 8 XL 4 XL 2 XL
Total size of the cluster (1 leader + 5 core nodes) 6 6 6 6 6
Total query runtime on C6g (seconds) 2774.86205 2752.84429 3173.08086 5108.45489 8697.08117
Total query runtime on C7g (seconds) 2396.22799 2336.28224 2698.72928 4151.85869 7249.58148
Total query runtime improvement with C7g 13.65% 15.13% 14.95% 18.73% 16.64%
Geometric mean query runtime C6g (seconds) 22.2113 21.75459 23.38081 31.97192 45.41656
Geometric mean query runtime C7g (seconds) 18.43905 17.65898 19.01684 25.48695 37.43737
Geometric mean query runtime improvement with C7g 16.98% 18.83% 18.66% 20.28% 17.57%
EC2 C6g instance price ($ per hour) $2.1760 $1.6320 $1.0880 $0.5440 $0.2720
EMR C6g instance price ($ per hour) $0.5440 $0.4080 $0.2720 $0.1360 $0.0680
(EC2 + EMR) instance price ($ per hour) $2.7200 $2.0400 $1.3600 $0.6800 $0.3400
Cost of running on C6g ($ per instance) $2.09656 $1.55995 $1.19872 $0.96493 $0.82139
EC2 C7g instance price ($ per hour) $2.3200 $1.7400 $1.1600 $0.5800 $0.2900
EMR C7g price ($ per hour per instance) $0.5800 $0.4350 $0.2900 $0.1450 $0.0725
(EC2 + EMR) C7g instance price ($ per hour) $2.9000 $2.1750 $1.4500 $0.7250 $0.3625
Cost of running on C7g ($ per instance) $1.930290 $1.411500 $1.086990 $0.836140 $0.729990
Total cost reduction with C7g including performance improvement -7.93% -9.52% -9.32% -13.35% -11.13%

The following graph shows per-query improvements observed on C7g 2xlarge instances compared to equivalent C6g generations.

Benchmarking methodology

The benchmark used in this post is derived from the industry-standard TPC-DS benchmark, and uses queries from the Spark SQL Performance Tests GitHub repo with the following fixes applied.

We calculated TCO by multiplying cost per hour by number of instances in the cluster and time taken to run the queries on the cluster. We used on-demand pricing in the US East (N. Virginia) Region for all instances.

Conclusion

In this post, we described how we estimated the cost-performance benefit from using Amazon EMR with C7g instances compared to using equivalent previous generation instances. Using these new instances with Amazon EMR improves cost-performance by an additional 7–13%.


About the authors

AI MSAl MS is a product manager for Amazon EMR at Amazon Web Services.

Kyeonghyun Ryoo is a Software Development Engineer for EMR at Amazon Web Services. He primarily works on designing and building automation tools for internal teams and customers to maximize their productivity. Outside of work, he is a retired world champion in professional gaming who still enjoy playing video games.

Yuzhou Sun is a software development engineer for EMR at Amazon Web Services.

Steve Koonce is an Engineering Manager for EMR at Amazon Web Services.

AWS Lake Formation 2022 year in review

Post Syndicated from Jason Berkowitz original https://aws.amazon.com/blogs/big-data/aws-lake-formation-2022-year-in-review/

Data governance is the collection of policies, processes, and systems that organizations use to ensure the quality and appropriate handling of their data throughout its lifecycle for the purpose of generating business value. Data governance is increasingly top-of-mind for customers as they recognize data as one of their most important assets. Effective data governance enables better decision-making by improving data quality, reducing data management costs, and ensuring secure access to data for stakeholders. In addition, data governance is required to comply with an increasingly complex regulatory environment with data privacy (such as GDPR and CCPA) and data residency regulations (such as in the EU, Russia, and China).

For AWS customers, effective data governance improves decision-making, increases business agility, provides a competitive advantage, and reduces the risk of fines due to non-compliance with regulatory obligations. We understand the unique opportunity to provide our customers a comprehensive end-to-end data governance solution that is seamlessly integrated into our portfolio of services, and AWS Lake Formation and the AWS Glue Data Catalog are key to solving these challenges.

In this post, we are excited to summarize the features that the AWS Glue Data Catalog, AWS Glue crawler, and Lake Formation teams delivered in 2022. We have collected some of the key talks and solutions on data governance, data mesh, and modern data architecture published and presented in AWS re:Invent 2022, and a few data lake solutions built by customers and AWS Partners for easy reference. Whether you are a data platform builder, data engineer, data scientist, or any technology leader interested in data lake solutions, this post is for you.

To learn more about how customers are securing and sharing data with Lake Formation, we recommend going deeper into GoDaddy’s decentralized data mesh, Novo Nordisk’s modern data architecture, and JPMorgan’s improvements to their Federated Data Lake, a governed data mesh implementation using Lake Formation. Also, you can learn how AWS Partners integrated with Lake Formation to help customers build unique data lakes, in Starburst’s data mesh solution, Informatica’s automated data sharing solution, Ahana’s Presto integration with Lake Formation, Ascending’s custom data governance system, how PBS used machine learning on their data lakes, and how hc1 provides personalized health insights for customers.

You can review how Lake Formation is used by customers to build modern data architectures in the following re:Invent 2022 talks:

The Lake Formation team listened to customer feedback and made improvements in the areas of cross-account data governance, expanding the source of data lakes, enabling unified data governance of a business data catalog, making secure business-to-business data sharing possible, and expanding the coverage area for fine-grained access controls to Amazon Redshift. In the rest of this post, we are happy to share the progress we made in 2022.

Enhancing cross-account governance

Lake Formation provides the foundation for customers to share data across accounts within their organization. You can share AWS Glue Data Catalog resources to AWS Identity and Access Management (IAM) principals within an account as well as other AWS accounts using two methods. The first one is called the named-resource method, where users can select the names of databases and tables and choose the type of permissions to share. The second method uses LF-Tags, where users can create and associate LF-Tags to databases and tables and grant permission to IAM principals using LF-Tag policies and expressions.

In November 2022, Lake Formation introduced version 3 of its cross-account sharing feature. With this new version, Lake Formation users can share catalog resources using LF-Tags at the AWS Organizations level. Sharing data using LF-tags helps scale permissions and reduces the admin work for data lake builders. The cross-account sharing version 3 also allows you to share resources to specific IAM principals in other accounts, providing data owners control over who can access their data in other accounts. Lastly, we have removed the overhead of writing and maintaining Data Catalog resource policies by introducing AWS Resource Access Manager (AWS RAM) invites with LF-Tags-based policies in the cross-account sharing version 3. We encourage you to further explore cross-account sharing in Lake Formation.

Extending Lake Formation permissions to new data

Until re:Invent 2022, Lake Formation provided permissions management for IAM principals on Data Catalog resources with underlying data primarily on Amazon Simple Storage Service (Amazon S3). At re:Invent 2022, we introduced Lake Formation permissions management for Amazon Redshift data shares in preview mode. Amazon Redshift is a fully-managed, petabyte-scale data warehouse service in the AWS Cloud. The data sharing feature allows data owners to group databases, tables, and views in an Amazon Redshift cluster and share it with other Amazon Redshift clusters within or across AWS accounts. Data sharing reduces the need to keep multiple copies of the same data in different data warehouses to accelerate business decision-making across an organization. Lake Formation further enhances sharing data within Amazon Redshift data shares by providing fine-grained access control on tables and views.

For additional details on this feature, refer to AWS Lake Formation-managed Redshift datashares (preview) and How Redshift data share can be managed by Lake Formation.

Amazon EMR is a managed cluster platform to run big data applications using Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto at scale. You can use Amazon EMR to run batch and stream processing analytics jobs on your S3 data lakes. Starting with Amazon EMR release 6.7.0, we introduced Lake Formation permissions management on a runtime IAM role used with the EMR Steps API. This feature enables you to submit Apache Spark and Apache Hive applications to an EMR cluster through the EMR Steps API that enforces table-level and column-level permissions using Lake Formation to that IAM role submitting the application. This Lake Formation integration with Amazon EMR allows you to share an EMR cluster across multiple users in an organization with different permissions by isolating your applications through a runtime IAM role. We encourage you to check this feature in the Lake Formation workshop Integration with Amazon EMR using Runtime Roles. To explore a use case, see Introducing runtime roles for Amazon EMR steps: Use IAM roles and AWS Lake Formation for access control with Amazon EMR.

Amazon SageMaker Studio is a fully integrated development environment (IDE) for machine learning (ML) that enables data scientists and developers to prepare data for building, training, tuning, and deploying models. Studio offers a native integration with Amazon EMR so that data scientists and data engineers can interactively prepare data at petabyte scale using open-source frameworks such as Apache Spark, Presto, and Hive using Studio notebooks. With the release of Lake Formation permissions management on a runtime IAM role, Studio now supports table-level and column-level access with Lake Formation. When users connect to EMR clusters from Studio notebooks, they can choose the IAM role (called the runtime IAM role) that they want to connect with. If data access is managed by Lake Formation, users can enforce table-level and column-level permissions using policies attached to the runtime role. For more details, refer to Apply fine-grained data access controls with AWS Lake Formation and Amazon EMR from Amazon SageMaker Studio.

Ingest and catalog varied data

A robust data governance model includes data from an organization’s many data sources and methods to discover and catalog those varied data assets. AWS Glue crawlers provide the ability to discover data from sources including Amazon S3, Amazon Redshift, and NoSQL databases, and populate the AWS Glue Data Catalog.

In 2022, we launched AWS Glue crawler support for Snowflake and AWS Glue crawler support for Delta Lake tables. These integrations allow AWS Glue crawlers to create and update Data Catalog tables based on these popular data sources. This makes it even easier to create extract, transform, and load (ETL) jobs with AWS Glue based on these Data Catalog tables as sources and targets.

In 2022, the AWS Glue crawlers UI was redesigned to offer a better user experience. One of the main enhancements delivered as part of this revision is the greater insights into AWS Glue crawler history. The crawler history UI provides an easy view of crawler runs, schedules, data sources, and tags. For each crawl, the crawler history offers a summary of changes in the database schema or Amazon S3 partition changes. Crawler history also provides detailed info about DPU hours and reduces the time spent analyzing and debugging crawler operations and costs. To explore the new functionalities added to the crawlers UI, refer to Set up and monitor AWS Glue crawlers using the enhanced AWS Glue UI and crawler history.

In 2022, we also extended support for crawlers based on Amazon S3 event notifications to support catalog tables. With this feature, incremental crawling can be offloaded from data pipelines to the scheduled AWS Glue crawler, reducing crawls to incremental S3 events. For more information, refer to Build incremental crawls of data lakes with existing Glue catalog tables.

More ways to share data beyond the data lake

During re:Invent 2022, we announced a preview of AWS Data Exchange for AWS Lake Formation, a new feature that enables data subscribers to find and subscribe to third-party datasets that are managed directly through Lake Formation. Until now, AWS Data Exchange subscribers could access third-party datasets by exporting providers’ files to their own S3 buckets, calling providers’ APIs through Amazon API Gateway, or querying producers’ Amazon Redshift data shares from their Amazon Redshift cluster. With the new Lake Formation integration, data providers curate AWS Data Exchange datasets using Lake Formation tags. Data subscribers are able to query and explore the databases and tables associated with those tags, just like any other AWS Glue Data Catalog resource. Organizations can apply resource-based Lake Formation permissions to share the licensed datasets within the same account or across accounts using AWS License Manager. AWS Data Exchange for Lake Formation streamlines data licensing and sharing operations by accelerating data onboarding, reducing the amount of ETL required for end-users to access third-party data, and centralizing governance and access controls for third-party data.

At re:Invent 2022, we also announced Amazon DataZone, a new data management service that makes it faster and easier for you to catalog, discover, share, and govern data stored across AWS, on-premises, and third-party sources. Amazon DataZone is a business data catalog service that supplements the technical metadata in the AWS Glue Data Catalog. Amazon DataZone is integrated with Lake Formation permissions management so that you can effectively manage and govern access to your data, and audit who is accessing what data and for what purpose. With the publisher-subscriber model of Amazon DataZone, data assets can be shared and accessed across Regions. For additional details about the service and its capabilities, refer to the Amazon DataZone FAQs and re:Invent launch.

Conclusion

Data is transforming every field and every business. However, with data growing faster than most companies can keep track of, collecting, securing, and getting value out of that data is a challenging thing to do. A modern data strategy can help you create better business outcomes with data. AWS provides the most complete set of services for the end-to-end data journey to help you unlock value from your data and turn it into insight.

At AWS, we work backward from customer requirements. From the Lake Formation team, we worked hard to deliver the features described in this post, and we invite you to check them out. With our continued focus to invent, we hope to play a key role in empowering organizations to build new data governance models that help you derive more business value at lightning speed.

You can get started with Lake Formation by exploring our hands-on workshop modules and Getting started tutorials. We look forward to hearing from you, our customers, on your data lake and data governance use cases. Please get in touch through your AWS account team and share your comments.


About the Authors

Jason Berkowitz is a Senior Product Manager with AWS Lake Formation. He comes from a background in machine learning and data lake architectures. He helps customers become data-driven.

Aarthi Srinivasan is a Senior Big Data Architect with AWS Lake Formation. She enjoys building data lake solutions for AWS customers and partners. When not on the keyboard, she explores the latest science and technology trends and spends time with her family.

Leonardo Gómez is a Senior Analytics Specialist Solutions Architect at AWS. Based in Toronto, Canada, he has over a decade of experience in data management, helping customers around the globe address their business and technical needs.