Tag Archives: gaming

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

Post Syndicated from Satesh Sonti original https://aws.amazon.com/blogs/big-data/how-gaming-companies-can-use-amazon-redshift-serverless-to-build-scalable-analytical-applications-faster-and-easier/

This post provides guidance on how to build scalable analytical solutions for gaming industry use cases using Amazon Redshift Serverless. It covers how to use a conceptual, logical architecture for some of the most popular gaming industry use cases like event analysis, in-game purchase recommendations, measuring player satisfaction, telemetry data analysis, and more. This post also discusses the art of the possible with newer innovations in AWS services around streaming, machine learning (ML), data sharing, and serverless capabilities.

Our gaming customers tell us that their key business objectives include the following:

  • Increased revenue from in-app purchases
  • High average revenue per user and lifetime value
  • Improved stickiness with better gaming experience
  • Improved event productivity and high ROI

Our gaming customers also tell us that while building analytics solutions, they want the following:

  • Low-code or no-code model – Out-of-the-box solutions are preferred to building customized solutions.
  • Decoupled and scalable – Serverless, auto scaled, and fully managed services are preferred over manually managed services. Each service should be easily replaceable, enhanced with little or no dependency. Solutions should be flexible to scale up and down.
  • Portability to multiple channels – Solutions should be compatible with most of endpoint channels like PC, mobile, and gaming platforms.
  • Flexible and easy to use – The solutions should provide less restrictive, easy-to-access, and ready-to-use data. They should also provide optimal performance with low or no tuning.

Analytics reference architecture for gaming organizations

In this section, we discuss how gaming organizations can use a data hub architecture to address the analytical needs of an enterprise, which requires the same data at multiple levels of granularity and different formats, and is standardized for faster consumption. A data hub is a center of data exchange that constitutes a hub of data repositories and is supported by data engineering, data governance, security, and monitoring services.

A data hub contains data at multiple levels of granularity and is often not integrated. It differs from a data lake by offering data that is pre-validated and standardized, allowing for simpler consumption by users. Data hubs and data lakes can coexist in an organization, complementing each other. Data hubs are more focused around enabling businesses to consume standardized data quickly and easily. Data lakes are more focused around storing and maintaining all the data in an organization in one place. And unlike data warehouses, which are primarily analytical stores, a data hub is a combination of all types of repositories—analytical, transactional, operational, reference, and data I/O services, along with governance processes. A data warehouse is one of the components in a data hub.

The following diagram is a conceptual analytics data hub reference architecture. This architecture resembles a hub-and-spoke approach. Data repositories represent the hub. External processes are the spokes feeding data to and from the hub. This reference architecture partly combines a data hub and data lake to enable comprehensive analytics services.

Let’s look at the components of the architecture in more detail.

Sources

Data can be loaded from multiple sources, such as systems of record, data generated from applications, operational data stores, enterprise-wide reference data and metadata, data from vendors and partners, machine-generated data, social sources, and web sources. The source data is usually in either structured or semi-structured formats, which are highly and loosely formatted, respectively.

Data inbound

This section consists of components to process and load the data from multiple sources into data repositories. It can be in batch mode, continuous, pub/sub, or any other
custom integration. ETL (extract, transform, and load) technologies, streaming services, APIs, and data exchange interfaces are the core components of this pillar. Unlike ingestion processes, data can be transformed as per business rules before loading. You can apply technical or business data quality rules and load raw data as well. Essentially, it provides the flexibility to get the data into repositories in its most usable form.

Data repositories

This section consists of a group of data stores, which includes data warehouses, transactional or operational data stores, reference data stores, domain data stores housing purpose-built business views, and enterprise datasets (file storage). The file storage component is usually a common component between a data hub and a data lake to avoid data duplication and provide comprehensiveness. Data can also be shared among all these repositories without physically moving with features, such as data sharing and federated queries. However, data copy and duplication are allowed considering various consumption needs in terms of formats and latency.

Data outbound

Data is often consumed using structured queries for analytical needs. Also, datasets are accessed for ML, data exporting, and publishing needs. This section consists of components to query the data, export, exchange, and APIs. In terms of implementation, the same technologies may be used for both inbound and outbound, but the functions are different. However, it’s not mandatory to use the same technologies. These processes aren’t transformation heavy because the data is already standardized and almost ready to consume. The focus is on the ease of consumption and integration with consuming services.

Consumption

This pillar consists of various consumption channels for enterprise analytical needs. It includes business intelligence (BI) users, canned and interactive reports, dashboards, data science workloads, Internet of Things (IoT), web apps, and third-party data consumers. Popular consumption entities in many organizations are queries, reports, and data science workloads. Because there are multiple data stores maintaining data at different granularity and formats to service consumer needs, these consumption components depend on data catalogs for finding the right source.

Data governance

Data governance is key to the success of a data hub reference architecture. It constitutes components like metadata management, data quality, lineage, masking, and stewardship, which are required for organized maintenance of the data hub. Metadata management helps organize the technical and business metadata catalog, and consumers can reference this catalog to know what data is available in which repository and at what granularity, format, owners, refresh frequency, and so on. Along with metadata management, data quality is important to increase confidence for consumers. This includes data cleansing, validation, conformance, and data controls.

Security and monitoring

Users and application access should be controlled at multiple levels. It starts with authentication, then authorizing who and what should be accessed, policy management, encryption, and applying data compliance rules. It also includes monitoring components to log the activity for auditing and analysis.

Analytics data hub solution architecture on AWS

The following reference architecture provides an AWS stack for the solution components.

Let’s look at each component again and the relevant AWS services.

Data inbound services

AWS Glue and Amazon EMR services are ideal for batch processing. They scale automatically and are able to process most of the industry standard data formats. Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose, and Amazon Managed Streaming for Apache Kafka (Amazon MSK) enables you to build streaming process applications. These streaming services integrate well with the Amazon Redshift streaming feature. This helps you process real-time sources, IoT data, and data from online channels. You can also ingest data with third-party tools like Informatica, dbt, and Matallion.

You can build RESTful APIs and WebSocket APIs using Amazon API Gateway and AWS Lambda, which will enable real-time two-way communication with web sources, social, and IoT sources. AWS Data Exchange helps with subscribing to third-party data in AWS Marketplace. Data subscription and access is fully managed with this service. Refer to the respective service documentation for further details.

Data repository services

Amazon Redshift is the recommended data storage service for OLAP (Online Analytical Processing) workloads such as cloud data warehouses, data marts, and other analytical data stores. This service is the core of this reference architecture on AWS and can address most analytical needs out of the box. You can use simple SQL to analyze structured and semi-structured data across data warehouses, data marts, operational databases, and data lakes to deliver the best price performance at any scale. The Amazon Redshift data sharing feature provides instant, granular, and high-performance access without data copies and data movement across multiple Amazon Redshift data warehouses in the same or different AWS accounts, and across Regions.

For ease of use, Amazon Redshift offers a serverless option. Amazon Redshift Serverless automatically provisions and intelligently scales data warehouse capacity to deliver fast performance for even the most demanding and unpredictable workloads, and you pay only for what you use. Just load your data and start querying right away in Amazon Redshift Query Editor or in your favorite BI tool and continue to enjoy the best price performance and familiar SQL features in an easy-to-use, zero administration environment.

Amazon Relational Database Service (Amazon RDS) is a fully managed service for building transactional and operational data stores. You can choose from many popular engines such as MySQL, PostgreSQL, MariaDB, Oracle, and SQL Server. With the Amazon Redshift federated query feature, you can query transactional and operational data in place without moving the data. The federated query feature currently supports Amazon RDS for PostgreSQL, Amazon Aurora PostgreSQL-Compatible Edition, Amazon RDS for MySQL, and Amazon Aurora MySQL-Compatible Edition.

Amazon Simple Storage Service (Amazon S3) is the recommended service for multi-format storage layers in the architecture. It offers industry-leading scalability, data availability, security, and performance. Organizations typically store data in Amazon S3 using open file formats. Open file formats enable analysis of the same Amazon S3 data using multiple processing and consumption layer components. Data in Amazon S3 can be easily queried in place using SQL with Amazon Redshift Spectrum. It helps you query and retrieve structured and semi-structured data from files in Amazon S3 without having to load the data. Multiple Amazon Redshift data warehouses can concurrently query the same datasets in Amazon S3 without the need to make copies of the data for each data warehouse.

Data outbound services

Amazon Redshift comes with the web-based analytics workbench Query Editor V2.0, which helps you run queries, explore data, create SQL notebooks, and collaborate on data with your teams in SQL through a common interface. AWS Transfer Family helps securely transfer files using SFTP, FTPS, FTP, and AS2 protocols. It supports thousands of concurrent users and is a fully managed, low-code service. Similar to inbound processes, you can utilize Amazon API Gateway and AWS Lambda for data pull using the Amazon Redshift Data API. And AWS Data Exchange helps publish your data to third parties for consumption through AWS Marketplace.

Consumption services

Amazon QuickSight is the recommended service for creating reports and dashboards. It enables you to create interactive dashboards, visualizations, and advanced analytics with ML insights. Amazon SageMaker is the ML platform for all your data science workload needs. It helps you build, train, and deploy models consuming the data from repositories in the data hub. You can use Amazon front-end web and mobile services and AWS IoT services to build web, mobile, and IoT endpoint applications to consume data out of the data hub.

Data governance services

The AWS Glue Data Catalog and AWS Lake Formation are the core data governance services AWS currently offers. These services help manage metadata centrally for all the data repositories and manage access controls. They also help with data classification and can automatically handle schema changes. You can use Amazon DataZone to discover and share data at scale across organizational boundaries with built-in governance and access controls. AWS is investing in this space to provide more a unified experience for AWS services. There are many partner products such as Collibra, Alation, Amorphic, Informatica, and more, which you can use as well for data governance functions with AWS services.

Security and monitoring services

AWS Identity and Access Management (AWS IAM) manages identities for AWS services and resources. You can define users, groups, roles, and policies for fine-grained access management of your workforce and workloads. AWS Key Management Service (AWS KMS) manages AWS keys or customer managed keys for your applications. Amazon CloudWatch and AWS CloudTrail help provide monitoring and auditing capabilities. You can collect metrics and events and analyze them for operational efficiency.

In this post, we’ve discussed the most common AWS services for the respective solution components. However, you aren’t limited to only these services. There are many other AWS services for specific use cases that may be more appropriate for your needs than what we discussed here. You can reach to AWS Analytics Solutions Architects for appropriate guidance.

Example architectures for gaming use cases

In this section, we discuss example architectures for two gaming use cases.

Game event analysis

In-game events (also called timed or live events) encourage player engagement through excitement and anticipation. Events entice players to interact with the game, increasing player satisfaction and revenue with in-game purchases. Events have become more and more important, especially as games shift from being static pieces of entertainment to be played as is to offering dynamic and changing content through the use of services that use information to make decisions about game play as the game is being played. This enables games to change as the players play and influence what works and what doesn’t, and gives any game a potentially infinite lifespan.

This capability of in-game events to offer fresh content and activities within a familiar framework is how you keep players engaged and playing for months to years. Players can enjoy new experiences and challenges within the familiar framework or world that they have grown to love.

The following example shows how such an architecture might appear, including changes to support various sections of the process like breaking the data into separate containers to accommodate scalability, charge-back, and ownership.

To fully understand how events are viewed by the players and to make decisions about future events requires information on how the latest event was actually performed. This means gathering a lot of data as the players play to build key performance indicators (KPIs) that measure the effectiveness and player satisfaction with each event. This requires analytics that specifically measure each event and capture, analyze, report on, and measure player experience for each event. These KPIs include the following:

  • Initial user flow interactions – What actions users are taking after they first receive or download an event update in a game. Are there any clear drop-off points or bottlenecks that are turning people off the event?
  • Monetization – When, what, and where users are spending money on in the event, whether it’s buying in-game currencies, answering ads, specials, and so on.
  • Game economy – How can users earn and spend virtual currencies or goods during an event, using in-game money, trades, or barter.
  • In-game activity – Player wins, losses, leveling up, competition wins, or player achievements within the event.
  • User to user interactions – Invitations, gifting, chats (private and group), challenges, and so on during an event.

These are just some of the KPIs and metrics that are key for predictive modeling of events as the game acquires new players while keeping existing users involved, engaged, and playing.

In-game activity analysis

In-game activity analysis essentially looks at any meaningful, purposeful activity the player might show, with the goal of trying to understand what actions are taken, their timing, and outcomes. This includes situational information about the players, including where they are playing (both geographical and cultural), how often, how long, what they undertake on each login, and other activities.

The following example shows how such an architecture might appear, including changes to support various sections of the process like breaking the data into separate warehouses. The multi-cluster warehouse approach helps scale the workload independently, provides flexibility to the implemented charge-back model, and supports decentralized data ownership.

The solution essentially logs information to help understand the behavior of your players, which can lead to insights that increase retention of existing players, and acquisition of new ones. This can provide the ability to do the following:

  • Provide in-game purchase recommendations
  • Measure player trends in the short term and over time
  • Plan events the players will engage in
  • Understand what parts of your game are most successful and which are less so

You can use this understanding to make decisions about future game updates, make in-game purchase recommendations, determine when and how your game economy may need to be balanced, and even allow players to change their character or play as the game progresses by injecting this information and accompanying decisions back into the game.

Conclusion

This reference architecture, while showing examples of only a few analysis types, provides a faster technology path for enabling game analytics applications. The decoupled, hub/spoke approach brings the agility and flexibility to implement different approaches to analytics and understanding the performance of game applications. The purpose-built AWS services described in this architecture provide comprehensive capabilities to easily collect, store, measure, analyze, and report game and event metrics. This helps you efficiently perform in-game analytics, event analysis, measure player satisfaction, and provide tailor-made recommendations to game players, efficiently organize events, and increase retention rates.

Thanks for reading the post. If you have any feedback or questions, please leave them in the comments.


About the authors

Satesh Sonti is a Sr. Analytics Specialist Solutions Architect based out of Atlanta, specialized in building enterprise data platforms, data warehousing, and analytics solutions. He has over 16 years of experience in building data assets and leading complex data platform programs for banking and insurance clients across the globe.

Tanya Rhodes is a Senior Solutions Architect based out of San Francisco, focused on games customers with emphasis on analytics, scaling, and performance enhancement of games and supporting systems. She has over 25 years of experience in enterprise and solutions architecture specializing in very large business organizations across multiple lines of business including games, banking, healthcare, higher education, and state governments.

Best Practices for Hosting Regulated Gaming Workloads in AWS Local Zones and on AWS Outposts

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/best-practices-for-hosting-regulated-gaming-workloads-in-aws-local-zones-and-on-aws-outposts/

This blog post is written by Shiv Bhatt, Manthan Raval, and Pawan Matta, who are Senior Solutions Architects with AWS.

Many industries are subject to regulations that are created to protect the interests of the various stakeholders. For some industries, the specific details of the regulatory requirements influence not only the organization’s operations, but also their decisions for adopting new technology. In this post, we highlight the workload residency challenges that you may encounter when you deploy regulated gaming workloads, and how AWS Local Zones and AWS Outposts can help you address those challenges.

Regulated gaming workloads and residency requirements

A regulated gaming workload is a type of workload that’s subject to federal, state, local, or tribal laws related to the regulation of gambling and real money gaming. Examples of these workloads include sports betting, horse racing, casino, poker, lottery, bingo, and fantasy sports. The operators provide gamers with access to these workloads through online and land-based channels, and they’re required to follow various regulations required in their jurisdiction. Some regulations define specific workload residency requirements, and depending on the regulatory agency, the regulations could require that workloads be hosted within a specific city, state, province, or country. For example, in the United States, different state and tribal regulatory agencies dictate whether and where gaming operations are legal in a state, and who can operate. The agencies grant licenses to the operators of regulated gaming workloads, which then govern who can operate within the state, and sometimes, specifically where these workloads can be hosted. In addition, federal legislation can also constrain how regulated gaming workloads can be operated. For example, the United States Federal Wire Act makes it illegal to facilitate bets or wagers on sporting events across state lines. This regulation requires that operators make sure that users who place bets in a specific state are also within the borders of that state.

Benefits of using AWS edge infrastructure with regulated gaming workloads

The use of AWS edge infrastructure, specifically Local Zones and Outposts to host a regulated gaming workload, can help you meet workload residency requirements. You can manage Local Zones and Outposts by using the AWS Management Console or by using control plane API operations, which lets you seamlessly consume compute, storage, and other AWS services.

Local Zones

Local Zones are a type of AWS infrastructure deployment that place compute, storage, database, and other select services closer to large population, industry, and IT centers. Like AWS Regions, Local Zones enable you to innovate more quickly and bring new products to market sooner without having to worry about hardware and data center space procurement, capacity planning, and other forms of undifferentiated heavy-lifting. Local Zones have their own connections to the internet, and support AWS Direct Connect, so that workloads hosted in the Local Zone can serve local end-users with very low-latency communications. Local Zones are by default connected to a parent Region via Amazon’s redundant and high-bandwidth private network. This lets you extend Amazon Virtual Private Cloud (Amazon VPC) in the AWS Region to Local Zones. Furthermore, this provides applications hosted in AWS Local Zones with fast, secure, and seamless access to the broader portfolio of AWS services in the AWS Region. You can see the full list of AWS services supported in Local Zones on the AWS Local Zones features page.

You can start using Local Zones right away by enabling them in your AWS account. There are no setup fees, and as with the AWS Region, you pay only for the services that you use. There are three ways to pay for Amazon Elastic Compute Cloud (Amazon EC2) instances in Local Zones: On-Demand, Savings Plans, and Spot Instances. See the full list of cities where Local Zones are available on the Local Zones locations page.

Outposts

Outposts is a family of fully-managed solutions that deliver AWS infrastructure and services to most customer data center locations for a consistent hybrid experience. For a full list of countries and territories where Outposts is available, see the Outposts rack FAQs and Outposts servers FAQs. Outposts is available in various form factors, from 1U and 2U Outposts servers to 42U Outposts racks, and multiple rack deployments. To learn more about specific configuration options and pricing, see Outposts rack and Outposts servers.

You configure Outposts to work with a specific AWS Region using AWS Direct Connect or an internet connection, which lets you extend Amazon VPC in the AWS Region to Outposts. Like Local Zones, this provides applications hosted on Outposts with fast, secure, and seamless access to the broader portfolio of AWS services in the AWS Region. See the full list of AWS services supported on Outposts rack and on Outposts servers.

Choosing between AWS Regions, Local Zones, and Outposts

When you build and deploy a regulated gaming workload, you must assess the residency requirements carefully to make sure that your workload complies with regulations. As you make your assessment, we recommend that you consider separating your regulated gaming workload into regulated and non-regulated components. For example, for a sports betting workload, the regulated components might include sportsbook operation, and account and wallet management, while non-regulated components might include marketing, the odds engine, and responsible gaming. In describing the following scenarios, it’s assumed that regulated and non-regulated components must be fault-tolerant.

For hosting the non-regulated components of your regulated gaming workload, we recommend that you consider using an AWS Region instead of a Local Zone or Outpost. An AWS Region offers higher availability, larger scale, and a broader selection of AWS services.

For hosting regulated components, the type of AWS infrastructure that you choose will depend on which of the following scenarios applies to your situation:

  1. Scenario one: An AWS Region is available in your jurisdiction and local regulators have approved the use of cloud services for your regulated gaming workload.
  2. Scenario two: An AWS Region isn’t available in your jurisdiction, but a Local Zone is available, and local regulators have approved the use of cloud services for your regulated gaming workload.
  3. Scenario three: An AWS Region or Local Zone isn’t available in your jurisdiction, or local regulators haven’t approved the use of cloud services for your regulated gaming workload, but Outposts is available.

Let’s look at each of these scenarios in detail.

Scenario one: Use an AWS Region for regulated components

When local regulators have approved the use of cloud services for regulated gaming workloads, and an AWS Region is available in your jurisdiction, consider using an AWS Region rather than a Local Zone and Outpost. For example, in the United States, the State of Ohio has announced that it will permit regulated gaming workloads to be deployed in the cloud on infrastructure located within the state when sports betting goes live in January 2023. By using the US East (Ohio) Region, operators in the state don’t need to procure and manage physical infrastructure and data center space. Instead, they can use various compute, storage, database, analytics, and artificial intelligence/machine learning (AI/ML) services that are readily available in the AWS Region. You can host a regulated gaming workload entirely in a single AWS Region, which includes Availability Zones (AZs) – multiple, isolated locations within each AWS Region. By deploying your workload redundantly across at least two AZs, you can help make sure of the high availability, as shown in the following figure.

AWS Region hosting regulated and non-regulated components

Scenario two: Use a Local Zone for regulated components

A second scenario might be that local regulators have approved the use of cloud services for regulated gaming workloads, and an AWS Region isn’t available in your jurisdiction, but a Local Zone is available. In this scenario, consider using a Local Zone rather than Outposts. A Local Zone can support more elasticity in a more cost-effective way than Outposts can. However, you might also consider using a Local Zone and Outposts together to increase availability and scalability for regulated components. Let’s consider the State of Illinois, in the United States, which allows regulated gaming workloads to be deployed in the cloud, if workload residency requirements are met. Operators in this state can host regulated components in a Local Zone in Chicago, and they can also use Outposts in their data center in the same state, for high availability and disaster recovery, as shown in the following figure.

Route ingress gaming traffic through an AWS Region hosting non-regulated components, with a Local Zone and Outposts hosting regulated components

Scenario three: Use of Outposts for regulated components

When local regulators haven’t approved the use of cloud services for regulated gaming workloads, or when an AWS Region or Local Zone isn’t available in your jurisdiction, you can still choose to host your regulated gaming workloads on Outposts for a consistent cloud experience, if Outposts is available in your jurisdiction. If you choose to use Outposts, then note that as part of the shared responsibility model, customers are responsible for attesting to physical security and access controls around the Outpost, as well as environmental requirements for the facility, networking, and power. Use of Outposts requires you to procure and manage the data center within the city, state, province, or country boundary (as required by local regulations) that may be suitable to host regulated components, depending on the jurisdiction. Furthermore, you should procure and configure supported network connections between Outposts and the parent AWS Region. During the Outposts ordering process, you should account for the compute and network capacity required to support the peak load and availability design.

For a higher availability level, you should consider procuring and deploying two or more Outposts racks or Outposts servers in a data center. You might also consider deploying redundant network paths between Outposts and the parent AWS Region. However, depending on your business service level agreement (SLA) for regulated gaming workload, you might choose to spread Outposts racks across two or more isolated data centers within the same regulated boundary, as shown in the following figure.

Route ingress gaming traffic through an AWS Region hosting non-regulated components, with an Outposts hosting regulated components

Options to route ingress gaming traffic

You have two options to route ingress gaming traffic coming into your regulated and non-regulated components when you deploy the configurations that we described previously in Scenarios two and three. Your gaming traffic can come through to the AWS Region, or through the Local Zones or Outposts. Note that the benefits that we mentioned previously around selecting the AWS Region for deploying regulated and non-regulated components are the same when you select an ingress route.

Let’s discuss the benefits and trade offs for each of these options.

Option one: Route ingress gaming traffic through an AWS Region

If you choose to route ingress gaming traffic through an AWS Region, your regulated gaming workloads benefit from access to the wide range of tools, services, and capacity available in the AWS Region. For example, native AWS security services, like AWS WAF and AWS Shield, which provide protection against DDoS attacks, are currently only available in AWS Regions. Only traffic that you route into your workload through an AWS Region benefits from these services.

If you route gaming traffic through an AWS Region, and non-regulated components are hosted in an AWS Region, then traffic has a direct path to non-regulated components. In addition, gaming traffic destined to regulated components, hosted in a Local Zone and on Outposts, can be routed through your non-regulated components and a few native AWS services in the AWS Region, as shown in Figure 2.

Option two: Route ingress gaming traffic through a Local Zone or Outposts

Choosing to route ingress gaming traffic through a Local Zone or Outposts requires careful planning to make sure that tools, services, and capacity are available in that jurisdiction, as shown in the following figure. In addition, consider how choosing this route will influence the pillars of the AWS Well-Architected Framework. This route might require deploying and managing most of your non-regulated components in a Local Zone or on Outposts as well, including native AWS services that aren’t available in Local Zones or on Outposts. If you plan to implement this topology, then we recommend that you consider using AWS Partner solutions to replace the native AWS services that aren’t available in Local Zones or Outposts.

Route ingress gaming traffic through a Local Zone and Outposts that are hosting regulated and non-regulated components, with an AWS Region hosting limited non-regulated components

Conclusion

If you’re building regulated gaming workloads, then you might have to follow strict workload residency and availability requirements. In this post, we’ve highlighted how Local Zones and Outposts can help you meet these workload residency requirements by bringing AWS services closer to where they’re needed. We also discussed the benefits of using AWS Regions in compliment to the AWS edge infrastructure, and several reliability and cost design considerations.

Although this post provides information to consider when making choices about using AWS for regulated gaming workloads, you’re ultimately responsible for maintaining compliance with the gaming regulations and laws in your jurisdiction. You’re in the best position to determine and maintain ultimate responsibility for determining whether activities are legal, including evaluating the jurisdiction of the activities, how activities are made available, and whether specific technologies or services are required to make sure of compliance with the applicable law. You should always review these regulations and laws before you deploy regulated gaming workloads on AWS.

Demystifying Interviewing for Backend Engineers @ Netflix

Post Syndicated from Netflix Technology Blog original https://netflixtechblog.com/demystifying-interviewing-for-backend-engineers-netflix-aceb26a83495

By Karen Casella, Director of Engineering, Access & Identity Management

Have you ever experienced one of the following scenarios while looking for your next role?

  • You study and practice coding interview problems for hours/days/weeks/months, only to be asked to merge two sorted lists.
  • You apply for multiple roles at the same company and proceed through the interview process with each hiring team separately, despite the fact that there is tremendous overlap in the roles.
  • You go through the interview process, do really well, get really excited about the company and the people you meet, and in the end, you are “matched” to a role that does not excite you, working with a manager and team you have not even met during the interview process.

Interviewing can be a daunting endeavor and how companies, and teams, approach the process varies greatly. We hope that by demystifying the process, you will feel more informed and confident about your interview experience.

Backend Engineering Interview Loop

When you apply for a backend engineering role at Netflix, or if one of our recruiters or hiring managers find your LinkedIn profile interesting, a recruiter or hiring manager reviews your technical background and experience to see if your experience is aligned with our requirements. If so, we invite you to begin the interview process.

Most backend engineering teams follow a process very similar to what is shown below. While this is a relatively stream-lined process, it is not as efficient if a candidate is interested in or qualified for multiple roles within the organization.

Following is a brief description of each of these stages.

Recruiter Phone Screen: A member of our talent team contacts you to explain the process and to assess high-level qualifications . The recruiter also reviews the relevant open roles to see if you have a strong affinity for one or another. If your interests and experience align well with one or more of the roles, they schedule a phone screen with one of the hiring managers.

Manager Phone Screen: The purpose of this discussion is to get a sense for your technical background, your approach to problem solving, and how you work. It’s also a great opportunity for you to learn more about the available roles, the technical challenges the teams are facing and what it’s like to work on a backend engineering team at Netflix.

Technical Screen: The final screen before on-site interviews is used to assess your technical skills and match for the team. For many roles, you will be given a choice between a take-home coding exercise or a one-hour discussion with one of the engineers from the team. The problems you are asked to solve are related to the work of the team.

Round 1 Interviews: If you are invited on-site, the first round interview is with four or five people for 45 minutes each. The interview panel consists of two or three engineers, a hiring manager and a recruiter. The engineers assess your technical skills by asking you to solve various design and coding problems. These questions reflect actual challenges that our teams face.

Round 2 Interviews: You meet with two or three additional people, for 45 minutes each. The interview panel comprises an engineering director, a partner engineer or manager, and another engineering leader. The focus of this round is to assess how well you partner with other teams and your non-technical skills.

Decision & Offer: After round 2, we review the feedback and decide whether or not we will be offering you a role. If so, you will work with the recruiter to discuss compensation expectations, answer any questions that remain for you, and discuss a start date with your new team.

Enter Centralized Hiring

Some Netflix backend engineering teams, seeking stunning colleagues with similar backgrounds and talents, are joining forces and adopting a centralized hiring model. Centralized hiring is an approach of making multiple hiring decisions through one unified hiring process across multiple teams with shared needs in skill, function and experience level.

The interview approach does not vary much from what is shown above, with one big exception: there are several potential “pivot points” where you and / or Netflix may decide to focus on a particular role based on your experience and preference. At each stage of the process, we consider your preference and skills and may focus your remaining interviews with a specific team if we both consider it a strong match. It’s important to note that, even though your experience may not be an exact match for one team, you might be more closely aligned with another team. In that case, we would pivot you to another team rather than disqualify you from the process.

Interview Tips

Interviewing can be intimidating and stressful! Being prepared can help you minimize stress and anxiety. Following are a few quick tips to help you prepare:

  • Review your profile and make connections between your experience and the job description.
  • Think about your past work experiences and prepare some examples of when you achieved something amazing, or had some tough challenges.
  • We recommend against interview coding practice puzzle-type exercises, as we don’t ask those types of questions. If you want to practice, focus on medium-difficulty real-world problems you might encounter in a software engineering role.
  • Be sure to have questions prepared to ask the interviewers. This is a conversation, not an inquisition!

We are here to accommodate any accessibility needs you may have, to ensure that you’re set up for success during your interview. Let us know if you need any assistive technology or other accommodations ahead of time, and we’ll be sure to work with you to get it set up.

We want to see you at your best — we are not trying to trick you or trip you up! Try to relax, remember to breathe, and be honest and curious. Remember, this is not just about whether Netflix thinks you are a fit for the role, it’s about you deciding that Netflix and the role are right for you!

Yes, We Are Hiring!

Several of our backend engineering teams are searching for our next stunning colleagues. Some of the areas for which we are actively seeking backend engineers include Streaming & Gaming Technologies, Product Innovation, Infrastructure, and Studio Technologies. If any of the high-level descriptions below are of interest to you and seem like a good match for your experience and career goals, we’d like to hear from you! Simply click on the job description link and submit your application through our jobs site.

Streaming & Gaming Technologies

(https://jobs.netflix.com/jobs/175726412)

  • You are a distributed systems engineer working on product backend systems that support streaming video and/or mobile & cloud games.
  • You’re passionate about resilience, scalability, availability, and observability. Passion for large data sets, APIs, access & identity management, or delivering backend systems that enable mobile and cloud gaming is a big plus.
  • Your work centers around architecting, building and operating fault-tolerant distributed systems at massive scale.

Product Innovation

(https://jobs.netflix.com/jobs/175728345)

  • You are a distributed systems engineer working on core backend services that support our user journeys in signup, subscription, search, personalization and messaging.
  • You’re passionate about working at the intersection of business, product and technology at large scale.
  • Your work centers around building fault-tolerant backend systems and services that make a direct impact on users and the business.

Infrastructure

(https://jobs.netflix.com/jobs/122163878)

  • You are a distributed systems engineer working on infrastructure and platforms that enable or amplify the work of other engineering teams or systems.
  • You’re passionate about scalable and highly available complex distributed systems and have a deep understanding of how they operate and fail.
  • Your work centers around raising levels of abstraction to improve development at scale and creating engineering efficiencies.

Studio Technologies

(https://jobs.netflix.com/jobs/175745345)

  • You are a software engineer that builds products and services used by creative partners across the studio and external productions to produce and manage all of Netflix global content. Our products enable the entire workflow of content acquisition, production, promotion and financing from script to screen. We create innovative solutions that develop and manage entertainment at scale while helping entertain the world as members find joy in the shows and movies they love.
  • You’re passionate about innovation, scalability, functionality, shipping high-value features quickly and are committed to delivering exceptional backend systems for our consumers. You’re humble, curious, and looking to deliver results with other stunning colleagues.
  • Your work centers around building products and services targeting creative partners producing/managing global content.

Conclusion

Netflix has a Freedom & Responsibility culture in which every Netflix employee has the freedom to do their best work and the responsibility to achieve excellence. We value strong judgment, communication, impact, curiosity, innovation, courage, passion, integrity, selflessness, inclusion, and diversity. For more information on the culture, see http://jobs.netflix.com/culture.

Karen Casella is the Director of Engineering for Access & Identity Management technologies for Netflix streaming and gaming products. Connect with Karen on LinkedIn or Twitter.


Demystifying Interviewing for Backend Engineers @ Netflix was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

AWS Shield threat landscape review: 2020 year-in-review

Post Syndicated from Mario Pinho original https://aws.amazon.com/blogs/security/aws-shield-threat-landscape-review-2020-year-in-review/

AWS Shield is a managed service that protects applications that are running on Amazon Web Services (AWS) against external threats, such as bots and distributed denial of service (DDoS) attacks. Shield detects network and web application-layer volumetric events that may indicate a DDoS attack, web content scraping, or other unauthorized non-human traffic that is interacting with AWS resources.

In this blog post, I’ll show you some of the volumetric event trends from network traffic and web request patterns that we observed in 2020 as more workloads moved to the cloud. It includes insights that are broadly applicable to cloud applications and insights that are specific to gaming applications. I will also share tips and best practices that you can follow to protect the availability of the applications that you run on AWS.

DDoS trends as more developers rely on the cloud

In 2020, we saw an increase in developers building applications on AWS and protecting their availability with AWS Shield Advanced, which includes AWS WAF at no additional cost. The DDoS threat vectors we observed were similar to the ones that were observed in 2019, but they occurred with greater frequency. Between February 2020 and April 2020, we observed a 72% increase in the monthly number of events that were detected by Shield.

TCP SYN floods and UDP reflection attacks, which attempt to reflect and amplify packets off legitimate services running on the internet, were among the most common infrastructure-layer events detected by AWS Shield in 2020. (In this blog post, we’ll use the term infrastructure layer to refer to Layers 3 and 4 of the OSI model.) These tactics attempt to affect the availability of an application by overwhelming its ability to process packets or establish new connections on behalf of legitimate users. One of the oldest UDP reflection vectors, DNS reflection, remains the most common, at 15.5% of all infrastructure-layer events detected by Shield. TCP SYN floods were the second most common at 13.8%. This is unsurprising, because web applications commonly rely upon both DNS and TCP traffic. Bad actors can find a consistent supply of systems on the internet that can be used as reflectors, due to the properties of these protocols, or system misconfiguration.

Bad actors may use application-layer requests, in isolation or together with infrastructure-layer attacks, in their attempt to affect the availability of an application. The most common application-layer attack observed by Shield in 2020 was the web request flood, an observation that is consistent with prior years. This vector gives a bad actor more leverage, meaning that they can have a greater effect with less traffic and effort. Instead of having to exhaust the capacity of a network path, device, or other lower-level component, they only need to send more web requests than the application is able to handle. This attack vector was a significant cause of increased volumetric events detected by Shield in the first half of 2020. For more information about events detected by Shield during 2020, see Figure 1.
 

Figure 1: Monthly number of volumetric events detected by AWS Shield in 2020

Figure 1: Monthly number of volumetric events detected by AWS Shield in 2020

A closer look at web application-layer attacks

The request volume of web application-layer events that are detected by AWS Shield has increased, an indication that bad actors are making greater investments in tactics that are more challenging to detect and mitigate than infrastructure-layer events. Shield continuously monitors DDoS activity and alerts customers if there is an elevated threat at any point in time. In 2020, Shield reported elevated threats on 53 days, 33 of which were caused by high-volume web request floods. There were 55 events with a volume of greater than 500,000 requests per second (RPS), some of which reached millions of RPS. The RPS of the 99th percentile (P99) of the volume of web request floods detected by Shield nearly doubled between the first and second halves of the year. (The 99th percentile is the request volume in RPS, below which 99% of request floods were observed.). For more information about the volume of web request floods detected by Shield in 2020, see Figure 2.
 

Figure 2: Quarterly P90 and P99 volume of web request floods detected by AWS Shield in 2020

Figure 2: Quarterly P90 and P99 volume of web request floods detected by AWS Shield in 2020

It’s important to protect web applications against DDoS attacks of any size. The more common request floods are relatively small, but smaller attacks can affect an application if it isn’t architected for DDoS resiliency. You can follow these best practices to help protect your web application against request floods and other DDoS attacks:

  • Protect internet-facing resources with AWS Shield Advanced. You can use AWS Shield Advanced to protect your applications that are running on AWS against most common, frequently occurring network and transport layer DDoS attacks. When you add protected resources in AWS Shield Advanced, network volumetric attacks against those resources are detected and mitigated more quickly. You also receive visibility into security events by using the AWS Shield console, API, or Amazon CloudWatch metrics. If you need assistance during an active event, you can quickly engage with AWS Shield experts or escalate to the AWS Shield Response Team (SRT).
  • Access greater network and request capacity with Amazon CloudFront and Amazon Route 53. You can use these services to serve static and dynamic web content, as well as DNS answers, by using the global network of AWS edge locations. This provides you with greater capacity to help mitigate large volumetric attacks. Applications that are fronted by Amazon CloudFront and Amazon Route 53 also benefit from inline mitigation that continually inspects all traffic and mitigates most infrastructure-layer DDoS attempts in less than one second. CloudFront and the AWS Shield DDoS mitigation systems use SYN cookies to verify new connections, which protects against SYN floods and other traffic floods that aren’t valid for the application. (A SYN cookie is a technique by which the Shield infrastructure encodes connection setup information into the SYN response (SYN-ACK packet) in such a way that the TCP connection resources are only consumed for legitimate clients who complete the TCP handshake.)
  • Use AWS WAF and rate-based rules to mitigate application-layer attacks. AWS Shield Advanced provides you with protection against infrastructure-layer attacks that can be mitigated with network-based DDoS mitigation systems. When you add Shield Advanced protection to CloudFront or Application Load Balancer (ALB) for serving web content, you receive AWS WAF at no additional cost. AWS Managed Rules for AWS WAF makes it easy to select and apply pre-configured rules, depending on your specific requirements. You also receive web request flood detection and can mitigate security events by configuring rate-based rules to match and temporarily block IP addresses that are sending traffic above a rate that you define. For larger applications, or applications that span multiple AWS accounts, you can use AWS Firewall Manager to deploy and manage rules across all of your resources.

Considerations unique to gaming use cases

On AWS, you can build and protect any kind of application. Internet-facing applications are more likely to receive DDoS attacks, particularly if a bad actor is motivated to disrupt the normal function of the application. We looked across AWS Shield data and found that one type of application stood out as the most likely to be targeted by DDoS attacks: gaming servers. Gaming servers host matches between players on their personal computers or gaming consoles. 16% of infrastructure-layer events detected by Shield in 2020 targeted gaming applications. The application might be targeted simply out of malice, or to gain an advantage in the game. Between Q1 2020 and Q2 2020, we observed a 46% increase in the frequency of events that were detected on behalf of gaming applications. This increase aligns with the increased use of residential internet networks during the same time.

There are unique considerations for protecting a gaming application against DDoS attacks. Many gaming applications rely upon UDP traffic, which makes it infeasible to block UDP as a countermeasure against the most common DDoS attacks, like UDP reflection attacks or UDP floods. You can nevertheless protect your gaming application and the experience of your players by using Elastic IP addresses and protecting these resources with AWS Shield Advanced. Shield Advanced has the ability to perform deep packet inspection of all traffic, even at extremely high PPS rates. Using that powerful tool, the AWS Shield Response Team (SRT) can work with you to understand your application and build a custom mitigation that allows only valid player traffic.

Reacting to extortion attempts

From August 2020 through November 2020, we saw a revival of DDoS extortion attempts, a tactic that is now more than six years old. Each extortion attempt reported by customers to the AWS SRT had familiar characteristics. A malicious actor would target an application that wasn’t running on AWS as a proof of concept and then threaten a larger, follow-on attack if a ransom wasn’t paid. Although it’s very uncommon for the follow-on attack to actually occur, application owners take these threats seriously and use the opportunity to assess their own protection and operational readiness. In approximately 90% of AWS support cases related to these attempts, the SRT assisted the application owners directly with their preparation. We also assisted Shield Advanced customers who weren’t directly targeted by extortion attempts but were aware of other extortion campaigns.

One question that we frequently hear is how AWS can help developers monitor their applications and take quick action if a possible DDoS attack is detected. When you protect your resources with AWS Shield Advanced, you have the option to associate an Amazon Route 53 health check. The status of the health check is used to improve the decisions that are made by the Shield detection system. If you have Shield Advanced proactive engagement enabled, the SRT is automatically engaged any time a Shield event corresponds to an unhealthy Route 53 health check that is associated to your protected resource. Based on the contact information provided in the Shield console, an SRT engineer will contact you to coordinate a response to the detected event. If you’re running a web application, you can choose to delegate access to your Shield Advanced and AWS WAF APIs to the SRT and provide the team with copies of your AWS WAF logs. During an escalation, an SRT engineer will evaluate your logs for DDoS signatures and robotic patterns and assist in building effective mitigations.

Summary

In this blog post, I shared some of the trends that were observed by AWS Shield in 2020, as well as steps that you can take to protect the availability of your applications against DDoS attacks. If you’d like to learn more about DDoS protection on AWS and configuring AWS Shield Advanced, check out the following resources:

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Shield forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Mário Pinho

Mário is a Security Engineer at AWS. He has a background in network engineering and consulting, and feels at his best when breaking apart complex topics and processes into their simpler components. In his free time, he pretends to be an artist by playing piano and doing landscape photography.