Today, I am happy to announce Redis 6 compatibility to Amazon ElastiCache for Redis. This release brings several new and important features to Amazon ElastiCache for Redis:
Managed Role-Based Access Control – Amazon ElastiCache for Redis 6 now provides you with the ability to create and manage users and user groups that can be used to set up Role-Based Access Control (RBAC) for Redis commands. You can now simplify your architecture while maintaining security boundaries by having several applications use the same Redis cluster without being able to access each other’s data. You can also take advantage of granular access control and authorization to create administration and read-only user groups. Amazon ElastiCache enhances the new Access Control Lists (ACL) introduced in open source Redis 6 to provide a managed RBAC experience, making it easy to set up access control across several Amazon ElastiCache for Redis clusters.
Client-Side Caching – Amazon ElastiCache for Redis 6 comes with server-side enhancements to deliver efficient client-side caching to further improve your application performance. Redis clusters now support client-side caching by tracking client requests and sending invalidation messages for data stored on the client. In addition, you can also take advantage of a broadcast mode that allows clients to subscribe to a set of notifications from Redis clusters.
Significant Operational Improvements – This release also includes several enhancements that improve application availability and reliability. Specifically, Amazon ElastiCache has improved replication under low memory conditions, especially for workloads with medium/large sized keys, by reducing latency and the time it takes to perform snapshots. Open source Redis enhancements include improvements to expiry algorithm for faster eviction of expired keys and various bug fixes.
Note that open source Redis 6 also announced support for encryption-in-transit, a capability that is already available in Amazon ElastiCache for Redis 4.0.10 onwards. This release of Amazon ElastiCache for Redis 6 does not impact Amazon ElastiCache for Redis’ existing support for encryption-in-transit.
In order to apply RBAC to a new or existing Redis 6 cluster, we first need to ensure you have a user and user group created. We’ll review the process to do this below.
Using Role-Based Access Control – How it works An alternative to Authenticating Users with the Redis AUTH Command, Amazon ElastiCache for Redis 6 offers Role-Based Access Control (RBAC). With RBAC, you create users and assign them specific permissions via an Access String.
If you want to create, modify, and delete users and user groups, you will need to select to the User Management and User Group Management sections in the ElastiCache console.
ElastiCache will automatically configure a default user with user ID and user name “default”, and then you can add it or new created users to new groups in User Group Management.
If you want to change the default user with your own password and access setting, you need to create a new user with the username set to “default” and can then swap it with the original default user. We recommend using your own strong password for a default user.
The following example shows how to swap the original default user with another default that has a modified access string via AWS CLI.
Also, you can modify a user’s password or change its access permissions using modify-user command, or remove a specific user using delete-user command. It will be removed from any user groups to which it belongs.
Similarly you can modify a user group by adding new users and/or removing current users using modify-user-group command, or delete a user group using delete-user-group command. Note that the user group itself, not the users belonging to the group, will be deleted.
Once you have created a user group and added users, you can assign the user group to a replication group, or migrate between Redis AUTH and RBAC. For more information, see the documentation in detail.
Redis 6 cluster for ElastiCache – Getting Started As usual, you can use the ElastiCache Console, CLI, APIs, or a CloudFormation template to create to new Redis 6 cluster. I’ll use the Console, choose Redis from the navigation pane and click Create with the following settings:
Select “Encryption in-transit” checkbox to ensure you can see the “Access Control” options. You can select an option of Access Control either User Group Access Control List by RBAC features or Redis AUTH default user. If you select RBAC, you can choose one of the available user groups.
My cluster is up and running within minutes. You can also use the in-place upgrade feature on existing cluster. By selecting the cluster, click Action and Modify. You can change the Engine Version from 5.0.6-compatible engine to 6.x.
Now Available Amazon ElastiCache for Redis 6 is now available in all AWS regions. For a list of ElastiCache for Redis supported versions, refer to the documentation. Please send us feedback either in the AWS forum for Amazon ElastiCache or through AWS support, or your account team.
Founded in 2008, Dream11 is India’s leading sports-tech startup with a growing base of more than 45 million users playing multiple sports such as fantasy cricket, football, kabaddi, and basketball.
Dream11 uses Amazon Aurora with Amazon ElastiCache to serve 1 million concurrent users within 50ms response time, serving at an average 3 million requests per minute (rpm), which can surge to 3X in a 30-second time span. In this video, we’ll be shedding some light on our architecture, along with our battle plan to handle transactions without locking. We’ll also talk about the features of Aurora and ElastiCache that helped Dream11 to handle 5 million daily active users with 4X growth YoY.
This post was co-written with Lucas Ceballos, CTO of Smadex
Introduction
Showing ads may seem to be a simple task, but it’s not. Showing the right ad to the right user is an incredibly complex challenge that involves multiple disciplines such as artificial intelligence, data science, and software engineering. Doing it one million times per second with a 100-ms constraint is even harder.
In the ad-tech business, speed and infrastructure costs are the keys to success. The less the final user waits for an ad, the higher the probability of that user clicking on the ad. Doing that while keeping infrastructure costs under control is crucial for business profitability.
About Smadex
Smadex is the leading mobile-first programmatic advertising platform specifically built to deliver best user acquisition performance and complete transparency.
Its state-of-the-art digital signal processing (DSP) technology provides advertisers with the tools they need to achieve their goals and ROI, with measurable results from web forms, post-app install events, store visits, and sales.
Smadex advertising architecture
What does showing ads look like under the hood? At Smadex, our technology works based on the OpenRTB (Real-Time Bidding) protocol.
RTB is a means by which advertising inventory is bought and sold on a per-impression basis, via programmatic instantaneous auction, which is similar to financial markets.
To show ads, we participate in auctions deciding in real time which ad to show and how much to bid trying to optimize the cost of every impression.
The final user browses the publisher’s website or app.
Ad-exchange is called to start a new auction.
Smadex receives the bid request and has to decide which ad to show and how much to offer in just 100 ms (and this is happening one million times per second).
If Smadex won the auction, the ad must be sent and rendered on the publisher’s website or app.
In the end, the user interacts with the ad sending new requests and events to Smadex platform.
Flow of data
As you can see in the previous diagram, showing ads is just one part of the challenge. After the ad is shown, the final user interacts with it in multiple ways, such as clicking it, installing an application, subscribing to a service, etc. This happens during a determined period that we call the “attribution window.” All of those interactions must be tracked and linked to the original bid transaction (using the request_id parameter).
Doing this is complicated: billions of bid transactions must be stored and available so that they can be quickly accessed every time the user interacts with the ad. The longer we store the transactions, the longer we can “wait” for an interaction to take place, and the better for our business and our clients, too.
Challenge #1: Cost
The challenge is: What kind of database can store billions of records per day, with at least a 30-day retention capacity (attribution window), be accessed by key-value, and all by spending as little as possible?
The answer is…none! Based on our research, all the available options that met the technical requirements were way out of our budget.
So…how to solve it? Here is when creativity and the combination of different AWS services comes into place.
We started to analyze the time dispersion of the events trying to find some clues. The interesting thing we spotted was that 90% of what we call “post-bid events” (impression, click, install, etc.) happened within one hour after the auction took place.
That means that we can process 90% of post-bid events by storing just one hour of bids.
Under our current workload, in one hour we participate in approximately 3.7 billion auctions generating 100 million bid records of an average 600 bytes each. This adds up to 55 gigabytes per hour, an easier amount of data to process.
Instead of thinking about one single database to store all the bid requests, we decided to split bids into two different categories:
Hot Bid: A request that took place within the last hour (small amount and frequently accessed)
Cold Bid: A request that took place more than our hour ago (huge amount and infrequently accessed)
Amazon ElastiCache for Redis is the best option to store 55 GB of data in memory, which gives us the ability to query in a key-value way with the lowest possible latency.
Hot Bids flow
Every new bid is a hot bid by definition so it’s going to be stored in the hot bids Redis cluster.
At the moment of the user interaction with the ad, the Smadex tracker component receives an HTTPS notification, including the bid request UUID that originated it.
Based on the date of occurrence extracted from the received UUID, the tracker component can determine if it’s looking for a hot bid or not. If it’s a hot bid, the tracker reads it directly from Redis performing a key-value lookup query.
It’s been easy so far but what to do with the other 29 days and 23 hours we need to store?
Challenge #2: Performance
As we previously mentioned, cold bids are a huge infrequently accessed number of records with only 10% of post-bid events pointing to them. That sounds like a good use case for an inexpensive and slower data store like Amazon S3.
Thanks to the S3 low-cost storage prices combined with the ability to query S3 objects directly using Amazon Athena, we were able to optimize our costs by storing and querying cold bids by implementing a serverless architecture.
Cold Bids Flow
Incoming bids are buffered by Fluentd and flushed to S3 every one minute in JSON format. Every single file flushed to S3 contains all the bids processed by a specific EC2 instance for one minute.
An AWS Lambda function is automatically triggered on every new PutObject event from S3. This function transforms the JSON records to Parquet format and will save it back the S3 bucket, but this time into a specific partition folder based on file creation timestamp.
As seen on the hot bids flow, the tracker component will determine if it’s looking for a hot or a cold bid based on the extracted timestamp of the request UUID. In this case, the cold bid will be retrieved by running an Amazon Athena look-up query leveraging the use of partitions and Parquet format to reduce as much as possible the latency and data that needs to be scanned.
Conclusion
Thanks to this combined approach using different technologies and a variety of AWS services we were able to extend our attribution window from 30 to 90 days while reducing the infrastructure costs by 45%.
In-memory data stores are widely used for application scalability, and developers have long appreciated their benefits for storing frequently accessed data, whether volatile or persistent. Systems like Redis help decouple databases and backends from incoming traffic, shedding most of the traffic that would had otherwise reached them, and reducing application latency for users.
Obviously, managing these servers is a critical task, and great care must be taken to keep them up and running no matter what. In a previous job, my team had to move a cluster of physical cache servers across hosting suites: one by one, they connected them to external batteries, unplugged external power, unracked them, and used an office trolley (!) to roll them to the other suite where they racked them again! It happened without any service interruption, but we all breathed a sigh of relief once this was done… Lose cache data on a high-traffic platform, and things get ugly. Fast. Fortunately, cloud infrastructure is more flexible! To help minimize service disruption should an incident occur, we have added many high-availability features to Amazon ElastiCache, our managed in-memory data store for Memcached and Redis: cluster mode, multi-AZ with automatic failover, etc.
As Redis is often used to serve low latency traffic to global users, customers have told us that they’d love to be able to replicate Amazon ElastiCache clusters across AWS regions. We listened to them, got to work, and today, we’re very happy to announce that this replication capability is now available for Redis clusters.
Introducing Amazon ElastiCache Global Datastore For Redis In a nutshell, Amazon ElastiCache Global Datastore for Redis lets you replicate a cluster in one region to clusters in up to two other regions. Customers typically do this in order to:
Bring cached data closer to your users, in order to reduce network latency and improve application responsiveness.
Build disaster recovery capabilities, should a region be partially or totally unavailable.
Setting up a global datastore is extremely easy. First, you pick a cluster to be the primary cluster receiving writes from applications: this can either be a new cluster, or an existing cluster provided that it runs Redis 5.0.6 or above. Then, you add up to two secondary clusters in other regions which will receive updates from the primary.
This setup is available for all Redis configurations except single node clusters: of course, you can convert a single node cluster to a replication group cluster, and then use it as a primary cluster.
Last but not least, clusters that are part of a global datastore can be modified and resized as usual (adding or removing nodes, changing node type, adding or removing shards, adding or removing replica nodes).
Let’s do a quick demo.
Replicating a Redis Cluster Across Regions Let me show you how to build from scratch a three-cluster global datastore: the primary cluster will be located in the us-east-1 region, and the two secondary clusters will be located in the us-west-1 and us-west-2 regions. For the sake of simplicity, I’ll use the same default configuration for all clusters: three cache.r5.large nodes, multi-AZ, one shard.
Heading out to the AWS Console, I click on ‘Global Datastore’, and then on ‘Create’ to create my global datastore. I’m asked if I’d like to create a new cluster supporting the datastore, or if I’d rather use an existing cluster. I go for the former, and create a cluster named global-ds-1-useast1.
I click on ‘Next’, and fill in details for a secondary cluster hosted in the us-west-1 region. I unimaginatively name it global-ds-1-us-west1.
Then, I add another secondary cluster in the us-west-2 region, named global-ds-1-uswest2: I go to ‘Global Datastore’, click on ‘Add Region’, and fill in cluster details.
A little while later, all three clusters are up, and have been associated to the global datastore.
Using the redis-cli client running on an Amazon Elastic Compute Cloud (EC2) instance hosted in the us-east-1 region, I can quickly connect to the cluster endpoint and check that it’s indeed operational.
[us-east-1-instance] $ redis-cli -h $US_EAST_1_CLUSTER_READWRITE > ping PONG > set paris france OK > set berlin germany OK > set london uk OK > keys * 1) "london" 2) "berlin" 3) "paris" > get paris "france"
This looks fine. Using an EC2 instance hosted in the us-west-1 region, let’s now check that the data we stored in the primary cluster has been replicated to the us-west-1 secondary cluster.
[us-west-1-instance] $ redis-cli -h $US_WEST_1_CLUSTER_READONLY > keys * 1) "london" 2) "berlin" 3) "paris" > get paris "france"
Nice. Now let’s add some more data on the primary cluster…
> hset Parsifal composer "Richard Wagner" date 1882 acts 3 language "German" > hset DonGiovanni composer "W.A. Mozart" date 1787 acts 2 language "Italian" > hset Tosca composer "Giacomo Puccini" date 1900 acts 3 language "Italian"
…and check as quickly as possible on the secondary cluster.
That was fast: by the time I switched to the other terminal and ran the command, the new data was already there. That’s not really surprising since the typical network latency for cross region traffic ranges from 60 milliseconds to 200 milliseconds depending on regions.
Now, what would happen if something went wrong with our primary cluster hosted in us-east-1? Well, we could easily promote one of the secondary clusters to full read/write capabilities.
For good measure, I also remove the us-east-1 cluster from the global datastore. Once this is complete, the global datastore looks like this.
Now, using my EC2 instance in the us-west-1 region, and connecting to the read/write endpoint of my cluster, I add more data…
[us-west-1-instance] $ redis-cli -h $US_WEST_1_CLUSTER_READWRITE > hset Lohengrin composer "Richard Wagner" date 1850 acts 3 language "German"
… and check that it’s been replicated to the us-west-2 cluster.
It’s all there. Global datastores make it really easy to replicate Amazon ElastiCache data across regions!
Now Available! This new global datastore feature is available today in US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Asia Pacific (Seoul), Asia Pacific (Sydney), Asia Pacific (Singapore), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (London). Please give it a try and send us feedback, either on the AWS forum for Amazon ElastiCache, or through your usual AWS support contacts.
Today, I’m going to tell you about a new site we launched at re:Invent, the Amazon Builders’ Library, a collection of living articles covering topics across architecture, software delivery, and operations. You get to peek under the hood of how Amazon architects, releases, and operates the software underpinning Amazon.com and AWS.
Want to know how Amazon.com does what it does? This is for you. In this two-part series (the next one coming December 23), I’ll highlight some of the best architecture articles written by Amazon’s senior technical leaders and engineers.
Avoiding insurmountable queue backlogs
In queueing theory, the behavior of queues when they are short is relatively uninteresting. After all, when a queue is short, everyone is happy. It’s only when the queue is backlogged, when the line to an event goes out the door and around the corner, that people start thinking about throughput and prioritization.
In this article, I discuss strategies we use at Amazon to deal with queue backlog scenarios – design approaches we take to drain queues quickly and to prioritize workloads. Most importantly, I describe how to prevent queue backlogs from building up in the first place. In the first half, I describe scenarios that lead to backlogs, and in the second half, I describe many approaches used at Amazon to avoid backlogs or deal with them gracefully.
Whenever one service or system calls another, failures can happen. These failures can come from a variety of factors. They include servers, networks, load balancers, software, operating systems, or even mistakes from system operators. We design our systems to reduce the probability of failure, but impossible to build systems that never fail. So in Amazon, we design our systems to tolerate and reduce the probability of failure, and avoid magnifying a small percentage of failures into a complete outage. To build resilient systems, we employ three essential tools: timeouts, retries, and backoff.
The moment we added our second server, distributed systems became the way of life at Amazon. When I started at Amazon in 1999, we had so few servers that we could give some of them recognizable names like “fishy” or “online-01”. However, even in 1999, distributed computing was not easy. Then as now, challenges with distributed systems involved latency, scaling, understanding networking APIs, marshalling and unmarshalling data, and the complexity of algorithms such as Paxos. As the systems quickly grew larger and more distributed, what had been theoretical edge cases turned into regular occurrences.
Developing distributed utility computing services, such as reliable long-distance telephone networks, or Amazon Web Services (AWS) services, is hard. Distributed computing is also weirder and less intuitive than other forms of computing because of two interrelated problems. Independent failures and nondeterminism cause the most impactful issues in distributed systems. In addition to the typical computing failures most engineers are used to, failures in distributed systems can occur in many other ways. What’s worse, it’s impossible always to know whether something failed.
At Amazon, the services we build must meet extremely high availability targets. This means that we need to think carefully about the dependencies that our systems take. We design our systems to stay resilient even when those dependencies are impaired. In this article, we’ll define a pattern that we use called static stability to achieve this level of resilience. We’ll show you how we apply this concept to Availability Zones, a key infrastructure building block in AWS and therefore a bedrock dependency on which all of our services are built.
Learn about AWS Services & Solutions – September AWS Online Tech Talks
Join us this September to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!
Note – All sessions are free and in Pacific Time.
Tech talks this month:
Compute:
September 23, 2019 | 11:00 AM – 12:00 PM PT – Build Your Hybrid Cloud Architecture with AWS – Learn about the extensive range of services AWS offers to help you build a hybrid cloud architecture best suited for your use case.
September 25, 2019 | 1:00 PM – 2:00 PM PT – What’s New in Amazon DocumentDB (with MongoDB compatibility) – Learn what’s new in Amazon DocumentDB, a fully managed MongoDB compatible database service designed from the ground up to be fast, scalable, and highly available.
September 23, 2019 | 9:00 AM – 10:00 AM PT – Training Machine Learning Models Faster – Learn how to train machine learning models quickly and with a single click using Amazon SageMaker.
September 30, 2019 | 11:00 AM – 12:00 PM PT – Using Containers for Deep Learning Workflows – Learn how containers can help address challenges in deploying deep learning environments.
September 24, 2019 | 11:00 AM – 12:00 PM PT – Application Migrations Using AWS Server Migration Service (SMS) – Learn how to use AWS Server Migration Service (SMS) for automating application migration and scheduling continuous replication, from your on-premises data centers or Microsoft Azure to AWS.
September 30, 2019 | 1:00 PM – 2:00 PM PT – AWS Office Hours: Amazon CloudFront – Just getting started with Amazon CloudFront and [email protected]? Get answers directly from our experts during AWS Office Hours.
Robotics:
October 1, 2019 | 11:00 AM – 12:00 PM PT – Robots and STEM: AWS RoboMaker and AWS Educate Unite! – Come join members of the AWS RoboMaker and AWS Educate teams as we provide an overview of our education initiatives and walk you through the newly launched RoboMaker Badge.
October 2, 2019 | 9:00 AM – 10:00 AM PT – Deep Dive on Amazon EventBridge – Learn how to optimize event-driven applications, and use rules and policies to route, transform, and control access to these events that react to data from SaaS apps.
Earlier this year we announced Redis 4.0 compatibility for Amazon ElastiCache. In that post, Randall explained how ElastiCache for Redis clusters can scale to terabytes of memory and millions of reads and writes per second! Other recent improvements to Amazon ElastiCache for Redis include:
Read Replica Scaling – Support for adding or removing read replica nodes to a Redis Cluster, along with a reduction of up to 40% in cluster creation time.
PCI DSS Compliance – Certification as Payment Card Industry Data Security Standard (PCI DSS) compliant. This allows you to use ElastiCache for Redis (engine versions 4.0.10 and higher) to build low-latency, high-throughput applications that process sensitive payment card data.
FedRAMP Authorized and Available in AWS GovCloud (US) – United States government customers and their partners can use ElastiCache for Redis to process and store their FedRAMP systems and data for mission-critical, high-impact workloads in the AWS GovCloud (US) Region, and at moderate impact level in the other AWS Regions in the US. To learn more, read the ElastiCache for Redis Compliance documentation.
In-Place Upgrades – Support for upgrading a Redis Cluster to a newer engine version in-place and while maintaining availability except for a failover period measured in seconds.
New Instance Types – Support for the use of M5 and R5 instances, with significant performance improvements.
5.0 Compatibility Today I am happy to announce Redis 5.0 compatibility to Amazon ElastiCache for Redis. This version of Redis includes support for a new Streams data type and new commands (ZPOPMIN and ZPOPMAX) for use on Sorted Sets, and also does a better job of defragmenting memory. To learn more, read What’s New in Redis 5?
I can also use the in-place upgrade feature that I mentioned earlier on my existing 4.0-compatible cluster. I select the cluster, click Modify, and the 5.0-compatible engine is already selected. I confirm the other settings and click Modify to proceed:
Streams in Action The new Stream data type is very powerful! Each Stream has a name, and can be created by simply referencing it as part of an XADD command. Let’s say that I have a long-running process that generates files that need to be scanned and validated. For testing purposes, I can add a bunch of files to a stream name Files from the shell like this:
I can retrieve values starting from the beginning of the stream using the command XREAD BLOCK 1000 STREAMS Files 0:
I can also read the values that are after a given ID:
In most cases, I would be doing the reads and the writes from code rather than from the command line, of course. This is a very simple example of the power of Redis 5 Streams and I am sure that you can do better!
Available Now You can upgrade existing 4.0-compatible clusters and create new 5.0-compatible clusters today in all commercial AWS regions.
Contributed by Ruchita Arora, Sr. Product Manager, Allen Farris, Software Dev Engineer, and Itay Maoz, Sr. Software Engineering Manager
Earlier this year, Amazon EC2 introduced two exciting new instance families, M5 and R5. These instances are based on the new AWS Nitro system, a combination of dedicated hardware and lightweight hypervisor that aims to deliver performance indistinguishable from bare-metal performance. These new instance families deliver up to 25 Gbps of aggregate network bandwidth, with enhanced networking based on the Elastic Network Adapter (ENA).
R5 and M5 instances feature custom hardware and custom Intel Xeon Scalable processors to enable a sustained all core frequency of up to 3.1 GHz and support Intel Advanced Vector Extension 512 (AVX-512). The latest fifth generation EC2 instances offer up to 50% more vCPUs and 60% more memory over the previous generation instances, and larger r5.24xlarge and m5.24xlarge instances.
Amazon ElastiCache
Amazon ElastiCache offers a Redis or Memcached-compatible, fully managed, in-memory data store and caching service in the cloud. The service embodies much of what makes fast data a reality for customers who are looking to process a high volume of data at incredible rates, faster than traditional databases.
As part of adding support for M5 and R5 instances in ElastiCache, we spent the time to take advantage of the AWS Nitro-based system and optimize these instances for ElastiCache for Redis. Developers love the performance, simplicity, and in-memory capabilities of Redis, making it among the most popular NoSQL key-value stores. Redis’s microsecond latency has made it a default choice for caching. The support for advanced data structures (for example, lists, sets, and sorted sets) also enables a variety of in-memory use cases such as leaderboards, in-memory analytics, messaging, and more.
Optimizing performance for ElastiCache for Redis
We started with the M5 and R5 instances and tuned performance by optimizing the Amazon Linux operating system configuration on these instances to maximize network performance for running in-memory workloads.
Using the open-source benchmarking tool rpc-perf, we ran a Redis benchmark with 14.7 million unique keys, 200-byte string values, 80% gets, 20% sets, and no command pipelining. We ran this benchmark on 20 client instances connecting to an optimized R5 instance in the same Availability Zone. We saw up to 30% more transactions per second than running ElastiCache for Redis on the same size instance with the default Linux configuration. For details, see the following table.
Vanilla R4
Vanilla R5
Tuned R5
Vanilla R4 to Tuned R5 Improvement
large
88,000 RPS
179,000 RPS
215,000 RPS
144%
xlarge
93,000 RPS
180,000 RPS
207,000 RPS
122%
2xlarge
107,000 RPS
187,000 RPS
217,000 RPS
102%
4xlarge
131,000 RPS
208,000 RPS
225,000 RPS
71%
8xlarge/12xlarge
128,000 RPS
211,000 RPS
247,000 RPS
92%
16xlarge/24xlarge
149,000 RPS
181,000 RPS
237,000 RPS
59%
We also reduced average (p50) and tail (p99) latencies up to 23%, resulting in average latencies as low as 350 microseconds after these optimizations. The optimized M5 instances yielded 9%-42% incremental requests per second and better CPU utilization for ElastiCache for Redis workloads.
For the same caching use case scenario, ElastiCache for Redis optimized R5 instances benefited from a significant performance improvement over self-managed Redis on R4 instances. The optimized R5 instances supported 59%-144% more transactions per second than similarly sized R4 instances.
We saw similar incremental performance improvements on optimized M5 instances relative to previous generation M4 instances. The optimized M5 instances benefited from throughput improvements of up to 356% relative to previous generation M4 instances.
Among the M5 instances, the most significant improvements were in the smaller size of the M5 family. They take advantage of ENA performance with burst networking up to 10 Gbps for the m5.large through m5.4xlarge sizes, which is useful for handling infrequent traffic spikes.
Summary
We are excited to bring these new instances to customers. You benefit from less hypervisor overhead and better networking, but you also see a dramatic upside from the performance tuning work that the ElastiCache team did to take advantage of the AWS Nitro system. This is just the beginning.
Our performance team is continuing to enhance the full system for optimal ElastiCache for Redis performance, which we are rolling out in the coming months. To get started with ElastiCache on the new M5 and R5 EC2 instances, see the AWS Management Console.
Amazon ElastiCache for Redis has achieved the Payment Card Industry Data Security Standard (PCI DSS). This means that you can now use ElastiCache for Redis for low-latency and high-throughput in-memory processing of sensitive payment card data, such as Customer Cardholder Data (CHD). ElastiCache for Redis is a Redis-compatible, fully-managed, in-memory data store and caching service in the cloud. It delivers sub-millisecond response times with millions of requests per second.
To create a PCI-Compliant ElastiCache for Redis cluster, you must use the latest Redis engine version 4.0.10 or higher and current generation node types. The service offers various data security controls to store, process, and transmit sensitive financial data. These controls include in-transit encryption (TLS), at-rest encryption, and Redis AUTH. There’s no additional charge for PCI DSS compliant ElastiCache for Redis.
In addition to PCI, ElastiCache for Redis is a HIPAA eligible service. If you want to use your existing Redis clusters that process healthcare information to also process financial information while meeting PCI requirements, you must upgrade your Redis clusters from 3.2.6 to 4.0.10. For more details, see Upgrading Engine Versions and ElastiCache for Redis Compliance.
Meeting these high bars for security and compliance means ElastiCache for Redis can be used for secure database and application caching, session management, queues, chat/messaging, and streaming analytics in industries as diverse as financial services, gaming, retail, e-commerce, and healthcare. For example, you can use ElastiCache for Redis to build an internet-scale, ride-hailing application and add digital wallets that store customer payment card numbers, thus enabling people to perform financial transactions securely and at industry standards.
Amazon ElastiCache makes it easy for you to set up a fully managed in-memory data store and cache with Redis or Memcached. Today we’re pleased to launch compatibility with Redis 4.0 in ElastiCache. You can now launch Redis 4.0 compatible ElastiCache nodes or clusters, in all commercial AWS regions. ElastiCache Redis clusters can scale to terabytes of memory and millions of reads / writes per second to serve the most demanding needs of games, IoT devices, financial applications, and web applications.
Launching a Redis cluster in the AWS Management Console or AWS Command Line Interface (CLI) remains simple. I’m going to create a small cluster to play with the new Redis 4.0 features, to use the new version I just select a 4.0 release in “Engine version compatibility”. This will launch, at the time of this writing, a 4.0.10 compatible cluster.
New Features
Least Frequently Used (LFU) cache eviction policy – Redis 4.0 launched with a number of caching improvements including a new LFU cache eviction algorithm, customers may see better performance from LFU over Least Recently Used (LRU). Antirez’s blog has a deep dive on some of the changes.
Asynchronous FLUSHDB, FLUSHALL, and UNLINK – using the ASYNC option of the FLUSH commands allows users to make a non-blocking call to clear databases. Using UNLINK instead of DEL allows users to asynchronously delete individual keys. There’s also the SWAPDB command which can be useful to atomically switch between entire datasets.
Active memory defragmentation – Redis can now defragment memory while running which allows more efficient utilization of memory for customer data. This is off by default but you can modify the parameter group to turn it on. Customers should probably only turn it on if they’re running into fragmentation issues.
Online Cluster Resizing and Encryption in transit – with Redis 4.0 you can now use encryption in transit and online cluster resizing at the same time. With Online Cluster Resizing you can add and remove shards from a running cluster to dynamically scale-out or scale-in your Redis cluster and adapt to changes on demand. Previously this feature wasn’t able to be used with encryption in transit but now you can use both features simultaneously. This helps with workloads that require encryption for compliance purposes.
MEMORY commands – a whole new family of memory commands: DOCTOR, USAGE, STATS, PURGE, and MALLOC-STATS are available for gathering statistics or usage information on your redis nodes. Running MEMORY DOCTOR will tell you about any memory issues (and it will give you a nice sci-fi easter egg if no problems are detected). The MEMORY STATS command will return some useful statistics like “bytes-per-key” that aren’t available in the INFO commands.
We have several upcoming tech talks in the month of April and early May. Come join us to learn about AWS services and solution offerings. We’ll have AWS experts online to help answer questions in real-time. Sign up now to learn more, we look forward to seeing you.
Note – All sessions are free and in Pacific Time.
April & early May — 2018 Schedule
Compute
April 30, 2018 | 01:00 PM – 01:45 PM PT – Best Practices for Running Amazon EC2 Spot Instances with Amazon EMR (300) – Learn about the best practices for scaling big data workloads as well as process, store, and analyze big data securely and cost effectively with Amazon EMR and Amazon EC2 Spot Instances.
May 1, 2018 | 01:00 PM – 01:45 PM PT – How to Bring Microsoft Apps to AWS (300) – Learn more about how to save significant money by bringing your Microsoft workloads to AWS.
May 2, 2018 | 01:00 PM – 01:45 PM PT – Deep Dive on Amazon EC2 Accelerated Computing (300) – Get a technical deep dive on how AWS’ GPU and FGPA-based compute services can help you to optimize and accelerate your ML/DL and HPC workloads in the cloud.
April 25, 2018 | 01:00 PM – 01:45 PM PT – Intro to Open Source Databases on AWS (200) – Learn how to tap the benefits of open source databases on AWS without the administrative hassle.
April 24, 2018 | 11:00 AM – 11:45 AM PT – Deploy your Desktops and Apps on AWS (300) – Learn how to deploy your desktops and apps on AWS with Amazon WorkSpaces and Amazon AppStream 2.0
IoT
May 2, 2018 | 11:00 AM – 11:45 AM PT – How to Easily and Securely Connect Devices to AWS IoT (200) – Learn how to easily and securely connect devices to the cloud and reliably scale to billions of devices and trillions of messages with AWS IoT.
April 30, 2018 | 11:00 AM – 11:45 AM PT – Offline GraphQL Apps with AWS AppSync (300) – Come learn how to enable real-time and offline data in your applications with GraphQL using AWS AppSync.
Networking
May 2, 2018 | 09:00 AM – 09:45 AM PT – Taking Serverless to the Edge (300) – Learn how to run your code closer to your end users in a serverless fashion. Also, David Von Lehman from Aerobatic will discuss how they used [email protected] to reduce latency and cloud costs for their customer’s websites.
May 3, 2018 | 09:00 AM – 09:45 AM PT – Protect Your Game Servers from DDoS Attacks (200) – Learn how to use the new AWS Shield Advanced for EC2 to protect your internet-facing game servers against network layer DDoS attacks and application layer attacks of all kinds.
May 1, 2018 | 11:00 AM – 11:45 AM PT – Building Data Lakes That Cost Less and Deliver Results Faster (300) – Learn how Amazon S3 Select And Amazon Glacier Select increase application performance by up to 400% and reduce total cost of ownership by extending your data lake into cost-effective archive storage.
May 3, 2018 | 11:00 AM – 11:45 AM PT – Integrating On-Premises Vendors with AWS for Backup (300) – Learn how to work with AWS and technology partners to build backup & restore solutions for your on-premises, hybrid, and cloud native environments.
AWS has achieved Spain’s Esquema Nacional de Seguridad (ENS) High certification across 29 services. To successfully achieve the ENS High Standard, BDO España conducted an independent audit and attested that AWS meets confidentiality, integrity, and availability standards. This provides the assurance needed by Spanish Public Sector organizations wanting to build secure applications and services on AWS.
The National Security Framework, regulated under Royal Decree 3/2010, was developed through close collaboration between ENAC (Entidad Nacional de Acreditación), the Ministry of Finance and Public Administration and the CCN (National Cryptologic Centre), and other administrative bodies.
The following AWS Services are ENS High accredited across our Dublin and Frankfurt Regions:
Microservice-application requirements have changed dramatically in recent years. These days, applications operate with petabytes of data, need almost 100% uptime, and end users expect sub-second response times. Typical N-tier applications can’t deliver on these requirements.
Reactive Manifesto, published in 2014, describes the essential characteristics of reactive systems including: responsiveness, resiliency, elasticity, and being message driven.
Being message driven is perhaps the most important characteristic of reactive systems. Asynchronous messaging helps in the design of loosely coupled systems, which is a key factor for scalability. In order to build a highly decoupled system, it is important to isolate services from each other. As already described, isolation is an important aspect of the microservices pattern. Indeed, reactive systems and microservices are a natural fit.
Implemented Use Case This reference architecture illustrates a typical ad-tracking implementation.
Many ad-tracking companies collect massive amounts of data in near-real-time. In many cases, these workloads are very spiky and heavily depend on the success of the ad-tech companies’ customers. Typically, an ad-tracking-data use case can be separated into a real-time part and a non-real-time part. In the real-time part, it is important to collect data as fast as possible and ask several questions including:, “Is this a valid combination of parameters?,””Does this program exist?,” “Is this program still valid?”
Because response time has a huge impact on conversion rate in advertising, it is important for advertisers to respond as fast as possible. This information should be kept in memory to reduce communication overhead with the caching infrastructure. The tracking application itself should be as lightweight and scalable as possible. For example, the application shouldn’t have any shared mutable state and it should use reactive paradigms. In our implementation, one main application is responsible for this real-time part. It collects and validates data, responds to the client as fast as possible, and asynchronously sends events to backend systems.
The non-real-time part of the application consumes the generated events and persists them in a NoSQL database. In a typical tracking implementation, clicks, cookie information, and transactions are matched asynchronously and persisted in a data store. The matching part is not implemented in this reference architecture. Many ad-tech architectures use frameworks like Hadoop for the matching implementation.
The system can be logically divided into the data collection partand the core data updatepart. The data collection part is responsible for collecting, validating, and persisting the data. In the core data update part, the data that is used for validation gets updated and all subscribers are notified of new data.
Components and Services
Main Application The main application is implemented using Java 8 and uses Vert.x as the main framework. Vert.x is an event-driven, reactive, non-blocking, polyglot framework to implement microservices. It runs on the Java virtual machine (JVM) by using the low-level IO library Netty. You can write applications in Java, JavaScript, Groovy, Ruby, Kotlin, Scala, and Ceylon. The framework offers a simple and scalable actor-like concurrency model. Vert.x calls handlers by using a thread known as an event loop. To use this model, you have to write code known as “verticles.” Verticles share certain similarities with actors in the actor model. To use them, you have to implement the verticle interface. Verticles communicate with each other by generating messages in a single event bus. Those messages are sent on the event bus to a specific address, and verticles can register to this address by using handlers.
With only a few exceptions, none of the APIs in Vert.x block the calling thread. Similar to Node.js, Vert.x uses the reactor pattern. However, in contrast to Node.js, Vert.x uses several event loops. Unfortunately, not all APIs in the Java ecosystem are written asynchronously, for example, the JDBC API. Vert.x offers a possibility to run this, blocking APIs without blocking the event loop. These special verticles are called worker verticles. You don’t execute worker verticles by using the standard Vert.x event loops, but by using a dedicated thread from a worker pool. This way, the worker verticles don’t block the event loop.
Our application consists of five different verticles covering different aspects of the business logic. The main entry point for our application is the HttpVerticle, which exposes an HTTP-endpoint to consume HTTP-requests and for proper health checking. Data from HTTP requests such as parameters and user-agent information are collected and transformed into a JSON message. In order to validate the input data (to ensure that the program exists and is still valid), the message is sent to the CacheVerticle.
This verticle implements an LRU-cache with a TTL of 10 minutes and a capacity of 100,000 entries. Instead of adding additional functionality to a standard JDK map implementation, we use Google Guava, which has all the features we need. If the data is not in the L1 cache, the message is sent to the RedisVerticle. This verticle is responsible for data residing in Amazon ElastiCache and uses the Vert.x-redis-client to read data from Redis. In our example, Redis is the central data store. However, in a typical production implementation, Redis would just be the L2 cache with a central data store like Amazon DynamoDB. One of the most important paradigms of a reactive system is to switch from a pull- to a push-based model. To achieve this and reduce network overhead, we’ll use Redis pub/sub to push core data changes to our main application.
Vert.x also supports direct Redis pub/sub-integration, the following code shows our subscriber-implementation:
vertx.eventBus().<JsonObject>consumer(REDIS_PUBSUB_CHANNEL_VERTX, received -> {
JsonObject value = received.body().getJsonObject("value");
String message = value.getString("message");
JsonObject jsonObject = new JsonObject(message);
eb.send(CACHE_REDIS_EVENTBUS_ADDRESS, jsonObject);
});
redis.subscribe(Constants.REDIS_PUBSUB_CHANNEL, res -> {
if (res.succeeded()) {
LOGGER.info("Subscribed to " + Constants.REDIS_PUBSUB_CHANNEL);
} else {
LOGGER.info(res.cause());
}
});
The verticle subscribes to the appropriate Redis pub/sub-channel. If a message is sent over this channel, the payload is extracted and forwarded to the cache-verticle that stores the data in the L1-cache. After storing and enriching data, a response is sent back to the HttpVerticle, which responds to the HTTP request that initially hit this verticle. In addition, the message is converted to ByteBuffer, wrapped in protocol buffers, and send to an Amazon Kinesis Data Stream.
The following example shows a stripped-down version of the KinesisVerticle:
Kinesis Consumer This AWS Lambda function consumes data from an Amazon Kinesis Data Stream and persists the data in an Amazon DynamoDB table. In order to improve testability, the invocation code is separated from the business logic. The invocation code is implemented in the class KinesisConsumerHandler and iterates over the Kinesis events pulled from the Kinesis stream by AWS Lambda. Each Kinesis event is unwrapped and transformed from ByteBuffer to protocol buffers and converted into a Java object. Those Java objects are passed to the business logic, which persists the data in a DynamoDB table. In order to improve duration of successive Lambda calls, the DynamoDB-client is instantiated lazily and reused if possible.
Redis Updater From time to time, it is necessary to update core data in Redis. A very efficient implementation for this requirement is using AWS Lambda and Amazon Kinesis. New core data is sent over the AWS Kinesis stream using JSON as data format and consumed by a Lambda function. This function iterates over the Kinesis events pulled from the Kinesis stream by AWS Lambda. Each Kinesis event is unwrapped and transformed from ByteBuffer to String and converted into a Java object. The Java object is passed to the business logic and stored in Redis. In addition, the new core data is also sent to the main application using Redis pub/sub in order to reduce network overhead and converting from a pull- to a push-based model.
The following example shows the source code to store data in Redis and notify all subscribers:
public void updateRedisData(final TrackingMessage trackingMessage, final Jedis jedis, final LambdaLogger logger) {
try {
ObjectMapper mapper = new ObjectMapper();
String jsonString = mapper.writeValueAsString(trackingMessage);
Map<String, String> map = marshal(jsonString);
String statusCode = jedis.hmset(trackingMessage.getProgramId(), map);
}
catch (Exception exc) {
if (null == logger)
exc.printStackTrace();
else
logger.log(exc.getMessage());
}
}
public void notifySubscribers(final TrackingMessage trackingMessage, final Jedis jedis, final LambdaLogger logger) {
try {
ObjectMapper mapper = new ObjectMapper();
String jsonString = mapper.writeValueAsString(trackingMessage);
jedis.publish(Constants.REDIS_PUBSUB_CHANNEL, jsonString);
}
catch (final IOException e) {
log(e.getMessage(), logger);
}
}
Similarly to our Kinesis Consumer, the Redis-client is instantiated somewhat lazily.
Infrastructure as Code As already outlined, latency and response time are a very critical part of any ad-tracking solution because response time has a huge impact on conversion rate. In order to reduce latency for customers world-wide, it is common practice to roll out the infrastructure in different AWS Regions in the world to be as close to the end customer as possible. AWS CloudFormation can help you model and set up your AWS resources so that you can spend less time managing those resources and more time focusing on your applications that run in AWS.
You create a template that describes all the AWS resources that you want (for example, Amazon EC2 instances or Amazon RDS DB instances), and AWS CloudFormation takes care of provisioning and configuring those resources for you. Our reference architecture can be rolled out in different Regions using an AWS CloudFormation template, which sets up the complete infrastructure (for example, Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Container Service (Amazon ECS) cluster, Lambda functions, DynamoDB table, Amazon ElastiCache cluster, etc.).
Conclusion In this blog post we described reactive principles and an example architecture with a common use case. We leveraged the capabilities of different frameworks in combination with several AWS services in order to implement reactive principles—not only at the application-level but also at the system-level. I hope I’ve given you ideas for creating your own reactive applications and systems on AWS.
About the Author
Sascha Moellering is a Senior Solution Architect. Sascha is primarily interested in automation, infrastructure as code, distributed computing, containers and JVM. He can be reached at [email protected]
Today we are launching our 18th AWS Region, our fourth in Europe. Located in the Paris area, AWS customers can use this Region to better serve customers in and around France.
The Paris Region will benefit from three AWS Direct Connect locations. Telehouse Voltaire is available today. AWS Direct Connect will also become available at Equinix Paris in early 2018, followed by Interxion Paris.
All AWS infrastructure regions around the world are designed, built, and regularly audited to meet the most rigorous compliance standards and to provide high levels of security for all AWS customers. These include ISO 27001, ISO 27017, ISO 27018, SOC 1 (Formerly SAS 70), SOC 2 and SOC 3 Security & Availability, PCI DSS Level 1, and many more. This means customers benefit from all the best practices of AWS policies, architecture, and operational processes built to satisfy the needs of even the most security sensitive customers.
AWS is certified under the EU-US Privacy Shield, and the AWS Data Processing Addendum (DPA) is GDPR-ready and available now to all AWS customers to help them prepare for May 25, 2018 when the GDPR becomes enforceable. The current AWS DPA, as well as the AWS GDPR DPA, allows customers to transfer personal data to countries outside the European Economic Area (EEA) in compliance with European Union (EU) data protection laws. AWS also adheres to the Cloud Infrastructure Service Providers in Europe (CISPE) Code of Conduct. The CISPE Code of Conduct helps customers ensure that AWS is using appropriate data protection standards to protect their data, consistent with the GDPR. In addition, AWS offers a wide range of services and features to help customers meet the requirements of the GDPR, including services for access controls, monitoring, logging, and encryption.
From Our Customers Many AWS customers are preparing to use this new Region. Here’s a small sample:
Societe Generale, one of the largest banks in France and the world, has accelerated their digital transformation while working with AWS. They developed SG Research, an application that makes reports from Societe Generale’s analysts available to corporate customers in order to improve the decision-making process for investments. The new AWS Region will reduce latency between applications running in the cloud and in their French data centers.
SNCF is the national railway company of France. Their mobile app, powered by AWS, delivers real-time traffic information to 14 million riders. Extreme weather, traffic events, holidays, and engineering works can cause usage to peak at hundreds of thousands of users per second. They are planning to use machine learning and big data to add predictive features to the app.
Radio France, the French public radio broadcaster, offers seven national networks, and uses AWS to accelerate its innovation and stay competitive.
Les Restos du Coeur, a French charity that provides assistance to the needy, delivering food packages and participating in their social and economic integration back into French society. Les Restos du Coeur is using AWS for its CRM system to track the assistance given to each of their beneficiaries and the impact this is having on their lives.
AlloResto by JustEat (a leader in the French FoodTech industry), is using AWS to to scale during traffic peaks and to accelerate their innovation process.
AWS Consulting and Technology Partners We are already working with a wide variety of consulting, technology, managed service, and Direct Connect partners in France. Here’s a partial list:
AWS in France We have been investing in Europe, with a focus on France, for the last 11 years. We have also been developing documentation and training programs to help our customers to improve their skills and to accelerate their journey to the AWS Cloud.
As part of our commitment to AWS customers in France, we plan to train more than 25,000 people in the coming years, helping them develop highly sought after cloud skills. They will have access to AWS training resources in France via AWS Academy, AWSome days, AWS Educate, and webinars, all delivered in French by AWS Technical Trainers and AWS Certified Trainers.
Use it Today The EU (Paris) Region is open for business now and you can start using it today!
Today we launched our 17th Region globally, and the second in China. The AWS China (Ningxia) Region, operated by Ningxia Western Cloud Data Technology Co. Ltd. (NWCD), is generally available now and provides customers another option to run applications and store data on AWS in China.
Operating Partner To comply with China’s legal and regulatory requirements, AWS has formed a strategic technology collaboration with NWCD to operate and provide services from the AWS China (Ningxia) Region. Founded in 2015, NWCD is a licensed datacenter and cloud services provider, based in Ningxia, China. NWCD joins Sinnet, the operator of the AWS China China (Beijing) Region, as an AWS operating partner in China. Through these relationships, AWS provides its industry-leading technology, guidance, and expertise to NWCD and Sinnet, while NWCD and Sinnet operate and provide AWS cloud services to local customers. While the cloud services offered in both AWS China Regions are the same as those available in other AWS Regions, the AWS China Regions are different in that they are isolated from all other AWS Regions and operated by AWS’s Chinese partners separately from all other AWS Regions. Customers using the AWS China Regions enter into customer agreements with Sinnet and NWCD, rather than with AWS.
Use it Today The AWS China (Ningxia) Region, operated by NWCD, is open for business, and you can start using it now! Starting today, Chinese developers, startups, and enterprises, as well as government, education, and non-profit organizations, can leverage AWS to run their applications and store their data in the new AWS China (Ningxia) Region, operated by NWCD. Customers already using the AWS China (Beijing) Region, operated by Sinnet, can select the AWS China (Ningxia) Region directly from the AWS Management Console, while new customers can request an account at www.amazonaws.cn to begin using both AWS China Regions.
Glenn Gore here, Chief Architect for AWS. I’m in Las Vegas this week — with 43K others — for re:Invent 2017. We have a lot of exciting announcements this week. I’m going to post to the AWS Architecture blog each day with my take on what’s interesting about some of the announcements from a cloud architectural perspective.
Why not start at the beginning? At the Midnight Madness launch on Sunday night, we announced Amazon Sumerian, our platform for VR, AR, and mixed reality. The hype around VR/AR has existed for many years, though for me, it is a perfect example of how a working end-to-end solution often requires innovation from multiple sources. For AR/VR to be successful, we need many components to come together in a coherent manner to provide a great experience.
First, we need lightweight, high-definition goggles with motion tracking that are comfortable to wear. Second, we need to track movement of our body and hands in a 3-D space so that we can interact with virtual objects in the virtual world. Third, we need to build the virtual world itself and populate it with assets and define how the interactions will work and connect with various other systems.
There has been rapid development of the physical devices for AR/VR, ranging from iOS devices to Oculus Rift and HTC Vive, which provide excellent capabilities for the first and second components defined above. With the launch of Amazon Sumerian we are solving for the third area, which will help developers easily build their own virtual worlds and start experimenting and innovating with how to apply AR/VR in new ways.
Already, within 48 hours of Amazon Sumerian being announced, I have had multiple discussions with customers and partners around some cool use cases where VR can help in training simulations, remote-operator controls, or with new ideas around interacting with complex visual data sets, which starts bringing concepts straight out of sci-fi movies into the real (virtual) world. I am really excited to see how Sumerian will unlock the creative potential of developers and where this will lead.
Amazon MQ I am a huge fan of distributed architectures where asynchronous messaging is the backbone of connecting the discrete components together. Amazon Simple Queue Service (Amazon SQS) is one of my favorite services due to its simplicity, scalability, performance, and the incredible flexibility of how you can use Amazon SQS in so many different ways to solve complex queuing scenarios.
While Amazon SQS is easy to use when building cloud-native applications on AWS, many of our customers running existing applications on-premises required support for different messaging protocols such as: Java Message Service (JMS), .Net Messaging Service (NMS), Advanced Message Queuing Protocol (AMQP), MQ Telemetry Transport(MQTT), Simple (or Streaming) Text Orientated Messaging Protocol (STOMP), and WebSockets. One of the most popular applications for on-premise message brokers is Apache ActiveMQ. With the release of Amazon MQ, you can now run Apache ActiveMQ on AWS as a managed service similar to what we did with Amazon ElastiCache back in 2012. For me, there are two compelling, major benefits that Amazon MQ provides:
Integrate existing applications with cloud-native applications without having to change a line of application code if using one of the supported messaging protocols. This removes one of the biggest blockers for integration between the old and the new.
Remove the complexity of configuring Multi-AZ resilient message broker services as Amazon MQ provides out-of-the-box redundancy by always storing messages redundantly across Availability Zones. Protection is provided against failure of a broker through to complete failure of an Availability Zone.
I believe that Amazon MQ is a major component in the tools required to help you migrate your existing applications to AWS. Having set up cross-data center Apache ActiveMQ clusters in the past myself and then testing to ensure they work as expected during critical failure scenarios, technical staff working on migrations to AWS benefit from the ease of deploying a fully redundant, managed Apache ActiveMQ cluster within minutes.
Who would have thought I would have been so excited to revisit Apache ActiveMQ in 2017 after using SQS for many, many years? Choice is a wonderful thing.
Amazon GuardDuty Maintaining application and information security in the modern world is increasingly complex and is constantly evolving and changing as new threats emerge. This is due to the scale, variety, and distribution of services required in a competitive online world.
At Amazon, security is our number one priority. Thus, we are always looking at how we can increase security detection and protection while simplifying the implementation of advanced security practices for our customers. As a result, we released Amazon GuardDuty, which provides intelligent threat detection by using a combination of multiple information sources, transactional telemetry, and the application of machine learning models developed by AWS. One of the biggest benefits of Amazon GuardDuty that I appreciate is that enabling this service requires zero software, agents, sensors, or network choke points. which can all impact performance or reliability of the service you are trying to protect. Amazon GuardDuty works by monitoring your VPC flow logs, AWS CloudTrail events, DNS logs, as well as combing other sources of security threats that AWS is aggregating from our own internal and external sources.
The use of machine learning in Amazon GuardDuty allows it to identify changes in behavior, which could be suspicious and require additional investigation. Amazon GuardDuty works across all of your AWS accounts allowing for an aggregated analysis and ensuring centralized management of detected threats across accounts. This is important for our larger customers who can be running many hundreds of AWS accounts across their organization, as providing a single common threat detection of their organizational use of AWS is critical to ensuring they are protecting themselves.
Detection, though, is only the beginning of what Amazon GuardDuty enables. When a threat is identified in Amazon GuardDuty, you can configure remediation scripts or trigger Lambda functions where you have custom responses that enable you to start building automated responses to a variety of different common threats. Speed of response is required when a security incident may be taking place. For example, Amazon GuardDuty detects that an Amazon Elastic Compute Cloud (Amazon EC2) instance might be compromised due to traffic from a known set of malicious IP addresses. Upon detection of a compromised EC2 instance, we could apply an access control entry restricting outbound traffic for that instance, which stops loss of data until a security engineer can assess what has occurred.
Whether you are a customer running a single service in a single account, or a global customer with hundreds of accounts with thousands of applications, or a startup with hundreds of micro-services with hourly release cycle in a devops world, I recommend enabling Amazon GuardDuty. We have a 30-day free trial available for all new customers of this service. As it is a monitor of events, there is no change required to your architecture within AWS.
Stay tuned for tomorrow’s post on AWS Media Services and Amazon Neptune.
We can’t believe that there are just few days left before re:Invent 2017. If you are attending this year, you’ll want to check out our Big Data sessions! The Big Data and Machine Learning categories are bigger than ever. As in previous years, you can find these sessions in various tracks, including Analytics & Big Data, Deep Learning Summit, Artificial Intelligence & Machine Learning, Architecture, and Databases.
We have great sessions from organizations and companies like Vanguard, Cox Automotive, Pinterest, Netflix, FINRA, Amtrak, AmazonFresh, Sysco Foods, Twilio, American Heart Association, Expedia, Esri, Nextdoor, and many more. All sessions are recorded and made available on YouTube. In addition, all slide decks from the sessions will be available on SlideShare.net after the conference.
This post highlights the sessions that will be presented as part of the Analytics & Big Data track, as well as relevant sessions from other tracks like Architecture, Artificial Intelligence & Machine Learning, and IoT. If you’re interested in Machine Learning sessions, don’t forget to check out our Guide to Machine Learning at re:Invent 2017.
This year’s session catalog contains the following breakout sessions.
Raju Gulabani, VP, Database, Analytics and AI at AWS will discuss the evolution of database and analytics services in AWS, the new database and analytics services and features we launched this year, and our vision for continued innovation in this space. We are witnessing an unprecedented growth in the amount of data collected, in many different forms. Storage, management, and analysis of this data require database services that scale and perform in ways not possible before. AWS offers a collection of database and other data services—including Amazon Aurora, Amazon DynamoDB, Amazon RDS, Amazon Redshift, Amazon ElastiCache, Amazon Kinesis, and Amazon EMR—to process, store, manage, and analyze data. In this session, we provide an overview of AWS database and analytics services and discuss how customers are using these services today.
Deep dive customer use cases
ABD401 – How Netflix Monitors Applications in Near Real-Time with Amazon Kinesis Thousands of services work in concert to deliver millions of hours of video streams to Netflix customers every day. These applications vary in size, function, and technology, but they all make use of the Netflix network to communicate. Understanding the interactions between these services is a daunting challenge both because of the sheer volume of traffic and the dynamic nature of deployments. In this session, we first discuss why Netflix chose Kinesis Streams to address these challenges at scale. We then dive deep into how Netflix uses Kinesis Streams to enrich network traffic logs and identify usage patterns in real time. Lastly, we cover how Netflix uses this system to build comprehensive dependency maps, increase network efficiency, and improve failure resiliency. From this session, you will learn how to build a real-time application monitoring system using network traffic logs and get real-time, actionable insights.
In this session, learn how Nextdoor replaced their home-grown data pipeline based on a topology of Flume nodes with a completely serverless architecture based on Kinesis and Lambda. By making these changes, they improved both the reliability of their data and the delivery times of billions of records of data to their Amazon S3–based data lake and Amazon Redshift cluster. Nextdoor is a private social networking service for neighborhoods.
ABD205 – Taking a Page Out of Ivy Tech’s Book: Using Data for Student Success Data speaks. Discover how Ivy Tech, the nation’s largest singly accredited community college, uses AWS to gather, analyze, and take action on student behavioral data for the betterment of over 3,100 students. This session outlines the process from inception to implementation across the state of Indiana and highlights how Ivy Tech’s model can be applied to your own complex business problems.
ABD207 – Leveraging AWS to Fight Financial Crime and Protect National Security Banks aren’t known to share data and collaborate with one another. But that is exactly what the Mid-Sized Bank Coalition of America (MBCA) is doing to fight digital financial crime—and protect national security. Using the AWS Cloud, the MBCA developed a shared data analytics utility that processes terabytes of non-competitive customer account, transaction, and government risk data. The intelligence produced from the data helps banks increase the efficiency of their operations, cut labor and operating costs, and reduce false positive volumes. The collective intelligence also allows greater enforcement of Anti-Money Laundering (AML) regulations by helping members detect internal risks—and identify the challenges to detecting these risks in the first place. This session demonstrates how the AWS Cloud supports the MBCA to deliver advanced data analytics, provide consistent operating models across financial institutions, reduce costs, and strengthen national security.
ABD208 – Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores New Innovation with Amazon Kinesis Firehose In this session, learn how Cox Automotive is using Splunk Cloud for real time visibility into its AWS and hybrid environments to achieve near instantaneous MTTI, reduce auction incidents by 90%, and proactively predict outages. We also introduce a highly anticipated capability that allows you to ingest, transform, and analyze data in real time using Splunk and Amazon Kinesis Firehose to gain valuable insights from your cloud resources. It’s now quicker and easier than ever to gain access to analytics-driven infrastructure monitoring using Splunk Enterprise & Splunk Cloud.
ABD209 – Accelerating the Speed of Innovation with a Data Sciences Data & Analytics Hub at Takeda Historically, silos of data, analytics, and processes across functions, stages of development, and geography created a barrier to R&D efficiency. Gathering the right data necessary for decision-making was challenging due to issues of accessibility, trust, and timeliness. In this session, learn how Takeda is undergoing a transformation in R&D to increase the speed-to-market of high-impact therapies to improve patient lives. The Data and Analytics Hub was built, with Deloitte, to address these issues and support the efficient generation of data insights for functions such as clinical operations, clinical development, medical affairs, portfolio management, and R&D finance. In the AWS hosted data lake, this data is processed, integrated, and made available to business end users through data visualization interfaces, and to data scientists through direct connectivity. Learn how Takeda has achieved significant time reductions—from weeks to minutes—to gather and provision data that has the potential to reduce cycle times in drug development. The hub also enables more efficient operations and alignment to achieve product goals through cross functional team accountability and collaboration due to the ability to access the same cross domain data.
ABD210 – Modernizing Amtrak: Serverless Solution for Real-Time Data Capabilities As the nation’s only high-speed intercity passenger rail provider, Amtrak needs to know critical information to run their business such as: Who’s onboard any train at any time? How are booking and revenue trending? Amtrak was faced with unpredictable and often slow response times from existing databases, ranging from seconds to hours; existing booking and revenue dashboards were spreadsheet-based and manual; multiple copies of data were stored in different repositories, lacking integration and consistency; and operations and maintenance (O&M) costs were relatively high. Join us as we demonstrate how Deloitte and Amtrak successfully went live with a cloud-native operational database and analytical datamart for near-real-time reporting in under six months. We highlight the specific challenges and the modernization of architecture on an AWS native Platform as a Service (PaaS) solution. The solution includes cloud-native components such as AWS Lambda for microservices, Amazon Kinesis and AWS Data Pipeline for moving data, Amazon S3 for storage, Amazon DynamoDB for a managed NoSQL database service, and Amazon Redshift for near-real time reports and dashboards. Deloitte’s solution enabled “at scale” processing of 1 million transactions/day and up to 2K transactions/minute. It provided flexibility and scalability, largely eliminate the need for system management, and dramatically reduce operating costs. Moreover, it laid the groundwork for decommissioning legacy systems, anticipated to save at least $1M over 3 years.
ABD211 – Sysco Foods: A Journey from Too Much Data to Curated Insights In this session, we detail Sysco’s journey from a company focused on hindsight-based reporting to one focused on insights and foresight. For this shift, Sysco moved from multiple data warehouses to an AWS ecosystem, including Amazon Redshift, Amazon EMR, AWS Data Pipeline, and more. As the team at Sysco worked with Tableau, they gained agile insight across their business. Learn how Sysco decided to use AWS, how they scaled, and how they became more strategic with the AWS ecosystem and Tableau.
ABD217 – From Batch to Streaming: How Amazon Flex Uses Real-time Analytics to Deliver Packages on Time Reducing the time to get actionable insights from data is important to all businesses, and customers who employ batch data analytics tools are exploring the benefits of streaming analytics. Learn best practices to extend your architecture from data warehouses and databases to real-time solutions. Learn how to use Amazon Kinesis to get real-time data insights and integrate them with Amazon Aurora, Amazon RDS, Amazon Redshift, and Amazon S3. The Amazon Flex team describes how they used streaming analytics in their Amazon Flex mobile app, used by Amazon delivery drivers to deliver millions of packages each month on time. They discuss the architecture that enabled the move from a batch processing system to a real-time system, overcoming the challenges of migrating existing batch data to streaming data, and how to benefit from real-time analytics.
ABD218 – How EuroLeague Basketball Uses IoT Analytics to Engage Fans IoT and big data have made their way out of industrial applications, general automation, and consumer goods, and are now a valuable tool for improving consumer engagement across a number of industries, including media, entertainment, and sports. The low cost and ease of implementation of AWS analytics services and AWS IoT have allowed AGT, a leader in IoT, to develop their IoTA analytics platform. Using IoTA, AGT brought a tailored solution to EuroLeague Basketball for real-time content production and fan engagement during the 2017-18 season. In this session, we take a deep dive into how this solution is architected for secure, scalable, and highly performant data collection from athletes, coaches, and fans. We also talk about how the data is transformed into insights and integrated into a content generation pipeline. Lastly, we demonstrate how this solution can be easily adapted for other industries and applications.
ABD222 – How to Confidently Unleash Data to Meet the Needs of Your Entire Organization Where are you on the spectrum of IT leaders? Are you confident that you’re providing the technology and solutions that consistently meet or exceed the needs of your internal customers? Do your peers at the executive table see you as an innovative technology leader? Innovative IT leaders understand the value of getting data and analytics directly into the hands of decision makers, and into their own. In this session, Daren Thayne, Domo’s Chief Technology Officer, shares how innovative IT leaders are helping drive a culture change at their organizations. See how transformative it can be to have real-time access to all of the data that’ is relevant to YOUR job (including a complete view of your entire AWS environment), as well as understand how it can help you lead the way in applying that same pattern throughout your entire company
ABD303 – Developing an Insights Platform – Sysco’s Journey from Disparate Systems to Data Lake and Beyond Sysco has nearly 200 operating companies across its multiple lines of business throughout the United States, Canada, Central/South America, and Europe. As the global leader in food services, Sysco identified the need to streamline the collection, transformation, and presentation of data produced by the distributed units and systems, into a central data ecosystem. Sysco’s Business Intelligence and Analytics team addressed these requirements by creating a data lake with scalable analytics and query engines leveraging AWS services. In this session, Sysco will outline their journey from a hindsight reporting focused company to an insights driven organization. They will cover solution architecture, challenges, and lessons learned from deploying a self-service insights platform. They will also walk through the design patterns they used and how they designed the solution to provide predictive analytics using Amazon Redshift Spectrum, Amazon S3, Amazon EMR, AWS Glue, Amazon Elasticsearch Service and other AWS services.
ABD309 – How Twilio Scaled Its Data-Driven Culture As a leading cloud communications platform, Twilio has always been strongly data-driven. But as headcount and data volumes grew—and grew quickly—they faced many new challenges. One-off, static reports work when you’re a small startup, but how do you support a growth stage company to a successful IPO and beyond? Today, Twilio’s data team relies on AWS and Looker to provide data access to 700 colleagues. Departments have the data they need to make decisions, and cloud-based scale means they get answers fast. Data delivers real-business value at Twilio, providing a 360-degree view of their customer, product, and business. In this session, you hear firsthand stories directly from the Twilio data team and learn real-world tips for fostering a truly data-driven culture at scale.
ABD310 – How FINRA Secures Its Big Data and Data Science Platform on AWS FINRA uses big data and data science technologies to detect fraud, market manipulation, and insider trading across US capital markets. As a financial regulator, FINRA analyzes highly sensitive data, so information security is critical. Learn how FINRA secures its Amazon S3 Data Lake and its data science platform on Amazon EMR and Amazon Redshift, while empowering data scientists with tools they need to be effective. In addition, FINRA shares AWS security best practices, covering topics such as AMI updates, micro segmentation, encryption, key management, logging, identity and access management, and compliance.
ABD331 – Log Analytics at Expedia Using Amazon Elasticsearch Service Expedia uses Amazon Elasticsearch Service (Amazon ES) for a variety of mission-critical use cases, ranging from log aggregation to application monitoring and pricing optimization. In this session, the Expedia team reviews how they use Amazon ES and Kibana to analyze and visualize Docker startup logs, AWS CloudTrail data, and application metrics. They share best practices for architecting a scalable, secure log analytics solution using Amazon ES, so you can add new data sources almost effortlessly and get insights quickly
ABD316 – American Heart Association: Finding Cures to Heart Disease Through the Power of Technology Combining disparate datasets and making them accessible to data scientists and researchers is a prevalent challenge for many organizations, not just in healthcare research. American Heart Association (AHA) has built a data science platform using Amazon EMR, Amazon Elasticsearch Service, and other AWS services, that corrals multiple datasets and enables advanced research on phenotype and genotype datasets, aimed at curing heart diseases. In this session, we present how AHA built this platform and the key challenges they addressed with the solution. We also provide a demo of the platform, and leave you with suggestions and next steps so you can build similar solutions for your use cases
ABD319 – Tooling Up for Efficiency: DIY Solutions @ Netflix At Netflix, we have traditionally approached cloud efficiency from a human standpoint, whether it be in-person meetings with the largest service teams or manually flipping reservations. Over time, we realized that these manual processes are not scalable as the business continues to grow. Therefore, in the past year, we have focused on building out tools that allow us to make more insightful, data-driven decisions around capacity and efficiency. In this session, we discuss the DIY applications, dashboards, and processes we built to help with capacity and efficiency. We start at the ten thousand foot view to understand the unique business and cloud problems that drove us to create these products, and discuss implementation details, including the challenges encountered along the way. Tools discussed include Picsou, the successor to our AWS billing file cost analyzer; Libra, an easy-to-use reservation conversion application; and cost and efficiency dashboards that relay useful financial context to 50+ engineering teams and managers.
ABD312 – Deep Dive: Migrating Big Data Workloads to AWS Customers are migrating their analytics, data processing (ETL), and data science workloads running on Apache Hadoop, Spark, and data warehouse appliances from on-premise deployments to AWS in order to save costs, increase availability, and improve performance. AWS offers a broad set of analytics services, including solutions for batch processing, stream processing, machine learning, data workflow orchestration, and data warehousing. This session will focus on identifying the components and workflows in your current environment; and providing the best practices to migrate these workloads to the right AWS data analytics product. We will cover services such as Amazon EMR, Amazon Athena, Amazon Redshift, Amazon Kinesis, and more. We will also feature Vanguard, an American investment management company based in Malvern, Pennsylvania with over $4.4 trillion in assets under management. Ritesh Shah, Sr. Program Manager for Cloud Analytics Program at Vanguard, will describe how they orchestrated their migration to AWS analytics services, including Hadoop and Spark workloads to Amazon EMR. Ritesh will highlight the technical challenges they faced and overcame along the way, as well as share common recommendations and tuning tips to accelerate the time to production.
ABD402 – How Esri Optimizes Massive Image Archives for Analytics in the Cloud Petabyte scale archives of satellites, planes, and drones imagery continue to grow exponentially. They mostly exist as semi-structured data, but they are only valuable when accessed and processed by a wide range of products for both visualization and analysis. This session provides an overview of how ArcGIS indexes and structures data so that any part of it can be quickly accessed, processed, and analyzed by reading only the minimum amount of data needed for the task. In this session, we share best practices for structuring and compressing massive datasets in Amazon S3, so it can be analyzed efficiently. We also review a number of different image formats, including GeoTIFF (used for the Public Datasets on AWS program, Landsat on AWS), cloud optimized GeoTIFF, MRF, and CRF as well as different compression approaches to show the effect on processing performance. Finally, we provide examples of how this technology has been used to help image processing and analysis for the response to Hurricane Harvey.
ABD329 – A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Massive Scale Amazon’s consumer business continues to grow, and so does the volume of data and the number and complexity of the analytics done in support of the business. In this session, we talk about how Amazon.com uses AWS technologies to build a scalable environment for data and analytics. We look at how Amazon is evolving the world of data warehousing with a combination of a data lake and parallel, scalable compute engines such as Amazon EMR and Amazon Redshift.
ABD327 – Migrating Your Traditional Data Warehouse to a Modern Data Lake In this session, we discuss the latest features of Amazon Redshift and Redshift Spectrum, and take a deep dive into its architecture and inner workings. We share many of the recent availability, performance, and management enhancements and how they improve your end user experience. You also hear from 21st Century Fox, who presents a case study of their fast migration from an on-premises data warehouse to Amazon Redshift. Learn how they are expanding their data warehouse to a data lake that encompasses multiple data sources and data formats. This architecture helps them tie together siloed business units and get actionable 360-degree insights across their consumer base. MCL202 – Ally Bank & Cognizant: Transforming Customer Experience Using Amazon Alexa Given the increasing popularity of natural language interfaces such as Voice as User technology or conversational artificial intelligence (AI), Ally® Bank was looking to interact with customers by enabling direct transactions through conversation or voice. They also needed to develop a capability that allows third parties to connect to the bank securely for information sharing and exchange, using oAuth, an authentication protocol seen as the future of secure banking technology. Cognizant’s Architecture team partnered with Ally Bank’s Enterprise Architecture group and identified the right product for oAuth integration with Amazon Alexa and third-party technologies. In this session, we discuss how building products with conversational AI helps Ally Bank offer an innovative customer experience; increase retention through improved data-driven personalization; increase the efficiency and convenience of customer service; and gain deep insights into customer needs through data analysis and predictive analytics to offer new products and services.
MCL317 – Orchestrating Machine Learning Training for Netflix Recommendations At Netflix, we use machine learning (ML) algorithms extensively to recommend relevant titles to our 100+ million members based on their tastes. Everything on the member home page is an evidence-driven, A/B-tested experience that we roll out backed by ML models. These models are trained using Meson, our workflow orchestration system. Meson distinguishes itself from other workflow engines by handling more sophisticated execution graphs, such as loops and parameterized fan-outs. Meson can schedule Spark jobs, Docker containers, bash scripts, gists of Scala code, and more. Meson also provides a rich visual interface for monitoring active workflows and inspecting execution logs. It has a powerful Scala DSL for authoring workflows as well as the REST API. In this session, we focus on how Meson trains recommendation ML models in production, and how we have re-architected it to scale up for a growing need of broad ETL applications within Netflix. As a driver for this change, we have had to evolve the persistence layer for Meson. We talk about how we migrated from Cassandra to Amazon RDS backed by Amazon Aurora
MCL350 – Humans vs. the Machines: How Pinterest Uses Amazon Mechanical Turk’s Worker Community to Improve Machine Learning Ever since the term “crowdsourcing” was coined in 2006, it’s been a buzzword for technology companies and social institutions. In the technology sector, crowdsourcing is instrumental for verifying machine learning algorithms, which, in turn, improves the user’s experience. In this session, we explore how Pinterest adapted to an increased reliability on human evaluation to improve their product, with a focus on how they’ve integrated with Mechanical Turk’s platform. This presentation is aimed at engineers, analysts, program managers, and product managers who are interested in how companies rely on Mechanical Turk’s human evaluation platform to better understand content and improve machine learning algorithms. The discussion focuses on the analysis and product decisions related to building a high quality crowdsourcing system that takes advantage of Mechanical Turk’s powerful worker community.
ABD201 – Big Data Architectural Patterns and Best Practices on AWS In this session, we simplify big data processing as a data bus comprising various stages: collect, store, process, analyze, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architectures, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost
ABD202 – Best Practices for Building Serverless Big Data Applications Serverless technologies let you build and scale applications and services rapidly without the need to provision or manage servers. In this session, we show you how to incorporate serverless concepts into your big data architectures. We explore the concepts behind and benefits of serverless architectures for big data, looking at design patterns to ingest, store, process, and visualize your data. Along the way, we explain when and how you can use serverless technologies to streamline data processing, minimize infrastructure management, and improve agility and robustness and share a reference architecture using a combination of cloud and open source technologies to solve your big data problems. Topics include: use cases and best practices for serverless big data applications; leveraging AWS technologies such as Amazon DynamoDB, Amazon S3, Amazon Kinesis, AWS Lambda, Amazon Athena, and Amazon EMR; and serverless ETL, event processing, ad hoc analysis, and real-time analytics.
ABD206 – Building Visualizations and Dashboards with Amazon QuickSight Just as a picture is worth a thousand words, a visual is worth a thousand data points. A key aspect of our ability to gain insights from our data is to look for patterns, and these patterns are often not evident when we simply look at data in tables. The right visualization will help you gain a deeper understanding in a much quicker timeframe. In this session, we will show you how to quickly and easily visualize your data using Amazon QuickSight. We will show you how you can connect to data sources, generate custom metrics and calculations, create comprehensive business dashboards with various chart types, and setup filters and drill downs to slice and dice the data.
ABD203 – Real-Time Streaming Applications on AWS: Use Cases and Patterns To win in the marketplace and provide differentiated customer experiences, businesses need to be able to use live data in real time to facilitate fast decision making. In this session, you learn common streaming data processing use cases and architectures. First, we give an overview of streaming data and AWS streaming data capabilities. Next, we look at a few customer examples and their real-time streaming applications. Finally, we walk through common architectures and design patterns of top streaming data use cases.
ABD213 – How to Build a Data Lake with AWS Glue Data Catalog As data volumes grow and customers store more data on AWS, they often have valuable data that is not easily discoverable and available for analytics. The AWS Glue Data Catalog provides a central view of your data lake, making data readily available for analytics. We introduce key features of the AWS Glue Data Catalog and its use cases. Learn how crawlers can automatically discover your data, extract relevant metadata, and add it as table definitions to the AWS Glue Data Catalog. We will also explore the integration between AWS Glue Data Catalog and Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.
ABD214 – Real-time User Insights for Mobile and Web Applications with Amazon Pinpoint With customers demanding relevant and real-time experiences across a range of devices, digital businesses are looking to gather user data at scale, understand this data, and respond to customer needs instantly. This requires tools that can record large volumes of user data in a structured fashion, and then instantly make this data available to generate insights. In this session, we demonstrate how you can use Amazon Pinpoint to capture user data in a structured yet flexible manner. Further, we demonstrate how this data can be set up for instant consumption using services like Amazon Kinesis Firehose and Amazon Redshift. We walk through example data based on real world scenarios, to illustrate how Amazon Pinpoint lets you easily organize millions of events, record them in real-time, and store them for further analysis.
ABD223 – IT Innovators: New Technology for Leveraging Data to Enable Agility, Innovation, and Business Optimization Companies of all sizes are looking for technology to efficiently leverage data and their existing IT investments to stay competitive and understand where to find new growth. Regardless of where companies are in their data-driven journey, they face greater demands for information by customers, prospects, partners, vendors and employees. All stakeholders inside and outside the organization want information on-demand or in “real time”, available anywhere on any device. They want to use it to optimize business outcomes without having to rely on complex software tools or human gatekeepers to relevant information. Learn how IT innovators at companies such as MasterCard, Jefferson Health, and TELUS are using Domo’s Business Cloud to help their organizations more effectively leverage data at scale.
ABD301 – Analyzing Streaming Data in Real Time with Amazon Kinesis Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. In this session, we present an end-to-end streaming data solution using Kinesis Streams for data ingestion, Kinesis Analytics for real-time processing, and Kinesis Firehose for persistence. We review in detail how to write SQL queries using streaming data and discuss best practices to optimize and monitor your Kinesis Analytics applications. Lastly, we discuss how to estimate the cost of the entire system
ABD302 – Real-Time Data Exploration and Analytics with Amazon Elasticsearch Service and Kibana In this session, we use Apache web logs as example and show you how to build an end-to-end analytics solution. First, we cover how to configure an Amazon ES cluster and ingest data using Amazon Kinesis Firehose. We look at best practices for choosing instance types, storage options, shard counts, and index rotations based on the throughput of incoming data. Then we demonstrate how to set up a Kibana dashboard and build custom dashboard widgets. Finally, we review approaches for generating custom, ad-hoc reports.
ABD304 – Best Practices for Data Warehousing with Amazon Redshift & Redshift Spectrum Most companies are over-run with data, yet they lack critical insights to make timely and accurate business decisions. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. In this session, we take an in-depth look at how modern data warehousing blends and analyzes all your data, inside and outside your data warehouse without moving the data, to give you deeper insights to run your business. We will cover best practices on how to design optimal schemas, load data efficiently, and optimize your queries to deliver high throughput and performance.
ABD305 – Design Patterns and Best Practices for Data Analytics with Amazon EMR Amazon EMR is one of the largest Hadoop operators in the world, enabling customers to run ETL, machine learning, real-time processing, data science, and low-latency SQL at petabyte scale. In this session, we introduce you to Amazon EMR design patterns such as using Amazon S3 instead of HDFS, taking advantage of both long and short-lived clusters, and other Amazon EMR architectural best practices. We talk about lowering cost with Auto Scaling and Spot Instances, and security best practices for encryption and fine-grained access control. Finally, we dive into some of our recent launches to keep you current on our latest features.
ABD307 – Deep Analytics for Global AWS Marketing Organization To meet the needs of the global marketing organization, the AWS marketing analytics team built a scalable platform that allows the data science team to deliver custom econometric and machine learning models for end user self-service. To meet data security standards, we use end-to-end data encryption and different AWS services such as Amazon Redshift, Amazon RDS, Amazon S3, Amazon EMR with Apache Spark and Auto Scaling. In this session, you see real examples of how we have scaled and automated critical analysis, such as calculating the impact of marketing programs like re:Invent and prioritizing leads for our sales teams.
ABD311 – Deploying Business Analytics at Enterprise Scale with Amazon QuickSight One of the biggest tradeoffs customers usually make when deploying BI solutions at scale is agility versus governance. Large-scale BI implementations with the right governance structure can take months to design and deploy. In this session, learn how you can avoid making this tradeoff using Amazon QuickSight. Learn how to easily deploy Amazon QuickSight to thousands of users using Active Directory and Federated SSO, while securely accessing your data sources in Amazon VPCs or on-premises. We also cover how to control access to your datasets, implement row-level security, create scheduled email reports, and audit access to your data.
ABD315 – Building Serverless ETL Pipelines with AWS Glue Organizations need to gain insight and knowledge from a growing number of Internet of Things (IoT), APIs, clickstreams, unstructured and log data sources. However, organizations are also often limited by legacy data warehouses and ETL processes that were designed for transactional data. In this session, we introduce key ETL features of AWS Glue, cover common use cases ranging from scheduled nightly data warehouse loads to near real-time, event-driven ETL flows for your data lake. We discuss how to build scalable, efficient, and serverless ETL pipelines using AWS Glue. Additionally, Merck will share how they built an end-to-end ETL pipeline for their application release management system, and launched it in production in less than a week using AWS Glue.
ABD318 – Architecting a data lake with Amazon S3, Amazon Kinesis, and Amazon Athena Learn how to architect a data lake where different teams within your organization can publish and consume data in a self-service manner. As organizations aim to become more data-driven, data engineering teams have to build architectures that can cater to the needs of diverse users – from developers, to business analysts, to data scientists. Each of these user groups employs different tools, have different data needs and access data in different ways. In this talk, we will dive deep into assembling a data lake using Amazon S3, Amazon Kinesis, Amazon Athena, Amazon EMR, and AWS Glue. The session will feature Mohit Rao, Architect and Integration lead at Atlassian, the maker of products such as JIRA, Confluence, and Stride. First, we will look at a couple of common architectures for building a data lake. Then we will show how Atlassian built a self-service data lake, where any team within the company can publish a dataset to be consumed by a broad set of users.
Companies have valuable data that they may not be analyzing due to the complexity, scalability, and performance issues of loading the data into their data warehouse. However, with the right tools, you can extend your analytics to query data in your data lake—with no loading required. Amazon Redshift Spectrum extends the analytic power of Amazon Redshift beyond data stored in your data warehouse to run SQL queries directly against vast amounts of unstructured data in your Amazon S3 data lake. This gives you the freedom to store your data where you want, in the format you want, and have it available for analytics when you need it. Join a discussion with AWS solution architects to ask question.
ABD330 – Combining Batch and Stream Processing to Get the Best of Both Worlds Today, many architects and developers are looking to build solutions that integrate batch and real-time data processing, and deliver the best of both approaches. Lambda architecture (not to be confused with the AWS Lambda service) is a design pattern that leverages both batch and real-time processing within a single solution to meet the latency, accuracy, and throughput requirements of big data use cases. Come join us for a discussion on how to implement Lambda architecture (batch, speed, and serving layers) and best practices for data processing, loading, and performance tuning
ABD335 – Real-Time Anomaly Detection Using Amazon Kinesis Amazon Kinesis Analytics offers a built-in machine learning algorithm that you can use to easily detect anomalies in your VPC network traffic and improve security monitoring. Join us for an interactive discussion on how to stream your VPC flow Logs to Amazon Kinesis Streams and identify anomalies using Kinesis Analytics.
ABD339 – Deep Dive and Best Practices for Amazon Athena Amazon Athena is an interactive query service that enables you to process data directly from Amazon S3 without the need for infrastructure. Since its launch at re:invent 2016, several organizations have adopted Athena as the central tool to process all their data. In this talk, we dive deep into the most common use cases, including working with other AWS services. We review the best practices for creating tables and partitions and performance optimizations. We also dive into how Athena handles security, authorization, and authentication. Lastly, we hear from a customer who has reduced costs and improved time to market by deploying Athena across their organization.
We look forward to meeting you at re:Invent 2017!
About the Author
Roy Ben-Alta is a solution architect and principal business development manager at Amazon Web Services in New York. He focuses on Data Analytics and ML Technologies, working with AWS customers to build innovative data-driven products.
CodeBuild is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy. As part of the build process, developers often require access to resources that should be isolated from the public Internet. Now CodeBuild builds can be optionally configured to have VPC connectivity and access these resources directly.
Accessing Resources in a VPC
You can configure builds to have access to a VPC when you create a CodeBuild project or you can update an existing CodeBuild project with VPC configuration attributes. Here’s how it looks in the console:
To configure VPC connectivity: select a VPC, one or more subnets within that VPC, and one or more VPC security groups that CodeBuild should apply when attaching to your VPC. Once configured, commands running as part of your build will be able to access resources in your VPC without transiting across the public Internet.
Use Cases
The availability of VPC connectivity from CodeBuild builds unlocks many potential uses. For example, you can:
Run integration tests from your build against data in an Amazon RDS instance that’s isolated on a private subnet.
Query data in an ElastiCache cluster directly from tests.
Interact with internal web services hosted on Amazon EC2, Amazon ECS, or services that use internal Elastic Load Balancing.
Retrieve dependencies from self-hosted, internal artifact repositories such as PyPI for Python, Maven for Java, npm for Node.js, and so on.
Access objects in an Amazon S3 bucket configured to allow access only through a VPC endpoint.
Query external web services that require fixed IP addresses through the Elastic IP address of the NAT gateway associated with your subnet(s).
… and more! Your builds can now access any resource that’s hosted in your VPC without any compromise on network isolation.
Internet Connectivity
CodeBuild requires access to resources on the public Internet to successfully execute builds. At a minimum, it must be able to reach your source repository system (such as AWS CodeCommit, GitHub, Bitbucket), Amazon Simple Storage Service (Amazon S3) to deliver build artifacts, and Amazon CloudWatch Logs to stream logs from the build process. The interface attached to your VPC will not be assigned a public IP address so to enable Internet access from your builds, you will need to set up a managed NAT Gateway or NAT instance for the subnets you configure. You must also ensure your security groups allow outbound access to these services.
IP Address Space
Each running build will be assigned an IP address from one of the subnets in your VPC that you designate for CodeBuild to use. As CodeBuild scales to meet your build volume, ensure that you select subnets with enough address space to accommodate your expected number of concurrent builds.
Service Role Permissions
CodeBuild requires new permissions in order to manage network interfaces on your VPCs. If you create a service role for your new projects, these permissions will be included in that role’s policy automatically. For existing service roles, you can edit the policy document to include the additional actions. For the full policy document to apply to your service role, see Advanced Setup in the CodeBuild documentation.
For more information, see VPC Support in the CodeBuild documentation. We hope you find the ability to access internal resources on a VPC useful in your build processes! If you have any questions or feedback, feel free to reach out to us through the AWS CodeBuild forum or leave a comment!
Contributed by Otavio Ferreira, Manager, Software Development, AWS Messaging
Like other developers around the world, you may be tackling increasingly complex business problems. A key success factor, in that case, is the ability to break down a large project scope into smaller, more manageable components. A service-oriented architecture guides you toward designing systems as a collection of loosely coupled, independently scaled, and highly reusable services. Microservices take this even further. To improve performance and scalability, they promote fine-grained interfaces and lightweight protocols.
However, the communication among isolated microservices can be challenging. Services are often deployed onto independent servers and don’t share any compute or storage resources. Also, you should avoid hard dependencies among microservices, to preserve maintainability and reusability.
If you apply the pub/sub design pattern, you can effortlessly decouple and independently scale out your microservices and serverless architectures. A pub/sub messaging service, such as Amazon SNS, promotes event-driven computing that statically decouples event publishers from subscribers, while dynamically allowing for the exchange of messages between them. An event-driven architecture also introduces the responsiveness needed to deal with complex problems, which are often unpredictable and asynchronous.
What is event-driven computing?
Given the context of microservices, event-driven computing is a model in which subscriber services automatically perform work in response to events triggered by publisher services. This paradigm can be applied to automate workflows while decoupling the services that collectively and independently work to fulfil these workflows. Amazon SNS is an event-driven computing hub, in the AWS Cloud, that has native integration with several AWS publisher and subscriber services.
Which AWS services publish events to SNS natively?
Several AWS services have been integrated as SNS publishers and, therefore, can natively trigger event-driven computing for a variety of use cases. In this post, I specifically cover AWS compute, storage, database, and networking services, as depicted below.
Compute services
Auto Scaling: Helps you ensure that you have the correct number of Amazon EC2 instances available to handle the load for your application. You can configure Auto Scaling lifecycle hooks to trigger events, as Auto Scaling resizes your EC2 cluster.As an example, you may want to warm up the local cache store on newly launched EC2 instances, and also download log files from other EC2 instances that are about to be terminated. To make this happen, set an SNS topic as your Auto Scaling group’s notification target, then subscribe two Lambda functions to this SNS topic. The first function is responsible for handling scale-out events (to warm up cache upon provisioning), whereas the second is in charge of handling scale-in events (to download logs upon termination).
AWS Elastic Beanstalk: An easy-to-use service for deploying and scaling web applications and web services developed in a number of programming languages. You can configure event notifications for your Elastic Beanstalk environment so that notable events can be automatically published to an SNS topic, then pushed to topic subscribers.As an example, you may use this event-driven architecture to coordinate your continuous integration pipeline (such as Jenkins CI). That way, whenever an environment is created, Elastic Beanstalk publishes this event to an SNS topic, which triggers a subscribing Lambda function, which then kicks off a CI job against your newly created Elastic Beanstalk environment.
Elastic Load Balancing:Automatically distributes incoming application traffic across Amazon EC2 instances, containers, or other resources identified by IP addresses.You can configure CloudWatch alarms on Elastic Load Balancing metrics, to automate the handling of events derived from Classic Load Balancers. As an example, you may leverage this event-driven design to automate latency profiling in an Amazon ECS cluster behind a Classic Load Balancer. In this example, whenever your ECS cluster breaches your load balancer latency threshold, an event is posted by CloudWatch to an SNS topic, which then triggers a subscribing Lambda function. This function runs a task on your ECS cluster to trigger a latency profiling tool, hosted on the cluster itself. This can enhance your latency troubleshooting exercise by making it timely.
Amazon S3:Object storage built to store and retrieve any amount of data.You can enable S3 event notifications, and automatically get them posted to SNS topics, to automate a variety of workflows. For instance, imagine that you have an S3 bucket to store incoming resumes from candidates, and a fleet of EC2 instances to encode these resumes from their original format (such as Word or text) into a portable format (such as PDF).In this example, whenever new files are uploaded to your input bucket, S3 publishes these events to an SNS topic, which in turn pushes these messages into subscribing SQS queues. Then, encoding workers running on EC2 instances poll these messages from the SQS queues; retrieve the original files from the input S3 bucket; encode them into PDF; and finally store them in an output S3 bucket.
Amazon EFS: Provides simple and scalable file storage, for use with Amazon EC2 instances, in the AWS Cloud.You can configure CloudWatch alarms on EFS metrics, to automate the management of your EFS systems. For example, consider a highly parallelized genomics analysis application that runs against an EFS system. By default, this file system is instantiated on the “General Purpose” performance mode. Although this performance mode allows for lower latency, it might eventually impose a scaling bottleneck. Therefore, you may leverage an event-driven design to handle it automatically.Basically, as soon as the EFS metric “Percent I/O Limit” breaches 95%, CloudWatch could post this event to an SNS topic, which in turn would push this message into a subscribing Lambda function. This function automatically creates a new file system, this time on the “Max I/O” performance mode, then switches the genomics analysis application to this new file system. As a result, your application starts experiencing higher I/O throughput rates.
Amazon Glacier: A secure, durable, and low-cost cloud storage service for data archiving and long-term backup.You can set a notification configuration on an Amazon Glacier vault so that when a job completes, a message is published to an SNS topic. Retrieving an archive from Amazon Glacier is a two-step asynchronous operation, in which you first initiate a job, and then download the output after the job completes. Therefore, SNS helps you eliminate polling your Amazon Glacier vault to check whether your job has been completed, or not. As usual, you may subscribe SQS queues, Lambda functions, and HTTP endpoints to your SNS topic, to be notified when your Amazon Glacier job is done.
AWS Snowball: A petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data.You can leverage Snowball notifications to automate workflows related to importing data into and exporting data from AWS. More specifically, whenever your Snowball job status changes, Snowball can publish this event to an SNS topic, which in turn can broadcast the event to all its subscribers.As an example, imagine a Geographic Information System (GIS) that distributes high-resolution satellite images to users via Web browser. In this example, the GIS vendor could capture up to 80 TB of satellite images; create a Snowball job to import these files from an on-premises system to an S3 bucket; and provide an SNS topic ARN to be notified upon job status changes in Snowball. After Snowball changes the job status from “Importing” to “Completed”, Snowball publishes this event to the specified SNS topic, which delivers this message to a subscribing Lambda function, which finally creates a CloudFront web distribution for the target S3 bucket, to serve the images to end users.
Amazon RDS: Makes it easy to set up, operate, and scale a relational database in the cloud.RDS leverages SNS to broadcast notifications when RDS events occur. As usual, these notifications can be delivered via any protocol supported by SNS, including SQS queues, Lambda functions, and HTTP endpoints.As an example, imagine that you own a social network website that has experienced organic growth, and needs to scale its compute and database resources on demand. In this case, you could provide an SNS topic to listen to RDS DB instance events. When the “Low Storage” event is published to the topic, SNS pushes this event to a subscribing Lambda function, which in turn leverages the RDS API to increase the storage capacity allocated to your DB instance. The provisioning itself takes place within the specified DB maintenance window.
Amazon ElastiCache: A web service that makes it easy to deploy, operate, and scale an in-memory data store or cache in the cloud.ElastiCache can publish messages using Amazon SNS when significant events happen on your cache cluster. This feature can be used to refresh the list of servers on client machines connected to individual cache node endpoints of a cache cluster. For instance, an ecommerce website fetches product details from a cache cluster, with the goal of offloading a relational database and speeding up page load times. Ideally, you want to make sure that each web server always has an updated list of cache servers to which to connect.To automate this node discovery process, you can get your ElastiCache cluster to publish events to an SNS topic. Thus, when ElastiCache event “AddCacheNodeComplete” is published, your topic then pushes this event to all subscribing HTTP endpoints that serve your ecommerce website, so that these HTTP servers can update their list of cache nodes.
Amazon Redshift: A fully managed data warehouse that makes it simple to analyze data using standard SQL and BI (Business Intelligence) tools.Amazon Redshift uses SNS to broadcast relevant events so that data warehouse workflows can be automated. As an example, imagine a news website that sends clickstream data to a Kinesis Firehose stream, which then loads the data into Amazon Redshift, so that popular news and reading preferences might be surfaced on a BI tool. At some point though, this Amazon Redshift cluster might need to be resized, and the cluster enters a ready-only mode. Hence, this Amazon Redshift event is published to an SNS topic, which delivers this event to a subscribing Lambda function, which finally deletes the corresponding Kinesis Firehose delivery stream, so that clickstream data uploads can be put on hold.At a later point, after Amazon Redshift publishes the event that the maintenance window has been closed, SNS notifies a subscribing Lambda function accordingly, so that this function can re-create the Kinesis Firehose delivery stream, and resume clickstream data uploads to Amazon Redshift.
AWS DMS: Helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.DMS also uses SNS to provide notifications when DMS events occur, which can automate database migration workflows. As an example, you might create data replication tasks to migrate an on-premises MS SQL database, composed of multiple tables, to MySQL. Thus, if replication tasks fail due to incompatible data encoding in the source tables, these events can be published to an SNS topic, which can push these messages into a subscribing SQS queue. Then, encoders running on EC2 can poll these messages from the SQS queue, encode the source tables into a compatible character set, and restart the corresponding replication tasks in DMS. This is an event-driven approach to a self-healing database migration process.
Amazon Route 53: A highly available and scalable cloud-based DNS (Domain Name System). Route 53 health checks monitor the health and performance of your web applications, web servers, and other resources.You can set CloudWatch alarms and get automated Amazon SNS notifications when the status of your Route 53 health check changes. As an example, imagine an online payment gateway that reports the health of its platform to merchants worldwide, via a status page. This page is hosted on EC2 and fetches platform health data from DynamoDB. In this case, you could configure a CloudWatch alarm for your Route 53 health check, so that when the alarm threshold is breached, and the payment gateway is no longer considered healthy, then CloudWatch publishes this event to an SNS topic, which pushes this message to a subscribing Lambda function, which finally updates the DynamoDB table that populates the status page. This event-driven approach avoids any kind of manual update to the status page visited by merchants.
AWS Direct Connect (AWS DX): Makes it easy to establish a dedicated network connection from your premises to AWS, which can reduce your network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet-based connections.You can monitor physical DX connections using CloudWatch alarms, and send SNS messages when alarms change their status. As an example, when a DX connection state shifts to 0 (zero), indicating that the connection is down, this event can be published to an SNS topic, which can fan out this message to impacted servers through HTTP endpoints, so that they might reroute their traffic through a different connection instead. This is an event-driven approach to connectivity resilience.
In addition to SNS, event-driven computing is also addressed by Amazon CloudWatch Events, which delivers a near real-time stream of system events that describe changes in AWS resources. With CloudWatch Events, you can route each event type to one or more targets, including:
Many AWS services publish events to CloudWatch. As an example, you can get CloudWatch Events to capture events on your ETL (Extract, Transform, Load) jobs running on AWS Glue and push failed ones to an SQS queue, so that you can retry them later.
Conclusion
Amazon SNS is a pub/sub messaging service that can be used as an event-driven computing hub to AWS customers worldwide. By capturing events natively triggered by AWS services, such as EC2, S3 and RDS, you can automate and optimize all kinds of workflows, namely scaling, testing, encoding, profiling, broadcasting, discovery, failover, and much more. Business use cases presented in this post ranged from recruiting websites, to scientific research, geographic systems, social networks, retail websites, and news portals.
Amazon ElastiCache makes it easy to for you to set up a fast, in-memory data store and cache. With support for the two most popular open source offerings (Redis and Memcached), ElastiCache supports the demanding needs of game leaderboards, in-memory analytics, and large-scale messaging.
Today I would like to tell you about an important addition to Amazon ElastiCache for Redis. You can already create clusters with up to 15 shards, each responsible for storing keys and values for a specific set of slots (each cluster has exactly 16,384 slots). A single cluster can expand to store 3.55 terabytes of in-memory data while supporting up to 20 million reads and 4.5 million writes per second.
Now with Online Resizing You can now adjust the number of shards in a running ElastiCache for Redis cluster while the cluster remains online and responding to requests. This gives you the power to respond to changes in traffic and data volume without having to take the cluster offline or to start with an empty cache. You can also rebalance a running cluster to uniformly redistribute slot space without changing the number of shards.
When you initiate a resharding or rebalancing operation, ElastiCache for Redis starts by preparing a plan that will result in an even distribution of slots across the shards in the cluster. Then it transfers slots across shards, moving many in parallel for efficiency. This all happens while the cluster continues to respond to requests, with a modest impact on write throughput for writes to a slot that is in motion. The migration rate is dependent on the instance type, network speed, read/write traffic to the slots, and is generally about 1 gigabyte per minute.
The resharding and rebalancing operations apply to Redis clusters that were created with Cluster Mode enabled:
Resharding a Cluster In general, you will know that it is time to expand a cluster via resharding when it starts to face significant memory pressure or when individual nodes are becoming bottlenecks. You can watch the cluster’s CloudWatch metrics to identify each situation:
To reshard a Redis cluster from the ElastiCache Dashboard, click on the cluster to visit the detail page, and then click on the Add shards button:
Enter the number of shards to add and (optionally) the desired Availability Zones, then click on Add:
The status of the cluster will change to modifying and the resharding process will begin. It can take anywhere from a few minutes to several hours, as indicated above. You can track the progress on the detail page for the cluster:
You can see the slots moving from shard to shard:
You can also watch the Events for the cluster:
During the resharding you should avoid the use of the KEYS and SMEMBERS commands, as well as compute-intensive Lua scripts in order to moderate the load on the cluster shards. You should avoid the FLUSHDB and FLUSHALL commands entirely; using them will interrupt and then abort the resharding process.
The status of each shard will return to available when the process is complete:
The same process takes place when you delete shards.
Rebalancing Slots You can perform this operation by heading to the cluster’s detail page and clicking on Rebalance Slot Distribution:
Things to Know Here are a couple of things to keep in mind about this new feature:
Engine Version – Your cluster must be running version 3.2.10 of the Redis engine.
Migration Size – Slots that contain items that are larger than 256 megabytes after serialization are not migrated.
Cluster Endpoint – The cluster endpoint does not change as a result of a resharding or rebalancing.
Available Now This feature is available now and you can start using it today.
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.