By Fred Wurden, General Manager, AWS Enterprise Engineering (Windows, VMware, RedHat, SAP, Benchmarking)
AWS R5b.8xlarge delivers better performance at lower cost than Azure E64_32s_v4 for a SQL Server workload
In this blog, we will review a recent benchmark that Principled Technologies published on 2/25. The benchmark found that an Amazon Elastic Compute Cloud (Amazon EC2) R5b.8xlarge instance delivered better performance for a SQL Server workload at a lower cost when directly tested against an Azure E64_32s_v4 VM.
Behind the study: Understanding how SQL Server performed better, for a lower cost with an AWS EC2 R5b instance
Principled Technologies tested an online transaction processing (OLTP) workload for SQL Server 2019 on both an R5b instance on Amazon EC2 with Amazon Elastic Block Store (EBS) as storage and Azure E64_32s_v4. This particular Azure VM was chosen as an equivalent to the R5b instance, as both instances have comparable specifications for input/output operations per second (IOPS) performance, use Intel Xeon processors from the same generation (Cascade Lake), and offer the same number of cores (32). For storage, Principled Technologies mirrored storage configurations across the Azure VM and the EC2 instance (which used Amazon Elastic Block Store (EBS)), maxing out the IOPS specs on each while offering a direct comparison between instances.
When benchmarking, Principled Technologies ran a TPC-C-like OLTP workload from HammerDB v3.3 on both instances, testing against new orders per minute (NOPM) performance. NOPM shows the number of new-order transactions completed in one minute as part of a serialized business workload. HammerDB claims that because NOPM is “independent of any particular database implementation [it] is the recommended primary metric to use.”
The results: SQL Server on AWS EC2 R5b delivered 2x performance than the Azure VM and 62% less expensive
These test results from the Principled Technologies report show the price/performance and performance comparisons. The performance metric is New Orders Per Minute (NOPM); faster is better. The price/performance calculations are based on the cost of on-demand, License Included SQL Server instances and storage to achieve 1,000 NOPM performance, smaller is better.
An EC2 r5b.8xlarge instance powered by an Intel Xeon Scalable processor delivered better SQL Server NOPM performance on the HammerDB benchmark and a lower price per 1,000 NOPM than an Azure E64_32s_v4 VM powered by similar Intel Xeon Scalable processors.
On top of that, AWS’s storage price-performance exceeded Azure’s. The Azure managed disks offered 53 percent more storage than the EBS storage, but the EC2 instance with EBS storage cost 24 percent less than the Azure VM with managed disks. Even by reducing Azure storage by the difference in storage, something customers cannot do, EBS would have cost 13 percent less per storage GB than the Azure managed disks.
Why AWS is the best cloud to run your Windows and SQL Server workloads
To us, these results aren’t surprising. In fact, they’re in line with the success that customers find running Windows on AWS for over 12 years. Customers like Pearson and Expedia have all found better performance and enhanced cost savings by moving their Windows, SQL Server, and .NET workloads to AWS. In fact, RepricerExpress migrated its Windows and SQL Server environments from Azure to AWS to slash outbound bandwidth costs while gaining agility and performance.
Not only do we offer better price-performance for your Windows workloads, but we also offer better ways to run Windows in the cloud. Whether you want to rehost your databases to EC2, move to managed with Amazon Relational Database for SQL Server (RDS), or even modernize to cloud-native databases, AWS stands ready to help you get the most out of the cloud.
To learn more on migrating Windows Server or SQL Server, visit Windows on AWS. For more stories about customers who have successfully migrated and modernized SQL Server workloads with AWS, visit our Customer Success page. Contact us to start your migration journey today.
This post was written by Santiago Cardenas, Sr Partner SA. and Nir Mashkowski, Principal Product Manager.
Increasingly, customers turn to software as a service (SaaS) solutions for the potential of lowering the total cost of ownership (TCO). This enables customers to focus their teams on business priorities instead of managing and maintaining software and infrastructure. Startups are building SaaS products for a wide variety of common application types to take advantage of these market needs.
As SaaS accelerates adoption, enterprise customers expect the same capabilities that are available with traditional, on-premises software. They want the ability to customize system behavior and use rich integrations that can help build solutions rapidly.
For customization and extensibility, many independent software vendors (ISVs) are building application programming interfaces (APIs) and integration hooks. To extend these capabilities, many SaaS builders expose a common set of APIs:
Event APIs emit events when SaaS entities change. Synchronous event APIs block the SaaS action until the API completes a request. Asynchronous are non-blocking and use mechanisms like pub/sub and webhooks to inform the caller of updates. Event APIs are used for many purposes, such as enriching incoming data or triggering workflows.
CRUD APIs allow developers to interact with entities within the SaaS product. They can be used by mobile or web clients to add, update, and remove records, for example.
Schema APIs allow developers to create data entities in the SaaS product, such as tables, key-value stores, or document repositories.
User experience (UX) components. Many SaaS products include an SDK that helps provide a consistent look-and-feel and built-in support for common functions, such as authentication. Components are sometimes delivered as code libraries or as an online API that renders the UI.
Business systems expose different subsets of the APIs based on the application domain. Extensibility models are built on top of those APIs and can take various different forms. ISVs use these APIs to build features such as “no code” workflow engines, UX, and report generators. In those cases, the SaaS product runs a domain-specific language (DSL) where it controls compute, storage, and memory consumption.
This level of customization is acceptable for many business users. However, for more sophisticated customization, this requires the ability to write custom code. When coding is needed, some business systems choose to provide sandboxing for the user code within the service. Others choose to ask developers to host the extensibility model themselves.
The growth of vendor-hosted SaaS extensions
First-generation SaaS products essentially “lift and shift” on-premises enterprise software, where each customer has a copy of the entire stack. This single tenant model offers simplicity, a smaller blast radius, and faster time to market.
Newer, born-in-the-cloud SaaS products implement a multi-tenant approach, where all resources are shared across customers. This model may be easier to maintain but can present challenges for handling security, isolation, and resource allocation.
Multi-tenancy challenges are harder when customers can run custom code inside the SaaS infrastructure. To solve this, SaaS builders may start with a customer hosted approach, where customers implement their own extensions by consuming SaaS APIs. This means customers must learn and install an SDK, deploy, and maintain an app in their cloud. This often results in higher cost and slower time to market.
To simplify this model, SaaS builders are finding ways to allow developers to write code directly within the SaaS product. The event driven, pay-per-execution, and polyglot nature of serverless functions provides new capabilities for implementing SaaS extensibility. This model is called vendor hosted SaaS extensions.
SaaS builders are using AWS Lambda for serverless functions to provide flexible compute options to their customers. The goal is to abstract away and simplify the consumption model. AWS provides SaaS builders with features and controls to customize the execution environments as part of their own SaaS product. This allows SaaS owners more flexibility when deciding on isolation models, usability, and cost considerations.
Isolating tenant requests
Isolation of customer requests is important both at the product level and at the tenant level. Product-level isolation focuses on controlling and enforcing the access to data between tenants. It ensures that one tenant is separated from another tenant’s functions. Tenant-level isolation focuses on resources allocated to serve requests. These may include identity, network and internet access, file system access, and memory/CPU allocation.
SaaS product owners can allow customers to use familiar programming languages within the serverless functions. This allows customers to grow with the service and potentially host and scale independently, using their own infrastructure.
Usability considers the domain and industry of the product. For example, if the SaaS product enables data processing, it may enable invocation of serverless functions during these workflows. Additionally, these functions may provide the customer the context of the user, application, tenant, and the domain. A streamlined, opinionated deployment workflow that abstracts away initial configuration can also aid customer adoption.
Cost is an important factor in driving adoption. It’s an important differentiator to pay only for the resources used, while being able to scale in response to events. This can help reduce costs that are passed on to SaaS customers.
Examples of SaaS product extensibility
Multiple AWS Partners are extending their SaaS product using Lambda for on-demand scalable compute. This enables them to focus on enriching the customer experience that is associated with their business domain. Examples include:
Segment Functions, which seamlessly integrates as a source or destination. The service uses code snippets to allow customers to enrich data, enforce consistency, and connect to APIs and services that power their workflows.
Freshworks’ Neo platform provides extensibility using the concept of apps. These are powered by Lambda functions hosting the core business logic and backends. Apps are triggered by unplanned and scheduled Freshworks events (customer support tickets, IT service cases, contacts, and deal updates), in addition to app-specific and external events.
Netlify Functions enables customers to supercharge frontend code with functions in their development workflow. These can power automated triggers, connect to third-party APIs, or provide user authentication.
All of these SaaS partners abstract away the deployment, versioning, and configuration of custom code using Lambda.
As customers increasingly use SaaS solutions in their businesses, they want the same customization and extensibility available in on-premises solutions. SaaS partners have developed APIs and integration hooks to help address this need. For more sophisticated customization, products enable custom code to run within their SaaS workflows.
This presents SaaS partners with isolation, usability, and cost challenges and many of them are now using serverless functions to address these challenges. Lambda provides a pay-per-value compute service that scales automatically to meet customer demand. Segment Functions, Freshworks, and Netlify Functions have all used Lambda to provide extensibility to their customers.
Lambda continues to develop features and functionality to power the extensibility of SaaS products. We look forward to seeing the new ways you use Lambda to extend your SaaS product for your customers. Share your Lambda extensibility story with us at [email protected].
Fred Wurden, General Manager, AWS Enterprise Engineering (Windows, VMware, RedHat, SAP, Benchmarking)
For companies that rely on Windows Server but find it daunting to move those workloads to the cloud, there is no easier way to run Windows in the cloud than AWS. Customers as diverse as Expedia, Pearson, Seven West Media, and RepricerExpress have chosen AWS over other cloud providers to unlock the Microsoft products they currently rely on, including Windows Server and SQL Server. The reasons are several: by embracing AWS, they’ve achieved cost savings through forthright pricing options and expanded breadth and depth of capabilities. In this blog, we break down these advantages to understand why AWS is the simplest, most popular and secure cloud to run your business-critical Windows Server and SQL Server workloads.
AWS lowers costs and increases choice with flexible pricing options
Customers expect accurate and transparent pricing so you can make the best decisions for your business. When assessing which cloud to run your Windows workloads, customers look at the total cost of ownership (TCO) of workloads.
Not only does AWS provide cost-effective ways to run Windows and SQL Server workloads, we also regularly lower prices to make it even more affordable. Since launching in 2006, AWS has reduced prices 85 times. In fact, we recently dropped pricing by and average of 25% for Amazon RDS for SQL Server Enterprise Edition database instances in the Multi-AZ configuration, for both On-Demand Instance and Reserved Instance types on the latest generation hardware.
The AWS pricing approach makes it simple to understand your costs, even as we actively help you pay AWS less now and in the future. For example, AWS Trusted Advisor provides real-time guidance to provision your resources more efficiently. This means that you spend less money with us. We do this because we know that if we aren’t creating more and more value for you each year, you’ll go elsewhere.
In addition, we have several other industry-leading initiatives to help lower customer costs, including AWS Compute Optimizer, Amazon CodeGuru, and AWS Windows Optimization and Licensing Assessments (AWS OLA). AWS Compute Optimizer recommends optimal AWS Compute resources for your workloads by using machine learning (ML) to analyze historical utilization metrics. Customers who use Compute Optimizer can save up to 25% on applications running on Amazon Elastic Compute Cloud (Amazon EC2). Machine learning also plays a key role in Amazon CodeGuru, which provides intelligent recommendations for improving code quality and identifying an application’s most expensive lines of code. Finally, AWS OLA helps customers to optimize licensing and infrastructure provisioning based on actual resource consumption (ARC) to offer cost-effective Windows deployment options.
Cloud pricing shouldn’t be complicated
Other cloud providers bury key pricing information when making comparisons to other vendors, thereby incorrectly touting pricing advantages. Often those online “pricing calculators” that purport to clarify pricing neglect to include hidden fees, complicating costs through licensing rules (e.g., you can run this workload “for free” if you pay us elsewhere for “Software Assurance”). At AWS, we believe such pricing and licensing tricks are contrary to the fundamental promise of transparent pricing for cloud computing.
By contrast, AWS makes it straightforward for you to run Windows Server applications where you want. With our End-of-Support Migration Program (EMP) for Windows Server, you can easily move your legacy Windows Server applications—without needing any code changes. The EMP technology decouples the applications from the underlying OS. This enables AWS Partners or AWS Professional Services to migrate critical applications from legacy Windows Server 2003, 2008, and 2008 R2 to newer, supported versions of Windows Server on AWS. This allows you to avoid extra charges for extended support that other cloud providers charge.
Other cloud providers also may limit your ability to Bring-Your-Own-License (BYOL) for SQL Server to your preferred cloud provider. Meanwhile, AWS improves the BYOL experience using EC2 Dedicated Hosts and AWS License Manager. With EC2 Dedicated Hosts, you can save costs by moving existing Windows Server and SQL Server licenses do not have Software Assurance to AWS. AWS License Manager simplifies how you manage your software licenses from software vendors such as Microsoft, SAP, Oracle, and IBM across AWS and on-premises environments. We also work hard to help our customers spend less.
How AWS helps customers save money on Windows Server and SQL Server workloads
The first way AWS helps customers save money is by delivering the most reliable global cloud infrastructure for your Windows workloads. Any downtime costs customers in terms of lost revenue, diminished customer goodwill, and reduced employee productivity.
With respect to pricing, AWS offers multiple pricing options to help our customers save. First, we offer AWS Savings Plans that provide you with a flexible pricing model to save up to 72 percent on your AWS compute usage. You can sign up for Savings Plans for a 1- or 3-year term. Our Savings Plans help you easily manage your plans by taking advantage of recommendations, performance reporting and budget alerts in AWS Cost Explorer, which is a unique benefit only AWS provides. Not only that, but we also offer Amazon EC2 Spot Instances that help you save up to 90 percent on your compute costs vs. On-Demand Instance pricing.
Customers don’t need to walk this migration path alone. In fact, AWS customers often make the most efficient use of cloud resources by working with assessment partners like Cloudamize, CloudChomp, or Migration Evaluator (formerly TSO Logic), which is now part of AWS. By running detailed assessments of their environments with Migration Evaluator before migration, customers can achieve an average of 36 percent savings using AWS over three years. So how do you get from an on-premises Windows deployment to the cloud? AWS makes it simple.
AWS has support programs and tools to help you migrate to the cloud
Though AWS Migration Acceleration Program (MAP) for Windows is a great way to reduce the cost of migrating Windows Server and SQL Server workloads, MAP is more than a cost savings tool. As part of MAP, AWS offers a number of resources to support and sustain your migration efforts. This includes an experienced APN Partner ecosystem to execute migrations, our AWS Professional Services team to provide best practices and prescriptive advice, and a training program to help IT professionals understand and carry out migrations successfully. We help you figure out which workloads to move first, then leverage the combined experience of our Professional Services and partner teams to guide you through the process. For customers who want to save even more (up to 72% in some cases) we are the leaders in helping customers transform legacy systems to modernized managed services.
Why run Windows Server and SQL Server anywhere else but AWS?
Not only does AWS offer significantly more services than any other cloud, with over 48 services without comparable equivalents on other clouds, but AWS also provides better ways to use Microsoft products than any other cloud. This includes Active Directory as a managed service and FSx for Windows File Server, the only fully managed file storage service for Windows. If you’re interested in learning more about how AWS improves the Windows experience, please visit this article on our Modernizing with AWS blog.
Bring your Windows Server and SQL Server workloads to AWS for the most secure, reliable, and performant cloud, providing you with the depth and breadth of capabilities at the lowest cost. To learn more, visit Windows on AWS. Contact us today to learn more on how we can help you move your Windows to AWS or innovate on open source solutions.
About the Author Fred Wurden is the GM of Enterprise Engineering (Windows, VMware, Red Hat, SAP, benchmarking) working to make AWS the most customer-centric cloud platform on Earth. Prior to AWS, Fred worked at Microsoft for 17 years and held positions, including: EU/DOJ engineering compliance for Windows and Azure, interoperability principles and partner engagements, and open source engineering. He lives with his wife and a few four-legged friends since his kids are all in college now.
In this month’s issue of AWS Architecture Monthly, Worldwide Tech Lead for Agriculture, Karen Hildebrand (who’s also a fourth generation farmer) refers to agriculture as “the connective tissue our world needs to survive.” As our expert for August’s Agriculture issue, she also talks about what role cloud will play in future development efforts in this industry and why developing personal connections with our AWS agriculture customers is one of the most important aspects of our jobs.
You’ll also buzz through the world of high tech beehives, milk the information about data analytics-savvy cows, and see what the reference architecture of a Smart Farm looks like.
In August’s issue Agriculture issue
Ask an Expert: Karen Hildebrand, AWS WW Agriculture Tech Leader
CustomerSuccessStory: Tine & Crayon: Revolutionizing the Norwegian Dairy Industry Using Machine Learning on AWS
Blog Post: Beewise Combines IoT and AI to Offer an Automated Beehive
ReferenceArchitecture:Smart Farm: Enabling Sensor, Computer Vision, and Edge Inference in Agriculture
Customer Success Story: Farmobile: Empowering the Agriculture Industry Through Data
Blog Post: The Cow Collar Wearable: How Halter benefits from FreeRTOS
Related Videos: DuPont, mPrest & Netafirm, and Veolia
This month, we’re also asking you to take a 10-question survey about your experiences with this magazine. The survey is hosted by an external company (Qualtrics), so the below survey button doesn’t lead to our website. Please note that AWS will own the data gathered from this survey, and we will not share the results we collect with survey respondents. Your responses to this survey will be subject to Amazon’s Privacy Notice. Please take a few moments to give us your opinions.
AWS makes it fast and easy for the Media and Entertainment (M&E) industry to produce, process, and deliver broadcast and over-the-top video. These pay-as-you-go services and appliance products offer the video infrastructure you need to deliver great viewing experiences to any screen.
For June’s issue of AWS Architecture Monthly, WW Tech Leader Konstantin Wilms talks about industry and architectural pattern trends in the cloud-based M&E space, and explains why this industry is unique in that almost any business problem can be solved using many AWS services. He also offers advice on what prospective customers should be thinking about and considering when moving to AWS.
In June’s Media & Entertainment issue
Ask an Expert: Konstantin Wilms, WW Tech Leader, AWS M&E
Blog: Deploying Your Favorite Post Production Applications on AWS Virtual Desktop Infrastructure
Case Study: L Benfica: Launching a Full-Featured VOD Platform in Just Four Weeks
Quick Start: Cloud Video Editing on AWS
Tutorial: Reimagine Your Studio
Whitepaper: Building Media & Entertainment Predictive Analytics Solutions on AWS
IP networking is often seen as a means to an end, an abstract aspect of your business. You don’t say, “I really want a fast network…just to have a fast network.” Quite the contrary. As a business, you set out to accomplish your mission and goals, and then find you need applications to get there. In turn, these applications must speak to each other. Hence, the requirement for network connectivity is born. But networking is often overlooked; it’s often seen as a byproduct of the main objective of the business. But don’t be fooled—the network is a necessary component of any application.
If we did a comparison of an IP network to a house, the network could be analogous to the plumbing of the house, a foundational component. It must be planned somewhere between the architectural blueprints for the structure of the house and the carpenters closing and finishing the walls. It would be a major reconstruction to open the walls to fix the plumbing. This may be an overly simplified analogy, but it helps illustrate the headache involved for application teams if they forget to spend the time properly designing their network architecture.
Amazon Virtual Private Cloud (Amazon VPC) is a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network. You have complete control to customize and design software-defined networks based on your requirements.
Customers can launch VPCs through the AWS Management Console or use APIs to automate provisioning and operations. Within an AWS account, there is a default VPC where you can launch Amazon Elastic Cloud Compute (EC2) instances and other AWS resources, without any network configuration. While the default VPC is a great way to start, most customers need multiple isolated networks, which translates into having multiple VPCs (for example, creating a different VPC for prod and non-prod). As an application scales or must be integrated, different connectivity needs arise.
Customers are adopting a multi-account strategy and using the “AWS account” as a logical boundary for segmentation. The AWS account can segment different environments (dev, test, and prod), different applications or application teams, increase security, and help simplify pricing. This influences the networking architecture because for every AWS account, there is one or more VPCs, and oftentimes, these VPCs must connect to each other.
Network architecture should be carefully planned before provisioning applications to avoid running into common pit falls, including:
#1: Overlapping IPs are never fun
When it comes to designing network architectures, there is one universal truth: don’t use overlapping IP addresses. It doesn’t end well. When you want these applications to talk to each other, it won’t work without a Network Address Translation (NAT). You either have NAT in one direction or you may end up with NAT in two directions. This doesn’t scale; it adds an unnecessary operational burden to network operators that slows them down. In addition, if you find yourself in this situation, it’s a tedious process to get out of. The best option is to design a new network environment and slowly migrate resources to it.
#2: DNS must be planned for hybrid architectures
Customers migrating to the cloud can have Domain Name System (DNS) architectures that require a hybrid approach that considers the on-premises DNS domains and the requirements for cloud resources to share domain names. Often, on-premises resources need to resolve cloud resources and the other way around. AWS has simplified this with the introduction of Amazon Route 53 resolver hybrid endpoints that can forward to, or receive DNS queries from, on-premises servers. This removes the burden of customers having to manage EC2 instances for the purposes of DNS forwarding.
#3: Scalable architectures require dedicated network teams for architecture and operations
As networks grow, they become more important to the business. One of the age-old questions remains true: how much money does your business lose for every minute, every hour, and every day that the applications are offline? This helps you clearly see how dependent you are on the uptime and availability of your network. The optimal network architecture puts the control and operations of network management into the hands of the network team and enables application developers to focus on building great applications.
#4: AWS is constantly evolving and innovating, and so are network architectures
New services can simplify architectures, improve performance, and save costs. A service worth mentioning is AWS Transit Gateway, which lets you scale connectivity across thousands of Amazon VPCs, AWS accounts, and on-premises networks. Transit Gateway introduces a hub-and-spoke architecture with it being the hub and your VPCs being the spokes. It is highly scalable and performant, and it simplifies network operations. In addition, it enables advanced network architectures such as centralized firewalls for outbound (or egress) filtering, centralized NAT, or multicast.
For other designs, teams and applications may require tighter integration. That would be a good use case for VPC sharing. VPC sharing allows multiple AWS accounts to provision resources in the same centrally managed VPC. VPC sharing can allow network administrators to centralize network resources and have complete control. With a multi-account strategy, more accounts can be created and VPCs can be shared from the network team across to the application and development teams. This gives the network operations team complete control over the architecture and operations of the networking, and ensures that application teams don’t need to worry about properly configuring the network.
#5: Every network design is different
The AWS Well-Architected Framework provides a methodology to help guide customers to build resilient and fault-tolerant applications. But often, the pillars of the Well-Architected Framework are at odds with each other. For example, to build in resiliency may compete with cost optimization. You must define your own requirements. If you’re a large customer, you may prefer to build with more complexity because it keeps costs lower at high scale. In contrast, others may prefer to architect for simplicity and performance with a few added costs.
In closing, I recommend you read the whitepaper, Building a Scalable and Secure Multi-VPC AWS Network Infrastructure. We wrote this paper to help you create scalable, performant, highly available, and secure cloud networks that take your business and applications to the next level. We are in the midst of a technology revolution, and we are your partners in the journey.
Many applications start to grow in complexity as they mature, making it harder for developers to maintain code or add new features. This can lead to monolithic applications, where developers must know more about the entire architecture to make changes. Typically, this causes code to become more fragile, and the rate of development slows down.
This blog post shows how you can use an event-based architecture to decouple services and functional areas of applications. It uses the document repository solution as an example, to compare architecture after shifting to an event-based approach. The new architecture offers both greater extensibility and simplicity for developers adding new functionality in the future. It can help alleviate the problems associated with monolithic applications.
There are some limitations with this design. First, there is a single source bucket for documents, which may not reflect production usage. Also, while it could be modified to allow new file types for indexing, adding new functionality such as translating documents would require refactoring. And despite having multiple Lambda functions, it’s packaged as a single application, which makes it harder to deploy changes.
The new design uses events to decouple each service used to process incoming S3 objects. It can also use one or more buckets as event sources, which you can change dynamically as needed. Most importantly, it can be easier to introduce changes and new functionality, since the application is no longer deployed as a mono-repo. The new architecture uses this design:
Setup and configuration of AWS resources.
Parser function to filter and reformat S3 events for the application.
Converter functions to operate on distinct file types.
Analyzer functions for interpreting the content of the files.
The Loader function imports the metadata into the Amazon Elasticsearch Service.
The resulting solution is five separate applications, which you deploy in stages. To set up the application, visit the GitHub repo and follow the instructions in the README.md file.
Setup and configuration
The SAM template in the setup directory creates the S3 buckets, and configures AWS CloudTrail to capture put events in these buckets. This is required as EventBridge consumes S3 events via CloudTrail. Now, when any object is stored in any of these S3 buckets, EventBridge receives an event.
This policy can be attached to any Lambda function that must read the contents from one of the S3 buckets. If the pool of source buckets changes in the future, you only need to modify this policy. Any downstream Lambda functions using the policy automatically gain access to the added buckets.
In the second setup application, the Parser service receives those S3 events and reformats the event for downstream services. Specifically, it creates a new attribute for the file type of the S3 object. After you deploy these two templates, creating any objects in the source S3 buckets generates the following event in the default event bus:
Building the converter processes
This application uses converters to process incoming objects in the S3 buckets. One converter handles one file type. There are two converters required to replicate the original application’s functionality, for pdf and docx files. An EventBridge rule matches incoming events and triggers the appropriate Lambda function to convert the object. This diagram shows abridged input and output events for these functions:
A matching EventBridge rule invokes the relevant converter function. The function converts the source file into raw text.
The text is split into batches of 5,000 characters.
The functions publish the text batches back to EventBridge, using new detail-type and source attributes.
The SAM template specifies the EventBridge rules, the permissions for EventBridge to invoke the Lambda functions, and the processing Lambda functions. The Lambda functions use the customer managed IAM policy created during the setup for read-only access to the originating S3 bucket. Each converter has its own logic for processing file types differently, and can produce different types of events if needed.
The analyzer functions
In this workflow, any file type containing text is analyzed by Amazon Comprehend to detect entities. The AnalyzeText function is invoked by an EventBridge rule. The rule is filtering for the NewTextBatch attribute in an event from docRepo.converters.
Another EventBridge rule triggers the AnalyzeImage function. This is filtering for jpg and jpeg file types where the event source is docRepo.s3. This function uses Amazon Rekognition to identify labels in the images.
Both functions produce new events containing the entities and labels, using new detail-type and source attributes. These events are published back to the default bus on EventBridge:
A matching EventBridge rule invokes the relevant analyzer function. The function uses Amazon ML services to detect labels in images and entities in text.
The functions publish the metadata back to EventBridge, using new detail-type and source attributes.
The Loader function is invoked by an EventBridge rule that is filtering for events from the Analyzers functions. This final function receives those events and loads the labels and entities metadata into the Amazon Elasticsearch Service:
Choosing between AWS Step Functions and Amazon EventBridge
In this application, there is a sequence of steps to the workflow that could also be handled by AWS Step Functions. Both services can simplify workflows in distributed applications and make it easier to maintain and modify serverless applications. In many cases, it makes sense to use both services for larger enterprise applications with complex business logic.
However, EventBridge enables you to separate processes into independent applications. It also allows other consumers to build custom logic using your events without impacting your application design or performance. In enterprise applications, this makes it much easier to innovate and develop new application features.
Benefits for developers
With the original monolithic application divided into five separate applications, it’s now easier for different teams to work on this project. It’s also easier and safer to deploy changes to a single microservice without needing to deploy the entire application. Developers must only understand their own service rather than the complete architecture of the application.
For example, to add more S3 buckets to the source list, you only need to modify the SAM template in the setup part of the application. The Parser function consumes put events from any number of buckets, and downstream functions consume events via EventBridge. To add a new file type, you only need to add a new converter function. Or to change the indexing provider, you create a new loader function to route the metadata to another service. The services of this application are independent, decoupled by EventBridge, and you can add more producers and consumers as required.
Traditionally, one of the challenges with event-based applications is tracking the format of events. Event schemas are typically hard to manage because any service can produce an event. The schema may also change as developers release new versions of a service. To help solve these issues, EventBridge has a feature called schema discovery that can automate the tracking and management of events in your application.
All the microservices in this application publish with a source attribute of docRepo. If you enable schema discovery, EventBridge quickly identifies these custom event schemas:
The schemas are defined in JSON using the OpenAPI Specification. As you develop new features, you can download code bindings directly from these schemas. For type-safe languages, this allows you to use events as objects directly in your applications, helping to accelerate development. To learn more about how to use code bindings and schema discovery, watch this video:
Larger applications can quickly become monoliths. You can use event-based architectures to decouple services within applications, and maintain flexibility as your application grows. Amazon EventBridge is a serverless event bus that can help simplify you architecture, allowing each service to operate independently with no dependence on event consumers.
In this post, I show how to rearchitect the Serverless Document Repository example into five smaller applications orchestrated using events. I explore the benefits of developing applications using this approach, including the ability to make changes more easily. I also show how EventBridge schema discovery can help automate event schema management.
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.