Tag Archives: load balancing

Cloudflare Partners: A New Program with New Partners

Post Syndicated from Dan Hollinger original https://blog.cloudflare.com/cloudflare-partners-a-new-program-with-new-partners/

Cloudflare Partners: A New Program with New Partners

Many overlook a critical portion of the language in Cloudflare’s mission: “to help build a better Internet.” From the beginning, we knew a mission this bold, an undertaking of this magnitude, couldn’t be done alone. We could only help. To ultimately build a better Internet, it would take a diverse and engaged ecosystem of technologies, customers, partners, and end-users. Fortunately, we’ve been able to work with amazing partners as we’ve grown, and we are eager to announce new, specific programs to grow our ecosystem with an increasingly diverse set of partners.

Today, we’re excited to announce the latest iteration of our partnership program for solutions partners. These categories encompass resellers and referral partners, OEM partners, and the new partner services program. Over the past few years, we’ve grown and learned from some amazing partnerships, and want to bring those best practices to our latest partners at scale—to help them grow their business with Cloudflare’s global network.

Cloudflare Partners: A New Program with New Partners
Cloudflare Partner Tiers

Partner Program for Solution Partners

Every partner program out there has tiers, and Cloudflare’s program is no exception. However, our tiering was built to help our partners ramp up, accelerate and move fast. As Matt Harrell highlighted, we want the process to be as seamless as possible, to help partners find the level of engagement that works best for them‚ with world-class enablement paths and best-in-class revenue share models—built-in from the beginning.

World-Class Enablement

Cloudflare offers complimentary training and enablement to all partners. From self-serve paths, to partner-focused webinars, and instructor-based courses, and certification—we want to ensure our partners can learn and develop Cloudflare and product expertise, to make them as effective as possible when utilizing our massive global network.

Driving Business Value

We want our partners to grow and succeed. From self-serve resellers to our most custom integration, we want to make it frictionless to build a profitable business on Cloudflare. From our tenant API system to dedicated account teams, we’re ready to help you go-to-market with solutions that help end-customers. This includes opportunities to co-market, develop target accounts, and directly partner, to succeed with strategic accounts.

Cloudflare recognizes that, in order to help build a better Internet, we need successful partners—and our latest program is built to help partners build and grow profitable business models around Cloudflare’s solutions.

Partner Services Program – SIs, MSSPs, MSSPs, PSOs

For the first time, we are expanding our program to also include service providers that want to develop and grow profitable services practices around Cloudflare’s solutions.

Our customers face some of the most complex challenges on the Internet. From those challenges, we’ve already seen some amazing opportunities for service providers to create value, grow their business, and make an impact. From customers migrating to the public, hybrid or multi-cloud for the first time, to entirely re-writing applications using Cloudflare Workers®, the need for expertise has increased across our entire customer base. In early pilots, we’ve seen four major categories in building a successful service practice around Cloudflare:

  • Network Digital Transformations – Help customers migrate and modernize their network solution. Cloudflare is the only cloud-native network to give Enterprise-grade control and visibility into on-prem, hybrid, and multi-cloud architectures.
  • Serverless Architecture Development – Provide serverless application development services, thought leadership, and technical consulting around leveraging Cloudflare Workers and Apps.
  • Managed Security & Insights – Enable CISO and IT leaders to obtain single pane of glass of reporting and policy management across all Internet-facing application with Cloudflare’s Security solutions.
  • Managed Performance & Reliability – Keep customer’s High-Availability applications running quickly and efficiently with Cloudflare’s Global Load Balancing, Smart Routing, and Anycast DNS, which allows performance consulting, traffic analysis, and application monitoring.

As we expand this program, we are looking for audacious service providers and system integrators that want to help us build a better Internet for our mutual customers. Cloudflare can be an essential lynchpin to simplify and accelerate digital transformations. We imagine a future where massive applications run within 10ms of 90% of the global population, and where a single-pane solution provides security, performance, and reliability for mission-critical applications—running across multiple clouds. Cloudflare needs help from amazing global integrators and service partners to help realize this future.

If you are interested in learning more about becoming a service partner and growing your business with Cloudflare, please reach out to [email protected] or explore cloudflare.com/partners/services

Just Getting Started

Metcalf’s law states that a network is only as powerful as the amount of nodes within the network. And within the global Cloudflare network, we want as many partner nodes as possible—from agencies to systems integrators, managed security providers, Enterprise resellers, and OEMs. A diverse ecosystem of partners is essential to our mission of helping to build a better Internet, together. We are dedicated to the success of our partners, and we will continue to iterate and develop our programs to make sure our partners can grow and develop on Cloudflare’s global network. Our commitment moving forward is that Cloudflare will be the easiest and most valuable solution for channel partners to sell and support globally.


More Information:

MultiCloud… flare

Post Syndicated from Zack Bloom original https://blog.cloudflare.com/multicloudflare/

If you want to start an intense conversation in the halls of Cloudflare, try describing us as a “CDN”. CDNs don’t generally provide you with Load Balancing, they don’t allow you to deploy Serverless Applications, and they certainly don’t get installed onto your phone. One of the costs of that confusion is many people don’t realize everything Cloudflare can do for people who want to operate in multiple public clouds, or want to operate in both the cloud and on their own hardware.

Load Balancing

Cloudflare has countless servers located in 180 data centers around the world. Each one is capable of acting as a Layer 7 load balancer, directing incoming traffic between origins wherever they may be. You could, for example, add load balancing between a set of machines you have in AWS’ EC2, and another set you keep in Google Cloud.

This load balancing isn’t just round-robining traffic. It supports weighting to allow you to control how much traffic goes to each cluster. It supports latency-based routing to automatically route traffic to the cluster which is closer (so adding geographic distribution can be as simple as spinning up machines). It even supports health checks, allowing it to automatically direct traffic to the cloud which is currently healthy.

Most importantly, it doesn’t run in any of the provider’s clouds and isn’t dependent on them to function properly. Even better, since the load balancing runs near virtually every Internet user around the world it doesn’t come at any performance cost. (Using our Argo technology performance often gets better!).

Argo Tunnel

One of the hardest components to managing a multi-cloud deployment is networking. Each provider has their own method of defining networks and firewalls, and even tools which can deploy clusters across multiple clouds often can’t quite manage to get the networking configuration to work in the same way. The task of setting it up can often be a trial-and-error game where the final config is best never touched again, leaving ‘going multi-cloud’ as a failed experiment within organizations.

At Cloudflare we have a technology called Argo Tunnel which flips networking on its head. Rather than opening ports and directing incoming traffic, each of your virtual machines (or k8s pods) makes outbound tunnels to the nearest Cloudflare PoPs. All of your Internet traffic then flows over those tunnels. You keep all your ports closed to inbound traffic, and never have to think about Internet networking again.

What’s so powerful about this configuration is is makes it trivial to spin up machines in new locations. Want a dozen machines in Australia? As long as they start the Argo Tunnel daemon they will start receiving traffic. Don’t need them any more? Shut them down and the traffic will be routed elsewhere. And, of course, none of this relies on any one public cloud provider, making it reliable even if they should have issues.

Argo Tunnel makes it trivial to add machines in new clouds, or to keep machines on-prem even as you start shifting workloads into the Cloud.

Access Control

One thing you’ll realize about using Argo Tunnel is you now have secure tunnels which connect your infrastructure with Cloudflare’s network. Once traffic reaches that network, it doesn’t necessarily have to flow directly to your machines. It could, for example, have access control applied where we use your Identity Provider (like Okta or Active Directory) to decide who should be able to access what. Rather than wrestling with VPCs and VPN configurations, you can move to a zero-trust model where you use policies to decide exactly who can access what on a per-request basis.

In fact, you can now do this with SSH as well! You can manage all your user accounts in a single place and control with precision who can access which piece of infrastructure, irrespective of which cloud it’s in.

Our Reliability

No computer system is perfect, and ours is no exception. We make mistakes, have bugs in our code, and deal with the pain of operating at the scale of the Internet every day. One great innovation in the recent history of computers, however, is the idea that it is possible to build a reliable system on top of many individually unreliable components.

Each of Cloudflare’s PoPs is designed to function without communication or support from others, or from a central data center. That alone greatly increases our tolerance for network partitions and moves us from maintaining a single system to be closer to maintaining 180 independent clouds, any of which can serve all traffic.

We are also a system built on anycast which allows us to tap into the fundamental reliability of the Internet. The Internet uses a protocol called BGP which asks each system who would like to receive traffic for a particular IP address to ‘advertise’ it. Each router then will decide to forward traffic based on which person advertising an address is the closest. We advertise all of our IP addresses in every one of our data centers. If a data centers goes down, it stops advertising BGP routes, and the very same packets which would have been destined for it arrive in another PoP seamlessly.

Ultimately we are trying to help build a better Internet. We don’t believe that Internet is built on the back of a single provider. Many of the services provided by these cloud providers are simply too complex to be as reliable as the Internet demands.

True reliability and cost control both require existing on multiple clouds. It is clear that the tools which the Internet of the 80s and 90s gave us may be insufficient to move into that future. With a smarter network we can do more, better.

Designing resilient systems beyond retries (Part 2): Bulkheading, Load Balancing, and Fallbacks

Post Syndicated from Grab Tech original https://engineering.grab.com/beyond-retries-part-2

This post is the second of a three-part series on going beyond retries to improve system resiliency. We’ve previously discussed about rate-limiting as a strategy to improve resiliency. In this article, we will cover these techniques: bulkheading, load balancing, and fallbacks.

Introducing Bulkheading (Isolation)

Bulkheading is a fundamental pattern which underpins many other resiliency techniques, especially where microservices are concerned, so it’s worth introducing first. The term actually comes from an ancient technique in ship building, where a ship’s hull would be partitioned into several watertight compartments. If one of the compartments has a leak, then the water fills just that compartment and is contained, rather than flooding the entire ship. We can apply this principle to software applications and microservices: by isolating failures to individual components, we can prevent a single failure from cascading and bringing down the entire system.

Bulkheads also help to prevent single points of failure, by reducing the impact of any failures so services can maintain some level of service.

Level of bulkheads

It is important to note that bulkheads can be applied at multiple levels in software architecture. The two highest levels of bulkheads are at the infrastructure level, and the first is hardware isolation. In a cloud environment, this usually means isolating regions or availability zones. The second is isolating the operating system, which has become a widespread technique with the popularity of virtual machines and now containerization. Previously, it was common for multiple applications to run on a single (very powerful) dedicated server. Unfortunately, this meant that a rogue application could wreak havoc on the entire system in a number of ways, from filling the disk with logs to consuming memory or other resources.

Isolation can be achieved by applying bulkheading at multiple levels
Isolation can be achieved by applying bulkheading at multiple levels

 

This article focuses on resiliency from the application perspective, so below the system level is process-level isolation. In practical terms, this isolation prevents an application crash from affecting multiple system components. By moving those components into separate processes (or microservices), certain classes of application-level failures are prevented from causing cascading failure.

At the lowest level, and perhaps the most common form of bulkheading to software engineers, are the concepts of connection pooling and thread pools. While these techniques are commonly employed for performance reasons (reusing resources is cheaper than acquiring new ones), they also help to put a finite limit on the number of connections or concurrent threads that an operation is allowed to consume. This ensures that if the load of a particular operation suddenly increases unexpectedly (such as due to external load or downstream latency), the impact is contained to only a partial failure.

Bulkheading support in the Hystrix library

The Hystrix library for Go supports a form of bulkheading through its MaxConcurrentRequests parameter. This is conveniently tied to the circuit name, meaning that different levels of isolation can be achieved by choosing an appropriate circuit name. A good rule of thumb is to use a different circuit name for each operation or API call. This ensures that if just one particular endpoint of a remote service is failing, the other circuits are still free to be used for the remaining healthy endpoints, achieving failure isolation.

Load balancing

Global rate-limiting with a central server
Global rate-limiting with a central server

 

Load balancing is where network traffic from a client may be served by one of many backend servers. You can think of load balancers as traffic cops who distribute traffic on the road to prevent congestion and overload. Assuming the traffic is distributed evenly on the network, this effectively increases the computing power of the backend. Adding capacity like this is a common way to handle an increase in load from the clients, such as when a website becomes more popular.

Almost always, load balancers provide high availability for the application. When there is just a single backend server, this server is a ‘single point of failure’, because if it is ever unavailable, there are no servers remaining to serve the clients. However, if there is a pool of backend servers behind a load balancer, the impact is reduced. If there are 4 backend servers and only 1 is unavailable, evenly distributed requests would only fail 25% of the time instead of 100%. This is already an improvement, but modern load balancers are more sophisticated.

Usually, load balancers will include some form of a health check. This is a mechanism that monitors whether servers in the pool are ‘healthy’, ie. able to serve requests. The implementations for the health check vary, but this can be an active check such as sending ‘pings’, or passive monitoring of responses and removing the failing backend server instances.

As with rate-limiting, there are many strategies for load balancing to consider.

There are four main types of load balancer to choose from, each with their own pros and cons:

  • Proxy. This is perhaps the most well-known form of load-balancer, and is the method used by Amazon’s Elastic Load Balancer. The proxy sits on the boundary between the backend servers and the public clients, and therefore also doubles as a security layer: the clients do not know about or have direct access to the backend servers. The proxy will handle all the logic for load balancing and health checking. It is a very convenient and popular approach because it requires no special integration with the client or server code. They also typically perform ‘SSL termination’, decrypting incoming HTTPS traffic and using HTTP to communicate with the backend servers.
  • Client-side. This is where the client performs all of the load-balancing itself, often using a dedicated library built for the purpose. Compared with the proxy, it is more performant because it avoids an extra network ‘hop.’ However, there is a significant cost in developing and maintaining the code, which is necessarily complex and any bugs have serious consequences.
  • Lookaside. This is a hybrid approach where the majority of the load-balancing logic is handled by a dedicated service, but it does not proxy; the client still makes direct connections to the backend. This reduces the burden of the client-side library but maintains high performance, however the load-balancing service becomes another potential point of failure.
  • Service mesh with sidecar. A service mesh is an all-in-one solution for service communication, with many popular open-source products available. They usually include a sidecar, which is a proxy that sits on the same server as the application to route network traffic. Like the traditional proxy load balancer, this handles many concerns of load-balancing for free. However, there is still an extra network hop, and there can be a significant development cost to integrate with existing systems for logging, reporting and so on, so this must be weighed against building a client-side solution in-house.
Comparison of load-balancer architectures
Comparison of load-balancer architectures

 

Grab’s load-balancing implementation

At Grab, we have built our own internal client-side solution called CSDP, which uses the distributed key-value store etcd as its backend store.

Fallbacks

There are scenarios when simply retrying a failed API call doesn’t work. If the remote server is completely down or only returning errors, no amount of retries are going to help; the failure is unrecoverable. When recovery isn’t an option, mitigation is an alternative. This is related to the concept of graceful degradation: sometimes it is preferable to return a less optimal response than fail completely, especially for user-facing applications where user experience is important.

One such mitigation strategy is fallbacks. This is a broad topic with many different sub-strategies, but here are a few of the most common:

Fail silently

Starting with the easiest to implement, one basic fallback strategy is fail silently. This means returning an empty or null response when an error is encountered, as if the call had succeeded. If the data being requested is not critical functionality then this can be considered: missing part of a UI is less noticeable than an error page! For example, UI bubbles showing unread notifications are a common feature. But if the service providing the notifications is failing and the bubble shows 0 instead of N notifications, the user’s experience is unlikely to be significantly affected.

Local computation

A second fallback strategy when a downstream dependency is failing could be to compute the value locally instead. This could mean either returning a default (static) value, or using a simple formula to compute the response. For example, a marketplace application might have a service to calculate shipping costs. If it is unavailable, then using a default price might be acceptable. Or even $0 – users are unlikely to complain about errors that benefit them, and it’s better than losing business!

Cached values

Similarly, cached values are often used as fallbacks. If the service isn’t available to calculate the most up to date value, returning a stale response might be better than returning nothing. If an application is already caching the value with a short expiration to optimize performance, it can be reused as a fallback cache by setting two expiration times: one for normal circumstances, and another when the service providing the response has failed.

Backup service

Finally, if the response is too complex to compute locally or if major functionality of the application is required to have a fallback, then an entirely new service can act as a fallback; a backup service. Such a service is a big investment, so to make it worthwhile some trade-offs must be accepted. The backup service should be considerably simpler than the service it is intended to replace; if it is too complex then it will require constant testing and maintenance, not to mention documentation and training to make sure it is well understood within the engineering team. Also, a complex system is more likely to fail when activated. Usually such systems will have very few or no dependencies, and certainly should not depend on any parts of the original system, since they could have failed, rendering the backup system useless.

Grab’s fallback implementation

At Grab, we make use of various fallback strategies in our services. For example, our microservice framework Grab-Kit has built-in support for returning cached values when a downstream service is unresponsive. We’ve even built a backup service to replicate our core functionality, so we can continue to serve customers despite severe technical difficulties!

Up next, Architecture Patterns and Chaos Engineering…

We’ve covered various techniques in designing reliable and resilient systems in the previous articles. I hope you found them useful. Comments are always welcome.

In our next post, we will look at ways to prevent and reduce failures through architecture patterns and testing.

Please stay tuned!

AWS Achieves Spain’s ENS High Certification Across 29 Services

Post Syndicated from Oliver Bell original https://aws.amazon.com/blogs/security/aws-achieves-spains-ens-high-certification-across-29-services/

AWS has achieved Spain’s Esquema Nacional de Seguridad (ENS) High certification across 29 services. To successfully achieve the ENS High Standard, BDO España conducted an independent audit and attested that AWS meets confidentiality, integrity, and availability standards. This provides the assurance needed by Spanish Public Sector organizations wanting to build secure applications and services on AWS.

The National Security Framework, regulated under Royal Decree 3/2010, was developed through close collaboration between ENAC (Entidad Nacional de Acreditación), the Ministry of Finance and Public Administration and the CCN (National Cryptologic Centre), and other administrative bodies.

The following AWS Services are ENS High accredited across our Dublin and Frankfurt Regions:

  • Amazon API Gateway
  • Amazon DynamoDB
  • Amazon Elastic Container Service
  • Amazon Elastic Block Store
  • Amazon Elastic Compute Cloud
  • Amazon Elastic File System
  • Amazon Elastic MapReduce
  • Amazon ElastiCache
  • Amazon Glacier
  • Amazon Redshift
  • Amazon Relational Database Service
  • Amazon Simple Queue Service
  • Amazon Simple Storage Service
  • Amazon Simple Workflow Service
  • Amazon Virtual Private Cloud
  • Amazon WorkSpaces
  • AWS CloudFormation
  • AWS CloudTrail
  • AWS Config
  • AWS Database Migration Service
  • AWS Direct Connect
  • AWS Directory Service
  • AWS Elastic Beanstalk
  • AWS Key Management Service
  • AWS Lambda
  • AWS Snowball
  • AWS Storage Gateway
  • Elastic Load Balancing
  • VM Import/Export

New AWS Auto Scaling – Unified Scaling For Your Cloud Applications

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-auto-scaling-unified-scaling-for-your-cloud-applications/

I’ve been talking about scalability for servers and other cloud resources for a very long time! Back in 2006, I wrote “This is the new world of scalable, on-demand web services. Pay for what you need and use, and not a byte more.” Shortly after we launched Amazon Elastic Compute Cloud (EC2), we made it easy for you to do this with the simultaneous launch of Elastic Load Balancing, EC2 Auto Scaling, and Amazon CloudWatch. Since then we have added Auto Scaling to other AWS services including ECS, Spot Fleets, DynamoDB, Aurora, AppStream 2.0, and EMR. We have also added features such as target tracking to make it easier for you to scale based on the metric that is most appropriate for your application.

Introducing AWS Auto Scaling
Today we are making it easier for you to use the Auto Scaling features of multiple AWS services from a single user interface with the introduction of AWS Auto Scaling. This new service unifies and builds on our existing, service-specific, scaling features. It operates on any desired EC2 Auto Scaling groups, EC2 Spot Fleets, ECS tasks, DynamoDB tables, DynamoDB Global Secondary Indexes, and Aurora Replicas that are part of your application, as described by an AWS CloudFormation stack or in AWS Elastic Beanstalk (we’re also exploring some other ways to flag a set of resources as an application for use with AWS Auto Scaling).

You no longer need to set up alarms and scaling actions for each resource and each service. Instead, you simply point AWS Auto Scaling at your application and select the services and resources of interest. Then you select the desired scaling option for each one, and AWS Auto Scaling will do the rest, helping you to discover the scalable resources and then creating a scaling plan that addresses the resources of interest.

If you have tried to use any of our Auto Scaling options in the past, you undoubtedly understand the trade-offs involved in choosing scaling thresholds. AWS Auto Scaling gives you a variety of scaling options: You can optimize for availability, keeping plenty of resources in reserve in order to meet sudden spikes in demand. You can optimize for costs, running close to the line and accepting the possibility that you will tax your resources if that spike arrives. Alternatively, you can aim for the middle, with a generous but not excessive level of spare capacity. In addition to optimizing for availability, cost, or a blend of both, you can also set a custom scaling threshold. In each case, AWS Auto Scaling will create scaling policies on your behalf, including appropriate upper and lower bounds for each resource.

AWS Auto Scaling in Action
I will use AWS Auto Scaling on a simple CloudFormation stack consisting of an Auto Scaling group of EC2 instances and a pair of DynamoDB tables. I start by removing the existing Scaling Policies from my Auto Scaling group:

Then I open up the new Auto Scaling Console and selecting the stack:

Behind the scenes, Elastic Beanstalk applications are always launched via a CloudFormation stack. In the screen shot above, awseb-e-sdwttqizbp-stack is an Elastic Beanstalk application that I launched.

I can click on any stack to learn more about it before proceeding:

I select the desired stack and click on Next to proceed. Then I enter a name for my scaling plan and choose the resources that I’d like it to include:

I choose the scaling strategy for each type of resource:

After I have selected the desired strategies, I click Next to proceed. Then I review the proposed scaling plan, and click Create scaling plan to move ahead:

The scaling plan is created and in effect within a few minutes:

I can click on the plan to learn more:

I can also inspect each scaling policy:

I tested my new policy by applying a load to the initial EC2 instance, and watched the scale out activity take place:

I also took a look at the CloudWatch metrics for the EC2 Auto Scaling group:

Available Now
We are launching AWS Auto Scaling today in the US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), and Asia Pacific (Singapore) Regions today, with more to follow. There’s no charge for AWS Auto Scaling; you pay only for the CloudWatch Alarms that it creates and any AWS resources that you consume.

As is often the case with our new services, this is just the first step on what we hope to be a long and interesting journey! We have a long roadmap, and we’ll be adding new features and options throughout 2018 in response to your feedback.

Jeff;

Scale Your Web Application — One Step at a Time

Post Syndicated from Saurabh Shrivastava original https://aws.amazon.com/blogs/architecture/scale-your-web-application-one-step-at-a-time/

I often encounter people experiencing frustration as they attempt to scale their e-commerce or WordPress site—particularly around the cost and complexity related to scaling. When I talk to customers about their scaling plans, they often mention phrases such as horizontal scaling and microservices, but usually people aren’t sure about how to dive in and effectively scale their sites.

Now let’s talk about different scaling options. For instance if your current workload is in a traditional data center, you can leverage the cloud for your on-premises solution. This way you can scale to achieve greater efficiency with less cost. It’s not necessary to set up a whole powerhouse to light a few bulbs. If your workload is already in the cloud, you can use one of the available out-of-the-box options.

Designing your API in microservices and adding horizontal scaling might seem like the best choice, unless your web application is already running in an on-premises environment and you’ll need to quickly scale it because of unexpected large spikes in web traffic.

So how to handle this situation? Take things one step at a time when scaling and you may find horizontal scaling isn’t the right choice, after all.

For example, assume you have a tech news website where you did an early-look review of an upcoming—and highly-anticipated—smartphone launch, which went viral. The review, a blog post on your website, includes both video and pictures. Comments are enabled for the post and readers can also rate it. For example, if your website is hosted on a traditional Linux with a LAMP stack, you may find yourself with immediate scaling problems.

Let’s get more details on the current scenario and dig out more:

  • Where are images and videos stored?
  • How many read/write requests are received per second? Per minute?
  • What is the level of security required?
  • Are these synchronous or asynchronous requests?

We’ll also want to consider the following if your website has a transactional load like e-commerce or banking:

How is the website handling sessions?

  • Do you have any compliance requests—like the Payment Card Industry Data Security Standard (PCI DSS compliance) —if your website is using its own payment gateway?
  • How are you recording customer behavior data and fulfilling your analytics needs?
  • What are your loading balancing considerations (scaling, caching, session maintenance, etc.)?

So, if we take this one step at a time:

Step 1: Ease server load. We need to quickly handle spikes in traffic, generated by activity on the blog post, so let’s reduce server load by moving image and video to some third -party content delivery network (CDN). AWS provides Amazon CloudFront as a CDN solution, which is highly scalable with built-in security to verify origin access identity and handle any DDoS attacks. CloudFront can direct traffic to your on-premises or cloud-hosted server with its 113 Points of Presence (102 Edge Locations and 11 Regional Edge Caches) in 56 cities across 24 countries, which provides efficient caching.
Step 2: Reduce read load by adding more read replicas. MySQL provides a nice mirror replication for databases. Oracle has its own Oracle plug for replication and AWS RDS provide up to five read replicas, which can span across the region and even the Amazon database Amazon Aurora can have 15 read replicas with Amazon Aurora autoscaling support. If a workload is highly variable, you should consider Amazon Aurora Serverless database  to achieve high efficiency and reduced cost. While most mirror technologies do asynchronous replication, AWS RDS can provide synchronous multi-AZ replication, which is good for disaster recovery but not for scalability. Asynchronous replication to mirror instance means replication data can sometimes be stale if network bandwidth is low, so you need to plan and design your application accordingly.

I recommend that you always use a read replica for any reporting needs and try to move non-critical GET services to read replica and reduce the load on the master database. In this case, loading comments associated with a blog can be fetched from a read replica—as it can handle some delay—in case there is any issue with asynchronous reflection.

Step 3: Reduce write requests. This can be achieved by introducing queue to process the asynchronous message. Amazon Simple Queue Service (Amazon SQS) is a highly-scalable queue, which can handle any kind of work-message load. You can process data, like rating and review; or calculate Deal Quality Score (DQS) using batch processing via an SQS queue. If your workload is in AWS, I recommend using a job-observer pattern by setting up Auto Scaling to automatically increase or decrease the number of batch servers, using the number of SQS messages, with Amazon CloudWatch, as the trigger.  For on-premises workloads, you can use SQS SDK to create an Amazon SQS queue that holds messages until they’re processed by your stack. Or you can use Amazon SNS  to fan out your message processing in parallel for different purposes like adding a watermark in an image, generating a thumbnail, etc.

Step 4: Introduce a more robust caching engine. You can use Amazon Elastic Cache for Memcached or Redis to reduce write requests. Memcached and Redis have different use cases so if you can afford to lose and recover your cache from your database, use Memcached. If you are looking for more robust data persistence and complex data structure, use Redis. In AWS, these are managed services, which means AWS takes care of the workload for you and you can also deploy them in your on-premises instances or use a hybrid approach.

Step 5: Scale your server. If there are still issues, it’s time to scale your server.  For the greatest cost-effectiveness and unlimited scalability, I suggest always using horizontal scaling. However, use cases like database vertical scaling may be a better choice until you are good with sharding; or use Amazon Aurora Serverless for variable workloads. It will be wise to use Auto Scaling to manage your workload effectively for horizontal scaling. Also, to achieve that, you need to persist the session. Amazon DynamoDB can handle session persistence across instances.

If your server is on premises, consider creating a multisite architecture, which will help you achieve quick scalability as required and provide a good disaster recovery solution.  You can pick and choose individual services like Amazon Route 53, AWS CloudFormation, Amazon SQS, Amazon SNS, Amazon RDS, etc. depending on your needs.

Your multisite architecture will look like the following diagram:

In this architecture, you can run your regular workload on premises, and use your AWS workload as required for scalability and disaster recovery. Using Route 53, you can direct a precise percentage of users to an AWS workload.

If you decide to move all of your workloads to AWS, the recommended multi-AZ architecture would look like the following:

In this architecture, you are using a multi-AZ distributed workload for high availability. You can have a multi-region setup and use Route53 to distribute your workload between AWS Regions. CloudFront helps you to scale and distribute static content via an S3 bucket and DynamoDB, maintaining your application state so that Auto Scaling can apply horizontal scaling without loss of session data. At the database layer, RDS with multi-AZ standby provides high availability and read replica helps achieve scalability.

This is a high-level strategy to help you think through the scalability of your workload by using AWS even if your workload in on premises and not in the cloud…yet.

I highly recommend creating a hybrid, multisite model by placing your on-premises environment replica in the public cloud like AWS Cloud, and using Amazon Route53 DNS Service and Elastic Load Balancing to route traffic between on-premises and cloud environments. AWS now supports load balancing between AWS and on-premises environments to help you scale your cloud environment quickly, whenever required, and reduce it further by applying Amazon auto-scaling and placing a threshold on your on-premises traffic using Route 53.

Now Open AWS EU (Paris) Region

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-open-aws-eu-paris-region/

Today we are launching our 18th AWS Region, our fourth in Europe. Located in the Paris area, AWS customers can use this Region to better serve customers in and around France.

The Details
The new EU (Paris) Region provides a broad suite of AWS services including Amazon API Gateway, Amazon Aurora, Amazon CloudFront, Amazon CloudWatch, CloudWatch Events, Amazon CloudWatch Logs, Amazon DynamoDB, Amazon Elastic Compute Cloud (EC2), EC2 Container Registry, Amazon ECS, Amazon Elastic Block Store (EBS), Amazon EMR, Amazon ElastiCache, Amazon Elasticsearch Service, Amazon Glacier, Amazon Kinesis Streams, Polly, Amazon Redshift, Amazon Relational Database Service (RDS), Amazon Route 53, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon Simple Storage Service (S3), Amazon Simple Workflow Service (SWF), Amazon Virtual Private Cloud, Auto Scaling, AWS Certificate Manager (ACM), AWS CloudFormation, AWS CloudTrail, AWS CodeDeploy, AWS Config, AWS Database Migration Service, AWS Direct Connect, AWS Elastic Beanstalk, AWS Identity and Access Management (IAM), AWS Key Management Service (KMS), AWS Lambda, AWS Marketplace, AWS OpsWorks Stacks, AWS Personal Health Dashboard, AWS Server Migration Service, AWS Service Catalog, AWS Shield Standard, AWS Snowball, AWS Snowball Edge, AWS Snowmobile, AWS Storage Gateway, AWS Support (including AWS Trusted Advisor), Elastic Load Balancing, and VM Import.

The Paris Region supports all sizes of C5, M5, R4, T2, D2, I3, and X1 instances.

There are also four edge locations for Amazon Route 53 and Amazon CloudFront: three in Paris and one in Marseille, all with AWS WAF and AWS Shield. Check out the AWS Global Infrastructure page to learn more about current and future AWS Regions.

The Paris Region will benefit from three AWS Direct Connect locations. Telehouse Voltaire is available today. AWS Direct Connect will also become available at Equinix Paris in early 2018, followed by Interxion Paris.

All AWS infrastructure regions around the world are designed, built, and regularly audited to meet the most rigorous compliance standards and to provide high levels of security for all AWS customers. These include ISO 27001, ISO 27017, ISO 27018, SOC 1 (Formerly SAS 70), SOC 2 and SOC 3 Security & Availability, PCI DSS Level 1, and many more. This means customers benefit from all the best practices of AWS policies, architecture, and operational processes built to satisfy the needs of even the most security sensitive customers.

AWS is certified under the EU-US Privacy Shield, and the AWS Data Processing Addendum (DPA) is GDPR-ready and available now to all AWS customers to help them prepare for May 25, 2018 when the GDPR becomes enforceable. The current AWS DPA, as well as the AWS GDPR DPA, allows customers to transfer personal data to countries outside the European Economic Area (EEA) in compliance with European Union (EU) data protection laws. AWS also adheres to the Cloud Infrastructure Service Providers in Europe (CISPE) Code of Conduct. The CISPE Code of Conduct helps customers ensure that AWS is using appropriate data protection standards to protect their data, consistent with the GDPR. In addition, AWS offers a wide range of services and features to help customers meet the requirements of the GDPR, including services for access controls, monitoring, logging, and encryption.

From Our Customers
Many AWS customers are preparing to use this new Region. Here’s a small sample:

Societe Generale, one of the largest banks in France and the world, has accelerated their digital transformation while working with AWS. They developed SG Research, an application that makes reports from Societe Generale’s analysts available to corporate customers in order to improve the decision-making process for investments. The new AWS Region will reduce latency between applications running in the cloud and in their French data centers.

SNCF is the national railway company of France. Their mobile app, powered by AWS, delivers real-time traffic information to 14 million riders. Extreme weather, traffic events, holidays, and engineering works can cause usage to peak at hundreds of thousands of users per second. They are planning to use machine learning and big data to add predictive features to the app.

Radio France, the French public radio broadcaster, offers seven national networks, and uses AWS to accelerate its innovation and stay competitive.

Les Restos du Coeur, a French charity that provides assistance to the needy, delivering food packages and participating in their social and economic integration back into French society. Les Restos du Coeur is using AWS for its CRM system to track the assistance given to each of their beneficiaries and the impact this is having on their lives.

AlloResto by JustEat (a leader in the French FoodTech industry), is using AWS to to scale during traffic peaks and to accelerate their innovation process.

AWS Consulting and Technology Partners
We are already working with a wide variety of consulting, technology, managed service, and Direct Connect partners in France. Here’s a partial list:

AWS Premier Consulting PartnersAccenture, Capgemini, Claranet, CloudReach, DXC, and Edifixio.

AWS Consulting PartnersABC Systemes, Atos International SAS, CoreExpert, Cycloid, Devoteam, LINKBYNET, Oxalide, Ozones, Scaleo Information Systems, and Sopra Steria.

AWS Technology PartnersAxway, Commerce Guys, MicroStrategy, Sage, Software AG, Splunk, Tibco, and Zerolight.

AWS in France
We have been investing in Europe, with a focus on France, for the last 11 years. We have also been developing documentation and training programs to help our customers to improve their skills and to accelerate their journey to the AWS Cloud.

As part of our commitment to AWS customers in France, we plan to train more than 25,000 people in the coming years, helping them develop highly sought after cloud skills. They will have access to AWS training resources in France via AWS Academy, AWSome days, AWS Educate, and webinars, all delivered in French by AWS Technical Trainers and AWS Certified Trainers.

Use it Today
The EU (Paris) Region is open for business now and you can start using it today!

Jeff;

 

Now Open – AWS China (Ningxia) Region

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-open-aws-china-ningxia-region/

Today we launched our 17th Region globally, and the second in China. The AWS China (Ningxia) Region, operated by Ningxia Western Cloud Data Technology Co. Ltd. (NWCD), is generally available now and provides customers another option to run applications and store data on AWS in China.

The Details
At launch, the new China (Ningxia) Region, operated by NWCD, supports Auto Scaling, AWS Config, AWS CloudFormation, AWS CloudTrail, Amazon CloudWatch, CloudWatch Events, Amazon CloudWatch Logs, AWS CodeDeploy, AWS Direct Connect, Amazon DynamoDB, Amazon Elastic Compute Cloud (EC2), Amazon Elastic Block Store (EBS), Amazon EC2 Systems Manager, AWS Elastic Beanstalk, Amazon ElastiCache, Amazon Elasticsearch Service, Elastic Load Balancing, Amazon EMR, Amazon Glacier, AWS Identity and Access Management (IAM), Amazon Kinesis Streams, Amazon Redshift, Amazon Relational Database Service (RDS), Amazon Simple Storage Service (S3), Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), AWS Support API, AWS Trusted Advisor, Amazon Simple Workflow Service (SWF), Amazon Virtual Private Cloud, and VM Import. Visit the AWS China Products page for additional information on these services.

The Region supports all sizes of C4, D2, M4, T2, R4, I3, and X1 instances.

Check out the AWS Global Infrastructure page to learn more about current and future AWS Regions.

Operating Partner
To comply with China’s legal and regulatory requirements, AWS has formed a strategic technology collaboration with NWCD to operate and provide services from the AWS China (Ningxia) Region. Founded in 2015, NWCD is a licensed datacenter and cloud services provider, based in Ningxia, China. NWCD joins Sinnet, the operator of the AWS China China (Beijing) Region, as an AWS operating partner in China. Through these relationships, AWS provides its industry-leading technology, guidance, and expertise to NWCD and Sinnet, while NWCD and Sinnet operate and provide AWS cloud services to local customers. While the cloud services offered in both AWS China Regions are the same as those available in other AWS Regions, the AWS China Regions are different in that they are isolated from all other AWS Regions and operated by AWS’s Chinese partners separately from all other AWS Regions. Customers using the AWS China Regions enter into customer agreements with Sinnet and NWCD, rather than with AWS.

Use it Today
The AWS China (Ningxia) Region, operated by NWCD, is open for business, and you can start using it now! Starting today, Chinese developers, startups, and enterprises, as well as government, education, and non-profit organizations, can leverage AWS to run their applications and store their data in the new AWS China (Ningxia) Region, operated by NWCD. Customers already using the AWS China (Beijing) Region, operated by Sinnet, can select the AWS China (Ningxia) Region directly from the AWS Management Console, while new customers can request an account at www.amazonaws.cn to begin using both AWS China Regions.

Jeff;

 

 

Now Available: A New AWS Quick Start Reference Deployment for CJIS

Post Syndicated from Emil Lerch original https://aws.amazon.com/blogs/security/now-available-a-new-aws-quick-start-reference-deployment-for-cjis/

CJIS logo

As part of the AWS Compliance Quick Start program, AWS has published a new Quick Start reference deployment for customers who need to align with Criminal Justice Information Services (CJIS) Security Policy 5.6 and process Criminal Justice Information (CJI) in accordance with this policy. The new Quick Start is AWS Enterprise Accelerator – Compliance: CJIS, and it makes it easier for you to address the list of supported controls you will find in the security controls matrix that accompanies the Quick Start.

As all AWS Quick Starts do, this Quick Start helps you automate the building of a recommended architecture that, when deployed as a package, provides a baseline AWS configuration. The Quick Start uses sets of nested AWS CloudFormation templates and user data scripts to create an example environment with a two-VPC, multi-tiered web service.

The new Quick Start also includes:

The recommended architecture built by the Quick Start supports a wide variety of AWS best practices (all of which are detailed in the Quick Start), including the use of multiple Availability Zones, isolation using public and private subnets, load balancing, and Auto Scaling.

The Quick Start package also includes a deployment guide with detailed instructions and a security controls matrix that describes how the deployment addresses CJIS Security Policy 5.6 controls. You should have your IT security assessors and risk decision makers review the security controls matrix so that they can understand the extent of the implementation of the controls within the architecture. The matrix also identifies the specific resources in the CloudFormation templates that affect each control, and contains cross-references to the CJIS Security Policy 5.6 security controls.

If you have questions about this new Quick Start, contact the AWS Compliance Quick Start team. For more information about the AWS CJIS program, see CJIS Compliance.

– Emil

Running Windows Containers on Amazon ECS

Post Syndicated from Nathan Taber original https://aws.amazon.com/blogs/compute/running-windows-containers-on-amazon-ecs/

This post was developed and written by Jeremy Cowan, Thomas Fuller, Samuel Karp, and Akram Chetibi.

Containers have revolutionized the way that developers build, package, deploy, and run applications. Initially, containers only supported code and tooling for Linux applications. With the release of Docker Engine for Windows Server 2016, Windows developers have started to realize the gains that their Linux counterparts have experienced for the last several years.

This week, we’re adding support for running production workloads in Windows containers using Amazon Elastic Container Service (Amazon ECS). Now, Amazon ECS provides an ECS-Optimized Windows Server Amazon Machine Image (AMI). This AMI is based on the EC2 Windows Server 2016 AMI, and includes Docker 17.06 Enterprise Edition and the ECS Agent 1.16. This AMI provides improved instance and container launch time performance. It’s based on Windows Server 2016 Datacenter and includes Docker 17.06.2-ee-5, along with a new version of the ECS agent that now runs as a native Windows service.

In this post, I discuss the benefits of this new support, and walk you through getting started running Windows containers with Amazon ECS.

When AWS released the Windows Server 2016 Base with Containers AMI, the ECS agent ran as a process that made it difficult to monitor and manage. As a service, the agent can be health-checked, managed, and restarted no differently than other Windows services. The AMI also includes pre-cached images for Windows Server Core 2016 and Windows Server Nano Server 2016. By caching the images in the AMI, launching new Windows containers is significantly faster. When Docker images include a layer that’s already cached on the instance, Docker re-uses that layer instead of pulling it from the Docker registry.

The ECS agent and an accompanying ECS PowerShell module used to install, configure, and run the agent come pre-installed on the AMI. This guarantees there is a specific platform version available on the container instance at launch. Because the software is included, you don’t have to download it from the internet. This saves startup time.

The Windows-compatible ECS-optimized AMI also reports CPU and memory utilization and reservation metrics to Amazon CloudWatch. Using the CloudWatch integration with ECS, you can create alarms that trigger dynamic scaling events to automatically add or remove capacity to your EC2 instances and ECS tasks.

Getting started

To help you get started running Windows containers on ECS, I’ve forked the ECS reference architecture, to build an ECS cluster comprised of Windows instances instead of Linux instances. You can pull the latest version of the reference architecture for Windows.

The reference architecture is a layered CloudFormation stack, in that it calls other stacks to create the environment. Within the stack, the ecs-windows-cluster.yaml file contains the instructions for bootstrapping the Windows instances and configuring the ECS cluster. To configure the instances outside of AWS CloudFormation (for example, through the CLI or the console), you can add the following commands to your instance’s user data:

Import-Module ECSTools
Initialize-ECSAgent

Or

Import-Module ECSTools
Initialize-ECSAgent –Cluster MyCluster -EnableIAMTaskRole

If you don’t specify a cluster name when you initialize the agent, the instance is joined to the default cluster.

Adding -EnableIAMTaskRole when initializing the agent adds support for IAM roles for tasks. Previously, enabling this setting meant running a complex script and setting an environment variable before you could assign roles to your ECS tasks.

When you enable IAM roles for tasks on Windows, it consumes port 80 on the host. If you have tasks that listen on port 80 on the host, I recommend configuring a service for them that uses load balancing. You can use port 80 on the load balancer, and the traffic can be routed to another host port on your container instances. For more information, see Service Load Balancing.

Create a cluster

To create a new ECS cluster, choose Launch stack, or pull the GitHub project to your local machine and run the following command:

aws cloudformation create-stack –template-body file://<path to master-windows.yaml> --stack-name <name>

Upload your container image

Now that you have a cluster running, step through how to build and push an image into a container repository. You use a repository hosted in Amazon Elastic Container Registry (Amazon ECR) for this, but you could also use Docker Hub. To build and push an image to a repository, install Docker on your Windows* workstation. You also create a repository and assign the necessary permissions to the account that pushes your image to Amazon ECR. For detailed instructions, see Pushing an Image.

* If you are building an image that is based on Windows layers, then you must use a Windows environment to build and push your image to the registry.

Write your task definition

Now that your image is built and ready, the next step is to run your Windows containers using a task.

Start by creating a new task definition based on the windows-simple-iis image from Docker Hub.

  1. Open the ECS console.
  2. Choose Task Definitions, Create new task definition.
  3. Scroll to the bottom of the page and choose Configure via JSON.
  4. Copy and paste the following JSON into that field.
  5. Choose Save, Create.
{
   "family": "windows-simple-iis",
   "containerDefinitions": [
   {
     "name": "windows_sample_app",
     "image": "microsoft/iis",
     "cpu": 100,
     "entryPoint":["powershell", "-Command"],
     "command":["New-Item -Path C:\\inetpub\\wwwroot\\index.html -Type file -Value '<html><head><title>Amazon ECS Sample App</title> <style>body {margin-top: 40px; background-color: #333;} </style> </head><body> <div style=color:white;text-align:center><h1>Amazon ECS Sample App</h1> <h2>Congratulations!</h2> <p>Your application is now running on a container in Amazon ECS.</p></body></html>'; C:\\ServiceMonitor.exe w3svc"],
     "portMappings": [
     {
       "protocol": "tcp",
       "containerPort": 80,
       "hostPort": 8080
     }
     ],
     "memory": 500,
     "essential": true
   }
   ]
}

You can now go back into the Task Definition page and see windows-simple-iis as an available task definition.

There are a few important aspects of the task definition file to note when working with Windows containers. First, the hostPort is configured as 8080, which is necessary because the ECS agent currently uses port 80 to enable IAM roles for tasks required for least-privilege security configurations.

There are also some fairly standard task parameters that are intentionally not included. For example, network mode is not available with Windows at the time of this release, so keep that setting blank to allow Docker to configure WinNAT, the only option available today.

Also, some parameters work differently with Windows than they do with Linux. The CPU limits that you define in the task definition are absolute, whereas on Linux they are weights. For information about other task parameters that are supported or possibly different with Windows, see the documentation.

Run your containers

At this point, you are ready to run containers. There are two options to run containers with ECS:

  1. Task
  2. Service

A task is typically a short-lived process that ECS creates. It can’t be configured to actively monitor or scale. A service is meant for longer-running containers and can be configured to use a load balancer, minimum/maximum capacity settings, and a number of other knobs and switches to help ensure that your code keeps running. In both cases, you are able to pick a placement strategy and a specific IAM role for your container.

  1. Select the task definition that you created above and choose Action, Run Task.
  2. Leave the settings on the next page to the default values.
  3. Select the ECS cluster created when you ran the CloudFormation template.
  4. Choose Run Task to start the process of scheduling a Docker container on your ECS cluster.

You can now go to the cluster and watch the status of your task. It may take 5–10 minutes for the task to go from PENDING to RUNNING, mostly because it takes time to download all of the layers necessary to run the microsoft/iis image. After the status is RUNNING, you should see the following results:

You may have noticed that the example task definition is named windows-simple-iis:2. This is because I created a second version of the task definition, which is one of the powerful capabilities of using ECS. You can make the task definitions part of your source code and then version them. You can also roll out new versions and practice blue/green deployment, switching to reduce downtime and improve the velocity of your deployments!

After the task has moved to RUNNING, you can see your website hosted in ECS. Find the public IP or DNS for your ECS host. Remember that you are hosting on port 8080. Make sure that the security group allows ingress from your client IP address to that port and that your VPC has an internet gateway associated with it. You should see a page that looks like the following:

This is a nice start to deploying a simple single instance task, but what if you had a Web API to be scaled out and in based on usage? This is where you could look at defining a service and collecting CloudWatch data to add and remove both instances of the task. You could also use CloudWatch alarms to add more ECS container instances and keep up with the demand. The former is built into the configuration of your service.

  1. Select the task definition and choose Create Service.
  2. Associate a load balancer.
  3. Set up Auto Scaling.

The following screenshot shows an example where you would add an additional task instance when the CPU Utilization CloudWatch metric is over 60% on average over three consecutive measurements. This may not be aggressive enough for your requirements; it’s meant to show you the option to scale tasks the same way you scale ECS instances with an Auto Scaling group. The difference is that these tasks start much faster because all of the base layers are already on the ECS host.

Do not confuse task dynamic scaling with ECS instance dynamic scaling. To add additional hosts, see Tutorial: Scaling Container Instances with CloudWatch Alarms.

Conclusion

This is just scratching the surface of the flexibility that you get from using containers and Amazon ECS. For more information, see the Amazon ECS Developer Guide and ECS Resources.

– Jeremy, Thomas, Samuel, Akram

AWS Systems Manager – A Unified Interface for Managing Your Cloud and Hybrid Resources

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aws-systems-manager/

AWS Systems Manager is a new way to manage your cloud and hybrid IT environments. AWS Systems Manager provides a unified user interface that simplifies resource and application management, shortens the time to detect and resolve operational problems, and makes it easy to operate and manage your infrastructure securely at scale. This service is absolutely packed full of features. It defines a new experience around grouping, visualizing, and reacting to problems using features from products like Amazon EC2 Systems Manager (SSM) to enable rich operations across your resources.

As I said above, there are a lot of powerful features in this service and we won’t be able to dive deep on all of them but it’s easy to go to the console and get started with any of the tools.

Resource Groupings

Resource Groups allow you to create logical groupings of most resources that support tagging like: Amazon Elastic Compute Cloud (EC2) instances, Amazon Simple Storage Service (S3) buckets, Elastic Load Balancing balancers, Amazon Relational Database Service (RDS) instances, Amazon Virtual Private Cloud, Amazon Kinesis streams, Amazon Route 53 zones, and more. Previously, you could use the AWS Console to define resource groupings but AWS Systems Manager provides this new resource group experience via a new console and API. These groupings are a fundamental building block of Systems Manager in that they are frequently the target of various operations you may want to perform like: compliance management, software inventories, patching, and other automations.

You start by defining a group based on tag filters. From there you can view all of the resources in a centralized console. You would typically use these groupings to differentiate between applications, application layers, and environments like production or dev – but you can make your own rules about how to use them as well. If you imagine a typical 3 tier web-app you might have a few EC2 instances, an ELB, a few S3 buckets, and an RDS instance. You can define a grouping for that application and with all of those different resources simultaneously.

Insights

AWS Systems Manager automatically aggregates and displays operational data for each resource group through a dashboard. You no longer need to navigate through multiple AWS consoles to view all of your operational data. You can easily integrate your exiting Amazon CloudWatch dashboards, AWS Config rules, AWS CloudTrail trails, AWS Trusted Advisor notifications, and AWS Personal Health Dashboard performance and availability alerts. You can also easily view your software inventories across your fleet. AWS Systems Manager also provides a compliance dashboard allowing you to see the state of various security controls and patching operations across your fleets.

Acting on Insights

Building on the success of EC2 Systems Manager (SSM), AWS Systems Manager takes all of the features of SSM and provides a central place to access them. These are all the same experiences you would have through SSM with a more accesible console and centralized interface. You can use the resource groups you’ve defined in Systems Manager to visualize and act on groups of resources.

Automation


Automations allow you to define common IT tasks as a JSON document that specify a list of tasks. You can also use community published documents. These documents can be executed through the Console, CLIs, SDKs, scheduled maintenance windows, or triggered based on changes in your infrastructure through CloudWatch events. You can track and log the execution of each step in the documents and prompt for additional approvals. It also allows you to incrementally roll out changes and automatically halt when errors occur. You can start executing an automation directly on a resource group and it will be able to apply itself to the resources that it understands within the group.

Run Command

Run Command is a superior alternative to enabling SSH on your instances. It provides safe, secure remote management of your instances at scale without logging into your servers, replacing the need for SSH bastions or remote powershell. It has granular IAM permissions that allow you to restrict which roles or users can run certain commands.

Patch Manager, Maintenance Windows, and State Manager

I’ve written about Patch Manager before and if you manage fleets of Windows and Linux instances it’s a great way to maintain a common baseline of security across your fleet.

Maintenance windows allow you to schedule instance maintenance and other disruptive tasks for a specific time window.

State Manager allows you to control various server configuration details like anti-virus definitions, firewall settings, and more. You can define policies in the console or run existing scripts, PowerShell modules, or even Ansible playbooks directly from S3 or GitHub. You can query State Manager at any time to view the status of your instance configurations.

Things To Know

There’s some interesting terminology here. We haven’t done the best job of naming things in the past so let’s take a moment to clarify. EC2 Systems Manager (sometimes called SSM) is what you used before today. You can still invoke aws ssm commands. However, AWS Systems Manager builds on and enhances many of the tools provided by EC2 Systems Manager and allows those same tools to be applied to more than just EC2. When you see the phrase “Systems Manager” in the future you should think of AWS Systems Manager and not EC2 Systems Manager.

AWS Systems Manager with all of this useful functionality is provided at no additional charge. It is immediately available in all public AWS regions.

The best part about these services is that even with their tight integrations each one is designed to be used in isolation as well. If you only need one component of these services it’s simple to get started with only that component.

There’s a lot more than I could ever document in this post so I encourage you all to jump into the console and documentation to figure out where you can start using AWS Systems Manager.

Randall

AWS Fargate: A Product Overview

Post Syndicated from Deepak Dayama original https://aws.amazon.com/blogs/compute/aws-fargate-a-product-overview/

It was just about three years ago that AWS announced Amazon Elastic Container Service (Amazon ECS), to run and manage containers at scale on AWS. With Amazon ECS, you’ve been able to run your workloads at high scale and availability without having to worry about running your own cluster management and container orchestration software.

Today, AWS announced the availability of AWS Fargate – a technology that enables you to use containers as a fundamental compute primitive without having to manage the underlying instances. With Fargate, you don’t need to provision, configure, or scale virtual machines in your clusters to run containers. Fargate can be used with Amazon ECS today, with plans to support Amazon Elastic Container Service for Kubernetes (Amazon EKS) in the future.

Fargate has flexible configuration options so you can closely match your application needs and granular, per-second billing.

Amazon ECS with Fargate

Amazon ECS enables you to run containers at scale. This service also provides native integration into the AWS platform with VPC networking, load balancing, IAM, Amazon CloudWatch Logs, and CloudWatch metrics. These deep integrations make the Amazon ECS task a first-class object within the AWS platform.

To run tasks, you first need to stand up a cluster of instances, which involves picking the right types of instances and sizes, setting up Auto Scaling, and right-sizing the cluster for performance. With Fargate, you can leave all that behind and focus on defining your application and policies around permissions and scaling.

The same container management capabilities remain available so you can continue to scale your container deployments. With Fargate, the only entity to manage is the task. You don’t need to manage the instances or supporting software like Docker daemon or the Amazon ECS agent.

Fargate capabilities are available natively within Amazon ECS. This means that you don’t need to learn new API actions or primitives to run containers on Fargate.

Using Amazon ECS, Fargate is a launch type option. You continue to define the applications the same way by using task definitions. In contrast, the EC2 launch type gives you more control of your server clusters and provides a broader range of customization options.

For example, a RunTask command example is pasted below with the Fargate launch type:

ecs run-task --launch-type FARGATE --cluster fargate-test --task-definition nginx --network-configuration
"awsvpcConfiguration={subnets=[subnet-b563fcd3]}"

Key features of Fargate

Resource-based pricing and per second billing
You pay by the task size and only for the time for which resources are consumed by the task. The price for CPU and memory is charged on a per-second basis. There is a one-minute minimum charge.

Flexible configurations options
Fargate is available with 50 different combinations of CPU and memory to closely match your application needs. You can use 2 GB per vCPU anywhere up to 8 GB per vCPU for various configurations. Match your workload requirements closely, whether they are general purpose, compute, or memory optimized.

Networking
All Fargate tasks run within your own VPC. Fargate supports the recently launched awsvpc networking mode and the elastic network interface for a task is visible in the subnet where the task is running. This provides the separation of responsibility so you retain full control of networking policies for your applications via VPC features like security groups, routing rules, and NACLs. Fargate also supports public IP addresses.

Load Balancing
ECS Service Load Balancing  for the Application Load Balancer and Network Load Balancer is supported. For the Fargate launch type, you specify the IP addresses of the Fargate tasks to register with the load balancers.

Permission tiers
Even though there are no instances to manage with Fargate, you continue to group tasks into logical clusters. This allows you to manage who can run or view services within the cluster. The task IAM role is still applicable. Additionally, there is a new Task Execution Role that grants Amazon ECS permissions to perform operations such as pushing logs to CloudWatch Logs or pulling image from Amazon Elastic Container Registry (Amazon ECR).

Container Registry Support
Fargate provides seamless authentication to help pull images from Amazon ECR via the Task Execution Role. Similarly, if you are using a public repository like DockerHub, you can continue to do so.

Amazon ECS CLI
The Amazon ECS CLI provides high-level commands to help simplify to create and run Amazon ECS clusters, tasks, and services. The latest version of the CLI now supports running tasks and services with Fargate.

EC2 and Fargate Launch Type Compatibility
All Amazon ECS clusters are heterogeneous – you can run both Fargate and Amazon ECS tasks in the same cluster. This enables teams working on different applications to choose their own cadence of moving to Fargate, or to select a launch type that meets their requirements without breaking the existing model. You can make an existing ECS task definition compatible with the Fargate launch type and run it as a Fargate service, and vice versa. Choosing a launch type is not a one-way door!

Logging and Visibility
With Fargate, you can send the application logs to CloudWatch logs. Service metrics (CPU and Memory utilization) are available as part of CloudWatch metrics. AWS partners for visibility, monitoring and application performance management including Datadog, Aquasec, Splunk, Twistlock, and New Relic also support Fargate tasks.

Conclusion

Fargate enables you to run containers without having to manage the underlying infrastructure. Today, Fargate is availabe for Amazon ECS, and in 2018, Amazon EKS. Visit the Fargate product page to learn more, or get started in the AWS Console.

–Deepak Dayama

Amazon EC2 Bare Metal Instances with Direct Access to Hardware

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-ec2-bare-metal-instances-with-direct-access-to-hardware/

When customers come to us with new and unique requirements for AWS, we listen closely, ask lots of questions, and do our best to understand and address their needs. When we do this, we make the resulting service or feature generally available; we do not build one-offs or “snowflakes” for individual customers. That model is messy and hard to scale and is not the way we work.

Instead, every AWS customer has access to whatever it is that we build, and everyone benefits. VMware Cloud on AWS is a good example of this strategy in action. They told us that they wanted to run their virtualization stack directly on the hardware, within the AWS Cloud, giving their customers access to the elasticity, security, and reliability (not to mention the broad array of services) that AWS offers.

We knew that other customers also had interesting use cases for bare metal hardware and didn’t want to take the performance hit of nested virtualization. They wanted access to the physical resources for applications that take advantage of low-level hardware features such as performance counters and Intel® VT that are not always available or fully supported in virtualized environments, and also for applications intended to run directly on the hardware or licensed and supported for use in non-virtualized environments.

Our multi-year effort to move networking, storage, and other EC2 features out of our virtualization platform and into dedicated hardware was already well underway and provided the perfect foundation for a possible solution. This work, as I described in Now Available – Compute-Intensive C5 Instances for Amazon EC2, includes a set of dedicated hardware accelerators.

Now that we have provided VMware with the bare metal access that they requested, we are doing the same for all AWS customers. I’m really looking forward to seeing what you can do with them!

New Bare Metal Instances
Today we are launching a public preview the i3.metal instance, the first in a series of EC2 instances that offer the best of both worlds, allowing the operating system to run directly on the underlying hardware while still providing access to all of the benefits of the cloud. The instance gives you direct access to the processor and other hardware, and has the following specifications:

  • Processing – Two Intel Xeon E5-2686 v4 processors running at 2.3 GHz, with a total of 36 hyperthreaded cores (72 logical processors).
  • Memory – 512 GiB.
  • Storage – 15.2 terabytes of local, SSD-based NVMe storage.
  • Network – 25 Gbps of ENA-based enhanced networking.

Bare Metal instances are full-fledged members of the EC2 family and can take advantage of Elastic Load Balancing, Auto Scaling, Amazon CloudWatch, Auto Recovery, and so forth. They can also access the full suite of AWS database, IoT, mobile, analytics, artificial intelligence, and security services.

Previewing Now
We are launching a public preview of the Bare Metal instances today; please sign up now if you want to try them out.

You can now bring your specialized applications or your own stack of virtualized components to AWS and run them on Bare Metal instances. If you are using or thinking about using containers, these instances make a great host for CoreOS.

An AMI that works on one of the new C5 instances should also work on an I3 Bare Metal Instance. It must have the ENA and NVMe drivers, and must be tagged for ENA.

Jeff;