Tag Archives: data

Elisa 0.0.80 Released

Post Syndicated from ris original https://lwn.net/Articles/741172/rss

A very early alpha version of the Elisa music player has been released.
Elisa allows to browse music by album, artist or all tracks. The music is indexed using either a private indexer or an indexer using Baloo. The private one can be configured to scan music on chosen paths. The Baloo one is much faster because Baloo is providing all needed data from its own database. You can build and play your own playlist.

Managing AWS Lambda Function Concurrency

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/managing-aws-lambda-function-concurrency/

One of the key benefits of serverless applications is the ease in which they can scale to meet traffic demands or requests, with little to no need for capacity planning. In AWS Lambda, which is the core of the serverless platform at AWS, the unit of scale is a concurrent execution. This refers to the number of executions of your function code that are happening at any given time.

Thinking about concurrent executions as a unit of scale is a fairly unique concept. In this post, I dive deeper into this and talk about how you can make use of per function concurrency limits in Lambda.

Understanding concurrency in Lambda

Instead of diving right into the guts of how Lambda works, here’s an appetizing analogy: a magical pizza.
Yes, a magical pizza!

This magical pizza has some unique properties:

  • It has a fixed maximum number of slices, such as 8.
  • Slices automatically re-appear after they are consumed.
  • When you take a slice from the pizza, it does not re-appear until it has been completely consumed.
  • One person can take multiple slices at a time.
  • You can easily ask to have the number of slices increased, but they remain fixed at any point in time otherwise.

Now that the magical pizza’s properties are defined, here’s a hypothetical situation of some friends sharing this pizza.

Shawn, Kate, Daniela, Chuck, Ian and Avleen get together every Friday to share a pizza and catch up on their week. As there is just six of them, they can easily all enjoy a slice of pizza at a time. As they finish each slice, it re-appears in the pizza pan and they can take another slice again. Given the magical properties of their pizza, they can continue to eat all they want, but with two very important constraints:

  • If any of them take too many slices at once, the others may not get as much as they want.
  • If they take too many slices, they might also eat too much and get sick.

One particular week, some of the friends are hungrier than the rest, taking two slices at a time instead of just one. If more than two of them try to take two pieces at a time, this can cause contention for pizza slices. Some of them would wait hungry for the slices to re-appear. They could ask for a pizza with more slices, but then run the same risk again later if more hungry friends join than planned for.

What can they do?

If the friends agreed to accept a limit for the maximum number of slices they each eat concurrently, both of these issues are avoided. Some could have a maximum of 2 of the 8 slices, or other concurrency limits that were more or less. Just so long as they kept it at or under eight total slices to be eaten at one time. This would keep any from going hungry or eating too much. The six friends can happily enjoy their magical pizza without worry!

Concurrency in Lambda

Concurrency in Lambda actually works similarly to the magical pizza model. Each AWS Account has an overall AccountLimit value that is fixed at any point in time, but can be easily increased as needed, just like the count of slices in the pizza. As of May 2017, the default limit is 1000 “slices” of concurrency per AWS Region.

Also like the magical pizza, each concurrency “slice” can only be consumed individually one at a time. After consumption, it becomes available to be consumed again. Services invoking Lambda functions can consume multiple slices of concurrency at the same time, just like the group of friends can take multiple slices of the pizza.

Let’s take our example of the six friends and bring it back to AWS services that commonly invoke Lambda:

  • Amazon S3
  • Amazon Kinesis
  • Amazon DynamoDB
  • Amazon Cognito

In a single account with the default concurrency limit of 1000 concurrent executions, any of these four services could invoke enough functions to consume the entire limit or some part of it. Just like with the pizza example, there is the possibility for two issues to pop up:

  • One or more of these services could invoke enough functions to consume a majority of the available concurrency capacity. This could cause others to be starved for it, causing failed invocations.
  • A service could consume too much concurrent capacity and cause a downstream service or database to be overwhelmed, which could cause failed executions.

For Lambda functions that are launched in a VPC, you have the potential to consume the available IP addresses in a subnet or the maximum number of elastic network interfaces to which your account has access. For more information, see Configuring a Lambda Function to Access Resources in an Amazon VPC. For information about elastic network interface limits, see Network Interfaces section in the Amazon VPC Limits topic.

One way to solve both of these problems is applying a concurrency limit to the Lambda functions in an account.

Configuring per function concurrency limits

You can now set a concurrency limit on individual Lambda functions in an account. The concurrency limit that you set reserves a portion of your account level concurrency for a given function. All of your functions’ concurrent executions count against this account-level limit by default.

If you set a concurrency limit for a specific function, then that function’s concurrency limit allocation is deducted from the shared pool and assigned to that specific function. AWS also reserves 100 units of concurrency for all functions that don’t have a specified concurrency limit set. This helps to make sure that future functions have capacity to be consumed.

Going back to the example of the consuming services, you could set throttles for the functions as follows:

Amazon S3 function = 350
Amazon Kinesis function = 200
Amazon DynamoDB function = 200
Amazon Cognito function = 150
Total = 900

With the 100 reserved for all non-concurrency reserved functions, this totals the account limit of 1000.

Here’s how this works. To start, create a basic Lambda function that is invoked via Amazon API Gateway. This Lambda function returns a single “Hello World” statement with an added sleep time between 2 and 5 seconds. The sleep time simulates an API providing some sort of capability that can take a varied amount of time. The goal here is to show how an API that is underloaded can reach its concurrency limit, and what happens when it does.
To create the example function

  1. Open the Lambda console.
  2. Choose Create Function.
  3. For Author from scratch, enter the following values:
    1. For Name, enter a value (such as concurrencyBlog01).
    2. For Runtime, choose Python 3.6.
    3. For Role, choose Create new role from template and enter a name aligned with this function, such as concurrencyBlogRole.
  4. Choose Create function.
  5. The function is created with some basic example code. Replace that code with the following:

import time
from random import randint
seconds = randint(2, 5)

def lambda_handler(event, context):
time.sleep(seconds)
return {"statusCode": 200,
"body": ("Hello world, slept " + str(seconds) + " seconds"),
"headers":
{
"Access-Control-Allow-Headers": "Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token",
"Access-Control-Allow-Methods": "GET,OPTIONS",
}}

  1. Under Basic settings, set Timeout to 10 seconds. While this function should only ever take up to 5-6 seconds (with the 5-second max sleep), this gives you a little bit of room if it takes longer.

  1. Choose Save at the top right.

At this point, your function is configured for this example. Test it and confirm this in the console:

  1. Choose Test.
  2. Enter a name (it doesn’t matter for this example).
  3. Choose Create.
  4. In the console, choose Test again.
  5. You should see output similar to the following:

Now configure API Gateway so that you have an HTTPS endpoint to test against.

  1. In the Lambda console, choose Configuration.
  2. Under Triggers, choose API Gateway.
  3. Open the API Gateway icon now shown as attached to your Lambda function:

  1. Under Configure triggers, leave the default values for API Name and Deployment stage. For Security, choose Open.
  2. Choose Add, Save.

API Gateway is now configured to invoke Lambda at the Invoke URL shown under its configuration. You can take this URL and test it in any browser or command line, using tools such as “curl”:


$ curl https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01
Hello world, slept 2 seconds

Throwing load at the function

Now start throwing some load against your API Gateway + Lambda function combo. Right now, your function is only limited by the total amount of concurrency available in an account. For this example account, you might have 850 unreserved concurrency out of a full account limit of 1000 due to having configured a few concurrency limits already (also the 100 concurrency saved for all functions without configured limits). You can find all of this information on the main Dashboard page of the Lambda console:

For generating load in this example, use an open source tool called “hey” (https://github.com/rakyll/hey), which works similarly to ApacheBench (ab). You test from an Amazon EC2 instance running the default Amazon Linux AMI from the EC2 console. For more help with configuring an EC2 instance, follow the steps in the Launch Instance Wizard.

After the EC2 instance is running, SSH into the host and run the following:


sudo yum install go
go get -u github.com/rakyll/hey

“hey” is easy to use. For these tests, specify a total number of tests (5,000) and a concurrency of 50 against the API Gateway URL as follows(replace the URL here with your own):


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01

The output from “hey” tells you interesting bits of information:


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01

Summary:
Total: 381.9978 secs
Slowest: 9.4765 secs
Fastest: 0.0438 secs
Average: 3.2153 secs
Requests/sec: 13.0891
Total data: 140024 bytes
Size/request: 28 bytes

Response time histogram:
0.044 [1] |
0.987 [2] |
1.930 [0] |
2.874 [1803] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
3.817 [1518] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
4.760 [719] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
5.703 [917] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
6.647 [13] |
7.590 [14] |
8.533 [9] |
9.477 [4] |

Latency distribution:
10% in 2.0224 secs
25% in 2.0267 secs
50% in 3.0251 secs
75% in 4.0269 secs
90% in 5.0279 secs
95% in 5.0414 secs
99% in 5.1871 secs

Details (average, fastest, slowest):
DNS+dialup: 0.0003 secs, 0.0000 secs, 0.0332 secs
DNS-lookup: 0.0000 secs, 0.0000 secs, 0.0046 secs
req write: 0.0000 secs, 0.0000 secs, 0.0005 secs
resp wait: 3.2149 secs, 0.0438 secs, 9.4472 secs
resp read: 0.0000 secs, 0.0000 secs, 0.0004 secs

Status code distribution:
[200] 4997 responses
[502] 3 responses

You can see a helpful histogram and latency distribution. Remember that this Lambda function has a random sleep period in it and so isn’t entirely representational of a real-life workload. Those three 502s warrant digging deeper, but could be due to Lambda cold-start timing and the “second” variable being the maximum of 5, causing the Lambda functions to time out. AWS X-Ray and the Amazon CloudWatch logs generated by both API Gateway and Lambda could help you troubleshoot this.

Configuring a concurrency reservation

Now that you’ve established that you can generate this load against the function, I show you how to limit it and protect a backend resource from being overloaded by all of these requests.

  1. In the console, choose Configure.
  2. Under Concurrency, for Reserve concurrency, enter 25.

  1. Click on Save in the top right corner.

You could also set this with the AWS CLI using the Lambda put-function-concurrency command or see your current concurrency configuration via Lambda get-function. Here’s an example command:


$ aws lambda get-function --function-name concurrencyBlog01 --output json --query Concurrency
{
"ReservedConcurrentExecutions": 25
}

Either way, you’ve set the Concurrency Reservation to 25 for this function. This acts as both a limit and a reservation in terms of making sure that you can execute 25 concurrent functions at all times. Going above this results in the throttling of the Lambda function. Depending on the invoking service, throttling can result in a number of different outcomes, as shown in the documentation on Throttling Behavior. This change has also reduced your unreserved account concurrency for other functions by 25.

Rerun the same load generation as before and see what happens. Previously, you tested at 50 concurrency, which worked just fine. By limiting the Lambda functions to 25 concurrency, you should see rate limiting kick in. Run the same test again:


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01

While this test runs, refresh the Monitoring tab on your function detail page. You see the following warning message:

This is great! It means that your throttle is working as configured and you are now protecting your downstream resources from too much load from your Lambda function.

Here is the output from a new “hey” command:


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01
Summary:
Total: 379.9922 secs
Slowest: 7.1486 secs
Fastest: 0.0102 secs
Average: 1.1897 secs
Requests/sec: 13.1582
Total data: 164608 bytes
Size/request: 32 bytes

Response time histogram:
0.010 [1] |
0.724 [3075] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1.438 [0] |
2.152 [811] |∎∎∎∎∎∎∎∎∎∎∎
2.866 [11] |
3.579 [566] |∎∎∎∎∎∎∎
4.293 [214] |∎∎∎
5.007 [1] |
5.721 [315] |∎∎∎∎
6.435 [4] |
7.149 [2] |

Latency distribution:
10% in 0.0130 secs
25% in 0.0147 secs
50% in 0.0205 secs
75% in 2.0344 secs
90% in 4.0229 secs
95% in 5.0248 secs
99% in 5.0629 secs

Details (average, fastest, slowest):
DNS+dialup: 0.0004 secs, 0.0000 secs, 0.0537 secs
DNS-lookup: 0.0002 secs, 0.0000 secs, 0.0184 secs
req write: 0.0000 secs, 0.0000 secs, 0.0016 secs
resp wait: 1.1892 secs, 0.0101 secs, 7.1038 secs
resp read: 0.0000 secs, 0.0000 secs, 0.0005 secs

Status code distribution:
[502] 3076 responses
[200] 1924 responses

This looks fairly different from the last load test run. A large percentage of these requests failed fast due to the concurrency throttle failing them (those with the 0.724 seconds line). The timing shown here in the histogram represents the entire time it took to get a response between the EC2 instance and API Gateway calling Lambda and being rejected. It’s also important to note that this example was configured with an edge-optimized endpoint in API Gateway. You see under Status code distribution that 3076 of the 5000 requests failed with a 502, showing that the backend service from API Gateway and Lambda failed the request.

Other uses

Managing function concurrency can be useful in a few other ways beyond just limiting the impact on downstream services and providing a reservation of concurrency capacity. Here are two other uses:

  • Emergency kill switch
  • Cost controls

Emergency kill switch

On occasion, due to issues with applications I’ve managed in the past, I’ve had a need to disable a certain function or capability of an application. By setting the concurrency reservation and limit of a Lambda function to zero, you can do just that.

With the reservation set to zero every invocation of a Lambda function results in being throttled. You could then work on the related parts of the infrastructure or application that aren’t working, and then reconfigure the concurrency limit to allow invocations again.

Cost controls

While I mentioned how you might want to use concurrency limits to control the downstream impact to services or databases that your Lambda function might call, another resource that you might be cautious about is money. Setting the concurrency throttle is another way to help control costs during development and testing of your application.

You might want to prevent against a function performing a recursive action too quickly or a development workload generating too high of a concurrency. You might also want to protect development resources connected to this function from generating too much cost, such as APIs that your Lambda function calls.

Conclusion

Concurrent executions as a unit of scale are a fairly unique characteristic about Lambda functions. Placing limits on how many concurrency “slices” that your function can consume can prevent a single function from consuming all of the available concurrency in an account. Limits can also prevent a function from overwhelming a backend resource that isn’t as scalable.

Unlike monolithic applications or even microservices where there are mixed capabilities in a single service, Lambda functions encourage a sort of “nano-service” of small business logic directly related to the integration model connected to the function. I hope you’ve enjoyed this post and configure your concurrency limits today!

Top 10 Most Pirated Movies of The Week on BitTorrent – 12/11/17

Post Syndicated from Ernesto original https://torrentfreak.com/top-10-pirated-movies-week-bittorrent-121117/

This week we have four newcomers in our chart.

Dunkirk is the most downloaded movie.

The data for our weekly download chart is estimated by TorrentFreak, and is for informational and educational reference only. All the movies in the list are Web-DL/Webrip/HDRip/BDrip/DVDrip unless stated otherwise.

RSS feed for the weekly movie download chart.

This week’s most downloaded movies are:
Movie Rank Rank last week Movie name IMDb Rating / Trailer
Most downloaded movies via torrents
1 (…) Dunkirk 8.3 / trailer
2 (1) Kingsman: The Golden Circle 7.2 / trailer
3 (…) Mother! 7.0 / trailer
4 (2) The Foreigner 7.2 / trailer
5 (3) American Assassin 6.3 / trailer
6 (…) American Made 7.2 / trailer
7 (5) Valerian and the City of a Thousand Planets 6.7 / trailer
8 (7) Justice League (HDTS) 7.2 / trailer
9 (…) Coco (HDTS) 8.9 / trailer
10 (9) Thor Ragnarok (HDTS/Cam) 8.2 / trailer

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

The Operations Team Just Got Rich-er!

Post Syndicated from Yev original https://www.backblaze.com/blog/operations-team-just-got-rich-er/

We’re growing at a pretty rapid clip, and as we add more customers, we need people to help keep all of our hard drive spinning. Along with support, the other department that grows linearly with the number of customers that join us is the operations team, and they’ve just added a new member to their team, Rich! He joins us as a Network Systems Administrator! Lets take a moment to learn more about Rich, shall we?

What is your Backblaze Title?
Network Systems Administrator

Where are you originally from?
The Upper Peninsula of Michigan. Da UP, eh!

What attracted you to Backblaze?
The fact that it is a small tech company packed with highly intelligent people and a place where I can also be friends with my peers. I am also huge on cloud storage and backing up your past!

What do you expect to learn while being at Backblaze?
I look forward to expanding my Networking skills and System Administration skills while helping build the best Cloud Storage and Backup Company there is!

Where else have you worked?
I first started working in Data Centers at Viawest. I was previously an Infrastructure Engineer at Twitter and a Production Engineer at Groupon.

Where did you go to school?
I started at Finlandia University in Norther Michigan, carried onto Northwest Florida State and graduated with my A.S. from North Lake College in Dallas, TX. I then completed my B.S. Degree online at WGU.

What’s your dream job?
Sr. Network Engineer

Favorite place you’ve traveled?
I have traveled around a bit in my life. I really liked Dublin, Ireland but I have to say favorite has to be Puerto Vallarta, Mexico! Which is actually where I am getting married in 2019!

Favorite hobby?
Water is my life. I like to wakeboard and wakesurf. I also enjoy biking, hunting, fishing, camping, and anything that has to do with the great outdoors!

Of what achievement are you most proud?
I’m proud of moving up in my career as quickly as I have been. I am also very proud of being able to wakesurf behind a boat without a rope! Lol!

Star Trek or Star Wars?
Star Trek! I grew up on it!

Coke or Pepsi?
H2O 😀

Favorite food?
Mexican Food and Pizza!

Why do you like certain things?
Hmm…. because certain things make other certain things particularly certain!

Anything else you’d like you’d like to tell us?
Nope 😀

Who can say no to high quality H2O? Welcome to the team Rich!

The post The Operations Team Just Got Rich-er! appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Now Available: A New AWS Quick Start Reference Deployment for CJIS

Post Syndicated from Emil Lerch original https://aws.amazon.com/blogs/security/now-available-a-new-aws-quick-start-reference-deployment-for-cjis/

CJIS logo

As part of the AWS Compliance Quick Start program, AWS has published a new Quick Start reference deployment for customers who need to align with Criminal Justice Information Services (CJIS) Security Policy 5.6 and process Criminal Justice Information (CJI) in accordance with this policy. The new Quick Start is AWS Enterprise Accelerator – Compliance: CJIS, and it makes it easier for you to address the list of supported controls you will find in the security controls matrix that accompanies the Quick Start.

As all AWS Quick Starts do, this Quick Start helps you automate the building of a recommended architecture that, when deployed as a package, provides a baseline AWS configuration. The Quick Start uses sets of nested AWS CloudFormation templates and user data scripts to create an example environment with a two-VPC, multi-tiered web service.

The new Quick Start also includes:

The recommended architecture built by the Quick Start supports a wide variety of AWS best practices (all of which are detailed in the Quick Start), including the use of multiple Availability Zones, isolation using public and private subnets, load balancing, and Auto Scaling.

The Quick Start package also includes a deployment guide with detailed instructions and a security controls matrix that describes how the deployment addresses CJIS Security Policy 5.6 controls. You should have your IT security assessors and risk decision makers review the security controls matrix so that they can understand the extent of the implementation of the controls within the architecture. The matrix also identifies the specific resources in the CloudFormation templates that affect each control, and contains cross-references to the CJIS Security Policy 5.6 security controls.

If you have questions about this new Quick Start, contact the AWS Compliance Quick Start team. For more information about the AWS CJIS program, see CJIS Compliance.

– Emil

Dutch Film Distributor Wins Right To Chase Pirates, Store Data For 5 Years

Post Syndicated from Andy original https://torrentfreak.com/dutch-film-distributor-wins-right-to-chase-pirates-store-data-for-5-years-171208/

For many years, Dutch Internet users were allowed to download copyrighted content without reprisals, provided it was for their own personal use.

In 2014, however, the European Court of Justice ruled that the country’s “piracy levy” to compensate rightsholders was unlawful. Almost immediately, the government announced a downloading ban.

In March 2016, anti-piracy outfit BREIN followed up by obtaining permission from the Dutch Data Protection Authority to track and store the personal data of alleged BitTorrent pirates. This year, movie distributor Dutch FilmWorks (DFW) made a similar application.

The company said that it would be pursuing alleged pirates to deter future infringement but many suspected that securing cash settlements was its main aim. That was confirmed in August.

“[The letter to alleged pirates] will propose a fee. If someone does not agree [to pay], the organization can start a lawsuit,” said DFW CEO Willem Pruijsserts

“In Germany, this costs between €800 and €1,000, although we find this a bit excessive. But of course it has to be a deterrent, so it will be more than a tenner or two,” he added.

But despite the grand plans, nothing would be possible without first obtaining the necessary permission from the Data Protection Authority. This Wednesday, however, that arrived.

“DFW has given sufficient guarantees for the proper and careful processing of personal data. This means that DFW has been given a green light from the Data Protection Authority to collect personal data, such as IP addresses, from people downloading from illegal sources,” the Authority announced.

Noting that it received feedback from four entities during the six-week consultation process following the publication of its draft decision during the summer, the Data Protection Authority said that further investigations were duly carried out. All input was considered before handing down the final decision.

The Authority said it was satisfied that personal data would be handled correctly and that the information collected and stored would be encrypted and hashed to ensure integrity. Furthermore, data will not be retained for longer than is necessary.

“DFW has stated…that data from users with Dutch IP addresses who were involved in the exchange of a title owned by DFW, but in respect of which there is no intention to follow up on that within three months after receipt, will be destroyed,” the decision reads.

For any cases that are active and haven’t been discarded in the initial three-month period, DFW will be allowed to hold alleged pirates’ data for a maximum of five years, a period that matches the time a company has to file a claim under the Dutch Civil Code.

“When DFW does follow up on a file, DFW carries out further research into the identity of the users of the IP addresses. For this, it is necessary to contact the Internet service providers of the subscribers who used the IP addresses found in the BitTorrent network,” the Authority notes.

According to the decision, once DFW has a person’s details it can take any of several actions, starting with a simple warning or moving up to an amicable cash settlement. Failing that, it might choose to file a full-on court case in which the distributor seeks an injunction against the alleged pirate plus compensation and costs.

Only time will tell what strategy DFW will deploy against alleged pirates but since these schemes aren’t cheap to run, it’s likely that simple warning letters will be seriously outnumbered by demands for cash settlement.

While it seems unlikely that the Data Protection Authority will change its mind at this late stage, it’s decision remains open to appeal. Interested parties have just under six weeks to make their voices heard. Failing that, copyright trolling will hit the Netherlands in the weeks and months to come.

The full decision can be found here (Dutch, pdf) via Tweakers

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

timeShift(GrafanaBuzz, 1w) Issue 25

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/12/08/timeshiftgrafanabuzz-1w-issue-25/

Welcome to TimeShift

This week, a few of us from Grafana Labs, along with 4,000 of our closest friends, headed down to chilly Austin, TX for KubeCon + CloudNativeCon North America 2017. We got to see a number of great talks and were thrilled to see Grafana make appearances in some of the presentations. We were also a sponsor of the conference and handed out a ton of swag (we overnighted some of our custom Grafana scarves, which came in handy for Thursday’s snow).

We also announced Grafana Labs has joined the Cloud Native Computing Foundation as a Silver member! We’re excited to share our expertise in time series data visualization and open source software with the CNCF community.


Latest Release

Grafana 4.6.2 is available and includes some bug fixes:

  • Prometheus: Fixes bug with new Prometheus alerts in Grafana. Make sure to download this version if you’re using Prometheus for alerting. More details in the issue. #9777
  • Color picker: Bug after using textbox input field to change/paste color string #9769
  • Cloudwatch: build using golang 1.9.2 #9667, thanks @mtanda
  • Heatmap: Fixed tooltip for “time series buckets” mode #9332
  • InfluxDB: Fixed query editor issue when using > or < operators in WHERE clause #9871

Download Grafana 4.6.2 Now


From the Blogosphere

Grafana Labs Joins the CNCF: Grafana Labs has officially joined the Cloud Native Computing Foundation (CNCF). We look forward to working with the CNCF community to democratize metrics and help unify traditionally disparate information.

Automating Web Performance Regression Alerts: Peter and his team needed a faster and easier way to find web performance regressions at the Wikimedia Foundation. Grafana 4’s alerting features were exactly what they needed. This post covers their journey on setting up alerts for both RUM and synthetic testing and shares the alerts they’ve set up on their dashboards.

How To Install Grafana on Ubuntu 17.10: As you probably guessed from the title, this article walks you through installing and configuring Grafana in the latest version of Ubuntu (or earlier releases). It also covers installing plugins using the Grafana CLI tool.

Prometheus: Starting the Server with Alertmanager, cAdvisor and Grafana: Learn how to monitor Docker from scratch using cAdvisor, Prometheus and Grafana in this detailed, step-by-step walkthrough.

Monitoring Java EE Servers with Prometheus and Payara: In this screencast, Adam uses firehose; a Java EE 7+ metrics gateway for Prometheus, to convert the JSON output into Prometheus statistics and visualizes the data in Grafana.

Monitoring Spark Streaming with InfluxDB and Grafana: This article focuses on how to monitor Apache Spark Streaming applications with InfluxDB and Grafana at scale.


GrafanaCon EU, March 1-2, 2018

We are currently reaching out to everyone who submitted a talk to GrafanaCon and will soon publish the final schedule at grafanacon.org.

Join us March 1-2, 2018 in Amsterdam for 2 days of talks centered around Grafana and the surrounding monitoring ecosystem including Graphite, Prometheus, InfluxData, Elasticsearch, Kubernetes, and more.

Get Your Ticket Now


Grafana Plugins

Lots of plugin updates and a new OpenNMS Helm App plugin to announce! To install or update any plugin in an on-prem Grafana instance, use the Grafana-cli tool, or install and update with 1 click on Hosted Grafana.

NEW PLUGIN

OpenNMS Helm App – The new OpenNMS Helm App plugin replaces the old OpenNMS data source. Helm allows users to create flexible dashboards using both fault management (FM) and performance management (PM) data from OpenNMS® Horizon™ and/or OpenNMS® Meridian™. The old data source is now deprecated.


Install Now

UPDATED PLUGIN

PNP Data Source – This data source plugin (that uses PNP4Nagios to access RRD files) received a small, but important update that fixes template query parsing.


Update

UPDATED PLUGIN

Vonage Status Panel – The latest version of the Status Panel comes with a number of small fixes and changes. Below are a few of the enhancements:

  • Threshold settings – removed Show Always option, and replaced it with 2 options:
    • Display Alias – Select when to show the metric alias.
    • Display Value – Select when to show the metric value.
  • Text format configuration (bold / italic) for warning / critical / disabled states.
  • Option to change the corner radius of the panel. Now you can change the panel’s shape to have rounded corners.

Update

UPDATED PLUGIN

Google Calendar Plugin – This plugin received a small update, so be sure to install version 1.0.4.


Update

UPDATED PLUGIN

Carpet Plot Panel – The Carpet Plot Panel received a fix for IE 11, and also added the ability to choose custom colors.


Update


Upcoming Events:

In between code pushes we like to speak at, sponsor and attend all kinds of conferences and meetups. We also like to make sure we mention other Grafana-related events happening all over the world. If you’re putting on just such an event, let us know and we’ll list it here.

Docker Meetup @ Tuenti | Madrid, Spain – Dec 12, 2017: Javier Provecho: Intro to Metrics with Swarm, Prometheus and Grafana

Learn how to gain visibility in real time for your micro services. We’ll cover how to deploy a Prometheus server with persistence and Grafana, how to enable metrics endpoints for various service types (docker daemon, traefik proxy and postgres) and how to scrape, visualize and set up alarms based on those metrics.

RSVP

Grafana Lyon Meetup n ° 2 | Lyon, France – Dec 14, 2017: This meetup will cover some of the latest innovations in Grafana and discussion about automation. Also, free beer and chips, so – of course you’re going!

RSVP

FOSDEM | Brussels, Belgium – Feb 3-4, 2018: FOSDEM is a free developer conference where thousands of developers of free and open source software gather to share ideas and technology. Carl Bergquist is managing the Cloud and Monitoring Devroom, and we’ve heard there were some great talks submitted. There is no need to register; all are welcome.


Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

We were thrilled to see our dashboards bigger than life at KubeCon + CloudNativeCon this week. Thanks for snapping a photo and sharing!


Grafana Labs is Hiring!

We are passionate about open source software and thrive on tackling complex challenges to build the future. We ship code from every corner of the globe and love working with the community. If this sounds exciting, you’re in luck – WE’RE HIRING!

Check out our Open Positions


How are we doing?

Hard to believe this is the 25th issue of Timeshift! I have a blast writing these roundups, but Let me know what you think. Submit a comment on this article below, or post something at our community forum. Find an article I haven’t included? Send it my way. Help us make timeShift better!

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

Looking Forward to 2018

Post Syndicated from Let's Encrypt - Free SSL/TLS Certificates original https://letsencrypt.org//2017/12/07/looking-forward-to-2018.html

Let’s Encrypt had a great year in 2017. We more than doubled the number of active (unexpired) certificates we service to 46 million, we just about tripled the number of unique domains we service to 61 million, and we did it all while maintaining a stellar security and compliance track record. Most importantly though, the Web went from 46% encrypted page loads to 67% according to statistics from Mozilla – a gain of 21% in a single year – incredible. We’re proud to have contributed to that, and we’d like to thank all of the other people and organizations who also worked hard to create a more secure and privacy-respecting Web.

While we’re proud of what we accomplished in 2017, we are spending most of the final quarter of the year looking forward rather than back. As we wrap up our own planning process for 2018, I’d like to share some of our plans with you, including both the things we’re excited about and the challenges we’ll face. We’ll cover service growth, new features, infrastructure, and finances.

Service Growth

We are planning to double the number of active certificates and unique domains we service in 2018, to 90 million and 120 million, respectively. This anticipated growth is due to continuing high expectations for HTTPS growth in general in 2018.

Let’s Encrypt helps to drive HTTPS adoption by offering a free, easy to use, and globally available option for obtaining the certificates required to enable HTTPS. HTTPS adoption on the Web took off at an unprecedented rate from the day Let’s Encrypt launched to the public.

One of the reasons Let’s Encrypt is so easy to use is that our community has done great work making client software that works well for a wide variety of platforms. We’d like to thank everyone involved in the development of over 60 client software options for Let’s Encrypt. We’re particularly excited that support for the ACME protocol and Let’s Encrypt is being added to the Apache httpd server.

Other organizations and communities are also doing great work to promote HTTPS adoption, and thus stimulate demand for our services. For example, browsers are starting to make their users more aware of the risks associated with unencrypted HTTP (e.g. Firefox, Chrome). Many hosting providers and CDNs are making it easier than ever for all of their customers to use HTTPS. Government agencies are waking up to the need for stronger security to protect constituents. The media community is working to Secure the News.

New Features

We’ve got some exciting features planned for 2018.

First, we’re planning to introduce an ACME v2 protocol API endpoint and support for wildcard certificates along with it. Wildcard certificates will be free and available globally just like our other certificates. We are planning to have a public test API endpoint up by January 4, and we’ve set a date for the full launch: Tuesday, February 27.

Later in 2018 we plan to introduce ECDSA root and intermediate certificates. ECDSA is generally considered to be the future of digital signature algorithms on the Web due to the fact that it is more efficient than RSA. Let’s Encrypt will currently sign ECDSA keys from subscribers, but we sign with the RSA key from one of our intermediate certificates. Once we have an ECDSA root and intermediates, our subscribers will be able to deploy certificate chains which are entirely ECDSA.

Infrastructure

Our CA infrastructure is capable of issuing millions of certificates per day with multiple redundancy for stability and a wide variety of security safeguards, both physical and logical. Our infrastructure also generates and signs nearly 20 million OCSP responses daily, and serves those responses nearly 2 billion times per day. We expect issuance and OCSP numbers to double in 2018.

Our physical CA infrastructure currently occupies approximately 70 units of rack space, split between two datacenters, consisting primarily of compute servers, storage, HSMs, switches, and firewalls.

When we issue more certificates it puts the most stress on storage for our databases. We regularly invest in more and faster storage for our database servers, and that will continue in 2018.

We’ll need to add a few additional compute servers in 2018, and we’ll also start aging out hardware in 2018 for the first time since we launched. We’ll age out about ten 2u compute servers and replace them with new 1u servers, which will save space and be more energy efficient while providing better reliability and performance.

We’ll also add another infrastructure operations staff member, bringing that team to a total of six people. This is necessary in order to make sure we can keep up with demand while maintaining a high standard for security and compliance. Infrastructure operations staff are systems administrators responsible for building and maintaining all physical and logical CA infrastructure. The team also manages a 24/7/365 on-call schedule and they are primary participants in both security and compliance audits.

Finances

We pride ourselves on being an efficient organization. In 2018 Let’s Encrypt will secure a large portion of the Web with a budget of only $3.0M. For an overall increase in our budget of only 13%, we will be able to issue and service twice as many certificates as we did in 2017. We believe this represents an incredible value and that contributing to Let’s Encrypt is one of the most effective ways to help create a more secure and privacy-respecting Web.

Our 2018 fundraising efforts are off to a strong start with Platinum sponsorships from Mozilla, Akamai, OVH, Cisco, Google Chrome and the Electronic Frontier Foundation. The Ford Foundation has renewed their grant to Let’s Encrypt as well. We are seeking additional sponsorship and grant assistance to meet our full needs for 2018.

We had originally budgeted $2.91M for 2017 but we’ll likely come in under budget for the year at around $2.65M. The difference between our 2017 expenses of $2.65M and the 2018 budget of $3.0M consists primarily of the additional infrastructure operations costs previously mentioned.

Support Let’s Encrypt

We depend on contributions from our community of users and supporters in order to provide our services. If your company or organization would like to sponsor Let’s Encrypt please email us at [email protected]. We ask that you make an individual contribution if it is within your means.

We’re grateful for the industry and community support that we receive, and we look forward to continuing to create a more secure and privacy-respecting Web!

How to Easily Apply Amazon Cloud Directory Schema Changes with In-Place Schema Upgrades

Post Syndicated from Mahendra Chheda original https://aws.amazon.com/blogs/security/how-to-easily-apply-amazon-cloud-directory-schema-changes-with-in-place-schema-upgrades/

Now, Amazon Cloud Directory makes it easier for you to apply schema changes across your directories with in-place schema upgrades. Your directory now remains available while Cloud Directory applies backward-compatible schema changes such as the addition of new fields. Without migrating data between directories or applying code changes to your applications, you can upgrade your schemas. You also can view the history of your schema changes in Cloud Directory by using version identifiers, which help you track and audit schema versions across directories. If you have multiple instances of a directory with the same schema, you can view the version history of schema changes to manage your directory fleet and ensure that all directories are running with the same schema version.

In this blog post, I demonstrate how to perform an in-place schema upgrade and use schema versions in Cloud Directory. I add additional attributes to an existing facet and add a new facet to a schema. I then publish the new schema and apply it to running directories, upgrading the schema in place. I also show how to view the version history of a directory schema, which helps me to ensure my directory fleet is running the same version of the schema and has the correct history of schema changes applied to it.

Note: I share Java code examples in this post. I assume that you are familiar with the AWS SDK and can use Java-based code to build a Cloud Directory code example. You can apply the concepts I cover in this post to other programming languages such as Python and Ruby.

Cloud Directory fundamentals

I will start by covering a few Cloud Directory fundamentals. If you are already familiar with the concepts behind Cloud Directory facets, schemas, and schema lifecycles, you can skip to the next section.

Facets: Groups of attributes. You use facets to define object types. For example, you can define a device schema by adding facets such as computers, phones, and tablets. A computer facet can track attributes such as serial number, make, and model. You can then use the facets to create computer objects, phone objects, and tablet objects in the directory to which the schema applies.

Schemas: Collections of facets. Schemas define which types of objects can be created in a directory (such as users, devices, and organizations) and enforce validation of data for each object class. All data within a directory must conform to the applied schema. As a result, the schema definition is essentially a blueprint to construct a directory with an applied schema.

Schema lifecycle: The four distinct states of a schema: Development, Published, Applied, and Deleted. Schemas in the Published and Applied states have version identifiers and cannot be changed. Schemas in the Applied state are used by directories for validation as applications insert or update data. You can change schemas in the Development state as many times as you need them to. In-place schema upgrades allow you to apply schema changes to an existing Applied schema in a production directory without the need to export and import the data populated in the directory.

How to add attributes to a computer inventory application schema and perform an in-place schema upgrade

To demonstrate how to set up schema versioning and perform an in-place schema upgrade, I will use an example of a computer inventory application that uses Cloud Directory to store relationship data. Let’s say that at my company, AnyCompany, we use this computer inventory application to track all computers we give to our employees for work use. I previously created a ComputerSchema and assigned its version identifier as 1. This schema contains one facet called ComputerInfo that includes attributes for SerialNumber, Make, and Model, as shown in the following schema details.

Schema: ComputerSchema
Version: 1

Facet: ComputerInfo
Attribute: SerialNumber, type: Integer
Attribute: Make, type: String
Attribute: Model, type: String

AnyCompany has offices in Seattle, Portland, and San Francisco. I have deployed the computer inventory application for each of these three locations. As shown in the lower left part of the following diagram, ComputerSchema is in the Published state with a version of 1. The Published schema is applied to SeattleDirectory, PortlandDirectory, and SanFranciscoDirectory for AnyCompany’s three locations. Implementing separate directories for different geographic locations when you don’t have any queries that cross location boundaries is a good data partitioning strategy and gives your application better response times with lower latency.

Diagram of ComputerSchema in Published state and applied to three directories

Legend for the diagrams in this post

The following code example creates the schema in the Development state by using a JSON file, publishes the schema, and then creates directories for the Seattle, Portland, and San Francisco locations. For this example, I assume the schema has been defined in the JSON file. The createSchema API creates a schema Amazon Resource Name (ARN) with the name defined in the variable, SCHEMA_NAME. I can use the putSchemaFromJson API to add specific schema definitions from the JSON file.

// The utility method to get valid Cloud Directory schema JSON
String validJson = getJsonFile("ComputerSchema_version_1.json")

String SCHEMA_NAME = "ComputerSchema";

String developmentSchemaArn = client.createSchema(new CreateSchemaRequest()
        .withName(SCHEMA_NAME))
        .getSchemaArn();

// Put the schema document in the Development schema
PutSchemaFromJsonResult result = client.putSchemaFromJson(new PutSchemaFromJsonRequest()
        .withSchemaArn(developmentSchemaArn)
        .withDocument(validJson));

The following code example takes the schema that is currently in the Development state and publishes the schema, changing its state to Published.

String SCHEMA_VERSION = "1";
String publishedSchemaArn = client.publishSchema(
        new PublishSchemaRequest()
        .withDevelopmentSchemaArn(developmentSchemaArn)
        .withVersion(SCHEMA_VERSION))
        .getPublishedSchemaArn();

// Our Published schema ARN is as follows
// arn:aws:clouddirectory:us-west-2:XXXXXXXXXXXX:schema/published/ComputerSchema/1

The following code example creates a directory named SeattleDirectory and applies the published schema. The createDirectory API call creates a directory by using the published schema provided in the API parameters. Note that Cloud Directory stores a version of the schema in the directory in the Applied state. I will use similar code to create directories for PortlandDirectory and SanFranciscoDirectory.

String DIRECTORY_NAME = "SeattleDirectory"; 

CreateDirectoryResult directory = client.createDirectory(
        new CreateDirectoryRequest()
        .withName(DIRECTORY_NAME)
        .withSchemaArn(publishedSchemaArn));

String directoryArn = directory.getDirectoryArn();
String appliedSchemaArn = directory.getAppliedSchemaArn();

// This code section can be reused to create directories for Portland and San Francisco locations with the appropriate directory names

// Our directory ARN is as follows 
// arn:aws:clouddirectory:us-west-2:XXXXXXXXXXXX:directory/XX_DIRECTORY_GUID_XX

// Our applied schema ARN is as follows 
// arn:aws:clouddirectory:us-west-2:XXXXXXXXXXXX:directory/XX_DIRECTORY_GUID_XX/schema/ComputerSchema/1

Revising a schema

Now let’s say my company, AnyCompany, wants to add more information for computers and to track which employees have been assigned a computer for work use. I modify the schema to add two attributes to the ComputerInfo facet: Description and OSVersion (operating system version). I make Description optional because it is not important for me to track this attribute for the computer objects I create. I make OSVersion mandatory because it is critical for me to track it for all computer objects so that I can make changes such as applying security patches or making upgrades. Because I make OSVersion mandatory, I must provide a default value that Cloud Directory will apply to objects that were created before the schema revision, in order to handle backward compatibility. Note that you can replace the value in any object with a different value.

I also add a new facet to track computer assignment information, shown in the following updated schema as the ComputerAssignment facet. This facet tracks these additional attributes: Name (the name of the person to whom the computer is assigned), EMail (the email address of the assignee), Department, and department CostCenter. Note that Cloud Directory refers to the previously available version identifier as the Major Version. Because I can now add a minor version to a schema, I also denote the changed schema as Minor Version A.

Schema: ComputerSchema
Major Version: 1
Minor Version: A 

Facet: ComputerInfo
Attribute: SerialNumber, type: Integer 
Attribute: Make, type: String
Attribute: Model, type: Integer
Attribute: Description, type: String, required: NOT_REQUIRED
Attribute: OSVersion, type: String, required: REQUIRED_ALWAYS, default: "Windows 7"

Facet: ComputerAssignment
Attribute: Name, type: String
Attribute: EMail, type: String
Attribute: Department, type: String
Attribute: CostCenter, type: Integer

The following diagram shows the changes that were made when I added another facet to the schema and attributes to the existing facet. The highlighted area of the diagram (bottom left) shows that the schema changes were published.

Diagram showing that schema changes were published

The following code example revises the existing Development schema by adding the new attributes to the ComputerInfo facet and by adding the ComputerAssignment facet. I use a new JSON file for the schema revision, and for the purposes of this example, I am assuming the JSON file has the full schema including planned revisions.

// The utility method to get a valid CloudDirectory schema JSON
String schemaJson = getJsonFile("ComputerSchema_version_1_A.json")

// Put the schema document in the Development schema
PutSchemaFromJsonResult result = client.putSchemaFromJson(
        new PutSchemaFromJsonRequest()
        .withSchemaArn(developmentSchemaArn)
        .withDocument(schemaJson));

Upgrading the Published schema

The following code example performs an in-place schema upgrade of the Published schema with schema revisions (it adds new attributes to the existing facet and another facet to the schema). The upgradePublishedSchema API upgrades the Published schema with backward-compatible changes from the Development schema.

// From an earlier code example, I know the publishedSchemaArn has this value: "arn:aws:clouddirectory:us-west-2:XXXXXXXXXXXX:schema/published/ComputerSchema/1"

// Upgrade publishedSchemaArn to minorVersion A. The Development schema must be backward compatible with 
// the existing publishedSchemaArn. 

String minorVersion = "A"

UpgradePublishedSchemaResult upgradePublishedSchemaResult = client.upgradePublishedSchema(new UpgradePublishedSchemaRequest()
        .withDevelopmentSchemaArn(developmentSchemaArn)
        .withPublishedSchemaArn(publishedSchemaArn)
        .withMinorVersion(minorVersion));

String upgradedPublishedSchemaArn = upgradePublishedSchemaResult.getUpgradedSchemaArn();

// The Published schema ARN after the upgrade shows a minor version as follows 
// arn:aws:clouddirectory:us-west-2:XXXXXXXXXXXX:schema/published/ComputerSchema/1/A

Upgrading the Applied schema

The following diagram shows the in-place schema upgrade for the SeattleDirectory directory. I am performing the schema upgrade so that I can reflect the new schemas in all three directories. As a reminder, I added new attributes to the ComputerInfo facet and also added the ComputerAssignment facet. After the schema and directory upgrade, I can create objects for the ComputerInfo and ComputerAssignment facets in the SeattleDirectory. Any objects that were created with the old facet definition for ComputerInfo will now use the default values for any additional attributes defined in the new schema.

Diagram of the in-place schema upgrade for the SeattleDirectory directory

I use the following code example to perform an in-place upgrade of the SeattleDirectory to a Major Version of 1 and a Minor Version of A. Note that you should change a Major Version identifier in a schema to make backward-incompatible changes such as changing the data type of an existing attribute or dropping a mandatory attribute from your schema. Backward-incompatible changes require directory data migration from a previous version to the new version. You should change a Minor Version identifier in a schema to make backward-compatible upgrades such as adding additional attributes or adding facets, which in turn may contain one or more attributes. The upgradeAppliedSchema API lets me upgrade an existing directory with a different version of a schema.

// This upgrades ComputerSchema version 1 of the Applied schema in SeattleDirectory to Major Version 1 and Minor Version A
// The schema must be backward compatible or the API will fail with IncompatibleSchemaException

UpgradeAppliedSchemaResult upgradeAppliedSchemaResult = client.upgradeAppliedSchema(new UpgradeAppliedSchemaRequest()
        .withDirectoryArn(directoryArn)
        .withPublishedSchemaArn(upgradedPublishedSchemaArn));

String upgradedAppliedSchemaArn = upgradeAppliedSchemaResult.getUpgradedSchemaArn();

// The Applied schema ARN after the in-place schema upgrade will appear as follows
// arn:aws:clouddirectory:us-west-2:XXXXXXXXXXXX:directory/XX_DIRECTORY_GUID_XX/schema/ComputerSchema/1

// This code section can be reused to upgrade directories for the Portland and San Francisco locations with the appropriate directory ARN

Note: Cloud Directory has excluded returning the Minor Version identifier in the Applied schema ARN for backward compatibility and to enable the application to work across older and newer versions of the directory.

The following diagram shows the changes that are made when I perform an in-place schema upgrade in the two remaining directories, PortlandDirectory and SanFranciscoDirectory. I make these calls sequentially, upgrading PortlandDirectory first and then upgrading SanFranciscoDirectory. I use the same code example that I used earlier to upgrade SeattleDirectory. Now, all my directories are running the most current version of the schema. Also, I made these schema changes without having to migrate data and while maintaining my application’s high availability.

Diagram showing the changes that are made with an in-place schema upgrade in the two remaining directories

Schema revision history

I can now view the schema revision history for any of AnyCompany’s directories by using the listAppliedSchemaArns API. Cloud Directory maintains the five most recent versions of applied schema changes. Similarly, to inspect the current Minor Version that was applied to my schema, I use the getAppliedSchemaVersion API. The listAppliedSchemaArns API returns the schema ARNs based on my schema filter as defined in withSchemaArn.

I use the following code example to query an Applied schema for its version history.

// This returns the five most recent Minor Versions associated with a Major Version
ListAppliedSchemaArnsResult listAppliedSchemaArnsResult = client.listAppliedSchemaArns(new ListAppliedSchemaArnsRequest()
        .withDirectoryArn(directoryArn)
        .withSchemaArn(upgradedAppliedSchemaArn));

// Note: The listAppliedSchemaArns API without the SchemaArn filter returns all the Major Versions in a directory

The listAppliedSchemaArns API returns the two ARNs as shown in the following output.

arn:aws:clouddirectory:us-west-2:XXXXXXXXXXXX:directory/XX_DIRECTORY_GUID_XX/schema/ComputerSchema/1
arn:aws:clouddirectory:us-west-2:XXXXXXXXXXXX:directory/XX_DIRECTORY_GUID_XX/schema/ComputerSchema/1/A

The following code example queries an Applied schema for current Minor Version by using the getAppliedSchemaVersion API.

// This returns the current Applied schema's Minor Version ARN 

GetAppliedSchemaVersion getAppliedSchemaVersionResult = client.getAppliedSchemaVersion(new GetAppliedSchemaVersionRequest()
	.withSchemaArn(upgradedAppliedSchemaArn));

The getAppliedSchemaVersion API returns the current Applied schema ARN with a Minor Version, as shown in the following output.

arn:aws:clouddirectory:us-west-2:XXXXXXXXXXXX:directory/XX_DIRECTORY_GUID_XX/schema/ComputerSchema/1/A

If you have a lot of directories, schema revision API calls can help you audit your directory fleet and ensure that all directories are running the same version of a schema. Such auditing can help you ensure high integrity of directories across your fleet.

Summary

You can use in-place schema upgrades to make changes to your directory schema as you evolve your data set to match the needs of your application. An in-place schema upgrade allows you to maintain high availability for your directory and applications while the upgrade takes place. For more information about in-place schema upgrades, see the in-place schema upgrade documentation.

If you have comments about this blog post, submit them in the “Comments” section below. If you have questions about implementing the solution in this post, start a new thread in the Directory Service forum or contact AWS Support.

– Mahendra

 

[$] Mozilla releases tools and data for speech recognition

Post Syndicated from jake original https://lwn.net/Articles/740768/rss

Voice computing has long been a staple of science fiction, but it has
only relatively recently made its way into fairly common mainstream use.
Gadgets like mobile
phones and “smart” home assistant devices (e.g. Amazon Echo, Google Home)
have brought voice-based user interfaces to the masses. The voice
processing for those gadgets relies on various proprietary services “in the
cloud”, which generally leaves the free-software world out in the cold.
There have
been FOSS speech-recognition efforts over
the years, but Mozilla’s recent
announcement
of the release of its voice-recognition code and voice
data set should help further the goal of FOSS voice interfaces.

Running Windows Containers on Amazon ECS

Post Syndicated from Nathan Taber original https://aws.amazon.com/blogs/compute/running-windows-containers-on-amazon-ecs/

This post was developed and written by Jeremy Cowan, Thomas Fuller, Samuel Karp, and Akram Chetibi.

Containers have revolutionized the way that developers build, package, deploy, and run applications. Initially, containers only supported code and tooling for Linux applications. With the release of Docker Engine for Windows Server 2016, Windows developers have started to realize the gains that their Linux counterparts have experienced for the last several years.

This week, we’re adding support for running production workloads in Windows containers using Amazon Elastic Container Service (Amazon ECS). Now, Amazon ECS provides an ECS-Optimized Windows Server Amazon Machine Image (AMI). This AMI is based on the EC2 Windows Server 2016 AMI, and includes Docker 17.06 Enterprise Edition and the ECS Agent 1.16. This AMI provides improved instance and container launch time performance. It’s based on Windows Server 2016 Datacenter and includes Docker 17.06.2-ee-5, along with a new version of the ECS agent that now runs as a native Windows service.

In this post, I discuss the benefits of this new support, and walk you through getting started running Windows containers with Amazon ECS.

When AWS released the Windows Server 2016 Base with Containers AMI, the ECS agent ran as a process that made it difficult to monitor and manage. As a service, the agent can be health-checked, managed, and restarted no differently than other Windows services. The AMI also includes pre-cached images for Windows Server Core 2016 and Windows Server Nano Server 2016. By caching the images in the AMI, launching new Windows containers is significantly faster. When Docker images include a layer that’s already cached on the instance, Docker re-uses that layer instead of pulling it from the Docker registry.

The ECS agent and an accompanying ECS PowerShell module used to install, configure, and run the agent come pre-installed on the AMI. This guarantees there is a specific platform version available on the container instance at launch. Because the software is included, you don’t have to download it from the internet. This saves startup time.

The Windows-compatible ECS-optimized AMI also reports CPU and memory utilization and reservation metrics to Amazon CloudWatch. Using the CloudWatch integration with ECS, you can create alarms that trigger dynamic scaling events to automatically add or remove capacity to your EC2 instances and ECS tasks.

Getting started

To help you get started running Windows containers on ECS, I’ve forked the ECS reference architecture, to build an ECS cluster comprised of Windows instances instead of Linux instances. You can pull the latest version of the reference architecture for Windows.

The reference architecture is a layered CloudFormation stack, in that it calls other stacks to create the environment. Within the stack, the ecs-windows-cluster.yaml file contains the instructions for bootstrapping the Windows instances and configuring the ECS cluster. To configure the instances outside of AWS CloudFormation (for example, through the CLI or the console), you can add the following commands to your instance’s user data:

Import-Module ECSTools
Initialize-ECSAgent

Or

Import-Module ECSTools
Initialize-ECSAgent –Cluster MyCluster -EnableIAMTaskRole

If you don’t specify a cluster name when you initialize the agent, the instance is joined to the default cluster.

Adding -EnableIAMTaskRole when initializing the agent adds support for IAM roles for tasks. Previously, enabling this setting meant running a complex script and setting an environment variable before you could assign roles to your ECS tasks.

When you enable IAM roles for tasks on Windows, it consumes port 80 on the host. If you have tasks that listen on port 80 on the host, I recommend configuring a service for them that uses load balancing. You can use port 80 on the load balancer, and the traffic can be routed to another host port on your container instances. For more information, see Service Load Balancing.

Create a cluster

To create a new ECS cluster, choose Launch stack, or pull the GitHub project to your local machine and run the following command:

aws cloudformation create-stack –template-body file://<path to master-windows.yaml> --stack-name <name>

Upload your container image

Now that you have a cluster running, step through how to build and push an image into a container repository. You use a repository hosted in Amazon Elastic Container Registry (Amazon ECR) for this, but you could also use Docker Hub. To build and push an image to a repository, install Docker on your Windows* workstation. You also create a repository and assign the necessary permissions to the account that pushes your image to Amazon ECR. For detailed instructions, see Pushing an Image.

* If you are building an image that is based on Windows layers, then you must use a Windows environment to build and push your image to the registry.

Write your task definition

Now that your image is built and ready, the next step is to run your Windows containers using a task.

Start by creating a new task definition based on the windows-simple-iis image from Docker Hub.

  1. Open the ECS console.
  2. Choose Task Definitions, Create new task definition.
  3. Scroll to the bottom of the page and choose Configure via JSON.
  4. Copy and paste the following JSON into that field.
  5. Choose Save, Create.
{
   "family": "windows-simple-iis",
   "containerDefinitions": [
   {
     "name": "windows_sample_app",
     "image": "microsoft/iis",
     "cpu": 100,
     "entryPoint":["powershell", "-Command"],
     "command":["New-Item -Path C:\\inetpub\\wwwroot\\index.html -Type file -Value '<html><head><title>Amazon ECS Sample App</title> <style>body {margin-top: 40px; background-color: #333;} </style> </head><body> <div style=color:white;text-align:center><h1>Amazon ECS Sample App</h1> <h2>Congratulations!</h2> <p>Your application is now running on a container in Amazon ECS.</p></body></html>'; C:\\ServiceMonitor.exe w3svc"],
     "portMappings": [
     {
       "protocol": "tcp",
       "containerPort": 80,
       "hostPort": 8080
     }
     ],
     "memory": 500,
     "essential": true
   }
   ]
}

You can now go back into the Task Definition page and see windows-simple-iis as an available task definition.

There are a few important aspects of the task definition file to note when working with Windows containers. First, the hostPort is configured as 8080, which is necessary because the ECS agent currently uses port 80 to enable IAM roles for tasks required for least-privilege security configurations.

There are also some fairly standard task parameters that are intentionally not included. For example, network mode is not available with Windows at the time of this release, so keep that setting blank to allow Docker to configure WinNAT, the only option available today.

Also, some parameters work differently with Windows than they do with Linux. The CPU limits that you define in the task definition are absolute, whereas on Linux they are weights. For information about other task parameters that are supported or possibly different with Windows, see the documentation.

Run your containers

At this point, you are ready to run containers. There are two options to run containers with ECS:

  1. Task
  2. Service

A task is typically a short-lived process that ECS creates. It can’t be configured to actively monitor or scale. A service is meant for longer-running containers and can be configured to use a load balancer, minimum/maximum capacity settings, and a number of other knobs and switches to help ensure that your code keeps running. In both cases, you are able to pick a placement strategy and a specific IAM role for your container.

  1. Select the task definition that you created above and choose Action, Run Task.
  2. Leave the settings on the next page to the default values.
  3. Select the ECS cluster created when you ran the CloudFormation template.
  4. Choose Run Task to start the process of scheduling a Docker container on your ECS cluster.

You can now go to the cluster and watch the status of your task. It may take 5–10 minutes for the task to go from PENDING to RUNNING, mostly because it takes time to download all of the layers necessary to run the microsoft/iis image. After the status is RUNNING, you should see the following results:

You may have noticed that the example task definition is named windows-simple-iis:2. This is because I created a second version of the task definition, which is one of the powerful capabilities of using ECS. You can make the task definitions part of your source code and then version them. You can also roll out new versions and practice blue/green deployment, switching to reduce downtime and improve the velocity of your deployments!

After the task has moved to RUNNING, you can see your website hosted in ECS. Find the public IP or DNS for your ECS host. Remember that you are hosting on port 8080. Make sure that the security group allows ingress from your client IP address to that port and that your VPC has an internet gateway associated with it. You should see a page that looks like the following:

This is a nice start to deploying a simple single instance task, but what if you had a Web API to be scaled out and in based on usage? This is where you could look at defining a service and collecting CloudWatch data to add and remove both instances of the task. You could also use CloudWatch alarms to add more ECS container instances and keep up with the demand. The former is built into the configuration of your service.

  1. Select the task definition and choose Create Service.
  2. Associate a load balancer.
  3. Set up Auto Scaling.

The following screenshot shows an example where you would add an additional task instance when the CPU Utilization CloudWatch metric is over 60% on average over three consecutive measurements. This may not be aggressive enough for your requirements; it’s meant to show you the option to scale tasks the same way you scale ECS instances with an Auto Scaling group. The difference is that these tasks start much faster because all of the base layers are already on the ECS host.

Do not confuse task dynamic scaling with ECS instance dynamic scaling. To add additional hosts, see Tutorial: Scaling Container Instances with CloudWatch Alarms.

Conclusion

This is just scratching the surface of the flexibility that you get from using containers and Amazon ECS. For more information, see the Amazon ECS Developer Guide and ECS Resources.

– Jeremy, Thomas, Samuel, Akram

Newly Updated Whitepaper: FERPA Compliance on AWS

Post Syndicated from Chris Gile original https://aws.amazon.com/blogs/security/newly-updated-whitepaper-ferpa-compliance-on-aws/

One of the main tenets of the Family Educational Rights and Privacy Act (FERPA) is the protection of student education records, including personally identifiable information (PII) and directory information. We recently updated our FERPA Compliance on AWS whitepaper to include AWS service-specific guidance for 24 AWS services. The whitepaper describes how these services can be used to help secure protected data. In conjunction with more detailed service-specific documentation, this updated information helps make it easier for you to plan, deploy, and operate secure environments to meet your compliance requirements in the AWS Cloud.

The updated whitepaper is especially useful for educational institutions and their vendors who need to understand:

  • AWS’s Shared Responsibility Model.
  • How AWS services can be used to help deploy educational and PII workloads securely in the AWS Cloud.
  • Key security disciplines in a security program to help you run a FERPA-compliant program (such as auditing, data destruction, and backup and disaster recovery).

In a related effort to help you secure PII, we also added to the whitepaper a mapping of NIST SP 800-122, which provides guidance for protecting PII, as well as a link to our NIST SP 800-53 Quick Start, a CloudFormation template that automatically configures AWS resources and deploys a multi-tier, Linux-based web application. To learn how this Quick Start works, see the Automate NIST Compliance in AWS GovCloud (US) with AWS Quick Start Tools video. The template helps you streamline and automate secure baselines in AWS—from initial design to operational security readiness—by incorporating the expertise of AWS security and compliance subject matter experts.

For more information about AWS Compliance and FERPA or to request support for your organization, contact your AWS account manager.

– Chris Gile, Senior Manager, AWS Security Assurance

Implementing Dynamic ETL Pipelines Using AWS Step Functions

Post Syndicated from Tara Van Unen original https://aws.amazon.com/blogs/compute/implementing-dynamic-etl-pipelines-using-aws-step-functions/

This post contributed by:
Wangechi Dole, AWS Solutions Architect
Milan Krasnansky, ING, Digital Solutions Developer, SGK
Rian Mookencherry, Director – Product Innovation, SGK

Data processing and transformation is a common use case you see in our customer case studies and success stories. Often, customers deal with complex data from a variety of sources that needs to be transformed and customized through a series of steps to make it useful to different systems and stakeholders. This can be difficult due to the ever-increasing volume, velocity, and variety of data. Today, data management challenges cannot be solved with traditional databases.

Workflow automation helps you build solutions that are repeatable, scalable, and reliable. You can use AWS Step Functions for this. A great example is how SGK used Step Functions to automate the ETL processes for their client. With Step Functions, SGK has been able to automate changes within the data management system, substantially reducing the time required for data processing.

In this post, SGK shares the details of how they used Step Functions to build a robust data processing system based on highly configurable business transformation rules for ETL processes.

SGK: Building dynamic ETL pipelines

SGK is a subsidiary of Matthews International Corporation, a diversified organization focusing on brand solutions and industrial technologies. SGK’s Global Content Creation Studio network creates compelling content and solutions that connect brands and products to consumers through multiple assets including photography, video, and copywriting.

We were recently contracted to build a sophisticated and scalable data management system for one of our clients. We chose to build the solution on AWS to leverage advanced, managed services that help to improve the speed and agility of development.

The data management system served two main functions:

  1. Ingesting a large amount of complex data to facilitate both reporting and product funding decisions for the client’s global marketing and supply chain organizations.
  2. Processing the data through normalization and applying complex algorithms and data transformations. The system goal was to provide information in the relevant context—such as strategic marketing, supply chain, product planning, etc. —to the end consumer through automated data feeds or updates to existing ETL systems.

We were faced with several challenges:

  • Output data that needed to be refreshed at least twice a day to provide fresh datasets to both local and global markets. That constant data refresh posed several challenges, especially around data management and replication across multiple databases.
  • The complexity of reporting business rules that needed to be updated on a constant basis.
  • Data that could not be processed as contiguous blocks of typical time-series data. The measurement of the data was done across seasons (that is, combination of dates), which often resulted with up to three overlapping seasons at any given time.
  • Input data that came from 10+ different data sources. Each data source ranged from 1–20K rows with as many as 85 columns per input source.

These challenges meant that our small Dev team heavily invested time in frequent configuration changes to the system and data integrity verification to make sure that everything was operating properly. Maintaining this system proved to be a daunting task and that’s when we turned to Step Functions—along with other AWS services—to automate our ETL processes.

Solution overview

Our solution included the following AWS services:

  • AWS Step Functions: Before Step Functions was available, we were using multiple Lambda functions for this use case and running into memory limit issues. With Step Functions, we can execute steps in parallel simultaneously, in a cost-efficient manner, without running into memory limitations.
  • AWS Lambda: The Step Functions state machine uses Lambda functions to implement the Task states. Our Lambda functions are implemented in Java 8.
  • Amazon DynamoDB provides us with an easy and flexible way to manage business rules. We specify our rules as Keys. These are key-value pairs stored in a DynamoDB table.
  • Amazon RDS: Our ETL pipelines consume source data from our RDS MySQL database.
  • Amazon Redshift: We use Amazon Redshift for reporting purposes because it integrates with our BI tools. Currently we are using Tableau for reporting which integrates well with Amazon Redshift.
  • Amazon S3: We store our raw input files and intermediate results in S3 buckets.
  • Amazon CloudWatch Events: Our users expect results at a specific time. We use CloudWatch Events to trigger Step Functions on an automated schedule.

Solution architecture

This solution uses a declarative approach to defining business transformation rules that are applied by the underlying Step Functions state machine as data moves from RDS to Amazon Redshift. An S3 bucket is used to store intermediate results. A CloudWatch Event rule triggers the Step Functions state machine on a schedule. The following diagram illustrates our architecture:

Here are more details for the above diagram:

  1. A rule in CloudWatch Events triggers the state machine execution on an automated schedule.
  2. The state machine invokes the first Lambda function.
  3. The Lambda function deletes all existing records in Amazon Redshift. Depending on the dataset, the Lambda function can create a new table in Amazon Redshift to hold the data.
  4. The same Lambda function then retrieves Keys from a DynamoDB table. Keys represent specific marketing campaigns or seasons and map to specific records in RDS.
  5. The state machine executes the second Lambda function using the Keys from DynamoDB.
  6. The second Lambda function retrieves the referenced dataset from RDS. The records retrieved represent the entire dataset needed for a specific marketing campaign.
  7. The second Lambda function executes in parallel for each Key retrieved from DynamoDB and stores the output in CSV format temporarily in S3.
  8. Finally, the Lambda function uploads the data into Amazon Redshift.

To understand the above data processing workflow, take a closer look at the Step Functions state machine for this example.

We walk you through the state machine in more detail in the following sections.

Walkthrough

To get started, you need to:

  • Create a schedule in CloudWatch Events
  • Specify conditions for RDS data extracts
  • Create Amazon Redshift input files
  • Load data into Amazon Redshift

Step 1: Create a schedule in CloudWatch Events
Create rules in CloudWatch Events to trigger the Step Functions state machine on an automated schedule. The following is an example cron expression to automate your schedule:

In this example, the cron expression invokes the Step Functions state machine at 3:00am and 2:00pm (UTC) every day.

Step 2: Specify conditions for RDS data extracts
We use DynamoDB to store Keys that determine which rows of data to extract from our RDS MySQL database. An example Key is MCS2017, which stands for, Marketing Campaign Spring 2017. Each campaign has a specific start and end date and the corresponding dataset is stored in RDS MySQL. A record in RDS contains about 600 columns, and each Key can represent up to 20K records.

A given day can have multiple campaigns with different start and end dates running simultaneously. In the following example DynamoDB item, three campaigns are specified for the given date.

The state machine example shown above uses Keys 31, 32, and 33 in the first ChoiceState and Keys 21 and 22 in the second ChoiceState. These keys represent marketing campaigns for a given day. For example, on Monday, there are only two campaigns requested. The ChoiceState with Keys 21 and 22 is executed. If three campaigns are requested on Tuesday, for example, then ChoiceState with Keys 31, 32, and 33 is executed. MCS2017 can be represented by Key 21 and Key 33 on Monday and Tuesday, respectively. This approach gives us the flexibility to add or remove campaigns dynamically.

Step 3: Create Amazon Redshift input files
When the state machine begins execution, the first Lambda function is invoked as the resource for FirstState, represented in the Step Functions state machine as follows:

"Comment": ” AWS Amazon States Language.", 
  "StartAt": "FirstState",
 
"States": { 
  "FirstState": {
   
"Type": "Task",
   
"Resource": "arn:aws:lambda:xx-xxxx-x:XXXXXXXXXXXX:function:Start",
    "Next": "ChoiceState" 
  } 

As described in the solution architecture, the purpose of this Lambda function is to delete existing data in Amazon Redshift and retrieve keys from DynamoDB. In our use case, we found that deleting existing records was more efficient and less time-consuming than finding the delta and updating existing records. On average, an Amazon Redshift table can contain about 36 million cells, which translates to roughly 65K records. The following is the code snippet for the first Lambda function in Java 8:

public class LambdaFunctionHandler implements RequestHandler<Map<String,Object>,Map<String,String>> {
    Map<String,String> keys= new HashMap<>();
    public Map<String, String> handleRequest(Map<String, Object> input, Context context){
       Properties config = getConfig(); 
       // 1. Cleaning Redshift Database
       new RedshiftDataService(config).cleaningTable(); 
       // 2. Reading data from Dynamodb
       List<String> keyList = new DynamoDBDataService(config).getCurrentKeys();
       for(int i = 0; i < keyList.size(); i++) {
           keys.put(”key" + (i+1), keyList.get(i)); 
       }
       keys.put(”key" + T,String.valueOf(keyList.size()));
       // 3. Returning the key values and the key count from the “for” loop
       return (keys);
}

The following JSON represents ChoiceState.

"ChoiceState": {
   "Type" : "Choice",
   "Choices": [ 
   {

      "Variable": "$.keyT",
     "StringEquals": "3",
     "Next": "CurrentThreeKeys" 
   }, 
   {

     "Variable": "$.keyT",
    "StringEquals": "2",
    "Next": "CurrentTwooKeys" 
   } 
 ], 
 "Default": "DefaultState"
}

The variable $.keyT represents the number of keys retrieved from DynamoDB. This variable determines which of the parallel branches should be executed. At the time of publication, Step Functions does not support dynamic parallel state. Therefore, choices under ChoiceState are manually created and assigned hardcoded StringEquals values. These values represent the number of parallel executions for the second Lambda function.

For example, if $.keyT equals 3, the second Lambda function is executed three times in parallel with keys, $key1, $key2 and $key3 retrieved from DynamoDB. Similarly, if $.keyT equals two, the second Lambda function is executed twice in parallel.  The following JSON represents this parallel execution:

"CurrentThreeKeys": { 
  "Type": "Parallel",
  "Next": "NextState",
  "Branches": [ 
  {

     "StartAt": “key31",
    "States": { 
       “key31": {

          "Type": "Task",
        "InputPath": "$.key1",
        "Resource": "arn:aws:lambda:xx-xxxx-x:XXXXXXXXXXXX:function:Execution",
        "End": true 
       } 
    } 
  }, 
  {

     "StartAt": “key32",
    "States": { 
     “key32": {

        "Type": "Task",
       "InputPath": "$.key2",
         "Resource": "arn:aws:lambda:xx-xxxx-x:XXXXXXXXXXXX:function:Execution",
       "End": true 
      } 
     } 
   }, 
   {

      "StartAt": “key33",
       "States": { 
          “key33": {

                "Type": "Task",
             "InputPath": "$.key3",
             "Resource": "arn:aws:lambda:xx-xxxx-x:XXXXXXXXXXXX:function:Execution",
           "End": true 
       } 
     } 
    } 
  ] 
} 

Step 4: Load data into Amazon Redshift
The second Lambda function in the state machine extracts records from RDS associated with keys retrieved for DynamoDB. It processes the data then loads into an Amazon Redshift table. The following is code snippet for the second Lambda function in Java 8.

public class LambdaFunctionHandler implements RequestHandler<String, String> {
 public static String key = null;

public String handleRequest(String input, Context context) { 
   key=input; 
   //1. Getting basic configurations for the next classes + s3 client Properties
   config = getConfig();

   AmazonS3 s3 = AmazonS3ClientBuilder.defaultClient(); 
   // 2. Export query results from RDS into S3 bucket 
   new RdsDataService(config).exportDataToS3(s3,key); 
   // 3. Import query results from S3 bucket into Redshift 
    new RedshiftDataService(config).importDataFromS3(s3,key); 
   System.out.println(input); 
   return "SUCCESS"; 
 } 
}

After the data is loaded into Amazon Redshift, end users can visualize it using their preferred business intelligence tools.

Lessons learned

  • At the time of publication, the 1.5–GB memory hard limit for Lambda functions was inadequate for processing our complex workload. Step Functions gave us the flexibility to chunk our large datasets and process them in parallel, saving on costs and time.
  • In our previous implementation, we assigned each key a dedicated Lambda function along with CloudWatch rules for schedule automation. This approach proved to be inefficient and quickly became an operational burden. Previously, we processed each key sequentially, with each key adding about five minutes to the overall processing time. For example, processing three keys meant that the total processing time was three times longer. With Step Functions, the entire state machine executes in about five minutes.
  • Using DynamoDB with Step Functions gave us the flexibility to manage keys efficiently. In our previous implementations, keys were hardcoded in Lambda functions, which became difficult to manage due to frequent updates. DynamoDB is a great way to store dynamic data that changes frequently, and it works perfectly with our serverless architectures.

Conclusion

With Step Functions, we were able to fully automate the frequent configuration updates to our dataset resulting in significant cost savings, reduced risk to data errors due to system downtime, and more time for us to focus on new product development rather than support related issues. We hope that you have found the information useful and that it can serve as a jump-start to building your own ETL processes on AWS with managed AWS services.

For more information about how Step Functions makes it easy to coordinate the components of distributed applications and microservices in any workflow, see the use case examples and then build your first state machine in under five minutes in the Step Functions console.

If you have questions or suggestions, please comment below.

Glenn’s Take on re:Invent 2017 – Part 3

Post Syndicated from Glenn Gore original https://aws.amazon.com/blogs/architecture/glenns-take-on-reinvent-2017-part-3/

Glenn Gore here, Chief Architect for AWS. I was in Las Vegas last week — with 43K others — for re:Invent 2017. I checked in to the Architecture blog here and here with my take on what was interesting about some of the bigger announcements from a cloud-architecture perspective.

In the excitement of so many new services being launched, we sometimes overlook feature updates that, while perhaps not as exciting as Amazon DeepLens, have significant impact on how you architect and develop solutions on AWS.

Amazon DynamoDB is used by more than 100,000 customers around the world, handling over a trillion requests every day. From the start, DynamoDB has offered high availability by natively spanning multiple Availability Zones within an AWS Region. As more customers started building and deploying truly-global applications, there was a need to replicate a DynamoDB table to multiple AWS Regions, allowing for read/write operations to occur in any region where the table was replicated. This update is important for providing a globally-consistent view of information — as users may transition from one region to another — or for providing additional levels of availability, allowing for failover between AWS Regions without loss of information.

There are some interesting concurrency-design aspects you need to be aware of and ensure you can handle correctly. For example, we support the “last writer wins” reconciliation where eventual consistency is being used and an application updates the same item in different AWS Regions at the same time. If you require strongly-consistent read/writes then you must perform all of your read/writes in the same AWS Region. The details behind this can be found in the DynamoDB documentation. Providing a globally-distributed, replicated DynamoDB table simplifies many different use cases and allows for the logic of replication, which may have been pushed up into the application layers to be simplified back down into the data layer.

The other big update for DynamoDB is that you can now back up your DynamoDB table on demand with no impact to performance. One of the features I really like is that when you trigger a backup, it is available instantly, regardless of the size of the table. Behind the scenes, we use snapshots and change logs to ensure a consistent backup. While backup is instant, restoring the table could take some time depending on its size and ranges — from minutes to hours for very large tables.

This feature is super important for those of you who work in regulated industries that often have strict requirements around data retention and backups of data, which sometimes limited the use of DynamoDB or required complex workarounds to implement some sort of backup feature in the past. This often incurred significant, additional costs due to increased read transactions on their DynamoDB tables.

Amazon Simple Storage Service (Amazon S3) was our first-released AWS service over 11 years ago, and it proved the simplicity and scalability of true API-driven architectures in the cloud. Today, Amazon S3 stores trillions of objects, with transactional requests per second reaching into the millions! Dealing with data as objects opened up an incredibly diverse array of use cases ranging from libraries of static images, game binary downloads, and application log data, to massive data lakes used for big data analytics and business intelligence. With Amazon S3, when you accessed your data in an object, you effectively had to write/read the object as a whole or use the range feature to retrieve a part of the object — if possible — in your individual use case.

Now, with Amazon S3 Select, an SQL-like query language is used that can work with delimited text and JSON files, as well as work with GZIP compressed files. We don’t support encryption during the preview of Amazon S3 Select.

Amazon S3 Select provides two major benefits:

  • Faster access
  • Lower running costs

Serverless Lambda functions, where every millisecond matters when you are being charged, will benefit greatly from Amazon S3 Select as data retrieval and processing of your Lambda function will experience significant speedups and cost reductions. For example, we have seen 2x speed improvement and 80% cost reduction with the Serverless MapReduce code.

Other AWS services such as Amazon Athena, Amazon Redshift, and Amazon EMR will support Amazon S3 Select as well as partner offerings including Cloudera and Hortonworks. If you are using Amazon Glacier for longer-term data archival, you will be able to use Amazon Glacier Select to retrieve a subset of your content from within Amazon Glacier.

As the volume of data that can be stored within Amazon S3 and Amazon Glacier continues to scale on a daily basis, we will continue to innovate and develop improved and optimized services that will allow you to work with these magnificently-large data sets while reducing your costs (retrieval and processing). I believe this will also allow you to simplify the transformation and storage of incoming data into Amazon S3 in basic, semi-structured formats as a single copy vs. some of the duplication and reformatting of data sometimes required to do upfront optimizations for downstream processing. Amazon S3 Select largely removes the need for this upfront optimization and instead allows you to store data once and process it based on your individual Amazon S3 Select query per application or transaction need.

Thanks for reading!

Glenn contemplating why CSV format is still relevant in 2017 (Italy).

Mozilla releases its speech-recognition system

Post Syndicated from corbet original https://lwn.net/Articles/740611/rss

Mozilla has announced
the initial releases from its “Project DeepSpeech” and “Project Common
Voice” efforts. “I’m excited to announce the initial release of
Mozilla’s open source speech recognition model that has an accuracy
approaching what humans can perceive when listening to the same
recordings. We are also releasing the world’s second largest publicly
available voice dataset, which was contributed to by nearly 20,000 people
globally.

Top 10 Most Pirated Movies of The Week on BitTorrent – 12/03/17

Post Syndicated from Ernesto original https://torrentfreak.com/top-10-pirated-movies-week-bittorrent-120317/

This week we have three newcomers in our chart.

Kingsman: The Golden Circle is the most downloaded movie again.

The data for our weekly download chart is estimated by TorrentFreak, and is for informational and educational reference only. All the movies in the list are Web-DL/Webrip/HDRip/BDrip/DVDrip unless stated otherwise.

RSS feed for the weekly movie download chart.

This week’s most downloaded movies are:
Movie Rank Rank last week Movie name IMDb Rating / Trailer
Most downloaded movies via torrents
1 (1) Kingsman: The Golden Circle 7.2 / trailer
2 (…) The Foreigner 7.2 / trailer
3 (2) American Assassin 6.3 / trailer
4 (…) Detroit 7.5 / trailer
5 (3) Valerian and the City of a Thousand Planets 6.7 / trailer
6 (4) Geostorm (Subbed HDRip) 5.5 / trailer
7 (…) Justice League (HDTS) 7.2 / trailer
8 (5) Logan Lucky 7.2 / trailer
9 (9) Thor Ragnarok (HDTS/Cam) 8.2 / trailer
10 (6) Wind River 7.8 / trailer

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons