Tag Archives: launch

Implement continuous integration and delivery of serverless AWS Glue ETL applications using AWS Developer Tools

Post Syndicated from Prasad Alle original https://aws.amazon.com/blogs/big-data/implement-continuous-integration-and-delivery-of-serverless-aws-glue-etl-applications-using-aws-developer-tools/

AWS Glue is an increasingly popular way to develop serverless ETL (extract, transform, and load) applications for big data and data lake workloads. Organizations that transform their ETL applications to cloud-based, serverless ETL architectures need a seamless, end-to-end continuous integration and continuous delivery (CI/CD) pipeline: from source code, to build, to deployment, to product delivery. Having a good CI/CD pipeline can help your organization discover bugs before they reach production and deliver updates more frequently. It can also help developers write quality code and automate the ETL job release management process, mitigate risk, and more.

AWS Glue is a fully managed data catalog and ETL service. It simplifies and automates the difficult and time-consuming tasks of data discovery, conversion, and job scheduling. AWS Glue crawls your data sources and constructs a data catalog using pre-built classifiers for popular data formats and data types, including CSV, Apache Parquet, JSON, and more.

When you are developing ETL applications using AWS Glue, you might come across some of the following CI/CD challenges:

  • Iterative development with unit tests
  • Continuous integration and build
  • Pushing the ETL pipeline to a test environment
  • Pushing the ETL pipeline to a production environment
  • Testing ETL applications using real data (live test)
  • Exploring and validating data

In this post, I walk you through a solution that implements a CI/CD pipeline for serverless AWS Glue ETL applications supported by AWS Developer Tools (including AWS CodePipeline, AWS CodeCommit, and AWS CodeBuild) and AWS CloudFormation.

Solution overview

The following diagram shows the pipeline workflow:

This solution uses AWS CodePipeline, which lets you orchestrate and automate the test and deploy stages for ETL application source code. The solution consists of a pipeline that contains the following stages:

1.) Source Control: In this stage, the AWS Glue ETL job source code and the AWS CloudFormation template file for deploying the ETL jobs are both committed to version control. I chose to use AWS CodeCommit for version control.

To get the ETL job source code and AWS CloudFormation template, download the gluedemoetl.zip file. This solution is developed based on a previous post, Build a Data Lake Foundation with AWS Glue and Amazon S3.

2.) LiveTest: In this stage, all resources—including AWS Glue crawlers, jobs, S3 buckets, roles, and other resources that are required for the solution—are provisioned, deployed, live tested, and cleaned up.

The LiveTest stage includes the following actions:

  • Deploy: In this action, all the resources that are required for this solution (crawlers, jobs, buckets, roles, and so on) are provisioned and deployed using an AWS CloudFormation template.
  • AutomatedLiveTest: In this action, all the AWS Glue crawlers and jobs are executed and data exploration and validation tests are performed. These validation tests include, but are not limited to, record counts in both raw tables and transformed tables in the data lake and any other business validations. I used AWS CodeBuild for this action.
  • LiveTestApproval: This action is included for the cases in which a pipeline administrator approval is required to deploy/promote the ETL applications to the next stage. The pipeline pauses in this action until an administrator manually approves the release.
  • LiveTestCleanup: In this action, all the LiveTest stage resources, including test crawlers, jobs, roles, and so on, are deleted using the AWS CloudFormation template. This action helps minimize cost by ensuring that the test resources exist only for the duration of the AutomatedLiveTest and LiveTestApproval

3.) DeployToProduction: In this stage, all the resources are deployed using the AWS CloudFormation template to the production environment.

Try it out

This code pipeline takes approximately 20 minutes to complete the LiveTest test stage (up to the LiveTest approval stage, in which manual approval is required).

To get started with this solution, choose Launch Stack:

This creates the CI/CD pipeline with all of its stages, as described earlier. It performs an initial commit of the sample AWS Glue ETL job source code to trigger the first release change.

In the AWS CloudFormation console, choose Create. After the template finishes creating resources, you see the pipeline name on the stack Outputs tab.

After that, open the CodePipeline console and select the newly created pipeline. Initially, your pipeline’s CodeCommit stage shows that the source action failed.

Allow a few minutes for your new pipeline to detect the initial commit applied by the CloudFormation stack creation. As soon as the commit is detected, your pipeline starts. You will see the successful stage completion status as soon as the CodeCommit source stage runs.

In the CodeCommit console, choose Code in the navigation pane to view the solution files.

Next, you can watch how the pipeline goes through the LiveTest stage of the deploy and AutomatedLiveTest actions, until it finally reaches the LiveTestApproval action.

At this point, if you check the AWS CloudFormation console, you can see that a new template has been deployed as part of the LiveTest deploy action.

At this point, make sure that the AWS Glue crawlers and the AWS Glue job ran successfully. Also check whether the corresponding databases and external tables have been created in the AWS Glue Data Catalog. Then verify that the data is validated using Amazon Athena, as shown following.

Open the AWS Glue console, and choose Databases in the navigation pane. You will see the following databases in the Data Catalog:

Open the Amazon Athena console, and run the following queries. Verify that the record counts are matching.

SELECT count(*) FROM "nycitytaxi_gluedemocicdtest"."data";
SELECT count(*) FROM "nytaxiparquet_gluedemocicdtest"."datalake";

The following shows the raw data:

The following shows the transformed data:

The pipeline pauses the action until the release is approved. After validating the data, manually approve the revision on the LiveTestApproval action on the CodePipeline console.

Add comments as needed, and choose Approve.

The LiveTestApproval stage now appears as Approved on the console.

After the revision is approved, the pipeline proceeds to use the AWS CloudFormation template to destroy the resources that were deployed in the LiveTest deploy action. This helps reduce cost and ensures a clean test environment on every deployment.

Production deployment is the final stage. In this stage, all the resources—AWS Glue crawlers, AWS Glue jobs, Amazon S3 buckets, roles, and so on—are provisioned and deployed to the production environment using the AWS CloudFormation template.

After successfully running the whole pipeline, feel free to experiment with it by changing the source code stored on AWS CodeCommit. For example, if you modify the AWS Glue ETL job to generate an error, it should make the AutomatedLiveTest action fail. Or if you change the AWS CloudFormation template to make its creation fail, it should affect the LiveTest deploy action. The objective of the pipeline is to guarantee that all changes that are deployed to production are guaranteed to work as expected.


In this post, you learned how easy it is to implement CI/CD for serverless AWS Glue ETL solutions with AWS developer tools like AWS CodePipeline and AWS CodeBuild at scale. Implementing such solutions can help you accelerate ETL development and testing at your organization.

If you have questions or suggestions, please comment below.


Additional Reading

If you found this post useful, be sure to check out Implement Continuous Integration and Delivery of Apache Spark Applications using AWS and Build a Data Lake Foundation with AWS Glue and Amazon S3.


About the Authors

Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.

Luis Caro is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.




RDS for Oracle: Extending Outbound Network Access to use SSL/TLS

Post Syndicated from Surya Nallu original https://aws.amazon.com/blogs/architecture/rds-for-oracle-extending-outbound-network-access-to-use-ssltls/

In December 2016, we launched the Outbound Network Access functionality for Amazon RDS for Oracle, enabling customers to use their RDS for Oracle database instances to communicate with external web endpoints using the utl_http and utl tcp packages, and sending emails through utl_smtp. We extended the functionality by adding the option of using custom DNS servers, allowing such outbound network accesses to make use of any DNS server a customer chooses to use. These releases enabled HTTP, TCP and SMTP communication originating out of RDS for Oracle instances – limited to non-secure (non-SSL) mediums.

To overcome the limitation over SSL connections, we recently published a whitepaper, that guides through the process of creating customized Oracle wallet bundles on your RDS for Oracle instances. By making use of such wallets, you can now extend the Outbound Network Access capability to have external communications happen over secure (SSL/TLS) connections. This opens up new use cases for your RDS for Oracle instances.

With the right set of certificates imported into your RDS for Oracle instances (through Oracle wallets), your database instances can now:

  • Communicate with a HTTPS endpoint: Using utl_http, access a resource such as https://status.aws.amazon.com/robots.txt
  • Download files from Amazon S3 securely: Using a presigned URL from Amazon S3, you can now download any file over SSL
  • Extending Oracle Database links to use SSL: Database links between RDS for Oracle instances can now use SSL as long as the instances have the SSL option installed
  • Sending email over SMTPS:
    • You can now integrate with Amazon SES to send emails from your database instances and any other generic SMTPS with which the provider can be integrated

These are just a few high-level examples of new use cases that have opened up with the whitepaper. As a reminder, always ensure to have best security practices in place when making use of Outbound Network Access (detailed in the whitepaper).

About the Author

Surya Nallu is a Software Development Engineer on the Amazon RDS for Oracle team.

Get Started with Blockchain Using the new AWS Blockchain Templates

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/get-started-with-blockchain-using-the-new-aws-blockchain-templates/

Many of today’s discussions around blockchain technology remind me of the classic Shimmer Floor Wax skit. According to Dan Aykroyd, Shimmer is a dessert topping. Gilda Radner claims that it is a floor wax, and Chevy Chase settles the debate and reveals that it actually is both! Some of the people that I talk to see blockchains as the foundation of a new monetary system and a way to facilitate international payments. Others see blockchains as a distributed ledger and immutable data source that can be applied to logistics, supply chain, land registration, crowdfunding, and other use cases. Either way, it is clear that there are a lot of intriguing possibilities and we are working to help our customers use this technology more effectively.

We are launching AWS Blockchain Templates today. These templates will let you launch an Ethereum (either public or private) or Hyperledger Fabric (private) network in a matter of minutes and with just a few clicks. The templates create and configure all of the AWS resources needed to get you going in a robust and scalable fashion.

Launching a Private Ethereum Network
The Ethereum template offers two launch options. The ecs option creates an Amazon ECS cluster within a Virtual Private Cloud (VPC) and launches a set of Docker images in the cluster. The docker-local option also runs within a VPC, and launches the Docker images on EC2 instances. The template supports Ethereum mining, the EthStats and EthExplorer status pages, and a set of nodes that implement and respond to the Ethereum RPC protocol. Both options create and make use of a DynamoDB table for service discovery, along with Application Load Balancers for the status pages.

Here are the AWS Blockchain Templates for Ethereum:

I start by opening the CloudFormation Console in the desired region and clicking Create Stack:

I select Specify an Amazon S3 template URL, enter the URL of the template for the region, and click Next:

I give my stack a name:

Next, I enter the first set of parameters, including the network ID for the genesis block. I’ll stick with the default values for now:

I will also use the default values for the remaining network parameters:

Moving right along, I choose the container orchestration platform (ecs or docker-local, as I explained earlier) and the EC2 instance type for the container nodes:

Next, I choose my VPC and the subnets for the Ethereum network and the Application Load Balancer:

I configure my keypair, EC2 security group, IAM role, and instance profile ARN (full information on the required permissions can be found in the documentation):

The Instance Profile ARN can be found on the summary page for the role:

I confirm that I want to deploy EthStats and EthExplorer, choose the tag and version for the nested CloudFormation templates that are used by this one, and click Next to proceed:

On the next page I specify a tag for the resources that the stack will create, leave the other options as-is, and click Next:

I review all of the parameters and options, acknowledge that the stack might create IAM resources, and click Create to build my network:

The template makes use of three nested templates:

After all of the stacks have been created (mine took about 5 minutes), I can select JeffNet and click the Outputs tab to discover the links to EthStats and EthExplorer:

Here’s my EthStats:

And my EthExplorer:

If I am writing apps that make use of my private network to store and process smart contracts, I would use the EthJsonRpcUrl.

Stay Tuned
My colleagues are eager to get your feedback on these new templates and plan to add new versions of the frameworks as they become available.



IsoHunt Founder Returns With New Search Tool

Post Syndicated from Ernesto original https://torrentfreak.com/isohunt-founder-returns-with-new-search-tool-180419/

Of all the major torrent sites that dominated the Internet at the beginning of this decade, only a few remain.

One of the sites that fell prey to ever-increasing pressure from the entertainment industry was isoHunt.

Founded by the Canadian entrepreneur Gary Fung, the site was one of the early pioneers in the world of torrents, paving the way for many others. However, this spotlight also caught the attention of the major movie studios.

After a lengthy legal battle isoHunt’s founder eventually shut down the site late 2013. This happened after Fung signed a settlement agreement with Hollywood for no less than $110 million, on paper at least.

Launching a new torrent search engine was never really an option, but Fung decided not to let his expertise go to waste. He focused his time and efforts on a new search project instead, which was unveiled to the public this week.

The new app called “WonderSwipe” has just been added to Apple’s iOS store. It’s a mobile search app that ties into Google’s backend, but with a different user interface. While it has nothing to do with file-sharing, we decided to reach out to isoHunt’s founder to find out more.

Fung tells us that he got the idea for the app because he was frustrated with Google’s default search options on the mobile platform.

“I find myself barely do any search on the smartphone, most of the time waiting until I get to my desktop. I ask why?” Fung tells us.

One of the main issues he identified is the fact that swiping is not an option. Instead, people end up browsing through dozens of mobile browser tabs. So, Fung took Google’s infrastructure and search power, making it swipeable.

“From a UI design perspective, I find swiping through photos on the first iPhone one of the most extraordinary advances in computing. It’s so easy that babies would be doing it before they even learn how to flip open a book!

“Bringing that ease of use to the central way of conducting mobile search and research is the initial eureka I had in starting work on WonderSwipe,” Fung adds.

That was roughly three years ago, and a few hours ago WonderSwipe finally made its way into the App store. Android users will have to wait for now, but the application will eventually be available on that platform as well.

In addition to swiping through search results, the app also promises faster article loading and browsing, a reader mode with condensed search results, and a hands-free mode with automated browsing where summaries are read out loud.


Of course, WonderSwipe is nothing like isoHunt ever was, apart from the fact that Google is a search engine that also links to torrents, indirectly.

This similarity was also brought up during the lawsuit with the MPAA, when Fung’s legal team likened isoHunt to Google in court. However, the Canadian entrepreneur doesn’t expect that Hollywood will have an issue with WonderSwipe in particular.

“isoHunt was similar to Google in how it worked as a search engine, but not in scope. Torrents are a small subset of all the webpages Google indexes,” Fung says.

“WonderSwipe’s aim is to find answers in all webpages, powered by Google search results. It presents results in extracted text and summaries with no connection to BitTorrent clients. As such, WonderSwipe can be bigger than isoHunt has ever been.”

Ironically, in recent years Hollywood has often criticized Google for linking to pirated content in its search results. These results will also be available through WonderSwipe.

However, Fung says that any copyright issues with WonderSwipe will have to be dealt with on the search engine level, by Google.

“If there are links to pirated content, tell search engines so they can take them down!” he says.

WonderSwipe is totally free and Fung tells us that he plans to monetize it with in-app purchases for pro features, and non-intrusive advertising that won’t slow down swiping or search results. More details on the future plans for the app are available here.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

New – Registry of Open Data on AWS (RODA)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-registry-of-open-data-on-aws-roda/

Almost a decade ago, my colleague Deepak Singh introduced the AWS Public Datasets in his post Paging Researchers, Analysts, and Developers. I’m happy to report that Deepak is still an important part of the AWS team and that the Public Datasets program is still going strong!

Today we are announcing a new take on open and public data, the Registry of Open Data on AWS, or RODA. This registry includes existing Public Datasets and allows anyone to add their own datasets so that they can be accessed and analyzed on AWS.

Inside the Registry
The home page lists all of the datasets in the registry:

Entering a search term shrinks the list so that only the matching datasets are displayed:

Each dataset has an associated detail page, including usage examples, license info, and the information needed to locate and access the dataset on AWS:

In this case, I can access the data with a simple CLI command:

I could also access it programmatically, or download data to my EC2 instance.

Adding to the Repository
If you have a dataset that is publicly available and would like to add it to RODA , you can simply send us a pull request. Head over to the open-data-registry repo, read the CONTRIBUTING document, and create a YAML file that describes your dataset, using one of the existing files in the datasets directory as a model:

We’ll review pull requests regularly; you can “star” or watch the repo in order to track additions and changes.

Impress Me
I am looking forward to an inrush of new datasets, along with some blog posts and apps that show how to to use the data in powerful and interesting ways. Let me know what you come up with.



AIY Projects 2: Google’s AIY Projects Kits get an upgrade

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/google-aiy-projects-2/

After the outstanding success of their AIY Projects Voice and Vision Kits, Google has announced the release of upgraded kits, complete with Raspberry Pi Zero WH, Camera Module, and preloaded SD card.

Google AIY Projects Vision Kit 2 Raspberry Pi

Google’s AIY Projects Kits

Google launched the AIY Projects Voice Kit last year, first as a cover gift with The MagPi magazine and later as a standalone product.

Makers needed to provide their own Raspberry Pi for the original kit. The new kits include everything you need, from Pi to SD card.

Within a DIY cardboard box, makers were able to assemble their own voice-activated AI assistant akin to the Amazon Alexa, Apple’s Siri, and Google’s own Google Home Assistant. The Voice Kit was an instant hit that spurred no end of maker videos and tutorials, including our own free tutorial for controlling a robot using voice commands.

Later in the year, the team followed up the success of the Voice Kit with the AIY Projects Vision Kit — the same cardboard box hosting a camera perfect for some pretty nifty image recognition projects.

For more on the AIY Voice Kit, here’s our release video hosted by the rather delightful Rob Zwetsloot.

AIY Projects adds natural human interaction to your Raspberry Pi

Check out the exclusive Google AIY Projects Kit that comes free with The MagPi 57! Grab yourself a copy in stores or online now: http://magpi.cc/2pI6IiQ This first AIY Projects kit taps into the Google Assistant SDK and Cloud Speech API using the AIY Projects Voice HAT (Hardware Accessory on Top) board, stereo microphone, and speaker (included free with the magazine).

AIY Projects 2

So what’s new with version 2 of the AIY Projects Voice Kit? The kit now includes the recently released Raspberry Pi Zero WH, our Zero W with added pre-soldered header pins for instant digital making accessibility. Purchasers of the kits will also get a micro SD card with preloaded OS to help them get started without having to set the card up themselves.

Google AIY Projects Vision Kit 2 Raspberry Pi

Everything you need to build your own Raspberry Pi-powered Google voice assistant

In the newly upgraded AIY Projects Vision Kit v1.2, makers are also treated to an official Raspberry Pi Camera Module v2, the latest model of our add-on camera.

Google AIY Projects Vision Kit 2 Raspberry Pi

“Everything you need to get started is right there in the box,” explains Billy Rutledge, Google’s Director of AIY Projects. “We knew from our research that even though makers are interested in AI, many felt that adding it to their projects was too difficult or required expensive hardware.”

Google AIY Projects Vision Kit 2 Raspberry Pi
Google AIY Projects Vision Kit 2 Raspberry Pi
Google AIY Projects Vision Kit 2 Raspberry Pi

Google is also hard at work producing AIY Projects companion apps for Android, iOS, and Chrome. The Android app is available now to coincide with the launch of the upgraded kits, with the other two due for release soon. The app supports wireless setup of the AIY Kit, though avid coders will still be able to hack theirs to better suit their projects.

Google has also updated the AIY Projects website with an AIY Models section highlighting a range of neural network projects for the kits.

Get your kit

The updated Voice and Vision Kits were announced last night, and in the US they are available now from Target. UK-based makers should be able to get their hands on them this summer — keep an eye on our social channels for updates and links.

The post AIY Projects 2: Google’s AIY Projects Kits get an upgrade appeared first on Raspberry Pi.

Russia’s Encryption War: 1.8m Google & Amazon IPs Blocked to Silence Telegram

Post Syndicated from Andy original https://torrentfreak.com/russias-encryption-war-1-8m-google-amazon-ips-blocked-to-silence-telegram-180417/

The rules in Russia are clear. Entities operating an encrypted messaging service need to register with the authorities. They also need to hand over their encryption keys so that if law enforcement sees fit, users can be spied on.

Free cross-platform messaging app Telegram isn’t playing ball. An impressive 200,000,000 people used the software in March (including a growing number for piracy purposes) and founder Pavel Durov says he will not compromise their security, despite losing a lawsuit against the Federal Security Service which compels him to do so.

“Telegram doesn’t have shareholders or advertisers to report to. We don’t do deals with marketers, data miners or government agencies. Since the day we launched in August 2013 we haven’t disclosed a single byte of our users’ private data to third parties,” Durov said.

“Above all, we at Telegram believe in people. We believe that humans are inherently intelligent and benevolent beings that deserve to be trusted; trusted with freedom to share their thoughts, freedom to communicate privately, freedom to create tools. This philosophy defines everything we do.”

But by not handing over its keys, Telegram is in trouble with Russia. The FSB says it needs access to Telegram messages to combat terrorism so, in response to its non-compliance, telecoms watchdog Rozcomnadzor filed a lawsuit to degrade Telegram via web-blocking. Last Friday, that process ended in the state’s favor.

After an 18-minute hearing, a Moscow court gave the go-ahead for Telegram to be banned in Russia. The hearing was scheduled just the day before, giving Telegram little time to prepare. In protest, its lawyers didn’t even turn up to argue the company’s position.

Instead, Durov took to his VKontakte account to announce that Telegram would take counter-measures.

“Telegram will use built-in methods to bypass blocks, which do not require actions from users, but 100% availability of the service without a VPN is not guaranteed,” Durov wrote.

Telegram can appeal the blocking decision but Russian authorities aren’t waiting around for a response. They are clearly prepared to match Durov’s efforts, no matter what the cost.

In instructions sent out yesterday nationwide, Rozomnadzor ordered ISPs to block Telegram. The response was immediate and massive. Telegram was using both Amazon and Google to provide service to its users so, within hours, huge numbers of IP addresses belonging to both companies were targeted.

Initially, 655,352 Amazon IP addresses were placed on Russia’s nationwide blacklist. It was later reported that a further 131,000 IP addresses were added to that total. But the Russians were just getting started.

Servers.ru reports that a further 1,048,574 IP addresses belonging to Google were also targeted Monday. Rozcomnadzor said the court ruling against Telegram compelled it to take whatever action is needed to take Telegram down but with at least 1,834,996 addresses now confirmed blocked, it remains unclear what effect it’s had on the service.

Friday’s court ruling states that restrictions against Telegram can be lifted provided that the service hands over its encryption keys to the FSB. However, Durov responded by insisting that “confidentiality is not for sale, and human rights should not be compromised because of fear or greed.”

But of course, money is still part of the Telegram equation. While its business model in terms of privacy stands in stark contrast to that of Facebook, Telegram is also involved in the world’s biggest initial coin offering (ICO). According to media reports, it has raised $1.7 billion in pre-sales thus far.

This week’s action against Telegram is the latest in Russia’s war on ‘unauthorized’ encryption.

At the end of March, authorities suggested that around 15 million IP addresses (13.5 million belonging to Amazon) could be blocked to target chat software Zello. While those measures were averted, a further 500 domains belonging to Google were caught in the dragnet.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Now You Can Create Encrypted Amazon EBS Volumes by Using Your Custom Encryption Keys When You Launch an Amazon EC2 Instance

Post Syndicated from Nishit Nagar original https://aws.amazon.com/blogs/security/create-encrypted-amazon-ebs-volumes-custom-encryption-keys-launch-amazon-ec2-instance-2/

Amazon Elastic Block Store (EBS) offers an encryption solution for your Amazon EBS volumes so you don’t have to build, maintain, and secure your own infrastructure for managing encryption keys for block storage. Amazon EBS encryption uses AWS Key Management Service (AWS KMS) customer master keys (CMKs) when creating encrypted Amazon EBS volumes, providing you all the benefits associated with using AWS KMS. You can specify either an AWS managed CMK or a customer-managed CMK to encrypt your Amazon EBS volume. If you use a customer-managed CMK, you retain granular control over your encryption keys, such as having AWS KMS rotate your CMK every year. To learn more about creating CMKs, see Creating Keys.

In this post, we demonstrate how to create an encrypted Amazon EBS volume using a customer-managed CMK when you launch an EC2 instance from the EC2 console, AWS CLI, and AWS SDK.

Creating an encrypted Amazon EBS volume from the EC2 console

Follow these steps to launch an EC2 instance from the EC2 console with Amazon EBS volumes that are encrypted by customer-managed CMKs:

  1. Sign in to the AWS Management Console and open the EC2 console.
  2. Select Launch instance, and then, in Step 1 of the wizard, select an Amazon Machine Image (AMI).
  3. In Step 2 of the wizard, select an instance type, and then provide additional configuration details in Step 3. For details about configuring your instances, see Launching an Instance.
  4. In Step 4 of the wizard, specify additional EBS volumes that you want to attach to your instances.
  5. To create an encrypted Amazon EBS volume, first add a new volume by selecting Add new volume. Leave the Snapshot column blank.
  6. In the Encrypted column, select your CMK from the drop-down menu. You can also paste the full Amazon Resource Name (ARN) of your custom CMK key ID in this box. To learn more about finding the ARN of a CMK, see Working with Keys.
  7. Select Review and Launch. Your instance will launch with an additional Amazon EBS volume with the key that you selected. To learn more about the launch wizard, see Launching an Instance with Launch Wizard.

Creating Amazon EBS encrypted volumes from the AWS CLI or SDK

You also can use RunInstances to launch an instance with additional encrypted Amazon EBS volumes by setting Encrypted to true and adding kmsKeyID along with the actual key ID in the BlockDeviceMapping object, as shown in the following command:

$> aws ec2 run-instances –image-id ami-b42209de –count 1 –instance-type m4.large –region us-east-1 –block-device-mappings file://mapping.json

In this example, mapping.json describes the properties of the EBS volume that you want to create:

"DeviceName": "/dev/sda1",
"Ebs": {
"DeleteOnTermination": true,
"VolumeSize": 100,
"VolumeType": "gp2",
"Encrypted": true,
"kmsKeyID": "arn:aws:kms:us-east-1:012345678910:key/abcd1234-a123-456a-a12b-a123b4cd56ef"

You can also launch instances with additional encrypted EBS data volumes via an Auto Scaling or Spot Fleet by creating a launch template with the above BlockDeviceMapping. For example:

$> aws ec2 create-launch-template –MyLTName –image-id ami-b42209de –count 1 –instance-type m4.large –region us-east-1 –block-device-mappings file://mapping.json

To learn more about launching an instance with the AWS CLI or SDK, see the AWS CLI Command Reference.

In this blog post, we’ve demonstrated a single-step, streamlined process for creating Amazon EBS volumes that are encrypted under your CMK when you launch your EC2 instance, thereby streamlining your instance launch workflow. To start using this functionality, navigate to the EC2 console.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the Amazon EC2 forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

TV Broadcaster Wants App Stores Blocked to Prevent Piracy

Post Syndicated from Andy original https://torrentfreak.com/tv-broadcaster-wants-app-stores-blocked-to-prevent-piracy-180416/

After first targeting torrent and regular streaming platforms with blocking injunctions, last year Village Roadshow and studios including Disney, Universal, Warner Bros, Twentieth Century Fox, and Paramount began looking at a new threat.

The action targeted HDSubs+, a reasonably popular IPTV service that provides hundreds of otherwise premium live channels, movies, and sports for a relatively small monthly fee. The application was filed during October 2017 and targeted Australia’s largest ISPs.

In parallel, Hong Kong-based broadcaster Television Broadcasts Limited (TVB) launched a similar action, demanding that the same ISPs (including Telstra, Optus, TPG, and Vocus, plus subsidiaries) block several ‘pirate’ IPTV services, named in court as A1, BlueTV, EVPAD, FunTV, MoonBox, Unblock, and hTV5.

Due to the similarity of the cases, both applications were heard in Federal Court in Sydney on Friday. Neither case is as straightforward as blocking a torrent or basic streaming portal, so both applicants are having to deal with additional complexities.

The TVB case is of particular interest. Up to a couple of dozen URLs maintain the services, which are used to provide the content, an EPG (electronic program guide), updates and sundry other features. While most of these appear to fit the description of an “online location” designed to assist copyright infringement, where the Android-based software for the IPTV services is hosted provides an interesting dilemma.

ComputerWorld reports that the apps – which offer live broadcasts, video-on-demand, and catch-up TV – are hosted on as-yet-unnamed sites which are functionally similar to Google Play or Apple’s App Store. They’re repositories of applications that also carry non-infringing apps, such as those for Netflix and YouTube.

Nevertheless, despite clear knowledge of this dual use, TVB wants to have these app marketplaces blocked by Australian ISPs, which would not only render the illicit apps inaccessible to the public but all of the non-infringing ones too. Part of its argument that this action would be reasonable appears to be that legal apps – such as Netflix’s for example – can also be freely accessed elsewhere.

It will be up to Justice Nicholas to decide whether the “primary purpose” of these marketplaces is to infringe or facilitate the infringement of TVB’s copyrights. However, TVB also appears to have another problem which is directly connected to the copyright status in Australia of its China-focused live programming.

Justice Nicholas questioned whether watching a stream in Australia of TVB’s live Chinese broadcasts would amount to copyright infringement because no copy of that content is being made.

“If most of what is occurring here is a reproduction of broadcasts that are not protected by copyright, then the primary purpose is not to facilitate copyright infringement,” Justice Nicholas said.

One of the problems appears to be that China is not a party to the 1961 Rome Convention for the Protection of Performers, Producers of Phonograms and Broadcasting Organisations. However, TVB is arguing that it should still receive protection because it airs pre-recorded content and the live broadcasts are also archived for re-transmission via catch-up services.

The question over whether unchoreographed live broadcasts receive protection has been raised in other regions but in most cases, a workaround has been found. The presence of broadcaster logos on screen (which receive copyright protection) is a factor and it’s been reported that broadcasters are able to record the ‘live’ action and transmit a copy just a couple of seconds later, thereby broadcasting an already-copyrighted work.

While TVB attempts to overcome its issues, Village Roadshow is facing some of its own in its efforts to take down HDSubs+.

It appears that at least partly in response to the Roadshow legal action, the service has undergone some modifications, including a change of brand to ‘Press Play Extra’. As reported by ZDNet, there have been structural changes too, which means that Roadshow can no longer “see under the hood”.

According to Justice Nicholas, there is no evidence that the latest version of the app infringes copyright but according to counsel for Village Roadshow, the new app is merely transitional and preparing for a possible future change.

“We submit the difference to be drawn is reactive to my clients serving on the operators a notice,” counsel for Roadshow argued, with an expert describing the new app as “almost like a placeholder.”

In short, Roadshow still wants all of the target domains in its original application blocked because the company believes there’s a good chance they’ll be reactivated in the future.

None of the ISPs involved in either case turned up to the hearings on Friday, which removes one layer of complexity in what appears thus far to be less than straightforward cases.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

AWS AppSync – Production-Ready with Six New Features

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-appsync-production-ready-with-six-new-features/

If you build (or want to build) data-driven web and mobile apps and need real-time updates and the ability to work offline, you should take a look at AWS AppSync. Announced in preview form at AWS re:Invent 2017 and described in depth here, AWS AppSync is designed for use in iOS, Android, JavaScript, and React Native apps. AWS AppSync is built around GraphQL, an open, standardized query language that makes it easy for your applications to request the precise data that they need from the cloud.

I’m happy to announce that the preview period is over and that AWS AppSync is now generally available and production-ready, with six new features that will simplify and streamline your application development process:

Console Log Access – You can now see the CloudWatch Logs entries that are created when you test your GraphQL queries, mutations, and subscriptions from within the AWS AppSync Console.

Console Testing with Mock Data – You can now create and use mock context objects in the console for testing purposes.

Subscription Resolvers – You can now create resolvers for AWS AppSync subscription requests, just as you can already do for query and mutate requests.

Batch GraphQL Operations for DynamoDB – You can now make use of DynamoDB’s batch operations (BatchGetItem and BatchWriteItem) across one or more tables. in your resolver functions.

CloudWatch Support – You can now use Amazon CloudWatch Metrics and CloudWatch Logs to monitor calls to the AWS AppSync APIs.

CloudFormation Support – You can now define your schemas, data sources, and resolvers using AWS CloudFormation templates.

A Brief AppSync Review
Before diving in to the new features, let’s review the process of creating an AWS AppSync API, starting from the console. I click Create API to begin:

I enter a name for my API and (for demo purposes) choose to use the Sample schema:

The schema defines a collection of GraphQL object types. Each object type has a set of fields, with optional arguments:

If I was creating an API of my own I would enter my schema at this point. Since I am using the sample, I don’t need to do this. Either way, I click on Create to proceed:

The GraphQL schema type defines the entry points for the operations on the data. All of the data stored on behalf of a particular schema must be accessible using a path that begins at one of these entry points. The console provides me with an endpoint and key for my API:

It also provides me with guidance and a set of fully functional sample apps that I can clone:

When I clicked Create, AWS AppSync created a pair of Amazon DynamoDB tables for me. I can click Data Sources to see them:

I can also see and modify my schema, issue queries, and modify an assortment of settings for my API.

Let’s take a quick look at each new feature…

Console Log Access
The AWS AppSync Console already allows me to issue queries and to see the results, and now provides access to relevant log entries.In order to see the entries, I must enable logs (as detailed below), open up the LOGS, and check the checkbox. Here’s a simple mutation query that adds a new event. I enter the query and click the arrow to test it:

I can click VIEW IN CLOUDWATCH for a more detailed view:

To learn more, read Test and Debug Resolvers.

Console Testing with Mock Data
You can now create a context object in the console where it will be passed to one of your resolvers for testing purposes. I’ll add a testResolver item to my schema:

Then I locate it on the right-hand side of the Schema page and click Attach:

I choose a data source (this is for testing and the actual source will not be accessed), and use the Put item mapping template:

Then I click Select test context, choose Create New Context, assign a name to my test content, and click Save (as you can see, the test context contains the arguments from the query along with values to be returned for each field of the result):

After I save the new Resolver, I click Test to see the request and the response:

Subscription Resolvers
Your AWS AppSync application can monitor changes to any data source using the @aws_subscribe GraphQL schema directive and defining a Subscription type. The AWS AppSync client SDK connects to AWS AppSync using MQTT over Websockets and the application is notified after each mutation. You can now attach resolvers (which convert GraphQL payloads into the protocol needed by the underlying storage system) to your subscription fields and perform authorization checks when clients attempt to connect. This allows you to perform the same fine grained authorization routines across queries, mutations, and subscriptions.

To learn more about this feature, read Real-Time Data.

Batch GraphQL Operations
Your resolvers can now make use of DynamoDB batch operations that span one or more tables in a region. This allows you to use a list of keys in a single query, read records multiple tables, write records in bulk to multiple tables, and conditionally write or delete related records across multiple tables.

In order to use this feature the IAM role that you use to access your tables must grant access to DynamoDB’s BatchGetItem and BatchPutItem functions.

To learn more, read the DynamoDB Batch Resolvers tutorial.

CloudWatch Logs Support
You can now tell AWS AppSync to log API requests to CloudWatch Logs. Click on Settings and Enable logs, then choose the IAM role and the log level:

CloudFormation Support
You can use the following CloudFormation resource types in your templates to define AWS AppSync resources:

AWS::AppSync::GraphQLApi – Defines an AppSync API in terms of a data source (an Amazon Elasticsearch Service domain or a DynamoDB table).

AWS::AppSync::ApiKey – Defines the access key needed to access the data source.

AWS::AppSync::GraphQLSchema – Defines a GraphQL schema.

AWS::AppSync::DataSource – Defines a data source.

AWS::AppSync::Resolver – Defines a resolver by referencing a schema and a data source, and includes a mapping template for requests.

Here’s a simple schema definition in YAML form:

    Type: "AWS::AppSync::GraphQLSchema"
      - AppSyncGraphQLApi
      ApiId: !GetAtt AppSyncGraphQLApi.ApiId
      Definition: |
        schema {
          query: Query
          mutation: Mutation
        type Query {
          singlePost(id: ID!): Post
          allPosts: [Post]
        type Mutation {
          putPost(id: ID!, title: String!): Post
        type Post {
          id: ID!
          title: String!

Available Now
These new features are available now and you can start using them today! Here are a couple of blog posts and other resources that you might find to be of interest:




How to retain system tables’ data spanning multiple Amazon Redshift clusters and run cross-cluster diagnostic queries

Post Syndicated from Karthik Sonti original https://aws.amazon.com/blogs/big-data/how-to-retain-system-tables-data-spanning-multiple-amazon-redshift-clusters-and-run-cross-cluster-diagnostic-queries/

Amazon Redshift is a data warehouse service that logs the history of the system in STL log tables. The STL log tables manage disk space by retaining only two to five days of log history, depending on log usage and available disk space.

To retain STL tables’ data for an extended period, you usually have to create a replica table for every system table. Then, for each you load the data from the system table into the replica at regular intervals. By maintaining replica tables for STL tables, you can run diagnostic queries on historical data from the STL tables. You then can derive insights from query execution times, query plans, and disk-spill patterns, and make better cluster-sizing decisions. However, refreshing replica tables with live data from STL tables at regular intervals requires schedulers such as Cron or AWS Data Pipeline. Also, these tables are specific to one cluster and they are not accessible after the cluster is terminated. This is especially true for transient Amazon Redshift clusters that last for only a finite period of ad hoc query execution.

In this blog post, I present a solution that exports system tables from multiple Amazon Redshift clusters into an Amazon S3 bucket. This solution is serverless, and you can schedule it as frequently as every five minutes. The AWS CloudFormation deployment template that I provide automates the solution setup in your environment. The system tables’ data in the Amazon S3 bucket is partitioned by cluster name and query execution date to enable efficient joins in cross-cluster diagnostic queries.

I also provide another CloudFormation template later in this post. This second template helps to automate the creation of tables in the AWS Glue Data Catalog for the system tables’ data stored in Amazon S3. After the system tables are exported to Amazon S3, you can run cross-cluster diagnostic queries on the system tables’ data and derive insights about query executions in each Amazon Redshift cluster. You can do this using Amazon QuickSight, Amazon Athena, Amazon EMR, or Amazon Redshift Spectrum.

You can find all the code examples in this post, including the CloudFormation templates, AWS Glue extract, transform, and load (ETL) scripts, and the resolution steps for common errors you might encounter in this GitHub repository.

Solution overview

The solution in this post uses AWS Glue to export system tables’ log data from Amazon Redshift clusters into Amazon S3. The AWS Glue ETL jobs are invoked at a scheduled interval by AWS Lambda. AWS Systems Manager, which provides secure, hierarchical storage for configuration data management and secrets management, maintains the details of Amazon Redshift clusters for which the solution is enabled. The last-fetched time stamp values for the respective cluster-table combination are maintained in an Amazon DynamoDB table.

The following diagram covers the key steps involved in this solution.

The solution as illustrated in the preceding diagram flows like this:

  1. The Lambda function, invoke_rs_stl_export_etl, is triggered at regular intervals, as controlled by Amazon CloudWatch. It’s triggered to look up the AWS Systems Manager parameter store to get the details of the Amazon Redshift clusters for which the system table export is enabled.
  2. The same Lambda function, based on the Amazon Redshift cluster details obtained in step 1, invokes the AWS Glue ETL job designated for the Amazon Redshift cluster. If an ETL job for the cluster is not found, the Lambda function creates one.
  3. The ETL job invoked for the Amazon Redshift cluster gets the cluster credentials from the parameter store. It gets from the DynamoDB table the last exported time stamp of when each of the system tables was exported from the respective Amazon Redshift cluster.
  4. The ETL job unloads the system tables’ data from the Amazon Redshift cluster into an Amazon S3 bucket.
  5. The ETL job updates the DynamoDB table with the last exported time stamp value for each system table exported from the Amazon Redshift cluster.
  6. The Amazon Redshift cluster system tables’ data is available in Amazon S3 and is partitioned by cluster name and date for running cross-cluster diagnostic queries.

Understanding the configuration data

This solution uses AWS Systems Manager parameter store to store the Amazon Redshift cluster credentials securely. The parameter store also securely stores other configuration information that the AWS Glue ETL job needs for extracting and storing system tables’ data in Amazon S3. Systems Manager comes with a default AWS Key Management Service (AWS KMS) key that it uses to encrypt the password component of the Amazon Redshift cluster credentials.

The following table explains the global parameters and cluster-specific parameters required in this solution. The global parameters are defined once and applicable at the overall solution level. The cluster-specific parameters are specific to an Amazon Redshift cluster and repeat for each cluster for which you enable this post’s solution. The CloudFormation template explained later in this post creates these parameters as part of the deployment process.

Parameter name Type Description
Global parametersdefined once and applied to all jobs
redshift_query_logs.global.s3_prefix String The Amazon S3 path where the query logs are exported. Under this path, each exported table is partitioned by cluster name and date.
redshift_query_logs.global.tempdir String The Amazon S3 path that AWS Glue ETL jobs use for temporarily staging the data.
redshift_query_logs.global.role> String The name of the role that the AWS Glue ETL jobs assume. Just the role name is sufficient. The complete Amazon Resource Name (ARN) is not required.
redshift_query_logs.global.enabled_cluster_list StringList A comma-separated list of cluster names for which system tables’ data export is enabled. This gives flexibility for a user to exclude certain clusters.
Cluster-specific parametersfor each cluster specified in the enabled_cluster_list parameter
redshift_query_logs.<<cluster_name>>.connection String The name of the AWS Glue Data Catalog connection to the Amazon Redshift cluster. For example, if the cluster name is product_warehouse, the entry is redshift_query_logs.product_warehouse.connection.
redshift_query_logs.<<cluster_name>>.user String The user name that AWS Glue uses to connect to the Amazon Redshift cluster.
redshift_query_logs.<<cluster_name>>.password Secure String The password that AWS Glue uses to connect the Amazon Redshift cluster’s encrypted-by key that is managed in AWS KMS.

For example, suppose that you have two Amazon Redshift clusters, product-warehouse and category-management, for which the solution described in this post is enabled. In this case, the parameters shown in the following screenshot are created by the solution deployment CloudFormation template in the AWS Systems Manager parameter store.

Solution deployment

To make it easier for you to get started, I created a CloudFormation template that automatically configures and deploys the solution—only one step is required after deployment.


To deploy the solution, you must have one or more Amazon Redshift clusters in a private subnet. This subnet must have a network address translation (NAT) gateway or a NAT instance configured, and also a security group with a self-referencing inbound rule for all TCP ports. For more information about why AWS Glue ETL needs the configuration it does, described previously, see Connecting to a JDBC Data Store in a VPC in the AWS Glue documentation.

To start the deployment, launch the CloudFormation template:

CloudFormation stack parameters

The following table lists and describes the parameters for deploying the solution to export query logs from multiple Amazon Redshift clusters.

Property Default Description
S3Bucket mybucket The bucket this solution uses to store the exported query logs, stage code artifacts, and perform unloads from Amazon Redshift. For example, the mybucket/extract_rs_logs/data bucket is used for storing all the exported query logs for each system table partitioned by the cluster. The mybucket/extract_rs_logs/temp/ bucket is used for temporarily staging the unloaded data from Amazon Redshift. The mybucket/extract_rs_logs/code bucket is used for storing all the code artifacts required for Lambda and the AWS Glue ETL jobs.
ExportEnabledRedshiftClusters Requires Input A comma-separated list of cluster names from which the system table logs need to be exported.
DataStoreSecurityGroups Requires Input A list of security groups with an inbound rule to the Amazon Redshift clusters provided in the parameter, ExportEnabledClusters. These security groups should also have a self-referencing inbound rule on all TCP ports, as explained on Connecting to a JDBC Data Store in a VPC.

After you launch the template and create the stack, you see that the following resources have been created:

  1. AWS Glue connections for each Amazon Redshift cluster you provided in the CloudFormation stack parameter, ExportEnabledRedshiftClusters.
  2. All parameters required for this solution created in the parameter store.
  3. The Lambda function that invokes the AWS Glue ETL jobs for each configured Amazon Redshift cluster at a regular interval of five minutes.
  4. The DynamoDB table that captures the last exported time stamps for each exported cluster-table combination.
  5. The AWS Glue ETL jobs to export query logs from each Amazon Redshift cluster provided in the CloudFormation stack parameter, ExportEnabledRedshiftClusters.
  6. The IAM roles and policies required for the Lambda function and AWS Glue ETL jobs.

After the deployment

For each Amazon Redshift cluster for which you enabled the solution through the CloudFormation stack parameter, ExportEnabledRedshiftClusters, the automated deployment includes temporary credentials that you must update after the deployment:

  1. Go to the parameter store.
  2. Note the parameters <<cluster_name>>.user and redshift_query_logs.<<cluster_name>>.password that correspond to each Amazon Redshift cluster for which you enabled this solution. Edit these parameters to replace the placeholder values with the right credentials.

For example, if product-warehouse is one of the clusters for which you enabled system table export, you edit these two parameters with the right user name and password and choose Save parameter.

Querying the exported system tables

Within a few minutes after the solution deployment, you should see Amazon Redshift query logs being exported to the Amazon S3 location, <<S3Bucket_you_provided>>/extract_redshift_query_logs/data/. In that bucket, you should see the eight system tables partitioned by customer name and date: stl_alert_event_log, stl_dlltext, stl_explain, stl_query, stl_querytext, stl_scan, stl_utilitytext, and stl_wlm_query.

To run cross-cluster diagnostic queries on the exported system tables, create external tables in the AWS Glue Data Catalog. To make it easier for you to get started, I provide a CloudFormation template that creates an AWS Glue crawler, which crawls the exported system tables stored in Amazon S3 and builds the external tables in the AWS Glue Data Catalog.

Launch this CloudFormation template to create external tables that correspond to the Amazon Redshift system tables. S3Bucket is the only input parameter required for this stack deployment. Provide the same Amazon S3 bucket name where the system tables’ data is being exported. After you successfully create the stack, you can see the eight tables in the database, redshift_query_logs_db, as shown in the following screenshot.

Now, navigate to the Athena console to run cross-cluster diagnostic queries. The following screenshot shows a diagnostic query executed in Athena that retrieves query alerts logged across multiple Amazon Redshift clusters.

You can build the following example Amazon QuickSight dashboard by running cross-cluster diagnostic queries on Athena to identify the hourly query count and the key query alert events across multiple Amazon Redshift clusters.

How to extend the solution

You can extend this post’s solution in two ways:

  • Add any new Amazon Redshift clusters that you spin up after you deploy the solution.
  • Add other system tables or custom query results to the list of exports from an Amazon Redshift cluster.

Extend the solution to other Amazon Redshift clusters

To extend the solution to more Amazon Redshift clusters, add the three cluster-specific parameters in the AWS Systems Manager parameter store following the guidelines earlier in this post. Modify the redshift_query_logs.global.enabled_cluster_list parameter to append the new cluster to the comma-separated string.

Extend the solution to add other tables or custom queries to an Amazon Redshift cluster

The current solution ships with the export functionality for the following Amazon Redshift system tables:

  • stl_alert_event_log
  • stl_dlltext
  • stl_explain
  • stl_query
  • stl_querytext
  • stl_scan
  • stl_utilitytext
  • stl_wlm_query

You can easily add another system table or custom query by adding a few lines of code to the AWS Glue ETL job, <<cluster-name>_extract_rs_query_logs. For example, suppose that from the product-warehouse Amazon Redshift cluster you want to export orders greater than $2,000. To do so, add the following five lines of code to the AWS Glue ETL job product-warehouse_extract_rs_query_logs, where product-warehouse is your cluster name:

  1. Get the last-processed time-stamp value. The function creates a value if it doesn’t already exist.

salesLastProcessTSValue = functions.getLastProcessedTSValue(trackingEntry=”mydb.sales_2000",job_configs=job_configs)

  1. Run the custom query with the time stamp.

returnDF=functions.runQuery(query="select * from sales s join order o where o.order_amnt > 2000 and sale_timestamp > '{}'".format (salesLastProcessTSValue) ,tableName="mydb.sales_2000",job_configs=job_configs)

  1. Save the results to Amazon S3.


  1. Get the latest time-stamp value from the returned data frame in Step 2.


  1. Update the last-processed time-stamp value in the DynamoDB table.



In this post, I demonstrate a serverless solution to retain the system tables’ log data across multiple Amazon Redshift clusters. By using this solution, you can incrementally export the data from system tables into Amazon S3. By performing this export, you can build cross-cluster diagnostic queries, build audit dashboards, and derive insights into capacity planning by using services such as Athena. I also demonstrate how you can extend this solution to other ad hoc query use cases or tables other than system tables by adding a few lines of code.

Additional Reading

If you found this post useful, be sure to check out Using Amazon Redshift Spectrum, Amazon Athena, and AWS Glue with Node.js in Production and Amazon Redshift – 2017 Recap.

About the Author

Karthik Sonti is a senior big data architect at Amazon Web Services. He helps AWS customers build big data and analytical solutions and provides guidance on architecture and best practices.





MPAA Quietly Shut Down Its ‘Legal’ Movie Search Engine

Post Syndicated from Ernesto original https://torrentfreak.com/mpaa-quietly-shut-down-its-legal-movie-search-engine-180411/

During the fall of 2014, Hollywood launched WhereToWatch, its very own search engine for movies and TV-shows.

The site enabled people to check if and where the latest entertainment was available, hoping to steer U.S. visitors away from pirate sites.

Aside from the usual critics, the launch received a ton of favorable press. This was soon followed up by another release highlighting some of the positive responses and praise from the press.

“The initiative marks a further attempt by the MPAA to combat rampant online piracy by reminding consumers of legal means to watch movies and TV shows,” the LA Times wrote, for example.

Over the past several years, the site hasn’t appeared in the news much, but it did help thousands of people find legal sources for the latest entertainment. However, those who try to access it today will notice that WhereToWatch has been abandoned, quietly.

The MPAA pulled the plug on the service a few months ago. And where the mainstream media covered its launch in detail, the shutdown received zero mentions. So why did the site fold?

According to MPAA Vice President of Corporate Communications, Chris Ortman, it was no longer needed as there are many similar search engines out there.

“Given the many search options commercially available today, which can be found on the MPAA website, WheretoWatch.com was discontinued at the conclusion of 2017,” Ortman informs TF.

“There are more than 140 lawful online platforms in the United States for accessing film and television content, and more than 460 around the world,” he adds.

The MPAA lists several of these alternative search engines on its new website. The old WhereToWatch domain now forwards to the MPAA’s online magazine ‘The Credits,’ which features behind-the-scenes stories and industry profiles.

While the MPAA is right that there are alternative search engines, many of these were already available when WhereToWatch launched. In fact, the site used the services of the competing service GoWatchIt for its search results.

Perhaps the lack of interest from the U.S. public played a role as well. The site never really took off and according to traffic estimates from SimilarWeb and Alexa, most of the visitors came from Iran, where the site was unusable due to a geo-block.

After searching long and hard we were able to track down a former WhereToWatch user on Reddit. This person just started to get into the service and was disappointed to see it go.

“So, does anyone know of better places or simply other places where this information lives in an easily accessible place?” he or she asked.

One person responded by recommending Icefilms.info, a pirate site. This is a response the MPAA would cringe at, but luckily, most people mentioned justwatch.com as the best alternative.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

[$] A new package index for Python

Post Syndicated from jake original https://lwn.net/Articles/751458/rss

The Python Package Index (PyPI) is
the principal repository of libraries for the Python programming language,
serving more than 170 million downloads each week. Fifteen years after PyPI
launched, a new edition is in beta at pypi.org, with features like better
search, a refreshed layout, and Markdown README files
(and with some old
features removed, like viewing GPG package signatures). Starting
April 16, users visiting the site or running pip install will
seamlessly redirected to the new site. Two weeks after that, the legacy site is
expected to be shut down and the team will turn toward new
features; in the meantime, it is worth a look at what the new PyPI brings
to the table.

More power to your Pi

Post Syndicated from James Adams original https://www.raspberrypi.org/blog/pi-power-supply-chip/

It’s been just over three weeks since we launched the new Raspberry Pi 3 Model B+. Although the product is branded Raspberry Pi 3B+ and not Raspberry Pi 4, a serious amount of engineering was involved in creating it. The wireless networking, USB/Ethernet hub, on-board power supplies, and BCM2837 chip were all upgraded: together these represent almost all the circuitry on the board! Today, I’d like to tell you about the work that has gone into creating a custom power supply chip for our newest computer.

Raspberry Pi 3 Model B+, with custome power supply chip

The new Raspberry Pi 3B+, sporting a new, custom power supply chip (bottom left-hand corner)

Successful launch

The Raspberry Pi 3B+ has been well received, and we’ve enjoyed hearing feedback from the community as well as reading the various reviews and articles highlighting the solid improvements in wireless networking, Ethernet, CPU, and thermal performance of the new board. Gareth Halfacree’s post here has some particularly nice graphs showing the increased performance as well as how the Pi 3B+ keeps cool under load due to the new CPU package that incorporates a metal heat spreader. The Raspberry Pi production lines at the Sony UK Technology Centre are running at full speed, and it seems most people who want to get hold of the new board are able to find one in stock.

Powering your Pi

One of the most critical but often under-appreciated elements of any electronic product, particularly one such as Raspberry Pi with lots of complex on-board silicon (processor, networking, high-speed memory), is the power supply. In fact, the Raspberry Pi 3B+ has no fewer than six different voltage rails: two at 3.3V — one special ‘quiet’ one for audio, and one for everything else; 1.8V; 1.2V for the LPDDR2 memory; and 1.2V nominal for the CPU core. Note that the CPU voltage is actually raised and lowered on the fly as the speed of the CPU is increased and decreased depending on how hard the it is working. The sixth rail is 5V, which is the master supply that all the others are created from, and the output voltage for the four downstream USB ports; this is what the mains power adaptor is supplying through the micro USB power connector.

Power supply primer

There are two common classes of power supply circuits: linear regulators and switching regulators. Linear regulators work by creating a lower, regulated voltage from a higher one. In simple terms, they monitor the output voltage against an internally generated reference and continually change their own resistance to keep the output voltage constant. Switching regulators work in a different way: they ‘pump’ energy by first storing the energy coming from the source supply in a reactive component (usually an inductor, sometimes a capacitor) and then releasing it to the regulated output supply. The switches in switching regulators effect this energy transfer by first connecting the inductor (or capacitor) to store the source energy, and then switching the circuit so the energy is released to its destination.

Linear regulators produce smoother, less noisy output voltages, but they can only convert to a lower voltage, and have to dissipate energy to do so. The higher the output current and the voltage difference across them is, the more energy is lost as heat. On the other hand, switching supplies can, depending on their design, convert any voltage to any other voltage and can be much more efficient (efficiencies of 90% and above are not uncommon). However, they are more complex and generate noisier output voltages.

Designers use both types of regulators depending on the needs of the downstream circuit: for low-voltage drops, low current, or low noise, linear regulators are usually the right choice, while switching regulators are used for higher power or when efficiency of conversion is required. One of the simplest switching-mode power supply circuits is the buck converter, used to create a lower voltage from a higher one, and this is what we use on the Pi.

A history lesson

The BCM2835 processor chip (found on the original Raspberry Pi Model B and B+, as well as on the Zero products) has on-chip power supplies: one switch-mode regulator for the core voltage, as well as a linear one for the LPDDR2 memory supply. This meant that in addition to 5V, we only had to provide 3.3V and 1.8V on the board, which was relatively simple to do using cheap, off-the-shelf parts.

Pi Zero sporting a BCM2835 processor which only needs 2 external switchers (the components clustered behind the camera port)

When we moved to the BCM2836 for Raspberry Pi Model 2 (and subsequently to the BCM2837A1 and B0 for Raspberry Pi 3B and 3B+), the core supply and the on-chip LPDDR2 memory supply were not up to the job of supplying the extra processor cores and larger memory, so we removed them. (We also used the recovered chip area to help fit in the new quad-core ARM processors.) The upshot of this was that we had to supply these power rails externally for the Raspberry Pi 2 and models thereafter. Moreover, we also had to provide circuitry to sequence them correctly in order to control exactly when they power up compared to the other supplies on the board.

Power supply design is tricky (but critical)

Raspberry Pi boards take in 5V from the micro USB socket and have to generate the other required supplies from this. When 5V is first connected, each of these other supplies must ‘start up’, meaning go from ‘off’, or 0V, to their correct voltage in some short period of time. The order of the supplies starting up is often important: commonly, there are structures inside a chip that form diodes between supply rails, and bringing supplies up in the wrong order can sometimes ‘turn on’ these diodes, causing them to conduct, with undesirable consequences. Silicon chips come with a data sheet specifying what supplies (voltages and currents) are needed and whether they need to be low-noise, in what order they must power up (and in some cases down), and sometimes even the rate at which the voltages must power up and down.

A Pi3. Power supply components are clustered bottom left next to the micro USB, middle (above LPDDR2 chip which is on the bottom of the PCB) and above the A/V jack.

In designing the power chain for the Pi 2 and 3, the sequencing was fairly straightforward: power rails power up in order of voltage (5V, 3.3V, 1.8V, 1.2V). However, the supplies were all generated with individual, discrete devices. Therefore, I spent quite a lot of time designing circuitry to control the sequencing — even with some design tricks to reduce component count, quite a few sequencing components are required. More complex systems generally use a Power Management Integrated Circuit (PMIC) with multiple supplies on a single chip, and many different PMIC variants are made by various manufacturers. Since Raspberry Pi 2 days, I was looking for a suitable PMIC to simplify the Pi design, but invariably (and somewhat counter-intuitively) these were always too expensive compared to my discrete solution, usually because they came with more features than needed.

One device to rule them all

It was way back in May 2015 when I first chatted to Peter Coyle of Exar (Exar were bought by MaxLinear in 2017) about power supply products for Raspberry Pi. We didn’t find a product match then, but in June 2016 Peter, along with Tuomas Hollman and Trevor Latham, visited to pitch the possibility of building a custom power management solution for us.

I was initially sceptical that it could be made cheap enough. However, our discussion indicated that if we could tailor the solution to just what we needed, it could be cost-effective. Over the coming weeks and months, we honed a specification we agreed on from the initial sketches we’d made, and Exar thought they could build it for us at the target price.

The chip we designed would contain all the key supplies required for the Pi on one small device in a cheap QFN package, and it would also perform the required sequencing and voltage monitoring. Moreover, the chip would be flexible to allow adjustment of supply voltages from their default values via I2C; the largest supply would be capable of being adjusted quickly to perform the dynamic core voltage changes needed in order to reduce voltage to the processor when it is idling (to save power), and to boost voltage to the processor when running at maximum speed (1.4 GHz). The supplies on the chip would all be generously specified and could deliver significantly more power than those used on the Raspberry Pi 3. All in all, the chip would contain four switching-mode converters and one low-current linear regulator, this last one being low-noise for the audio circuitry.

The MXL7704 chip

The project was a great success: MaxLinear delivered working samples of first silicon at the end of May 2017 (almost exactly a year after we had kicked off the project), and followed through with production quantities in December 2017 in time for the Raspberry Pi 3B+ production ramp.

The team behind the power supply chip on the Raspberry Pi 3 Model B+ (group of six men, two of whom are holding Raspberry Pi boards)

Front row: Roger with the very first Pi 3B+ prototypes and James with a MXL7704 development board hacked to power a Pi 3. Back row left to right: Will Torgerson, Trevor Latham, Peter Coyle, Tuomas Hollman.

The MXL7704 device has been key to reducing Pi board complexity and therefore overall bill of materials cost. Furthermore, by being able to deliver more power when needed, it has also been essential to increasing the speed of the (newly packaged) BCM2837B0 processor on the 3B+ to 1.4GHz. The result is improvements to both the continuous output current to the CPU (from 3A to 4A) and to the transient performance (i.e. the chip has helped to reduce the ‘transient response’, which is the change in supply voltage due to a sudden current spike that occurs when the processor suddenly demands a large current in a few nanoseconds, as modern CPUs tend to do).

With the MXL7704, the power supply circuitry on the 3B+ is now a lot simpler than the Pi 3B design. This new supply also provides the LPDDR2 memory voltage directly from a switching regulator rather than using linear regulators like the Pi 3, thereby improving energy efficiency. This helps to somewhat offset the extra power that the faster Ethernet, wireless networking, and processor consume. A pleasing side effect of using the new chip is the symmetric board layout of the regulators — it’s easy to see the four switching-mode supplies, given away by four similar-looking blobs (three grey and one brownish), which are the inductors.

Close-up of the power supply chip on the Raspberry Pi 3 Model B+

The Pi 3B+ PMIC MXL7704 — pleasingly symmetric


It takes a lot of effort to design a new chip from scratch and get it all the way through to production — we are very grateful to the team at MaxLinear for their hard work, dedication, and enthusiasm. We’re also proud to have created something that will not only power Raspberry Pis, but will also be useful for other product designs: it turns out when you have a low-cost and flexible device, it can be used for many things — something we’re fairly familiar with here at Raspberry Pi! For the curious, the product page (including the data sheet) for the MXL7704 chip is here. Particular thanks go to Peter Coyle, Tuomas Hollman, and Trevor Latham, and also to Jon Cronk, who has been our contact in the US and has had to get up early to attend all our conference calls!

The MXL7704 design team celebrating on Pi Day — it takes a lot of people to design a chip!

I hope you liked reading about some of the effort that has gone into creating the new Pi. It’s nice to finally have a chance to tell people about some of the (increasingly complex) technical work that makes building a $35 computer possible — we’re very pleased with the Raspberry Pi 3B+, and we hope you enjoy using it as much as we’ve enjoyed creating it!

The post More power to your Pi appeared first on Raspberry Pi.

DARPA Funding in AI-Assisted Cybersecurity

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/04/darpa_funding_i.html

DARPA is launching a program aimed at vulnerability discovery via human-assisted AI. The new DARPA program is called CHESS (Computers and Humans Exploring Software Security), and they’re holding a proposers day in a week and a half.

This is the kind of thing that can dramatically change the offense/defense balance.