Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/01/podcast_intervi_5.html
Nice interview with the EFF’s director of cybersecurity, Eva Galperin.
Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/01/podcast_intervi_5.html
Nice interview with the EFF’s director of cybersecurity, Eva Galperin.
Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/ec2-instance-update-m5-instances-with-local-nvme-storage-m5d/
Earlier this month we launched the C5 Instances with Local NVMe Storage and I told you that we would be doing the same for additional instance types in the near future!
Today we are introducing M5 instances equipped with local NVMe storage. Available for immediate use in 5 regions, these instances are a great fit for workloads that require a balance of compute and memory resources. Here are the specs:
|Instance Name||vCPUs||RAM||Local Storage||EBS-Optimized Bandwidth||Network Bandwidth|
|m5d.large||2||8 GiB||1 x 75 GB NVMe SSD||Up to 2.120 Gbps||Up to 10 Gbps|
|m5d.xlarge||4||16 GiB||1 x 150 GB NVMe SSD||Up to 2.120 Gbps||Up to 10 Gbps|
|m5d.2xlarge||8||32 GiB||1 x 300 GB NVMe SSD||Up to 2.120 Gbps||Up to 10 Gbps|
|m5d.4xlarge||16||64 GiB||1 x 600 GB NVMe SSD||2.210 Gbps||Up to 10 Gbps|
|m5d.12xlarge||48||192 GiB||2 x 900 GB NVMe SSD||5.0 Gbps||10 Gbps|
|m5d.24xlarge||96||384 GiB||4 x 900 GB NVMe SSD||10.0 Gbps||25 Gbps|
The M5d instances are powered by Custom Intel® Xeon® Platinum 8175M series processors running at 2.5 GHz, including support for AVX-512.
You can use any AMI that includes drivers for the Elastic Network Adapter (ENA) and NVMe; this includes the latest Amazon Linux, Microsoft Windows (Server 2008 R2, Server 2012, Server 2012 R2 and Server 2016), Ubuntu, RHEL, SUSE, and CentOS AMIs.
Here are a couple of things to keep in mind about the local NVMe storage on the M5d instances:
Naming – You don’t have to specify a block device mapping in your AMI or during the instance launch; the local storage will show up as one or more devices (
/dev/nvme*1 on Linux) after the guest operating system has booted.
Encryption – Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.
Lifetime – Local NVMe devices have the same lifetime as the instance they are attached to, and do not stick around after the instance has been stopped or terminated.
M5d instances are available in On-Demand, Reserved Instance, and Spot form in the US East (N. Virginia), US West (Oregon), EU (Ireland), US East (Ohio), and Canada (Central) Regions. Prices vary by Region, and are just a bit higher than for the equivalent M5 instances.
Post Syndicated from Devin Watson original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-june-2018/
AWS Online Tech Talks – June 2018
Join us this month to learn about AWS services and solutions. New this month, we have a fireside chat with the GM of Amazon WorkSpaces and our 2nd episode of the “How to re:Invent” series. We’ll also cover best practices, deep dives, use cases and more! Join us and register today!
Note – All sessions are free and in Pacific Time.
Tech talks featured this month:
Analytics & Big Data
June 18, 2018 | 11:00 AM – 11:45 AM PT – Get Started with Real-Time Streaming Data in Under 5 Minutes – Learn how to use Amazon Kinesis to capture, store, and analyze streaming data in real-time including IoT device data, VPC flow logs, and clickstream data.
June 20, 2018 | 11:00 AM – 11:45 AM PT – Insights For Everyone – Deploying Data across your Organization – Learn how to deploy data at scale using AWS Analytics and QuickSight’s new reader role and usage based pricing.
June 13, 2018 | 05:00 PM – 05:30 PM PT – Episode 2: AWS re:Invent Breakout Content Secret Sauce – Hear from one of our own AWS content experts as we dive deep into the re:Invent content strategy and how we maintain a high bar.
June 25, 2018 | 01:00 PM – 01:45 PM PT – Accelerating Containerized Workloads with Amazon EC2 Spot Instances – Learn how to efficiently deploy containerized workloads and easily manage clusters at any scale at a fraction of the cost with Spot Instances.
June 26, 2018 | 01:00 PM – 01:45 PM PT – Ensuring Your Windows Server Workloads Are Well-Architected – Get the benefits, best practices and tools on running your Microsoft Workloads on AWS leveraging a well-architected approach.
June 25, 2018 | 09:00 AM – 09:45 AM PT – Running Kubernetes on AWS – Learn about the basics of running Kubernetes on AWS including how setup masters, networking, security, and add auto-scaling to your cluster.
June 18, 2018 | 01:00 PM – 01:45 PM PT – Oracle to Amazon Aurora Migration, Step by Step – Learn how to migrate your Oracle database to Amazon Aurora.
June 20, 2018 | 09:00 AM – 09:45 AM PT – Set Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tools – Learn how to set up a CI/CD pipeline for deploying containers using the AWS Developer Tools.
Enterprise & Hybrid
June 18, 2018 | 09:00 AM – 09:45 AM PT – De-risking Enterprise Migration with AWS Managed Services – Learn how enterprise customers are de-risking cloud adoption with AWS Managed Services.
June 19, 2018 | 11:00 AM – 11:45 AM PT – Launch AWS Faster using Automated Landing Zones – Learn how the AWS Landing Zone can automate the set up of best practice baselines when setting up new
June 21, 2018 | 11:00 AM – 11:45 AM PT – Leading Your Team Through a Cloud Transformation – Learn how you can help lead your organization through a cloud transformation.
June 21, 2018 | 01:00 PM – 01:45 PM PT – Enabling New Retail Customer Experiences with Big Data – Learn how AWS can help retailers realize actual value from their big data and deliver on differentiated retail customer experiences.
June 28, 2018 | 01:00 PM – 01:45 PM PT – Fireside Chat: End User Collaboration on AWS – Learn how End User Compute services can help you deliver access to desktops and applications anywhere, anytime, using any device.
June 27, 2018 | 11:00 AM – 11:45 AM PT – AWS IoT in the Connected Home – Learn how to use AWS IoT to build innovative Connected Home products.
June 19, 2018 | 09:00 AM – 09:45 AM PT – Integrating Amazon SageMaker into your Enterprise – Learn how to integrate Amazon SageMaker and other AWS Services within an Enterprise environment.
June 21, 2018 | 09:00 AM – 09:45 AM PT – Building Text Analytics Applications on AWS using Amazon Comprehend – Learn how you can unlock the value of your unstructured data with NLP-based text analytics.
June 20, 2018 | 01:00 PM – 01:45 PM PT – Optimizing Application Performance and Costs with Auto Scaling – Learn how selecting the right scaling option can help optimize application performance and costs.
June 25, 2018 | 11:00 AM – 11:45 AM PT – Drive User Engagement with Amazon Pinpoint – Learn how Amazon Pinpoint simplifies and streamlines effective user engagement.
Security, Identity & Compliance
June 26, 2018 | 09:00 AM – 09:45 AM PT – Understanding AWS Secrets Manager – Learn how AWS Secrets Manager helps you rotate and manage access to secrets centrally.
June 28, 2018 | 09:00 AM – 09:45 AM PT – Using Amazon Inspector to Discover Potential Security Issues – See how Amazon Inspector can be used to discover security issues of your instances.
June 19, 2018 | 01:00 PM – 01:45 PM PT – Productionize Serverless Application Building and Deployments with AWS SAM – Learn expert tips and techniques for building and deploying serverless applications at scale with AWS SAM.
June 26, 2018 | 11:00 AM – 11:45 AM PT – Deep Dive: Hybrid Cloud Storage with AWS Storage Gateway – Learn how you can reduce your on-premises infrastructure by using the AWS Storage Gateway to connecting your applications to the scalable and reliable AWS storage services.
June 27, 2018 | 01:00 PM – 01:45 PM PT – Changing the Game: Extending Compute Capabilities to the Edge – Discover how to change the game for IIoT and edge analytics applications with AWS Snowball Edge plus enhanced Compute instances.
June 28, 2018 | 11:00 AM – 11:45 AM PT – Big Data and Analytics Workloads on Amazon EFS – Get best practices and deployment advice for running big data and analytics workloads on Amazon EFS.
Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/06/e-mail_vulnerab.html
Last week, researchers disclosed vulnerabilities in a large number of encrypted e-mail clients: specifically, those that use OpenPGP and S/MIME, including Thunderbird and AppleMail. These are serious vulnerabilities: An attacker who can alter mail sent to a vulnerable client can trick that client into sending a copy of the plaintext to a web server controlled by that attacker. The story of these vulnerabilities and the tale of how they were disclosed illustrate some important lessons about security vulnerabilities in general and e-mail security in particular.
But first, if you use PGP or S/MIME to encrypt e-mail, you need to check the list on this page and see if you are vulnerable. If you are, check with the vendor to see if they’ve fixed the vulnerability. (Note that some early patches turned out not to fix the vulnerability.) If not, stop using the encrypted e-mail program entirely until it’s fixed. Or, if you know how to do it, turn off your e-mail client’s ability to process HTML e-mail or — even better — stop decrypting e-mails from within the client. There’s even more complex advice for more sophisticated users, but if you’re one of those, you don’t need me to explain this to you.
Consider your encrypted e-mail insecure until this is fixed.
All software contains security vulnerabilities, and one of the primary ways we all improve our security is by researchers discovering those vulnerabilities and vendors patching them. It’s a weird system: Corporate researchers are motivated by publicity, academic researchers by publication credentials, and just about everyone by individual fame and the small bug-bounties paid by some vendors.
Software vendors, on the other hand, are motivated to fix vulnerabilities by the threat of public disclosure. Without the threat of eventual publication, vendors are likely to ignore researchers and delay patching. This happened a lot in the 1990s, and even today, vendors often use legal tactics to try to block publication. It makes sense; they look bad when their products are pronounced insecure.
Over the past few years, researchers have started to choreograph vulnerability announcements to make a big press splash. Clever names — the e-mail vulnerability is called “Efail” — websites, and cute logos are now common. Key reporters are given advance information about the vulnerabilities. Sometimes advance teasers are released. Vendors are now part of this process, trying to announce their patches at the same time the vulnerabilities are announced.
This simultaneous announcement is best for security. While it’s always possible that some organization — either government or criminal — has independently discovered and is using the vulnerability before the researchers go public, use of the vulnerability is essentially guaranteed after the announcement. The time period between announcement and patching is the most dangerous, and everyone except would-be attackers wants to minimize it.
Things get much more complicated when multiple vendors are involved. In this case, Efail isn’t a vulnerability in a particular product; it’s a vulnerability in a standard that is used in dozens of different products. As such, the researchers had to ensure both that everyone knew about the vulnerability in time to fix it and that no one leaked the vulnerability to the public during that time. As you can imagine, that’s close to impossible.
Efail was discovered sometime last year, and the researchers alerted dozens of different companies between last October and March. Some companies took the news more seriously than others. Most patched. Amazingly, news about the vulnerability didn’t leak until the day before the scheduled announcement date. Two days before the scheduled release, the researchers unveiled a teaser — honestly, a really bad idea — which resulted in details leaking.
After the leak, the Electronic Frontier Foundation posted a notice about the vulnerability without details. The organization has been criticized for its announcement, but I am hard-pressed to find fault with its advice. (Note: I am a board member at EFF.) Then, the researchers published — and lots of press followed.
All of this speaks to the difficulty of coordinating vulnerability disclosure when it involves a large number of companies or — even more problematic — communities without clear ownership. And that’s what we have with OpenPGP. It’s even worse when the bug involves the interaction between different parts of a system. In this case, there’s nothing wrong with PGP or S/MIME in and of themselves. Rather, the vulnerability occurs because of the way many e-mail programs handle encrypted e-mail. GnuPG, an implementation of OpenPGP, decided that the bug wasn’t its fault and did nothing about it. This is arguably true, but irrelevant. They should fix it.
Expect more of these kinds of problems in the future. The Internet is shifting from a set of systems we deliberately use — our phones and computers — to a fully immersive Internet-of-things world that we live in 24/7. And like this e-mail vulnerability, vulnerabilities will emerge through the interactions of different systems. Sometimes it will be obvious who should fix the problem. Sometimes it won’t be. Sometimes it’ll be two secure systems that, when they interact in a particular way, cause an insecurity. In April, I wrote about a vulnerability that arose because Google and Netflix make different assumptions about e-mail addresses. I don’t even know who to blame for that one.
It gets even worse. Our system of disclosure and patching assumes that vendors have the expertise and ability to patch their systems, but that simply isn’t true for many of the embedded and low-cost Internet of things software packages. They’re designed at a much lower cost, often by offshore teams that come together, create the software, and then disband; as a result, there simply isn’t anyone left around to receive vulnerability alerts from researchers and write patches. Even worse, many of these devices aren’t patchable at all. Right now, if you own a digital video recorder that’s vulnerable to being recruited for a botnet — remember Mirai from 2016? — the only way to patch it is to throw it away and buy a new one.
Patching is starting to fail, which means that we’re losing the best mechanism we have for improving software security at exactly the same time that software is gaining autonomy and physical agency. Many researchers and organizations, including myself, have proposed government regulations enforcing minimal security standards for Internet-of-things devices, including standards around vulnerability disclosure and patching. This would be expensive, but it’s hard to see any other viable alternative.
Getting back to e-mail, the truth is that it’s incredibly difficult to secure well. Not because the cryptography is hard, but because we expect e-mail to do so many things. We use it for correspondence, for conversations, for scheduling, and for record-keeping. I regularly search my 20-year e-mail archive. The PGP and S/MIME security protocols are outdated, needlessly complicated and have been difficult to properly use the whole time. If we could start again, we would design something better and more user friendlybut the huge number of legacy applications that use the existing standards mean that we can’t. I tell people that if they want to communicate securely with someone, to use one of the secure messaging systems: Signal, Off-the-Record, or — if having one of those two on your system is itself suspicious — WhatsApp. Of course they’re not perfect, as last week’s announcement of a vulnerability (patched within hours) in Signal illustrates. And they’re not as flexible as e-mail, but that makes them easier to secure.
This essay previously appeared on Lawfare.com.
Post Syndicated from Richard Hayler original https://www.raspberrypi.org/blog/build-your-own-weather-station/
One of the most common enquiries I receive at Pi Towers is “How can I get my hands on a Raspberry Pi Oracle Weather Station?” Now the answer is: “Why not build your own version using our guide?”
In 2016 we sent out nearly 1000 Raspberry Pi Oracle Weather Station kits to schools from around the world who had applied to be part of our weather station programme. In the original kit was a special HAT that allows the Pi to collect weather data with a set of sensors.
We designed the HAT to enable students to create their own weather stations and mount them at their schools. As part of the programme, we also provide an ever-growing range of supporting resources. We’ve seen Oracle Weather Stations in great locations with a huge differences in climate, and they’ve even recorded the effects of a solar eclipse.
We only had a single batch of HATs made, and unfortunately we’ve given nearly* all the Weather Station kits away. Not only are the kits really popular, we also receive lots of questions about how to add extra sensors or how to take more precise measurements of a particular weather phenomenon. So today, to satisfy your demand for a hackable weather station, we’re launching our Build your own weather station guide!
Our guide suggests the use of many of the sensors from the Oracle Weather Station kit, so can build a station that’s as close as possible to the original. As you know, the Raspberry Pi is incredibly versatile, and we’ve made it easy to hack the design in case you want to use different sensors.
Many other tutorials for Pi-powered weather stations don’t explain how the various sensors work or how to store your data. Ours goes into more detail. It shows you how to put together a breadboard prototype, it describes how to write Python code to take readings in different ways, and it guides you through recording these readings in a database.
There’s also a section on how to make your station weatherproof. And in case you want to move past the breadboard stage, we also help you with that. The guide shows you how to solder together all the components, similar to the original Oracle Weather Station HAT.
We think this is a great project to tackle at home, at a STEM club, Scout group, or CoderDojo, and we’re sure that many of you will be chomping at the bit to get started. Before you do, please note that we’ve designed the build to be as straight-forward as possible, but it’s still fairly advanced both in terms of electronics and programming. You should read through the whole guide before purchasing any components.
The sensors and components we’re suggesting balance cost, accuracy, and easy of use. Depending on what you want to use your station for, you may wish to use different components. Similarly, the final soldered design in the guide may not be the most elegant, but we think it is achievable for someone with modest soldering experience and basic equipment.
You can build a functioning weather station without soldering with our guide, but the build will be more durable if you do solder it. If you’ve never tried soldering before, that’s OK: we have a Getting started with soldering resource plus video tutorial that will walk you through how it works step by step.
For those of you who are more experienced makers, there are plenty of different ways to put the final build together. We always like to hear about alternative builds, so please post your designs in the Weather Station forum.
Our next step is publishing supplementary guides for adding extra functionality to your weather station. We’d love to hear which enhancements you would most like to see! Our current ideas under development include adding a webcam, making a tweeting weather station, adding a light/UV meter, and incorporating a lightning sensor. Let us know which of these is your favourite, or suggest your own amazing ideas in the comments!
*We do have a very small number of kits reserved for interesting projects or locations: a particularly cool experiment, a novel idea for how the Oracle Weather Station could be used, or places with specific weather phenomena. If have such a project in mind, please send a brief outline to [email protected], and we’ll consider how we might be able to help you.
The post Build your own weather station with our new guide! appeared first on Raspberry Pi.
Post Syndicated from Andy original https://torrentfreak.com/flight-sim-company-threatens-reddit-mods-over-libellous-drm-posts-180604/
The story began when a Reddit user reported something unusual in his download of FlightSimLabs’ A320X module. A file – test.exe – was being flagged up as a ‘Chrome Password Dump’ tool, something which rang alarm bells among flight sim fans.
As additional information was made available, the story became even more sensational. After first dodging the issue with carefully worded statements, FlightSimLabs admitted that it had installed a password dumper onto ALL users’ machines – whether they were pirates or not – in an effort to catch a particular software cracker and launch legal action.
It was an incredible story that no doubt did damage to FlightSimLabs’ reputation. But the now the company is at the center of a new storm, again centered around anti-piracy measures and again focused on Reddit.
Just before the weekend, Reddit user /u/walkday reported finding something unusual in his A320X module, the same module that caused the earlier controversy.
“The latest installer of FSLabs’ A320X puts two cmdhost.exe files under ‘system32\’ and ‘SysWOW64\’ of my Windows directory. Despite the name, they don’t open a command-line window,” he reported.
“They’re a part of the authentication because, if you remove them, the A320X won’t get loaded. Does someone here know more about cmdhost.exe? Why does FSLabs give them such a deceptive name and put them in the system folders? I hate them for polluting my system folder unless, of course, it is a dll used by different applications.”
Needless to say, the news that FSLabs were putting files into system folders named to make them look like system files was not well received.
“Hiding something named to resemble Window’s “Console Window Host” process in system folders is a huge red flag,” one user wrote.
“It’s a malware tactic used to deceive users into thinking the executable is a part of the OS, thus being trusted and not deleted. Really dodgy tactic, don’t trust it and don’t trust them,” opined another.
With a disenchanted Reddit userbase simmering away in the background, FSLabs took to Facebook with a statement to quieten down the masses.
“Over the past few hours we have become aware of rumors circulating on social media about the cmdhost file installed by the A320-X and wanted to clear up any confusion or misunderstanding,” the company wrote.
“cmdhost is part of our eSellerate infrastructure – which communicates between the eSellerate server and our product activation interface. It was designed to reduce the number of product activation issues people were having after the FSX release – which have since been resolved.”
The company noted that the file had been checked by all major anti-virus companies and everything had come back clean, which does indeed appear to be the case. Nevertheless, the critical Reddit thread remained, bemoaning the actions of a company which probably should have known better than to irritate fans after February’s debacle. In response, however, FSLabs did just that once again.
In private messages to the moderators of the /r/flightsim sub-Reddit, FSLabs’ Marketing and PR Manager Simon Kelsey suggested that the mods should do something about the thread in question or face possible legal action.
“Just a gentle reminder of Reddit’s obligations as a publisher in order to ensure that any libelous content is taken down as soon as you become aware of it,” Kelsey wrote.
Noting that FSLabs welcomes “robust fair comment and opinion”, Kelsey gave the following advice.
“The ‘cmdhost.exe’ file in question is an entirely above board part of our anti-piracy protection and has been submitted to numerous anti-virus providers in order to verify that it poses no threat. Therefore, ANY suggestion that current or future products pose any threat to users is absolutely false and libelous,” he wrote, adding:
“As we have already outlined in the past, ANY suggestion that any user’s data was compromised during the events of February is entirely false and therefore libelous.”
Noting that FSLabs would “hate for lawyers to have to get involved in this”, Kelsey advised the /r/flightsim mods to ensure that no such claims were allowed to remain on the sub-Reddit.
But after not receiving the response he would’ve liked, Kelsey wrote once again to the mods. He noted that “a number of unsubstantiated and highly defamatory comments” remained online and warned that if something wasn’t done to clean them up, he would have “no option” than to pass the matter to FSLabs’ legal team.
Like the first message, this second effort also failed to have the desired effect. In fact, the moderators’ response was to post an open letter to Kelsey and FSLabs instead.
“We sincerely disagree that you ‘welcome robust fair comment and opinion’, demonstrated by the censorship on your forums and the attempted censorship on our subreddit,” the mods wrote.
“While what you do on your forum is certainly your prerogative, your rules do not extend to Reddit nor the r/flightsim subreddit. Removing content you disagree with is simply not within our purview.”
The letter, which is worth reading in full, refutes Kelsey’s claims and also suggests that critics of FSLabs may have been subjected to Reddit vote manipulation and coordinated efforts to discredit them.
It’s a little early to say for sure but it seems unlikely that this will end in a net positive for FSLabs, no matter what decision Reddit’s admins take.
Post Syndicated from Andy original https://torrentfreak.com/joe-public-becomes-commercial-pirate-little-knowledge-dangerous-180603/
Back in March and just a few hours before the Anthony Joshua v Joseph Parker fight, I got chatting with some fellow fans in the local pub. While some were intending to pay for the fight, others were going down the Kodi route.
Soon after the conversation switched to IPTV. One of the guys had a subscription and he said that his supplier would be along shortly if anyone wanted a package to watch the fight at home. Of course, I was curious to hear what he had to say since it’s not often this kind of thing is offered ‘offline’.
The guy revealed that he sold more or less exclusively on eBay and called up the page on his phone to show me. The listing made interesting reading.
In common with hundreds of similar IPTV subscription offers easily findable on eBay, the listing offered “All the sports and films you need plus VOD and main UK channels” for the sum of just under £60 per year, which is fairly cheap in the current market. With a non-committal “hmmm” I asked a bit more about the guy’s business and surprisingly he was happy to provide some details.
Like many people offering such packages, the guy was a reseller of someone else’s product. He also insisted that selling access to copyrighted content is OK because it sits in a “gray area”. It’s also easy to keep listings up on eBay, he assured me, as long as a few simple rules are adhered to. Right, this should be interesting.
First of all, sellers shouldn’t be “too obvious” he advised, noting that individual channels or channel lists shouldn’t be listed on the site. Fair enough, but then he said the most important thing of all is to have a disclaimer like his in any listing, written as follows:
“PLEASE NOTE EBAY: THIS IS NOT A DE SCRAMBLER SERVICE, I AM NOT SELLING ANY ILLEGAL CHANNELS OR CHANNEL LISTS NOR DO I REPRESENT ANY MEDIA COMPANY NOR HAVE ACCESS TO ANY OF THEIR CONTENTS. NO TRADEMARK HAS BEEN INFRINGED. DO NOT REMOVE LISTING AS IT IS IN ACCORDANCE WITH EBAY POLICIES.”
Apparently, this paragraph is crucial to keeping listings up on eBay and is the equivalent of kryptonite when it comes to deflecting copyright holders, police, and Trading Standards. Sure enough, a few seconds with Google reveals the same wording on dozens of eBay listings and those offering IPTV subscriptions on external platforms.
It is, of course, absolutely worthless but the IPTV seller insisted otherwise, noting he’d sold “thousands” of subscriptions through eBay without any problems. While a similar logic can be applied to garlic and vampires, a second disclaimer found on many other illicit IPTV subscription listings treads an even more bizarre path.
“THE PRODUCTS OFFERED CAN NOT BE USED TO DESCRAMBLE OR OTHERWISE ENABLE ACCESS TO CABLE OR SATELLITE TELEVISION PROGRAMS THAT BYPASSES PAYMENT TO THE SERVICE PROVIDER. RECEIVING SUBSCRIPTION/BASED TV AIRTIME IS ILLEGAL WITHOUT PAYING FOR IT.”
This disclaimer (which apparently no sellers displaying it have ever read) seems to be have been culled from the Zgemma site, which advertises a receiving device which can technically receive pirate IPTV services but wasn’t designed for the purpose. In that context, the disclaimer makes sense but when applied to dedicated pirate IPTV subscriptions, it’s absolutely ridiculous.
It’s unclear why so many sellers on eBay, Gumtree, Craigslist and other platforms think that these disclaimers are useful. It leads one to the likely conclusion that these aren’t hardcore pirates at all but regular people simply out to make a bit of extra cash who have received bad advice.
What is clear, however, is that selling access to thousands of otherwise subscription channels without permission from copyright owners is definitely illegal in the EU. The European Court of Justice says so (1,2) and it’s been backed up by subsequent cases in the Netherlands.
While the odds of getting criminally prosecuted or sued for reselling such a service are relatively slim, it’s worrying that in 2018 people still believe that doing so is made legal by the inclusion of a paragraph of text. It’s even more worrying that these individuals apparently have no idea of the serious consequences should they become singled out for legal action.
Even more surprisingly, TorrentFreak spoke with a handful of IPTV suppliers higher up the chain who also told us that what they are doing is legal. A couple claimed to be protected by communication intermediary laws, others didn’t want to go into details. Most stopped responding to emails on the topic. Perhaps most tellingly, none wanted to go on the record.
The big take-home here is that following some important EU rulings, knowingly linking to copyrighted content for profit is nearly always illegal in Europe and leaves people open for targeting by copyright holders and the authorities. People really should be aware of that, especially the little guy making a little extra pocket money on eBay.
Of course, people are perfectly entitled to carry on regardless and test the limits of the law when things go wrong. At this point, however, it’s probably worth noting that IPTV provider Ace Hosting recently handed over £600,000 rather than fight the Premier League (1,2) when they clearly had the money to put up a defense.
Given their effectiveness, perhaps they should’ve put up a disclaimer instead?
Post Syndicated from Janina Ander original https://www.raspberrypi.org/blog/coral-reefs-nemo-pi/
The German charity Save Nemo works to protect coral reefs, and they are developing Nemo-Pi, an underwater “weather station” that monitors ocean conditions. Right now, you can vote for Save Nemo in the Google.org Impact Challenge.
The organisation says there are two major threats to coral reefs: divers, and climate change. To make diving saver for reefs, Save Nemo installs buoy anchor points where diving tour boats can anchor without damaging corals in the process.
In addition, they provide dos and don’ts for how to behave on a reef dive.
To monitor the effects of climate change, and to help divers decide whether conditions are right at a reef while they’re still on shore, Save Nemo is also in the process of perfecting Nemo-Pi.
This Raspberry Pi-powered device is made up of a buoy, a solar panel, a GPS device, a Pi, and an array of sensors. Nemo-Pi measures water conditions such as current, visibility, temperature, carbon dioxide and nitrogen oxide concentrations, and pH. It also uploads its readings live to a public webserver.
The Save Nemo team is currently doing long-term tests of Nemo-Pi off the coast of Thailand and Indonesia. They are also working on improving the device’s power consumption and durability, and testing prototypes with the Raspberry Pi Zero W.
Save Nemo aims to install a network of Nemo-Pis at shallow reefs (up to 60 metres deep) in South East Asia. Then diving tour companies can check the live data online and decide day-to-day whether tours are feasible. This will lower the impact of humans on reefs and help the local flora and fauna survive.
Nemo-Pi data may also be useful for groups lobbying for reef conservation, and for scientists and activists who want to shine a spotlight on the awful effects of climate change on sea life, such as coral bleaching caused by rising water temperatures.
If you want to help Save Nemo in their mission today, vote for them to win the Google.org Impact Challenge:
Voting is open until 6 June. You can also follow Save Nemo on Facebook or Twitter. We think this organisation is doing valuable work, and that their projects could be expanded to reefs across the globe. It’s fantastic to see the Raspberry Pi being used to help protect ocean life.
The post Protecting coral reefs with Nemo-Pi, the underwater monitor appeared first on Raspberry Pi.
Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-pay-per-session-pricing-for-amazon-quicksight-another-region-and-lots-more/
Amazon QuickSight is a fully managed cloud business intelligence system that gives you Fast & Easy to Use Business Analytics for Big Data. QuickSight makes business analytics available to organizations of all shapes and sizes, with the ability to access data that is stored in your Amazon Redshift data warehouse, your Amazon Relational Database Service (RDS) relational databases, flat files in S3, and (via connectors) data stored in on-premises MySQL, PostgreSQL, and SQL Server databases. QuickSight scales to accommodate tens, hundreds, or thousands of users per organization.
Today we are launching a new, session-based pricing option for QuickSight, along with additional region support and other important new features. Let’s take a look at each one:
Our customers are making great use of QuickSight and take full advantage of the power it gives them to connect to data sources, create reports, and and explore visualizations.
However, not everyone in an organization needs or wants such powerful authoring capabilities. Having access to curated data in dashboards and being able to interact with the data by drilling down, filtering, or slicing-and-dicing is more than adequate for their needs. Subscribing them to a monthly or annual plan can be seen as an unwarranted expense, so a lot of such casual users end up not having access to interactive data or BI.
In order to allow customers to provide all of their users with interactive dashboards and reports, the Enterprise Edition of Amazon QuickSight now allows Reader access to dashboards on a Pay-per-Session basis. QuickSight users are now classified as Admins, Authors, or Readers, with distinct capabilities and prices:
Authors have access to the full power of QuickSight; they can establish database connections, upload new data, create ad hoc visualizations, and publish dashboards, all for $9 per month (Standard Edition) or $18 per month (Enterprise Edition).
Readers can view dashboards, slice and dice data using drill downs, filters and on-screen controls, and download data in CSV format, all within the secure QuickSight environment. Readers pay $0.30 for 30 minutes of access, with a monthly maximum of $5 per reader.
Admins have all authoring capabilities, and can manage users and purchase SPICE capacity in the account. The QuickSight admin now has the ability to set the desired option (Author or Reader) when they invite members of their organization to use QuickSight. They can extend Reader invites to their entire user base without incurring any up-front or monthly costs, paying only for the actual usage.
To learn more, visit the QuickSight Pricing page.
A New Region
QuickSight is now available in the Asia Pacific (Tokyo) Region:
The UI is in English, with a localized version in the works.
Hourly Data Refresh
Enterprise Edition SPICE data sets can now be set to refresh as frequently as every hour. In the past, each data set could be refreshed up to 5 times a day. To learn more, read Refreshing Imported Data.
Access to Data in Private VPCs
This feature was launched in preview form late last year, and is now available in production form to users of the Enterprise Edition. As I noted at the time, you can use it to implement secure, private communication with data sources that do not have public connectivity, including on-premises data in Teradata or SQL Server, accessed over an AWS Direct Connect link. To learn more, read Working with AWS VPC.
Parameters with On-Screen Controls
QuickSight dashboards can now include parameters that are set using on-screen dropdown, text box, numeric slider or date picker controls. The default value for each parameter can be set based on the user name (QuickSight calls this a dynamic default). You could, for example, set an appropriate default based on each user’s office location, department, or sales territory. Here’s an example:
To learn more, read about Parameters in QuickSight.
URL Actions for Linked Dashboards
You can now connect your QuickSight dashboards to external applications by defining URL actions on visuals. The actions can include parameters, and become available in the Details menu for the visual. URL actions are defined like this:
You can use this feature to link QuickSight dashboards to third party applications (e.g. Salesforce) or to your own internal applications. Read Custom URL Actions to learn how to use this feature.
You can now share QuickSight dashboards across every user in an account.
Larger SPICE Tables
The per-data set limit for SPICE tables has been raised from 10 GB to 25 GB.
Upgrade to Enterprise Edition
The QuickSight administrator can now upgrade an account from Standard Edition to Enterprise Edition with a click. This enables provisioning of Readers with pay-per-session pricing, private VPC access, row-level security for dashboards and data sets, and hourly refresh of data sets. Enterprise Edition pricing applies after the upgrade.
Everything I listed above is available now and you can start using it today!
Previously, I showed you how to rotate Amazon RDS database credentials automatically with AWS Secrets Manager. In addition to database credentials, AWS Secrets Manager makes it easier to rotate, manage, and retrieve API keys, OAuth tokens, and other secrets throughout their lifecycle. You can configure Secrets Manager to rotate these secrets automatically, which can help you meet your compliance needs. You can also use Secrets Manager to rotate secrets on demand, which can help you respond quickly to security events. In this post, I show you how to store an API key in Secrets Manager and use a custom Lambda function to rotate the key automatically. I’ll use a Twitter API key and bearer token as an example; you can reference this example to rotate other types of API keys.
The instructions are divided into four main phases:
For the purpose of this post, I use the placeholder Demo/Twitter_Api_Key to denote the API key, the placeholder Demo/Twitter_bearer_token to denote the bearer token, and placeholder Lambda_Rotate_Bearer_Token to denote the custom Lambda function. Be sure to replace these placeholders with the resource names from your account.
Twitter enables developers to register their applications and retrieve an API key, which includes a consumer_key and consumer_secret. Developers use these to generate a bearer token that applications can then use to authenticate and retrieve information from Twitter. At any given point of time, you can use an API key to create only one valid bearer token.
Start by storing the API key in Secrets Manager. Here’s how:
Next, store the bearer token in Secrets Manager. Here’s how:
While Secrets Manager supports rotating credentials for databases hosted on Amazon RDS natively, it also enables you to meet your unique rotation-related use cases by authoring custom Lambda functions. Now that you’ve stored the API key and bearer token, you’ll create a Lambda function to rotate the bearer token. For this example, I’ll create my Lambda function using Python 3.6.
Here’s what it will look like:
Here’s what it will look like:
Here’s what it will look like:
Note: Resources used in this example are in US East (Ohio) region. If you intend to use another AWS Region, change the SECRETS_MANAGER_ENDPOINT set in the Environment variables to the appropriate region.
You’ve now created a Lambda function that can rotate the bearer token:
Now that you’ve stored the bearer token in Secrets Manager, update the application to retrieve the bearer token from Secrets Manager instead of hard-coding this information in a configuration file or source code. For this example, I show you how to configure a Python application to retrieve this secret from Secrets Manager.
Note: To improve the resiliency of your applications, associate your application with two API keys/bearer tokens. This is a higher availability option because you can continue to use one bearer token while Secrets Manager rotates the other token. Read the AWS documentation to learn how AWS Secrets Manager rotates your secrets.
Now that you’ve stored the secret in Secrets Manager and created a Lambda function to rotate this secret, configure Secrets Manager to rotate the secret Demo/Twitter_bearer_token.
The banner on the next screen confirms that I have successfully configured rotation and the first rotation is in progress, which enables you to verify that rotation is functioning as expected. Secrets Manager will rotate this credential automatically every 30 days.
In this post, I showed you how to configure Secrets Manager to manage and rotate an API key and bearer token used by applications to authenticate and retrieve information from Twitter. You can use the steps described in this blog to manage and rotate other API keys, as well.
Secrets Manager helps you protect access to your applications, services, and IT resources without the upfront investment and on-going maintenance costs of operating your own secrets management infrastructure. To get started, open the Secrets Manager console. To learn more, read the Secrets Manager documentation.
If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Secrets Manager forum or contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
Post Syndicated from Andy original https://torrentfreak.com/majority-of-canadians-consume-online-content-legally-survey-finds-180531/
Back in January, a coalition of companies and organizations with ties to the entertainment industries called on local telecoms regulator CRTC to implement a national website blocking regime.
Under the banner of Fairplay Canada, members including Bell, Cineplex, Directors Guild of Canada, Maple Leaf Sports and Entertainment, Movie Theatre Association of Canada, and Rogers Media, spoke of an industry under threat from marauding pirates. But just how serious is this threat?
The results of a new survey commissioned by Innovation Science and Economic Development Canada (ISED) in collaboration with the Department of Canadian Heritage (PCH) aims to shine light on the problem by revealing the online content consumption habits of citizens in the Great White North.
While there are interesting findings for those on both sides of the site-blocking debate, the situation seems somewhat removed from the Armageddon scenario predicted by the entertainment industries.
Carried out among 3,301 Canadians aged 12 years and over, the Kantar TNS study aims to cover copyright infringement in six key content areas – music, movies, TV shows, video games, computer software, and eBooks. Attitudes and behaviors are also touched upon while measuring the effectiveness of Canada’s copyright measures.
General Digital Content Consumption
In its introduction, the report notes that 28 million Canadians used the Internet in the three-month study period to November 27, 2017. Of those, 22 million (80%) consumed digital content. Around 20 million (73%) streamed or accessed content, 16 million (59%) downloaded content, while 8 million (28%) shared content.
Music, TV shows and movies all battled for first place in the consumption ranks, with 48%, 48%, and 46% respectively.
According to the study, the majority of Canadians do things completely by the book. An impressive 74% of media-consuming respondents said that they’d only accessed material from legal sources in the preceding three months.
The remaining 26% admitted to accessing at least one illegal file in the same period. Of those, just 5% said that all of their consumption was from illegal sources, with movies (36%), software (36%), TV shows (34%) and video games (33%) the most likely content to be consumed illegally.
Interestingly, the study found that few demographic factors – such as gender, region, rural and urban, income, employment status and language – play a role in illegal content consumption.
“We found that only age and income varied significantly between consumers who infringed by downloading or streaming/accessing content online illegally and consumers who did not consume infringing content online,” the report reads.
“More specifically, the profile of consumers who downloaded or streamed/accessed infringing content skewed slightly younger and towards individuals with household incomes of $100K+.”
Licensed services much more popular than pirate haunts
It will come as no surprise that Netflix was the most popular service with consumers, with 64% having used it in the past three months. Sites like YouTube and Facebook were a big hit too, visited by 36% and 28% of content consumers respectively.
Overall, 74% of online content consumers use licensed services for content while 42% use social networks. Under a third (31%) use a combination of peer-to-peer (BitTorrent), cyberlocker platforms, or linking sites. Stream-ripping services are used by 9% of content consumers.
“Consumers who reported downloading or streaming/accessing infringing content only are less likely to use licensed services and more likely to use peer-to-peer/cyberlocker/linking sites than other consumers of online content,” the report notes.
Attitudes towards legal consumption & infringing content
In common with similar surveys over the years, the Kantar research looked at the reasons why people consume content from various sources, both legal and otherwise.
Convenience (48%), speed (36%) and quality (34%) were the most-cited reasons for using legal sources. An interesting 33% of respondents said they use legal sites to avoid using illegal sources.
On the illicit front, 54% of those who obtained unauthorized content in the previous three months said they did so due to it being free, with 40% citing convenience and 34% mentioning speed.
Almost six out of ten (58%) said lower costs would encourage them to switch to official sources, with 47% saying they’d move if legal availability was improved.
Canada’s ‘Notice-and-Notice’ warning system
People in Canada who share content on peer-to-peer systems like BitTorrent without permission run the risk of receiving an infringement notice warning them to stop. These are sent by copyright holders via users’ ISPs and the hope is that the shock of receiving a warning will turn consumers back to the straight and narrow.
The study reveals that 10% of online content consumers over the age of 12 have received one of these notices but what kind of effect have they had?
“Respondents reported that receiving such a notice resulted in the following: increased awareness of copyright infringement (38%), taking steps to ensure password protected home networks (27%), a household discussion about copyright infringement (27%), and discontinuing illegal downloading or streaming (24%),” the report notes.
While these are all positives for the entertainment industries, Kantar reports that almost a quarter (24%) of people who receive a notice simply ignore them.
Once upon a time, people obtaining music via P2P networks was cited as the music industry’s greatest threat but, with the advent of sites like YouTube, so-called stream-ripping is the latest bogeyman.
According to the study, 11% of Internet users say they’ve used a stream-ripping service. They are most likely to be male (62%) and predominantly 18 to 34 (52%) years of age.
“Among Canadians who have used a service to stream-rip music or entertainment, nearly half (48%) have used stream-ripping sites, one-third have used downloader apps (38%), one-in-seven (14%) have used a stream-ripping plug-in, and one-in-ten (10%) have used stream-ripping software,” the report adds.
Set-Top Boxes and VPNs
Few general piracy studies would be complete in 2018 without touching on set-top devices and Virtual Private Networks and this report doesn’t disappoint.
More than one in five (21%) respondents aged 12+ reported using a VPN, with the main purpose of securing communications and Internet browsing (57%).
A relatively modest 36% said they use a VPN to access free content while 32% said the aim was to access geo-blocked content unavailable in Canada. Just over a quarter (27%) said that accessing content from overseas at a reasonable price was the main motivator.
One in ten (10%) of respondents reported using a set-top box, with 78% stating they use them to access paid-for content. Interestingly, only a small number say they use the devices to infringe.
“A minority use set-top boxes to access other content that is not legal or they are unsure if it is legal (16%), or to access live sports that are not legal or they are unsure if it is legal (11%),” the report notes.
“Individuals who consumed a mix of legal and illegal content online are more likely to use VPN services (42%) or TV set-top boxes (21%) than consumers who only downloaded or streamed/accessed legal content.”
Kantar says that the findings of the report will be used to help policymakers evaluate how Canada’s Copyright Act is coping with a changing market and technological developments.
“This research will provide the necessary information required to further develop copyright policy in Canada, as well as to provide a foundation to assess the effectiveness of the measures to address copyright infringement, should future analysis be undertaken,” it concludes.
The full report can be found here (pdf)
Post Syndicated from Rob Zwetsloot original https://www.raspberrypi.org/blog/magpi-70-home-automation/
Hey folks, Rob here! It’s the last Thursday of the month, and that means it’s time for a brand-new The MagPi. Issue 70 is all about home automation using your favourite microcomputer, the Raspberry Pi.
We think home automation is an excellent use of the Raspberry Pi, hiding it around your house and letting it power your lights and doorbells and…fish tanks? We show you how to do all of that, and give you some excellent tips on how to add even more automation to your home in our ten-page cover feature.
Our other big feature this issue covers upcycling, the hot trend of taking old electronics and making them better than new with some custom code and a tactically placed Raspberry Pi. For this feature, we had a chat with Martin Mander, upcycler extraordinaire, to find out his top tips for hacking your old hardware.
If for some reason you want even more content, you’re in luck! We have some fun tutorials for you to try, like creating a theremin and turning a Babbage into an IoT nanny cam. We also continue our quest to make a video game in C++. Our project showcase is headlined by the Teslonda on page 28, a Honda/Tesla car hybrid that is just wonderful.
All this comes with our definitive reviews and the community section where we celebrate you, our amazing community! You’re all good beans
Issue 70 is available today from WHSmith, Tesco, Sainsbury’s, and Asda. If you live in the US, head over to your local Barnes & Noble or Micro Center in the next few days for a print copy. You can also get the new issue online from our store, or digitally via our Android and iOS apps. And don’t forget, there’s always the free PDF as well.
Want to support the Raspberry Pi Foundation and the magazine? We’ve launched a new way to subscribe to the print version of The MagPi: you can now take out a monthly £4 subscription to the magazine, effectively creating a rolling pre-order system that saves you money on each issue.
You can also take out a twelve-month print subscription and get a Pi Zero W plus case and adapter cables absolutely free! This offer does not currently have an end date.
That’s it for today! See you next month.
Post Syndicated from Yev original https://www.backblaze.com/blog/hiring-a-director-of-sales/
Backblaze is hiring a Director of Sales. This is a critical role for Backblaze as we continue to grow the team. We need a strong leader who has experience in scaling a sales team and who has an excellent track record for exceeding goals by selling Software as a Service (SaaS) solutions. In addition, this leader will need to be highly motivated, as well as able to create and develop a highly-motivated, success oriented sales team that has fun and enjoys what they do.
The History of Backblaze from our CEO
In 2007, after a friend’s computer crash caused her some suffering, we realized that with every photo, video, song, and document going digital, everyone would eventually lose all of their information. Five of us quit our jobs to start a company with the goal of making it easy for people to back up their data.
Like many startups, for a while we worked out of a co-founder’s one-bedroom apartment. Unlike most startups, we made an explicit agreement not to raise funding during the first year. We would then touch base every six months and decide whether to raise or not. We wanted to focus on building the company and the product, not on pitching and slide decks. And critically, we wanted to build a culture that understood money comes from customers, not the magical VC giving tree. Over the course of 5 years we built a profitable, multi-million dollar revenue business — and only then did we raise a VC round.
Fast forward 10 years later and our world looks quite different. You’ll have some fantastic assets to work with:
You might be saying, “sounds like you’ve got this under control — why do you need me?” Don’t be misled. We need you. Here’s why:
So that’s a bit about us. What are we looking for in you?
Experience: As a sales leader, you will strategically build and drive the territory’s sales pipeline by assembling and leading a skilled team of sales professionals. This leader should be familiar with generating, developing and closing software subscription (SaaS) opportunities. We are looking for a self-starter who can manage a team and make an immediate impact of selling our Backup and Cloud Storage solutions. In this role, the sales leader will work closely with the VP of Sales, marketing staff, and service staff to develop and implement specific strategic plans to achieve and exceed revenue targets, including new business acquisition as well as build out our customer success program.
Leadership: We have an experienced team who’s brought us to where we are today. You need to have the people and management skills to get them excited about working with you. You need to be a strong leader and compassionate about developing and supporting your team.
Data driven and creative: The data has to show something makes sense before we scale it up. However, without creativity, it’s easy to say “the data shows it’s impossible” or to find a local maximum. Whether it’s deciding how to scale the team, figuring out what our outbound sales efforts should look like or putting a plan in place to develop the team for career growth, we’ve seen a bit of creativity get us places a few extra dollars couldn’t.
Jive with our culture: Strong leaders affect culture and the person we hire for this role may well shape, not only fit into, ours. But to shape the culture you have to be accepted by the organism, which means a certain set of shared values. We default to openness with our team, our customers, and everyone if possible. We love initiative — without arrogance or dictatorship. We work to create a place people enjoy showing up to work. That doesn’t mean ping pong tables and foosball (though we do try to have perks & fun), but it means people are friendly, non-political, working to build a good service but also a good place to work.
Do the work: Ideas and strategy are critical, but good execution makes them happen. We’re looking for someone who can help the team execute both from the perspective of being capable of guiding and organizing, but also someone who is hands-on themselves.
Additional Responsibilities needed for this role:
Think you want to join us on this adventure?
Send an email to email@example.com with the subject “Director of Sales.” (Recruiters and agencies, please don’t email us.) Include a resume and answer these two questions:
Thank you for taking the time to read this and I hope that this sounds like the opportunity for which you’ve been waiting.
Backblaze is an Equal Opportunity Employer.
Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-neptune-generally-available/
Amazon Neptune is now Generally Available in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland). Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets. At the core of Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with millisecond latencies. Neptune supports two popular graph models, Property Graph and RDF, through Apache TinkerPop Gremlin and SPARQL, allowing you to easily build queries that efficiently navigate highly connected datasets. Neptune can be used to power everything from recommendation engines and knowledge graphs to drug discovery and network security. Neptune is fully-managed with automatic minor version upgrades, backups, encryption, and fail-over. I wrote about Neptune in detail for AWS re:Invent last year and customers have been using the preview and providing great feedback that the team has used to prepare the service for GA.
Now that Amazon Neptune is generally available there are a few changes from the preview:
Launching a Neptune cluster is as easy as navigating to the AWS Management Console and clicking create cluster. Of course you can also launch with CloudFormation, the CLI, or the SDKs.
You can monitor your cluster health and the health of individual instances through Amazon CloudWatch and the console.
We’ve created two repos with some additional tools and examples here. You can expect continuous development on these repos as we add additional tools and examples.
There’s an industry trend where we’re moving more and more onto purpose-built databases. Developers and businesses want to access their data in the format that makes the most sense for their applications. As cloud resources make transforming large datasets easier with tools like AWS Glue, we have a lot more options than we used to for accessing our data. With tools like Amazon Redshift, Amazon Athena, Amazon Aurora, Amazon DynamoDB, and more we get to choose the best database for the job or even enable entirely new use-cases. Amazon Neptune is perfect for workloads where the data is highly connected across data rich edges.
I’m really excited about graph databases and I see a huge number of applications. Looking for ideas of cool things to build? I’d love to build a web crawler in AWS Lambda that uses Neptune as the backing store. You could further enrich it by running Amazon Comprehend or Amazon Rekognition on the text and images found and creating a search engine on top of Neptune.
As always, feel free to reach out in the comments or on twitter to provide any feedback!
Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/built-in-authentication-in-alb/
Today I’m excited to announce built-in authentication support in Application Load Balancers (ALB). ALB can now securely authenticate users as they access applications, letting developers eliminate the code they have to write to support authentication and offload the responsibility of authentication from the backend. The team built a great live example where you can try out the authentication functionality.
Identity-based security is a crucial component of modern applications and as customers continue to move mission critical applications into the cloud, developers are asked to write the same authentication code again and again. Enterprises want to use their on-premises identities with their cloud applications. Web developers want to use federated identities from social networks to allow their users to sign-in. ALB’s new authentication action provides authentication through social Identity Providers (IdP) like Google, Facebook, and Amazon through Amazon Cognito. It also natively integrates with any OpenID Connect protocol compliant IdP, providing secure authentication and a single sign-on experience across your applications.
Authentication is a complicated topic and our readers may have differing levels of expertise with it. I want to cover a few key concepts to make sure we’re all on the same page. If you’re already an authentication expert and you just want to see how ALB authentication works feel free to skip to the next section!
When we get away from the terminology for a bit, all of this boils down to figuring out who a user is and what they’re allowed to do. Doing this securely and efficiently is hard. Traditionally, enterprises have used a protocol called SAML with their IdPs, to provide a single sign-on (SSO) experience for their internal users. SAML is XML heavy and modern applications have started using OIDC with JSON mechanism to share claims. Developers can use SAML in ALB with Amazon Cognito’s SAML support. Web app or mobile developers typically use federated identities via social IdPs like Facebook, Amazon, or Google which, conveniently, are also supported by Amazon Cognito.
ALB Authentication works by defining an authentication action in a listener rule. The ALB’s authentication action will check if a session cookie exists on incoming requests, then check that it’s valid. If the session cookie is set and valid then the ALB will route the request to the target group with
X-AMZN-OIDC-* headers set. The headers contain identity information in JSON Web Token (JWT) format, that a backend can use to identify a user. If the session cookie is not set or invalid then ALB will follow the OIDC protocol and issue an HTTP 302 redirect to the identity provider. The protocol is a lot to unpack and is covered more thoroughly in the documentation for those curious.
I have a simple Python flask app in an Amazon ECS cluster running in some AWS Fargate containers. The containers are in a target group routed to by an ALB. I want to make sure users of my application are logged in before accessing the authenticated portions of my application. First, I’ll navigate to the ALB in the console and edit the rules.
I want to make sure all access to
/account* endpoints is authenticated so I’ll add new rule with a condition to match those endpoints.
Now, I’ll add a new rule and create an Authenticate action in that rule.
I’ll have ALB create a new Amazon Cognito user pool for me by providing some configuration details.
After creating the Amazon Cognito pool, I can make some additional configuration in the advanced settings.
I can change the default cookie name, adjust the timeout, adjust the scope, and choose the action for unauthenticated requests.
I can pick Deny to serve a 401 for all unauthenticated requests or I can pick Allow which will pass through to the application if unauthenticated. This is useful for Single Page Apps (SPAs). For now, I’ll choose Authenticate, which will prompt the IdP, in this case Amazon Cognito, to authenticate the user and reload the existing page.
Now I’ll add a forwarding action for my target group and save the rule.
Over on the Facebook side I just need to add my Amazon Cognito User Pool Domain to the whitelisted OAuth redirect URLs.
I would follow similar steps for other authentication providers.
Now, when I navigate to an authenticated page my Fargate containers receive the originating request with the
X-Amzn-Oidc-* headers set by ALB. Using the information in those headers (claims-data, identity, access-token) my application can implement authorization.
All of this was possible without having to write a single line of code to deal with each of the IdPs. However, it’s still important for the implementing applications to verify the signature on the JWT header to ensure the request hasn’t been tampered with.
Of course everything we’ve seen today is also available in the the API and AWS Command Line Interface (CLI). You can find additional information on the feature in the documentation. This feature is provided at no additional charge.
Be sure to check out the live example as well.
With authentication built-in to ALB, developers can focus on building their applications instead of rebuilding authentication for every application, all the while maintaining the scale, availability, and reliability of ALB. I think this feature is a pretty big deal and I can’t wait to see what customers build with it. Let us know what you think of this feature in the comments or on twitter!
Post Syndicated from Andy original https://torrentfreak.com/fcc-asks-amazon-ebay-to-help-eliminate-pirate-media-box-sales-180530/
Historically, people deploying search terms including “Kodi” or “fully-loaded” were greeted by page after page of Android-type boxes, each ready for illicit plug-and-play entertainment consumption following delivery.
Although the problem persists on both platforms, people are now much less likely to find infringing devices than they were 12 to 24 months ago. Under pressure from entertainment industry groups, both Amazon and eBay have tightened the screws on sellers of such devices. Now, however, both companies have received requests to stem sales from a completetey different direction.
In a letter to eBay CEO Devin Wenig and Amazon CEO Jeff Bezos first spotted by Ars, FCC Commissioner Michael O’Rielly calls on the platforms to take action against piracy-configured boxes that fail to comply with FCC equipment authorization requirements or falsely display FCC logos, contrary to United States law.
“Disturbingly, some rogue set-top box manufacturers and distributors are exploiting the FCC’s trusted logo by fraudulently placing it on devices that have not been approved via the Commission’s equipment authorization process,” O’Rielly’s letter reads.
“Specifically, nine set-top box distributors were referred to the FCC in October for enabling the unlawful streaming of copyrighted material, seven of which displayed the FCC logo, although there was no record of such compliance.”
While O’Rielly admits that the copyright infringement aspects fall outside the jurisdiction of the FCC, he says it’s troubling that many of these devices are used to stream infringing content, “exacerbating the theft of billions of dollars in American innovation and creativity.”
As noted above, both Amazon and eBay have taken steps to reduce sales of pirate boxes on their respective platforms on copyright infringement grounds, something which is duly noted by O’Rielly. However, he points out that devices continue to be sold to members of the public who may believe that the devices are legal since they’re available for sale from legitimate companies.
“For these reasons, I am seeking your further cooperation in assisting the FCC in taking steps to eliminate the non-FCC compliant devices or devices that fraudulently bear the FCC logo,” the Commissioner writes (pdf).
“Moreover, if your company is made aware by the Commission, with supporting evidence, that a particular device is using a fraudulent FCC label or has not been appropriately certified and labeled with a valid FCC logo, I respectfully request that you commit to swiftly removing these products from your sites.”
In the event that Amazon and eBay take action under this request, O’Rielly asks both platforms to hand over information they hold on offending manufacturers, distributors, and suppliers.
Amazon was quick to respond to the FCC. In a letter published by Ars, Amazon’s Public Policy Vice President Brian Huseman assured O’Rielly that the company is not only dedicated to tackling rogue devices on copyright-infringement grounds but also when there is fraudulent use of the FCC’s logos.
Noting that Amazon is a key member of the Alliance for Creativity and Entertainment (ACE) – a group that has been taking legal action against sellers of infringing streaming devices (ISDs) and those who make infringing addons for Kodi-type systems – Huseman says that dealing with the problem is a top priority.
“Our goal is to prevent the sale of ISDs anywhere, as we seek to protect our customers from the risks posed by these devices, in addition to our interest in protecting Amazon Studios content,” Huseman writes.
“In 2017, Amazon became the first online marketplace to prohibit the sale of streaming media players that promote or facilitate piracy. To prevent the sale of these devices, we proactively scan product listings for signs of potentially infringing products, and we also invest heavily in sophisticated, automated real-time tools to review a variety of data sources and signals to identify inauthentic goods.
“These automated tools are supplemented by human reviewers that conduct manual investigations. When we suspect infringement, we take immediate action to remove suspected listings, and we also take enforcement action against sellers’ entire accounts when appropriate.”
Huseman also reveals that since implementing a proactive policy against such devices, “tens of thousands” of listings have been blocked from Amazon. In addition, the platform has been making criminal referrals to law enforcement as well as taking civil action (1,2,3) as part of ACE.
“As noted in your letter, we would also appreciate the opportunity to collaborate further with the FCC to remove non-compliant devices that improperly use the FCC logo or falsely claim FCC certification. If any FCC non-compliant devices are identified, we seek to work with you to ensure they are not offered for sale,” Huseman concludes.
Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/how-to-wipe-a-mac-hard-drive/
What do I do with a Mac that still has personal data on it? Do I take out the disk drive and smash it? Do I sweep it with a really strong magnet? Is there a difference in how I handle a hard drive (HDD) versus a solid-state drive (SSD)? Well, taking a sledgehammer or projectile weapon to your old machine is certainly one way to make the data irretrievable, and it can be enormously cathartic as long as you follow appropriate safety and disposal protocols. But there are far less destructive ways to make sure your data is gone for good. Let me introduce you to secure erasing.
Before we start, you need to know whether you have a HDD or a SSD. To find out, or at least to make sure, you click on the Apple menu and select “About this Mac.” Once there, select the “Storage” tab to see which type of drive is in your system.
The first example, below, shows a SATA Disk (HDD) in the system.
In the next case, we see we have a Solid State SATA Drive (SSD), plus a Mac SuperDrive.
The third screen shot shows an SSD, as well. In this case it’s called “Flash Storage.”
Before you get started, you’ll want to make sure that any important data on your hard drive has moved somewhere else. OS X’s built-in Time Machine backup software is a good start, especially when paired with Backblaze. You can learn more about using Time Machine in our Mac Backup Guide.
With a local backup copy in hand and secure cloud storage, you know your data is always safe no matter what happens.
Once you’ve verified your data is backed up, roll up your sleeves and get to work. The key is OS X Recovery — a special part of the Mac operating system since OS X 10.7 “Lion.”
NOTE: If you’re interested in wiping an SSD, see below.
There are four notches to that Security Options slider. “Fastest” is quick but insecure — data could potentially be rebuilt using a file recovery app. Moving that slider to the right introduces progressively more secure erasing. Disk Utility’s most secure level erases the information used to access the files on your disk, then writes zeroes across the disk surface seven times to help remove any trace of what was there. This setting conforms to the DoD 5220.22-M specification.
Once it’s done, the Mac’s hard drive will be clean as a whistle and ready for its next adventure: a fresh installation of OS X, being donated to a relative or a local charity, or just sent to an e-waste facility. Of course you can still drill a hole in your disk or smash it with a sledgehammer if it makes you happy, but now you know how to wipe the data from your old computer with much less ruckus.
The above instructions apply to older Macintoshes with HDDs. What do you do if you have an SSD?
Most new Macs ship with solid state drives (SSDs). Only the iMac and Mac mini ship with regular hard drives anymore, and even those are available in pure SSD variants if you want.
If your Mac comes equipped with an SSD, Apple’s Disk Utility software won’t actually let you zero the hard drive.
In a tech note posted to Apple’s own online knowledgebase, Apple explains that you don’t need to securely erase your Mac’s SSD:
With an SSD drive, Secure Erase and Erasing Free Space are not available in Disk Utility. These options are not needed for an SSD drive because a standard erase makes it difficult to recover data from an SSD.
In fact, some folks will tell you not to zero out the data on an SSD, since it can cause wear and tear on the memory cells that, over time, can affect its reliability. I don’t think that’s nearly as big an issue as it used to be — SSD reliability and longevity has improved.
If “Standard Erase” doesn’t quite make you feel comfortable that your data can’t be recovered, there are a couple of options.
One way to make sure that your SSD’s data remains secure is to use FileVault. FileVault is whole-disk encryption for the Mac. With FileVault engaged, you need a password to access the information on your hard drive. Without it, that data is encrypted.
There’s one potential downside of FileVault — if you lose your password or the encryption key, you’re screwed: You’re not getting your data back any time soon. Based on my experience working at a Mac repair shop, losing a FileVault key happens more frequently than it should.
When you first set up a new Mac, you’re given the option of turning FileVault on. If you don’t do it then, you can turn on FileVault at any time by clicking on your Mac’s System Preferences, clicking on Security & Privacy, and clicking on the FileVault tab. Be warned, however, that the initial encryption process can take hours, as will decryption if you ever need to turn FileVault off.
With FileVault turned on, you can restart your Mac into its Recovery System (by restarting the Mac while holding down the command and R keys) and erase the hard drive using Disk Utility, once you’ve unlocked it (by selecting the disk, clicking the File menu, and clicking Unlock). That deletes the FileVault key, which means any data on the drive is useless.
FileVault doesn’t impact the performance of most modern Macs, though I’d suggest only using it if your Mac has an SSD, not a conventional hard disk drive.
If you don’t want to take Apple’s word for it, if you’re not using FileVault, or if you just want to, there is a way to securely erase free space on your SSD. It’s a little more involved but it works.
Before we get into the nitty-gritty, let me state for the record that this really isn’t necessary to do, which is why Apple’s made it so hard to do. But if you’re set on it, you’ll need to use Apple’s Terminal app. Terminal provides you with command line interface access to the OS X operating system. Terminal lives in the Utilities folder, but you can access Terminal from the Mac’s Recovery System, as well. Once your Mac has booted into the Recovery partition, click the Utilities menu and select Terminal to launch it.
From a Terminal command line, type:
diskutil secureErase freespace VALUE /Volumes/DRIVE
That tells your Mac to securely erase the free space on your SSD. You’ll need to change VALUE to a number between 0 and 4. 0 is a single-pass run of zeroes; 1 is a single-pass run of random numbers; 2 is a 7-pass erase; 3 is a 35-pass erase; and 4 is a 3-pass erase. DRIVE should be changed to the name of your hard drive. To run a 7-pass erase of your SSD drive in “JohnB-Macbook”, you would enter the following:
diskutil secureErase freespace 2 /Volumes/JohnB-Macbook
And remember, if you used a space in the name of your Mac’s hard drive, you need to insert a leading backslash before the space. For example, to run a 35-pass erase on a hard drive called “Macintosh HD” you enter the following:
diskutil secureErase freespace 3 /Volumes/Macintosh\ HD
Something to remember is that the more extensive the erase procedure, the longer it will take.
If you absolutely, positively need to be sure that all the data on a drive is irretrievable, see this Scientific American article (with contributions by Gleb Budman, Backblaze CEO), How to Destroy a Hard Drive — Permanently.
Post Syndicated from Andy original https://torrentfreak.com/putin-asked-to-investigate-damage-caused-by-telegram-web-blocking-180526/
After a Moscow court gave the go-ahead for Telegram to be banned in Russia last month, the Internet became a battleground.
On the instructions of telecoms watchdog Roscomnadzor, ISPs across Russia tried to block Telegram by blackholing millions of IP addresses. The effect was both dramatic and pathetic. While Telegram remained stubbornly online, countless completely innocent services suffered outages as Roscomnadzor charged ahead with its mission.
Over the past several weeks, Roscomnadzor has gone some way to clean up the mess, partly by removing innocent Google and Amazon IP addresses from Russia’s blacklist. However, the collateral damage was so widespread it’s called into question the watchdog’s entire approach to web-blockades and whether they should be carried out at any cost.
This week, thanks to an annual report presented to President Vladimir Putin by business ombudsman Boris Titov, the matter looks set to be escalated. ‘The Book of Complaints and Suggestions of Russian Business’ contains comments from Internet ombudsman Dmitry Marinichev, who says that the Prosecutor General’s Office should launch an investigation into Roscomnadzor’s actions.
Marinichev said that when attempting to take down Telegram using aggressive technical means, Roscomnadzor relied upon “its own interpretation of court decisions” to provide guidance, TASS reports.
“When carrying out blockades of information resources, Roskomnadzor did not assess the related damage caused to them,” he said.
More than 15 million IP addresses were blocked, many of them with functions completely unrelated to the operations of Telegram. Marinichev said that the consequences were very real for those who suffered collateral damage.
“[The blocking led] to a temporary inaccessibility of Internet resources of a number of Russian enterprises in the Internet sector, including several banks and government information resources,” he reported.
In advice to the President, Marinichev suggests that the Prosecutor General’s Office should look into “the legality and validity of Roskomnadzor’s actions” which led to the “violation of availability of information resources of commercial companies” and “threatened the integrity, sustainability, and functioning of the unified telecommunications network of the Russian Federation and its critical information infrastructure.”
Early May, it was reported that in addition to various web services, around 50 VPN, proxy and anonymization platforms had been blocked for providing access to Telegram. In a May 22 report, that number had swelled to more than 80 although 10 were later unblocked after they stopped providing access to the messaging platform.
This week, Roscomnadzor has continued with efforts to block access to torrent and streaming platforms. In a new wave of action, the telecoms watchdog ordered ISPs to block at least 47 mirrors and proxies providing access to previously blocked sites.
The adoption of Apache Spark has increased significantly over the past few years, and running Spark-based application pipelines is the new normal. Spark jobs that are in an ETL (extract, transform, and load) pipeline have different requirements—you must handle dependencies in the jobs, maintain order during executions, and run multiple jobs in parallel. In most of these cases, you can use workflow scheduler tools like Apache Oozie, Apache Airflow, and even Cron to fulfill these requirements.
Apache Oozie is a widely used workflow scheduler system for Hadoop-based jobs. However, its limited UI capabilities, lack of integration with other services, and heavy XML dependency might not be suitable for some users. On the other hand, Apache Airflow comes with a lot of neat features, along with powerful UI and monitoring capabilities and integration with several AWS and third-party services. However, with Airflow, you do need to provision and manage the Airflow server. The Cron utility is a powerful job scheduler. But it doesn’t give you much visibility into the job details, and creating a workflow using Cron jobs can be challenging.
What if you have a simple use case, in which you want to run a few Spark jobs in a specific order, but you don’t want to spend time orchestrating those jobs or maintaining a separate application? You can do that today in a serverless fashion using AWS Step Functions. You can create the entire workflow in AWS Step Functions and interact with Spark on Amazon EMR through Apache Livy.
In this post, I walk you through a list of steps to orchestrate a serverless Spark-based ETL pipeline using AWS Step Functions and Apache Livy.
For the source data for this post, I use the New York City Taxi and Limousine Commission (TLC) trip record data. For a description of the data, see this detailed dictionary of the taxi data. In this example, we’ll work mainly with the following three columns for the Spark jobs.
|Column name||Column description|
|RateCodeID||Represents the rate code in effect at the end of the trip (for example, 1 for standard rate, 2 for JFK airport, 3 for Newark airport, and so on).|
|FareAmount||Represents the time-and-distance fare calculated by the meter.|
|TripDistance||Represents the elapsed trip distance in miles reported by the taxi meter.|
The trip data is in comma-separated values (CSV) format with the first row as a header. To shorten the Spark execution time, I trimmed the large input data to only 20,000 rows. During the deployment phase, the input file tripdata.csv is stored in Amazon S3 in the <<your-bucket>>/emr-step-functions/input/ folder.
The following image shows a sample of the trip data:
The next few sections describe how Spark jobs are created for this solution, how you can interact with Spark using Apache Livy, and how you can use AWS Step Functions to create orchestrations for these Spark applications.
At a high level, the solution includes the following steps:
Let’s take a look at the Spark application that is used for this solution.
For this example, I built a Spark jar named spark-taxi.jar. It has two different Spark applications:
The following are the expected output columns:
This job first reads the tripdata.csv file and aggregates the fare_amount by the rate_code. After this point, you have two different datasets, both aggregated by rate_code. Finally, the job uses the rate_code field to join two datasets and output the entire rate code status in a single CSV file.
The output columns are as follows:
Note that in this case, you don’t need to run two different Spark jobs to generate that output. The goal of setting up the jobs in this way is just to create a dependency between the two jobs and use them within AWS Step Functions.
Both Spark applications take one input argument called rootPath. It’s the S3 location where the Spark job is stored along with input and output data. Here is a sample of the final output:
The next section discusses how you can use Apache Livy to interact with Spark applications that are running on Amazon EMR.
Apache Livy provides a REST interface to interact with Spark running on an EMR cluster. Livy is included in Amazon EMR release version 5.9.0 and later. In this post, I use Livy to submit Spark jobs and retrieve job status. When Amazon EMR is launched with Livy installed, the EMR master node becomes the endpoint for Livy, and it starts listening on port 8998 by default. Livy provides APIs to interact with Spark.
Let’s look at a couple of examples how you can interact with Spark running on Amazon EMR using Livy.
To list active running jobs, you can execute the following from the EMR master node:
If you want to do the same from a remote instance, just change localhost to the EMR hostname, as in the following (port 8998 must be open to that remote instance through the security group):
To submit a Spark job through Livy pointing to the application jar in S3, you can do the following:
Spark submit through Livy returns a session ID that starts from 0. Using that session ID, you can retrieve the status of the job as shown following:
Through Spark submit, you can pass multiple arguments for the Spark job and Spark configuration settings. You can also do that using Livy, by passing the S3 path through the args parameter, as shown following:
All Apache Livy REST calls return a response as JSON, as shown in the following image:
If you want to pretty-print that JSON response, you can pipe command with Python’s JSON tool as follows:
For a detailed list of Livy APIs, see the Apache Livy REST API page. This post uses GET /batches and POST /batches.
In the next section, you create a state machine and orchestrate Spark applications using AWS Step Functions.
AWS Step Functions automatically triggers and tracks each step and retries when it encounters errors. So your application executes in order and as expected every time. To create a Spark job workflow using AWS Step Functions, you first create a Lambda state machine using different types of states to create the entire workflow.
First, you use the Task state—a simple state in AWS Step Functions that performs a single unit of work. You also use the Wait state to delay the state machine from continuing for a specified time. Later, you use the Choice state to add branching logic to a state machine.
The following is a quick summary of how to use different states in the state machine to create the Spark ETL pipeline:
Here is one of my Task states, MilesPerRateCode, which simply submits a Spark job:
This Task state configuration specifies the Lambda function to execute. Inside the Lambda function, it submits a Spark job through Livy using Livy’s POST API. Using ResultPath, it tells the state machine where to place the result of the executing task. As discussed in the previous section, Spark submit returns the session ID, which is captured with $.jobId and used in a later state.
The following code section shows the Lambda function, which is used to submit the MilesPerRateCode job. It uses the Python request library to submit a POST against the Livy endpoint hosted on Amazon EMR and passes the required parameters in JSON format through payload. It then parses the response, grabs id from the response, and returns it. The Next field tells the state machine which state to go to next.
Just like in the MilesPerRate job, another state submits the RateCodeStatus job, but it executes only when all previous jobs have completed successfully.
Here is the Task state in the state machine that checks the Spark job status:
Just like other states, the preceding Task executes a Lambda function, captures the result (represented by jobStatus), and passes it to the next state. The following is the Lambda function that checks the Spark job status based on a given session ID:
In the Choice state, it checks the Spark job status value, compares it with a predefined state status, and transitions the state based on the result. For example, if the status is success, move to the next state (RateCodeJobStatus job), and if it is dead, move to the MilesPerRate job failed state.
To set up this entire solution, you need to create a few AWS resources. To make it easier, I have created an AWS CloudFormation template. This template creates all the required AWS resources and configures all the resources that are needed to create a Spark-based ETL pipeline on AWS Step Functions.
This CloudFormation template requires you to pass the following four parameters during initiation.
|ClusterSubnetID||The subnet where the Amazon EMR cluster is deployed and Lambda is configured to talk to this subnet.|
|KeyName||The name of the existing EC2 key pair to access the Amazon EMR cluster.|
|VPCID||The ID of the virtual private cloud (VPC) where the EMR cluster is deployed and Lambda is configured to talk to this VPC.|
|S3RootPath||The Amazon S3 path where all required files (input file, Spark job, and so on) are stored and the resulting data is written.|
IMPORTANT: These templates are designed only to show how you can create a Spark-based ETL pipeline on AWS Step Functions using Apache Livy. They are not intended for production use without modification. And if you try this solution outside of the us-east-1 Region, download the necessary files from s3://aws-data-analytics-blog/emr-step-functions, upload the files to the buckets in your Region, edit the script as appropriate, and then run it.
To launch the CloudFormation stack, choose Launch Stack:
Launching this stack creates the following list of AWS resources.
|Logical ID||Resource Type||Description|
|StepFunctionsStateExecutionRole||IAM role||IAM role to execute the state machine and have a trust relationship with the states service.|
|SparkETLStateMachine||AWS Step Functions state machine||State machine in AWS Step Functions for the Spark ETL workflow.|
|LambdaSecurityGroup||Amazon EC2 security group||Security group that is used for the Lambda function to call the Livy API.|
|AWS Lambda function||Lambda function to submit the RateCodeStatus job.|
|MilesPerRateJobSubmitFunction||AWS Lambda function||Lambda function to submit the MilesPerRate job.|
|SparkJobStatusFunction||AWS Lambda function||Lambda function to check the Spark job status.|
|LambdaStateMachineRole||IAM role||IAM role for all Lambda functions to use the lambda trust relationship.|
|EMRCluster||Amazon EMR cluster||EMR cluster where Livy is running and where the job is placed.|
During the AWS CloudFormation deployment phase, it sets up S3 paths for input and output. Input files are stored in the <<s3-root-path>>/emr-step-functions/input/ path, whereas spark-taxi.jar is copied under <<s3-root-path>>/emr-step-functions/.
The following screenshot shows how the S3 paths are configured after deployment. In this example, I passed a bucket that I created in the AWS account s3://tm-app-demos for the S3 root path.
If the CloudFormation template completed successfully, you will see Spark-ETL-State-Machine in the AWS Step Functions dashboard, as follows:
Choose the Spark-ETL-State-Machine state machine to take a look at this implementation. The AWS CloudFormation template built the entire state machine along with its dependent Lambda functions, which are now ready to be executed.
On the dashboard, choose the newly created state machine, and then choose New execution to initiate the state machine. It asks you to pass input in JSON format. This input goes to the first state MilesPerRate Job, which eventually executes the Lambda function blog-miles-per-rate-job-submit-function.
Pass the S3 root path as input:
Then choose Start Execution:
The rootPath value is the same value that was passed when creating the CloudFormation stack. It can be an S3 bucket location or a bucket with prefixes, but it should be the same value that is used for AWS CloudFormation. This value tells the state machine where it can find the Spark jar and input file, and where it will write output files. After the state machine starts, each state/task is executed based on its definition in the state machine.
At a high level, the following represents the flow of events:
This following screenshot shows a successfully completed Spark ETL state machine:
The right side of the state machine diagram shows the details of individual states with their input and output.
When you execute the state machine for the second time, it fails because the S3 path already exists. The state machine turns red and stops at MilePerRate job failed. The following image represents that failed execution of the state machine:
You can also check your Spark application status and logs by going to the Amazon EMR console and viewing the Application history tab:
I hope this walkthrough paints a picture of how you can create a serverless solution for orchestrating Spark jobs on Amazon EMR using AWS Step Functions and Apache Livy. In the next section, I share some ideas for making this solution even more elegant.
The goal of this post is to show a simple example that uses AWS Step Functions to create an orchestration for Spark-based jobs in a serverless fashion. To make this solution robust and production ready, you can explore the following options:
With Lambda and AWS Step Functions, you can create a very robust serverless orchestration for your big data workload.
When you’ve finished testing this solution, remember to clean up all those AWS resources that you created using AWS CloudFormation. Use the AWS CloudFormation console or AWS CLI to delete the stack named Blog-Spark-ETL-Step-Functions.
In this post, I showed you how to use AWS Step Functions to orchestrate your Spark jobs that are running on Amazon EMR. You used Apache Livy to submit jobs to Spark from a Lambda function and created a workflow for your Spark jobs, maintaining a specific order for job execution and triggering different AWS events based on your job’s outcome.
Go ahead—give this solution a try, and share your experience with us!
If you found this post useful, be sure to check out Building a Real World Evidence Platform on AWS and Building High-Throughput Genomics Batch Workflows on AWS: Workflow Layer (Part 4 of 4).
Tanzir Musabbir is an EMR Specialist Solutions Architect with AWS. He is an early adopter of open source Big Data technologies. At AWS, he works with our customers to provide them architectural guidance for running analytics solutions on Amazon EMR, Amazon Athena & AWS Glue. Tanzir is a big Real Madrid fan and he loves to travel in his free time.
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.