The German charity Save Nemo works to protect coral reefs, and they are developing Nemo-Pi, an underwater “weather station” that monitors ocean conditions. Right now, you can vote for Save Nemo in the Google.org Impact Challenge.
Save Nemo
The organisation says there are two major threats to coral reefs: divers, and climate change. To make diving saver for reefs, Save Nemo installs buoy anchor points where diving tour boats can anchor without damaging corals in the process.
In addition, they provide dos and don’ts for how to behave on a reef dive.
The Nemo-Pi
To monitor the effects of climate change, and to help divers decide whether conditions are right at a reef while they’re still on shore, Save Nemo is also in the process of perfecting Nemo-Pi.
This Raspberry Pi-powered device is made up of a buoy, a solar panel, a GPS device, a Pi, and an array of sensors. Nemo-Pi measures water conditions such as current, visibility, temperature, carbon dioxide and nitrogen oxide concentrations, and pH. It also uploads its readings live to a public webserver.
The Save Nemo team is currently doing long-term tests of Nemo-Pi off the coast of Thailand and Indonesia. They are also working on improving the device’s power consumption and durability, and testing prototypes with the Raspberry Pi Zero W.
The web dashboard showing live Nemo-Pi data
Long-term goals
Save Nemo aims to install a network of Nemo-Pis at shallow reefs (up to 60 metres deep) in South East Asia. Then diving tour companies can check the live data online and decide day-to-day whether tours are feasible. This will lower the impact of humans on reefs and help the local flora and fauna survive.
A healthy coral reef
Nemo-Pi data may also be useful for groups lobbying for reef conservation, and for scientists and activists who want to shine a spotlight on the awful effects of climate change on sea life, such as coral bleaching caused by rising water temperatures.
A bleached coral reef
Vote now for Save Nemo
If you want to help Save Nemo in their mission today, vote for them to win the Google.org Impact Challenge:
Click “Abstimmen” in the footer of the page to vote
Click “JA” in the footer to confirm
Voting is open until 6 June. You can also follow Save Nemo on Facebook or Twitter. We think this organisation is doing valuable work, and that their projects could be expanded to reefs across the globe. It’s fantastic to see the Raspberry Pi being used to help protect ocean life.
Backblaze is hiring a Director of Sales. This is a critical role for Backblaze as we continue to grow the team. We need a strong leader who has experience in scaling a sales team and who has an excellent track record for exceeding goals by selling Software as a Service (SaaS) solutions. In addition, this leader will need to be highly motivated, as well as able to create and develop a highly-motivated, success oriented sales team that has fun and enjoys what they do.
The History of Backblaze from our CEO In 2007, after a friend’s computer crash caused her some suffering, we realized that with every photo, video, song, and document going digital, everyone would eventually lose all of their information. Five of us quit our jobs to start a company with the goal of making it easy for people to back up their data.
Like many startups, for a while we worked out of a co-founder’s one-bedroom apartment. Unlike most startups, we made an explicit agreement not to raise funding during the first year. We would then touch base every six months and decide whether to raise or not. We wanted to focus on building the company and the product, not on pitching and slide decks. And critically, we wanted to build a culture that understood money comes from customers, not the magical VC giving tree. Over the course of 5 years we built a profitable, multi-million dollar revenue business — and only then did we raise a VC round.
Fast forward 10 years later and our world looks quite different. You’ll have some fantastic assets to work with:
A brand millions recognize for openness, ease-of-use, and affordability.
A computer backup service that stores over 500 petabytes of data, has recovered over 30 billion files for hundreds of thousands of paying customers — most of whom self-identify as being the people that find and recommend technology products to their friends.
Our B2 service that provides the lowest cost cloud storage on the planet at 1/4th the price Amazon, Google or Microsoft charges. While being a newer product on the market, it already has over 100,000 IT and developers signed up as well as an ecosystem building up around it.
A growing, profitable and cash-flow positive company.
And last, but most definitely not least: a great sales team.
You might be saying, “sounds like you’ve got this under control — why do you need me?” Don’t be misled. We need you. Here’s why:
We have a great team, but we are in the process of expanding and we need to develop a structure that will easily scale and provide the most success to drive revenue.
We just launched our outbound sales efforts and we need someone to help develop that into a fully successful program that’s building a strong pipeline and closing business.
We need someone to work with the marketing department and figure out how to generate more inbound opportunities that the sales team can follow up on and close.
We need someone who will work closely in developing the skills of our current sales team and build a path for career growth and advancement.
We want someone to manage our Customer Success program.
So that’s a bit about us. What are we looking for in you?
Experience: As a sales leader, you will strategically build and drive the territory’s sales pipeline by assembling and leading a skilled team of sales professionals. This leader should be familiar with generating, developing and closing software subscription (SaaS) opportunities. We are looking for a self-starter who can manage a team and make an immediate impact of selling our Backup and Cloud Storage solutions. In this role, the sales leader will work closely with the VP of Sales, marketing staff, and service staff to develop and implement specific strategic plans to achieve and exceed revenue targets, including new business acquisition as well as build out our customer success program.
Leadership: We have an experienced team who’s brought us to where we are today. You need to have the people and management skills to get them excited about working with you. You need to be a strong leader and compassionate about developing and supporting your team.
Data driven and creative: The data has to show something makes sense before we scale it up. However, without creativity, it’s easy to say “the data shows it’s impossible” or to find a local maximum. Whether it’s deciding how to scale the team, figuring out what our outbound sales efforts should look like or putting a plan in place to develop the team for career growth, we’ve seen a bit of creativity get us places a few extra dollars couldn’t.
Jive with our culture: Strong leaders affect culture and the person we hire for this role may well shape, not only fit into, ours. But to shape the culture you have to be accepted by the organism, which means a certain set of shared values. We default to openness with our team, our customers, and everyone if possible. We love initiative — without arrogance or dictatorship. We work to create a place people enjoy showing up to work. That doesn’t mean ping pong tables and foosball (though we do try to have perks & fun), but it means people are friendly, non-political, working to build a good service but also a good place to work.
Do the work: Ideas and strategy are critical, but good execution makes them happen. We’re looking for someone who can help the team execute both from the perspective of being capable of guiding and organizing, but also someone who is hands-on themselves.
Additional Responsibilities needed for this role:
Recruit, coach, mentor, manage and lead a team of sales professionals to achieve yearly sales targets. This includes closing new business and expanding upon existing clientele.
Expand the customer success program to provide the best customer experience possible resulting in upsell opportunities and a high retention rate.
Develop effective sales strategies and deliver compelling product demonstrations and sales pitches.
Acquire and develop the appropriate sales tools to make the team efficient in their daily work flow.
Apply a thorough understanding of the marketplace, industry trends, funding developments, and products to all management activities and strategic sales decisions.
Ensure that sales department operations function smoothly, with the goal of facilitating sales and/or closings; operational responsibilities include accurate pipeline reporting and sales forecasts.
This position will report directly to the VP of Sales and will be staffed in our headquarters in San Mateo, CA.
Requirements:
7 – 10+ years of successful sales leadership experience as measured by sales performance against goals. Experience in developing skill sets and providing career growth and opportunities through advancement of team members.
Background in selling SaaS technologies with a strong track record of success.
Strong presentation and communication skills.
Must be able to travel occasionally nationwide.
BA/BS degree required
Think you want to join us on this adventure? Send an email to jobscontact@backblaze.com with the subject “Director of Sales.” (Recruiters and agencies, please don’t email us.) Include a resume and answer these two questions:
How would you approach evaluating the current sales team and what is your process for developing a growth strategy to scale the team?
What are the goals you would set for yourself in the 3 month and 1-year timeframes?
Thank you for taking the time to read this and I hope that this sounds like the opportunity for which you’ve been waiting.
This post is courtesy of Alan Protasio, Software Development Engineer, Amazon Web Services
Just like compute and storage, messaging is a fundamental building block of enterprise applications. Message brokers (aka “message-oriented middleware”) enable different software systems, often written in different languages, on different platforms, running in different locations, to communicate and exchange information. Mission-critical applications, such as CRM and ERP, rely on message brokers to work.
A common performance consideration for customers deploying a message broker in a production environment is the throughput of the system, measured as messages per second. This is important to know so that application environments (hosts, threads, memory, etc.) can be configured correctly.
In this post, we demonstrate how to measure the throughput for Amazon MQ, a new managed message broker service for ActiveMQ, using JMS Benchmark. It should take between 15–20 minutes to set up the environment and an hour to run the benchmark. We also provide some tips on how to configure Amazon MQ for optimal throughput.
Benchmarking throughput for Amazon MQ
ActiveMQ can be used for a number of use cases. These use cases can range from simple fire and forget tasks (that is, asynchronous processing), low-latency request-reply patterns, to buffering requests before they are persisted to a database.
The throughput of Amazon MQ is largely dependent on the use case. For example, if you have non-critical workloads such as gathering click events for a non-business-critical portal, you can use ActiveMQ in a non-persistent mode and get extremely high throughput with Amazon MQ.
On the flip side, if you have a critical workload where durability is extremely important (meaning that you can’t lose a message), then you are bound by the I/O capacity of your underlying persistence store. We recommend using mq.m4.large for the best results. The mq.t2.micro instance type is intended for product evaluation. Performance is limited, due to the lower memory and burstable CPU performance.
Tip: To improve your throughput with Amazon MQ, make sure that you have consumers processing messaging as fast as (or faster than) your producers are pushing messages.
Because it’s impossible to talk about how the broker (ActiveMQ) behaves for each and every use case, we walk through how to set up your own benchmark for Amazon MQ using our favorite open-source benchmarking tool: JMS Benchmark. We are fans of the JMS Benchmark suite because it’s easy to set up and deploy, and comes with a built-in visualizer of the results.
Non-Persistent Scenarios – Queue latency as you scale producer throughput
Getting started
At the time of publication, you can create an mq.m4.large single-instance broker for testing for $0.30 per hour (US pricing).
Step 2 – Create an EC2 instance to run your benchmark Launch the EC2 instance using Step 1: Launch an Instance. We recommend choosing the m5.large instance type.
Step 3 – Configure the security groups Make sure that all the security groups are correctly configured to let the traffic flow between the EC2 instance and your broker.
From the broker list, choose the name of your broker (for example, MyBroker)
In the Details section, under Security and network, choose the name of your security group or choose the expand icon ( ).
From the security group list, choose your security group.
At the bottom of the page, choose Inbound, Edit.
In the Edit inbound rules dialog box, add a role to allow traffic between your instance and the broker: • Choose Add Rule. • For Type, choose Custom TCP. • For Port Range, type the ActiveMQ SSL port (61617). • For Source, leave Custom selected and then type the security group of your EC2 instance. • Choose Save.
Your broker can now accept the connection from your EC2 instance.
Step 4 – Run the benchmark Connect to your EC2 instance using SSH and run the following commands:
After the benchmark finishes, you can find the results in the ~/reports directory. As you may notice, the performance of ActiveMQ varies based on the number of consumers, producers, destinations, and message size.
Amazon MQ architecture
The last bit that’s important to know so that you can better understand the results of the benchmark is how Amazon MQ is architected.
Amazon MQ is architected to be highly available (HA) and durable. For HA, we recommend using the multi-AZ option. After a message is sent to Amazon MQ in persistent mode, the message is written to the highly durable message store that replicates the data across multiple nodes in multiple Availability Zones. Because of this replication, for some use cases you may see a reduction in throughput as you migrate to Amazon MQ. Customers have told us they appreciate the benefits of message replication as it helps protect durability even in the face of the loss of an Availability Zone.
Conclusion
We hope this gives you an idea of how Amazon MQ performs. We encourage you to run tests to simulate your own use cases.
To learn more, see the Amazon MQ website. You can try Amazon MQ for free with the AWS Free Tier, which includes up to 750 hours of a single-instance mq.t2.micro broker and up to 1 GB of storage per month for one year.
A hanging plotter, also known as a polar plotter or polargraph, is a machine for drawing images on a vertical surface. It does so by using motors to control the length of two cords that form a V shape, supporting a pen where they meet. We’ve featured one on this blog before: Norbert “HomoFaciens” Heinz’s video is a wonderfully clear introduction to how a polargraph works and what you have to consider when you’re putting one together.
Today, we look at Inky Lines, by John Proudlock. With it, John is creating a series of captivating and beautiful pieces, and with his most recent work, each rendering of an image is unique.
An evolving project
The project isn’t new – John has been working on it for at least a couple of years – but it is constantly evolving. When we first spotted it, John had just implemented code to allow the plotter to produce mesmeric, spiralling patterns.
But we’re skipping ahead. Let’s go back to the beginning.
From pixels to motor movements
John starts by providing an image, usually no more than 100 pixels wide, to a Raspberry Pi. Custom software that he wrote evaluates the darkness of each pixel and selects a pattern of a suitable density to represent it.
The two cords supporting the plotter’s pen are wound around the shafts of two stepper motors, such that the movement of the motors controls the length of the cords: the program next calculates how much each motor must move in order to produce the pattern. The Raspberry Pi passes corresponding instructions to two motor circuits, which transform the signals to a higher voltage and pass them to the stepper motors. These turn by very precise amounts, winding or unwinding the cords and, very slowly, dragging the pen across the paper.
Suspended in-between the two motors is a print head, made out of a new 3-d modelling material I’ve been prototyping called cardboard. An old coat hanger and some velcro were also used.
The earlier drawings that John made used a repeatable method to render image files as lines on paper. That is, if the machine drew the same image a number of times, each copy would be identical. More recently, though, he has been using a method that yields random movements of the pen:
The pen point is guided around the image, but moves to each new point entirely at random. Up close this looks like a chaotic squiggle, but from a distance of a couple of meters, the human eye (and brain) make order from the chaos and view an infinite number of shades and a smoother, less mechanical image.
This method means that no matter how many times the polargraph repeats the same image, each copy will be unique.
A gallery of work
Inky Lines’ website and its Instagram feed offer a collection of wonderful pieces John has drawn with his polargraph, and he discusses the different techniques and types of image that he is exploring.
They range from holiday photographs, processed to extract particular features and rendered in silhouette, to portraits, made with a single continuous line that can be several hundred metres long, to generative images spirograph images like those pictured above, created by an algorithm rather than rendered from a source image.
Side projects are the things you do at home, after work, for your own “entertainment”, or to satisfy your desire to learn new stuff, in case your workplace doesn’t give you that opportunity (or at least not enough of it). Side projects are also a way to build stuff that you think is valuable but not necessarily “commercialisable”. Many side projects are open-sourced sooner or later and some of them contribute to the pool of tools at other people’s disposal.
I’ve outlined one recommendation about side projects before – do them with technologies that are new to you, so that you learn important things that will keep you better positioned in the software world.
But there are more benefits than that – serendipitous benefits, for example. And I’d like to tell some personal stories about that. I’ll focus on a few examples from my list of side projects to show how, through a sort-of butterfly effect, they helped shape my career.
The computoser project, no matter how cool algorithmic music composition, didn’t manage to have much of a long term impact. But it did teach me something apart from niche musical theory – how to read a bulk of scientific papers (mostly computer science) and understand them without being formally trained in the particular field. We’ll see how that was useful later.
Then there was the “State alerts” project – a website that scraped content from public institutions in my country (legislation, legislation proposals, decisions by regulators, new tenders, etc.), made them searchable, and “subscribable” – so that you get notified when a keyword of interest is mentioned in newly proposed legislation, for example. (I obviously subscribed for “information technologies” and “electronic”).
And that project turned out to have a significant impact on the following years. First, I chose a new technology to write it with – Scala. Which turned out to be of great use when I started working at TomTom, and on the 3rd day I was transferred to a Scala project, which was way cooler and much more complex than the original one I was hired for. It was a bit ironic, as my colleagues had just read that “I don’t like Scala” a few weeks earlier, but nevertheless, that was one of the most interesting projects I’ve worked on, and it went on for two years. Had I not known Scala, I’d probably be gone from TomTom much earlier (as the other project was restructured a few times), and I would not have learned many of the scalability, architecture and AWS lessons that I did learn there.
But the very same project had an even more important follow-up. Because if its “civic hacking” flavour, I was invited to join an informal group of developers (later officiated as an NGO) who create tools that are useful for society (something like MySociety.org). That group gathered regularly, discussed both tools and policies, and at some point we put up a list of policy priorities that we wanted to lobby policy makers. One of them was open source for the government, the other one was open data. As a result of our interaction with an interim government, we donated the official open data portal of my country, functioning to this day.
As a result of that, a few months later we got a proposal from the deputy prime minister’s office to “elect” one of the group for an advisor to the cabinet. And we decided that could be me. So I went for it and became advisor to the deputy prime minister. The job has nothing to do with anything one could imagine, and it was challenging and fascinating. We managed to pass legislation, including one that requires open source for custom projects, eID and open data. And all of that would not have been possible without my little side project.
As for my latest side project, LogSentinel – it became my current startup company. And not without help from the previous two mentioned above – the computer science paper reading was of great use when I was navigating the crypto papers landscape, and from the government job I not only gained invaluable legal knowledge, but I also “got” a co-founder.
Some other side projects died without much fanfare, and that’s fine. But the ones above shaped my “story” in a way that would not have been possible otherwise.
And I agree that such serendipitous chain of events could have happened without side projects – I could’ve gotten these opportunities by meeting someone at a bar (unlikely, but who knows). But we, as software engineers, are capable of tilting chance towards us by utilizing our skills. Side projects are our “extracurricular activities”, and they often lead to unpredictable, but rather positive chains of events. They would rarely be the only factor, but they are certainly great at unlocking potential.
As you can see from my EC2 Instance History post, we add new instance types on a regular and frequent basis. Driven by increasingly powerful processors and designed to address an ever-widening set of use cases, the size and diversity of this list reflects the equally diverse group of EC2 customers!
Near the bottom of that list you will find the new compute-intensive C5 instances. With a 25% to 50% improvement in price-performance over the C4 instances, the C5 instances are designed for applications like batch and log processing, distributed and or real-time analytics, high-performance computing (HPC), ad serving, highly scalable multiplayer gaming, and video encoding. Some of these applications can benefit from access to high-speed, ultra-low latency local storage. For example, video encoding, image manipulation, and other forms of media processing often necessitates large amounts of I/O to temporary storage. While the input and output files are valuable assets and are typically stored as Amazon Simple Storage Service (S3) objects, the intermediate files are expendable. Similarly, batch and log processing runs in a race-to-idle model, flushing volatile data to disk as fast as possible in order to make full use of compute resources.
New C5d Instances with Local Storage In order to meet this need, we are introducing C5 instances equipped with local NVMe storage. Available for immediate use in 5 regions, these instances are a great fit for the applications that I described above, as well as others that you will undoubtedly dream up! Here are the specs:
Instance Name
vCPUs
RAM
Local Storage
EBS Bandwidth
Network Bandwidth
c5d.large
2
4 GiB
1 x 50 GB NVMe SSD
Up to 2.25 Gbps
Up to 10 Gbps
c5d.xlarge
4
8 GiB
1 x 100 GB NVMe SSD
Up to 2.25 Gbps
Up to 10 Gbps
c5d.2xlarge
8
16 GiB
1 x 225 GB NVMe SSD
Up to 2.25 Gbps
Up to 10 Gbps
c5d.4xlarge
16
32 GiB
1 x 450 GB NVMe SSD
2.25 Gbps
Up to 10 Gbps
c5d.9xlarge
36
72 GiB
1 x 900 GB NVMe SSD
4.5 Gbps
10 Gbps
c5d.18xlarge
72
144 GiB
2 x 900 GB NVMe SSD
9 Gbps
25 Gbps
Other than the addition of local storage, the C5 and C5d share the same specs. Both are powered by 3.0 GHz Intel Xeon Platinum 8000-series processors, optimized for EC2 and with full control over C-states on the two largest sizes, giving you the ability to run two cores at up to 3.5 GHz using Intel Turbo Boost Technology.
You can use any AMI that includes drivers for the Elastic Network Adapter (ENA) and NVMe; this includes the latest Amazon Linux, Microsoft Windows (Server 2008 R2, Server 2012, Server 2012 R2 and Server 2016), Ubuntu, RHEL, SUSE, and CentOS AMIs.
Here are a couple of things to keep in mind about the local NVMe storage:
Naming – You don’t have to specify a block device mapping in your AMI or during the instance launch; the local storage will show up as one or more devices (/dev/nvme*1 on Linux) after the guest operating system has booted.
Encryption – Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.
Lifetime – Local NVMe devices have the same lifetime as the instance they are attached to, and do not stick around after the instance has been stopped or terminated.
Available Now C5d instances are available in On-Demand, Reserved Instance, and Spot form in the US East (N. Virginia), US West (Oregon), EU (Ireland), US East (Ohio), and Canada (Central) Regions. Prices vary by Region, and are just a bit higher than for the equivalent C5 instances.
Thanks to Susan Ferrell, Senior Technical Writer, for a great blog post on how to use CodeCommit branch-level permissions. —-
AWS CodeCommit users have been asking for a way to restrict commits to some repository branches to just a few people. In this blog post, we’re going to show you how to do that by creating and applying a conditional policy, an AWS Identity and Access Management (IAM) policy that contains a context key.
Why would I do this?
When you create a branch in an AWS CodeCommit repository, the branch is available, by default, to all repository users. Here are some scenarios in which refining access might help you:
You maintain a branch in a repository for production-ready code, and you don’t want to allow changes to this branch except from a select group of people.
You want to limit the number of people who can make changes to the default branch in a repository.
You want to ensure that pull requests cannot be merged to a branch except by an approved group of developers.
We’ll show you how to create a policy in IAM that prevents users from pushing commits to and merging pull requests to a branch named master. You’ll attach that policy to one group or role in IAM, and then test how users in that group are affected when that policy is applied. We’ll explain how it works, so you can create custom policies for your repositories.
What you need to get started
You’ll need to sign in to AWS with sufficient permissions to:
Create and apply policies in IAM.
Create groups in IAM.
Add users to those groups.
Apply policies to those groups.
You can use existing IAM groups, but because you’re going to be changing permissions, you might want to first test this out on groups and users you’ve created specifically for this purpose.
You’ll need a repository in AWS CodeCommit with at least two branches: master and test-branch. For information about how to create repositories, see Create a Repository. For information about how to create branches, see Create a Branch. In this blog post, we’ve named the repository MyDemoRepo. You can use an existing repository with branches of another name, if you prefer.
Let’s get started!
Create two groups in IAM
We’re going to set up two groups in IAM: Developers and Senior_Developers. To start, both groups will have the same managed policy, AWSCodeCommitPowerUsers, applied. Users in each group will have exactly the same permissions to perform actions in IAM.
Figure 1: Two example groups in IAM, with distinct users but the same managed policy applied to each group
In the navigation pane, choose Groups, and then choose Create New Group.
In the Group Name box, type Developers, and then choose Next Step.
In the list of policies, select the check box for AWSCodeCommitPowerUsers, then choose Next Step.
Choose Create Group.
Now, follow these steps to create the Senior_Developers group and attach the AWSCodeCommitPowerUsers managed policy. You now have two empty groups with the same policy attached.
Create users in IAM
Next, add at least one unique user to each group. You can use existing IAM users, but because you’ll be affecting their access to AWS CodeCommit, you might want to create two users just for testing purposes. Let’s go ahead and create Arnav and Mary.
In the navigation pane, choose Users, and then choose Add user.
For the new user, type Arnav_Desai.
Choose Add another user, and then type Mary_Major.
Select the type of access (programmatic access, access to the AWS Management Console, or both). In this blog post, we’ll be testing everything from the console, but if you want to test AWS CodeCommit using the AWS CLI, make sure you include programmatic access and console access.
For Console password type, choose Custom password. Each user is assigned the password that you type in the box. Write these down so you don’t forget them. You’ll need to sign in to the console using each of these accounts.
Choose Next: Permissions.
On the Set permissions page, choose Add user to group. Add Arnav to the Developers group. Add Mary to the Senior_Developers group.
Choose Next: Review to see all of the choices you made up to this point. When you are ready to proceed, choose Create user.
Sign in as Arnav, and then follow these steps to go to the master branch and add a file. Then sign in as Mary and follow the same steps.
On the Dashboard page, from the list of repositories, choose MyDemoRepo.
In the Code view, choose the branch named master.
Choose Add file, and then choose Create file. Type some text or code in the editor.
Provide information to other users about who added this file to the repository and why.
In Author name, type the name of the user (Arnav or Mary).
In Email address, type an email address so that other repository users can contact you about this change.
In Commit message, type a brief description to help you remember why you added this file or any other details you might find helpful.
Type a name for the file.
Choose Commit file.
Now follow the same steps to add a file in a different branch. (In our example repository, that’s the branch named test-branch.) You should be able to add a file to both branches regardless of whether you’re signed in as Arnav or Mary.
Let’s change that.
Create a conditional policy in IAM
You’re going to create a policy in IAM that will deny API actions if certain conditions are met. We want to prevent users with this policy applied from updating a branch named master, but we don’t want to prevent them from viewing the branch, cloning the repository, or creating pull requests that will merge to that branch. For this reason, we want to pick and choose our APIs carefully. Looking at the Permissions Reference, the logical permissions for this are:
GitPush
PutFile
MergePullRequestByFastForward
Now’s the time to think about what else you might want this policy to do. For example, because we don’t want users with this policy to make changes to this branch, we probably don’t want them to be able to delete it either, right? So let’s add one more permission:
DeleteBranch
The branch in which we want to deny these actions is master. The repository in which the branch resides is MyDemoRepo. We’re going to need more than just the repository name, though. We need the repository ARN. Fortunately, that’s easy to find. Just go to the AWS CodeCommit console, choose the repository, and choose Settings. The repository ARN is displayed on the General tab.
Now we’re ready to create a policy. 1. Open the IAM console at https://console.aws.amazon.com/iam/. Make sure you’re signed in with the account that has sufficient permissions to create policies, and not as Arnav or Mary. 2. In the navigation pane, choose Policies, and then choose Create policy. 3. Choose JSON, and then paste in the following:
You’ll notice a few things here. First, change the repository ARN to the ARN for your repository and include the repository name. Second, if you want to restrict access to a branch with a name different from our example, master, change that reference too.
Now let’s talk about this policy and what it does. You might be wondering why we’re using a Git reference (refs/heads) value instead of just the branch name. The answer lies in how Git references things, and how AWS CodeCommit, as a Git-based repository service, implements its APIs. A branch in Git is a simple pointer (reference) to the SHA-1 value of the head commit for that branch.
You might also be wondering about the second part of the condition, the nullification language. This is necessary because of the way git push and git-receive-pack work. Without going into too many technical details, when you attempt to push a change from a local repo to AWS CodeCommit, an initial reference call is made to AWS CodeCommit without any branch information. AWS CodeCommit evaluates that initial call to ensure that:
a) You’re authorized to make calls.
b) A repository exists with the name specified in the initial call. If you left that null out of the policy, users with that policy would be unable to complete any pushes from their local repos to the AWS CodeCommit remote repository at all, regardless of which branch they were trying to push their commits to.
Could you write a policy in such a way that the null is not required? Of course. IAM policy language is flexible. There’s an example of how to do this in the AWS CodeCommit User Guide, if you’re curious. But for the purposes of this blog post, let’s continue with this policy as written.
So what have we essentially said in this policy? We’ve asked IAM to deny the relevant CodeCommit permissions if the request is made to the resource MyDemoRepo and it meets the following condition: the reference is to refs/heads/master. Otherwise, the deny does not apply.
I’m sure you’re wondering if this policy has to be constrained to a specific repository resource like MyDemoRepo. After all, it would be awfully convenient if a single policy could apply to all branches in any repository in an AWS account, particularly since the default branch in any repository is initially the master branch. Good news! Simply replace the ARN with an *, and your policy will affect ALL branches named master in every AWS CodeCommit repository in your AWS account. Make sure that this is really what you want, though. We suggest you start by limiting the scope to just one repository, and then changing things when you’ve tested it and are happy with how it works.
When you’re sure you’ve modified the policy for your environment, choose Review policy to validate it. Give this policy a name, such as DenyChangesToMaster, provide a description of its purpose, and then choose Create policy.
Now that you have a policy, it’s time to apply and test it.
Apply the policy to a group
In theory, you could apply the policy you just created directly to any IAM user, but that really doesn’t scale well. You should apply this policy to a group, if you use IAM groups to manage users, or to a role, if your users assume a role when interacting with AWS resources.
In the IAM console, choose Groups, and then choose Developers.
On the Permissions tab, choose Attach Policy.
Choose DenyChangesToMaster, and then choose Attach policy.
Your groups now have a critical difference: users in the Developers group have an additional policy applied that restricts their actions in the master branch. In other words, Mary can continue to add files, push commits, and merge pull requests in the master branch, but Arnav cannot.
Figure 2: Two example groups in IAM, one with an additional policy applied that will prevent users in this group from making changes to the master branch
Test it out. Sign in as Arnav, and do the following:
On the Dashboard page, from the list of repositories, choose MyDemoRepo.
In the Code view, choose the branch named master.
Choose Add file, and then choose Create file, just as you did before. Provide some text, and then add the file name and your user information.
Choose Commit file.
This time you’ll see an error after choosing Commit file. It’s not a pretty message, but at the very end, you’ll see a telling phrase: “explicit deny”. That’s the policy in action. You, as Arnav, are explicitly denied PutFile, which prevents you from adding a file to the master branch. You’ll see similar results if you try other actions denied by that policy, such as deleting the master branch.
Stay signed in as Arnav, but this time add a file to test-branch. You should be able to add a file without seeing any errors. You can create a branch based on the master branch, add a file to it, and create a pull request that will merge to the master branch, all just as before. However, you cannot perform denied actions on that master branch.
Sign out as Arnav and sign in as Mary. You’ll see that as that IAM user, you can add and edit files in the master branch, merge pull requests to it, and even, although we don’t recommend this, delete it.
Conclusion
You can use conditional statements in policies in IAM to refine how users interact with your AWS CodeCommit repositories. This blog post showed how to use such a policy to prevent users from making changes to a branch named master. There are many other options. We hope this blog post will encourage you to experiment with AWS CodeCommit, IAM policies, and permissions. If you have any questions or suggestions, we’d love to hear from you.
In our blog post on Tuesday, Cryptocurrency Security Challenges, we wrote about the two primary challenges faced by anyone interested in safely and profitably participating in the cryptocurrency economy: 1) make sure you’re dealing with reputable and ethical companies and services, and, 2) keep your cryptocurrency holdings safe and secure.
In this post, we’re going to focus on how to make sure you don’t lose any of your cryptocurrency holdings through accident, theft, or carelessness. You do that by backing up the keys needed to sell or trade your currencies.
$34 Billion in Lost Value
Of the 16.4 million bitcoins said to be in circulation in the middle of 2017, close to 3.8 million may have been lost because their owners no longer are able to claim their holdings. Based on today’s valuation, that could total as much as $34 billion dollars in lost value. And that’s just bitcoins. There are now over 1,500 different cryptocurrencies, and we don’t know how many of those have been misplaced or lost.
Now that some cryptocurrencies have reached (at least for now) staggering heights in value, it’s likely that owners will be more careful in keeping track of the keys needed to use their cryptocurrencies. For the ones already lost, however, the owners have been separated from their currencies just as surely as if they had thrown Benjamin Franklins and Grover Clevelands over the railing of a ship.
The Basics of Securing Your Cryptocurrencies
In our previous post, we reviewed how cryptocurrency keys work, and the common ways owners can keep track of them. A cryptocurrency owner needs two keys to use their currencies: a public key that can be shared with others is used to receive currency, and a private key that must be kept secure is used to spend or trade currency.
Many wallets and applications allow the user to require extra security to access them, such as a password, or iris, face, or thumb print scan. If one of these options is available in your wallets, take advantage of it. Beyond that, it’s essential to back up your wallet, either using the backup feature built into some applications and wallets, or manually backing up the data used by the wallet. When backing up, it’s a good idea to back up the entire wallet, as some wallets require additional private data to operate that might not be apparent.
No matter which backup method you use, it is important to back up often and have multiple backups, preferable in different locations. As with any valuable data, a 3-2-1 backup strategy is good to follow, which ensures that you’ll have a good backup copy if anything goes wrong with one or more copies of your data.
One more caveat, don’t reuse passwords. This applies to all of your accounts, but is especially important for something as critical as your finances. Don’t ever use the same password for more than one account. If security is breached on one of your accounts, someone could connect your name or ID with other accounts, and will attempt to use the password there, as well. Consider using a password manager such as LastPass or 1Password, which make creating and using complex and unique passwords easy no matter where you’re trying to sign in.
Approaches to Backing Up Your Cryptocurrency Keys
There are numerous ways to be sure your keys are backed up. Let’s take them one by one.
1. Automatic backups using a backup program
If you’re using a wallet program on your computer, for example, Bitcoin Core, it will store your keys, along with other information, in a file. For Bitcoin Core, that file is wallet.dat. Other currencies will use the same or a different file name and some give you the option to select a name for the wallet file.
To back up the wallet.dat or other wallet file, you might need to tell your backup program to explicitly back up that file. Users of Backblaze Backup don’t have to worry about configuring this, since by default, Backblaze Backup will back up all data files. You should determine where your particular cryptocurrency, wallet, or application stores your keys, and make sure the necessary file(s) are backed up if your backup program requires you to select which files are included in the backup.
Backblaze B2 is an option for those interested in low-cost and high security cloud storage of their cryptocurrency keys. Backblaze B2 supports 2-factor verification for account access, works with a number of apps that support automatic backups with encryption, error-recovery, and versioning, and offers an API and command-line interface (CLI), as well. The first 10GB of storage is free, which could be all one needs to store encrypted cryptocurrency keys.
2. Backing up by exporting keys to a file
Apps and wallets will let you export your keys from your app or wallet to a file. Once exported, your keys can be stored on a local drive, USB thumb drive, DAS, NAS, or in the cloud with any cloud storage or sync service you wish. Encrypting the file is strongly encouraged — more on that later. If you use 1Password or LastPass, or other secure notes program, you also could store your keys there.
3. Backing up by saving a mnemonic recovery seed
A mnemonic phrase, mnemonic recovery phrase, or mnemonic seed is a list of words that stores all the information needed to recover a cryptocurrency wallet. Many wallets will have the option to generate a mnemonic backup phrase, which can be written down on paper. If the user’s computer no longer works or their hard drive becomes corrupted, they can download the same wallet software again and use the mnemonic recovery phrase to restore their keys.
The phrase can be used by anyone to recover the keys, so it must be kept safe. Mnemonic phrases are an excellent way of backing up and storing cryptocurrency and so they are used by almost all wallets.
A mnemonic recovery seed is represented by a group of easy to remember words. For example:
The first four letters are enough to unambiguously identify the word.
Similar words are avoided (such as: build and built).
Bitcoin and most other cryptocurrencies such as Litecoin, Ethereum, and others use mnemonic seeds that are 12 to 24 words long. Other currencies might use different length seeds.
4. Physical backups — Paper, Metal
Some cryptocurrency holders believe that their backup, or even all their cryptocurrency account information, should be stored entirely separately from the internet to avoid any risk of their information being compromised through hacks, exploits, or leaks. This type of storage is called “cold storage.” One method of cold storage involves printing out the keys to a piece of paper and then erasing any record of the keys from all computer systems. The keys can be entered into a program from the paper when needed, or scanned from a QR code printed on the paper.
Printed public and private keys
Some who go to extremes suggest separating the mnemonic needed to access an account into individual pieces of paper and storing those pieces in different locations in the home or office, or even different geographical locations. Some say this is a bad idea since it could be possible to reconstruct the mnemonic from one or more pieces. How diligent you wish to be in protecting these codes is up to you.
Mnemonic recovery phrase booklet
There’s another option that could make you the envy of your friends. That’s the CryptoSteel wallet, which is a stainless steel metal case that comes with more than 250 stainless steel letter tiles engraved on each side. Codes and passwords are assembled manually from the supplied part-randomized set of tiles. Users are able to store up to 96 characters worth of confidential information. Cryptosteel claims to be fireproof, waterproof, and shock-proof.
Cryptosteel cold wallet
Of course, if you leave your Cryptosteel wallet in the pocket of a pair of ripped jeans that gets thrown out by the housekeeper, as happened to the character Russ Hanneman on the TV show Silicon Valley in last Sunday’s episode, then you’re out of luck. That fictional billionaire investor lost a USB drive with $300 million in cryptocoins. Let’s hope that doesn’t happen to you.
Encryption & Security
Whether you store your keys on your computer, an external disk, a USB drive, DAS, NAS, or in the cloud, you want to make sure that no one else can use those keys. The best way to handle that is to encrypt the backup.
With Backblaze Backup for Windows and Macintosh, your backups are encrypted in transmission to the cloud and on the backup server. Users have the option to add an additional level of security by adding a Personal Encryption Key (PEK), which secures their private key. Your cryptocurrency backup files are secure in the cloud. Using our web or mobile interface, previous versions of files can be accessed, as well.
Our object storage cloud offering, Backblaze B2, can be used with a variety of applications for Windows, Macintosh, and Linux. With B2, cryptocurrency users can choose whichever method of encryption they wish to use on their local computers and then upload their encrypted currency keys to the cloud. Depending on the client used, versioning and life-cycle rules can be applied to the stored files.
Other backup programs and systems provide some or all of these capabilities, as well. If you are backing up to a local drive, it is a good idea to encrypt the local backup, which is an option in some backup programs.
Address Security
Some experts recommend using a different address for each cryptocurrency transaction. Since the address is not the same as your wallet, this means that you are not creating a new wallet, but simply using a new identifier for people sending you cryptocurrency. Creating a new address is usually as easy as clicking a button in the wallet.
One of the chief advantages of using a different address for each transaction is anonymity. Each time you use an address, you put more information into the public ledger (blockchain) about where the currency came from or where it went. That means that over time, using the same address repeatedly could mean that someone could map your relationships, transactions, and incoming funds. The more you use that address, the more information someone can learn about you. For more on this topic, refer to Address reuse.
Note that a downside of using a paper wallet with a single key pair (type-0 non-deterministic wallet) is that it has the vulnerabilities listed above. Each transaction using that paper wallet will add to the public record of transactions associated with that address. Newer wallets, i.e. “deterministic” or those using mnemonic code words support multiple addresses and are now recommended.
There are other approaches to keeping your cryptocurrency transaction secure. Here are a couple of them.
Multi-signature
Multi-signature refers to requiring more than one key to authorize a transaction, much like requiring more than one key to open a safe. It is generally used to divide up responsibility for possession of cryptocurrency. Standard transactions could be called “single-signature transactions” because transfers require only one signature — from the owner of the private key associated with the currency address (public key). Some wallets and apps can be configured to require more than one signature, which means that a group of people, businesses, or other entities all must agree to trade in the cryptocurrencies.
Deep Cold Storage
Deep cold storage ensures the entire transaction process happens in an offline environment. There are typically three elements to deep cold storage.
First, the wallet and private key are generated offline, and the signing of transactions happens on a system not connected to the internet in any manner. This ensures it’s never exposed to a potentially compromised system or connection.
Second, details are secured with encryption to ensure that even if the wallet file ends up in the wrong hands, the information is protected.
Third, storage of the encrypted wallet file or paper wallet is generally at a location or facility that has restricted access, such as a safety deposit box at a bank.
Deep cold storage is used to safeguard a large individual cryptocurrency portfolio held for the long term, or for trustees holding cryptocurrency on behalf of others, and is possibly the safest method to ensure a crypto investment remains secure.
Keep Your Software Up to Date
You should always make sure that you are using the latest version of your app or wallet software, which includes important stability and security fixes. Installing updates for all other software on your computer or mobile device is also important to keep your wallet environment safer.
One Last Thing: Think About Your Testament
Your cryptocurrency funds can be lost forever if you don’t have a backup plan for your peers and family. If the location of your wallets or your passwords is not known by anyone when you are gone, there is no hope that your funds will ever be recovered. Taking a bit of time on these matters can make a huge difference.
To the Moon*
Are you comfortable with how you’re managing and backing up your cryptocurrency wallets and keys? Do you have a suggestion for keeping your cryptocurrencies safe that we missed above? Please let us know in the comments.
*To the Moon — Crypto slang for a currency that reaches an optimistic price projection.
Join us this month to learn about some of the exciting new services and solution best practices at AWS. We also have our first re:Invent 2018 webinar series, “How to re:Invent”. Sign up now to learn more, we look forward to seeing you.
Note – All sessions are free and in Pacific Time.
Tech talks featured this month:
Analytics & Big Data
May 21, 2018 | 11:00 AM – 11:45 AM PT – Integrating Amazon Elasticsearch with your DevOps Tooling – Learn how you can easily integrate Amazon Elasticsearch Service into your DevOps tooling and gain valuable insight from your log data.
May 24, 2018 | 11:00 AM – 11:45 AM PT – Data Transformation Patterns in AWS – Discover how to perform common data transformations on the AWS Data Lake.
May 30, 2018 | 01:00 PM – 01:45 PM PT – Accelerating Life Sciences with HPC on AWS – Learn how you can accelerate your Life Sciences research workloads by harnessing the power of high performance computing on AWS.
Containers
May 24, 2018 | 01:00 PM – 01:45 PM PT –Building Microservices with the 12 Factor App Pattern on AWS – Learn best practices for building containerized microservices on AWS, and how traditional software design patterns evolve in the context of containers.
Databases
May 21, 2018 | 01:00 PM – 01:45 PM PT – How to Migrate from Cassandra to Amazon DynamoDB – Get the benefits, best practices and guides on how to migrate your Cassandra databases to Amazon DynamoDB.
May 23, 2018 | 01:00 PM – 01:45 PM PT – 5 Hacks for Optimizing MySQL in the Cloud – Learn how to optimize your MySQL databases for high availability, performance, and disaster resilience using RDS.
DevOps
May 23, 2018 | 09:00 AM – 09:45 AM PT – .NET Serverless Development on AWS – Learn how to build a modern serverless application in .NET Core 2.0.
Enterprise & Hybrid
May 22, 2018 | 11:00 AM – 11:45 AM PT – Hybrid Cloud Customer Use Cases on AWS – Learn how customers are leveraging AWS hybrid cloud capabilities to easily extend their datacenter capacity, deliver new services and applications, and ensure business continuity and disaster recovery.
IoT
May 31, 2018 | 11:00 AM – 11:45 AM PT – Using AWS IoT for Industrial Applications – Discover how you can quickly onboard your fleet of connected devices, keep them secure, and build predictive analytics with AWS IoT.
Machine Learning
May 22, 2018 | 09:00 AM – 09:45 AM PT – Using Apache Spark with Amazon SageMaker – Discover how to use Apache Spark with Amazon SageMaker for training jobs and application integration.
May 24, 2018 | 09:00 AM – 09:45 AM PT – Introducing AWS DeepLens – Learn how AWS DeepLens provides a new way for developers to learn machine learning by pairing the physical device with a broad set of tutorials, examples, source code, and integration with familiar AWS services.
May 30, 2018 | 09:00 AM – 09:45 AM PT– Introducing AWS Certificate Manager Private Certificate Authority (CA) – Learn how AWS Certificate Manager (ACM) Private Certificate Authority (CA), a managed private CA service, helps you easily and securely manage the lifecycle of your private certificates.
June 1, 2018 | 09:00 AM – 09:45 AM PT – Introducing AWS Firewall Manager – Centrally configure and manage AWS WAF rules across your accounts and applications.
May 30, 2018 | 11:00 AM – 11:45 AM PT – Accelerate Productivity by Computing at the Edge – Learn how AWS Snowball Edge support for compute instances helps accelerate data transfers, execute custom applications, and reduce overall storage costs.
We have a new resource available to help you meet a requirement for physically-separated infrastructure using logical separation in the AWS cloud. Our latest guide, Logical Separation: An Evaluation of the U.S. Department of Defense Cloud Security Requirements for Sensitive Workloads outlines how AWS meets the U.S. Department of Defense’s (DoD) stringent physical separation requirement by pioneering a three-pronged logical separation approach that leverages virtualization, encryption, and deploying compute to dedicated hardware.
This guide will help you understand logical separation in the cloud and demonstrates its advantages over a traditional physical separation model. Embracing this approach can help organizations confidently meet or exceed security requirements found in traditional on-premises environments, while also providing increased security control and flexibility.
Logical Separation is the second guide in the AWS Government Handbook Series, which examines cybersecurity policy initiatives and identifies best practices.
If you have questions or want to learn more, contact your account executive or AWS Support.
Most likely you’ve read the tantalizing stories of big gains from investing in cryptocurrencies. Someone who invested $1,000 into bitcoins five years ago would have over $85,000 in value now. Alternatively, someone who invested in bitcoins three months ago would have seen their investment lose 20% in value. Beyond the big price fluctuations, currency holders are possibly exposed to fraud, bad business practices, and even risk losing their holdings altogether if they are careless in keeping track of the all-important currency keys.
It’s certain that beyond the rewards and risks, cryptocurrencies are here to stay. We can’t ignore how they are changing the game for how money is handled between people and businesses.
Some Advantages of Cryptocurrency
Cryptocurrency is accessible to anyone.
Decentralization means the network operates on a user-to-user (or peer-to-peer) basis.
Transactions can completed for a fraction of the expense and time required to complete traditional asset transfers.
Transactions are digital and cannot be counterfeited or reversed arbitrarily by the sender, as with credit card charge-backs.
There aren’t usually transaction fees for cryptocurrency exchanges.
Cryptocurrency allows the cryptocurrency holder to send exactly what information is needed and no more to the merchant or recipient, even permitting anonymous transactions (for good or bad).
Cryptocurrency operates at the universal level and hence makes transactions easier internationally.
There is no other electronic cash system in which your account isn’t owned by someone else.
On top of all that, blockchain, the underlying technology behind cryptocurrencies, is already being applied to a variety of business needs and itself becoming a hot sector of the tech economy. Blockchain is bringing traceability and cost-effectiveness to supply-chain management — which also improves quality assurance in areas such as food, reducing errors and improving accounting accuracy, smart contracts that can be automatically validated, signed and enforced through a blockchain construct, the possibility of secure, online voting, and many others.
Like any new, booming marketing there are risks involved in these new currencies. Anyone venturing into this domain needs to have their eyes wide open. While the opportunities for making money are real, there are even more ways to lose money.
We’re going to cover two primary approaches to staying safe and avoiding fraud and loss when dealing with cryptocurrencies. The first is to thoroughly vet any person or company you’re dealing with to judge whether they are ethical and likely to succeed in their business segment. The second is keeping your critical cryptocurrency keys safe, which we’ll deal with in this and a subsequent post.
Caveat Emptor — Buyer Beware
The short history of cryptocurrency has already seen the demise of a number of companies that claimed to manage, mine, trade, or otherwise help their customers profit from cryptocurrency. Mt. Gox, GAW Miners, and OneCoin are just three of the many companies that disappeared with their users’ money. This is the traditional equivalent of your bank going out of business and zeroing out your checking account in the process.
That doesn’t happen with banks because of regulatory oversight. But with cryptocurrency, you need to take the time to investigate any company you use to manage or trade your currencies. How long have they been around? Who are their investors? Are they affiliated with any reputable financial institutions? What is the record of their founders and executive management? These are all important questions to consider when evaluating a company in this new space.
Would you give the keys to your house to a service or person you didn’t thoroughly know and trust? Some companies that enable you to buy and sell currencies online will routinely hold your currency keys, which gives them the ability to do anything they want with your holdings, including selling them and pocketing the proceeds if they wish.
That doesn’t mean you shouldn’t ever allow a company to keep your currency keys in escrow. It simply means that you better know with whom you’re doing business and if they’re trustworthy enough to be given that responsibility.
Keys To the Cryptocurrency Kingdom — Public and Private
If you’re an owner of cryptocurrency, you know how this all works. If you’re not, bear with me for a minute while I bring everyone up to speed.
Cryptocurrency has no physical manifestation, such as bills or coins. It exists purely as a computer record. And unlike currencies maintained by governments, such as the U.S. dollar, there is no central authority regulating its distribution and value. Cryptocurrencies use a technology called blockchain, which is a decentralized way of keeping track of transactions. There are many copies of a given blockchain, so no single central authority is needed to validate its authenticity or accuracy.
The validity of each cryptocurrency is determined by a blockchain. A blockchain is a continuously growing list of records, called “blocks”, which are linked and secured using cryptography. Blockchains by design are inherently resistant to modification of the data. They perform as an open, distributed ledger that can record transactions between two parties efficiently and in a verifiable, permanent way. A blockchain is typically managed by a peer-to-peer network collectively adhering to a protocol for validating new blocks. Once recorded, the data in any given block cannot be altered retroactively without the alteration of all subsequent blocks, which requires collusion of the network majority. On a scaled network, this level of collusion is impossible — making blockchain networks effectively immutable and trustworthy.
The other element common to all cryptocurrencies is their use of public and private keys, which are stored in the currency’s wallet. A cryptocurrency wallet stores the public and private “keys” or “addresses” that can be used to receive or spend the cryptocurrency. With the private key, it is possible to write in the public ledger (blockchain), effectively spending the associated cryptocurrency. With the public key, it is possible for others to send currency to the wallet.
Cryptocurrency “coins” can be lost if the owner loses the private keys needed to spend the currency they own. It’s as if the owner had lost a bank account number and had no way to verify their identity to the bank, or if they lost the U.S. dollars they had in their wallet. The assets are gone and unusable.
The Cryptocurrency Wallet
Given the importance of these keys, and lack of recourse if they are lost, it’s obviously very important to keep track of your keys.
If you’re being careful in choosing reputable exchanges, app developers, and other services with whom to trust your cryptocurrency, you’ve made a good start in keeping your investment secure. But if you’re careless in managing the keys to your bitcoins, ether, Litecoin, or other cryptocurrency, you might as well leave your money on a cafe tabletop and walk away.
What Are the Differences Between Hot and Cold Wallets?
Just like other numbers you might wish to keep track of — credit cards, account numbers, phone numbers, passphrases — cryptocurrency keys can be stored in a variety of ways. Those who use their currencies for day-to-day purchases most likely will want them handy in a smartphone app, hardware key, or debit card that can be used for purchases. These are called “hot” wallets. Some experts advise keeping the balances in these devices and apps to a minimal amount to avoid hacking or data loss. We typically don’t walk around with thousands of dollars in U.S. currency in our old-style wallets, so this is really a continuation of the same approach to managing spending money.
A “hot” wallet, the Bread mobile app
Some investors with large balances keep their keys in “cold” wallets, or “cold storage,” i.e. a device or location that is not connected online. If funds are needed for purchases, they can be transferred to a more easily used payment medium. Cold wallets can be hardware devices, USB drives, or even paper copies of your keys.
A “cold” wallet, the Trezor hardware wallet
A “cold” wallet, the Ledger Nano S
A “cold” Bitcoin paper wallet
Wallets are suited to holding one or more specific cryptocurrencies, and some people have multiple wallets for different currencies and different purposes.
A paper wallet is nothing other than a printed record of your public and private keys. Some prefer their records to be completely disconnected from the internet, and a piece of paper serves that need. Just like writing down an account password on paper, however, it’s essential to keep the paper secure to avoid giving someone the ability to freely access your funds.
How to Keep your Keys, and Cryptocurrency Secure
In a post this coming Thursday, Securing Your Cryptocurrency, we’ll discuss the best strategies for backing up your cryptocurrency so that your currencies don’t become part of the millions that have been lost. We’ll cover the common (and uncommon) approaches to backing up hot wallets, cold wallets, and using paper and metal solutions to keeping your keys safe.
In the meantime, please tell us of your experiences with cryptocurrencies — good and bad — and how you’ve dealt with the issue of cryptocurrency security.
Many companies across the globe use Amazon DynamoDB to store and query historical user-interaction data. DynamoDB is a fast NoSQL database used by applications that need consistent, single-digit millisecond latency.
Often, customers want to turn their valuable data in DynamoDB into insights by analyzing a copy of their table stored in Amazon S3. Doing this separates their analytical queries from their low-latency critical paths. This data can be the primary source for understanding customers’ past behavior, predicting future behavior, and generating downstream business value. Customers often turn to DynamoDB because of its great scalability and high availability. After a successful launch, many customers want to use the data in DynamoDB to predict future behaviors or provide personalized recommendations.
DynamoDB is a good fit for low-latency reads and writes, but it’s not practical to scan all data in a DynamoDB database to train a model. In this post, I demonstrate how you can use DynamoDB table data copied to Amazon S3 by AWS Data Pipeline to predict customer behavior. I also demonstrate how you can use this data to provide personalized recommendations for customers using Amazon SageMaker. You can also run ad hoc queries using Amazon Athena against the data. DynamoDB recently released on-demand backups to create full table backups with no performance impact. However, it’s not suitable for our purposes in this post, so I chose AWS Data Pipeline instead to create managed backups are accessible from other services.
To do this, I describe how to read the DynamoDB backup file format in Data Pipeline. I also describe how to convert the objects in S3 to a CSV format that Amazon SageMaker can read. In addition, I show how to schedule regular exports and transformations using Data Pipeline. The sample data used in this post is from Bank Marketing Data Set of UCI.
The solution that I describe provides the following benefits:
Separates analytical queries from production traffic on your DynamoDB table, preserving your DynamoDB read capacity units (RCUs) for important production requests
Automatically updates your model to get real-time predictions
Optimizes for performance (so it doesn’t compete with DynamoDB RCUs after the export) and for cost (using data you already have)
Makes it easier for developers of all skill levels to use Amazon SageMaker
All code and data set in this post are available in this .zip file.
Solution architecture
The following diagram shows the overall architecture of the solution.
The steps that data follows through the architecture are as follows:
Data Pipeline regularly copies the full contents of a DynamoDB table as JSON into an S3
Exported JSON files are converted to comma-separated value (CSV) format to use as a data source for Amazon SageMaker.
Amazon SageMaker renews the model artifact and update the endpoint.
The converted CSV is available for ad hoc queries with Amazon Athena.
Data Pipeline controls this flow and repeats the cycle based on the schedule defined by customer requirements.
Building the auto-updating model
This section discusses details about how to read the DynamoDB exported data in Data Pipeline and build automated workflows for real-time prediction with a regularly updated model.
Find the automation_script.sh file and edit it for your environment. For example, you need to replace 's3://<your bucket>/<datasource path>/' with your own S3 path to the data source for Amazon ML. In the script, the text enclosed by angle brackets—< and >—should be replaced with your own path.
Upload the json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar file to your S3 path so that the ADD jar command in Apache Hive can refer to it.
For this solution, the banking.csv should be imported into a DynamoDB table.
Export a DynamoDB table
To export the DynamoDB table to S3, open the Data Pipeline console and choose the Export DynamoDB table to S3 template. In this template, Data Pipeline creates an Amazon EMR cluster and performs an export in the EMRActivity activity. Set proper intervals for backups according to your business requirements.
One core node(m3.xlarge) provides the default capacity for the EMR cluster and should be suitable for the solution in this post. Leave the option to resize the cluster before running enabled in the TableBackupActivity activity to let Data Pipeline scale the cluster to match the table size. The process of converting to CSV format and renewing models happens in this EMR cluster.
For a more in-depth look at how to export data from DynamoDB, see Export Data from DynamoDB in the Data Pipeline documentation.
Add the script to an existing pipeline
After you export your DynamoDB table, you add an additional EMR step to EMRActivity by following these steps:
Open the Data Pipeline console and choose the ID for the pipeline that you want to add the script to.
For Actions, choose Edit.
In the editing console, choose the Activities category and add an EMR step using the custom script downloaded in the previous section, as shown below.
Paste the following command into the new step after the data upload step:
The element #{output.directoryPath} references the S3 path where the data pipeline exports DynamoDB data as JSON. The path should be passed to the script as an argument.
The bash script has two goals, converting data formats and renewing the Amazon SageMaker model. Subsequent sections discuss the contents of the automation script.
Automation script: Convert JSON data to CSV with Hive
We use Apache Hive to transform the data into a new format. The Hive QL script to create an external table and transform the data is included in the custom script that you added to the Data Pipeline definition.
When you run the Hive scripts, do so with the -e option. Also, define the Hive table with the 'org.openx.data.jsonserde.JsonSerDe' row format to parse and read JSON format. The SQL creates a Hive EXTERNAL table, and it reads the DynamoDB backup data on the S3 path passed to it by Data Pipeline.
Note: You should create the table with the “EXTERNAL” keyword to avoid the backup data being accidentally deleted from S3 if you drop the table.
The full automation script for converting follows. Add your own bucket name and data source path in the highlighted areas.
After creating an external table, you need to read data. You then use the INSERT OVERWRITE DIRECTORY ~ SELECT command to write CSV data to the S3 path that you designated as the data source for Amazon SageMaker.
Depending on your requirements, you can eliminate or process the columns in the SELECT clause in this step to optimize data analysis. For example, you might remove some columns that have unpredictable correlations with the target value because keeping the wrong columns might expose your model to “overfitting” during the training. In this post, customer_id columns is removed. Overfitting can make your prediction weak. More information about overfitting can be found in the topic Model Fit: Underfitting vs. Overfitting in the Amazon ML documentation.
Automation script: Renew the Amazon SageMaker model
After the CSV data is replaced and ready to use, create a new model artifact for Amazon SageMaker with the updated dataset on S3. For renewing model artifact, you must create a new training job. Training jobs can be run using the AWS SDK ( for example, Amazon SageMaker boto3 ) or the Amazon SageMaker Python SDK that can be installed with “pip install sagemaker” command as well as the AWS CLI for Amazon SageMaker described in this post.
In addition, consider how to smoothly renew your existing model without service impact, because your model is called by applications in real time. To do this, you need to create a new endpoint configuration first and update a current endpoint with the endpoint configuration that is just created.
#!/bin/bash
## Define variable
REGION=$2
DTTIME=`date +%Y-%m-%d-%H-%M-%S`
ROLE="<your AmazonSageMaker-ExecutionRole>"
# Select containers image based on region.
case "$REGION" in
"us-west-2" )
IMAGE="174872318107.dkr.ecr.us-west-2.amazonaws.com/linear-learner:latest"
;;
"us-east-1" )
IMAGE="382416733822.dkr.ecr.us-east-1.amazonaws.com/linear-learner:latest"
;;
"us-east-2" )
IMAGE="404615174143.dkr.ecr.us-east-2.amazonaws.com/linear-learner:latest"
;;
"eu-west-1" )
IMAGE="438346466558.dkr.ecr.eu-west-1.amazonaws.com/linear-learner:latest"
;;
*)
echo "Invalid Region Name"
exit 1 ;
esac
# Start training job and creating model artifact
TRAINING_JOB_NAME=TRAIN-${DTTIME}
S3OUTPUT="s3://<your bucket name>/model/"
INSTANCETYPE="ml.m4.xlarge"
INSTANCECOUNT=1
VOLUMESIZE=5
aws sagemaker create-training-job --training-job-name ${TRAINING_JOB_NAME} --region ${REGION} --algorithm-specification TrainingImage=${IMAGE},TrainingInputMode=File --role-arn ${ROLE} --input-data-config '[{ "ChannelName": "train", "DataSource": { "S3DataSource": { "S3DataType": "S3Prefix", "S3Uri": "s3://<your bucket name>/<datasource path>/", "S3DataDistributionType": "FullyReplicated" } }, "ContentType": "text/csv", "CompressionType": "None" , "RecordWrapperType": "None" }]' --output-data-config S3OutputPath=${S3OUTPUT} --resource-config InstanceType=${INSTANCETYPE},InstanceCount=${INSTANCECOUNT},VolumeSizeInGB=${VOLUMESIZE} --stopping-condition MaxRuntimeInSeconds=120 --hyper-parameters feature_dim=20,predictor_type=binary_classifier
# Wait until job completed
aws sagemaker wait training-job-completed-or-stopped --training-job-name ${TRAINING_JOB_NAME} --region ${REGION}
# Get newly created model artifact and create model
MODELARTIFACT=`aws sagemaker describe-training-job --training-job-name ${TRAINING_JOB_NAME} --region ${REGION} --query 'ModelArtifacts.S3ModelArtifacts' --output text `
MODELNAME=MODEL-${DTTIME}
aws sagemaker create-model --region ${REGION} --model-name ${MODELNAME} --primary-container Image=${IMAGE},ModelDataUrl=${MODELARTIFACT} --execution-role-arn ${ROLE}
# create a new endpoint configuration
CONFIGNAME=CONFIG-${DTTIME}
aws sagemaker create-endpoint-config --region ${REGION} --endpoint-config-name ${CONFIGNAME} --production-variants VariantName=Users,ModelName=${MODELNAME},InitialInstanceCount=1,InstanceType=ml.m4.xlarge
# create or update the endpoint
STATUS=`aws sagemaker describe-endpoint --endpoint-name ServiceEndpoint --query 'EndpointStatus' --output text --region ${REGION} `
if [[ $STATUS -ne "InService" ]] ;
then
aws sagemaker create-endpoint --endpoint-name ServiceEndpoint --endpoint-config-name ${CONFIGNAME} --region ${REGION}
else
aws sagemaker update-endpoint --endpoint-name ServiceEndpoint --endpoint-config-name ${CONFIGNAME} --region ${REGION}
fi
Grant permission
Before you execute the script, you must grant proper permission to Data Pipeline. Data Pipeline uses the DataPipelineDefaultResourceRole role by default. I added the following policy to DataPipelineDefaultResourceRole to allow Data Pipeline to create, delete, and update the Amazon SageMaker model and data source in the script.
After you deploy a model into production using Amazon SageMaker hosting services, your client applications use this API to get inferences from the model hosted at the specified endpoint. This approach is useful for interactive web, mobile, or desktop applications.
Following, I provide a simple Python code example that queries against Amazon SageMaker endpoint URL with its name (“ServiceEndpoint”) and then uses them for real-time prediction.
Data Pipeline exports DynamoDB table data into S3. The original JSON data should be kept to recover the table in the rare event that this is needed. Data Pipeline then converts JSON to CSV so that Amazon SageMaker can read the data.Note: You should select only meaningful attributes when you convert CSV. For example, if you judge that the “campaign” attribute is not correlated, you can eliminate this attribute from the CSV.
Train the Amazon SageMaker model with the new data source.
When a new customer comes to your site, you can judge how likely it is for this customer to subscribe to your new product based on “predictedScores” provided by Amazon SageMaker.
If the new user subscribes your new product, your application must update the attribute “y” to the value 1 (for yes). This updated data is provided for the next model renewal as a new data source. It serves to improve the accuracy of your prediction. With each new entry, your application can become smarter and deliver better predictions.
Running ad hoc queries using Amazon Athena
Amazon Athena is a serverless query service that makes it easy to analyze large amounts of data stored in Amazon S3 using standard SQL. Athena is useful for examining data and collecting statistics or informative summaries about data. You can also use the powerful analytic functions of Presto, as described in the topic Aggregate Functions of Presto in the Presto documentation.
With the Data Pipeline scheduled activity, recent CSV data is always located in S3 so that you can run ad hoc queries against the data using Amazon Athena. I show this with example SQL statements following. For an in-depth description of this process, see the post Interactive SQL Queries for Data in Amazon S3 on the AWS News Blog.
Creating an Amazon Athena table and running it
Simply, you can create an EXTERNAL table for the CSV data on S3 in Amazon Athena Management Console.
=== Table Creation ===
CREATE EXTERNAL TABLE datasource (
age int,
job string,
marital string ,
education string,
default string,
housing string,
loan string,
contact string,
month string,
day_of_week string,
duration int,
campaign int,
pdays int ,
previous int ,
poutcome string,
emp_var_rate double,
cons_price_idx double,
cons_conf_idx double,
euribor3m double,
nr_employed double,
y int
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',' ESCAPED BY '\\' LINES TERMINATED BY '\n'
LOCATION 's3://<your bucket name>/<datasource path>/';
The following query calculates the correlation coefficient between the target attribute and other attributes using Amazon Athena.
=== Sample Query ===
SELECT corr(age,y) AS correlation_age_and_target,
corr(duration,y) AS correlation_duration_and_target,
corr(campaign,y) AS correlation_campaign_and_target,
corr(contact,y) AS correlation_contact_and_target
FROM ( SELECT age , duration , campaign , y ,
CASE WHEN contact = 'telephone' THEN 1 ELSE 0 END AS contact
FROM datasource
) datasource ;
Conclusion
In this post, I introduce an example of how to analyze data in DynamoDB by using table data in Amazon S3 to optimize DynamoDB table read capacity. You can then use the analyzed data as a new data source to train an Amazon SageMaker model for accurate real-time prediction. In addition, you can run ad hoc queries against the data on S3 using Amazon Athena. I also present how to automate these procedures by using Data Pipeline.
You can adapt this example to your specific use case at hand, and hopefully this post helps you accelerate your development. You can find more examples and use cases for Amazon SageMaker in the video AWS 2017: Introducing Amazon SageMaker on the AWS website.
Yong Seong Lee is a Cloud Support Engineer for AWS Big Data Services. He is interested in every technology related to data/databases and helping customers who have difficulties in using AWS services. His motto is “Enjoy life, be curious and have maximum experience.”
AWS Config enables continuous monitoring of your AWS resources, making it simple to assess, audit, and record resource configurations and changes. AWS Config does this through the use of rules that define the desired configuration state of your AWS resources. AWS Config provides a number of AWS managed rules that address a wide range of security concerns such as checking if you encrypted your Amazon Elastic Block Store (Amazon EBS) volumes, tagged your resources appropriately, and enabled multi-factor authentication (MFA) for root accounts. You can also create custom rules to codify your compliance requirements through the use of AWS Lambda functions.
In this post we’ll show you how to use AWS Config to monitor our Amazon Simple Storage Service (S3) bucket ACLs and policies for violations which allow public read or public write access. If AWS Config finds a policy violation, we’ll have it trigger an Amazon CloudWatch Event rule to trigger an AWS Lambda function which either corrects the S3 bucket ACL, or notifies you via Amazon Simple Notification Service (Amazon SNS) that the policy is in violation and allows public read or public write access. We’ll show you how to do this in five main steps.
Enable AWS Config to monitor Amazon S3 bucket ACLs and policies for compliance violations.
Create an IAM Role and Policy that grants a Lambda function permissions to read S3 bucket policies and send alerts through SNS.
Create and configure a CloudWatch Events rule that triggers the Lambda function when AWS Config detects an S3 bucket ACL or policy violation.
Create a Lambda function that uses the IAM role to review S3 bucket ACLs and policies, correct the ACLs, and notify your team of out-of-compliance policies.
Verify the monitoring solution.
Note: This post assumes your compliance policies require the buckets you monitor not allow public read or write access. If you have intentionally open buckets serving static content, for example, you can use this post as a jumping-off point for a solution tailored to your needs.
At the end of this post, we provide an AWS CloudFormation template that implements the solution outlined. The template enables you to deploy the solution in multiple regions quickly.
Important: The use of some of the resources deployed, including those deployed using the provided CloudFormation template, will incur costs as long as they are in use. AWS Config Rules incur costs in each region they are active.
Architecture
Here’s an architecture diagram of what we’ll implement:
Figure 1: Architecture diagram
Step 1: Enable AWS Config and Amazon S3 Bucket monitoring
The following steps demonstrate how to set up AWS Config to monitor Amazon S3 buckets.
If this is your first time using AWS Config, select Get started. If you’ve already used AWS Config, select Settings.
In the Settings page, under Resource types to record, clear the All resources checkbox. In the Specific types list, select Bucket under S3.
Figure 2: The Settings dialog box showing the “Specific types” list
Choose the Amazon S3 bucket for storing configuration history and snapshots. We’ll create a new Amazon S3 bucket.
Figure 3: Creating an S3 bucket
If you prefer to use an existing Amazon S3 bucket in your account, select the Choose a bucket from your account radio button and, using the dropdown, select an existing bucket.
Figure 4: Selecting an existing S3 bucket
Under Amazon SNS topic, check the box next to Stream configuration changes and notifications to an Amazon SNS topic, and then select the radio button to Create a topic.
Alternatively, you can choose a topic that you have previously created and subscribed to.
Figure 5: Selecting a topic that you’ve previously created and subscribed to
If you created a new SNS topic you need to subscribe to it to receive notifications. We’ll cover this in a later step.
Under AWS Config role, choose Create a role (unless you already have a role you want to use). We’re using the auto-suggested role name.
Figure 6: Creating a role
Select Next.
Configure Amazon S3 bucket monitoring rules:
On the AWS Config rules page, search for S3 and choose the s3-bucket-publice-read-prohibited and s3-bucket-public-write-prohibited rules, then click Next.
Figure 7: AWS Config rules dialog
On the Review page, select Confirm. AWS Config is now analyzing your Amazon S3 buckets, capturing their current configurations, and evaluating the configurations against the rules we selected.
If you created a new Amazon SNS topic, open the Amazon SNS Management Console and locate the topic you created:
Figure 8: Amazon SNS topic list
Copy the ARN of the topic (the string that begins with arn:) because you’ll need it in a later step.
Select the checkbox next to the topic, and then, under the Actions menu, select Subscribe to topic.
Select Email as the protocol, enter your email address, and then select Create subscription.
After several minutes, you’ll receive an email asking you to confirm your subscription for notifications for this topic. Select the link to confirm the subscription.
Step 2: Create a Role for Lambda
Our Lambda will need permissions that enable it to inspect and modify Amazon S3 bucket ACLs and policies, log to CloudWatch Logs, and publishing to an Amazon SNS topic. We’ll now set up a custom AWS Identity and Access Management (IAM) policy and role to support these actions and assign them to the Lambda function we’ll create in the next section.
In the AWS Management Console, under Services, select IAM to access the IAM Console.
Create a policy with the following permissions, or copy the following policy:
Select Lambda from the list of services that will use this role.
Select the check box next to the policy you created previously, and then select Next: Review
Name your role, give it a description, and then select Create Role. In this example, we’re naming the role LambdaS3PolicySecuringRole.
Step 3: Create and Configure a CloudWatch Rule
In this section, we’ll create a CloudWatch Rule to trigger the Lambda function when AWS Config determines that your Amazon S3 buckets are non-compliant.
In the AWS Management Console, under Services, select CloudWatch.
On the left-hand side, under Events, select Rules.
Click Create rule.
In Step 1: Create rule, under Event Source, select the dropdown list and select Build custom event pattern.
Copy the following pattern and paste it into the text box:
The pattern matches events generated by AWS Config when it checks the Amazon S3 bucket for public accessibility.
We’ll add a Lambda target later. For now, select your Amazon SNS topic created earlier, and then select Configure details.
Figure 9: The “Create rule” dialog
Give your rule a name and description. For this example, we’ll name ours AWSConfigFoundOpenBucket
Click Create rule.
Step 4: Create a Lambda Function
In this section, we’ll create a new Lambda function to examine an Amazon S3 bucket’s ACL and bucket policy. If the bucket ACL is found to allow public access, the Lambda function overwrites it to be private. If a bucket policy is found, the Lambda function creates an SNS message, puts the policy in the message body, and publishes it to the Amazon SNS topic we created. Bucket policies can be complex, and overwriting your policy may cause unexpected loss of access, so this Lambda function doesn’t attempt to alter your policy in any way.
Get the ARN of the Amazon SNS topic created earlier.
In the AWS Management Console, under Services, select Lambda to go to the Lambda Console.
From the Dashboard, select Create Function. Or, if you were taken directly to the Functions page, select the Create Function button in the upper-right.
On the Create function page:
Choose Author from scratch.
Provide a name for the function. We’re using AWSConfigOpenAccessResponder.
The Lambda function we’ve written is Python 3.6 compatible, so in the Runtime dropdown list, select Python 3.6.
Under Role, select Choose an existing role. Select the role you created in the previous section, and then select Create function.
Figure 10: The “Create function” dialog
We’ll now add a CloudWatch Event based on the rule we created earlier.
In the Add triggers section, select CloudWatch Events. A CloudWatch Events box should appear connected to the left side of the Lambda Function and have a note that states Configuration required.
Figure 11: CloudWatch Events in the “Add triggers” section
From the Rule dropdown box, choose the rule you created earlier, and then select Add.
Scroll up to the Designer section and select the name of your Lambda function.
Delete the default code and paste in the following code:
import boto3
from botocore.exceptions import ClientError
import json
import os
ACL_RD_WARNING = "The S3 bucket ACL allows public read access."
PLCY_RD_WARNING = "The S3 bucket policy allows public read access."
ACL_WRT_WARNING = "The S3 bucket ACL allows public write access."
PLCY_WRT_WARNING = "The S3 bucket policy allows public write access."
RD_COMBO_WARNING = ACL_RD_WARNING + PLCY_RD_WARNING
WRT_COMBO_WARNING = ACL_WRT_WARNING + PLCY_WRT_WARNING
def policyNotifier(bucketName, s3client):
try:
bucketPolicy = s3client.get_bucket_policy(Bucket = bucketName)
# notify that the bucket policy may need to be reviewed due to security concerns
sns = boto3.client('sns')
subject = "Potential compliance violation in " + bucketName + " bucket policy"
message = "Potential bucket policy compliance violation. Please review: " + json.dumps(bucketPolicy['Policy'])
# send SNS message with warning and bucket policy
response = sns.publish(
TopicArn = os.environ['TOPIC_ARN'],
Subject = subject,
Message = message
)
except ClientError as e:
# error caught due to no bucket policy
print("No bucket policy found; no alert sent.")
def lambda_handler(event, context):
# instantiate Amazon S3 client
s3 = boto3.client('s3')
resource = list(event['detail']['requestParameters']['evaluations'])[0]
bucketName = resource['complianceResourceId']
complianceFailure = event['detail']['requestParameters']['evaluations'][0]['annotation']
if(complianceFailure == ACL_RD_WARNING or complianceFailure == PLCY_RD_WARNING):
s3.put_bucket_acl(Bucket = bucketName, ACL = 'private')
elif(complianceFailure == PLCY_RD_WARNING or complianceFailure == PLCY_WRT_WARNING):
policyNotifier(bucketName, s3)
elif(complianceFailure == RD_COMBO_WARNING or complianceFailure == WRT_COMBO_WARNING):
s3.put_bucket_acl(Bucket = bucketName, ACL = 'private')
policyNotifier(bucketName, s3)
return 0 # done
Scroll down to the Environment variables section. This code uses an environment variable to store the Amazon SNS topic ARN.
For the key, enter TOPIC_ARN.
For the value, enter the ARN of the Amazon SNS topic created earlier.
Under Execution role, select Choose an existing role, and then select the role created earlier from the dropdown.
Leave everything else as-is, and then, at the top, select Save.
Step 5: Verify it Works
We now have the Lambda function, an Amazon SNS topic, AWS Config watching our Amazon S3 buckets, and a CloudWatch Rule to trigger the Lambda function if a bucket is found to be non-compliant. Let’s test them to make sure they work.
We have an Amazon S3 bucket, myconfigtestbucket that’s been created in the region monitored by AWS Config, as well as the associated Lambda function. This bucket has no public read or write access set in an ACL or a policy, so it’s compliant.
Figure 12: The “Config Dashboard”
Let’s change the bucket’s ACL to allow public listing of objects:
Figure 13: Screen shot of “Permissions” tab showing Everyone granted list access
After saving, the bucket now has public access. After several minutes, the AWS Config Dashboard notes that there is one non-compliant resource:
Figure 14: The “Config Dashboard” shown with a non-compliant resource
In the Amazon S3 Console, we can see that the bucket no longer has public listing of objects enabled after the invocation of the Lambda function triggered by the CloudWatch Rule created earlier.
Figure 15: The “Permissions” tab showing list access no longer allowed
Notice that the AWS Config Dashboard now shows that there are no longer any non-compliant resources:
Figure 16: The “Config Dashboard” showing zero non-compliant resources
Now, let’s try out the Amazon S3 bucket policy check by configuring a bucket policy that allows list access:
Figure 17: A bucket policy that allows list access
A few minutes after setting this bucket policy on the myconfigtestbucket bucket, AWS Config recognizes the bucket is no longer compliant. Because this is a bucket policy rather than an ACL, we publish a notification to the SNS topic we created earlier that lets us know about the potential policy violation:
Figure 18: Notification about potential policy violation
Knowing that the policy allows open listing of the bucket, we can now modify or delete the policy, after which AWS Config will recognize that the resource is compliant.
Conclusion
In this post, we demonstrated how you can use AWS Config to monitor for Amazon S3 buckets with open read and write access ACLs and policies. We also showed how to use Amazon CloudWatch, Amazon SNS, and Lambda to overwrite a public bucket ACL, or to alert you should a bucket have a suspicious policy. You can use the CloudFormation template to deploy this solution in multiple regions quickly. With this approach, you will be able to easily identify and secure open Amazon S3 bucket ACLs and policies. Once you have deployed this solution to multiple regions you can aggregate the results using an AWS Config aggregator. See this post to learn more.
If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the AWS Config forum or contact AWS Support.
Want more AWS Security news? Follow us on Twitter.
As of March 31, 2018 we had 100,110 spinning hard drives. Of that number, there were 1,922 boot drives and 98,188 data drives. This review looks at the quarterly and lifetime statistics for the data drive models in operation in our data centers. We’ll also take a look at why we are collecting and reporting 10 new SMART attributes and take a sneak peak at some 8 TB Toshiba drives. Along the way, we’ll share observations and insights on the data presented and we look forward to you doing the same in the comments.
Background
Since April 2013, Backblaze has recorded and saved daily hard drive statistics from the drives in our data centers. Each entry consists of the date, manufacturer, model, serial number, status (operational or failed), and all of the SMART attributes reported by that drive. Currently there are about 97 million entries totaling 26 GB of data. You can download this data from our website if you want to do your own research, but for starters here’s what we found.
Hard Drive Reliability Statistics for Q1 2018
At the end of Q1 2018 Backblaze was monitoring 98,188 hard drives used to store data. For our evaluation below we remove from consideration those drives which were used for testing purposes and those drive models for which we did not have at least 45 drives. This leaves us with 98,046 hard drives. The table below covers just Q1 2018.
Notes and Observations
If a drive model has a failure rate of 0%, it only means there were no drive failures of that model during Q1 2018.
The overall Annualized Failure Rate (AFR) for Q1 is just 1.2%, well below the Q4 2017 AFR of 1.65%. Remember that quarterly failure rates can be volatile, especially for models that have a small number of drives and/or a small number of Drive Days.
There were 142 drives (98,188 minus 98,046) that were not included in the list above because we did not have at least 45 of a given drive model. We use 45 drives of the same model as the minimum number when we report quarterly, yearly, and lifetime drive statistics.
Welcome Toshiba 8TB drives, almost…
We mentioned Toshiba 8 TB drives in the first paragraph, but they don’t show up in the Q1 Stats chart. What gives? We only had 20 of the Toshiba 8 TB drives in operation in Q1, so they were excluded from the chart. Why do we have only 20 drives? When we test out a new drive model we start with the “tome test” and it takes 20 drives to fill one tome. A tome is the same drive model in the same logical position in each of the 20 Storage Pods that make up a Backblaze Vault. There are 60 tomes in each vault.
In this test, we created a Backblaze Vault of 8 TB drives, with 59 of the tomes being Seagate 8 TB drives and 1 tome being the Toshiba drives. Then we monitored the performance of the vault and its member tomes to see if, in this case, the Toshiba drives performed as expected.
So far the Toshiba drive is performing fine, but they have been in place for only 20 days. Next up is the “pod test” where we fill a Storage Pod with Toshiba drives and integrate it into a Backblaze Vault comprised of like-sized drives. We hope to have a better look at the Toshiba 8 TB drives in our Q2 report — stay tuned.
Lifetime Hard Drive Reliability Statistics
While the quarterly chart presented earlier gets a lot of interest, the real test of any drive model is over time. Below is the lifetime failure rate chart for all the hard drive models which have 45 or more drives in operation as of March 31st, 2018. For each model, we compute their reliability starting from when they were first installed.
Notes and Observations
The failure rates of all of the larger drives (8-, 10- and 12 TB) are very good, 1.2% AFR (Annualized Failure Rate) or less. Many of these drives were deployed in the last year, so there is some volatility in the data, but you can use the Confidence Interval to get a sense of the failure percentage range.
The overall failure rate of 1.84% is the lowest we have ever achieved, besting the previous low of 2.00% from the end of 2017.
Our regular readers and drive stats wonks may have noticed a sizable jump in the number of HGST 8 TB drives (model: HUH728080ALE600), from 45 last quarter to 1,045 this quarter. As the 10 TB and 12 TB drives become more available, the price per terabyte of the 8 TB drives has gone down. This presented an opportunity to purchase the HGST drives at a price in line with our budget.
We purchased and placed into service the 45 original HGST 8 TB drives in Q2 of 2015. They were our first Helium-filled drives and our only ones until the 10 TB and 12 TB Seagate drives arrived in Q3 2017. We’ll take a first look into whether or not Helium makes a difference in drive failure rates in an upcoming blog post.
New SMART Attributes
If you have previously worked with the hard drive stats data or plan to, you’ll notice that we added 10 more columns of data starting in 2018. There are 5 new SMART attributes we are tracking each with a raw and normalized value:
177 – Wear Range Delta
179 – Used Reserved Block Count Total
181- Program Fail Count Total or Non-4K Aligned Access Count
182 – Erase Fail Count
235 – Good Block Count AND System(Free) Block Count
The 5 values are all related to SSD drives.
Yes, SSD drives, but before you jump to any conclusions, we used 10 Samsung 850 EVO SSDs as boot drives for a period of time in Q1. This was an experiment to see if we could reduce boot up time for the Storage Pods. In our case, the improved boot up speed wasn’t worth the SSD cost, but it did add 10 new columns to the hard drive stats data.
Speaking of hard drive stats data, the complete data set used to create the information used in this review is available on our Hard Drive Test Data page. You can download and use this data for free for your own purpose, all we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone. It is free.
If you just want the summarized data used to create the tables and charts in this blog post, you can download the ZIP file containing the MS Excel spreadsheet.
Good luck and let us know if you find anything interesting.
[Ed: 5/1/2018 – Updated Lifetime chart to fix error in confidence interval for HGST 4TB drive, model: HDS5C4040ALE630]
As ransomware attacks have grown in number in recent months, the tactics and attack vectors also have evolved. While the primary method of attack used to be to target individual computer users within organizations with phishing emails and infected attachments, we’re increasingly seeing attacks that target weaknesses in businesses’ IT infrastructure.
How Ransomware Attacks Typically Work
In our previous posts on ransomware, we described the common vehicles used by hackers to infect organizations with ransomware viruses. Most often, downloaders distribute trojan horses through malicious downloads and spam emails. The emails contain a variety of file attachments, which if opened, will download and run one of the many ransomware variants. Once a user’s computer is infected with a malicious downloader, it will retrieve additional malware, which frequently includes crypto-ransomware. After the files have been encrypted, a ransom payment is demanded of the victim in order to decrypt the files.
What’s Changed With the Latest Ransomware Attacks?
In 2016, a customized ransomware strain called SamSam began attacking the servers in primarily health care institutions. SamSam, unlike more conventional ransomware, is not delivered through downloads or phishing emails. Instead, the attackers behind SamSam use tools to identify unpatched servers running Red Hat’s JBoss enterprise products. Once the attackers have successfully gained entry into one of these servers by exploiting vulnerabilities in JBoss, they use other freely available tools and scripts to collect credentials and gather information on networked computers. Then they deploy their ransomware to encrypt files on these systems before demanding a ransom. Gaining entry to an organization through its IT center rather than its endpoints makes this approach scalable and especially unsettling.
SamSam’s methodology is to scour the Internet searching for accessible and vulnerable JBoss application servers, especially ones used by hospitals. It’s not unlike a burglar rattling doorknobs in a neighborhood to find unlocked homes. When SamSam finds an unlocked home (unpatched server), the software infiltrates the system. It is then free to spread across the company’s network by stealing passwords. As it transverses the network and systems, it encrypts files, preventing access until the victims pay the hackers a ransom, typically between $10,000 and $15,000. The low ransom amount has encouraged some victimized organizations to pay the ransom rather than incur the downtime required to wipe and reinitialize their IT systems.
The success of SamSam is due to its effectiveness rather than its sophistication. SamSam can enter and transverse a network without human intervention. Some organizations are learning too late that securing internet-facing services in their data center from attack is just as important as securing endpoints.
The typical steps in a SamSam ransomware attack are:
1 Attackers gain access to vulnerable server
Attackers exploit vulnerable software or weak/stolen credentials.
2 Attack spreads via remote access tools
Attackers harvest credentials, create SOCKS proxies to tunnel traffic, and abuse RDP to install SamSam on more computers in the network.
3 Ransomware payload deployed
Attackers run batch scripts to execute ransomware on compromised machines.
4 Ransomware demand delivered requiring payment to decrypt files
Demand amounts vary from victim to victim. Relatively low ransom amounts appear to be designed to encourage quick payment decisions.
What all the organizations successfully exploited by SamSam have in common is that they were running unpatched servers that made them vulnerable to SamSam. Some organizations had their endpoints and servers backed up, while others did not. Some of those without backups they could use to recover their systems chose to pay the ransom money.
Timeline of SamSam History and Exploits
Since its appearance in 2016, SamSam has been in the news with many successful incursions into healthcare, business, and government institutions.
March 2016 SamSam appears
SamSam campaign targets vulnerable JBoss servers Attackers hone in on healthcare organizations specifically, as they’re more likely to have unpatched JBoss machines.
April 2016 SamSam finds new targets
SamSam begins targeting schools and government. After initial success targeting healthcare, attackers branch out to other sectors.
April 2017 New tactics include RDP
Attackers shift to targeting organizations with exposed RDP connections, and maintain focus on healthcare. An attack on Erie County Medical Center costs the hospital $10 million over three months of recovery.
January 2018 Municipalities attacked
• Attack on Municipality of Farmington, NM. • Attack on Hancock Health. • Attack on Adams Memorial Hospital • Attack on Allscripts (Electronic Health Records), which includes 180,000 physicians, 2,500 hospitals, and 7.2 million patients’ health records.
February 2018 Attack volume increases
• Attack on Davidson County, NC. • Attack on Colorado Department of Transportation.
March 2018 SamSam shuts down Atlanta
• Second attack on Colorado Department of Transportation. • City of Atlanta suffers a devastating attack by SamSam. The attack has far-reaching impacts — crippling the court system, keeping residents from paying their water bills, limiting vital communications like sewer infrastructure requests, and pushing the Atlanta Police Department to file paper reports. • SamSam campaign nets $325,000 in 4 weeks. Infections spike as attackers launch new campaigns. Healthcare and government organizations are once again the primary targets.
How to Defend Against SamSam and Other Ransomware Attacks
The best way to respond to a ransomware attack is to avoid having one in the first place. If you are attacked, making sure your valuable data is backed up and unreachable by ransomware infection will ensure that your downtime and data loss will be minimal or none if you ever suffer an attack.
In our previous post, How to Recover From Ransomware, we listed the ten ways to protect your organization from ransomware.
Use anti-virus and anti-malware software or other security policies to block known payloads from launching.
Make frequent, comprehensive backups of all important files and isolate them from local and open networks. Cybersecurity professionals view data backup and recovery (74% in a recent survey) by far as the most effective solution to respond to a successful ransomware attack.
Keep offline backups of data stored in locations inaccessible from any potentially infected computer, such as disconnected external storage drives or the cloud, which prevents them from being accessed by the ransomware.
Install the latest security updates issued by software vendors of your OS and applications. Remember to patch early and patch often to close known vulnerabilities in operating systems, server software, browsers, and web plugins.
Consider deploying security software to protect endpoints, email servers, and network systems from infection.
Exercise cyber hygiene, such as using caution when opening email attachments and links.
Segment your networks to keep critical computers isolated and to prevent the spread of malware in case of attack. Turn off unneeded network shares.
Turn off admin rights for users who don’t require them. Give users the lowest system permissions they need to do their work.
Restrict write permissions on file servers as much as possible.
Educate yourself, your employees, and your family in best practices to keep malware out of your systems. Update everyone on the latest email phishing scams and human engineering aimed at turning victims into abettors.
Please Tell Us About Your Experiences with Ransomware
Have you endured a ransomware attack or have a strategy to avoid becoming a victim? Please tell us of your experiences in the comments.
Abstract: We present a scalable dynamic analysis framework that allows for the automatic evaluation of the privacy behaviors of Android apps. We use our system to analyze mobile apps’ compliance with the Children’s Online Privacy Protection Act (COPPA), one of the few stringent privacy laws in the U.S. Based on our automated analysis of 5,855 of the most popular free children’s apps, we found that a majority are potentially in violation of COPPA, mainly due to their use of third-party SDKs. While many of these SDKs offer configuration options to respect COPPA by disabling tracking and behavioral advertising, our data suggest that a majority of apps either do not make use of these options or incorrectly propagate them across mediation SDKs. Worse, we observed that 19% of children’s apps collect identifiers or other personally identifiable information (PII) via SDKs whose terms of service outright prohibit their use in child-directed apps. Finally, we show that efforts by Google to limit tracking through the use of a resettable advertising ID have had little success: of the 3,454 apps that share the resettable ID with advertisers, 66% transmit other, non-resettable, persistent identifiers as well, negating any intended privacy-preserving properties of the advertising ID.
With AWS Organizations, you can centrally manage policies across multiple AWS accounts without having to use custom scripts and manual processes. For example, you can apply service control policies (SCPs) across multiple AWS accounts that are members of an organization. SCPs allow you to define which AWS service APIs can and cannot be executed by AWS Identity and Access Management (IAM) entities (such as IAM users and roles) in your organization’s member AWS accounts. SCPs are created and applied from the master account, which is the AWS account that you used when you created your organization.
In a previous post, How to Use Service Control Policies in AWS Organizations to Enforce Healthcare Compliance in Your AWS Account, we reviewed how to create and manage SCPs and Organizational Units (OU) within an organization. In this post, I show how to use SCPs for access control in Organizations, with a specific focus on evaluating SCPs when an IAM entity calls an API in a member AWS account. I first cover some key Organizations concepts, and then I show how an SCP attached to an organization impacts which AWS service APIs are available to member accounts. Finally, I demonstrate these concepts with an example.
Organizational structure in Organizations
OUs give you a way to logically group and structure member AWS accounts in your organization. The screenshot shows the tree view of an example organizational structure in my organization with several OUs. Currently, I have selected OrgUnit01, and this is the current view I see in my main window. You can see here that within the OrgUnit01 OU, I have nested two additional OUs (OrgUnit01ChildA and OrgUnit01ChildB) and an AWS account is also contained within OrgUnit01, named “Developer Sandbox Account”.
The parts of the example organizational structure in the screenshot are:
Tree view — The hierarchy of your organization’s root and any OUs you have created
Tree view toggle — Enable and disable tree view
Organizational Units — Any child OUs of the selected root or OU in tree view
Accounts — Any AWS accounts (members or master) in the current OU
In the next section, I explain why at least one SCP must be attached to your root and OUs and introduce SCP evaluation.
How Service Control Policy evaluation logic works
To allow an AWS service API at the member account level, you must allow the API at every level between the member account and the root of your organization. This means you must attach an SCP at every level between your organization’s root and the member account that allows the given AWS service API (such as ec2:RunInstances). For more information, see About Service Control Policies.
Let’s say you want to allow the ec2:RunInstances API in the Developer Sandbox Account in the example structure in the preceding screenshot. To allow this AWS service API, you must allow the API in at least one SCP attached at each of these levels:
The organization’s root
The OU named OrgUnit01
If you don’t allow the AWS service in an SCP attached at each of these two levels, neither IAM entities nor the root user in the Developer Sandbox Account will be able to call ec2:RunInstances, even if an administrator has given them permission to do so (for IAM entities). In terms of policy evaluation, SCPs follow exactly the same policy evaluation logic as IAM does: by default, all requests are denied, an explicit allow overrides this default, and an explicit deny overrides any explicit allows.
What does this look like in practice? In the next section, I share a practical example to demonstrate how this works in Organizations.
An example structure with nested OUs and SCPs
In the previous section, I introduced design aspects of AWS Organizations that help prevent administrators from breaking structures in their Organizations. But because AWS Organizations is flexible enough to address multiple use cases, administrators can make changes that have unintended consequences, such as breaking organizational structures when moving an AWS account from one OU to another. In this section, I show an example with broken OU and SCP structures and explain how you can fix them.
I’ll take a blacklisting approach. That is, I’ll use the FullAWSAccess SCP, which doesn’t filter out any AWS service APIs. Then, I will filter out specific APIs by blacklisting them in subsequent SCPs attached to OUs at various points in my organization’s structure. For further reading on blacklisting and whitelisting with AWS Organizations, review AWS Organizations Terminology and Concepts.
Let’s say I have developed the OU and SCP structure shown in the diagram below. Before taking a close look at that diagram, I’ll briefly outline the goals I’m trying to achieve. Broadly speaking, there’s a small subset of APIs that I want to filter out using SCPs. This means that IAM entities in some AWS accounts in my organization will not have access to particular AWS service APIs, such as those related to Amazon EC2, while other accounts will not have access to APIs associated with Amazon CloudWatch, Amazon S3, and so on. Apart from these special cases, I do want the accounts in my organization to have access to all other APIs. More specifically, my goals are as follows:
Any AWS accounts in the Root should not have any API filtered out.
Any AWS accounts in OU 001 should have APIs for CloudWatch filtered out, but all other APIs will be accessible.
Any AWS accounts in OU 002 should have APIs for both CloudWatch and EC2 filtered out, but all other APIs will be accessible.
Any AWS accounts in OU 003 should have APIs for S3 filtered out, but all other APIs will be accessible.
To that end, let’s now look at my initial SCP and OU configuration in the image below that shows the example OU and SCP structure. The arrow shows the direction of inheritance: the root and the OUs below it (children) inherit SCPs from the OUs above them (parents). This example structure contains the following SCPs:
FullAWSAccess — Allows all AWS service APIs
Deny_CW — Denies all CloudWatch APIs
Deny_EC2 — Denies all Amazon EC2 APIs
Deny_S3 — Denies all Amazon S3 APIs
Now that I’ve outlined my intent, and shown you the OU / SCP structure that I’ve created to meet that set of goals, you can probably already see that the structure provided in the image above will not work correctly for my stated goals. In fact, AWS accounts in the Root container and OU 001 will have the intended access, as per my goals (1) and (2). I will not, however, meet my goals (3) and (4) with the above structure: entities in member accounts directly under OU 002 cannot perform any actions, even if they’re granted permissions by IAM access policies. This is because the FullAWSAccess SCP isn’t attached directly to this OU (it’s only inherited).
Why is this important? For an AWS service API to be available to IAM entities in a member account, the API must be specified in an SCP attached at every level all the way down the hierarchy to the relevant member account. Similarly, even though OU 003 does have the FullAWSAccess SCP attached directly to it, the fact that it’s not attached to the parent OU (OU 002) means that IAM entities in member accounts under OU 003 also aren’t able to access any service APIs. This doesn’t happen by default—I have deliberately taken this action to organize my structure in this way, to show both the flexibility and the kind of problems you can encounter when working with OUs and SCPs.
So I now need to fix the problems that I’ve inadvertently created. To start with, I’m going to make one change to OU 002 by attaching the FullAWSAccess SCP directly to that OU. After I do that, OU 002 has the attached and inherited policies that are shown in the following image.
With the FullAWSAccess policy attached to OU 002, member accounts in both OU 002 and OU 003 can access the other non-restricted AWS service APIs (keeping in mind that the FullAWSAccess policy was already applied to OU 003).
I have one final issue to address in this example: OU 003 has an SCP attached that blocks access to the Amazon S3 APIs. However, in this OU, the intent is to allow IAM entities in member accounts to access the EC2 APIs. EC2 API access is blocked because in the parent OU (OU 002), an SCP is attached that denies access to that API (the Deny_EC2 SCP), which means that any actions listed in Deny_EC2 have already been filtered out. An explicit deny always trumps an allow, so to meet goal (4) and have an OU in which EC2 APIs are allowed but access to CloudWatch APIs and S3 APIs is filtered out, I will move OU 003 up one level, placing it directly under OU 001. This change gives me a working OU and policy structure, as shown in the following image.
I recommend that at each level of your organization’s hierarchy, you directly apply the relevant SCPs. By doing this, you’re less likely to forget to apply an SCP to a particular OU, which can break your permission structure. By directly applying SCPs, you also make your policy structure easier to read.
If you have a group of accounts in your organization that are for testing purposes, I recommend that you experiment with OUs and SCPs. Applying SCPs to OUs and then moving an AWS account around within that structure can show you how SCPs affect IAM entities. For example, if you have an IAM user with the AdministratorAccess policy attached, you should see how SCPs can filter out certain AWS service APIs from specified member accounts.
Conclusion
I showed you how you can effectively apply SCPs to OUs in your organization and avoid some of the common issues that you might experience. I demonstrated an approach to designing a working organizational structure that I hope will help smooth your deployment of your organization and enables you to better centrally secure and manage your AWS accounts.
If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Organizations forum.
In the wake of the Cambridge Analytica scandal, news articles and commentators have focused on what Facebook knows about us. A lot, it turns out. It collects data from our posts, our likes, our photos, things we type and delete without posting, and things we do while not on Facebook and even when we’re offline. It buys data about us from others. And it can infer even more: our sexual orientation, political beliefs, relationship status, drug use, and other personality traits — even if we didn’t take the personality test that Cambridge Analytica developed.
But for every article about Facebook’s creepy stalker behavior, thousands of other companies are breathing a collective sigh of relief that it’s Facebook and not them in the spotlight. Because while Facebook is one of the biggest players in this space, there are thousands of other companies that spy on and manipulate us for profit.
Harvard Business School professor Shoshana Zuboff calls it “surveillance capitalism.” And as creepy as Facebook is turning out to be, the entire industry is far creepier. It has existed in secret far too long, and it’s up to lawmakers to force these companies into the public spotlight, where we can all decide if this is how we want society to operate and — if not — what to do about it.
There are 2,500 to 4,000 data brokers in the United States whose business is buying and selling our personal data. Last year, Equifax was in thenews when hackers stole personal information on 150 million people, including Social Security numbers, birth dates, addresses, and driver’s license numbers.
You certainly didn’t give it permission to collect any of that information. Equifax is one of those thousands of data brokers, most of them you’ve never heard of, selling your personal information without your knowledge or consent to pretty much anyone who will pay for it.
Surveillance capitalism takes this one step further. Companies like Facebook and Google offer you free services in exchange for your data. Google’s surveillance isn’t in the news, but it’s startlingly intimate. We never lie to our search engines. Our interests and curiosities, hopes and fears, desires and sexual proclivities, are all collected and saved. Add to that the websites we visit that Google tracks through its advertising network, our Gmail accounts, our movements via Google Maps, and what it can collect from our smartphones.
That phone is probably the most intimate surveillance device ever invented. It tracks our location continuously, so it knows where we live, where we work, and where we spend our time. It’s the first and last thing we check in a day, so it knows when we wake up and when we go to sleep. We all have one, so it knows who we sleep with. Uber used just some of that information to detect one-night stands; your smartphone provider and any app you allow to collect location data knows a lot more.
Surveillance capitalism drives much of the internet. It’s behind most of the “free” services, and many of the paid ones as well. Its goal is psychological manipulation, in the form of personalized advertising to persuade you to buy something or do something, like vote for a candidate. And while the individualized profile-driven manipulation exposed by Cambridge Analytica feels abhorrent, it’s really no different from what every company wants in the end. This is why all your personal information is collected, and this is why it is so valuable. Companies that can understand it can use it against you.
None of this is new. The media has been reporting on surveillance capitalism for years. In 2015, I wrote a book about it. Back in 2010, the Wall Street Journal publishedan award-winning two-year series about how people are tracked both online and offline, titled “What They Know.”
Surveillance capitalism is deeply embedded in our increasingly computerized society, and if the extent of it came to light there would be broad demands for limits and regulation. But because this industry can largely operate in secret, only occasionally exposed after a data breach or investigative report, we remain mostly ignorant of its reach.
This might change soon. In 2016, the European Union passed the comprehensive General Data Protection Regulation, or GDPR. The details of the law are far too complex to explain here, but some of the things it mandates are that personal data of EU citizens can only be collected and saved for “specific, explicit, and legitimate purposes,” and only with explicit consent of the user. Consent can’t be buried in the terms and conditions, nor can it be assumed unless the user opts in. This law will take effect in May, and companies worldwide are bracing for its enforcement.
Because pretty much all surveillance capitalism companies collect data on Europeans, this will expose the industry like nothing else. Here’s just one example. In preparation for this law, PayPal quietlypublished a list of over 600 companies it might share your personal data with. What will it be like when every company has to publish this sort of information, and explicitly explain how it’s using your personal data? We’re about to find out.
In the wake of this scandal, even Mark Zuckerberg saidthat his industry probably should be regulated, although he’s certainly not wishing for the sorts of comprehensive regulation the GDPR is bringing to Europe.
He’s right. Surveillance capitalism has operated without constraints for far too long. And advances in both big data analysis and artificial intelligence will make tomorrow’s applications far creepier than today’s. Regulation is the only answer.
The first step to any regulation is transparency. Who has our data? Is it accurate? What are they doing with it? Who are they selling it to? How are they securing it? Can we delete it? I don’t see any hope of Congress passing a GDPR-like data protection law anytime soon, but it’s not too far-fetched to demand laws requiring these companies to be more transparent in what they’re doing.
One of the responses to the Cambridge Analytica scandal is that people are deleting their Facebook accounts. It’s hard to do right, and doesn’t do anything about the data that Facebook collectsaboutpeople who don’t use Facebook. But it’s a start. The market can put pressure on these companies to reduce their spying on us, but it can only do that if we force the industry out of its secret shadows.
Australian public sector customers now have a clear roadmap to use our secure services for sensitive workloads at the PROTECTED level. For the first time, we’ve released our Information Security Registered Assessors Program (IRAP) PROTECTED documentation via AWS Artifact. This information provides the ability to plan, architect, and self-assess systems built in AWS under the Digital Transformation Agency’s Secure Cloud Guidelines.
In short, this documentation gives public sector customers everything needed to evaluate AWS at the PROTECTED level. And we’re making this resource available to download on-demand through AWS Artifact. When you download the guide, you’ll find a mapping of how AWS meets each requirement to securely and compliantly process PROTECTED data.
With the AWS IRAP PROTECTED documentation, the process of adopting our secure services has never been easier. The information enables individual agencies to complete their own assessments and adopt AWS, but we also continue to work with the Australian Signals Directorate to include our services at the PROTECTED level on the Certified Cloud Services List.
Meanwhile, we’re also excited to announce that there are now 46 services in scope, which mean more options to build secure and innovative solutions, while also saving money and gaining the productivity of the cloud.
If you have questions about this announcement or would like to inquire about how to use AWS for your regulated workloads, contact your account team.
Today, I’m very pleased to announce that AWS services comply with the General Data Protection Regulation (GDPR). This means that, in addition to benefiting from all of the measures that AWS already takes to maintain services security, customers can deploy AWS services as a key part of their GDPR compliance plans.
This announcement confirms we have completed the entirety of our GDPR service readiness audit, validating that all generally available services and features adhere to the high privacy bar and data protection standards required of data processors by the GDPR. We completed this work two months ahead of the May 25, 2018 enforcement deadline in order to give customers and APN partners an environment in which they can confidently build their own GDPR-compliant products, services, and solutions.
AWS’s GDPR service readiness is only part of the story; we are continuing to work alongside our customers and the AWS Partner Network (APN) to help on their journey toward GDPR compliance. Along with this announcement, I’d like to highlight the following examples of ways AWS can help you accelerate your own GDPR compliance efforts.
Security of Personal Data During our GDPR service readiness audit, our security and compliance experts confirmed that AWS has in place effective technical and organizational measures for data processors to secure personal data in accordance with the GDPR. Security remains our highest priority, and we continue to innovate and invest in a high bar for security and compliance across all global operations. Our industry-leading functionality provides the foundation for our long list of internationally-recognized certifications and accreditations, demonstrating compliance with rigorous international standards, such as ISO 27001 for technical measures, ISO 27017 for cloud security, ISO 27018 for cloud privacy, SOC 1, SOC 2 and SOC 3, PCI DSS Level 1, and EU-specific certifications such as BSI’s Common Cloud Computing Controls Catalogue (C5). AWS continues to pursue the certifications that assist our customers.
Compliance-enabling Services Many requirements under the GDPR focus on ensuring effective control and protection of personal data. AWS services give you the capability to implement your own security measures in the ways you need in order to enable your compliance with the GDPR, including specific measures such as:
Encryption of personal data
Ability to ensure the ongoing confidentiality, integrity, availability, and resilience of processing systems and services
Ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident
Processes for regularly testing, assessing, and evaluating the effectiveness of technical and organizational measures for ensuring the security of processing
This is an advanced set of security and compliance services that are designed specifically to handle the requirements of the GDPR. There are numerous AWS services that have particular significance for customers focusing on GDPR compliance, including:
Amazon GuardDuty – a security service featuring intelligent threat detection and continuous monitoring
Amazon Macie – a machine learning tool to assist discovery and securing of personal data stored in Amazon S3
Amazon Inspector – an automated security assessment service to help keep applications in conformity with best security practices
AWS Config Rules – a monitoring service that dynamically checks cloud resources for compliance with security rules
Additionally, we have published a whitepaper, “Navigating GDPR Compliance on AWS,” dedicated to this topic. This paper details how to tie GDPR concepts to specific AWS services, including those relating to monitoring, data access, and key management. Furthermore, our GDPR Center will give you access to the up-to-date resources you need to tackle requirements that directly support your GDPR efforts.
Compliant DPA We offer a GDPR-compliant Data Processing Addendum (DPA), enabling you to comply with GDPR contractual obligations.
Conformity with a Code of Conduct GDPR introduces adherence to a “code of conduct” as a mechanism for demonstrating sufficient guarantees of requirements that the GDPR places on data processors. In this context, we previously announced compliance with the CISPE Code of Conduct. The CISPE Code of Conduct provides customers with additional assurances regarding their ability to fully control their data in a safe, secure, and compliant environment when they use services from providers like AWS. More detail about the CISPE Code of Conduct can be found at: https://aws.amazon.com/compliance/cispe/
Training and Summits We can provide you with training on navigating GDPR compliance using AWS services via our Professional Services team. This team has a GDPR workshop offering, which is a two-day facilitated session customized to your specific needs and challenges. We are also providing GDPR presentations during our AWS Summits in European countries, as well as San Francisco and Tokyo.
Additional Resources Finally, we have teams of compliance, data protection, and security experts, as well as the APN, helping customers across Europe prepare for running regulated workloads in the cloud as the GDPR becomes enforceable. For additional information on this, please contact your AWS Account Manager.
As we move towards May 25 and beyond, we’ll be posting a series of blogs to dive deeper into GDPR-related concepts along with how AWS can help. Please visit our GDPR Center for more information. We’re excited about being your partner in fully addressing this important regulation.
-Chad Woolf
Vice President, AWS Security Assurance
Interested in additional AWS Security news? Follow the AWS Security Blog on Twitter.
The collective thoughts of the interwebz
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.