A new PGP vulnerability was announced today. Basically, the vulnerability makes use of the fact that modern e-mail programs allow for embedded HTML objects. Essentially, if an attacker can intercept and modify a message in transit, he can insert code that sends the plaintext in a URL to a remote website. Very clever.
The EFAIL attacks exploit vulnerabilities in the OpenPGP and S/MIME standards to reveal the plaintext of encrypted emails. In a nutshell, EFAIL abuses active content of HTML emails, for example externally loaded images or styles, to exfiltrate plaintext through requested URLs. To create these exfiltration channels, the attacker first needs access to the encrypted emails, for example, by eavesdropping on network traffic, compromising email accounts, email servers, backup systems or client computers. The emails could even have been collected years ago.
The attacker changes an encrypted email in a particular way and sends this changed encrypted email to the victim. The victim’s email client decrypts the email and loads any external content, thus exfiltrating the plaintext to the attacker.
A few initial comments:
1. Being able to intercept and modify e-mails in transit is the sort of thing the NSA can do, but is hard for the average hacker. That being said, there are circumstances where someone can modify e-mails. I don’t mean to minimize the seriousness of this attack, but that is a consideration.
2. The vulnerability isn’t with PGP or S/MIME itself, but in the way they interact with modern e-mail programs. You can see this in the two suggested short-term mitigations: “No decryption in the e-mail client,” and “disable HTML rendering.”
3. I’ve been getting some weird press calls from reporters wanting to know if this demonstrates that e-mail encryption is impossible. No, this just demonstrates that programmers are human and vulnerabilities are inevitable. PGP almost certainly has fewer bugs than your average piece of software, but it’s not bug free.
3. Why is anyone using encrypted e-mail anymore, anyway? Reliably and easily encrypting e-mail is an insurmountably hard problem for reasons having nothing to do with today’s announcement. If you need to communicate securely, use Signal. If having Signal on your phone will arouse suspicion, use WhatsApp.
I’ll post other commentaries and analyses as I find them.
I’ve been busy trying to replicate the “eFail” PGP/SMIME bug. I thought I’d write up some notes.
PGP and S/MIME encrypt emails, so that eavesdroppers can’t read them. The bugs potentially allow eavesdroppers to take the encrypted emails they’ve captured and resend them to you, reformatted in a way that allows them to decrypt the messages.
Disable remote/external content in email
The most important defense is to disable “external” or “remote” content from being automatically loaded. This is when HTML-formatted emails attempt to load images from remote websites. This happens legitimately when they want to display images, but not fill up the email with them. But most of the time this is illegitimate, they hide images on the webpage in order to track you with unique IDs and cookies. For example, this is the code at the end of an email from politician Bernie Sanders to his supporters. Notice the long random number assigned to track me, and the width/height of this image is set to one pixel, so you don’t even see it:
Such trackers are so pernicious they are disabled by default in most email clients. This is an example of the settings in Thunderbird:
The problem is that as you read email messages, you often get frustrated by the fact the error messages and missing content, so you keep adding exceptions:
The correct defense against this eFail bug is to make sure such remote content is disabled and that you have no exceptions, or at least, no HTTP exceptions. HTTPS exceptions (those using SSL) are okay as long as they aren’t to a website the attacker controls. Unencrypted exceptions, though, the hacker can eavesdrop on, so it doesn’t matter if they control the website the requests go to. If the attacker can eavesdrop on your emails, they can probably eavesdrop on your HTTP sessions as well.
Some have recommended disabling PGP and S/MIME completely. That’s probably overkill. As long as the attacker can’t use the “remote content” in emails, you are fine. Likewise, some have recommend disabling HTML completely. That’s not even an option in any email client I’ve used — you can disable sending HTML emails, but not receiving them. It’s sufficient to just disable grabbing remote content, not the rest of HTML email rendering.
I couldn’t replicate the direct exfiltration
There rare two related bugs. One allows direct exfiltration, which appends the decrypted PGP email onto the end of an IMG tag (like one of those tracking tags), allowing the entire message to be decrypted.
An example of this is the following email. This is a standard HTML email message consisting of multiple parts. The trick is that the IMG tag in the first part starts the URL (blog.robertgraham.com/…) but doesn’t end it. It has the starting quotes in front of the URL but no ending quotes. The ending will in the next chunk.
The next chunk isn’t HTML, though, it’s PGP. The PGP extension (in my case, Enignmail) will detect this and automatically decrypt it. In this case, it’s some previous email message I’ve received the attacker captured by eavesdropping, who then pastes the contents into this email message in order to get it decrypted.
What should happen at this point is that Thunderbird will generate a request (if “remote content” is enabled) to the blog.robertgraham.com server with the decrypted contents of the PGP email appended to it. But that’s not what happens. Instead, I get this:
I am indeed getting weird stuff in the URL (the bit after the GET /), but it’s not the PGP decrypted message. Instead what’s going on is that when Thunderbird puts together a “multipart/mixed” message, it adds it’s own HTML tags consisting of lines between each part. In the email client it looks like this:
The HTML code it adds looks like:
That’s what you see in the above URL, all this code up to the first quotes. Those quotes terminate the quotes in the URL from the first multipart section, causing the rest of the content to be ignored (as far as being sent as part of the URL).
So at least for the latest version of Thunderbird, you are accidentally safe, even if you have “remote content” enabled. Though, this is only according to my tests, there may be a work around to this that hackers could exploit.
In the old days, email was sent plaintext over the wire so that it could be passively eavesdropped on. Nowadays, most providers send it via “STARTTLS”, which sorta encrypts it. Attackers can still intercept such email, but they have to do so actively, using man-in-the-middle. Such active techniques can be detected if you are careful and look for them.
Some organizations don’t care. Apparently, some nation states are just blocking all STARTTLS and forcing email to be sent unencrypted. Others do care. The NSA will passively sniff all the email they can in nations like Iraq, but they won’t actively intercept STARTTLS messages, for fear of getting caught.
The consequence is that it’s much less likely that somebody has been eavesdropping on you, passively grabbing all your PGP/SMIME emails. If you fear they have been, you should look (e.g. send emails from GMail and see if they are intercepted by sniffing the wire).
You’ll know if you are getting hacked
If somebody attacks you using eFail, you’ll know. You’ll get an email message formatted this way, with multipart/mixed components, some with corrupt HTML, some encrypted via PGP. This means that for the most part, your risk is that you’ll be attacked only once — the hacker will only be able to get one message through and decrypt it before you notice that something is amiss. Though to be fair, they can probably include all the emails they want decrypted as attachments to the single email they sent you, so the risk isn’t necessarily that you’ll only get one decrypted.
As mentioned above, a lot of attackers (e.g. the NSA) won’t attack you if its so easy to get caught. Other attackers, though, like anonymous hackers, don’t care.
Somebody ought to write a plugin to Thunderbird to detect this.
It only works if attackers have already captured your emails (though, that’s why you use PGP/SMIME in the first place, to guard against that).
It only works if you’ve enabled your email client to automatically grab external/remote content.
It seems to not be easily reproducible in all cases.
Instead of disabling PGP/SMIME, you should make sure your email client hast remote/external content disabled — that’s a huge privacy violation even without this bug.
Notes: The default email client on the Mac enables remote content by default, which is bad:
Bad software is everywhere. One can even claim that every software is bad. Cool companies, tech giants, established companies, all produce bad software. And no, yours is not an exception.
Who’s to blame for bad software? It’s all complicated and many factors are intertwined – there’s business requirements, there’s organizational context, there’s lack of sufficient skilled developers, there’s the inherent complexity of software development, there’s leaky abstractions, reliance on 3rd party software, consequences of wrong business and purchase decisions, time limitations, flawed business analysis, etc. So yes, despite the catchy title, I’m aware it’s actually complicated.
But in every “it’s complicated” scenario, there’s always one or two factors that are decisive. All of them contribute somehow, but the major drivers are usually a handful of things. And in the case of base software, I think it’s the fault of technical people. Developers, architects, ops.
We don’t seem to care about best practices. And I’ll do some nasty generalizations here, but bear with me. We can spend hours arguing about tabs vs spaces, curly bracket on new line, git merge vs rebase, which IDE is better, which framework is better and other largely irrelevant stuff. But we tend to ignore the important aspects that span beyond the code itself. The context in which the code lives, the non-functional requirements – robustness, security, resilience, etc.
We don’t seem to get security. Even trivial stuff such as user authentication is almost always implemented wrong. These days Twitter and GitHub realized they have been logging plain-text passwords, for example, but that’s just the tip of the iceberg. Too often we ignore the security implications.
“But the business didn’t request the security features”, one may say. The business never requested 2-factor authentication, encryption at rest, PKI, secure (or any) audit trail, log masking, crypto shredding, etc., etc. Because the business doesn’t know these things – we do and we have to put them on the backlog and fight for them to be implemented. Each organization has its specifics and tech people can influence the backlog in different ways, but almost everywhere we can put things there and prioritize them.
The other aspect is testing. We should all be well aware by now that automated testing is mandatory. We have all the tools in the world for unit, functional, integration, performance and whatnot testing, and yet many software projects lack the necessary test coverage to be able to change stuff without accidentally breaking things. “But testing takes time, we don’t have it”. We are perfectly aware that testing saves time, as we’ve all had those “not again!” recurring bugs. And yet we think of all sorts of excuses – “let the QAs test it”, we have to ship that now, we’ll test it later”, “this is too trivial to be tested”, etc.
And you may say it’s not our job. We don’t define what has do be done, we just do it. We don’t define the budget, the scope, the features. We just write whatever has been decided. And that’s plain wrong. It’s not our job to make money out of our code, and it’s not our job to define what customers need, but apart from that everything is our job. The way the software is structured, the security aspects and security features, the stability of the code base, the way the software behaves in different environments. The non-functional requirements are our job, and putting them on the backlog is our job.
You’ve probably heard that every software becomes “legacy” after 6 months. And that’s because of us, our sloppiness, our inability to mitigate external factors and constraints. Too often we create a mess through “just doing our job”.
And of course that’s a generalization. I happen to know a lot of great professionals who don’t make these mistakes, who strive for excellence and implement things the right way. But our industry as a whole doesn’t. Our industry as a whole produces bad software. And it’s our fault, as developers – as the only people who know why a certain piece of software is bad.
In a talk of his, Bob Martin warns us of the risks of our sloppiness. We have been building websites so far, but we are more and more building stuff that interacts with the real world, directly and indirectly. Ultimately, lives may depend on our software (like the recent unfortunate death caused by a self-driving car). And I’ll agree with Uncle Bob that it’s high time we self-regulate as an industry, before some technically incompetent politician decides to do that.
How, I don’t know. We’ll have to think more about it. But I’m pretty sure it’s our fault that software is bad, and no amount of blaming the management, the budget, the timing, the tools or the process can eliminate our responsibility.
Why do I insist on bashing my fellow software engineers? Because if we start looking at software development with more responsibility; with the fact that if it fails, it’s our fault, then we’re more likely to get out of our current bug-ridden, security-flawed, fragile software hole and really become the experts of the future.
Data is the soul of an application. As containers make it easier to package and deploy applications faster, testing plays an even more important role in the reliable delivery of software. Given that all applications have data, development teams want a way to reliably control, move, and test using real application data or, at times, obfuscated data.
For many teams, moving application data through a CI/CD pipeline, while honoring compliance and maintaining separation of concerns, has been a manual task that doesn’t scale. At best, it is limited to a few applications, and is not portable across environments. The goal should be to make running and testing stateful containers (think databases and message buses where operations are tracked) as easy as with stateless (such as with web front ends where they are often not).
Why is state important in testing scenarios? One reason is that many bugs manifest only when code is tested against real data. For example, we might simply want to test a database schema upgrade but a small synthetic dataset does not exercise the critical, finer corner cases in complex business logic. If we want true end-to-end testing, we need to be able to easily manage our data or state.
In this blog post, we define a CI/CD pipeline reference architecture that can automate data movement between applications. We also provide the steps to follow to configure the CI/CD pipeline.
Stateful Pipelines: Need for Portable Volumes
As part of continuous integration, testing, and deployment, a team may need to reproduce a bug found in production against a staging setup. Here, the hosting environment is comprised of a cluster with Kubernetes as the scheduler and Portworx for persistent volumes. The testing workflow is then automated by AWS CodeCommit, AWS CodePipeline, and AWS CodeBuild.
Portworx offers Kubernetes storage that can be used to make persistent volumes portable between AWS environments and pipelines. The addition of Portworx to the AWS Developer Tools continuous deployment for Kubernetes reference architecture adds persistent storage and storage orchestration to a Kubernetes cluster. The example uses MongoDB as the demonstration of a stateful application. In practice, the workflow applies to any containerized application such as Cassandra, MySQL, Kafka, and Elasticsearch.
Using the reference architecture, a developer calls CodePipeline to trigger a snapshot of the running production MongoDB database. Portworx then creates a block-based, writable snapshot of the MongoDB volume. Meanwhile, the production MongoDB database continues serving end users and is uninterrupted.
Without the Portworx integrations, a manual process would require an application-level backup of the database instance that is outside of the CI/CD process. For larger databases, this could take hours and impact production. The use of block-based snapshots follows best practices for resilient and non-disruptive backups.
As part of the workflow, CodePipeline deploys a new MongoDB instance for staging onto the Kubernetes cluster and mounts the second Portworx volume that has the data from production. CodePipeline triggers the snapshot of a Portworx volume through an AWS Lambda function, as shown here
AWS Developer Tools with Kubernetes: Integrated Workflow with Portworx
In the following workflow, a developer is testing changes to a containerized application that calls on MongoDB. The tests are performed against a staging instance of MongoDB. The same workflow applies if changes were on the server side. The original production deployment is scheduled as a Kubernetes deployment object and uses Portworx as the storage for the persistent volume.
The continuous deployment pipeline runs as follows:
Developers integrate bug fix changes into a main development branch that gets merged into a CodeCommit master branch.
Amazon CloudWatch triggers the pipeline when code is merged into a master branch of an AWS CodeCommit repository.
AWS CodePipeline sends the new revision to AWS CodeBuild, which builds a Docker container image with the build ID.
AWS CodeBuild pushes the new Docker container image tagged with the build ID to an Amazon ECR registry.
Kubernetes downloads the new container (for the database client) from Amazon ECR and deploys the application (as a pod) and staging MongoDB instance (as a deployment object).
AWS CodePipeline, through a Lambda function, calls Portworx to snapshot the production MongoDB and deploy a staging instance of MongoDB• Portworx provides a snapshot of the production instance as the persistent storage of the staging MongoDB • The MongoDB instance mounts the snapshot.
At this point, the staging setup mimics a production environment. Teams can run integration and full end-to-end tests, using partner tooling, without impacting production workloads. The full pipeline is shown here.
This reference architecture showcases how development teams can easily move data between production and staging for the purposes of testing. Instead of taking application-specific manual steps, all operations in this CodePipeline architecture are automated and tracked as part of the CI/CD process.
This integrated experience is part of making stateful containers as easy as stateless. With AWS CodePipeline for CI/CD process, developers can easily deploy stateful containers onto a Kubernetes cluster with Portworx storage and automate data movement within their process.
The reference architecture and code are available on GitHub:
Ubuntu 18.04, a long-term-support release, is out. “Codenamed ‘Bionic Beaver’, 18.04 LTS continues Ubuntu’s proud tradition of integrating the latest and greatest open source technologies into a high-quality, easy-to-use Linux distribution. The team has been hard at work through this cycle, introducing new features and fixing bugs.” It features a 4.15 kernel, a new GNOME-based desktop environment, and more. See the release notes and this overview for details.
Last week, we shared the first half of our Q&A with Raspberry Pi Trading CEO and Raspberry Pi creator Eben Upton. Today we follow up with all your other questions, including your expectations for a Raspberry Pi 4, Eben’s dream add-ons, and whether we really could go smaller than the Zero.
Get your questions to us now using #AskRaspberryPi on Twitter
With internet security becoming more necessary, will there be automated versions of VPN on an SD card?
There are already third-party tools which turn your Raspberry Pi into a VPN endpoint. Would we do it ourselves? Like the power button, it’s one of those cases where there are a million things we could do and so it’s more efficient to let the community get on with it.
Just to give a counterexample, while we don’t generally invest in optimising for particular use cases, we did invest a bunch of money into optimising Kodi to run well on Raspberry Pi, because we found that very large numbers of people were using it. So, if we find that we get half a million people a year using a Raspberry Pi as a VPN endpoint, then we’ll probably invest money into optimising it and feature it on the website as we’ve done with Kodi. But I don’t think we’re there today.
Have you ever seen any Pis running and doing important jobs in the wild, and if so, how does it feel?
It’s amazing how often you see them driving displays, for example in radio and TV studios. Of course, it feels great. There’s something wonderful about the geographic spread as well. The Raspberry Pi desktop is quite distinctive, both in its previous incarnation with the grey background and logo, and the current one where we have Greg Annandale’s road picture.
And so it’s funny when you see it in places. Somebody sent me a video of them teaching in a classroom in rural Pakistan and in the background was Greg’s picture.
Raspberry Pi 4!?!
There will be a Raspberry Pi 4, obviously. We get asked about it a lot. I’m sticking to the guidance that I gave people that they shouldn’t expect to see a Raspberry Pi 4 this year. To some extent, the opportunity to do the 3B+ was a surprise: we were surprised that we’ve been able to get 200MHz more clock speed, triple the wireless and wired throughput, and better thermals, and still stick to the $35 price point.
We’re up against the wall from a silicon perspective; we’re at the end of what you can do with the 40nm process. It’s not that you couldn’t clock the processor faster, or put a larger processor which can execute more instructions per clock in there, it’s simply about the energy consumption and the fact that you can’t dissipate the heat. So we’ve got to go to a smaller process node and that’s an order of magnitude more challenging from an engineering perspective. There’s more effort, more risk, more cost, and all of those things are challenging.
With 3B+ out of the way, we’re going to start looking at this now. For the first six months or so we’re going to be figuring out exactly what people want from a Raspberry Pi 4. We’re listening to people’s comments about what they’d like to see in a new Raspberry Pi, and I’m hoping by early autumn we should have an idea of what we want to put in it and a strategy for how we might achieve that.
Could you go smaller than the Zero?
The challenge with Zero as that we’re periphery-limited. If you run your hand around the unit, there is no edge of that board that doesn’t have something there. So the question is: “If you want to go smaller than Zero, what feature are you willing to throw out?”
It’s a single-sided board, so you could certainly halve the PCB area if you fold the circuitry and use both sides, though you’d have to lose something. You could give up some GPIO and go back to 26 pins like the first Raspberry Pi. You could give up the camera connector, you could go to micro HDMI from mini HDMI. You could remove the SD card and just do USB boot. I’m inventing a product live on air! But really, you could get down to two thirds and lose a bunch of GPIO – it’s hard to imagine you could get to half the size.
What’s the one feature that you wish you could outfit on the Raspberry Pi that isn’t cost effective at this time? Your dream feature.
Well, more memory. There are obviously technical reasons why we don’t have more memory on there, but there are also market reasons. People ask “why doesn’t the Raspberry Pi have more memory?”, and my response is typically “go and Google ‘DRAM price’”. We’re used to the price of memory going down. And currently, we’re going through a phase where this has turned around and memory is getting more expensive again.
Machine learning would be interesting. There are machine learning accelerators which would be interesting to put on a piece of hardware. But again, they are not going to be used by everyone, so according to our method of pricing what we might add to a board, machine learning gets treated like a $50 chip. But that would be lovely to do.
Which citizen science projects using the Pi have most caught your attention?
I like the wildlife camera projects. We live out in the countryside in a little village, and we’re conscious of being surrounded by nature but we don’t see a lot of it on a day-to-day basis. So I like the nature cam projects, though, to my everlasting shame, I haven’t set one up yet. There’s a range of them, from very professional products to people taking a Raspberry Pi and a camera and putting them in a plastic box. So those are good fun.
How does it feel to go to bed every day knowing you’ve changed the world for the better in such a massive way?
What feels really good is that when we started this in 2006 nobody else was talking about it, but now we’re part of a very broad movement.
We were in a really bad way: we’d seen a collapse in the number of applicants applying to study Computer Science at Cambridge and elsewhere. In our view, this reflected a move away from seeing technology as ‘a thing you do’ to seeing it as a ‘thing that you have done to you’. It is problematic from the point of view of the economy, industry, and academia, but most importantly it damages the life prospects of individual children, particularly those from disadvantaged backgrounds. The great thing about STEM subjects is that you can’t fake being good at them. There are a lot of industries where your Dad can get you a job based on who he knows and then you can kind of muddle along. But if your dad gets you a job building bridges and you suck at it, after the first or second bridge falls down, then you probably aren’t going to be building bridges anymore. So access to STEM education can be a great driver of social mobility.
By the time we were launching the Raspberry Pi in 2012, there was this wonderful movement going on. Code Club, for example, and CoderDojo came along. Lots of different ways of trying to solve the same problem. What feels really, really good is that we’ve been able to do this as part of an enormous community. And some parts of that community became part of the Raspberry Pi Foundation – we merged with Code Club, we merged with CoderDojo, and we continue to work alongside a lot of these other organisations. So in the two seconds it takes me to fall asleep after my face hits the pillow, that’s what I think about.
We’re currently advertising a Programme Manager role in New Delhi, India. Did you ever think that Raspberry Pi would be advertising a role like this when you were bringing together the Foundation?
No, I didn’t.
But if you told me we were going to be hiring somewhere, India probably would have been top of my list because there’s a massive IT industry in India. When we think about our interaction with emerging markets, India, in a lot of ways, is the poster child for how we would like it to work. There have already been some wonderful deployments of Raspberry Pi, for example in Kerala, without our direct involvement. And we think we’ve got something that’s useful for the Indian market. We have a product, we have clubs, we have teacher training. And we have a body of experience in how to teach people, so we have a physical commercial product as well as a charitable offering that we think are a good fit.
It’s going to be massive.
What is your favourite BBC type-in listing?
There was a game called Codename: Druid. There is a famous game called Codename: Droid which was the sequel to Stryker’s Run, which was an awesome, awesome game. And there was a type-in game called Codename: Druid, which was at the bottom end of what you would consider a commercial game.
And I remember typing that in. And what was really cool about it was that the next month, the guy who wrote it did another article that talks about the memory map and which operating system functions used which bits of memory. So if you weren’t going to do disc access, which bits of memory could you trample on and know the operating system would survive.
I still like type-in listings. The Raspberry Pi 2018 Annual has a type-in listing that I wrote for a Babbage versus Bugs game. I will say that’s not the last type-in listing you will see from me in the next twelve months. And if you download the PDF, you could probably copy and paste it into your favourite text editor to save yourself some time.
AWS Glue is an increasingly popular way to develop serverless ETL (extract, transform, and load) applications for big data and data lake workloads. Organizations that transform their ETL applications to cloud-based, serverless ETL architectures need a seamless, end-to-end continuous integration and continuous delivery (CI/CD) pipeline: from source code, to build, to deployment, to product delivery. Having a good CI/CD pipeline can help your organization discover bugs before they reach production and deliver updates more frequently. It can also help developers write quality code and automate the ETL job release management process, mitigate risk, and more.
AWS Glue is a fully managed data catalog and ETL service. It simplifies and automates the difficult and time-consuming tasks of data discovery, conversion, and job scheduling. AWS Glue crawls your data sources and constructs a data catalog using pre-built classifiers for popular data formats and data types, including CSV, Apache Parquet, JSON, and more.
When you are developing ETL applications using AWS Glue, you might come across some of the following CI/CD challenges:
Iterative development with unit tests
Continuous integration and build
Pushing the ETL pipeline to a test environment
Pushing the ETL pipeline to a production environment
Testing ETL applications using real data (live test)
The following diagram shows the pipeline workflow:
This solution uses AWS CodePipeline, which lets you orchestrate and automate the test and deploy stages for ETL application source code. The solution consists of a pipeline that contains the following stages:
1.) Source Control: In this stage, the AWS Glue ETL job source code and the AWS CloudFormation template file for deploying the ETL jobs are both committed to version control. I chose to use AWS CodeCommit for version control.
2.) LiveTest: In this stage, all resources—including AWS Glue crawlers, jobs, S3 buckets, roles, and other resources that are required for the solution—are provisioned, deployed, live tested, and cleaned up.
The LiveTest stage includes the following actions:
Deploy: In this action, all the resources that are required for this solution (crawlers, jobs, buckets, roles, and so on) are provisioned and deployed using an AWS CloudFormation template.
AutomatedLiveTest: In this action, all the AWS Glue crawlers and jobs are executed and data exploration and validation tests are performed. These validation tests include, but are not limited to, record counts in both raw tables and transformed tables in the data lake and any other business validations. I used AWS CodeBuild for this action.
LiveTestApproval: This action is included for the cases in which a pipeline administrator approval is required to deploy/promote the ETL applications to the next stage. The pipeline pauses in this action until an administrator manually approves the release.
LiveTestCleanup: In this action, all the LiveTest stage resources, including test crawlers, jobs, roles, and so on, are deleted using the AWS CloudFormation template. This action helps minimize cost by ensuring that the test resources exist only for the duration of the AutomatedLiveTest and LiveTestApproval
3.) DeployToProduction: In this stage, all the resources are deployed using the AWS CloudFormation template to the production environment.
Try it out
This code pipeline takes approximately 20 minutes to complete the LiveTest test stage (up to the LiveTest approval stage, in which manual approval is required).
To get started with this solution, choose Launch Stack:
This creates the CI/CD pipeline with all of its stages, as described earlier. It performs an initial commit of the sample AWS Glue ETL job source code to trigger the first release change.
In the AWS CloudFormation console, choose Create. After the template finishes creating resources, you see the pipeline name on the stack Outputs tab.
After that, open the CodePipeline console and select the newly created pipeline. Initially, your pipeline’s CodeCommit stage shows that the source action failed.
Allow a few minutes for your new pipeline to detect the initial commit applied by the CloudFormation stack creation. As soon as the commit is detected, your pipeline starts. You will see the successful stage completion status as soon as the CodeCommit source stage runs.
In the CodeCommit console, choose Code in the navigation pane to view the solution files.
Next, you can watch how the pipeline goes through the LiveTest stage of the deploy and AutomatedLiveTest actions, until it finally reaches the LiveTestApproval action.
At this point, if you check the AWS CloudFormation console, you can see that a new template has been deployed as part of the LiveTest deploy action.
At this point, make sure that the AWS Glue crawlers and the AWS Glue job ran successfully. Also check whether the corresponding databases and external tables have been created in the AWS Glue Data Catalog. Then verify that the data is validated using Amazon Athena, as shown following.
Open the AWS Glue console, and choose Databases in the navigation pane. You will see the following databases in the Data Catalog:
Open the Amazon Athena console, and run the following queries. Verify that the record counts are matching.
SELECT count(*) FROM "nycitytaxi_gluedemocicdtest"."data";
SELECT count(*) FROM "nytaxiparquet_gluedemocicdtest"."datalake";
The following shows the raw data:
The following shows the transformed data:
The pipeline pauses the action until the release is approved. After validating the data, manually approve the revision on the LiveTestApproval action on the CodePipeline console.
Add comments as needed, and choose Approve.
The LiveTestApproval stage now appears as Approved on the console.
After the revision is approved, the pipeline proceeds to use the AWS CloudFormation template to destroy the resources that were deployed in the LiveTest deploy action. This helps reduce cost and ensures a clean test environment on every deployment.
Production deployment is the final stage. In this stage, all the resources—AWS Glue crawlers, AWS Glue jobs, Amazon S3 buckets, roles, and so on—are provisioned and deployed to the production environment using the AWS CloudFormation template.
After successfully running the whole pipeline, feel free to experiment with it by changing the source code stored on AWS CodeCommit. For example, if you modify the AWS Glue ETL job to generate an error, it should make the AutomatedLiveTest action fail. Or if you change the AWS CloudFormation template to make its creation fail, it should affect the LiveTest deploy action. The objective of the pipeline is to guarantee that all changes that are deployed to production are guaranteed to work as expected.
In this post, you learned how easy it is to implement CI/CD for serverless AWS Glue ETL solutions with AWS developer tools like AWS CodePipeline and AWS CodeBuild at scale. Implementing such solutions can help you accelerate ETL development and testing at your organization.
If you have questions or suggestions, please comment below.
Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.
Luis Caro is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.
Writing programs that create things in Minecraft is not only a great way to learn how to code, but it also means that you have a program that you can run again and again to make as many copies of your Minecraft design as you want. You never need to worry about your creation being destroyed by your brother or sister ever again — simply rerun your program and get it back! Whilst it might take a little longer to write the program than to build one house, once it’s finished you can build as many houses as you want.
Co-ordinates in Minecraft
Let’s start with a review of the coordinate system that Minecraft uses to know where to place blocks. If you are already familiar with this, you can skip to the next section. Otherwise, read on.
Plan view of our house design
Minecraft shows us a three-dimensional (3D) view of the world. Imagine that the room you are in is the Minecraft world and you want to describe your location within that room. You can do so with three numbers, as follows:
How far across the room are you? As you move from side to side, you change this number. We can consider this value to be our X coordinate.
How high off the ground are you? If you are upstairs, or if you jump, this value increases. We can consider this value to be our Y coordinate.
How far into the room are you? As you walk forwards or backwards, you change this number. We can consider this value to be our Z coordinate.
You might have done graphs in school with X going across the page and Y going up the page. Coordinates in Minecraft are very similar, except that we have an extra value, Z, for our third dimension. Don’t worry if this still seems a little confusing: once we start to build our house, you will see how these three dimensions work in Minecraft.
Designing our house
It is a good idea to start with a rough design for our house. This will help us to work out the values for the coordinates when we are adding doors and windows to our house. You don’t have to plan every detail of your house right away. It is always fun to enhance it once you have got the basic design written. The image above shows the plan view of the house design that we will be creating in this tutorial. Note that because this is a plan view, it only shows the X and Z co-ordinates; we can’t see how high anything is. Hopefully, you can imagine the house extending up from the screen.
We will build our house close to where the Minecraft player is standing. This a good idea when creating something in Minecraft with Python, as it saves us from having to walk around the Minecraft world to try to find our creation.
Starting our program
Type in the code as you work through this tutorial. You can use any editor you like; we would suggest either Python 3 (IDLE) or Thonny Python IDE, both of which you can find on the Raspberry Pi menu under Programming. Start by selecting the File menu and creating a new file. Save the file with a name of your choice; it must end with .py so that the Raspberry Pi knows that it is a Python program.
It is important to enter the code exactly as it is shown in the listing. Pay particular attention to both the spelling and capitalisation (upper- or lower-case letters) used. You may find that when you run your program the first time, it doesn’t work. This is very common and just means there’s a small error somewhere. The error message will give you a clue about where the error is.
It is good practice to start all of your Python programs with the first line shown in our listing. All other lines that start with a # are comments. These are ignored by Python, but they are a good way to remind us what the program is doing.
The two lines starting with from tell Python about the Minecraft API; this is a code library that our program will be using to talk to Minecraft. The line starting mc = creates a connection between our Python program and the game. Then we get the player’s location broken down into three variables: x, y, and z.
Building the shell of our house
To help us build our house, we define three variables that specify its width, height, and depth. Defining these variables makes it easy for us to change the size of our house later; it also makes the code easier to understand when we are setting the co-ordinates of the Minecraft bricks. For now, we suggest that you use the same values that we have; you can go back and change them once the house is complete and you want to alter its design.
It’s now time to start placing some bricks. We create the shell of our house with just two lines of code! These lines of code each use the setBlocks command to create a complete block of bricks. This function takes the following arguments:
setBlocks(x1, y1, z1, x2, y2, z2, block-id, data)
x1, y1, and z1 are the coordinates of one corner of the block of bricks that we want to create; x1, y1, and z1 are the coordinates of the other corner. The block-id is the type of block that we want to use. Some blocks require another value called data; we will see this being used later, but you can ignore it for now.
We have to work out the values that we need to use in place of x1, y1, z1, x1, y1, z1 for our walls. Note that what we want is a larger outer block made of bricks and that is filled with a slightly smaller block of air blocks. Yes, in Minecraft even air is actually just another type of block.
Once you have typed in the two lines that create the shell of your house, you almost ready to run your program. Before doing so, you must have Minecraft running and displaying the contents of your world. Do not have a world loaded with things that you have created, as they may get destroyed by the house that we are building. Go to a clear area in the Minecraft world before running the program. When you run your program, check for any errors in the ‘console’ window and fix them, repeatedly running the code again until you’ve corrected all the errors.
You should see a block of bricks now, as shown above. You may have to turn the player around in the Minecraft world before you can see your house.
Adding the floor and door
Now, let’s make our house a bit more interesting! Add the lines for the floor and door. Note that the floor extends beyond the boundary of the wall of the house; can you see how we achieve this?
Hint: look closely at how we calculate the x and z attributes as compared to when we created the house shell above. Also note that we use a value of y-1 to create the floor below our feet.
Minecraft doors are two blocks high, so we have to create them in two parts. This is where we have to use the data argument. A value of 0 is used for the lower half of the door, and a value of 8 is used for the upper half (the part with the windows in it). These values will create an open door. If we add 4 to each of these values, a closed door will be created.
Before you run your program again, move to a new location in Minecraft to build the house away from the previous one. Then run it to check that the floor and door are created; you will need to fix any errors again. Even if your program runs without errors, check that the floor and door are positioned correctly. If they aren’t, then you will need to check the arguments so setBlock and setBlocks are exactly as shown in the listing.
Hopefully you will agree that your house is beginning to take shape! Now let’s add some windows. Looking at the plan for our house, we can see that there is a window on each side; see if you can follow along. Add the four lines of code, one for each window.
Now you can move to yet another location and run the program again; you should have a window on each side of the house. Our house is starting to look pretty good!
Adding a roof
The final stage is to add a roof to the house. To do this we are going to use wooden stairs. We will do this inside a loop so that if you change the width of your house, more layers are added to the roof. Enter the rest of the code. Be careful with the indentation: I recommend using spaces and avoiding the use of tabs. After the if statement, you need to indent the code even further. Each indentation level needs four spaces, so below the line with if on it, you will need eight spaces.
Since some of these code lines are lengthy and indented a lot, you may well find that the text wraps around as you reach the right-hand side of your editor window — don’t worry about this. You will have to be careful to get those indents right, however.
Now move somewhere new in your world and run the complete program. Iron out any last bugs, then admire your house! Does it look how you expect? Can you make it better?
Customising your house
Now you can start to customise your house. It is a good idea to use Save As in the menu to save a new version of your program. Then you can keep different designs, or refer back to your previous program if you get to a point where you don’t understand why your new one doesn’t work.
Consider these changes:
Change the size of your house. Are you able also to move the door and windows so they stay in proportion?
Change the materials used for the house. An ice house placed in an area of snow would look really cool!
Add a back door to your house. Or make the front door a double-width door!
We hope that you have enjoyed writing this program to build a house. Now you can easily add a house to your Minecraft world whenever you want to by simply running this program.
In this post, I’ll show you how to create a sample dataset for Amazon Macie, and how you can use Amazon Macie to implement data-centric compliance and security analytics in your Amazon S3 environment. I’ll also dive into the different kinds of credentials, document types, and PII detections supported by Macie. First, I’ll walk through creating a “getting started” sample set of artificial, generated data that you can use to test Macie capabilities and start building your own policies and alerts.
Create a realistic data sample set in S3
I’ll use amazon-macie-activity-generator, which we call “AMG” for short, a sample application developed by AWS that generates realistic content and accesses your test account to create the data. AMG uses AWS CloudFormation, AWS Lambda, and Python’s excellent Faker library to create a data set with artificial—but realistic—data classifications and access patterns to help test some of the features and capabilities of Macie. AMG is released under Amazon Software License 1.0, and we’ll accept pull requests on our GitHub repository and monitor any issues that are opened so we can try to fix bugs and consider new feature requests.
The following diagram shows a high level architecture overview of the components that will be created in your AWS account for AMG. For additional detail about these components and their relationships, review the CloudFormation setup script.
Depending on the data types specified in your JSON configuration template (details below), AMG will periodically generate artificial documents for the specified S3 target with a PutObject action. By default, the CloudFormation stack uses a configuration file that instructs AMG to create a new, private S3 bucket that can only be accessed by authorized AWS users/roles in the same account as the bucket. All the S3 objects with fake data in this bucket have a private ACL and inherit the bucket’s access control configuration. All generated objects feature the header in the example below, and AMG supports all fake data providers offered by https://faker.readthedocs.io/en/latest/index.html, as well as a few of AMG‘s own custom fake data providers requested by our customers: aws_creds, slack_creds, github_creds, facebook_creds, linux_shadow, rsa, linux_passwd, dsa, ec, pgp, cert, itin, swift_code, and cve.
# Sample Report - No identification of actual persons or places is # intended or should be inferred
74323 Julie Field Lake Joshuamouth, OR 30055-3905 1-196-191-4438x974 53001 Paul Union New John, HI 94740 Mastercard Amanda Wells 5135725008183484 09/26 CVV: 550
354-70-6172 242 George Plaza East Lawrencefurt, VA 37287-7620 GB73WAUS0628038988364 587 Silva Village Pearsonburgh, NM 11616-7231 LDNM1948227117807 American Express Brett Garza 347965534580275 05/20 CID: 4758
599.335.2742 JCB 15 digit Michael Arias 210069190253121 03/27 CVC: 861
Create your amazon-macie-activity-generator CloudFormation stack
You can deploy AMG in your AWS account by using either these methods:
Log in to the AWS Console in a region supported by Amazon Macie, which currently includes US East (N. Virginia), US West (Oregon).
Select the One-click CloudFormation launch stack, or launch CloudFormation using the template above.
Read our terms, select the Acknowledgement box, and then select Create.
Creating the data takes a few minutes, and you can periodically refresh CloudWatch to track progress.
Add the new sample data to Macie
Now, I’ll log into the Macie console and add the newly created sample data buckets for analysis by Macie.
Note: If you don’t explicitly specify a bucket for S3 targets in CloudFormation, AMG will use the S3 bucket that’s created by default for the stack, which will be printed out in the CloudFormation stack’s output.
To add buckets for data classification, follow these steps:
Log in to Amazon Macie.
Select Integrations, and then select Services.
Select your account, and then select Details from the Amazon S3 card.
Select your newly created buckets for Full classification, including existing data.
For additional details on configuring Macie, refer to our getting started documentation.
Macie classifies all historical and newly created data in the buckets created by AMG, and the data will be available in the Macie console as it’s classified. Typically, you can expect the data in the sample set to be classified within 60 minutes of the time it was selected for analysis.
Classifying objects with Macie
To see the objects in your test sample set, in Macie, open the Research tab, and then select the S3 Objects index. We’ll use the regular expression search capability in Macie to find any objects written to buckets that start with “amazon-macie-activity-generator-defaults3bucket”. To search for this, type the following text into the Macie search box and select the magnifying glass icon.
From here, you can see a nice breakdown of the kinds of objects that have been classified by Macie, as well as the object-specific details. Create an advanced search using Lucene Query Syntax, and save it as an alert to be matched against any newly created data.
Analyzing accesses to your test data
In addition to classifying data, Macie tracks all control plane and data plane accesses to your content using CloudTrail. To see accesses to your generated environment (created periodically by AMG to mimic user activity), on the Macie navigation bar, select Research, select the CloudTrail data index, and then use the following search to identify our generated role activity:
From this search, you can dive into the user activity (IAM users, assumed roles, federated users, and so on), which is summarized in 5-minute aggregations (user sessions). For example, in the screen shot you can see that one of our AMG-generated users listed objects one time (ListObjects) and wrote 56 objects to S3 (PutObject) during a 5-minute period.
Macie features both predictive (machine learning-based) and basic (rule-based) alerts, including alerts on unencrypted credentials being uploaded to S3 (because this activity might not follow compliance best practices), risky activity such as data exfiltration, and user-defined alerts that are based on saved searches. To see alerts that have been generated based on AMG‘s activity, on the Macie navigation bar, select Alerts.
AMG will continue to run, periodically uploading content to the specified S3 buckets. To stop AMG, delete the AMG CloudFormation stack and associated resources here.
What are the costs?
Macie has a free tier enabling up to 1GB of content to be analyzed per month at no cost to you. By default, AMG will write approximately 10MB of objects to Amazon S3 per day, and you will incur charges for data classification after crossing the 1GB monthly free tier. Running continuously, AMG will generate about 310MB of content per month (10MB/day x 31 days), which will stay below the free tier. Any data use above 1GB will be billed at the Macie public price of $5/GB. For more detail, see the Macie pricing documentation.
If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the Amazon Macie forum or contact AWS Support.
“Syzbot” is an automated system that runs the syzkaller fuzzer on the kernel and reports the resulting crashes. Dmitry Vyukov has announced the availability of a web site displaying the outstanding reports. “The dashboard shows info about active bugs reported by syzbot. There are ~130 active bugs and I think ~2/3 of them are actionable (still happen and have a reproducer or are simple enough to debug).”
Earlier this year we made the AWS SDK developer guides available as GitHub repos (all found within the awsdocs organization) and invited interested parties to contribute changes and improvements in the form of pull requests.
Today we are adding over 138 additional developer and user guides to the organization, and we are looking forward to receiving your requests. You can fix bugs, improve code samples (or submit new ones), add detail, and rewrite sentences and paragraphs in the interest of accuracy or clarity. You can also look at the commit history in order to learn more about new feature and service launches and to track improvements to the documents.
Once you find something to change or improve, visit the HTML version of the document and click on Edit on GitHub button at the top of the page:
This will allow you to edit the document in source form (typically Markdown or reStructuredText). The source code is used to produce the HTML, PDF, and Kindle versions of the documentation.
Once you are in GitHub, click on the pencil icon:
This creates a “fork” — a separate copy of the file that you can edit in isolation.
Next, make an edit. In general, as a new contributor to an open source project, you should gain experience and build your reputation by making small, high-quality edits. I’ll change “dozens of services” to “over one hundred services” in this document:
Then I summarize my change and click Propose file change:
I examine the differences to verify my changes and then click Create pull request:
Then I review the details and click Create pull request again:
The pull request (also known as a PR) makes its way to the Elastic Beanstalk documentation team and they get to decide if they want to accept it, reject it, or to engage in a conversation with me to learn more. The teams endeavor to respond to PRs within 48 hours, and I’ll be notified via GitHub whenever the status of the PR changes.
As is the case with most open source projects, a steady stream of focused, modest-sized pull requests is preferable to the occasional king-sized request with dozens of edits inside.
If I am interested in tracking changes to a repo over time, I can Watch and/or Star it:
If I Watch a repo, I’ll receive an email whenever there’s a new release, issue, or pull request for that service guide.
Go Fork It This launch gives you another way to help us to improve AWS. Let me know what you think!
Bug bounties end up in the news with some regularity, usually for the wrong reasons. I’ve been itching to write about that for a while – but instead of dwelling on the mistakes of the bygone days, I figured it may be better to talk about some of the ways to get vulnerability rewards right.
What do you get out of bug bounties?
There’s plenty of differing views, but I like to think of such programs simply as a bid on researchers’ time. In the most basic sense, you get three benefits:
Improved ability to detect bugs in production before they become major incidents.
A comparatively unbiased feedback loop to help you prioritize and measure other security work.
A robust talent pipeline for when you need to hire.
What bug bounties don’t offer?
You don’t get anything resembling a comprehensive security program or a systematic assessment of your platforms. Researchers end up looking for bugs that offer favorable effort-to-payoff ratios for their skills and given the very imperfect information they have about your enterprise. In other words, you may end up with a hundred people looking for XSS and just one person looking for RCE.
Your reward structure can steer them toward the targets and bugs you care about, but it’s difficult to fully eliminate this inherent skew. There’s only so far you can jack up your top-tier rewards, and only so far you can go lowering the bottom-tier ones.
Don’t you have to outcompete the black market to get all the “good” bugs?
There is a free market price discovery component to it all: if you’re not getting the engagement you were hoping for, you should probably consider paying more.
That said, there are going to be researchers who’d rather hurt you than work for you, no matter how much you pay; you don’t have to win them over, and you don’t have to outspend every authoritarian government or every crime syndicate. A bug bounty is effective simply if it attracts enough eyeballs to make bugs statistically harder to find, and reduces the useful lifespan of any zero-days in black market trade. Plus, most researchers don’t want their work to be used to crack down on dissidents in Egypt or Vietnam.
Another factor is that you’re paying for different things: a black market buyer probably wants a reliable exploit capable of delivering payloads, and then demands silence for months or years to come; a vendor-run bug bounty program is usually perfectly happy with a reproducible crash and doesn’t mind a researcher blogging about their work.
In fact, while money is important, you will probably find out that it’s not enough to retain your top talent; many folks want bug bounties to be more than a business transaction, and find a lot of value in having a close relationship with your security team, comparing notes, and growing together. Fostering that partnership can be more important than adding another $10,000 to your top reward.
How do I prevent it all from going horribly wrong?
Bug bounties are an unfamiliar beast to most lawyers and PR folks, so it’s a natural to be wary and try to plan for every eventuality with pages and pages of impenetrable rules and fine-print legalese.
This is generally unnecessary: there is a strong self-selection bias, and almost every participant in a vulnerability reward program will be coming to you in good faith. The more friendly, forthcoming, and approachable you seem, and the more you treat them like peers, the more likely it is for your relationship to stay positive. On the flip side, there is no faster way to make enemies than to make a security researcher feel that they are now talking to a lawyer or to the PR dept.
Most people have strong opinions on disclosure policies; instead of imposing your own views, strive to patch reported bugs reasonably quickly, and almost every reporter will play along. Demand researchers to cancel conference appearances, take down blog posts, or sign NDAs, and you will sooner or later end up in the news.
But what if that’s not enough?
As with any business endeavor, mistakes will happen; total risk avoidance is seldom the answer. Learn to sincerely apologize for mishaps; it’s not a sign of weakness to say “sorry, we messed up”. And you will almost certainly not end up in the courtroom for doing so.
It’s good to foster a healthy and productive relationship with the community, so that they come to your defense when something goes wrong. Encouraging people to disclose bugs and talk about their experiences is one way of accomplishing that.
What about extortion?
You should structure your program to naturally discourage bad behavior and make it stand out like a sore thumb. Require bona fide reports with complete technical details before any reward decision is made by a panel of named peers; and make it clear that you never demand non-disclosure as a condition of getting a reward.
To avoid researchers accidentally putting themselves in awkward situations, have clear rules around data exfiltration and lateral movement: assure them that you will always pay based on the worst-case impact of their findings; in exchange, ask them to stop as soon as they get a shell and never access any data that isn’t their own.
So… are there any downsides?
Yep. Other than souring up your relationship with the community if you implement your program wrong, the other consideration is that bug bounties tend to generate a lot of noise from well-meaning but less-skilled researchers.
When this happens, do not get frustrated and do not penalize such participants; instead, help them grow. Consider publishing educational articles, giving advice on how to investigate and structure reports, or offering free workshops every now and then.
The other downside is cost; although bug bounties tend to offer far more bang for your buck than your average penetration test, they are more random. The annual expenses tend to be fairly predictable, but there is always some possibility of having to pay multiple top-tier rewards in rapid succession. This is the kind of uncertainty that many mid-level budget planners react badly to.
Finally, you need to be able to fix the bugs you receive. It would be nuts to prefer to not know about the vulnerabilities in the first place – but once you invite the research, the clock starts ticking and you need to ship fixes reasonably fast.
So… should I try it?
There are folks who enthusiastically advocate for bug bounties in every conceivable situation, and people who dislike them with fierce passion; both sentiments are usually strongly correlated with the line of business they are in.
In reality, bug bounties are not a cure-all, and there are some ways to make them ineffectual or even dangerous. But they are not as risky or expensive as most people suspect, and when done right, they can actually be fun for your team, too. You won’t know for sure until you try.
The fourth update to the Ubuntu 16.04 long-term support distribution has been released; it is available from the “Get Ubuntu” web page. “As usual, this point release includes many updates, and updated installation media has been provided so that fewer updates will need to be downloaded after installation. These include security updates and corrections for other high-impact bugs, with a focus on maintaining stability and compatibility with Ubuntu 16.04 LTS.
Kubuntu 16.04.4 LTS, Xubuntu 16.04.4 LTS, Mythbuntu 16.04.4 LTS, Ubuntu GNOME 16.04.4 LTS, Lubuntu 16.04.4 LTS, Ubuntu Kylin 16.04.4 LTS, Ubuntu MATE 16.04.4 LTS and Ubuntu Studio 16.04.4 LTS are also now available.” Information about what has changed can be found in the overall release notes and in the release notes for the various Ubuntu flavors.
Product security is an interesting animal: it is a uniquely cross-disciplinary endeavor that spans policy, consulting, process automation, in-depth software engineering, and cutting-edge vulnerability research. And in contrast to many other specializations in our field of expertise – say, incident response or network security – we have virtually no time-tested and coherent frameworks for setting it up within a company of any size.
In my previous post, I shared some thoughts on nurturing technical organizations and cultivating the right kind of leadership within. Today, I figured it would be fitting to follow up with several notes on what I learned about structuring product security work – and about actually making the effort count.
The “comfort zone” trap
For security engineers, knowing your limits is a sought-after quality: there is nothing more dangerous than a security expert who goes off script and starts dispensing authoritatively-sounding but bogus advice on a topic they know very little about. But that same quality can be destructive when it prevents us from growing beyond our most familiar role: that of a critic who pokes holes in other people’s designs.
The role of a resident security critic lends itself all too easily to a sense of supremacy: the mistaken belief that our cognitive skills exceed the capabilities of the engineers and product managers who come to us for help – and that the cool bugs we file are the ultimate proof of our special gift. We start taking pride in the mere act of breaking somebody else’s software – and then write scathing but ineffectual critiques addressed to executives, demanding that they either put a stop to a project or sign off on a risk. And hey, in the latter case, they better brace for our triumphant “I told you so” at some later date.
Of course, escalations of this type have their place, but they need to be a very rare sight; when practiced routinely, they are a telltale sign of a dysfunctional team. We might be failing to think up viable alternatives that are in tune with business or engineering needs; we might be very unpersuasive, failing to communicate with other rational people in a language they understand; or it might be that our tolerance for risk is badly out of whack with the rest of the company. Whatever the cause, I’ve seen high-level escalations where the security team spoke of valiant efforts to resist inexplicably awful design decisions or data sharing setups; and where product leads in turn talked about pressing business needs randomly blocked by obstinate security folks. Sometimes, simply having them compare their notes would be enough to arrive at a technical solution – such as sharing a less sensitive subset of the data at hand.
To be effective, any product security program must be rooted in a partnership with the rest of the company, focused on helping them get stuff done while eliminating or reducing security risks. To combat the toxic us-versus-them mentality, I found it helpful to have some team members with software engineering backgrounds, even if it’s the ownership of a small open-source project or so. This can broaden our horizons, helping us see that we all make the same mistakes – and that not every solution that sounds good on paper is usable once we code it up.
Getting off the treadmill
All security programs involve a good chunk of operational work. For product security, this can be a combination of product launch reviews, design consulting requests, incoming bug reports, or compliance-driven assessments of some sort. And curiously, such reactive work also has the property of gradually expanding to consume all the available resources on a team: next year is bound to bring even more review requests, even more regulatory hurdles, and even more incoming bugs to triage and fix.
Being more tractable, such routine tasks are also more readily enshrined in SDLs, SLAs, and all kinds of other official documents that are often mistaken for a mission statement that justifies the existence of our teams. Soon, instead of explaining to a developer why they should fix a particular problem right away, we end up pointing them to page 17 in our severity classification guideline, which defines that “severity 2” vulnerabilities need to be resolved within a month. Meanwhile, another policy may be telling them that they need to run a fuzzer or a web application scanner for a particular number of CPU-hours – no matter whether it makes sense or whether the job is set up right.
To run a product security program that scales sublinearly, stays abreast of future threats, and doesn’t erect bureaucratic speed bumps just for the sake of it, we need to recognize this inherent tendency for operational work to take over – and we need to reign it in. No matter what the last year’s policy says, we usually don’t need to be doing security reviews with a particular cadence or to a particular depth; if we need to scale them back 10% to staff a two-quarter project that fixes an important API and squashes an entire class of bugs, it’s a short-term risk we should feel empowered to take.
As noted in my earlier post, I find contingency planning to be a valuable tool in this regard: why not ask ourselves how the team would cope if the workload went up another 30%, but bad financial results precluded any team growth? It’s actually fun to think about such hypotheticals ahead of the time – and hey, if the ideas sound good, why not try them out today?
Living for a cause
It can be difficult to understand if our security efforts are structured and prioritized right; when faced with such uncertainty, it is natural to stick to the safe fundamentals – investing most of our resources into the very same things that everybody else in our industry appears to be focusing on today.
I think it’s important to combat this mindset – and if so, we might as well tackle it head on. Rather than focusing on tactical objectives and policy documents, try to write down a concise mission statement explaining why you are a team in the first place, what specific business outcomes you are aiming for, how do you prioritize it, and how you want it all to change in a year or two. It should be a fluid narrative that reads right and that everybody on your team can take pride in; my favorite way of starting the conversation is telling folks that we could always have a new VP tomorrow – and that the VP’s first order of business could be asking, “why do you have so many people here and how do I know they are doing the right thing?”. It’s a playful but realistic framing device that motivates people to get it done.
In general, a comprehensive product security program should probably start with the assumption that no matter how many resources we have at our disposal, we will never be able to stay in the loop on everything that’s happening across the company – and even if we did, we’re not going to be able to catch every single bug. It follows that one of our top priorities for the team should be making sure that bugs don’t happen very often; a scalable way of getting there is equipping engineers with intuitive and usable tools that make it easy to perform common tasks without having to worry about security at all. Examples include standardized, managed containers for production jobs; safe-by-default APIs, such as strict contextual autoescaping for XSS or type safety for SQL; security-conscious style guidelines; or plug-and-play libraries that take care of common crypto or ACL enforcement tasks.
Of course, not all problems can be addressed on framework level, and not every engineer will always reach for the right tools. Because of this, the next principle that I found to be worth focusing on is containment and mitigation: making sure that bugs are difficult to exploit when they happen, or that the damage is kept in check. The solutions in this space can range from low-level enhancements (say, hardened allocators or seccomp-bpf sandboxes) to client-facing features such as browser origin isolation or Content Security Policy.
The usual consulting, review, and outreach tasks are an important facet of a product security program, but probably shouldn’t be the sole focus of your team. It’s also best to avoid undue emphasis on vulnerability showmanship: while valuable in some contexts, it creates a hypercompetitive environment that may be hostile to less experienced team members – not to mention, squashing individual bugs offers very limited value if the same issue is likely to be reintroduced into the codebase the next day. I like to think of security reviews as a teaching opportunity instead: it’s a way to raise awareness, form partnerships with engineers, and help them develop lasting habits that reduce the incidence of bugs. Metrics to understand the impact of your work are important, too; if your engagements are seen mostly as a yet another layer of red tape, product teams will stop reaching out to you for advice.
The other tenet of a healthy product security effort requires us to recognize at a scale and given enough time, every defense mechanism is bound to fail – and so, we need ways to prevent bugs from turning into incidents. The efforts in this space may range from developing product-specific signals for the incident response and monitoring teams; to offering meaningful vulnerability reward programs and nourishing a healthy and respectful relationship with the research community; to organizing regular offensive exercises in hopes of spotting bugs before anybody else does.
Oh, one final note: an important feature of a healthy security program is the existence of multiple feedback loops that help you spot problems without the need to micromanage the organization and without being deathly afraid of taking chances. For example, the data coming from bug bounty programs, if analyzed correctly, offers a wonderful way to alert you to systemic problems in your codebase – and later on, to measure the impact of any remediation and hardening work.
Norbert Preining reports the sad news that Staszek Wawrykiewicz has died. “Staszek was an active member of the Polish TeX community, and an incredibly valuable TeX Live Team member. His insistence and perseverance have saved TeX Live from many disasters and bugs. Although I have been in contact with Staszek over the TeX Live mailing lists since some years, I met him in person for the first time on my first ever BachoTeX, the EuroBachoTeX 2007. His friendliness, openness to all new things, his inquisitiveness, all took a great place in my heart.” (Thanks to Paul Wise)
KDE has released Plasma 5.12.0. “Plasma 5.12 LTS is the second long-term support release from the Plasma 5 team. We have been working hard, focusing on speed and stability for this release. Boot time to desktop has been improved by reviewing the code for anything which blocks execution. The team has been triaging and fixing bugs in every aspect of the codebase, tidying up artwork, removing corner cases, and ensuring cross-desktop integration. For the first time, we offer our Wayland integration on long-term support, so you can be sure we will continue to provide bug fixes and improvements to the Wayland experience.”
Here’s a look at what’s coming on the desktop in Libre Graphics World. “After almost 6 years of work, the GIMP team is finalizing the next big update. The plan is to cut a beta of v2.10 once the amount of critical bugs falls further down: it’s currently stuck at 20, as new bugs get promoted to blockers, while old blockers get fixed. It’s a bit of an uphill battle.”
Kuhu Shukla (bottom center) and team at the 2017 DataWorks Summit
By Kuhu Shukla
This post first appeared here on the Apache Software Foundation blog as part of ASF’s “Success at Apache” monthly blog series.
As I sit at my desk on a rather frosty morning with my coffee, looking up new JIRAs from the previous day in the Apache Tez project, I feel rather pleased. The latest community release vote is complete, the bug fixes that we so badly needed are in and the new release that we tested out internally on our many thousand strong cluster is looking good. Today I am looking at a new stack trace from a different Apache project process and it is hard to miss how much of the exceptional code I get to look at every day comes from people all around the globe. A contributor leaves a JIRA comment before he goes on to pick up his kid from soccer practice while someone else wakes up to find that her effort on a bug fix for the past two months has finally come to fruition through a binding +1.
Yahoo – which joined AOL, HuffPost, Tumblr, Engadget, and many more brands to form the Verizon subsidiary Oath last year – has been at the frontier of open source adoption and contribution since before I was in high school. So while I have no historical trajectories to share, I do have a story on how I found myself in an epic journey of migrating all of Yahoo jobs from Apache MapReduce to Apache Tez, a then-new DAG based execution engine.
Oath grid infrastructure is through and through driven by Apache technologies be it storage through HDFS, resource management through YARN, job execution frameworks with Tez and user interface engines such as Hive, Hue, Pig, Sqoop, Spark, Storm. Our grid solution is specifically tailored to Oath’s business-critical data pipeline needs using the polymorphic technologies hosted, developed and maintained by the Apache community.
On the third day of my job at Yahoo in 2015, I received a YouTube link on An Introduction to Apache Tez. I watched it carefully trying to keep up with all the questions I had and recognized a few names from my academic readings of Yarn ACM papers. I continued to ramp up on YARN and HDFS, the foundational Apache technologies Oath heavily contributes to even today. For the first few weeks I spent time picking out my favorite (necessary) mailing lists to subscribe to and getting started on setting up on a pseudo-distributed Hadoop cluster. I continued to find my footing with newbie contributions and being ever more careful with whitespaces in my patches. One thing was clear – Tez was the next big thing for us. By the time I could truly call myself a contributor in the Hadoop community nearly 80-90% of the Yahoo jobs were now running with Tez. But just like hiking up the Grand Canyon, the last 20% is where all the pain was. Being a part of the solution to this challenge was a happy prospect and thankfully contributing to Tez became a goal in my next quarter.
The next sprint planning meeting ended with me getting my first major Tez assignment – progress reporting. The progress reporting in Tez was non-existent – “Just needs an API fix,” I thought. Like almost all bugs in this ecosystem, it was not easy. How do you define progress? How is it different for different kinds of outputs in a graph? The questions were many.
I, however, did not have to go far to get answers. The Tez community actively came to a newbie’s rescue, finding answers and posing important questions. I started attending the bi-weekly Tez community sync up calls and asking existing contributors and committers for course correction. Suddenly the team was much bigger, the goals much more chiseled. This was new to anyone like me who came from the networking industry, where the most open part of the code are the RFCs and the implementation details are often hidden. These meetings served as a clean room for our coding ideas and experiments. Ideas were shared, to the extent of which data structure we should pick and what a future user of Tez would take from it. In between the usual status updates and extensive knowledge transfers were made.
Oath uses Apache Pig and Apache Hive extensively and most of the urgent requirements and requests came from Pig and Hive developers and users. Each issue led to a community JIRA and as we started running Tez at Oath scale, new feature ideas and bugs around performance and resource utilization materialized. Every year most of the Hadoop team at Oath travels to the Hadoop Summit where we meet our cohorts from the Apache community and we stand for hours discussing the state of the art and what is next for the project. One such discussion set the course for the next year and a half for me.
We needed an innovative way to shuffle data. Frameworks like MapReduce and Tez have a shuffle phase in their processing lifecycle wherein the data from upstream producers is made available to downstream consumers. Even though Apache Tez was designed with a feature set corresponding to optimization requirements in Pig and Hive, the Shuffle Handler Service was retrofitted from MapReduce at the time of the project’s inception. With several thousands of jobs on our clusters leveraging these features in Tez, the Shuffle Handler Service became a clear performance bottleneck. So as we stood talking about our experience with Tez with our friends from the community, we decided to implement a new Shuffle Handler for Tez. All the conversation points were tracked now through an umbrella JIRA TEZ-3334 and the to-do list was long. I picked a few JIRAs and as I started reading through I realized, this is all new code I get to contribute to and review. There might be a better way to put this, but to be honest it was just a lot of fun! All the whiteboards were full, the team took walks post lunch and discussed how to go about defining the API. Countless hours were spent debugging hangs while fetching data and looking at stack traces and Wireshark captures from our test runs. Six months in and we had the feature on our sandbox clusters. There were moments ranging from sheer frustration to absolute exhilaration with high fives as we continued to address review comments and fixing big and small issues with this evolving feature.
As much as owning your code is valued everywhere in the software community, I would never go on to say “I did this!” In fact, “we did!” It is this strong sense of shared ownership and fluid team structure that makes the open source experience at Apache truly rewarding. This is just one example. A lot of the work that was done in Tez was leveraged by the Hive and Pig community and cross Apache product community interaction made the work ever more interesting and challenging. Triaging and fixing issues with the Tez rollout led us to hit a 100% migration score last year and we also rolled the Tez Shuffle Handler Service out to our research clusters. As of last year we have run around 100 million Tez DAGs with a total of 50 billion tasks over almost 38,000 nodes.
In 2018 as I move on to explore Hadoop 3.0 as our future release, I hope that if someone outside the Apache community is reading this, it will inspire and intrigue them to contribute to a project of their choice. As an astronomy aficionado, going from a newbie Apache contributor to a newbie Apache committer was very much like looking through my telescope － it has endless possibilities and challenges you to be your best.
About the Author:
Kuhu Shukla is a software engineer at Oath and did her Masters in Computer Science at North Carolina State University. She works on the Big Data Platforms team on Apache Tez, YARN and HDFS with a lot of talented Apache PMCs and Committers in Champaign, Illinois. A recent Apache Tez Committer herself she continues to contribute to YARN and HDFS and spoke at the 2017 Dataworks Hadoop Summit on “Tez Shuffle Handler: Shuffling At Scale With Apache Hadoop”. Prior to that she worked on Juniper Networks’ router and switch configuration APIs. She likes to participate in open source conferences and women in tech events. In her spare time she loves singing Indian classical and jazz, laughing, whale watching, hiking and peering through her Dobsonian telescope.
Here’s a blog post from “bunnie” Huang on the tension between transparency and product liability around hardware flaws. “The open source community could use the Spectre/Meltdown crisis as an opportunity to reform the status quo. Instead of suing Intel for money, what if we sue Intel for documentation? If documentation and transparency have real value, then this is a chance to finally put that value in economic terms that Intel shareholders can understand. I propose a bargain somewhere along these lines: if Intel releases comprehensive microarchitectural hardware design specifications, microcode, firmware, and all software source code (e.g. for AMT/ME) so that the community can band together to hammer out any other security bugs hiding in their hardware, then Intel is absolved of any payouts related to the Spectre/Meltdown exploits.”
Here’s a 34c3 conference report in CSO suggesting that the BSDs are losing developers. “von Sprundel says he easily found around 115 kernel bugs across the three BSDs, including 30 for FreeBSD, 25 for OpenBSD, and 60 for NetBSD. Many of these bugs he called ‘low-hanging fruit.’ He promptly reported all the bugs, but six months later, at the time of his talk, many remained unpatched.
‘By and large, most security flaws in the Linux kernel don’t have a long lifetime. They get found pretty fast,’ von Sprundel says. ‘On the BSD side, that isn’t always true. I found a bunch of bugs that have been around a very long time.’ Many of them have been present in code for a decade or more.”
The collective thoughts of the interwebz
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.