Here’s the press release announcing Microsoft’s agreement to acquire GitHub for a mere $7.5 billion. “GitHub will retain its developer-first ethos and will operate independently to provide an open platform for all developers in all industries. Developers will continue to be able to use the programming languages, tools and operating systems of their choice for their projects — and will still be able to deploy their code to any operating system, any cloud and any device.”
Last year, we released Amazon Connect, a cloud-based contact center service that enables any business to deliver better customer service at low cost. This service is built based on the same technology that empowers Amazon customer service associates. Using this system, associates have millions of conversations with customers when they inquire about their shipping or order information. Because we made it available as an AWS service, you can now enable your contact center agents to make or receive calls in a matter of minutes. You can do this without having to provision any kind of hardware. 2
There are several advantages of building your contact center in the AWS Cloud, as described in our documentation. In addition, customers can extend Amazon Connect capabilities by using AWS products and the breadth of AWS services. In this blog post, we focus on how to get analytics out of the rich set of data published by Amazon Connect. We make use of an Amazon Connect data stream and create an end-to-end workflow to offer an analytical solution that can be customized based on need.
Solution overview
The following diagram illustrates the solution.
In this solution, Amazon Connect exports its contact trace records (CTRs) using Amazon Kinesis. CTRs are data streams in JSON format, and each has information about individual contacts. For example, this information might include the start and end time of a call, which agent handled the call, which queue the user chose, queue wait times, number of holds, and so on. You can enable this feature by reviewing our documentation.
In this architecture, we use Kinesis Firehose to capture Amazon Connect CTRs as raw data in an Amazon S3 bucket. We don’t use the recent feature added by Kinesis Firehose to save the data in S3 as Apache Parquet format. We use AWS Glue functionality to automatically detect the schema on the fly from an Amazon Connect data stream.
The primary reason for this approach is that it allows us to use attributes and enables an Amazon Connect administrator to dynamically add more fields as needed. Also by converting data to parquet in batch (every couple of hours) compression can be higher. However, if your requirement is to ingest the data in Parquet format on realtime, we recoment using Kinesis Firehose recently launched feature. You can review this blog post for further information.
By default, Firehose puts these records in time-series format. To make it easy for AWS Glue crawlers to capture information from new records, we use AWS Lambda to move all new records to a single S3 prefix called flatfiles. Our Lambda function is configured using S3 event notification. To comply with AWS Glue and Athena best practices, the Lambda function also converts all column names to lowercase. Finally, we also use the Lambda function to start AWS Glue crawlers. AWS Glue crawlers identify the data schema and update the AWS Glue Data Catalog, which is used by extract, transform, load (ETL) jobs in AWS Glue in the latter half of the workflow.
You can see our approach in the Lambda code following.
from __future__ import print_function
import json
import urllib
import boto3
import os
import re
s3 = boto3.resource('s3')
client = boto3.client('s3')
def convertColumntoLowwerCaps(obj):
for key in obj.keys():
new_key = re.sub(r'[\W]+', '', key.lower())
v = obj[key]
if isinstance(v, dict):
if len(v) > 0:
convertColumntoLowwerCaps(v)
if new_key != key:
obj[new_key] = obj[key]
del obj[key]
return obj
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode('utf8'))
try:
client.download_file(bucket, key, '/tmp/file.json')
with open('/tmp/out.json', 'w') as output, open('/tmp/file.json', 'rb') as file:
i = 0
for line in file:
for object in line.replace("}{","}\n{").split("\n"):
record = json.loads(object,object_hook=convertColumntoLowwerCaps)
if i != 0:
output.write("\n")
output.write(json.dumps(record))
i += 1
newkey = 'flatfiles/' + key.replace("/", "")
client.upload_file('/tmp/out.json', bucket,newkey)
s3.Object(bucket,key).delete()
return "success"
except Exception as e:
print(e)
print('Error coping object {} from bucket {}'.format(key, bucket))
raise e
We trigger AWS Glue crawlers based on events because this approach lets us capture any new data frame that we want to be dynamic in nature. CTR attributes are designed to offer multiple custom options based on a particular call flow. Attributes are essentially key-value pairs in nested JSON format. With the help of event-based AWS Glue crawlers, you can easily identify newer attributes automatically.
We recommend setting up an S3 lifecycle policy on the flatfiles folder that keeps records only for 24 hours. Doing this optimizes AWS Glue ETL jobs to process a subset of files rather than the entire set of records.
After we have data in the flatfiles folder, we use AWS Glue to catalog the data and transform it into Parquet format inside a folder called parquet/ctr/. The AWS Glue job performs the ETL that transforms the data from JSON to Parquet format. We use AWS Glue crawlers to capture any new data frame inside the JSON code that we want to be dynamic in nature. What this means is that when you add new attributes to an Amazon Connect instance, the solution automatically recognizes them and incorporates them in the schema of the results.
After AWS Glue stores the results in Parquet format, you can perform analytics using Amazon Redshift Spectrum, Amazon Athena, or any third-party data warehouse platform. To keep this solution simple, we have used Amazon Athena for analytics. Amazon Athena allows us to query data without having to set up and manage any servers or data warehouse platforms. Additionally, we only pay for the queries that are executed.
Try it out!
You can get started with our sample AWS CloudFormation template. This template creates the components starting from the Kinesis stream and finishes up with S3 buckets, the AWS Glue job, and crawlers. To deploy the template, open the AWS Management Console by clicking the following link.
In the console, specify the following parameters:
BucketName: The name for the bucket to store all the solution files. This name must be unique; if it’s not, template creation fails.
etlJobSchedule: The schedule in cron format indicating how often the AWS Glue job runs. The default value is every hour.
KinesisStreamName: The name of the Kinesis stream to receive data from Amazon Connect. This name must be different from any other Kinesis stream created in your AWS account.
s3interval: The interval in seconds for Kinesis Firehose to save data inside the flatfiles folder on S3. The value must between 60 and 900 seconds.
sampledata: When this parameter is set to true, sample CTR records are used. Doing this lets you try this solution without setting up an Amazon Connect instance. All examples in this walkthrough use this sample data.
Select the “I acknowledge that AWS CloudFormation might create IAM resources.” check box, and then choose Create. After the template finishes creating resources, you can see the stream name on the stack Outputs tab.
If you haven’t created your Amazon Connect instance, you can do so by following the Getting Started Guide. When you are done creating, choose your Amazon Connect instance in the console, which takes you to instance settings. Choose Data streaming to enable streaming for CTR records. Here, you can choose the Kinesis stream (defined in the KinesisStreamName parameter) that was created by the CloudFormation template.
Now it’s time to generate the data by making or receiving calls by using Amazon Connect. You can go to Amazon Connect Cloud Control Panel (CCP) to make or receive calls using a software phone or desktop phone. After a few minutes, we should see data inside the flatfiles folder. To make it easier to try this solution, we provide sample data that you can enable by setting the sampledata parameter to true in your CloudFormation template.
You can navigate to the AWS Glue console by choosing Jobs on the left navigation pane of the console. We can select our job here. In my case, the job created by CloudFormation is called glueJob-i3TULzVtP1W0; yours should be similar. You run the job by choosing Run job for Action.
After that, we wait for the AWS Glue job to run and to finish successfully. We can track the status of the job by checking the History tab.
When the job finishes running, we can check the Database section. There should be a new table created called ctr in Parquet format.
To query the data with Athena, we can select the ctr table, and for Action choose View data.
Doing this takes us to the Athena console. If you run a query, Athena shows a preview of the data.
When we can query the data using Athena, we can visualize it using Amazon QuickSight. Before connecting Amazon QuickSight to Athena, we must make sure to grant Amazon QuickSight access to Athena and the associated S3 buckets in the account. For more information on doing this, see Managing Amazon QuickSight Permissions to AWS Resources in the Amazon QuickSight User Guide. We can then create a new data set in Amazon QuickSight based on the Athena table that was created.
After setting up permissions, we can create a new analysis in Amazon QuickSight by choosing New analysis.
Then we add a new data set.
We choose Athena as the source and give the data source a name (in this case, I named it connectctr).
Choose the name of the database and the table referencing the Parquet results.
Then choose Visualize.
After that, we should see the following screen.
Now we can create some visualizations. First, search for the agent.username column, and drag it to the AutoGraph section.
We can see the agents and the number of calls for each, so we can easily see which agents have taken the largest amount of calls. If we want to see from what queues the calls came for each agent, we can add the queue.arn column to the visual.
After following all these steps, you can use Amazon QuickSight to add different columns from the call records and perform different types of visualizations. You can build dashboards that continuously monitor your connect instance. You can share those dashboards with others in your organization who might need to see this data.
Conclusion
In this post, you see how you can use services like AWS Lambda, AWS Glue, and Amazon Athena to process Amazon Connect call records. The post also demonstrates how to use AWS Lambda to preprocess files in Amazon S3 and transform them into a format that recognized by AWS Glue crawlers. Finally, the post shows how to used Amazon QuickSight to perform visualizations.
You can use the provided template to analyze your own contact center instance. Or you can take the CloudFormation template and modify it to process other data streams that can be ingested using Amazon Kinesis or stored on Amazon S3.
Luis Caro is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.
Peter Dalbhanjan is a Solutions Architect for AWS based in Herndon, VA. Peter has a keen interest in evangelizing AWS solutions and has written multiple blog posts that focus on simplifying complex use cases. At AWS, Peter helps with designing and architecting variety of customer workloads.
Backblaze is hiring a Director of Sales. This is a critical role for Backblaze as we continue to grow the team. We need a strong leader who has experience in scaling a sales team and who has an excellent track record for exceeding goals by selling Software as a Service (SaaS) solutions. In addition, this leader will need to be highly motivated, as well as able to create and develop a highly-motivated, success oriented sales team that has fun and enjoys what they do.
The History of Backblaze from our CEO In 2007, after a friend’s computer crash caused her some suffering, we realized that with every photo, video, song, and document going digital, everyone would eventually lose all of their information. Five of us quit our jobs to start a company with the goal of making it easy for people to back up their data.
Like many startups, for a while we worked out of a co-founder’s one-bedroom apartment. Unlike most startups, we made an explicit agreement not to raise funding during the first year. We would then touch base every six months and decide whether to raise or not. We wanted to focus on building the company and the product, not on pitching and slide decks. And critically, we wanted to build a culture that understood money comes from customers, not the magical VC giving tree. Over the course of 5 years we built a profitable, multi-million dollar revenue business — and only then did we raise a VC round.
Fast forward 10 years later and our world looks quite different. You’ll have some fantastic assets to work with:
A brand millions recognize for openness, ease-of-use, and affordability.
A computer backup service that stores over 500 petabytes of data, has recovered over 30 billion files for hundreds of thousands of paying customers — most of whom self-identify as being the people that find and recommend technology products to their friends.
Our B2 service that provides the lowest cost cloud storage on the planet at 1/4th the price Amazon, Google or Microsoft charges. While being a newer product on the market, it already has over 100,000 IT and developers signed up as well as an ecosystem building up around it.
A growing, profitable and cash-flow positive company.
And last, but most definitely not least: a great sales team.
You might be saying, “sounds like you’ve got this under control — why do you need me?” Don’t be misled. We need you. Here’s why:
We have a great team, but we are in the process of expanding and we need to develop a structure that will easily scale and provide the most success to drive revenue.
We just launched our outbound sales efforts and we need someone to help develop that into a fully successful program that’s building a strong pipeline and closing business.
We need someone to work with the marketing department and figure out how to generate more inbound opportunities that the sales team can follow up on and close.
We need someone who will work closely in developing the skills of our current sales team and build a path for career growth and advancement.
We want someone to manage our Customer Success program.
So that’s a bit about us. What are we looking for in you?
Experience: As a sales leader, you will strategically build and drive the territory’s sales pipeline by assembling and leading a skilled team of sales professionals. This leader should be familiar with generating, developing and closing software subscription (SaaS) opportunities. We are looking for a self-starter who can manage a team and make an immediate impact of selling our Backup and Cloud Storage solutions. In this role, the sales leader will work closely with the VP of Sales, marketing staff, and service staff to develop and implement specific strategic plans to achieve and exceed revenue targets, including new business acquisition as well as build out our customer success program.
Leadership: We have an experienced team who’s brought us to where we are today. You need to have the people and management skills to get them excited about working with you. You need to be a strong leader and compassionate about developing and supporting your team.
Data driven and creative: The data has to show something makes sense before we scale it up. However, without creativity, it’s easy to say “the data shows it’s impossible” or to find a local maximum. Whether it’s deciding how to scale the team, figuring out what our outbound sales efforts should look like or putting a plan in place to develop the team for career growth, we’ve seen a bit of creativity get us places a few extra dollars couldn’t.
Jive with our culture: Strong leaders affect culture and the person we hire for this role may well shape, not only fit into, ours. But to shape the culture you have to be accepted by the organism, which means a certain set of shared values. We default to openness with our team, our customers, and everyone if possible. We love initiative — without arrogance or dictatorship. We work to create a place people enjoy showing up to work. That doesn’t mean ping pong tables and foosball (though we do try to have perks & fun), but it means people are friendly, non-political, working to build a good service but also a good place to work.
Do the work: Ideas and strategy are critical, but good execution makes them happen. We’re looking for someone who can help the team execute both from the perspective of being capable of guiding and organizing, but also someone who is hands-on themselves.
Additional Responsibilities needed for this role:
Recruit, coach, mentor, manage and lead a team of sales professionals to achieve yearly sales targets. This includes closing new business and expanding upon existing clientele.
Expand the customer success program to provide the best customer experience possible resulting in upsell opportunities and a high retention rate.
Develop effective sales strategies and deliver compelling product demonstrations and sales pitches.
Acquire and develop the appropriate sales tools to make the team efficient in their daily work flow.
Apply a thorough understanding of the marketplace, industry trends, funding developments, and products to all management activities and strategic sales decisions.
Ensure that sales department operations function smoothly, with the goal of facilitating sales and/or closings; operational responsibilities include accurate pipeline reporting and sales forecasts.
This position will report directly to the VP of Sales and will be staffed in our headquarters in San Mateo, CA.
Requirements:
7 – 10+ years of successful sales leadership experience as measured by sales performance against goals. Experience in developing skill sets and providing career growth and opportunities through advancement of team members.
Background in selling SaaS technologies with a strong track record of success.
Strong presentation and communication skills.
Must be able to travel occasionally nationwide.
BA/BS degree required
Think you want to join us on this adventure? Send an email to jobscontact@backblaze.com with the subject “Director of Sales.” (Recruiters and agencies, please don’t email us.) Include a resume and answer these two questions:
How would you approach evaluating the current sales team and what is your process for developing a growth strategy to scale the team?
What are the goals you would set for yourself in the 3 month and 1-year timeframes?
Thank you for taking the time to read this and I hope that this sounds like the opportunity for which you’ve been waiting.
Amazon Neptune is now Generally Available in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland). Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets. At the core of Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with millisecond latencies. Neptune supports two popular graph models, Property Graph and RDF, through Apache TinkerPop Gremlin and SPARQL, allowing you to easily build queries that efficiently navigate highly connected datasets. Neptune can be used to power everything from recommendation engines and knowledge graphs to drug discovery and network security. Neptune is fully-managed with automatic minor version upgrades, backups, encryption, and fail-over. I wrote about Neptune in detail for AWS re:Invent last year and customers have been using the preview and providing great feedback that the team has used to prepare the service for GA.
Now that Amazon Neptune is generally available there are a few changes from the preview:
A large number of performance enhancements and updates
Launching a Neptune cluster is as easy as navigating to the AWS Management Console and clicking create cluster. Of course you can also launch with CloudFormation, the CLI, or the SDKs.
You can monitor your cluster health and the health of individual instances through Amazon CloudWatch and the console.
Additional Resources
We’ve created two repos with some additional tools and examples here. You can expect continuous development on these repos as we add additional tools and examples.
Amazon Neptune Tools Repo This repo has a useful tool for converting GraphML files into Neptune compatible CSVs for bulk loading from S3.
Amazon Neptune Samples Repo This repo has a really cool example of building a collaborative filtering recommendation engine for video game preferences.
Purpose Built Databases
There’s an industry trend where we’re moving more and more onto purpose-built databases. Developers and businesses want to access their data in the format that makes the most sense for their applications. As cloud resources make transforming large datasets easier with tools like AWS Glue, we have a lot more options than we used to for accessing our data. With tools like Amazon Redshift, Amazon Athena, Amazon Aurora, Amazon DynamoDB, and more we get to choose the best database for the job or even enable entirely new use-cases. Amazon Neptune is perfect for workloads where the data is highly connected across data rich edges.
I’m really excited about graph databases and I see a huge number of applications. Looking for ideas of cool things to build? I’d love to build a web crawler in AWS Lambda that uses Neptune as the backing store. You could further enrich it by running Amazon Comprehend or Amazon Rekognition on the text and images found and creating a search engine on top of Neptune.
As always, feel free to reach out in the comments or on twitter to provide any feedback!
“Most commonly we have unsolicited calls to potential victims in Australia, purporting to represent the people in authority in China and suggesting to intending victims here they have been involved in some sort of offence in China or elsewhere, for which they’re being held responsible,” Commander McLean said.
The scammers threaten the students with deportation from Australia or some kind of criminal punishment.
The victims are then coerced into providing their identification details or money to get out of the supposed trouble they’re in.
Commander McLean said there are also cases where the student is told they have to hide in a hotel room, provide compromising photos of themselves and cut off all contact.
This simulates a kidnapping.
“So having tricked the victims in Australia into providing the photographs, and money and documents and other things, they then present the information back to the unknowing families in China to suggest that their children who are abroad are in trouble,” Commander McLean said.
“So quite circular in a sense…very skilled, very cunning.”
This post courtesy of Thiago Morais, AWS Solutions Architect
When you build web applications or expose any data externally, you probably look for a platform where you can build highly scalable, secure, and robust REST APIs. As APIs are publicly exposed, there are a number of best practices for providing a secure mechanism to consumers using your API.
Amazon API Gateway handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, authorization and access control, monitoring, and API version management.
In this post, I show you how to take advantage of the regional API endpoint feature in API Gateway, so that you can create your own Amazon CloudFront distribution and secure your API using AWS WAF.
AWS WAF is a web application firewall that helps protect your web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources.
As you make your APIs publicly available, you are exposed to attackers trying to exploit your services in several ways. The AWS security team published a whitepaper solution using AWS WAF, How to Mitigate OWASP’s Top 10 Web Application Vulnerabilities.
Regional API endpoints
Edge-optimized APIs are endpoints that are accessed through a CloudFront distribution created and managed by API Gateway. Before the launch of regional API endpoints, this was the default option when creating APIs using API Gateway. It primarily helped to reduce latency for API consumers that were located in different geographical locations than your API.
When API requests predominantly originate from an Amazon EC2 instance or other services within the same AWS Region as the API is deployed, a regional API endpoint typically lowers the latency of connections. It is recommended for such scenarios.
For better control around caching strategies, customers can use their own CloudFront distribution for regional APIs. They also have the ability to use AWS WAF protection, as I describe in this post.
Edge-optimized API endpoint
The following diagram is an illustrated example of the edge-optimized API endpoint where your API clients access your API through a CloudFront distribution created and managed by API Gateway.
Regional API endpoint
For the regional API endpoint, your customers access your API from the same Region in which your REST API is deployed. This helps you to reduce request latency and particularly allows you to add your own content delivery network, as needed.
Walkthrough
In this section, you implement the following steps:
Attach the web ACL to the CloudFront distribution.
Test AWS WAF protection.
Create the regional API
For this walkthrough, use an existing PetStore API. All new APIs launch by default as the regional endpoint type. To change the endpoint type for your existing API, choose the cog icon on the top right corner:
After you have created the PetStore API on your account, deploy a stage called “prod” for the PetStore API.
On the API Gateway console, select the PetStore API and choose Actions, Deploy API.
For Stage name, type prod and add a stage description.
Choose Deploy and the new API stage is created.
Use the following AWS CLI command to update your API from edge-optimized to regional:
{
"description": "Your first API with Amazon API Gateway. This is a sample API that integrates via HTTP with your demo Pet Store endpoints",
"createdDate": 1511525626,
"endpointConfiguration": {
"types": [
"REGIONAL"
]
},
"id": "{api-id}",
"name": "PetStore"
}
After you change your API endpoint to regional, you can now assign your own CloudFront distribution to this API.
Create a CloudFront distribution
To make things easier, I have provided an AWS CloudFormation template to deploy a CloudFront distribution pointing to the API that you just created. Click the button to deploy the template in the us-east-1 Region.
For Stack name, enter RegionalAPI. For APIGWEndpoint, enter your API FQDN in the following format:
{api-id}.execute-api.us-east-1.amazonaws.com
After you fill out the parameters, choose Next to continue the stack deployment. It takes a couple of minutes to finish the deployment. After it finishes, the Output tab lists the following items:
A CloudFront domain URL
An S3 bucket for CloudFront access logs
Output from CloudFormation
Test the CloudFront distribution
To see if the CloudFront distribution was configured correctly, use a web browser and enter the URL from your distribution, with the following parameters:
With the new CloudFront distribution in place, you can now start setting up AWS WAF to protect your API.
For this demo, you deploy the AWS WAF Security Automations solution, which provides fine-grained control over the requests attempting to access your API.
For more information about deployment, see Automated Deployment. If you prefer, you can launch the solution directly into your account using the following button.
For CloudFront Access Log Bucket Name, add the name of the bucket created during the deployment of the CloudFormation stack for your CloudFront distribution.
The solution allows you to adjust thresholds and also choose which automations to enable to protect your API. After you finish configuring these settings, choose Next.
To start the deployment process in your account, follow the creation wizard and choose Create. It takes a few minutes do finish the deployment. You can follow the creation process through the CloudFormation console.
After the deployment finishes, you can see the new web ACL deployed on the AWS WAF console, AWSWAFSecurityAutomations.
Attach the AWS WAF web ACL to the CloudFront distribution
With the solution deployed, you can now attach the AWS WAF web ACL to the CloudFront distribution that you created earlier.
To assign the newly created AWS WAF web ACL, go back to your CloudFront distribution. After you open your distribution for editing, choose General, Edit.
Select the new AWS WAF web ACL that you created earlier, AWSWAFSecurityAutomations.
Save the changes to your CloudFront distribution and wait for the deployment to finish.
Test AWS WAF protection
To validate the AWS WAF Web ACL setup, use Artillery to load test your API and see AWS WAF in action.
To install Artillery on your machine, run the following command:
$ npm install -g artillery
After the installation completes, you can check if Artillery installed successfully by running the following command:
$ artillery -V
$ 1.6.0-12
As the time of publication, Artillery is on version 1.6.0-12.
One of the WAF web ACL rules that you have set up is a rate-based rule. By default, it is set up to block any requesters that exceed 2000 requests under 5 minutes. Try this out.
First, use cURL to query your distribution and see the API output:
What you are doing is firing 2000 requests to your API from 10 concurrent users. For brevity, I am not posting the Artillery output here.
After Artillery finishes its execution, try to run the cURL request again and see what happens:
$ curl -s https://{distribution-name}.cloudfront.net/prod/pets
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The request could not be satisfied</TITLE>
</HEAD><BODY>
<H1>ERROR</H1>
<H2>The request could not be satisfied.</H2>
<HR noshade size="1px">
Request blocked.
<BR clear="all">
<HR noshade size="1px">
<PRE>
Generated by cloudfront (CloudFront)
Request ID: [removed]
</PRE>
<ADDRESS>
</ADDRESS>
</BODY></HTML>
As you can see from the output above, the request was blocked by AWS WAF. Your IP address is removed from the blocked list after it falls below the request limit rate.
Conclusion
In this first part, you saw how to use the new API Gateway regional API endpoint together with Amazon CloudFront and AWS WAF to secure your API from a series of attacks.
In the second part, I will demonstrate some other techniques to protect your API using API keys and Amazon CloudFront custom headers.
We’re usually averse to buzzwords at HackSpace magazine, but not this month: in issue 7, we’re taking a deep dive into the Internet of Things.
Internet of Things (IoT)
To many people, IoT is a shady term used by companies to sell you something you already own, but this time with WiFi; to us, it’s a way to make our builds smarter, more useful, and more connected. In HackSpace magazine #7, you can join us on a tour of the boards that power IoT projects, marvel at the ways in which other makers are using IoT, and get started with your first IoT project!
Awesome projects
DIY retro computing: this issue, we’re taking our collective hat off to Spencer Owen. He stuck his home-brew computer on Tindie thinking he might make a bit of beer money — now he’s paying the mortgage with his making skills and inviting others to build modules for his machine. And if that tickles your fancy, why not take a crack at our Z80 tutorial? Get out your breadboard, assemble your jumper wires, and prepare to build a real-life computer!
Shameless patriotism: combine Lego, Arduino, and the car of choice for 1960 gold bullion thieves, and you’ve got yourself a groovy weekend project. We proudly present to you one man’s epic quest to add LED lights (controllable via a smartphone!) to his daughter’s LEGO Mini Cooper.
Makerspaces
Patriotism intensifies: for the last 200-odd years, the Black Country has been a hotbed of making. Urban Hax, based in Walsall, is the latest makerspace to show off its riches in the coveted Space of the Month pages. Every space has its own way of doing things, but not every space has a portrait of Rob Halford on the wall. All hail!
Diversity: advice on diversity often boils down to ‘Be nice to people’, which might feel more vague than actionable. This is where we come in to help: it is truly worth making the effort to give people of all backgrounds access to your makerspace, so we take a look at why it’s nice to be nice, and at the ways in which one makerspace has put niceness into practice — with great results.
And there’s more!
We also show you how to easily calculate the size and radius of laser-cut gears, use a bank of LEDs to etch PCBs in your own mini factory, and use chemistry to mess with your lunch menu.
All this plus much, much more waits for you in HackSpace magazine issue 7!
Get your copy of HackSpace magazine
If you like the sound of that, you can find HackSpace magazine in WHSmith, Tesco, Sainsbury’s, and independent newsagents in the UK. If you live in the US, check out your local Barnes & Noble, Fry’s, or Micro Center next week. We’re also shipping to stores in Australia, Hong Kong, Canada, Singapore, Belgium, and Brazil, so be sure to ask your local newsagent whether they’ll be getting HackSpace magazine.
Classic Bond villain, Elon Musk, has a new plan to create a website dedicated to measuring the credibility and adherence to “core truth” of journalists. He is, without any sense of irony, going to call this “Pravda”. This is not simply wrong but evil.
Musk has a point. Journalists do suck, and many suck consistently. I see this in my own industry, cybersecurity, and I frequently criticize them for their suckage.
But what he’s doing here is not correcting them when they make mistakes (or what Musk sees as mistakes), but questioning their legitimacy. This legitimacy isn’t measured by whether they follow established journalism ethics, but whether their “core truths” agree with Musk’s “core truths”.
An example of the problem is how the press fixates on Tesla car crashes due to its “autopilot” feature. Pretty much every autopilot crash makes national headlines, while the press ignores the other 40,000 car crashes that happen in the United States each year. Musk spies on Tesla drivers (hello, classic Bond villain everyone) so he can see the dip in autopilot usage every time such a news story breaks. He’s got good reason to be concerned about this.
He argues that autopilot is safer than humans driving, and he’s got the statistics and government studies to back this up. Therefore, the press’s fixation on Tesla crashes is illegitimate “fake news”, titillating the audience with distorted truth.
But here’s the thing: that’s still only Musk’s version of the truth. Yes, on a mile-per-mile basis, autopilot is safer, but there’s nuance here. Autopilot is used primarily on freeways, which already have a low mile-per-mile accident rate. People choose autopilot only when conditions are incredibly safe and drivers are unlikely to have an accident anyway. Musk is therefore being intentionally deceptive comparing apples to oranges. Autopilot may still be safer, it’s just that the numbers Musk uses don’t demonstrate this.
And then there is the truth calling it “autopilot” to begin with, because it isn’t. The public is overrating the capabilities of the feature. It’s little different than “lane keeping” and “adaptive cruise control” you can now find in other cars. In many ways, the technology is behind — my Tesla doesn’t beep at me when a pedestrian walks behind my car while backing up, but virtually every new car on the market does.
Yes, the press unduly covers Tesla autopilot crashes, but Musk has only himself to blame by unduly exaggerating his car’s capabilities by calling it “autopilot”.
What’s “core truth” is thus rather difficult to obtain. What the press satisfies itself with instead is smaller truths, what they can document. The facts are in such cases that the accident happened, and they try to get Tesla or Musk to comment on it.
What you can criticize a journalist for is therefore not “core truth” but whether they did journalism correctly. When such stories criticize “autopilot”, but don’t do their diligence in getting Tesla’s side of the story, then that’s a violation of journalistic practice. When I criticize journalists for their poor handling of stories in my industry, I try to focus on which journalistic principles they get wrong. For example, the NYTimes reporters do a lot of stories quoting anonymous government sources in clear violation of journalistic principles.
If “credibility” is the concern, then it’s the classic Bond villain here that’s the problem: Musk himself. His track record on business statements is abysmal. For example, when he announced the Model 3 he claimed production targets that every Wall Street analyst claimed were absurd. He didn’t make those targets, he didn’t come close. Model 3 production is still lagging behind Musk’s twice adjusted targets.
So who has a credibility gap here, the press, or Musk himself?
Not only is Musk’s credibility problem ironic, so is the name he chose, “Pravada”, the Russian word for truth that was the name of the Soviet Union Communist Party’s official newspaper. This is so absurd this has to be a joke, yet Musk claims to be serious about all this.
Yes, the press has a lot of problems, and if Musk were some journalism professor concerned about journalists meeting the objective standards of their industry (e.g. abusing anonymous sources), then this would be a fine thing. But it’s not. It’s Musk who is upset the press’s version of “core truth” does not agree with his version — a version that he’s proven time and time again differs from “real truth”.
Just in case Musk is serious, I’ve already registered “www.antipravda.com” to start measuring the credibility of statements by billionaire playboy CEOs. Let’s see who blinks first.
I stole the title, with permission, from this tweet:
Regardless of your career path, there’s no denying that attending industry events can provide helpful career development opportunities — not only for improving and expanding your skill sets, but for networking as well. According to this article from PayScale.com, experts estimate that somewhere between 70-85% of new positions are landed through networking.
Narrowing our focus to networking opportunities with cloud computing professionals who’re working on tackling some of today’s most innovative and exciting big data solutions, attending big data-focused sessions at an AWS Global Summit is a great place to start.
AWS Global Summits are free events that bring the cloud computing community together to connect, collaborate, and learn about AWS. As the name suggests, these summits are held in major cities around the world, and attract technologists from all industries and skill levels who’re interested in hearing from AWS leaders, experts, partners, and customers.
In addition to networking opportunities with top cloud technology providers, consultants and your peers in our Partner and Solutions Expo, you’ll also hone your AWS skills by attending and participating in a multitude of education and training opportunities.
Here’s a brief sampling of some of the upcoming sessions relevant to big data professionals:
Be sure to check out the main page for AWS Global Summits, where you can see which cities have AWS Summits planned for 2018, register to attend an upcoming event, or provide your information to be notified when registration opens for a future event.
We announced a preview of AWS IoT 1-Click at AWS re:Invent 2017 and have been refining it ever since, focusing on simplicity and a clean out-of-box experience. Designed to make IoT available and accessible to a broad audience, AWS IoT 1-Click is now generally available, along with new IoT buttons from AWS and AT&T.
I sat down with the dev team a month or two ago to learn about the service so that I could start thinking about my blog post. During the meeting they gave me a pair of IoT buttons and I started to think about some creative ways to put them to use. Here are a few that I came up with:
Help Request – Earlier this month I spent a very pleasant weekend at the HackTillDawn hackathon in Los Angeles. As the participants were hacking away, they occasionally had questions about AWS, machine learning, Amazon SageMaker, and AWS DeepLens. While we had plenty of AWS Solution Architects on hand (decked out in fashionable & distinctive AWS shirts for easy identification), I imagined an IoT button for each team. Pressing the button would alert the SA crew via SMS and direct them to the proper table.
Camera Control – Tim Bray and I were in the AWS video studio, prepping for the first episode of Tim’s series on AWS Messaging. Minutes before we opened the Twitch stream I realized that we did not have a clean, unobtrusive way to ask the camera operator to switch to a closeup view. Again, I imagined that a couple of IoT buttons would allow us to make the request.
Remote Dog Treat Dispenser – My dog barks every time a stranger opens the gate in front of our house. While it is great to have confirmation that my Ring doorbell is working, I would like to be able to press a button and dispense a treat so that Luna stops barking!
Homes, offices, factories, schools, vehicles, and health care facilities can all benefit from IoT buttons and other simple IoT devices, all managed using AWS IoT 1-Click.
All About AWS IoT 1-Click As I said earlier, we have been focusing on simplicity and a clean out-of-box experience. Here’s what that means:
Architects can dream up applications for inexpensive, low-powered devices.
Developers don’t need to write any device-level code. They can make use of pre-built actions, which send email or SMS messages, or write their own custom actions using AWS Lambda functions.
Installers don’t have to install certificates or configure cloud endpoints on newly acquired devices, and don’t have to worry about firmware updates.
Administrators can monitor the overall status and health of each device, and can arrange to receive alerts when a device nears the end of its useful life and needs to be replaced, using a single interface that spans device types and manufacturers.
I’ll show you how easy this is in just a moment. But first, let’s talk about the current set of devices that are supported by AWS IoT 1-Click.
Who’s Got the Button? We’re launching with support for two types of buttons (both pictured above). Both types of buttons are pre-configured with X.509 certificates, communicate to the cloud over secure connections, and are ready to use.
The AWS IoT Enterprise Button communicates via Wi-Fi. It has a 2000-click lifetime, encrypts outbound data using TLS, and can be configured using BLE and our mobile app. It retails for $19.99 (shipping and handling not included) and can be used in the United States, Europe, and Japan.
The AT&T LTE-M Button communicates via the LTE-M cellular network. It has a 1500-click lifetime, and also encrypts outbound data using TLS. The device and the bundled data plan is available an an introductory price of $29.99 (shipping and handling not included), and can be used in the United States.
We are very interested in working with device manufacturers in order to make even more shapes, sizes, and types of devices (badge readers, asset trackers, motion detectors, and industrial sensors, to name a few) available to our customers. Our team will be happy to tell you about our provisioning tools and our facility for pushing OTA (over the air) updates to large fleets of devices; you can contact them at [email protected].
AWS IoT 1-Click Concepts I’m eager to show you how to use AWS IoT 1-Click and the buttons, but need to introduce a few concepts first.
Device – A button or other item that can send messages. Each device is uniquely identified by a serial number.
Placement Template – Describes a like-minded collection of devices to be deployed. Specifies the action to be performed and lists the names of custom attributes for each device.
Placement – A device that has been deployed. Referring to placements instead of devices gives you the freedom to replace and upgrade devices with minimal disruption. Each placement can include values for custom attributes such as a location (“Building 8, 3rd Floor, Room 1337”) or a purpose (“Coffee Request Button”).
Action – The AWS Lambda function to invoke when the button is pressed. You can write a function from scratch, or you can make use of a pair of predefined functions that send an email or an SMS message. The actions have access to the attributes; you can, for example, send an SMS message with the text “Urgent need for coffee in Building 8, 3rd Floor, Room 1337.”
Getting Started with AWS IoT 1-Click Let’s set up an IoT button using the AWS IoT 1-Click Console:
If I didn’t have any buttons I could click Buy devices to get some. But, I do have some, so I click Claim devices to move ahead. I enter the device ID or claim code for my AT&T button and click Claim (I can enter multiple claim codes or device IDs if I want):
The AWS buttons can be claimed using the console or the mobile app; the first step is to use the mobile app to configure the button to use my Wi-Fi:
Then I scan the barcode on the box and click the button to complete the process of claiming the device. Both of my buttons are now visible in the console:
I am now ready to put them to use. I click on Projects, and then Create a project:
I name and describe my project, and click Next to proceed:
Now I define a device template, along with names and default values for the placement attributes. Here’s how I set up a device template (projects can contain several, but I just need one):
The action has two mandatory parameters (phone number and SMS message) built in; I add three more (Building, Room, and Floor) and click Create project:
I’m almost ready to ask for some coffee! The next step is to associate my buttons with this project by creating a placement for each one. I click Create placements to proceed. I name each placement, select the device to associate with it, and then enter values for the attributes that I established for the project. I can also add additional attributes that are peculiar to this placement:
I can inspect my project and see that everything looks good:
I click on the buttons and the SMS messages appear:
I can monitor device activity in the AWS IoT 1-Click Console:
And also in the Lambda Console:
The Lambda function itself is also accessible, and can be used as-is or customized:
As you can see, this is the code that lets me use {{*}}include all of the placement attributes in the message and {{Building}} (for example) to include a specific placement attribute.
Now Available I’ve barely scratched the surface of this cool new service and I encourage you to give it a try (or a click) yourself. Buy a button or two, build something cool, and let me know all about it!
Pricing is based on the number of enabled devices in your account, measured monthly and pro-rated for partial months. Devices can be enabled or disabled at any time. See the AWS IoT 1-Click Pricing page for more info.
This post courtesy of Jeff Levine Solutions Architect for Amazon Web Services
Amazon Linux 2 is the next generation of Amazon Linux, a Linux server operating system from Amazon Web Services (AWS). Amazon Linux 2 offers a high-performance Linux environment suitable for organizations of all sizes. It supports applications ranging from small websites to enterprise-class, mission-critical platforms.
Amazon Linux 2 includes support for the LAMP (Linux/Apache/MariaDB/PHP) stack, one of the most popular platforms for deploying websites. To secure the transmission of data-in-transit to such websites and prevent eavesdropping, organizations commonly leverage Secure Sockets Layer/Transport Layer Security (SSL/TLS) services which leverage certificates to provide encryption. The LAMP stack provided by Amazon Linux 2 includes a self-signed SSL/TLS certificate. Such certificates may be fine for internal usage but are not acceptable when attestation by a certificate authority is required.
In this post, I discuss how to extend the capabilities of Amazon Linux 2 by installing Let’s Encrypt, a certificate authority provided by the Internet Security Research Group. Let’s Encrypt offers basic SSL/TLS certificates for DNS hosts at no charge that you can use to add encryption-in-transit to a single web server. For commercial or multi-server configurations, you should consider AWS Certificate Manager and Elastic Load Balancing.
Let’s Encrypt also requires the certbot package, which you install from EPEL, the Extra Packaged for Enterprise Linux collection. Although EPEL is not included with Amazon Linux 2, I show how you can install it from the Fedora Project.
Walkthrough
At a high level, you perform the following tasks for this walkthrough:
Provision a VPC, Amazon Linux 2 instance, and LAMP stack.
Install and enable the EPEL repository.
Install and configure Let’s Encrypt.
Validate the installation.
Clean up.
Prerequisites and costs
To follow along with this walkthrough, you need the following:
Accept all other default values including with regard to storage.
Create a new security group and accept the default rule that allows TCP port 22 (SSH) from everywhere (0.0.0.0/0 in IPv4). For the purposes of this walkthrough, permitting access from all IP addresses is reasonable. In a production environment, you may restrict access to different addresses.
Allocate and associate an Elastic IP address to the server when it enters the running state.
Respond “Y” to all requests for approval to install the software.
Step 3: Install and configure Let’s Encrypt
If you are no longer connected to the Amazon Linux 2 instance, connect to it at the Elastic IP address that you just created.
Install certbot, the Let’s Encrypt client to be used to obtain an SSL/TLS certificate and install it into Apache.
sudo yum install python2-certbot-apache.noarch
Respond “Y” to all requests for approval to install the software. If you see a message appear about SELinux, you can safely ignore it. This is a known issue with the latest version of certbot.
Create a DNS “A record” that maps a host name to the Elastic IP address. For this post, assume that the name of the host is lamp.example.com. If you are hosting your DNS in Amazon Route 53, do this by creating the appropriate record set.
After the “A record” has propagated, browse to lamp.example.com. The Apache test page should appear. If the page does not appear, use a tool such as nslookup on your workstation to confirm that the DNS record has been properly configured.
You are now ready to install Let’s Encrypt. Let’s Encrypt does the following:
Confirms that you have control over the DNS domain being used, by having you create a DNS TXT record using the value that it provides.
Obtains an SSL/TLS certificate.
Modifies the Apache-related scripts to use the SSL/TLS certificate and redirects users browsing the site in HTTP mode to HTTPS mode.
Use the following command to install certbot:
sudo certbot -i apache -a manual \
--preferred-challenges dns -d lamp.example.com
The options have the following meanings:
-i apache Use the Apache installer.
-a manual Authenticate domain ownership manually.
--preferred-challenges dns Use DNS TXT records for authentication challenge.
-d lamp.example.com Specify the domain for the SSL/TLS certificate.
You are prompted for the following information: E-mail address for renewals? Enter an email address for certificate renewals. Accept the terms of services? Respond as appropriate. Send your e-mail address to the EFF? Respond as appropriate. Log your current IP address? Respond as appropriate.
You are prompted to deploy a DNS TXT record with the name “_acme-challenge.lamp.example.com” with the supplied value, as shown below.
After you enter the record, wait until the TXT record propagates. To look up the TXT record to confirm the deployment, use the nslookup command in a separate command window, as shown below. Remember to use the set ty=txt command before entering the TXT record. You are prompted to select a virtual host. There is only one, so choose 1. The final prompt asks whether to redirect HTTP traffic to HTTPS. To perform the redirection, choose 2. That completes the configuration of Let’s Encrypt.
Browse to the http:// lamp.example.com site. You are redirected to the SSL/TLS page https://lamp.example.com.
To look at the encryption information, use the appropriate actions within your browser. For example, in Firefox, you can open the padlock and traverse the menus. In the encryption technical details, you can see from the “Connection Encrypted” line that traffic to the website is now encrypted using TLS 1.2.
Security note: As of the time of publication, this website also supports TLS 1.0. I recommend that you disable this protocol because of some known vulnerabilities associated with it. To do this:
Edit the file /etc/letsencrypt/options-ssl-apache.conf.
Look for the line beginning with SSLProtocol and change it to the following:
SSLProtocol all -SSLv2 -SSLv3 -TLSv1
Save the file. After you make changes to this file, Let’s Encrypt no longer automatically updates it. Periodically check your log files for recommended updates to this file.
Restart the httpd server with the following command:
sudo service httpd restart
Step 5: Cleanup
Use the following steps to avoid incurring any further costs.
Terminate the Amazon Linux 2 instance that you created.
Release the Elastic IP address that you allocated.
Revert any DNS changes that you made, including the A and TXT records.
Conclusion
Amazon Linux 2 is an excellent option for hosting websites through the LAMP stack provided by the Amazon-Linux-Extras feature. You can then enhance the security of the Apache web server by installing EPEL and Let’s Encrypt. Let’s Encrypt provisions an SSL/TLS certificate, optionally installs it for you on the Apache server, and enables data-in-transit encryption. You can get started with Amazon Linux 2 in just a few clicks.
The EU’s General Data Protection Regulation (GDPR) describes data processor and data controller roles, and some customers and AWS Partner Network (APN) partners are asking how this affects the long-established AWS Shared Responsibility Model. I wanted to take some time to help folks understand shared responsibilities for us and for our customers in context of the GDPR.
How does the AWS Shared Responsibility Model change under GDPR? The short answer – it doesn’t. AWS is responsible for securing the underlying infrastructure that supports the cloud and the services provided; while customers and APN partners, acting either as data controllers or data processors, are responsible for any personal data they put in the cloud. The shared responsibility model illustrates the various responsibilities of AWS and our customers and APN partners, and the same separation of responsibility applies under the GDPR.
AWS responsibilities as a data processor
The GDPR does introduce specific regulation and responsibilities regarding data controllers and processors. When any AWS customer uses our services to process personal data, the controller is usually the AWS customer (and sometimes it is the AWS customer’s customer). However, in all of these cases, AWS is always the data processor in relation to this activity. This is because the customer is directing the processing of data through its interaction with the AWS service controls, and AWS is only executing customer directions. As a data processor, AWS is responsible for protecting the global infrastructure that runs all of our services. Controllers using AWS maintain control over data hosted on this infrastructure, including the security configuration controls for handling end-user content and personal data. Protecting this infrastructure, is our number one priority, and we invest heavily in third-party auditors to test our security controls and make any issues they find available to our customer base through AWS Artifact. Our ISO 27018 report is a good example, as it tests security controls that focus on protection of personal data in particular.
AWS has an increased responsibility for our managed services. Examples of managed services include Amazon DynamoDB, Amazon RDS, Amazon Redshift, Amazon Elastic MapReduce, and Amazon WorkSpaces. These services provide the scalability and flexibility of cloud-based resources with less operational overhead because we handle basic security tasks like guest operating system (OS) and database patching, firewall configuration, and disaster recovery. For most managed services, you only configure logical access controls and protect account credentials, while maintaining control and responsibility of any personal data.
Customer and APN partner responsibilities as data controllers — and how AWS Services can help
Our customers can act as data controllers or data processors within their AWS environment. As a data controller, the services you use may determine how you configure those services to help meet your GDPR compliance needs. For example, AWS Services that are classified as Infrastructure as a Service (IaaS), such as Amazon EC2, Amazon VPC, and Amazon S3, are under your control and require you to perform all routine security configuration and management that would be necessary no matter where the servers were located. With Amazon EC2 instances, you are responsible for managing: guest OS (including updates and security patches), application software or utilities installed on the instances, and the configuration of the AWS-provided firewall (called a security group).
To help you realize data protection by design principles under the GDPR when using our infrastructure, we recommend you protect AWS account credentials and set up individual user accounts with Amazon Identity and Access Management (IAM) so that each user is only given the permissions necessary to fulfill their job duties. We also recommend using multi-factor authentication (MFA) with each account, requiring the use of SSL/TLS to communicate with AWS resources, setting up API/user activity logging with AWS CloudTrail, and using AWS encryption solutions, along with all default security controls within AWS Services. You can also use advanced managed security services, such as Amazon Macie, which assists in discovering and securing personal data stored in Amazon S3.
For more information, you can download the AWS Security Best Practices whitepaper or visit the AWS Security Resources or GDPR Center webpages. In addition to our solutions and services, AWS APN partners can provide hundreds of tools and features to help you meet your security objectives, ranging from network security and configuration management to access control and data encryption.
Информация от Австрия за начина, по който обществената телевизия е под надзора на местния регулатор за изпълнение на обществената мисия.
Регулаторът не е одобрил две предложения на ORF – за собствен канал в YouTube и за нова услуга – предавания/филми от програмите на ORF срещу абонамент.
Видимо в Австрия регулаторът има роля ex ante в по-широк обхват – което е в защита на аудиторията. БНТ е в YouYube – или поне в интернет се вижда “Официален канал на БНТ” в YouTube. Но това не е единствената илюстрация за ограничената роля на българския регулатор – не е ясно например какво става с политематичната БНТ2, нито може да се прочете мониторингов доклад как БНТ изпълнява лицензиите си.
ORF кандидатства за разрешение да добави канал в YouTube към своите медийни дейности. Този канал трябваше да предложи главно предавания на ORF, които поради правни ограничения понастоящем не могат да бъдат предоставени за повече от 7 дни чрез catch-up услугата ORF TVthek.
KommAustria твърди, че изключителното сътрудничество между ORF и YouTube би дискриминирало други, сравними компании.
Регулаторът също така взема предвид съществуващите услуги, когато одобрява нови услуги на ORF. KommAustria предполага намаляване на интереса към ORF TVthek, ако се създаде канал на ORF в YouTube. Освен това регулаторът твърди, че е възможно да се удължи общият период за предоставяне на програми на ORF TVthek (повече от 7 дни) чрез преразглеждане на правната рамка.
Според KommAustria по принцип не е забранено на ORF да предлага абонаментна услуга. В конкретния случай обаче нито икономически, нито правно искането не е обосновано, напр. остава “напълно неясно” колко голям ще бъде делът от таксата за тази услуга – доколкото разбирам, не е изяснена пропорцията между абонамент и обществено финансиране.
I’ve been busy trying to replicate the “eFail” PGP/SMIME bug. I thought I’d write up some notes.
PGP and S/MIME encrypt emails, so that eavesdroppers can’t read them. The bugs potentially allow eavesdroppers to take the encrypted emails they’ve captured and resend them to you, reformatted in a way that allows them to decrypt the messages.
Disable remote/external content in email
The most important defense is to disable “external” or “remote” content from being automatically loaded. This is when HTML-formatted emails attempt to load images from remote websites. This happens legitimately when they want to display images, but not fill up the email with them. But most of the time this is illegitimate, they hide images on the webpage in order to track you with unique IDs and cookies. For example, this is the code at the end of an email from politician Bernie Sanders to his supporters. Notice the long random number assigned to track me, and the width/height of this image is set to one pixel, so you don’t even see it:
Such trackers are so pernicious they are disabled by default in most email clients. This is an example of the settings in Thunderbird:
The problem is that as you read email messages, you often get frustrated by the fact the error messages and missing content, so you keep adding exceptions:
The correct defense against this eFail bug is to make sure such remote content is disabled and that you have no exceptions, or at least, no HTTP exceptions. HTTPS exceptions (those using SSL) are okay as long as they aren’t to a website the attacker controls. Unencrypted exceptions, though, the hacker can eavesdrop on, so it doesn’t matter if they control the website the requests go to. If the attacker can eavesdrop on your emails, they can probably eavesdrop on your HTTP sessions as well.
Some have recommended disabling PGP and S/MIME completely. That’s probably overkill. As long as the attacker can’t use the “remote content” in emails, you are fine. Likewise, some have recommend disabling HTML completely. That’s not even an option in any email client I’ve used — you can disable sending HTML emails, but not receiving them. It’s sufficient to just disable grabbing remote content, not the rest of HTML email rendering.
I couldn’t replicate the direct exfiltration
There rare two related bugs. One allows direct exfiltration, which appends the decrypted PGP email onto the end of an IMG tag (like one of those tracking tags), allowing the entire message to be decrypted.
An example of this is the following email. This is a standard HTML email message consisting of multiple parts. The trick is that the IMG tag in the first part starts the URL (blog.robertgraham.com/…) but doesn’t end it. It has the starting quotes in front of the URL but no ending quotes. The ending will in the next chunk.
The next chunk isn’t HTML, though, it’s PGP. The PGP extension (in my case, Enignmail) will detect this and automatically decrypt it. In this case, it’s some previous email message I’ve received the attacker captured by eavesdropping, who then pastes the contents into this email message in order to get it decrypted.
What should happen at this point is that Thunderbird will generate a request (if “remote content” is enabled) to the blog.robertgraham.com server with the decrypted contents of the PGP email appended to it. But that’s not what happens. Instead, I get this:
I am indeed getting weird stuff in the URL (the bit after the GET /), but it’s not the PGP decrypted message. Instead what’s going on is that when Thunderbird puts together a “multipart/mixed” message, it adds it’s own HTML tags consisting of lines between each part. In the email client it looks like this:
The HTML code it adds looks like:
That’s what you see in the above URL, all this code up to the first quotes. Those quotes terminate the quotes in the URL from the first multipart section, causing the rest of the content to be ignored (as far as being sent as part of the URL).
So at least for the latest version of Thunderbird, you are accidentally safe, even if you have “remote content” enabled. Though, this is only according to my tests, there may be a work around to this that hackers could exploit.
STARTTLS
In the old days, email was sent plaintext over the wire so that it could be passively eavesdropped on. Nowadays, most providers send it via “STARTTLS”, which sorta encrypts it. Attackers can still intercept such email, but they have to do so actively, using man-in-the-middle. Such active techniques can be detected if you are careful and look for them.
Some organizations don’t care. Apparently, some nation states are just blocking all STARTTLS and forcing email to be sent unencrypted. Others do care. The NSA will passively sniff all the email they can in nations like Iraq, but they won’t actively intercept STARTTLS messages, for fear of getting caught.
The consequence is that it’s much less likely that somebody has been eavesdropping on you, passively grabbing all your PGP/SMIME emails. If you fear they have been, you should look (e.g. send emails from GMail and see if they are intercepted by sniffing the wire).
You’ll know if you are getting hacked
If somebody attacks you using eFail, you’ll know. You’ll get an email message formatted this way, with multipart/mixed components, some with corrupt HTML, some encrypted via PGP. This means that for the most part, your risk is that you’ll be attacked only once — the hacker will only be able to get one message through and decrypt it before you notice that something is amiss. Though to be fair, they can probably include all the emails they want decrypted as attachments to the single email they sent you, so the risk isn’t necessarily that you’ll only get one decrypted.
As mentioned above, a lot of attackers (e.g. the NSA) won’t attack you if its so easy to get caught. Other attackers, though, like anonymous hackers, don’t care.
Somebody ought to write a plugin to Thunderbird to detect this.
Summary
It only works if attackers have already captured your emails (though, that’s why you use PGP/SMIME in the first place, to guard against that).
It only works if you’ve enabled your email client to automatically grab external/remote content.
It seems to not be easily reproducible in all cases.
Instead of disabling PGP/SMIME, you should make sure your email client hast remote/external content disabled — that’s a huge privacy violation even without this bug.
Notes: The default email client on the Mac enables remote content by default, which is bad:
If you’ve used Amazon CloudWatch Events to schedule the invocation of a Lambda function at regular intervals, you may have noticed that the highest frequency possible is one invocation per minute. However, in some cases, you may need to invoke Lambda more often than that. In this blog post, I’ll cover invoking a Lambda function every 10 seconds, but with some simple math you can change to whatever interval you like.
To achieve this, I’ll show you how to leverage Step Functions and Amazon Kinesis Data Streams.
The Solution
For this example, I’ve created a Step Functions State Machine that invokes our Lambda function 6 times, 10 seconds apart. Such State Machine is then executed once per minute by a CloudWatch Events Rule. This state machine is then executed once per minute by an Amazon CloudWatch Events rule. Finally, the Kinesis Data Stream triggers our Lambda function for each record inserted. The result is our Lambda function being invoked every 10 seconds, indefinitely.
Below is a diagram illustrating how the various services work together.
Step 1: My sampleLambda function doesn’t actually do anything, it just simulates an execution for a few seconds. This is the (Python) code of my dummy function:
import time
import random
def lambda_handler(event, context):
rand = random.randint(1, 3)
print('Running for {} seconds'.format(rand))
time.sleep(rand)
return True
Step 2:
The next step is to create a second Lambda function, that I called Iterator, which has two duties:
It keeps track of the current number of iterations, since Step Function doesn’t natively have a state we can use for this purpose.
It asynchronously invokes our Lambda function at every loops.
This is the code of the Iterator, adapted from here.
The state machine starts and sets the index at 0 and the count at 6.
Iterator function is invoked.
If the iterator function reached the end of the loop, the IsCountReached state terminates the execution, otherwise the machine waits for 10 seconds.
The machine loops back to the iterator.
Step 3: Create an Amazon CloudWatch Events rule scheduled to trigger every minute and add the state machine as its target. I’ve actually prepared an Amazon CloudFormation template that creates the whole stack and starts the Lambda invocations, you can find it here.
Performance
Let’s have a look at a sample series of invocations and analyse how precise the timing is. In the following chart I reported the delay (in excess of the expected 10-second-wait) of 30 consecutive invocations of my dummy function, when the Iterator is configured with a memory size of 1024MB.
Invocations Delay
Notice the delay increases by a few hundred milliseconds at every invocation. The good news is it accrues only within the same loop, 6 times; after that, a new CloudWatch Events kicks in and it resets.
This delay is due to the work that AWS Step Function does outside of the Wait state, the main component of which is the Iterator function itself, that runs synchronously in the state machine and therefore adds up its duration to the 10-second-wait.
As we can easily imagine, the memory size of the Iterator Lambda function does make a difference. Here are the Average and Maximum duration of the function with 256MB, 512MB, 1GB and 2GB of memory.
Average Duration
Maximum Duration
Given those results, I’d say that a memory of 1024MB is a good compromise between costs and performance.
Caveats
As mentioned, in our Amazon CloudWatch Events documentation, in rare cases a rule can be triggered twice, causing two parallel executions of the state machine. If that is a concern, we can add a task state at the beginning of the state machine that checks if any other executions are currently running. If the outcome is positive, then a choice state can immediately terminate the flow. Since the state machine is invoked every 60 seconds and runs for about 50, it is safe to assume that executions should all be sequential and any parallel executions should be treated as duplicates. The task state that checks for current running executions can be a Lambda function similar to the following:
Abstract: Every day, hundreds of people fly on airline tickets that have been obtained fraudulently. This crime script analysis provides an overview of the trade in these tickets, drawing on interviews with industry and law enforcement, and an analysis of an online blackmarket. Tickets are purchased by complicit travellers or resellers from the online blackmarket. Victim travellers obtain tickets from fake travel agencies or malicious insiders. Compromised credit cards used to be the main method to purchase tickets illegitimately. However, as fraud detection systems improved, offenders displaced to other methods, including compromised loyalty point accounts, phishing, and compromised business accounts. In addition to complicit and victim travellers, fraudulently obtained tickets are used for transporting mules, and for trafficking and smuggling. This research details current prevention approaches, and identifies additional interventions, aimed at the act, the actor, and the marketplace.
If your day has been a little fraught so far, watch this video. It opens with a tableau of methodically laid-out components and then shows them soldered, screwed, and slotted neatly into place. Everything fits perfectly; nothing needs percussive adjustment. Then it shows us glimpses of an AR future just like the one promised in the less dystopian comics and TV programmes of my 1980s childhood. It is all very soothing, and exactly what I needed.
Transform any surface into mixed-reality using Raspberry Pi, a laser projector, and Android Things. Android Experiments – http://experiments.withgoogle.com/android/lantern Lantern project site – http://nordprojects.co/lantern check below to make your own ↓↓↓ Get the code – https://github.com/nordprojects/lantern Build the lamp – https://www.hackster.io/nord-projects/lantern-9f0c28
Creating augmented reality with projection
We’ve seen plenty of Raspberry Pi IoT builds that are smart devices for the home; they add computing power to things like lights, door locks, or toasters to make these objects interact with humans and with their environment in new ways. Nord Projects‘ Lantern takes a different approach. In their words, it:
imagines a future where projections are used to present ambient information, and relevant UI within everyday objects. Point it at a clock to show your appointments, or point to speaker to display the currently playing song. Unlike a screen, when Lantern’s projections are no longer needed, they simply fade away.
Lantern is set up so that you can connect your wireless device to it using Google Nearby. This means there’s no need to create an account before you can dive into augmented reality.
Your own open-source AR lamp
Nord Projects collaborated on Lantern with Google’s Android Things team. They’ve made it fully open-source, so you can find the code on GitHub and also download their parts list, which includes a Pi, an IKEA lamp, an accelerometer, and a laser projector. Build instructions are at hackster.io and on GitHub.
This is a particularly clear tutorial, very well illustrated with photos and GIFs, and once you’ve sourced and 3D-printed all of the components, you shouldn’t need a whole lot of experience to put everything together successfully. Since everything is open-source, though, if you want to adapt it — for example, if you’d like to source a less costly projector than the snazzy one used here — you can do that too.
The instructions walk you through the mechanical build and the wiring, as well as installing Android Things and Nord Projects’ custom software on the Raspberry Pi. Once you’ve set everything up, an accelerometer connected to the Pi’s GPIO pins lets the lamp know which surface it is pointing at. A companion app on your mobile device lets you choose from the mini apps that work on that surface to select the projection you want.
The designers are making several mini apps available for Lantern, including the charmingly named Space Porthole: this uses Processing and your local longitude and latitude to project onto your ceiling the stars you’d see if you punched a hole through to the sky, if it were night time, and clear weather. Wouldn’t you rather look at that than deal with the ant problem in your kitchen or tackle your GitHub notifications?
What would you like to project onto your living environment? Let us know in the comments!
Join us this month to learn about some of the exciting new services and solution best practices at AWS. We also have our first re:Invent 2018 webinar series, “How to re:Invent”. Sign up now to learn more, we look forward to seeing you.
Note – All sessions are free and in Pacific Time.
Tech talks featured this month:
Analytics & Big Data
May 21, 2018 | 11:00 AM – 11:45 AM PT – Integrating Amazon Elasticsearch with your DevOps Tooling – Learn how you can easily integrate Amazon Elasticsearch Service into your DevOps tooling and gain valuable insight from your log data.
May 24, 2018 | 11:00 AM – 11:45 AM PT – Data Transformation Patterns in AWS – Discover how to perform common data transformations on the AWS Data Lake.
May 30, 2018 | 01:00 PM – 01:45 PM PT – Accelerating Life Sciences with HPC on AWS – Learn how you can accelerate your Life Sciences research workloads by harnessing the power of high performance computing on AWS.
Containers
May 24, 2018 | 01:00 PM – 01:45 PM PT –Building Microservices with the 12 Factor App Pattern on AWS – Learn best practices for building containerized microservices on AWS, and how traditional software design patterns evolve in the context of containers.
Databases
May 21, 2018 | 01:00 PM – 01:45 PM PT – How to Migrate from Cassandra to Amazon DynamoDB – Get the benefits, best practices and guides on how to migrate your Cassandra databases to Amazon DynamoDB.
May 23, 2018 | 01:00 PM – 01:45 PM PT – 5 Hacks for Optimizing MySQL in the Cloud – Learn how to optimize your MySQL databases for high availability, performance, and disaster resilience using RDS.
DevOps
May 23, 2018 | 09:00 AM – 09:45 AM PT – .NET Serverless Development on AWS – Learn how to build a modern serverless application in .NET Core 2.0.
Enterprise & Hybrid
May 22, 2018 | 11:00 AM – 11:45 AM PT – Hybrid Cloud Customer Use Cases on AWS – Learn how customers are leveraging AWS hybrid cloud capabilities to easily extend their datacenter capacity, deliver new services and applications, and ensure business continuity and disaster recovery.
IoT
May 31, 2018 | 11:00 AM – 11:45 AM PT – Using AWS IoT for Industrial Applications – Discover how you can quickly onboard your fleet of connected devices, keep them secure, and build predictive analytics with AWS IoT.
Machine Learning
May 22, 2018 | 09:00 AM – 09:45 AM PT – Using Apache Spark with Amazon SageMaker – Discover how to use Apache Spark with Amazon SageMaker for training jobs and application integration.
May 24, 2018 | 09:00 AM – 09:45 AM PT – Introducing AWS DeepLens – Learn how AWS DeepLens provides a new way for developers to learn machine learning by pairing the physical device with a broad set of tutorials, examples, source code, and integration with familiar AWS services.
May 30, 2018 | 09:00 AM – 09:45 AM PT– Introducing AWS Certificate Manager Private Certificate Authority (CA) – Learn how AWS Certificate Manager (ACM) Private Certificate Authority (CA), a managed private CA service, helps you easily and securely manage the lifecycle of your private certificates.
June 1, 2018 | 09:00 AM – 09:45 AM PT – Introducing AWS Firewall Manager – Centrally configure and manage AWS WAF rules across your accounts and applications.
May 30, 2018 | 11:00 AM – 11:45 AM PT – Accelerate Productivity by Computing at the Edge – Learn how AWS Snowball Edge support for compute instances helps accelerate data transfers, execute custom applications, and reduce overall storage costs.
Many of my colleagues are fortunate to be able to spend a good part of their day sitting down with and listening to our customers, doing their best to understand ways that we can better meet their business and technology needs. This information is treated with extreme care and is used to drive the roadmap for new services and new features.
AWS customers in the financial services industry (often abbreviated as FSI) are looking ahead to the Fundamental Review of Trading Book (FRTB) regulations that will come in to effect between 2019 and 2021. Among other things, these regulations mandate a new approach to the “value at risk” calculations that each financial institution must perform in the four hour time window after trading ends in New York and begins in Tokyo. Today, our customers report this mission-critical calculation consumes on the order of 200,000 vCPUs, growing to between 400K and 800K vCPUs in order to meet the FRTB regulations. While there’s still some debate about the magnitude and frequency with which they’ll need to run this expanded calculation, the overall direction is clear.
Building a Big Grid In order to make sure that we are ready to help our FSI customers meet these new regulations, we worked with TIBCO to set up and run a proof of concept grid in the AWS Cloud. The periodic nature of the calculation, along with the amount of processing power and storage needed to run it to completion within four hours, make it a great fit for an environment where a vast amount of cost-effective compute power is available on an on-demand basis.
Our customers are already using the TIBCO GridServer on-premises and want to use it in the cloud. This product is designed to run grids at enterprise scale. It runs apps in a virtualized fashion, and accepts requests for resources, dynamically provisioning them on an as-needed basis. The cloud version supports Amazon Linux as well as the PostgreSQL-compatible edition of Amazon Aurora.
Working together with TIBCO, we set out to create a grid that was substantially larger than the current high-end prediction of 800K vCPUs, adding a 50% safety factor and then rounding up to reach 1.3 million vCPUs (5x the size of the largest on-premises grid). With that target in mind, the account limits were raised as follows:
Spot Instance Limit – 120,000
EBS Volume Limit – 120,000
EBS Capacity Limit – 2 PB
If you plan to create a grid of this size, you should also bring your friendly local AWS Solutions Architect into the loop as early as possible. They will review your plans, provide you with architecture guidance, and help you to schedule your run.
Running the Grid We hit the Go button and launched the grid, watching as it bid for and obtained Spot Instances, each of which booted, initialized, and joined the grid within two minutes. The test workload used the Strata open source analytics & market risk library from OpenGamma and was set up with their assistance.
The grid grew to 61,299 Spot Instances (1.3 million vCPUs drawn from 34 instance types spanning 3 generations of EC2 hardware) as planned, with just 1,937 instances reclaimed and automatically replaced during the run, and cost $30,000 per hour to run, at an average hourly cost of $0.078 per vCPU. If the same instances had been used in On-Demand form, the hourly cost to run the grid would have been approximately $93,000.
Despite the scale of the grid, prices for the EC2 instances did not move during the bidding process. This is due to the overall size of the AWS Cloud and the smooth price change model that we launched late last year.
To give you a sense of the compute power, we computed that this grid would have taken the #1 position on the TOP 500 supercomputer list in November 2007 by a considerable margin, and the #2 position in June 2008. Today, it would occupy position #360 on the list.
I hope that you enjoyed this AWS success story, and that it gives you an idea of the scale that you can achieve in the cloud!
Bad software is everywhere. One can even claim that every software is bad. Cool companies, tech giants, established companies, all produce bad software. And no, yours is not an exception.
Who’s to blame for bad software? It’s all complicated and many factors are intertwined – there’s business requirements, there’s organizational context, there’s lack of sufficient skilled developers, there’s the inherent complexity of software development, there’s leaky abstractions, reliance on 3rd party software, consequences of wrong business and purchase decisions, time limitations, flawed business analysis, etc. So yes, despite the catchy title, I’m aware it’s actually complicated.
But in every “it’s complicated” scenario, there’s always one or two factors that are decisive. All of them contribute somehow, but the major drivers are usually a handful of things. And in the case of base software, I think it’s the fault of technical people. Developers, architects, ops.
We don’t seem to care about best practices. And I’ll do some nasty generalizations here, but bear with me. We can spend hours arguing about tabs vs spaces, curly bracket on new line, git merge vs rebase, which IDE is better, which framework is better and other largely irrelevant stuff. But we tend to ignore the important aspects that span beyond the code itself. The context in which the code lives, the non-functional requirements – robustness, security, resilience, etc.
We don’t seem to get security. Even trivial stuff such as user authentication is almost always implemented wrong. These days Twitter and GitHub realized they have been logging plain-text passwords, for example, but that’s just the tip of the iceberg. Too often we ignore the security implications.
“But the business didn’t request the security features”, one may say. The business never requested 2-factor authentication, encryption at rest, PKI, secure (or any) audit trail, log masking, crypto shredding, etc., etc. Because the business doesn’t know these things – we do and we have to put them on the backlog and fight for them to be implemented. Each organization has its specifics and tech people can influence the backlog in different ways, but almost everywhere we can put things there and prioritize them.
The other aspect is testing. We should all be well aware by now that automated testing is mandatory. We have all the tools in the world for unit, functional, integration, performance and whatnot testing, and yet many software projects lack the necessary test coverage to be able to change stuff without accidentally breaking things. “But testing takes time, we don’t have it”. We are perfectly aware that testing saves time, as we’ve all had those “not again!” recurring bugs. And yet we think of all sorts of excuses – “let the QAs test it”, we have to ship that now, we’ll test it later”, “this is too trivial to be tested”, etc.
And you may say it’s not our job. We don’t define what has do be done, we just do it. We don’t define the budget, the scope, the features. We just write whatever has been decided. And that’s plain wrong. It’s not our job to make money out of our code, and it’s not our job to define what customers need, but apart from that everything is our job. The way the software is structured, the security aspects and security features, the stability of the code base, the way the software behaves in different environments. The non-functional requirements are our job, and putting them on the backlog is our job.
You’ve probably heard that every software becomes “legacy” after 6 months. And that’s because of us, our sloppiness, our inability to mitigate external factors and constraints. Too often we create a mess through “just doing our job”.
And of course that’s a generalization. I happen to know a lot of great professionals who don’t make these mistakes, who strive for excellence and implement things the right way. But our industry as a whole doesn’t. Our industry as a whole produces bad software. And it’s our fault, as developers – as the only people who know why a certain piece of software is bad.
In a talk of his, Bob Martin warns us of the risks of our sloppiness. We have been building websites so far, but we are more and more building stuff that interacts with the real world, directly and indirectly. Ultimately, lives may depend on our software (like the recent unfortunate death caused by a self-driving car). And I’ll agree with Uncle Bob that it’s high time we self-regulate as an industry, before some technically incompetent politician decides to do that.
How, I don’t know. We’ll have to think more about it. But I’m pretty sure it’s our fault that software is bad, and no amount of blaming the management, the budget, the timing, the tools or the process can eliminate our responsibility.
Why do I insist on bashing my fellow software engineers? Because if we start looking at software development with more responsibility; with the fact that if it fails, it’s our fault, then we’re more likely to get out of our current bug-ridden, security-flawed, fragile software hole and really become the experts of the future.
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.