Tag Archives: Compute

AWS Online Tech Talks – July 2018

Post Syndicated from Sara Rodas original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-july-2018/

Join us this month to learn about AWS services and solutions featuring topics on Amazon EMR, Amazon SageMaker, AWS Lambda, Amazon S3, Amazon WorkSpaces, Amazon EC2 Fleet and more! We also have our third episode of the “How to re:Invent” where we’ll dive deep with the AWS Training and Certification team on Bootcamps, Hands-on Labs, and how to get AWS Certified at re:Invent. Register now! We look forward to seeing you. Please note – all sessions are free and in Pacific Time.

 

Tech talks featured this month:

 

Analytics & Big Data

July 23, 2018 | 11:00 AM – 12:00 PM PT – Large Scale Machine Learning with Spark on EMR – Learn how to do large scale machine learning on Amazon EMR.

July 25, 2018 | 01:00 PM – 02:00 PM PT – Introduction to Amazon QuickSight: Business Analytics for Everyone – Get an introduction to Amazon Quicksight, Amazon’s BI service.

July 26, 2018 | 11:00 AM – 12:00 PM PT – Multi-Tenant Analytics on Amazon EMR – Discover how to make an Amazon EMR cluster multi-tenant to have different processing activities on the same data lake.

 

Compute

July 31, 2018 | 11:00 AM – 12:00 PM PT – Accelerate Machine Learning Workloads Using Amazon EC2 P3 Instances – Learn how to use Amazon EC2 P3 instances, the most powerful, cost-effective and versatile GPU compute instances available in the cloud.

August 1, 2018 | 09:00 AM – 10:00 AM PT – Technical Deep Dive on Amazon EC2 Fleet – Learn how to launch workloads across instance types, purchase models, and AZs with EC2 Fleet to achieve the desired scale, performance and cost.

 

Containers

July 25, 2018 | 11:00 AM – 11:45 AM PT – How Harry’s Shaved Off Their Operational Overhead by Moving to AWS Fargate – Learn how Harry’s migrated their messaging workload to Fargate and reduced message processing time by more than 75%.

 

Databases

July 23, 2018 | 01:00 PM – 01:45 PM PT – Purpose-Built Databases: Choose the Right Tool for Each Job – Learn about purpose-built databases and when to use which database for your application.

July 24, 2018 | 11:00 AM – 11:45 AM PT – Migrating IBM Db2 Databases to AWS – Learn how to migrate your IBM Db2 database to the cloud database of your choice.

 

DevOps

July 25, 2018 | 09:00 AM – 09:45 AM PT – Optimize Your Jenkins Build Farm – Learn how to optimize your Jenkins build farm using the plug-in for AWS CodeBuild.

 

Enterprise & Hybrid

July 31, 2018 | 09:00 AM – 09:45 AM PT – Enable Developer Productivity with Amazon WorkSpaces – Learn how your development teams can be more productive with Amazon WorkSpaces.

August 1, 2018 | 11:00 AM – 11:45 AM PT – Enterprise DevOps: Applying ITIL to Rapid Innovation – Innovation doesn’t have to equate to more risk for your organization. Learn how Enterprise DevOps delivers agility while maintaining governance, security and compliance.

 

IoT

July 30, 2018 | 01:00 PM – 01:45 PM PT – Using AWS IoT & Alexa Skills Kit to Voice-Control Connected Home Devices – Hands-on workshop that covers how to build a simple backend service using AWS IoT to support an Alexa Smart Home skill.

 

Machine Learning

July 23, 2018 | 09:00 AM – 09:45 AM PT – Leveraging ML Services to Enhance Content Discovery and Recommendations – See how customers are using computer vision and language AI services to enhance content discovery & recommendations.

July 24, 2018 | 09:00 AM – 09:45 AM PT – Hyperparameter Tuning with Amazon SageMaker’s Automatic Model Tuning – Learn how to use Automatic Model Tuning with Amazon SageMaker to get the best machine learning model for your datasets, to tune hyperparameters.

July 26, 2018 | 09:00 AM – 10:00 AM PT – Build Intelligent Applications with Machine Learning on AWS – Learn how to accelerate development of AI applications using machine learning on AWS.

 

re:Invent

July 18, 2018 | 08:00 AM – 08:30 AM PT – Episode 3: Training & Certification Round-Up – Join us as we dive deep with the AWS Training and Certification team on Bootcamps, Hands-on Labs, and how to get AWS Certified at re:Invent.

 

Security, Identity, & Compliance

July 30, 2018 | 11:00 AM – 11:45 AM PT – Get Started with Well-Architected Security Best Practices – Discover and walk through essential best practices for securing your workloads using a number of AWS services.

 

Serverless

July 24, 2018 | 01:00 PM – 02:00 PM PT – Getting Started with Serverless Computing Using AWS Lambda – Get an introduction to serverless and how to start building applications with no server management.

 

Storage

July 30, 2018 | 09:00 AM – 09:45 AM PT – Best Practices for Security in Amazon S3 – Learn about Amazon S3 security fundamentals and lots of new features that help make security simple.

Deploying a 4K, GPU-backed Linux desktop instance on AWS

Post Syndicated from Roshni Pary original https://aws.amazon.com/blogs/compute/deploying-4k-gpu-backed-linux-desktop-instance-on-aws/

Contributed by Amr Ragab, HPC Application Consultant, AWS Professional Services

AWS currently supports many managed des­ktop delivery mechanisms. Amazon WorkSpaces and Amazon AppStream 2.0 both deliver managed Windows-based machine images with GPU-backed instances. However, many desktop services and applications are better served through a Linux backed instance. Given the variety of Linux distributions as well as desktop managers, it can be valuable to have a generic solution for provisioning a Linux desktop on Amazon EC2.

A GPU-backed instance reduces the computational requirements from the client (local) machine, eliminating the need for a local discrete GPU to run graphical workloads. The framebuffer objects generated by the GPU are compressed when sent over the network, and decompressed by the local CPU resources. This allows clients to take advantage of the server GPU and display the high-resolution content on local thin clients, mobile devices, and low-powered desktops and laptops. Such GPU-backed Linux instances have been used for VFX rendering, computational drug discovery, and computational fluid dynamics (CFD) simulation use cases. An upcoming followup post details enabling this technology on the Windows platform.

Configuration

In this configuration, a client machine connects to the provisioned desktop (server) in the cloud. The server captures the framebuffer, which is sent in real time to the client machine over the network. Thus latency is an important metric to consider when provisioning this solution. I recommend choosing the nearest AWS Region (under 100 ms). Some customers may even prefer to install AWS Direct Connect.

Region Latency
US-East (Virginia) 18 ms
US East (Ohio) 31 ms
US-West (California) 77 ms
US-West (Oregon) 97 ms
Canada (Central) 29 ms
Europe (Ireland) 89 ms
Europe (London) 90 ms
Europe (Frankfurt) 108 ms
Asia Pacific (Mumbai) 197 ms
Asia Pacific (Seoul) 198 ms
Asia Pacific (Singapore) 288 ms
Asia Pacific (Sydney) 218 ms
Asia Pacific (Tokyo) 188 ms
South America (São Paulo) 138 ms
China (Beijing) 267 ms
AWS GovCloud (US) 97 ms

Source: http://www.cloudping.info/ from the Amazon offices located in Herndon, VA

Bandwidth requirements depend on the quality of the desktop experience as well as the desired resolution. Provision the backend Linux desktop instance with a 4096×2160 (4K) resolution. Depending on the specific G3 instance type selected, multi-GPU managed desktops give additional performance benefits. Each instance can also host multiple users, either in collaborative sessions, or with up to four independent 4K monitors. The GPU framebuffer memory used per session generally limits the number of sessions per managed desktop.

A smooth reliable experience depends on a low latency and high-bandwidth connection to the EC2 instance hosting the desktop. One of the benefits of using a multithreaded framebuffer reader is that only the defined block of the rendered desktop that is changing needs to be sent over the network. Full-screen redraws may be necessary only in rare cases. The minimum requirements for this 4K (3840×2160) configuration are as follows:

  • Bandwidth: 50 Mbps
  • Latency: < 30 ms
  • Jitter: < 5 ms

Deployment

Use RHEL/CentOS for the deployment. Except for DCV, this stack is compatible with Debian/Ubuntu distributions. Use the CentOS 7.5 Server AMI and install the NVIDIA/Xorg/KDE stack  to create a fully functioning desktop environment with a max resolution of 16384 x 8640 (that is, 4x4K) at 60 Hz.

This stack contains the following software:

  • CentOS 7.5 Base
  • Xorg 1.19
  • NVIDIA Grid Driver 6.1 (for the G3 instance family)
  • KDE Desktop environment
  • VirtualGL
  • TurboVNC
  • NICE DCV

To make the most efficient use of the NVIDIA Tesla M60 framebuffer memory, disable the compositing features of the desktop manager. Other non-compositing desktop managers (such as XFCE, MATE, etc.) are supported as well. This ensures that the GPU is reserved for specific OpenGL API tasks for the application, and that the performance is not impacted by the desktop environment decorations.

Start up a CentOS 7.5 server desktop based on the latest AMI available in the closest Region:

Distributor ID:    CentOS
Description:       CentOS Linux release 7.5.1804 (Core)
Release:           7.5.1804
Codename:          Core

Now install the Xorg stack with the KDE desktop manager:

sudo yum install epel-release
sudo yum update
sudo yum groupinstall "Development Tools"
sudo yum install xorg-* kernel-devel dkms python-pip lsb
sudo pip install awscli
sudo yum groupinstall "KDE Plasma Workspaces"
sudo systemctl disable firewalld #AWS security groups will provide our firewall rules
# if there is a kernel update
sudo reboot

Download the NVIDIA Grid driver (6.1). For more information, see Installing the NVIDIA Driver on Linux Instances.

aws s3 cp --recursive s3://ec2-linux-nvidia-drivers/ .
chmod +x latest/NVIDIA-Linux-x86_64-390.57-grid.run
sudo .latest/NVIDIA-Linux-x86_64-390.57-grid.run
# register the driver with dkms, ignore errors associated with 32bit compatible libraries

Deposit the xorg.conf file in /etc/X11/xorg.conf:

Section "ServerLayout"
        Identifier     "X.org Configured"
        Screen      0  "Screen0" 0 0
        InputDevice    "Mouse0" "CorePointer"
        InputDevice    "Keyboard0" "CoreKeyboard"
EndSection
 
Section "Files"
        ModulePath   "/usr/lib64/xorg/modules"
        FontPath     "catalogue:/etc/X11/fontpath.d"
        FontPath     "built-ins"
EndSection
 
Section "Module"
        Load  "glx"
EndSection
 
Section "InputDevice"
        Identifier  "Keyboard0"
        Driver      "kbd"
EndSection
 
Section "InputDevice"
        Identifier  "Mouse0"
        Driver      "mouse"
        Option      "Protocol" "auto"
        Option      "Device" "/dev/input/mice"
        Option      "ZAxisMapping" "4 5 6 7"
EndSection
 
Section "Monitor"
        Identifier   "Monitor0"
        VendorName   "Monitor Vendor"
        ModelName    "Monitor Model"
        Modeline "3840x2160_60.00"  712.34  3840 4152 4576 5312  2160 2161 2164 2235  -HSync +Vsync
EndSection

 
Section "Device"
        Identifier  "Card0"
        Driver      "nvidia"
        Option "ConnectToAcpid" "0"
        BusID       "PCI:0:30:0"
EndSection
 
Section "Screen"
        Identifier "Screen0"
        Device     "Card0"
        Monitor    "Monitor0"
        SubSection "Display"
                Viewport   0 0
                Depth     24
        Modes    "4096x2160" "3840x2160"
        EndSubSection
EndSection

Reboot again and check that the nvidia-gridd service is running. You may notice errors. They can be safely ignored after the nvidia-gridd service successfully acquires a license.

[[email protected] ~]# systemctl status nvidia-gridd.service
● nvidia-gridd.service - NVIDIA Grid Daemon
   Loaded: loaded (/usr/lib/systemd/system/nvidia-gridd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2018-05-29 18:37:35 UTC; 39s ago
  Process: 863 ExecStart=/usr/bin/nvidia-gridd (code=exited, status=0/SUCCESS)
 Main PID: 881 (nvidia-gridd)
   CGroup: /system.slice/nvidia-gridd.service
           └─881 /usr/bin/nvidia-gridd
May 29 18:37:35 ip-10-0-125-164.ec2.internal systemd[1]: Starting NVIDIA Grid Daemon...
May 29 18:37:35 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: Started (881)
May 29 18:37:35 ip-10-0-125-164.ec2.internal systemd[1]: Started NVIDIA Grid Daemon.
May 29 18:37:36 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: Configuration parameter ( ServerAddress  FeatureType) not set
May 29 18:37:40 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: Calling load_byte_array(tra)
May 29 18:37:41 ip-10-0-125-164.ec2.internal nvidia-gridd[881]: License acquired successfully (2)

You can confirm that 4K resolution is enabled by running the following command:

DISPLAY=:0 xrandr -q
Screen 0: minimum 8 x 8, current 4096 x 2160, maximum 16384 x 8640
DVI-D-0 connected primary 4096x2160+0+0 (normal left inverted right x axis y axis) 641mm x 400mm
2560x1600 59.86+
4096x2160 60.03*
3840x2160 60.00 

Finally, check that your underlying GL renderer is using the NVIDIA driver by querying glxinfo

DISPLAY=:0 glxinfo

OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: Quadro FX Tesla M60/PCIe/SSE2
OpenGL core profile version string: 4.5.0 NVIDIA 390.57
OpenGL core profile shading language version string: 4.50 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.6.0 NVIDIA 390.57
OpenGL shading language version string: 4.60 NVIDIA

At the time of publication, OpenGL 4.5 is enabled. Your applications can take advantage of that API for rendering.

To interact with the instance, install server-side desktop remote display software that can specifically take advantage of the 3D hardware acceleration. For example, AWS provides the NICE DCV platform.

DCV is an accelerated remote desktop framework that provides in-web browser desktop connections. DCV is supported in both Windows and Linux (RHEL/CentOS). In the Windows platform, OpenGL and DirectX are fully supported. DCV entitlement is free when provisioning on AWS. NICE DCV is also provided as a component to the AWS EnginFrame and myHPC solutions.

To install DCV, download the NICE DCV 2017 EL7 archive and Administrative Guide. After you extract the archive in the instance, you see a list of nice-* RPMS. You don’t have to worry about licensing, as the installer captures that the instance is running in AWS.

sudo yum localinstall nice-*
sudo systemctl enable dcvserver
sudo systemctl start dcvserver

When the DCV server starts, you have the option to create a single console session or multiple virtual sessions. You must assign a password for the CentOS user issued, by running the following command:

sudo passwd centos

Start the console session:

sudo dcv create-session --type=console --owner centos session1
sudo dcv list-sessions

The AWS security groups are enabled to allow TCP 8443 traffic to the instance. You see the DCV login portal and can interact with the instance. Other popular frameworks include the following:

You can also find plug and play images for managed desktops in the AWS Marketplace.

Optimization

Implement the changes outlined in the Optimizing GPU Settings (P2, P3, and G3 Instances) topic. You can turn off the autoboost feature and set the maximum graphics and memory clocks manually.

sudo nvidia-smi --auto-boost-default=0
sudo nvidia-smi -ac 2505,1177

Application testing

For testing, look at PyMOL (PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.). PyMOL is a standard commercial drug discovery application that is used for processing, and visualizing biochemical structures.  I used the opensource fork.

With the NVIDIA GRID licensing enabled earlier, PyMOL can take advantage of the Quadro features supplied by the Tesla M60. After it’s installed and loaded, you can confirm the functionality of the entire G3 instance software stack installed earlier:

PyMOL(TM) Molecular Graphics System, Version 2.1.0.
 Copyright (c) Schrodinger, LLC.
 All Rights Reserved.
 
    Created by Warren L. DeLano, Ph.D. 
 
    PyMOL is user-supported open-source software.  Although some versions
    are freely available, PyMOL is not in the public domain.
 
    If PyMOL is helpful in your work or study, then please volunteer 
    support for our ongoing efforts to create open and affordable scientific
    software by purchasing a PyMOL Maintenance and/or Support subscription.

    More information can be found at "http://www.pymol.org".
 
    Enter "help" for a list of commands.
    Enter "help <command-name>" for information on a specific command.

 Hit ESC anytime to toggle between text and graphics.

 Detected OpenGL version 2.0 or greater. Shaders available.
 Detected GLSL version 4.60.
 OpenGL graphics engine:
  GL_VENDOR:   NVIDIA Corporation
  GL_RENDERER: Quadro FX Tesla M60/PCIe/SSE2
  GL_VERSION:  4.6.0 NVIDIA 390.57
 Adapting to Quadro hardware.
 Detected 16 CPU cores.  Enabled multithreaded rendering.

In the PyMOL window, run “fetch 5ta3”, which is a 39k amino acid protein, under the 4K desktop environment. Rotating and translating the protein should be smooth and respond quickly to pointer events.

The PyMOL Gallery contains other representative examples that take advantage of various visualization and processing workflows. Also, you can find many demos (choose Wizard, Demo).

Under the Sculpting demo, you can show the pointer latency between the client and server.

Finally, look at ray tracing. From the PyMOL wiki, take a chemical structure and render each frame with ray tracing to produce a video. On the Tesla M60 with Quadro features enabled, the total render time was approximately 1 minute.

Scalability

As I mentioned previously, the framebuffer redirection protocols have a feature set to create multiple virtual sessions per node. A virtual session is not necessarily tied to a single user either. In other words, the number of independent virtual sessions is limited by the total amount of GPU frame buffer memory used in all sessions per GPU. Thus, it’s possible to scale horizontally by increasing the number of G3 instances, or vertically by using larger instance types in the G3 family.

Summary

The G3 instance type is purpose-built to provide a managed, high-end professional graphics infrastructure for visual computing needs. With NICE DCV, you can take advantage of NVIDIA Quadro software features for a range of applications including drug discovery and VFX rendering. Connected with the AWS high-performance network backbone, the instance can become an integral part of your graphics workload pipeline. Now, you can power up and deliver your applications to teams working anywhere in the world.

EC2 Instance Update – M5 Instances with Local NVMe Storage (M5d)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/ec2-instance-update-m5-instances-with-local-nvme-storage-m5d/

Earlier this month we launched the C5 Instances with Local NVMe Storage and I told you that we would be doing the same for additional instance types in the near future!

Today we are introducing M5 instances equipped with local NVMe storage. Available for immediate use in 5 regions, these instances are a great fit for workloads that require a balance of compute and memory resources. Here are the specs:

Instance Name vCPUs RAM Local Storage EBS-Optimized Bandwidth Network Bandwidth
m5d.large 2 8 GiB 1 x 75 GB NVMe SSD Up to 2.120 Gbps Up to 10 Gbps
m5d.xlarge 4 16 GiB 1 x 150 GB NVMe SSD Up to 2.120 Gbps Up to 10 Gbps
m5d.2xlarge 8 32 GiB 1 x 300 GB NVMe SSD Up to 2.120 Gbps Up to 10 Gbps
m5d.4xlarge 16 64 GiB 1 x 600 GB NVMe SSD 2.210 Gbps Up to 10 Gbps
m5d.12xlarge 48 192 GiB 2 x 900 GB NVMe SSD 5.0 Gbps 10 Gbps
m5d.24xlarge 96 384 GiB 4 x 900 GB NVMe SSD 10.0 Gbps 25 Gbps

The M5d instances are powered by Custom Intel® Xeon® Platinum 8175M series processors running at 2.5 GHz, including support for AVX-512.

You can use any AMI that includes drivers for the Elastic Network Adapter (ENA) and NVMe; this includes the latest Amazon Linux, Microsoft Windows (Server 2008 R2, Server 2012, Server 2012 R2 and Server 2016), Ubuntu, RHEL, SUSE, and CentOS AMIs.

Here are a couple of things to keep in mind about the local NVMe storage on the M5d instances:

Naming – You don’t have to specify a block device mapping in your AMI or during the instance launch; the local storage will show up as one or more devices (/dev/nvme*1 on Linux) after the guest operating system has booted.

Encryption – Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.

Lifetime – Local NVMe devices have the same lifetime as the instance they are attached to, and do not stick around after the instance has been stopped or terminated.

Available Now
M5d instances are available in On-Demand, Reserved Instance, and Spot form in the US East (N. Virginia), US West (Oregon), EU (Ireland), US East (Ohio), and Canada (Central) Regions. Prices vary by Region, and are just a bit higher than for the equivalent M5 instances.

Jeff;

 

AWS Online Tech Talks – June 2018

Post Syndicated from Devin Watson original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-june-2018/

AWS Online Tech Talks – June 2018

Join us this month to learn about AWS services and solutions. New this month, we have a fireside chat with the GM of Amazon WorkSpaces and our 2nd episode of the “How to re:Invent” series. We’ll also cover best practices, deep dives, use cases and more! Join us and register today!

Note – All sessions are free and in Pacific Time.

Tech talks featured this month:

 

Analytics & Big Data

June 18, 2018 | 11:00 AM – 11:45 AM PTGet Started with Real-Time Streaming Data in Under 5 Minutes – Learn how to use Amazon Kinesis to capture, store, and analyze streaming data in real-time including IoT device data, VPC flow logs, and clickstream data.
June 20, 2018 | 11:00 AM – 11:45 AM PT – Insights For Everyone – Deploying Data across your Organization – Learn how to deploy data at scale using AWS Analytics and QuickSight’s new reader role and usage based pricing.

 

AWS re:Invent
June 13, 2018 | 05:00 PM – 05:30 PM PTEpisode 2: AWS re:Invent Breakout Content Secret Sauce – Hear from one of our own AWS content experts as we dive deep into the re:Invent content strategy and how we maintain a high bar.
Compute

June 25, 2018 | 01:00 PM – 01:45 PM PTAccelerating Containerized Workloads with Amazon EC2 Spot Instances – Learn how to efficiently deploy containerized workloads and easily manage clusters at any scale at a fraction of the cost with Spot Instances.

June 26, 2018 | 01:00 PM – 01:45 PM PTEnsuring Your Windows Server Workloads Are Well-Architected – Get the benefits, best practices and tools on running your Microsoft Workloads on AWS leveraging a well-architected approach.

 

Containers
June 25, 2018 | 09:00 AM – 09:45 AM PTRunning Kubernetes on AWS – Learn about the basics of running Kubernetes on AWS including how setup masters, networking, security, and add auto-scaling to your cluster.

 

Databases

June 18, 2018 | 01:00 PM – 01:45 PM PTOracle to Amazon Aurora Migration, Step by Step – Learn how to migrate your Oracle database to Amazon Aurora.
DevOps

June 20, 2018 | 09:00 AM – 09:45 AM PTSet Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tools – Learn how to set up a CI/CD pipeline for deploying containers using the AWS Developer Tools.

 

Enterprise & Hybrid
June 18, 2018 | 09:00 AM – 09:45 AM PTDe-risking Enterprise Migration with AWS Managed Services – Learn how enterprise customers are de-risking cloud adoption with AWS Managed Services.

June 19, 2018 | 11:00 AM – 11:45 AM PTLaunch AWS Faster using Automated Landing Zones – Learn how the AWS Landing Zone can automate the set up of best practice baselines when setting up new

 

AWS Environments

June 21, 2018 | 11:00 AM – 11:45 AM PTLeading Your Team Through a Cloud Transformation – Learn how you can help lead your organization through a cloud transformation.

June 21, 2018 | 01:00 PM – 01:45 PM PTEnabling New Retail Customer Experiences with Big Data – Learn how AWS can help retailers realize actual value from their big data and deliver on differentiated retail customer experiences.

June 28, 2018 | 01:00 PM – 01:45 PM PTFireside Chat: End User Collaboration on AWS – Learn how End User Compute services can help you deliver access to desktops and applications anywhere, anytime, using any device.
IoT

June 27, 2018 | 11:00 AM – 11:45 AM PTAWS IoT in the Connected Home – Learn how to use AWS IoT to build innovative Connected Home products.

 

Machine Learning

June 19, 2018 | 09:00 AM – 09:45 AM PTIntegrating Amazon SageMaker into your Enterprise – Learn how to integrate Amazon SageMaker and other AWS Services within an Enterprise environment.

June 21, 2018 | 09:00 AM – 09:45 AM PTBuilding Text Analytics Applications on AWS using Amazon Comprehend – Learn how you can unlock the value of your unstructured data with NLP-based text analytics.

 

Management Tools

June 20, 2018 | 01:00 PM – 01:45 PM PTOptimizing Application Performance and Costs with Auto Scaling – Learn how selecting the right scaling option can help optimize application performance and costs.

 

Mobile
June 25, 2018 | 11:00 AM – 11:45 AM PTDrive User Engagement with Amazon Pinpoint – Learn how Amazon Pinpoint simplifies and streamlines effective user engagement.

 

Security, Identity & Compliance

June 26, 2018 | 09:00 AM – 09:45 AM PTUnderstanding AWS Secrets Manager – Learn how AWS Secrets Manager helps you rotate and manage access to secrets centrally.
June 28, 2018 | 09:00 AM – 09:45 AM PTUsing Amazon Inspector to Discover Potential Security Issues – See how Amazon Inspector can be used to discover security issues of your instances.

 

Serverless

June 19, 2018 | 01:00 PM – 01:45 PM PTProductionize Serverless Application Building and Deployments with AWS SAM – Learn expert tips and techniques for building and deploying serverless applications at scale with AWS SAM.

 

Storage

June 26, 2018 | 11:00 AM – 11:45 AM PTDeep Dive: Hybrid Cloud Storage with AWS Storage Gateway – Learn how you can reduce your on-premises infrastructure by using the AWS Storage Gateway to connecting your applications to the scalable and reliable AWS storage services.
June 27, 2018 | 01:00 PM – 01:45 PM PTChanging the Game: Extending Compute Capabilities to the Edge – Discover how to change the game for IIoT and edge analytics applications with AWS Snowball Edge plus enhanced Compute instances.
June 28, 2018 | 11:00 AM – 11:45 AM PTBig Data and Analytics Workloads on Amazon EFS – Get best practices and deployment advice for running big data and analytics workloads on Amazon EFS.

Storing Encrypted Credentials In Git

Post Syndicated from Bozho original https://techblog.bozho.net/storing-encrypted-credentials-in-git/

We all know that we should not commit any passwords or keys to the repo with our code (no matter if public or private). Yet, thousands of production passwords can be found on GitHub (and probably thousands more in internal company repositories). Some have tried to fix that by removing the passwords (once they learned it’s not a good idea to store them publicly), but passwords have remained in the git history.

Knowing what not to do is the first and very important step. But how do we store production credentials. Database credentials, system secrets (e.g. for HMACs), access keys for 3rd party services like payment providers or social networks. There doesn’t seem to be an agreed upon solution.

I’ve previously argued with the 12-factor app recommendation to use environment variables – if you have a few that might be okay, but when the number of variables grow (as in any real application), it becomes impractical. And you can set environment variables via a bash script, but you’d have to store it somewhere. And in fact, even separate environment variables should be stored somewhere.

This somewhere could be a local directory (risky), a shared storage, e.g. FTP or S3 bucket with limited access, or a separate git repository. I think I prefer the git repository as it allows versioning (Note: S3 also does, but is provider-specific). So you can store all your environment-specific properties files with all their credentials and environment-specific configurations in a git repo with limited access (only Ops people). And that’s not bad, as long as it’s not the same repo as the source code.

Such a repo would look like this:

project
└─── production
|   |   application.properites
|   |   keystore.jks
└─── staging
|   |   application.properites
|   |   keystore.jks
└─── on-premise-client1
|   |   application.properites
|   |   keystore.jks
└─── on-premise-client2
|   |   application.properites
|   |   keystore.jks

Since many companies are using GitHub or BitBucket for their repositories, storing production credentials on a public provider may still be risky. That’s why it’s a good idea to encrypt the files in the repository. A good way to do it is via git-crypt. It is “transparent” encryption because it supports diff and encryption and decryption on the fly. Once you set it up, you continue working with the repo as if it’s not encrypted. There’s even a fork that works on Windows.

You simply run git-crypt init (after you’ve put the git-crypt binary on your OS Path), which generates a key. Then you specify your .gitattributes, e.g. like that:

secretfile filter=git-crypt diff=git-crypt
*.key filter=git-crypt diff=git-crypt
*.properties filter=git-crypt diff=git-crypt
*.jks filter=git-crypt diff=git-crypt

And you’re done. Well, almost. If this is a fresh repo, everything is good. If it is an existing repo, you’d have to clean up your history which contains the unencrypted files. Following these steps will get you there, with one addition – before calling git commit, you should call git-crypt status -f so that the existing files are actually encrypted.

You’re almost done. We should somehow share and backup the keys. For the sharing part, it’s not a big issue to have a team of 2-3 Ops people share the same key, but you could also use the GPG option of git-crypt (as documented in the README). What’s left is to backup your secret key (that’s generated in the .git/git-crypt directory). You can store it (password-protected) in some other storage, be it a company shared folder, Dropbox/Google Drive, or even your email. Just make sure your computer is not the only place where it’s present and that it’s protected. I don’t think key rotation is necessary, but you can devise some rotation procedure.

git-crypt authors claim to shine when it comes to encrypting just a few files in an otherwise public repo. And recommend looking at git-remote-gcrypt. But as often there are non-sensitive parts of environment-specific configurations, you may not want to encrypt everything. And I think it’s perfectly fine to use git-crypt even in a separate repo scenario. And even though encryption is an okay approach to protect credentials in your source code repo, it’s still not necessarily a good idea to have the environment configurations in the same repo. Especially given that different people/teams manage these credentials. Even in small companies, maybe not all members have production access.

The outstanding questions in this case is – how do you sync the properties with code changes. Sometimes the code adds new properties that should be reflected in the environment configurations. There are two scenarios here – first, properties that could vary across environments, but can have default values (e.g. scheduled job periods), and second, properties that require explicit configuration (e.g. database credentials). The former can have the default values bundled in the code repo and therefore in the release artifact, allowing external files to override them. The latter should be announced to the people who do the deployment so that they can set the proper values.

The whole process of having versioned environment-speific configurations is actually quite simple and logical, even with the encryption added to the picture. And I think it’s a good security practice we should try to follow.

The post Storing Encrypted Credentials In Git appeared first on Bozho's tech blog.

Some quick thoughts on the public discussion regarding facial recognition and Amazon Rekognition this past week

Post Syndicated from Dr. Matt Wood original https://aws.amazon.com/blogs/aws/some-quick-thoughts-on-the-public-discussion-regarding-facial-recognition-and-amazon-rekognition-this-past-week/

We have seen a lot of discussion this past week about the role of Amazon Rekognition in facial recognition, surveillance, and civil liberties, and we wanted to share some thoughts.

Amazon Rekognition is a service we announced in 2016. It makes use of new technologies – such as deep learning – and puts them in the hands of developers in an easy-to-use, low-cost way. Since then, we have seen customers use the image and video analysis capabilities of Amazon Rekognition in ways that materially benefit both society (e.g. preventing human trafficking, inhibiting child exploitation, reuniting missing children with their families, and building educational apps for children), and organizations (enhancing security through multi-factor authentication, finding images more easily, or preventing package theft). Amazon Web Services (AWS) is not the only provider of services like these, and we remain excited about how image and video analysis can be a driver for good in the world, including in the public sector and law enforcement.

There have always been and will always be risks with new technology capabilities. Each organization choosing to employ technology must act responsibly or risk legal penalties and public condemnation. AWS takes its responsibilities seriously. But we believe it is the wrong approach to impose a ban on promising new technologies because they might be used by bad actors for nefarious purposes in the future. The world would be a very different place if we had restricted people from buying computers because it was possible to use that computer to do harm. The same can be said of thousands of technologies upon which we all rely each day. Through responsible use, the benefits have far outweighed the risks.

Customers are off to a great start with Amazon Rekognition; the evidence of the positive impact this new technology can provide is strong (and growing by the week), and we’re excited to continue to support our customers in its responsible use.

-Dr. Matt Wood, general manager of artificial intelligence at AWS

Majority of Canadians Consume Online Content Legally, Survey Finds

Post Syndicated from Andy original https://torrentfreak.com/majority-of-canadians-consume-online-content-legally-survey-finds-180531/

Back in January, a coalition of companies and organizations with ties to the entertainment industries called on local telecoms regulator CRTC to implement a national website blocking regime.

Under the banner of Fairplay Canada, members including Bell, Cineplex, Directors Guild of Canada, Maple Leaf Sports and Entertainment, Movie Theatre Association of Canada, and Rogers Media, spoke of an industry under threat from marauding pirates. But just how serious is this threat?

The results of a new survey commissioned by Innovation Science and Economic Development Canada (ISED) in collaboration with the Department of Canadian Heritage (PCH) aims to shine light on the problem by revealing the online content consumption habits of citizens in the Great White North.

While there are interesting findings for those on both sides of the site-blocking debate, the situation seems somewhat removed from the Armageddon scenario predicted by the entertainment industries.

Carried out among 3,301 Canadians aged 12 years and over, the Kantar TNS study aims to cover copyright infringement in six key content areas – music, movies, TV shows, video games, computer software, and eBooks. Attitudes and behaviors are also touched upon while measuring the effectiveness of Canada’s copyright measures.

General Digital Content Consumption

In its introduction, the report notes that 28 million Canadians used the Internet in the three-month study period to November 27, 2017. Of those, 22 million (80%) consumed digital content. Around 20 million (73%) streamed or accessed content, 16 million (59%) downloaded content, while 8 million (28%) shared content.

Music, TV shows and movies all battled for first place in the consumption ranks, with 48%, 48%, and 46% respectively.

Copyright Infringement

According to the study, the majority of Canadians do things completely by the book. An impressive 74% of media-consuming respondents said that they’d only accessed material from legal sources in the preceding three months.

The remaining 26% admitted to accessing at least one illegal file in the same period. Of those, just 5% said that all of their consumption was from illegal sources, with movies (36%), software (36%), TV shows (34%) and video games (33%) the most likely content to be consumed illegally.

Interestingly, the study found that few demographic factors – such as gender, region, rural and urban, income, employment status and language – play a role in illegal content consumption.

“We found that only age and income varied significantly between consumers who infringed by downloading or streaming/accessing content online illegally and consumers who did not consume infringing content online,” the report reads.

“More specifically, the profile of consumers who downloaded or streamed/accessed infringing content skewed slightly younger and towards individuals with household incomes of $100K+.”

Licensed services much more popular than pirate haunts

It will come as no surprise that Netflix was the most popular service with consumers, with 64% having used it in the past three months. Sites like YouTube and Facebook were a big hit too, visited by 36% and 28% of content consumers respectively.

Overall, 74% of online content consumers use licensed services for content while 42% use social networks. Under a third (31%) use a combination of peer-to-peer (BitTorrent), cyberlocker platforms, or linking sites. Stream-ripping services are used by 9% of content consumers.

“Consumers who reported downloading or streaming/accessing infringing content only are less likely to use licensed services and more likely to use peer-to-peer/cyberlocker/linking sites than other consumers of online content,” the report notes.

Attitudes towards legal consumption & infringing content

In common with similar surveys over the years, the Kantar research looked at the reasons why people consume content from various sources, both legal and otherwise.

Convenience (48%), speed (36%) and quality (34%) were the most-cited reasons for using legal sources. An interesting 33% of respondents said they use legal sites to avoid using illegal sources.

On the illicit front, 54% of those who obtained unauthorized content in the previous three months said they did so due to it being free, with 40% citing convenience and 34% mentioning speed.

Almost six out of ten (58%) said lower costs would encourage them to switch to official sources, with 47% saying they’d move if legal availability was improved.

Canada’s ‘Notice-and-Notice’ warning system

People in Canada who share content on peer-to-peer systems like BitTorrent without permission run the risk of receiving an infringement notice warning them to stop. These are sent by copyright holders via users’ ISPs and the hope is that the shock of receiving a warning will turn consumers back to the straight and narrow.

The study reveals that 10% of online content consumers over the age of 12 have received one of these notices but what kind of effect have they had?

“Respondents reported that receiving such a notice resulted in the following: increased awareness of copyright infringement (38%), taking steps to ensure password protected home networks (27%), a household discussion about copyright infringement (27%), and discontinuing illegal downloading or streaming (24%),” the report notes.

While these are all positives for the entertainment industries, Kantar reports that almost a quarter (24%) of people who receive a notice simply ignore them.

Stream-ripping

Once upon a time, people obtaining music via P2P networks was cited as the music industry’s greatest threat but, with the advent of sites like YouTube, so-called stream-ripping is the latest bogeyman.

According to the study, 11% of Internet users say they’ve used a stream-ripping service. They are most likely to be male (62%) and predominantly 18 to 34 (52%) years of age.

“Among Canadians who have used a service to stream-rip music or entertainment, nearly half (48%) have used stream-ripping sites, one-third have used downloader apps (38%), one-in-seven (14%) have used a stream-ripping plug-in, and one-in-ten (10%) have used stream-ripping software,” the report adds.

Set-Top Boxes and VPNs

Few general piracy studies would be complete in 2018 without touching on set-top devices and Virtual Private Networks and this report doesn’t disappoint.

More than one in five (21%) respondents aged 12+ reported using a VPN, with the main purpose of securing communications and Internet browsing (57%).

A relatively modest 36% said they use a VPN to access free content while 32% said the aim was to access geo-blocked content unavailable in Canada. Just over a quarter (27%) said that accessing content from overseas at a reasonable price was the main motivator.

One in ten (10%) of respondents reported using a set-top box, with 78% stating they use them to access paid-for content. Interestingly, only a small number say they use the devices to infringe.

“A minority use set-top boxes to access other content that is not legal or they are unsure if it is legal (16%), or to access live sports that are not legal or they are unsure if it is legal (11%),” the report notes.

“Individuals who consumed a mix of legal and illegal content online are more likely to use VPN services (42%) or TV set-top boxes (21%) than consumers who only downloaded or streamed/accessed legal content.”

Kantar says that the findings of the report will be used to help policymakers evaluate how Canada’s Copyright Act is coping with a changing market and technological developments.

“This research will provide the necessary information required to further develop copyright policy in Canada, as well as to provide a foundation to assess the effectiveness of the measures to address copyright infringement, should future analysis be undertaken,” it concludes.

The full report can be found here (pdf)

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

MagPi 70: Home automation with Raspberry Pi

Post Syndicated from Rob Zwetsloot original https://www.raspberrypi.org/blog/magpi-70-home-automation/

Hey folks, Rob here! It’s the last Thursday of the month, and that means it’s time for a brand-new The MagPi. Issue 70 is all about home automation using your favourite microcomputer, the Raspberry Pi.

Cover of The MagPi 70 — Raspberry Pi home automation and tech upcycling

Home automation in this month’s The MagPi!

Raspberry Pi home automation

We think home automation is an excellent use of the Raspberry Pi, hiding it around your house and letting it power your lights and doorbells and…fish tanks? We show you how to do all of that, and give you some excellent tips on how to add even more automation to your home in our ten-page cover feature.

Upcycle your life

Our other big feature this issue covers upcycling, the hot trend of taking old electronics and making them better than new with some custom code and a tactically placed Raspberry Pi. For this feature, we had a chat with Martin Mander, upcycler extraordinaire, to find out his top tips for hacking your old hardware.

Article on upcycling in The MagPi 70 — Raspberry Pi home automation and tech upcycling

Upcycling is a lot of fun

But wait, there’s more!

If for some reason you want even more content, you’re in luck! We have some fun tutorials for you to try, like creating a theremin and turning a Babbage into an IoT nanny cam. We also continue our quest to make a video game in C++. Our project showcase is headlined by the Teslonda on page 28, a Honda/Tesla car hybrid that is just wonderful.

Diddyborg V2 review in The MagPi 70 — Raspberry Pi home automation and tech upcycling

We review PiBorg’s latest robot

All this comes with our definitive reviews and the community section where we celebrate you, our amazing community! You’re all good beans

Teslonda article in The MagPi 70 — Raspberry Pi home automation and tech upcycling

An amazing, and practical, Raspberry Pi project

Get The MagPi 70

Issue 70 is available today from WHSmith, Tesco, Sainsbury’s, and Asda. If you live in the US, head over to your local Barnes & Noble or Micro Center in the next few days for a print copy. You can also get the new issue online from our store, or digitally via our Android and iOS apps. And don’t forget, there’s always the free PDF as well.

New subscription offer!

Want to support the Raspberry Pi Foundation and the magazine? We’ve launched a new way to subscribe to the print version of The MagPi: you can now take out a monthly £4 subscription to the magazine, effectively creating a rolling pre-order system that saves you money on each issue.

The MagPi subscription offer — Raspberry Pi home automation and tech upcycling

You can also take out a twelve-month print subscription and get a Pi Zero W plus case and adapter cables absolutely free! This offer does not currently have an end date.

That’s it for today! See you next month.

Animated GIF: a door slides open and Captain Picard emerges hesitantly

The post MagPi 70: Home automation with Raspberry Pi appeared first on Raspberry Pi.

Hiring a Director of Sales

Post Syndicated from Yev original https://www.backblaze.com/blog/hiring-a-director-of-sales/

Backblaze is hiring a Director of Sales. This is a critical role for Backblaze as we continue to grow the team. We need a strong leader who has experience in scaling a sales team and who has an excellent track record for exceeding goals by selling Software as a Service (SaaS) solutions. In addition, this leader will need to be highly motivated, as well as able to create and develop a highly-motivated, success oriented sales team that has fun and enjoys what they do.

The History of Backblaze from our CEO
In 2007, after a friend’s computer crash caused her some suffering, we realized that with every photo, video, song, and document going digital, everyone would eventually lose all of their information. Five of us quit our jobs to start a company with the goal of making it easy for people to back up their data.

Like many startups, for a while we worked out of a co-founder’s one-bedroom apartment. Unlike most startups, we made an explicit agreement not to raise funding during the first year. We would then touch base every six months and decide whether to raise or not. We wanted to focus on building the company and the product, not on pitching and slide decks. And critically, we wanted to build a culture that understood money comes from customers, not the magical VC giving tree. Over the course of 5 years we built a profitable, multi-million dollar revenue business — and only then did we raise a VC round.

Fast forward 10 years later and our world looks quite different. You’ll have some fantastic assets to work with:

  • A brand millions recognize for openness, ease-of-use, and affordability.
  • A computer backup service that stores over 500 petabytes of data, has recovered over 30 billion files for hundreds of thousands of paying customers — most of whom self-identify as being the people that find and recommend technology products to their friends.
  • Our B2 service that provides the lowest cost cloud storage on the planet at 1/4th the price Amazon, Google or Microsoft charges. While being a newer product on the market, it already has over 100,000 IT and developers signed up as well as an ecosystem building up around it.
  • A growing, profitable and cash-flow positive company.
  • And last, but most definitely not least: a great sales team.

You might be saying, “sounds like you’ve got this under control — why do you need me?” Don’t be misled. We need you. Here’s why:

  • We have a great team, but we are in the process of expanding and we need to develop a structure that will easily scale and provide the most success to drive revenue.
  • We just launched our outbound sales efforts and we need someone to help develop that into a fully successful program that’s building a strong pipeline and closing business.
  • We need someone to work with the marketing department and figure out how to generate more inbound opportunities that the sales team can follow up on and close.
  • We need someone who will work closely in developing the skills of our current sales team and build a path for career growth and advancement.
  • We want someone to manage our Customer Success program.

So that’s a bit about us. What are we looking for in you?

Experience: As a sales leader, you will strategically build and drive the territory’s sales pipeline by assembling and leading a skilled team of sales professionals. This leader should be familiar with generating, developing and closing software subscription (SaaS) opportunities. We are looking for a self-starter who can manage a team and make an immediate impact of selling our Backup and Cloud Storage solutions. In this role, the sales leader will work closely with the VP of Sales, marketing staff, and service staff to develop and implement specific strategic plans to achieve and exceed revenue targets, including new business acquisition as well as build out our customer success program.

Leadership: We have an experienced team who’s brought us to where we are today. You need to have the people and management skills to get them excited about working with you. You need to be a strong leader and compassionate about developing and supporting your team.

Data driven and creative: The data has to show something makes sense before we scale it up. However, without creativity, it’s easy to say “the data shows it’s impossible” or to find a local maximum. Whether it’s deciding how to scale the team, figuring out what our outbound sales efforts should look like or putting a plan in place to develop the team for career growth, we’ve seen a bit of creativity get us places a few extra dollars couldn’t.

Jive with our culture: Strong leaders affect culture and the person we hire for this role may well shape, not only fit into, ours. But to shape the culture you have to be accepted by the organism, which means a certain set of shared values. We default to openness with our team, our customers, and everyone if possible. We love initiative — without arrogance or dictatorship. We work to create a place people enjoy showing up to work. That doesn’t mean ping pong tables and foosball (though we do try to have perks & fun), but it means people are friendly, non-political, working to build a good service but also a good place to work.

Do the work: Ideas and strategy are critical, but good execution makes them happen. We’re looking for someone who can help the team execute both from the perspective of being capable of guiding and organizing, but also someone who is hands-on themselves.

Additional Responsibilities needed for this role:

  • Recruit, coach, mentor, manage and lead a team of sales professionals to achieve yearly sales targets. This includes closing new business and expanding upon existing clientele.
  • Expand the customer success program to provide the best customer experience possible resulting in upsell opportunities and a high retention rate.
  • Develop effective sales strategies and deliver compelling product demonstrations and sales pitches.
  • Acquire and develop the appropriate sales tools to make the team efficient in their daily work flow.
  • Apply a thorough understanding of the marketplace, industry trends, funding developments, and products to all management activities and strategic sales decisions.
  • Ensure that sales department operations function smoothly, with the goal of facilitating sales and/or closings; operational responsibilities include accurate pipeline reporting and sales forecasts.
  • This position will report directly to the VP of Sales and will be staffed in our headquarters in San Mateo, CA.

Requirements:

  • 7 – 10+ years of successful sales leadership experience as measured by sales performance against goals.
    Experience in developing skill sets and providing career growth and opportunities through advancement of team members.
  • Background in selling SaaS technologies with a strong track record of success.
  • Strong presentation and communication skills.
  • Must be able to travel occasionally nationwide.
  • BA/BS degree required

Think you want to join us on this adventure?
Send an email to jobscontact@backblaze.com with the subject “Director of Sales.” (Recruiters and agencies, please don’t email us.) Include a resume and answer these two questions:

  1. How would you approach evaluating the current sales team and what is your process for developing a growth strategy to scale the team?
  2. What are the goals you would set for yourself in the 3 month and 1-year timeframes?

Thank you for taking the time to read this and I hope that this sounds like the opportunity for which you’ve been waiting.

Backblaze is an Equal Opportunity Employer.

The post Hiring a Director of Sales appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Getting Rid of Your Mac? Here’s How to Securely Erase a Hard Drive or SSD

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/how-to-wipe-a-mac-hard-drive/

erasing a hard drive and a solid state drive

What do I do with a Mac that still has personal data on it? Do I take out the disk drive and smash it? Do I sweep it with a really strong magnet? Is there a difference in how I handle a hard drive (HDD) versus a solid-state drive (SSD)? Well, taking a sledgehammer or projectile weapon to your old machine is certainly one way to make the data irretrievable, and it can be enormously cathartic as long as you follow appropriate safety and disposal protocols. But there are far less destructive ways to make sure your data is gone for good. Let me introduce you to secure erasing.

Which Type of Drive Do You Have?

Before we start, you need to know whether you have a HDD or a SSD. To find out, or at least to make sure, you click on the Apple menu and select “About this Mac.” Once there, select the “Storage” tab to see which type of drive is in your system.

The first example, below, shows a SATA Disk (HDD) in the system.

SATA HDD

In the next case, we see we have a Solid State SATA Drive (SSD), plus a Mac SuperDrive.

Mac storage dialog showing SSD

The third screen shot shows an SSD, as well. In this case it’s called “Flash Storage.”

Flash Storage

Make Sure You Have a Backup

Before you get started, you’ll want to make sure that any important data on your hard drive has moved somewhere else. OS X’s built-in Time Machine backup software is a good start, especially when paired with Backblaze. You can learn more about using Time Machine in our Mac Backup Guide.

With a local backup copy in hand and secure cloud storage, you know your data is always safe no matter what happens.

Once you’ve verified your data is backed up, roll up your sleeves and get to work. The key is OS X Recovery — a special part of the Mac operating system since OS X 10.7 “Lion.”

How to Wipe a Mac Hard Disk Drive (HDD)

NOTE: If you’re interested in wiping an SSD, see below.

    1. Make sure your Mac is turned off.
    2. Press the power button.
    3. Immediately hold down the command and R keys.
    4. Wait until the Apple logo appears.
    5. Select “Disk Utility” from the OS X Utilities list. Click Continue.
    6. Select the disk you’d like to erase by clicking on it in the sidebar.
    7. Click the Erase button.
    8. Click the Security Options button.
    9. The Security Options window includes a slider that enables you to determine how thoroughly you want to erase your hard drive.

There are four notches to that Security Options slider. “Fastest” is quick but insecure — data could potentially be rebuilt using a file recovery app. Moving that slider to the right introduces progressively more secure erasing. Disk Utility’s most secure level erases the information used to access the files on your disk, then writes zeroes across the disk surface seven times to help remove any trace of what was there. This setting conforms to the DoD 5220.22-M specification.

  1. Once you’ve selected the level of secure erasing you’re comfortable with, click the OK button.
  2. Click the Erase button to begin. Bear in mind that the more secure method you select, the longer it will take. The most secure methods can add hours to the process.

Once it’s done, the Mac’s hard drive will be clean as a whistle and ready for its next adventure: a fresh installation of OS X, being donated to a relative or a local charity, or just sent to an e-waste facility. Of course you can still drill a hole in your disk or smash it with a sledgehammer if it makes you happy, but now you know how to wipe the data from your old computer with much less ruckus.

The above instructions apply to older Macintoshes with HDDs. What do you do if you have an SSD?

Securely Erasing SSDs, and Why Not To

Most new Macs ship with solid state drives (SSDs). Only the iMac and Mac mini ship with regular hard drives anymore, and even those are available in pure SSD variants if you want.

If your Mac comes equipped with an SSD, Apple’s Disk Utility software won’t actually let you zero the hard drive.

Wait, what?

In a tech note posted to Apple’s own online knowledgebase, Apple explains that you don’t need to securely erase your Mac’s SSD:

With an SSD drive, Secure Erase and Erasing Free Space are not available in Disk Utility. These options are not needed for an SSD drive because a standard erase makes it difficult to recover data from an SSD.

In fact, some folks will tell you not to zero out the data on an SSD, since it can cause wear and tear on the memory cells that, over time, can affect its reliability. I don’t think that’s nearly as big an issue as it used to be — SSD reliability and longevity has improved.

If “Standard Erase” doesn’t quite make you feel comfortable that your data can’t be recovered, there are a couple of options.

FileVault Keeps Your Data Safe

One way to make sure that your SSD’s data remains secure is to use FileVault. FileVault is whole-disk encryption for the Mac. With FileVault engaged, you need a password to access the information on your hard drive. Without it, that data is encrypted.

There’s one potential downside of FileVault — if you lose your password or the encryption key, you’re screwed: You’re not getting your data back any time soon. Based on my experience working at a Mac repair shop, losing a FileVault key happens more frequently than it should.

When you first set up a new Mac, you’re given the option of turning FileVault on. If you don’t do it then, you can turn on FileVault at any time by clicking on your Mac’s System Preferences, clicking on Security & Privacy, and clicking on the FileVault tab. Be warned, however, that the initial encryption process can take hours, as will decryption if you ever need to turn FileVault off.

With FileVault turned on, you can restart your Mac into its Recovery System (by restarting the Mac while holding down the command and R keys) and erase the hard drive using Disk Utility, once you’ve unlocked it (by selecting the disk, clicking the File menu, and clicking Unlock). That deletes the FileVault key, which means any data on the drive is useless.

FileVault doesn’t impact the performance of most modern Macs, though I’d suggest only using it if your Mac has an SSD, not a conventional hard disk drive.

Securely Erasing Free Space on Your SSD

If you don’t want to take Apple’s word for it, if you’re not using FileVault, or if you just want to, there is a way to securely erase free space on your SSD. It’s a little more involved but it works.

Before we get into the nitty-gritty, let me state for the record that this really isn’t necessary to do, which is why Apple’s made it so hard to do. But if you’re set on it, you’ll need to use Apple’s Terminal app. Terminal provides you with command line interface access to the OS X operating system. Terminal lives in the Utilities folder, but you can access Terminal from the Mac’s Recovery System, as well. Once your Mac has booted into the Recovery partition, click the Utilities menu and select Terminal to launch it.

From a Terminal command line, type:

diskutil secureErase freespace VALUE /Volumes/DRIVE

That tells your Mac to securely erase the free space on your SSD. You’ll need to change VALUE to a number between 0 and 4. 0 is a single-pass run of zeroes; 1 is a single-pass run of random numbers; 2 is a 7-pass erase; 3 is a 35-pass erase; and 4 is a 3-pass erase. DRIVE should be changed to the name of your hard drive. To run a 7-pass erase of your SSD drive in “JohnB-Macbook”, you would enter the following:

diskutil secureErase freespace 2 /Volumes/JohnB-Macbook

And remember, if you used a space in the name of your Mac’s hard drive, you need to insert a leading backslash before the space. For example, to run a 35-pass erase on a hard drive called “Macintosh HD” you enter the following:

diskutil secureErase freespace 3 /Volumes/Macintosh\ HD

Something to remember is that the more extensive the erase procedure, the longer it will take.

When Erasing is Not Enough — How to Destroy a Drive

If you absolutely, positively need to be sure that all the data on a drive is irretrievable, see this Scientific American article (with contributions by Gleb Budman, Backblaze CEO), How to Destroy a Hard Drive — Permanently.

The post Getting Rid of Your Mac? Here’s How to Securely Erase a Hard Drive or SSD appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Measuring the throughput for Amazon MQ using the JMS Benchmark

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/measuring-the-throughput-for-amazon-mq-using-the-jms-benchmark/

This post is courtesy of Alan Protasio, Software Development Engineer, Amazon Web Services

Just like compute and storage, messaging is a fundamental building block of enterprise applications. Message brokers (aka “message-oriented middleware”) enable different software systems, often written in different languages, on different platforms, running in different locations, to communicate and exchange information. Mission-critical applications, such as CRM and ERP, rely on message brokers to work.

A common performance consideration for customers deploying a message broker in a production environment is the throughput of the system, measured as messages per second. This is important to know so that application environments (hosts, threads, memory, etc.) can be configured correctly.

In this post, we demonstrate how to measure the throughput for Amazon MQ, a new managed message broker service for ActiveMQ, using JMS Benchmark. It should take between 15–20 minutes to set up the environment and an hour to run the benchmark. We also provide some tips on how to configure Amazon MQ for optimal throughput.

Benchmarking throughput for Amazon MQ

ActiveMQ can be used for a number of use cases. These use cases can range from simple fire and forget tasks (that is, asynchronous processing), low-latency request-reply patterns, to buffering requests before they are persisted to a database.

The throughput of Amazon MQ is largely dependent on the use case. For example, if you have non-critical workloads such as gathering click events for a non-business-critical portal, you can use ActiveMQ in a non-persistent mode and get extremely high throughput with Amazon MQ.

On the flip side, if you have a critical workload where durability is extremely important (meaning that you can’t lose a message), then you are bound by the I/O capacity of your underlying persistence store. We recommend using mq.m4.large for the best results. The mq.t2.micro instance type is intended for product evaluation. Performance is limited, due to the lower memory and burstable CPU performance.

Tip: To improve your throughput with Amazon MQ, make sure that you have consumers processing messaging as fast as (or faster than) your producers are pushing messages.

Because it’s impossible to talk about how the broker (ActiveMQ) behaves for each and every use case, we walk through how to set up your own benchmark for Amazon MQ using our favorite open-source benchmarking tool: JMS Benchmark. We are fans of the JMS Benchmark suite because it’s easy to set up and deploy, and comes with a built-in visualizer of the results.

Non-Persistent Scenarios – Queue latency as you scale producer throughput

JMS Benchmark nonpersistent scenarios

Getting started

At the time of publication, you can create an mq.m4.large single-instance broker for testing for $0.30 per hour (US pricing).

This walkthrough covers the following tasks:

  1.  Create and configure the broker.
  2. Create an EC2 instance to run your benchmark
  3. Configure the security groups
  4.  Run the benchmark.

Step 1 – Create and configure the broker
Create and configure the broker using Tutorial: Creating and Configuring an Amazon MQ Broker.

Step 2 – Create an EC2 instance to run your benchmark
Launch the EC2 instance using Step 1: Launch an Instance. We recommend choosing the m5.large instance type.

Step 3 – Configure the security groups
Make sure that all the security groups are correctly configured to let the traffic flow between the EC2 instance and your broker.

  1. Sign in to the Amazon MQ console.
  2. From the broker list, choose the name of your broker (for example, MyBroker)
  3. In the Details section, under Security and network, choose the name of your security group or choose the expand icon ( ).
  4. From the security group list, choose your security group.
  5. At the bottom of the page, choose Inbound, Edit.
  6. In the Edit inbound rules dialog box, add a role to allow traffic between your instance and the broker:
    • Choose Add Rule.
    • For Type, choose Custom TCP.
    • For Port Range, type the ActiveMQ SSL port (61617).
    • For Source, leave Custom selected and then type the security group of your EC2 instance.
    • Choose Save.

Your broker can now accept the connection from your EC2 instance.

Step 4 – Run the benchmark
Connect to your EC2 instance using SSH and run the following commands:

$ cd ~
$ curl -L https://github.com/alanprot/jms-benchmark/archive/master.zip -o master.zip
$ unzip master.zip
$ cd jms-benchmark-master
$ chmod a+x bin/*
$ env \
  SERVER_SETUP=false \
  SERVER_ADDRESS={activemq-endpoint} \
  ACTIVEMQ_TRANSPORT=ssl\
  ACTIVEMQ_PORT=61617 \
  ACTIVEMQ_USERNAME={activemq-user} \
  ACTIVEMQ_PASSWORD={activemq-password} \
  ./bin/benchmark-activemq

After the benchmark finishes, you can find the results in the ~/reports directory. As you may notice, the performance of ActiveMQ varies based on the number of consumers, producers, destinations, and message size.

Amazon MQ architecture

The last bit that’s important to know so that you can better understand the results of the benchmark is how Amazon MQ is architected.

Amazon MQ is architected to be highly available (HA) and durable. For HA, we recommend using the multi-AZ option. After a message is sent to Amazon MQ in persistent mode, the message is written to the highly durable message store that replicates the data across multiple nodes in multiple Availability Zones. Because of this replication, for some use cases you may see a reduction in throughput as you migrate to Amazon MQ. Customers have told us they appreciate the benefits of message replication as it helps protect durability even in the face of the loss of an Availability Zone.

Conclusion

We hope this gives you an idea of how Amazon MQ performs. We encourage you to run tests to simulate your own use cases.

To learn more, see the Amazon MQ website. You can try Amazon MQ for free with the AWS Free Tier, which includes up to 750 hours of a single-instance mq.t2.micro broker and up to 1 GB of storage per month for one year.

Security and Human Behavior (SHB 2018)

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/05/security_and_hu_7.html

I’m at Carnegie Mellon University, at the eleventh Workshop on Security and Human Behavior.

SHB is a small invitational gathering of people studying various aspects of the human side of security, organized each year by Alessandro Acquisti, Ross Anderson, and myself. The 50 or so people in the room include psychologists, economists, computer security researchers, sociologists, political scientists, neuroscientists, designers, lawyers, philosophers, anthropologists, business school professors, and a smattering of others. It’s not just an interdisciplinary event; most of the people here are individually interdisciplinary.

The goal is to maximize discussion and interaction. We do that by putting everyone on panels, and limiting talks to 7-10 minutes. The rest of the time is left to open discussion. Four hour-and-a-half panels per day over two days equals eight panels; six people per panel means that 48 people get to speak. We also have lunches, dinners, and receptions — all designed so people from different disciplines talk to each other.

I invariably find this to be the most intellectually stimulating conference of my year. It influences my thinking in many different, and sometimes surprising, ways.

This year’s program is here. This page lists the participants and includes links to some of their work. As he does every year, Ross Anderson is liveblogging the talks. (Ross also maintains a good webpage of psychology and security resources.)

Here are my posts on the first, second, third, fourth, fifth, sixth, seventh, eighth, ninth, and tenth SHB workshops. Follow those links to find summaries, papers, and occasionally audio recordings of the various workshops.

Next year, I’ll be hosting the event at Harvard.

Replacing macOS Server with Synology NAS

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/replacing-macos-server-with-synology-nas/

Synology NAS boxes backed up to the cloud

Businesses and organizations that rely on macOS server for essential office and data services are facing some decisions about the future of their IT services.

Apple recently announced that it is deprecating a significant portion of essential network services in macOS Server, as they described in a support statement posted on April 24, 2018, “Prepare for changes to macOS Server.” Apple’s note includes:

macOS Server is changing to focus more on management of computers, devices, and storage on your network. As a result, some changes are coming in how Server works. A number of services will be deprecated, and will be hidden on new installations of an update to macOS Server coming in spring 2018.

The note lists the services that will be removed in a future release of macOS Server, including calendar and contact support, Dynamic Host Configuration Protocol (DHCP), Domain Name Services (DNS), mail, instant messages, virtual private networking (VPN), NetInstall, Web server, and the Wiki.

Apple assures users who have already configured any of the listed services that they will be able to use them in the spring 2018 macOS Server update, but the statement ends with links to a number of alternative services, including hosted services, that macOS Server users should consider as viable replacements to the features it is removing. These alternative services are all FOSS (Free and Open-Source Software).

As difficult as this could be for organizations that use macOS server, this is not unexpected. Apple left the server hardware space back in 2010, when Steve Jobs announced the company was ending its line of Xserve rackmount servers, which were introduced in May, 2002. Since then, macOS Server has hardly been a prominent part of Apple’s product lineup. It’s not just the product itself that has lost some luster, but the entire category of SMB office and business servers, which has been undergoing a gradual change in recent years.

Some might wonder how important the news about macOS Server is, given that macOS Server represents a pretty small share of the server market. macOS Server has been important to design shops, agencies, education users, and small businesses that likely have been on Macs for ages, but it’s not a significant part of the IT infrastructure of larger organizations and businesses.

What Comes After macOS Server?

Lovers of macOS Server don’t have to fear having their Mac minis pried from their cold, dead hands quite yet. Installed services will continue to be available. In the fall of 2018, new installations and upgrades of macOS Server will require users to migrate most services to other software. Since many of the services of macOS Server were already open-source, this means that a change in software might not be required. It does mean more configuration and management required from those who continue with macOS Server, however.

Users can continue with macOS Server if they wish, but many will see the writing on the wall and look for a suitable substitute.

The Times They Are A-Changin’

For many people working in organizations, what is significant about this announcement is how it reflects the move away from the once ubiquitous server-based IT infrastructure. Services that used to be centrally managed and office-based, such as storage, file sharing, communications, and computing, have moved to the cloud.

In selecting the next office IT platforms, there’s an opportunity to move to solutions that reflect and support how people are working and the applications they are using both in the office and remotely. For many, this means including cloud-based services in office automation, backup, and business continuity/disaster recovery planning. This includes Software as a Service, Platform as a Service, and Infrastructure as a Service (Saas, PaaS, IaaS) options.

IT solutions that integrate well with the cloud are worth strong consideration for what comes after a macOS Server-based environment.

Synology NAS as a macOS Server Alternative

One solution that is becoming popular is to replace macOS Server with a device that has the ability to provide important office services, but also bridges the office and cloud environments. Using Network-Attached Storage (NAS) to take up the server slack makes a lot of sense. Many customers are already using NAS for file sharing, local data backup, automatic cloud backup, and other uses. In the case of Synology, their operating system, Synology DiskStation Manager (DSM), is Linux based, and integrates the basic functions of file sharing, centralized backup, RAID storage, multimedia streaming, virtual storage, and other common functions.

Synology NAS box

Synology NAS

Since DSM is based on Linux, there are numerous server applications available, including many of the same ones that are available for macOS Server, which shares conceptual roots with Linux as it comes from BSD Unix.

Synology DiskStation Manager Package Center screenshot

Synology DiskStation Manager Package Center

According to Ed Lukacs, COO at 2FIFTEEN Systems Management in Salt Lake City, their customers have found the move from macOS Server to Synology NAS not only painless, but positive. DSM works seamlessly with macOS and has been faster for their customers, as well. Many of their customers are running Adobe Creative Suite and Google G Suite applications, so a workflow that combines local storage, remote access, and the cloud, is already well known to them. Remote users are supported by Synology’s QuickConnect or VPN.

Business continuity and backup are simplified by the flexible storage capacity of the NAS. Synology has built-in backup to Backblaze B2 Cloud Storage with Synology’s Cloud Sync, as well as a choice of a number of other B2-compatible applications, such as Cloudberry, Comet, and Arq.

Customers have been able to get up and running quickly, with only initial data transfers requiring some time to complete. After that, management of the NAS can be handled in-house or with the support of a Managed Service Provider (MSP).

Are You Sticking with macOS Server or Moving to Another Platform?

If you’re affected by this change in macOS Server, please let us know in the comments how you’re planning to cope. Are you using Synology NAS for server services? Please tell us how that’s working for you.

The post Replacing macOS Server with Synology NAS appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

HackSpace magazine 7: Internet of Everything

Post Syndicated from Andrew Gregory original https://www.raspberrypi.org/blog/hackspace-magazine-7-internet-of-everything/

We’re usually averse to buzzwords at HackSpace magazine, but not this month: in issue 7, we’re taking a deep dive into the Internet of Things.HackSpace magazine issue 7 cover

Internet of Things (IoT)

To many people, IoT is a shady term used by companies to sell you something you already own, but this time with WiFi; to us, it’s a way to make our builds smarter, more useful, and more connected. In HackSpace magazine #7, you can join us on a tour of the boards that power IoT projects, marvel at the ways in which other makers are using IoT, and get started with your first IoT project!

Awesome projects

DIY retro computing: this issue, we’re taking our collective hat off to Spencer Owen. He stuck his home-brew computer on Tindie thinking he might make a bit of beer money — now he’s paying the mortgage with his making skills and inviting others to build modules for his machine. And if that tickles your fancy, why not take a crack at our Z80 tutorial? Get out your breadboard, assemble your jumper wires, and prepare to build a real-life computer!

Inside HackSpace magazine issue 7

Shameless patriotism: combine Lego, Arduino, and the car of choice for 1960 gold bullion thieves, and you’ve got yourself a groovy weekend project. We proudly present to you one man’s epic quest to add LED lights (controllable via a smartphone!) to his daughter’s LEGO Mini Cooper.

Makerspaces

Patriotism intensifies: for the last 200-odd years, the Black Country has been a hotbed of making. Urban Hax, based in Walsall, is the latest makerspace to show off its riches in the coveted Space of the Month pages. Every space has its own way of doing things, but not every space has a portrait of Rob Halford on the wall. All hail!

Inside HackSpace magazine issue 7

Diversity: advice on diversity often boils down to ‘Be nice to people’, which might feel more vague than actionable. This is where we come in to help: it is truly worth making the effort to give people of all backgrounds access to your makerspace, so we take a look at why it’s nice to be nice, and at the ways in which one makerspace has put niceness into practice — with great results.

And there’s more!

We also show you how to easily calculate the size and radius of laser-cut gears, use a bank of LEDs to etch PCBs in your own mini factory, and use chemistry to mess with your lunch menu.

Inside HackSpace magazine issue 7
Helen Steer inside HackSpace magazine issue 7
Inside HackSpace magazine issue 7

All this plus much, much more waits for you in HackSpace magazine issue 7!

Get your copy of HackSpace magazine

If you like the sound of that, you can find HackSpace magazine in WHSmith, Tesco, Sainsbury’s, and independent newsagents in the UK. If you live in the US, check out your local Barnes & Noble, Fry’s, or Micro Center next week. We’re also shipping to stores in Australia, Hong Kong, Canada, Singapore, Belgium, and Brazil, so be sure to ask your local newsagent whether they’ll be getting HackSpace magazine.

And if you can’t get to the shops, fear not: you can subscribe from £4 an issue from our online shop. And if you’d rather try before you buy, you can always download the free PDF. Happy reading, and happy making!

The post HackSpace magazine 7: Internet of Everything appeared first on Raspberry Pi.

C is to low level

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/05/c-is-too-low-level.html

I’m in danger of contradicting myself, after previously pointing out that x86 machine code is a high-level language, but this article claiming C is a not a low level language is bunk. C certainly has some problems, but it’s still the closest language to assembly. This is obvious by the fact it’s still the fastest compiled language. What we see is a typical academic out of touch with the real world.

The author makes the (wrong) observation that we’ve been stuck emulating the PDP-11 for the past 40 years. C was written for the PDP-11, and since then CPUs have been designed to make C run faster. The author imagines a different world, such as where CPU designers instead target something like LISP as their preferred language, or Erlang. This misunderstands the state of the market. CPUs do indeed supports lots of different abstractions, and C has evolved to accommodate this.


The author criticizes things like “out-of-order” execution which has lead to the Spectre sidechannel vulnerabilities. Out-of-order execution is necessary to make C run faster. The author claims instead that those resources should be spent on having more slower CPUs, with more threads. This sacrifices single-threaded performance in exchange for a lot more threads executing in parallel. The author cites Sparc Tx CPUs as his ideal processor.

But here’s the thing, the Sparc Tx was a failure. To be fair, it’s mostly a failure because most of the time, people wanted to run old C code instead of new Erlang code. But it was still a failure at running Erlang.

Time after time, engineers keep finding that “out-of-order”, single-threaded performance is still the winner. A good example is ARM processors for both mobile phones and servers. All the theory points to in-order CPUs as being better, but all the products are out-of-order, because this theory is wrong. The custom ARM cores from Apple and Qualcomm used in most high-end phones are so deeply out-of-order they give Intel CPUs competition. The same is true on the server front with the latest Qualcomm Centriq and Cavium ThunderX2 processors, deeply out of order supporting more than 100 instructions in flight.

The Cavium is especially telling. Its ThunderX CPU had 48 simple cores which was replaced with the ThunderX2 having 32 complex, deeply out-of-order cores. The performance increase was massive, even on multithread-friendly workloads. Every competitor to Intel’s dominance in the server space has learned the lesson from Sparc Tx: many wimpy cores is a failure, you need fewer beefy cores. Yes, they don’t need to be as beefy as Intel’s processors, but they need to be close.

Even Intel’s “Xeon Phi” custom chip learned this lesson. This is their GPU-like chip, running 60 cores with 512-bit wide “vector” (sic) instructions, designed for supercomputer applications. Its first version was purely in-order. Its current version is slightly out-of-order. It supports four threads and focuses on basic number crunching, so in-order cores seems to be the right approach, but Intel found in this case that out-of-order processing still provided a benefit. Practice is different than theory.

As an academic, the author of the above article focuses on abstractions. The criticism of C is that it has the wrong abstractions which are hard to optimize, and that if we instead expressed things in the right abstractions, it would be easier to optimize.

This is an intellectually compelling argument, but so far bunk.

The reason is that while the theoretical base language has issues, everyone programs using extensions to the language, like “intrinsics” (C ‘functions’ that map to assembly instructions). Programmers write libraries using these intrinsics, which then the rest of the normal programmers use. In other words, if your criticism is that C is not itself low level enough, it still provides the best access to low level capabilities.

Given that C can access new functionality in CPUs, CPU designers add new paradigms, from SIMD to transaction processing. In other words, while in the 1980s CPUs were designed to optimize C (stacks, scaled pointers), these days CPUs are designed to optimize tasks regardless of language.

The author of that article criticizes the memory/cache hierarchy, claiming it has problems. Yes, it has problems, but only compared to how well it normally works. The author praises the many simple cores/threads idea as hiding memory latency with little caching, but misses the point that caches also dramatically increase memory bandwidth. Intel processors are optimized to read a whopping 256 bits every clock cycle from L1 cache. Main memory bandwidth is orders of magnitude slower.

The author goes onto criticize cache coherency as a problem. C uses it, but other languages like Erlang don’t need it. But that’s largely due to the problems each languages solves. Erlang solves the problem where a large number of threads work on largely independent tasks, needing to send only small messages to each other across threads. The problems C solves is when you need many threads working on a huge, common set of data.

For example, consider the “intrusion prevention system”. Any thread can process any incoming packet that corresponds to any region of memory. There’s no practical way of solving this problem without a huge coherent cache. It doesn’t matter which language or abstractions you use, it’s the fundamental constraint of the problem being solved. RDMA is an important concept that’s moved from supercomputer applications to the data center, such as with memcached. Again, we have the problem of huge quantities (terabytes worth) shared among threads rather than small quantities (kilobytes).

The fundamental issue the author of the the paper is ignoring is decreasing marginal returns. Moore’s Law has gifted us more transistors than we can usefully use. We can’t apply those additional registers to just one thing, because the useful returns we get diminish.

For example, Intel CPUs have two hardware threads per core. That’s because there are good returns by adding a single additional thread. However, the usefulness of adding a third or fourth thread decreases. That’s why many CPUs have only two threads, or sometimes four threads, but no CPU has 16 threads per core.

You can apply the same discussion to any aspect of the CPU, from register count, to SIMD width, to cache size, to out-of-order depth, and so on. Rather than focusing on one of these things and increasing it to the extreme, CPU designers make each a bit larger every process tick that adds more transistors to the chip.

The same applies to cores. It’s why the “more simpler cores” strategy fails, because more cores have their own decreasing marginal returns. Instead of adding cores tied to limited memory bandwidth, it’s better to add more cache. Such cache already increases the size of the cores, so at some point it’s more effective to add a few out-of-order features to each core rather than more cores. And so on.

The question isn’t whether we can change this paradigm and radically redesign CPUs to match some academic’s view of the perfect abstraction. Instead, the goal is to find new uses for those additional transistors. For example, “message passing” is a useful abstraction in languages like Go and Erlang that’s often more useful than sharing memory. It’s implemented with shared memory and atomic instructions, but I can’t help but think it couldn’t better be done with direct hardware support.

Of course, as soon as they do that, it’ll become an intrinsic in C, then added to languages like Go and Erlang.

Summary

Academics live in an ideal world of abstractions, the rest of us live in practical reality. The reality is that vast majority of programmers work with the C family of languages (JavaScript, Go, etc.), whereas academics love the epiphanies they learned using other languages, especially function languages. CPUs are only superficially designed to run C and “PDP-11 compatibility”. Instead, they keep adding features to support other abstractions, abstractions available to C. They are driven by decreasing marginal returns — they would love to add new abstractions to the hardware because it’s a cheap way to make use of additional transitions. Academics are wrong believing that the entire system needs to be redesigned from scratch. Instead, they just need to come up with new abstractions CPU designers can add.

The Benefits of Side Projects

Post Syndicated from Bozho original https://techblog.bozho.net/the-benefits-of-side-projects/

Side projects are the things you do at home, after work, for your own “entertainment”, or to satisfy your desire to learn new stuff, in case your workplace doesn’t give you that opportunity (or at least not enough of it). Side projects are also a way to build stuff that you think is valuable but not necessarily “commercialisable”. Many side projects are open-sourced sooner or later and some of them contribute to the pool of tools at other people’s disposal.

I’ve outlined one recommendation about side projects before – do them with technologies that are new to you, so that you learn important things that will keep you better positioned in the software world.

But there are more benefits than that – serendipitous benefits, for example. And I’d like to tell some personal stories about that. I’ll focus on a few examples from my list of side projects to show how, through a sort-of butterfly effect, they helped shape my career.

The computoser project, no matter how cool algorithmic music composition, didn’t manage to have much of a long term impact. But it did teach me something apart from niche musical theory – how to read a bulk of scientific papers (mostly computer science) and understand them without being formally trained in the particular field. We’ll see how that was useful later.

Then there was the “State alerts” project – a website that scraped content from public institutions in my country (legislation, legislation proposals, decisions by regulators, new tenders, etc.), made them searchable, and “subscribable” – so that you get notified when a keyword of interest is mentioned in newly proposed legislation, for example. (I obviously subscribed for “information technologies” and “electronic”).

And that project turned out to have a significant impact on the following years. First, I chose a new technology to write it with – Scala. Which turned out to be of great use when I started working at TomTom, and on the 3rd day I was transferred to a Scala project, which was way cooler and much more complex than the original one I was hired for. It was a bit ironic, as my colleagues had just read that “I don’t like Scala” a few weeks earlier, but nevertheless, that was one of the most interesting projects I’ve worked on, and it went on for two years. Had I not known Scala, I’d probably be gone from TomTom much earlier (as the other project was restructured a few times), and I would not have learned many of the scalability, architecture and AWS lessons that I did learn there.

But the very same project had an even more important follow-up. Because if its “civic hacking” flavour, I was invited to join an informal group of developers (later officiated as an NGO) who create tools that are useful for society (something like MySociety.org). That group gathered regularly, discussed both tools and policies, and at some point we put up a list of policy priorities that we wanted to lobby policy makers. One of them was open source for the government, the other one was open data. As a result of our interaction with an interim government, we donated the official open data portal of my country, functioning to this day.

As a result of that, a few months later we got a proposal from the deputy prime minister’s office to “elect” one of the group for an advisor to the cabinet. And we decided that could be me. So I went for it and became advisor to the deputy prime minister. The job has nothing to do with anything one could imagine, and it was challenging and fascinating. We managed to pass legislation, including one that requires open source for custom projects, eID and open data. And all of that would not have been possible without my little side project.

As for my latest side project, LogSentinel – it became my current startup company. And not without help from the previous two mentioned above – the computer science paper reading was of great use when I was navigating the crypto papers landscape, and from the government job I not only gained invaluable legal knowledge, but I also “got” a co-founder.

Some other side projects died without much fanfare, and that’s fine. But the ones above shaped my “story” in a way that would not have been possible otherwise.

And I agree that such serendipitous chain of events could have happened without side projects – I could’ve gotten these opportunities by meeting someone at a bar (unlikely, but who knows). But we, as software engineers, are capable of tilting chance towards us by utilizing our skills. Side projects are our “extracurricular activities”, and they often lead to unpredictable, but rather positive chains of events. They would rarely be the only factor, but they are certainly great at unlocking potential.

The post The Benefits of Side Projects appeared first on Bozho's tech blog.

The Practical Effects of GDPR at Backblaze

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/the-practical-effects-of-gdpr-at-backblaze/


GDPR day, May 25, 2018, is nearly here. On that day, will your inbox explode with update notices, opt-in agreements, and offers from lawyers searching for GDPR violators? Perhaps all the companies on earth that are not GDPR ready will just dissolve into dust. More likely, there will be some changes, but business as usual will continue and we’ll all be more aware of data privacy. Let’s go with the last one.

What’s Different With GDPR at Backblaze

The biggest difference you’ll notice is a completely updated Privacy Policy. Last week we sent out a service email announcing the new Privacy Policy. Some people asked what was different. Basically everything. About 95% of the agreement was rewritten. In the agreement, we added in the appropriate provisions required by GDPR, and hopefully did a better job specifying the data we collect from you, why we collect it, and what we are going to do with it.

As a reminder, at Backblaze your data falls into two catagories. The first type of data is the data you store with us — stored data. These are the files and objects you upload and store, and as needed, restore. We do not share this data. We do not process this data, except as requested by you to store and restore the data. We do not analyze this data looking for keywords, tags, images, etc. No one outside of Backblaze has access to this data unless you explicitly shared the data by providing that person access to one or more files.

The second type of data is your account data. Some of your account data is considered personal data. This is the information we collect from you to provide our Personal Backup, Business Backup and B2 Cloud Storage services. Examples include your email address to provide access to your account, or the name of your computer so we can organize your files like they are arranged on your computer to make restoration easier. We have written a number of Help Articles covering the different ways this information is collected and processed. In addition, these help articles outline the various “rights” granted via GDPR. We will continue to add help articles over the coming weeks to assist in making it easy to work with us to understand and exercise your rights.

What’s New With GDPR at Backblaze

The most obvious addition is the Data Processing Addendum (DPA). This covers how we protect the data you store with us, i.e. stored data. As noted above, we don’t do anything with your data, except store it and keep it safe until you need it. Now we have a separate document saying that.

It is important to note the new Data Processing Addendum is now incorporated by reference into our Terms of Service, which everyone agrees to when they sign up for any of our services. Now all of our customers have a shiny new Data Processing Agreement to go along with the updated Privacy Policy. We promise they are not long or complicated, and we encourage you to read them. If you have any questions, stop by our GDPR help section on our website.

Patience, Please

Every company we have dealt with over the last few months is working hard to comply with GDPR. It has been a tough road whether you tried to do it yourself or like Backblaze, hired an EU-based law firm for advice. Over the coming weeks and months as you reach out to discover and assert your rights, please have a little patience. We are all going through a steep learning curve as GDPR gets put into practice. Along the way there are certain to be some growing pains — give us a chance, we all want to get it right.

Regardless, at Backblaze we’ve been diligently protecting our customers’ data for over 11 years and nothing that will happen on May 25th will change that.

The post The Practical Effects of GDPR at Backblaze appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.