Tag Archives: Events

Amazon Redshift at re:Invent 2019

Post Syndicated from Corina Radovanovich original https://aws.amazon.com/blogs/big-data/amazon-redshift-at-reinvent-2019/

The annual AWS re:Invent learning conference is an exciting time full of new product and program launches. At the first re:Invent conference in 2012, AWS announced Amazon Redshift. Since then, tens of thousands of customers have started using Amazon Redshift as their cloud data warehouse. In 2019, AWS shared several significant launches and dozens of sessions. This post presents highlights of what happened with Amazon Redshift at re:Invent 2019.

Andy Jassy’s AWS re:Invent 2019 keynote

When Andy Jassy takes the stage to talk about what’s new at AWS, he launches the new Amazon Redshift node type, RA3 with managed storage; the new Federated Query (preview) feature, Export to Data Lake; and Advanced Query Accelerator (AQUA) (preview) for Amazon Redshift. Watch AWS re:Invent 2019 – Keynote with Andy Jassy on YouTube, or jump ahead for the Amazon RedShift announcements.

Deep dive and best practices for Amazon Redshift

Every year the Amazon Redshift deep dive session rates highly, and people continue to watch and re-watch it after the event. This year was no different. Specialist Solution Architects Harshida Patel and Tony Gibbs take an in-depth look at best practices for data warehousing with Amazon Redshift. It’s a must-see for existing Amazon Redshift users. Watch AWS re:Invent 2019: Deep dive and best practices for Amazon Redshift (ANT418) on YouTube.

What’s new with Amazon Redshift, featuring Yelp and Workday

With over 200 new features and capabilities launched in the last 18 months, there’s a lot to cover in a session about what’s new with Amazon Redshift. Join one of the Product Managers driving RA3 and managed storage, Himanshu Raja, to catch up on the recent performance, concurrency, elasticity, and manageability enhancements behind Amazon Redshift’s record price-to-performance ratio. You also get more insight into the architectural evolution of Amazon Redshift with RA3 and managed storage, and how it uses machine learning to create a self-optimizing data warehouse. In the second half of the session, you hear from Steven Moy, a software engineer at Yelp, about how Amazon Redshift’s latest features have helped Yelp achieve optimization and scale for an organization with an enormous about of data and sophisticated analytics. Watch AWS re:Invent 2019: What’s new with Amazon Redshift, featuring Yelp (ANT320-R1) on YouTube.

The session repeated with Michalis Petropoulos, Director of Engineering, and Erol Guney of Workday. Watch the full session to get a slightly different take on what’s new, or jump to the customer presentation to hear from Erol Guney, Architect, Data Platform, at Workday about how Amazon Redshift empowers their Data as a Service product team to focus on architecture goals and business logic.

Migrate your data warehouse to the cloud in record time, featuring Nielsen and Fannie Mae

In this session, you learn about important concepts and tips for migrating your legacy on-premise data warehouse to the cloud. You hear from Tejas Desai, VP of Technology at Neilsen, about their migration journey and benefits. Watch AWS re:Invent 2019: Migrate your data warehouse to the cloud, featuring Nielsen (ANT334-R1) on YouTube.

The repeat of this session features Amy Tseng from Fannie Mae. If you don’t want to listen to Tony’s overview again, skip ahead to learn how Fannie Mae embraced a data lake architecture with Amazon Redshift for analytics to save costs, maximize performance, and scale. Amy’s presentation was a crowd favorite, with some of the most positive customer feedback and a wealth of great information about how Fannie Mae managed their migration.

How to scale data analytics with Amazon Redshift, featuring Duolingo and Warner Brothers Interactive Entertainment

Data is growing fast, and so is the value that business users need to gain from business data. When AWS first announced Amazon Redshift in 2012, it could handle up to 640 TB of compressed data. It can now scale to 8 PB of compressed data. Learn more about Amazon Redshift’s unique ability to deliver top performance at the lowest and most predictable cost from Vinay Shukla, Principal Product Manager. This is an especially important session if you want to learn more about the newest Amazon Redshift node type RA3. You also hear from Jonathan Burket of Duolingo about their experience in the preview of RA3 nodes and how Duolingo uses Amazon Redshift. Duolingo is a wildly popular language-learning platform and the most downloaded education app in the world, with over 300 million users. Enabling data-driven decisions with A/B tests and ad hoc analysis has been a driver of their success. Watch AWS re:Invent 2019: How to scale data analytics with Amazon Redshift (ANT335-R2) on YouTube.

The repeat session features Redshift Product Manager Maor Kleider with an in-depth case study from Matt Howell, Executive Director, Analytics, and Kurt Larson, Technical Director, Analytics, at Warner Brothers Interactive Entertainment. Watch the full session for another perspective about how to scale with the latest Amazon Redshift features, with unique insights about analytics across Amazon Redshift and your data lake. You can also jump to the customer presentation. Not only is this session packed with interesting insights about how data analytics drives the success of games like Batman and Mortal Kombat, it also has an action-packed trailer about all the awesome Warner Brothers games.

If you prefer to see a session without the announcements from the keynote and with demos, watch Debu Panda showcase the new Amazon Redshift console and share practical tips about using Amazon Redshift.

Amazon Redshift reimagined: RA3 and AQUA

This embargoed session is the first opportunity to learn more about AQUA for Amazon Redshift, and how it improves query performance to up to 10 times faster. Britt Johnston, Director of Product Management, kicks off with an intro into the next generation of Amazon Redshift, and Senior Principal Engineer Andy Caldwell jumps in to share the origin and vision of the exciting new technology. The enthusiasm Andy feels about sharing AQUA with customers for the first time is palpable. Watch AWS re:Invent 2019: [NEW LAUNCH!] Amazon Redshift reimagined: RA3 and AQUA (ANT230) on YouTube.

State-of-the-art cloud data warehousing, featuring Asurion and Comcast

This session serves as a great introduction to cloud data warehousing at AWS, with insightful presentations from a different customer in each delivery. You can hear from Asurion about how they use data analytics to serve over 300 million people with excellent customer satisfaction scores. You learn about how to use AWS services with Amazon Redshift and why Asurion believes in their data lake-based architecture. Watch AWS re:Invent 2019: State-of-the-art cloud data warehousing, featuring Asurion (ANT213-R1) on YouTube.

In the repeat session, Rajat Garg, Senior Principal Architect from Comcast, talks about moving to Amazon Redshift from a legacy on-premise Oracle Exadata environment. He shares their strategy, approach, and performance improvements.

What’s next and more information

In addition to these sessions at re:Invent, there are also hands-on workshops, intimate builder roundtables, and interactive chalk talks that weren’t recorded.

Keep exploring the following links for more information about new releases:

We hope to see you in Las Vegas for re:Invent 2020, or at one of the hundreds of other AWS virtual and in-person events running around the world. For more information, see AWS Events and Webinars.


About the authors

Corina Radovanovich leads product marketing for cloud data warehousing at AWS.

 

 

 

Come to our free educator sessions next to Bett 2020

Post Syndicated from Tom Evans original https://www.raspberrypi.org/blog/free-educator-sessions-bett-show-2020/

Are you attending Bett Show this year? Then come to our free educator sessions on Friday 24 January right next to Bett to take a break from the hustle and bustle of the show floor and learn something new!

Our team will be in a private room below the [email protected] pub, next door to Bett, all day on Friday 24 January. We’ll be offering free physical computing sessions for primary and secondary educators during the day. Then from 17:30, you can drop in to chat to us about computing in your classroom, and to connect with like-minded educators.

A teacher attending a physical computing sessions laughs as she works through an activity

Our schedule for you on 24 January

11:00–12:30: Physical computing session for primary teachers (limited spaces, please register to attend)

12:45–13:30: Panel and Q&A for primary teachers: Code Club and the National Centre of Computing Education (drop in without registering)

14:30–16:00: Physical computing session for secondary teachers (limited spaces, please register to attend)

16:15–17:00: Panel and Q&A for secondary teachers: Code Club and the National Centre of Computing Education (drop in without registering)

17:30–21:00: Informal meet and greet with the Raspberry Pi team for everyone (drop in without registering)

  • Snacks and refreshments will be provided at all the sessions
  • Directions to the [email protected] pub, where you’ll find us, are below
  • You don’t need to have a pass to Bett Show to attend any of our sessions

What are these physical computing sessions?

In these free, registration-only, practical sessions (tailored to primary and secondary educators, respectively), we’ll highlight the value of delivering curriculum objectives through physical computing activities.

You’ll learn about:

  • Setting up a Raspberry Pi computer
  • Controlling LEDs using Scratch, Python, and Raspberry Pi
  • Pedagogical approaches such as pair programming and Parson’s Puzzles

Women using Raspberry Pi and Trinket

The sessions are perfect for you if you’d like an introduction to how to bring physical computing to your classroom, because no experience of physical computing is needed.

Both sessions are free and open to all teachers and educators working with learners in the relevant Key Stages.

Spaces are limited for both sessions, so make sure you register to reserve your space:

Find out how to bring more computing opportunities to your school

Following each of the physical computing sessions, you’ll have the chance to find out how else we can help you bring computing to your school! During a 45-minute panel and Q&A, our team will introduce you all things Code Club and how to set up an engaging coding club in your school, and to the comprehensive, free support we offer you through the National Centre of Computing Education. You’ll also be able to ask us any questions you have about the programmes and resources we offer to you.

There is no need to register for this ‘panel and Q&A’ part of the day — just drop in when it suits you.

Network with us and other educators

Your evening at [email protected], from 17:30 onwards, will be an informal meet and greet with the Raspberry Pi team. Snacks and refreshments will be provided, and you can drop in whenever you like.

This is your time to chat to us, discover more about the other educational activities we run, and network with other primary and secondary educators who want to encourage children and young adults to get hands-on with computing.

Code Club

We hope to see many of you there, and we’re looking forward to chatting with you!

If you have any questions about this event, or want to find out more, please contact [email protected] and we will get back to you!

How to find us

The [email protected] is a pub located in Warehouse K next to the ExCel Center, easily accessed from the footpath between the ExCel West Entrance and Custom House DLR Station.

Map of where the Fox@ExCel London is

You will find us in a private area below the main floor of the [email protected]. There should be a sign directing you to the location, and you can also ask the pub staff to point the way.

From Custom House DLR Station:

Follow the signs along the footbridge towards the ExCel main entrance, enter the door labelled ‘[email protected]’ on the first building to your right, and head down the stairs.

From the ExCel West Entrance:

Turn right out of the main entrance and follow the footbridge towards the ExCel. You will find the entrance to the [email protected] in the second pair of doorways on your left. Enter the building and go down the stairs.

The post Come to our free educator sessions next to Bett 2020 appeared first on Raspberry Pi.

Amazon SageMaker Debugger – Debug Your Machine Learning Models

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-debugger-debug-your-machine-learning-models/

Today, we’re extremely happy to announce Amazon SageMaker Debugger, a new capability of Amazon SageMaker that automatically identifies complex issues developing in machine learning (ML) training jobs.

Building and training ML models is a mix of science and craft (some would even say witchcraft). From collecting and preparing data sets to experimenting with different algorithms to figuring out optimal training parameters (the dreaded hyperparameters), ML practitioners need to clear quite a few hurdles to deliver high-performance models. This is the very reason why be built Amazon SageMaker : a modular, fully managed service that simplifies and speeds up ML workflows.

As I keep finding out, ML seems to be one of Mr. Murphy’s favorite hangouts, and everything that may possibly go wrong often does! In particular, many obscure issues can happen during the training process, preventing your model from correctly extracting and learning patterns present in your data set. I’m not talking about software bugs in ML libraries (although they do happen too): most failed training jobs are caused by an inappropriate initialization of parameters, a poor combination of hyperparameters, a design issue in your own code, etc.

To make things worse, these issues are rarely visible immediately: they grow over time, slowly but surely ruining your training process, and yielding low accuracy models. Let’s face it, even if you’re a bona fide expert, it’s devilishly difficult and time-consuming to identify them and hunt them down, which is why we built Amazon SageMaker Debugger.

Let me tell you more.

Introducing Amazon SageMaker Debugger
In your existing training code for TensorFlow, Keras, Apache MXNet, PyTorch and XGBoost, you can use the new SageMaker Debugger SDK to save internal model state at periodic intervals; as you can guess, it will be stored in Amazon Simple Storage Service (S3).

This state is composed of:

  • The parameters being learned by the model, e.g. weights and biases for neural networks,
  • The changes applied to these parameters by the optimizer, aka gradients,
  • The optimization parameters themselves,
  • Scalar values, e.g. accuracies and losses,
  • The output of each layer,
  • Etc.

Each specific set of values – say, the sequence of gradients flowing over time through a specific neural network layer – is saved independently, and referred to as a tensor. Tensors are organized in collections (weights, gradients, etc.), and you can decide which ones you want to save during training. Then, using the SageMaker SDK and its estimators, you configure your training job as usual, passing additional parameters defining the rules you want SageMaker Debugger to apply.

A rule is a piece of Python code that analyses tensors for the model in training, looking for specific unwanted conditions. Pre-defined rules are available for common problems such as exploding/vanishing tensors (parameters reaching NaN or zero values), exploding/vanishing gradients, loss not changing, and more. Of course, you can also write your own rules.

Once the SageMaker estimator is configured, you can launch the training job. Immediately, it fires up a debug job for each rule that you configured, and they start inspecting available tensors. If a debug job detects a problem, it stops and logs additional information. A CloudWatch Events event is also sent, should you want to trigger additional automated steps.

So now you know that your deep learning job suffers from say, vanishing gradients. With a little brainstorming and experience, you’ll know where to look: maybe the neural network is too deep? Maybe your learning rate is too small? As the internal state has been saved to S3, you can now use the SageMaker Debugger SDK to explore the evolution of tensors over time, confirm your hypothesis and fix the root cause.

Let’s see SageMaker Debugger in action with a quick demo.

Debugging Machine Learning Models with Amazon SageMaker Debugger
At the core of SageMaker Debugger is the ability to capture tensors during training. This requires a little bit of instrumentation in your training code, in order to select the tensor collections you want to save, the frequency at which you want to save them, and whether you want to save the values themselves or a reduction (mean, average, etc.).

For this purpose, the SageMaker Debugger SDK provides simple APIs for each framework that it supports. Let me show you how this works with a simple TensorFlow script, trying to fit a 2-dimension linear regression model. Of course, you’ll find more examples in this Github repository.

Let’s take a look at the initial code:

import argparse
import numpy as np
import tensorflow as tf
import random

parser = argparse.ArgumentParser()
parser.add_argument('--model_dir', type=str, help="S3 path for the model")
parser.add_argument('--lr', type=float, help="Learning Rate", default=0.001)
parser.add_argument('--steps', type=int, help="Number of steps to run", default=100)
parser.add_argument('--scale', type=float, help="Scaling factor for inputs", default=1.0)

args = parser.parse_args()

with tf.name_scope('initialize'):
    # 2-dimensional input sample
    x = tf.placeholder(shape=(None, 2), dtype=tf.float32)
    # Initial weights: [10, 10]
    w = tf.Variable(initial_value=[[10.], [10.]], name='weight1')
    # True weights, i.e. the ones we're trying to learn
    w0 = [[1], [1.]]
with tf.name_scope('multiply'):
    # Compute true label
    y = tf.matmul(x, w0)
    # Compute "predicted" label
    y_hat = tf.matmul(x, w)
with tf.name_scope('loss'):
    # Compute loss
    loss = tf.reduce_mean((y_hat - y) ** 2, name="loss")

optimizer = tf.train.AdamOptimizer(args.lr)
optimizer_op = optimizer.minimize(loss)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(args.steps):
        x_ = np.random.random((10, 2)) * args.scale
        _loss, opt = sess.run([loss, optimizer_op], {x: x_})
        print (f'Step={i}, Loss={_loss}')

Let’s train this script using the TensorFlow Estimator. I’m using SageMaker local mode, which is a great way to quickly iterate on experimental code.

bad_hyperparameters = {'steps': 10, 'lr': 100, 'scale': 100000000000}

estimator = TensorFlow(
    role=sagemaker.get_execution_role(),
    base_job_name='debugger-simple-demo',
    train_instance_count=1,
    train_instance_type='local',
    entry_point='script-v1.py',
    framework_version='1.13.1',
    py_version='py3',
    script_mode=True,
    hyperparameters=bad_hyperparameters)

Looking at the training log, things did not go well.

Step=0, Loss=7.883463958023267e+23
algo-1-hrvqg_1 | Step=1, Loss=9.502028841062608e+23
algo-1-hrvqg_1 | Step=2, Loss=nan
algo-1-hrvqg_1 | Step=3, Loss=nan
algo-1-hrvqg_1 | Step=4, Loss=nan
algo-1-hrvqg_1 | Step=5, Loss=nan
algo-1-hrvqg_1 | Step=6, Loss=nan
algo-1-hrvqg_1 | Step=7, Loss=nan
algo-1-hrvqg_1 | Step=8, Loss=nan
algo-1-hrvqg_1 | Step=9, Loss=nan

Loss does not decrease at all, and even goes to infinity… This looks like an exploding tensor problem, which is one of the built-in rules defined in SageMaker Debugger. Let’s get to work.

Using the Amazon SageMaker Debugger SDK
In order to capture tensors, I need to instrument the training script with:

  • A SaveConfig object specifying the frequency at which tensors should be saved,
  • A SessionHook object attached to the TensorFlow session, putting everything together and saving required tensors during training,
  • An (optional) ReductionConfig object, listing tensor reductions that should be saved instead of full tensors,
  • An (optional) optimizer wrapper to capture gradients.

Here’s the updated code, with extra command line arguments for SageMaker Debugger parameters.

import argparse
import numpy as np
import tensorflow as tf
import random
import smdebug.tensorflow as smd

parser = argparse.ArgumentParser()
parser.add_argument('--model_dir', type=str, help="S3 path for the model")
parser.add_argument('--lr', type=float, help="Learning Rate", default=0.001 )
parser.add_argument('--steps', type=int, help="Number of steps to run", default=100 )
parser.add_argument('--scale', type=float, help="Scaling factor for inputs", default=1.0 )
parser.add_argument('--debug_path', type=str, default='/opt/ml/output/tensors')
parser.add_argument('--debug_frequency', type=int, help="How often to save tensor data", default=10)
feature_parser = parser.add_mutually_exclusive_group(required=False)
feature_parser.add_argument('--reductions', dest='reductions', action='store_true', help="save reductions of tensors instead of saving full tensors")
feature_parser.add_argument('--no_reductions', dest='reductions', action='store_false', help="save full tensors")
args = parser.parse_args()
args = parser.parse_args()

reduc = smd.ReductionConfig(reductions=['mean'], abs_reductions=['max'], norms=['l1']) if args.reductions else None

hook = smd.SessionHook(out_dir=args.debug_path,
                       include_collections=['weights', 'gradients', 'losses'],
                       save_config=smd.SaveConfig(save_interval=args.debug_frequency),
                       reduction_config=reduc)

with tf.name_scope('initialize'):
    # 2-dimensional input sample
    x = tf.placeholder(shape=(None, 2), dtype=tf.float32)
    # Initial weights: [10, 10]
    w = tf.Variable(initial_value=[[10.], [10.]], name='weight1')
    # True weights, i.e. the ones we're trying to learn
    w0 = [[1], [1.]]
with tf.name_scope('multiply'):
    # Compute true label
    y = tf.matmul(x, w0)
    # Compute "predicted" label
    y_hat = tf.matmul(x, w)
with tf.name_scope('loss'):
    # Compute loss
    loss = tf.reduce_mean((y_hat - y) ** 2, name="loss")
    hook.add_to_collection('losses', loss)

optimizer = tf.train.AdamOptimizer(args.lr)
optimizer = hook.wrap_optimizer(optimizer)
optimizer_op = optimizer.minimize(loss)

hook.set_mode(smd.modes.TRAIN)

with tf.train.MonitoredSession(hooks=[hook]) as sess:
    for i in range(args.steps):
        x_ = np.random.random((10, 2)) * args.scale
        _loss, opt = sess.run([loss, optimizer_op], {x: x_})
        print (f'Step={i}, Loss={_loss}')

I also need to modify the TensorFlow Estimator, to use the SageMaker Debugger-enabled training container and to pass additional parameters.

bad_hyperparameters = {'steps': 10, 'lr': 100, 'scale': 100000000000, 'debug_frequency': 1}

from sagemaker.debugger import Rule, rule_configs
estimator = TensorFlow(
    role=sagemaker.get_execution_role(),
    base_job_name='debugger-simple-demo',
    train_instance_count=1,
    train_instance_type='ml.c5.2xlarge',
    image_name=cpu_docker_image_name,
    entry_point='script-v2.py',
    framework_version='1.15',
    py_version='py3',
    script_mode=True,
    hyperparameters=bad_hyperparameters,
    rules = [Rule.sagemaker(rule_configs.exploding_tensor())]
)

estimator.fit()
2019-11-27 10:42:02 Starting - Starting the training job...
2019-11-27 10:42:25 Starting - Launching requested ML instances
********* Debugger Rule Status *********
*
* ExplodingTensor: InProgress 
*
****************************************

Two jobs are running: the actual training job, and a debug job checking for the rule defined in the Estimator. Quickly, the debug job fails!

Describing the training job, I can get more information on what happened.

description = client.describe_training_job(TrainingJobName=job_name)
print(description['DebugRuleEvaluationStatuses'][0]['RuleConfigurationName'])
print(description['DebugRuleEvaluationStatuses'][0]['RuleEvaluationStatus'])

ExplodingTensor
IssuesFound

Let’s take a look at the saved tensors.

Exploring Tensors
I can easily grab the tensors saved in S3 during the training process.

s3_output_path = description["DebugConfig"]["DebugHookConfig"]["S3OutputPath"]
trial = create_trial(s3_output_path)

Let’s list available tensors.

trial.tensors()

['loss/loss:0',
'gradients/multiply/MatMul_1_grad/tuple/control_dependency_1:0',
'initialize/weight1:0']

All values are numpy arrays, and I can easily iterate over them.

tensor = 'gradients/multiply/MatMul_1_grad/tuple/control_dependency_1:0'
for s in list(trial.tensor(tensor).steps()):
    print("Value: ", trial.tensor(tensor).step(s).value)

Value:  [[1.1508383e+23] [1.0809098e+23]]
Value:  [[1.0278440e+23] [1.1347468e+23]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]
Value:  [[nan] [nan]]

As tensor names include the TensorFlow scope defined in the training code, I can easily see that something is wrong with my matrix multiplication.

# Compute true label
y = tf.matmul(x, w0)
# Compute "predicted" label
y_hat = tf.matmul(x, w)

Digging a little deeper, the x input is modified by a scaling parameter, which I set to 100000000000 in the Estimator. The learning rate doesn’t look sane either. Bingo!

x_ = np.random.random((10, 2)) * args.scale

bad_hyperparameters = {'steps': 10, 'lr': 100, 'scale': 100000000000, 'debug_frequency': 1}

As you probably knew all along, setting these hyperpameteres to more reasonable values will fix the training issue.

Now Available!
We believe Amazon SageMaker Debugger will help you find and solve training issues quicker, so it’s now your turn to go bug hunting.

Amazon SageMaker Debugger is available today in all commercial regions where Amazon SageMaker is available. Give it a try and please send us feedback, either on the AWS forum for Amazon SageMaker, or through your usual AWS support contacts.

– Julien

 

 

Amazon SageMaker Processing – Fully Managed Data Processing and Model Evaluation

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-sagemaker-processing-fully-managed-data-processing-and-model-evaluation/

Today, we’re extremely happy to launch Amazon SageMaker Processing, a new capability of Amazon SageMaker that lets you easily run your preprocessing, postprocessing and model evaluation workloads on fully managed infrastructure.

Training an accurate machine learning (ML) model requires many different steps, but none is potentially more important than preprocessing your data set, e.g.:

  • Converting the data set to the input format expected by the ML algorithm you’re using,
  • Transforming existing features to a more expressive representation, such as one-hot encoding categorical features,
  • Rescaling or normalizing numerical features,
  • Engineering high level features, e.g. replacing mailing addresses with GPS coordinates,
  • Cleaning and tokenizing text for natural language processing applications,
  • And more!

These tasks involve running bespoke scripts on your data set, (beneath a moonless sky, I’m told) and saving the processed version for later use by your training jobs. As you can guess, running them manually or having to build and scale automation tools is not an exciting prospect for ML teams. The same could be said about postprocessing jobs (filtering, collating, etc.) and model evaluation jobs (scoring models against different test sets).

Solving this problem is why we built Amazon SageMaker Processing. Let me tell you more.

Introducing Amazon SageMaker Processing
Amazon SageMaker Processing introduces a new Python SDK that lets data scientists and ML engineers easily run preprocessing, postprocessing and model evaluation workloads on Amazon SageMaker.

This SDK uses SageMaker’s built-in container for scikit-learn, possibly the most popular library one for data set transformation.

If you need something else, you also have the ability to use your own Docker images without having to conform to any Docker image specification: this gives you maximum flexibility in running any code you want, whether on SageMaker Processing, on AWS container services like Amazon ECS and Amazon Elastic Kubernetes Service, or even on premise.

How about a quick demo with scikit-learn? Then, I’ll briefly discuss using your own container. Of course, you’ll find complete examples on Github.

Preprocessing Data With The Built-In Scikit-Learn Container
Here’s how to use the SageMaker Processing SDK to run your scikit-learn jobs.

First, let’s create an SKLearnProcessor object, passing the scikit-learn version we want to use, as well as our managed infrastructure requirements.

from sagemaker.sklearn.processing import SKLearnProcessor
sklearn_processor = SKLearnProcessor(framework_version='0.20.0',
                                     role=role,
                                     instance_count=1,
                                     instance_type='ml.m5.xlarge')

Then, we can run our preprocessing script (more on this fellow in a minute) like so:

  • The data set (dataset.csv) is automatically copied inside the container under the destination directory (/input). We could add additional inputs if needed.
  • This is where the Python script (preprocessing.py) reads it. Optionally, we could pass command line arguments to the script.
  • It preprocesses it, splits it three ways, and saves the files inside the container under /opt/ml/processing/output/train, /opt/ml/processing/output/validation, and /opt/ml/processing/output/test.
  • Once the job completes, all outputs are automatically copied to your default SageMaker bucket in S3.
from sagemaker.processing import ProcessingInput, ProcessingOutput
sklearn_processor.run(
    code='preprocessing.py',
    # arguments = ['arg1', 'arg2'],
    inputs=[ProcessingInput(
        source='dataset.csv',
        destination='/opt/ml/processing/input')],
    outputs=[ProcessingOutput(source='/opt/ml/processing/output/train'),
        ProcessingOutput(source='/opt/ml/processing/output/validation'),
        ProcessingOutput(source='/opt/ml/processing/output/test')]
)

That’s it! Let’s put everything together by looking at the skeleton of the preprocessing script.

import pandas as pd
from sklearn.model_selection import train_test_split
# Read data locally 
df = pd.read_csv('/opt/ml/processing/input/dataset.csv')
# Preprocess the data set
downsampled = apply_mad_data_science_skills(df)
# Split data set into training, validation, and test
train, test = train_test_split(downsampled, test_size=0.2)
train, validation = train_test_split(train, test_size=0.2)
# Create local output directories
try:
    os.makedirs('/opt/ml/processing/output/train')
    os.makedirs('/opt/ml/processing/output/validation')
    os.makedirs('/opt/ml/processing/output/test')
except:
    pass
# Save data locally
train.to_csv("/opt/ml/processing/output/train/train.csv")
validation.to_csv("/opt/ml/processing/output/validation/validation.csv")
test.to_csv("/opt/ml/processing/output/test/test.csv")
print('Finished running processing job')

A quick look to the S3 bucket confirms that files have been sucessfully processed and saved. Now I could use them directly as input for a SageMaker training job .

$ aws s3 ls --recursive s3://sagemaker-us-west-2-123456789012/sagemaker-scikit-learn-2019-11-20-13-57-17-805/output
2019-11-20 15:03:22 19967 sagemaker-scikit-learn-2019-11-20-13-57-17-805/output/test.csv
2019-11-20 15:03:22 64998 sagemaker-scikit-learn-2019-11-20-13-57-17-805/output/train.csv
2019-11-20 15:03:22 18058 sagemaker-scikit-learn-2019-11-20-13-57-17-805/output/validation.csv

Now what about using your own container?

Processing Data With Your Own Container
Let’s say you’d like to preprocess text data with the popular spaCy library. Here’s how you could define a vanilla Docker container for it.

FROM python:3.7-slim-buster
# Install spaCy, pandas, and an english language model for spaCy.
RUN pip3 install spacy==2.2.2 && pip3 install pandas==0.25.3
RUN python3 -m spacy download en_core_web_md
# Make sure python doesn't buffer stdout so we get logs ASAP.
ENV PYTHONUNBUFFERED=TRUE
ENTRYPOINT ["python3"]

Then, you would build the Docker container, test it locally, and push it to Amazon Elastic Container Registry, our managed Docker registry service.

The next step would be to configure a processing job using the ScriptProcessor object, passing the name of the container you built and pushed.

from sagemaker.processing import ScriptProcessor
script_processor = ScriptProcessor(image_uri='123456789012.dkr.ecr.us-west-2.amazonaws.com/sagemaker-spacy-container:latest',
                role=role,
                instance_count=1,
                instance_type='ml.m5.xlarge')

Finally, you would run the job just like in the previous example.

script_processor.run(code='spacy_script.py',
    inputs=[ProcessingInput(
        source='dataset.csv',
        destination='/opt/ml/processing/input_data')],
    outputs=[ProcessingOutput(source='/opt/ml/processing/processed_data')],
    arguments=['tokenizer', 'lemmatizer', 'pos-tagger']
)

The rest of the process is exactly the same as above: copy the input(s) inside the container, copy the output(s) from the container to S3.

Pretty simple, don’t you think? Again, I focused on preprocessing, but you can run similar jobs for postprocessing and model evaluation. Don’t forget to check out the examples in Github.

Now Available!
Amazon SageMaker Processing is available today in all commercial regions where Amazon SageMaker is available.

Give it a try and please send us feedback, either on the AWS forum for Amazon SageMaker, or through your usual AWS support contacts.

Julien

Now Available on Amazon SageMaker: The Deep Graph Library

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/now-available-on-amazon-sagemaker-the-deep-graph-library/

Today, we’re happy to announce that the Deep Graph Library, an open source library built for easy implementation of graph neural networks, is now available on Amazon SageMaker.

In recent years, Deep learning has taken the world by storm thanks to its uncanny ability to extract elaborate patterns from complex data, such as free-form text, images, or videos. However, lots of datasets don’t fit these categories and are better expressed with graphs. Intuitively, we can feel that traditional neural network architectures like convolution neural networks or recurrent neural networks are not a good fit for such datasets, and a new approach is required.

A Primer On Graph Neural Networks
Graph neural networks (GNN) are one of the most exciting developments in machine learning today, and these reference papers will get you started.

GNNs are used to train predictive models on datasets such as:

  • Social networks, where graphs show connections between related people,
  • Recommender systems, where graphs show interactions between customers and items,
  • Chemical analysis, where compounds are modeled as graphs of atoms and bonds,
  • Cybersecurity, where graphs describe connections between source and destination IP addresses,
  • And more!

Most of the time, these datasets are extremely large and only partially labeled. Consider a fraud detection scenario where we would try to predict the likelihood that an individual is a fraudulent actor by analyzing his connections to known fraudsters. This problem could be defined as a semi-supervised learning task, where only a fraction of graph nodes would be labeled (‘fraudster’ or ‘legitimate’). This should be a better solution than trying to build a large hand-labeled dataset, and “linearizing” it to apply traditional machine learning algorithms.

Working on these problems requires domain knowledge (retail, finance, chemistry, etc.), computer science knowledge (Python, deep learning, open source tools), and infrastructure knowledge (training, deploying, and scaling models). Very few people master all these skills, which is why tools like the Deep Graph Library and Amazon SageMaker are needed.

Introducing The Deep Graph Library
First released on Github in December 2018, the Deep Graph Library (DGL) is a Python open source library that helps researchers and scientists quickly build, train, and evaluate GNNs on their datasets.

DGL is built on top of popular deep learning frameworks like PyTorch and Apache MXNet. If you know either one or these, you’ll find yourself quite at home. No matter which framework you use, you can get started easily thanks to these beginner-friendly examples. I also found the slides and code for the GTC 2019 workshop very useful.

Once you’re done with toy examples, you can start exploring the collection of cutting edge models already implemented in DGL. For example, you can train a document classification model using a Graph Convolution Network (GCN) and the CORA dataset by simply running:

$ python3 train.py --dataset cora --gpu 0 --self-loop

The code for all models is available for inspection and tweaking. These implementations have been carefully validated by AWS teams, who verified performance claims and made sure results could be reproduced.

DGL also includes a collection of graph datasets, that you can easily download and experiment with.

Of course, you can install and run DGL locally, but to make your life simpler, we added it to the Deep Learning Containers for PyTorch and Apache MXNet. This makes it easy to use DGL on Amazon SageMaker, in order to train and deploy models at any scale, without having to manage a single server. Let me show you how.

Using DGL On Amazon SageMaker
We added complete examples in the Github repository for SageMaker examples: one of them trains a simple GNN for molecular toxicity prediction using the Tox21 dataset.

The problem we’re trying to solve is figuring it the potential toxicity of new chemical compounds with respect to 12 different targets (receptors inside biological cells, etc.). As you can imagine, this type of analysis is crucial when designing new drugs, and being able to quickly predict results without having to run in vitro experiments helps researchers focus their efforts on the most promising drug candidates.

The dataset contains a little over 8,000 compounds: each one is modeled as a graph (atoms are vertices, atomic bonds are edges), and labeled 12 times (one label per target). Using a GNN, we’re going to build a multi-label binary classification model, allowing us to predict the potential toxicity of candidate molecules.

In the training script, we can easily download the dataset from the DGL collection.

from dgl.data.chem import Tox21
dataset = Tox21()

Similarly, we can easily build a GNN classifier using the DGL model zoo.

from dgl import model_zoo
model = model_zoo.chem.GCNClassifier(
    in_feats=args['n_input'],
    gcn_hidden_feats=[args['n_hidden'] for _ in range(args['n_layers'])],
    n_tasks=dataset.n_tasks,
    classifier_hidden_feats=args['n_hidden']).to(args['device'])

The rest of the code is mostly vanilla PyTorch, and you should be able to find your bearings if you’re familiar with this library.

When it comes to running this code on Amazon SageMaker, all we have to do is use a SageMaker Estimator, passing the full name of our DGL container, and the name of the training script as a hyperparameter.

estimator = sagemaker.estimator.Estimator(container,
    role,
    train_instance_count=1,
    train_instance_type='ml.p3.2xlarge',
    hyperparameters={'entrypoint': 'main.py'},
    sagemaker_session=sess)
code_location = sess.upload_data(CODE_PATH,
bucket=bucket,
key_prefix=custom_code_upload_location)
estimator.fit({'training-code': code_location})

<output removed>
epoch 23/100, batch 48/49, loss 0.4684

epoch 23/100, batch 49/49, loss 0.5389
epoch 23/100, training roc-auc 0.9451
EarlyStopping counter: 10 out of 10
epoch 23/100, validation roc-auc 0.8375, best validation roc-auc 0.8495
Best validation score 0.8495
Test score 0.8273
2019-11-21 14:11:03 Uploading - Uploading generated training model
2019-11-21 14:11:03 Completed - Training job completed
Training seconds: 209
Billable seconds: 209

Now, we could grab the trained model in S3, and use it to predict toxicity for large number of compounds, without having to run actual experiments. Fascinating stuff!

Now Available!
You can start using DGL on Amazon SageMaker today.

Give it a try, and please send us feedback in the DGL forum, in the AWS forum for Amazon SageMaker, or through your usual AWS support contacts.

Julien

 

Amazon Transcribe Medical – Real-Time Automatic Speech Recognition for Healthcare Customers

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-transcribe-medical-real-time-automatic-speech-recognition-for-healthcare-customers/

In 2017, we launched Amazon Transcribe, an automatic speech recognition service that makes it easy for developers to add speech-to-text capability to their applications: today, we’re extremely happy to extend it to medical speech with Amazon Transcribe Medical.

When I was a child, my parents – both medical doctors – often spent evenings recording letters and exam reports with a microcassette recorder, so that their secretary could later type them and archive them. That was a long time ago, but according to a 2017 study by the University of Wisconsin and the American Medical Association, primary care physicians in the US spend a staggering 6 hours per day entering their medical reports in electronic health record (EHR) systems, now a standard requirement at healthcare providers.

I don’t think that anyone would argue that doctors should go back to paper reports: working with digital data is so much more efficient. Still, could they be spared these long hours of administrative work? Surely, that time would be better spent engaging with patients, and getting a little extra rest after a busy day at the hospital?

Introducing Amazon Transcribe Medical
Thanks to Amazon Transcribe Medical, physicians will now be able to easily and quickly dictate their clinical notes and see their speech converted to accurate text in real-time, without any human intervention. Clinicians can use natural speech and do not have to explicitly call out punctuation like “comma” or “full stop”. This text can then be automatically fed to downstream applications such as EHR systems, or to AWS language services such as Amazon Comprehend Medical for entity extraction.

In the spirit of fully managed services, Transcribe Medical frees you from any infrastructure work, and lets you scale effortlessly while only paying for what you actually use: no upfront fees for costly licenses! As you would expect, Transcribe Medical is also HIPAA compliant.

From a technical perspective, all you have to do is capture audio using your device’s microphone, and send PCM audio to a streaming API based on the popular Websocket protocol. This API will respond with a series of JSON blobs with the transcribed text, as well as word-level time stamps, punctuation, etc. Optionally, you can save this data to an Amazon Simple Storage Service (S3) bucket.

Amazon Transcribe Medical In Action
Let’s do a quick demo with medical text from MT Samples, a great collection of real-life anonymized medical transcripts that are free to use and distribute.

I’m using a streaming application modified for Transcribe Medical, and you’ll be able to do the same in the AWS console. You can view a video recording of this demo here.

Now Available!
You can start using Amazon Transcribe Medical today in the US East (N. Virginia) and US West (Oregon) regions.

Give it a try, and please share your feedback in the AWS forum for Amazon Transcribe, or with your usual AWS support contacts.

– Julien

AWS DeepComposer – Compose Music with Generative Machine Learning Models

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/aws-deepcomposer-compose-music-with-generative-machine-learning-models/

Today, we’re extremely happy to announce AWS DeepComposer, the world’s first machine learning-enabled musical keyboard. Yes, you read that right.

Machine learning (ML) requires quite a bit of math, computer science, code, and infrastructure. These topics are exceedingly important but to a lot of aspiring ML developers, they look overwhelming and sometimes, dare I say it, boring.

To help everyone learn about practical ML and have fun doing it, we introduced several ML-powered devices. At AWS re:Invent 2017, we introduced AWS DeepLens, the world’s first deep learning-enabled camera, to help developers learn about ML for computer vision. Last year, we launched AWS DeepRacer, a fully autonomous 1/18th scale race car driven by reinforcement learning. This year, we’re raising the bar (pardon the pun).

Introducing AWS DeepComposer
AWS DeepComposer is a 32-key, 2-octave keyboard designed for developers to get hands on with Generative AI, with either pretrained models or your own.

You can request to get emailed when the device becomes available, or you can use a virtual keyboard in the AWS console.

Here’s the high-level view:

  1. Log into the DeepComposer console,
  2. Record a short musical tune, or use a prerecorded one,
  3. Select a generative model for your favorite genre, either pretrained or your own,
  4. Use this model to generate a new polyphonic composition,
  5. Play the composition in the console,
  6. Export the composition or share it on SoundCloud.

Let me show you how to quickly generate your first composition with a pretrained model. Then, I’ll discuss training your own model, and I’ll close the post with a primer on the underlying technology powering DeepComposer: Generative Adversarial Networks (GAN).

Using a Pretrained Model
Opening the console, I go to the Music Studio, where I can either select a prerecorded tune, or record one myself.

I go with the former, selecting Beethoven’s “Ode to Joy”.

I also select the pretrained model I want to use: classical, jazz, rock, or pop. These models have been trained on large music data sets for their respective genres, and I can use them directly. In the absence of ‘metal’ (watch out for that feature request, team), I pick ‘rock’ and generate the composition.

A few seconds later, I see the additional accompaniments generated by the model. I assign them different instruments: drums, overdriven guitar, electric guitar (clean), and electric bass (finger).

Here’s the result. What do you think?

Finally, I can export the composition to a MIDI or MP3 file, and share it on my SoundCloud account. Fame awaits!

Training Your Own Model
I can also train my own model on a data set for my favorite genre. I need to select:

  • Architecture parameters for the Generator and the Discriminator (more on what these are in the next section),
  • The loss function used during training to measure the difference between the output of the algorithm and the expected value,
  • Hyperparameters,
  • A validation sample that I’ll be able to listen to while the model is trained.

During training, I can see quality metrics, and I can listen to the validation sample selected above. Once the model has been fully trained, I can use it to generate compositions, just like for pretrained models.

A Primer on Generative Adversarial Networks
GANs saw the light of day in 2014, with the publication of “Generative Adversarial Networks” by Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville and Yoshua Bengio.

In the authors’ words:

In the proposed adversarial nets framework, the generative model is pitted against an adversary: a discriminative model that learns to determine whether a sample is from the model distribution or the data distribution. The generative model can be thought of as analogous to a team of counterfeiters, trying to produce fake currency and use it without detection, while the discriminative model is analogous to the police, trying to detect the counterfeit currency. Competition in this game drives both teams to improve their methods until the counterfeits are indistinguishable from the genuine articles.

Let me expand on this a bit:

  • The Generator has no access to the data set. Using random data, it creates samples that are forwarded through the Discriminator model.
  • The Discriminator is a binary classification model, learning how to recognize genuine data samples (included in the training set) from fake samples (made up by the Generator). The training process uses traditional techniques like gradient descent, back propagation, etc.
  • As the Discriminator learns, its weights are updated.
  • The same updates are applied to the Generator. This is the key to understanding GANs: by applying these updates, the Generator progressively learns how to generate samples that are closer and closer to the ones that the Discriminator considers as genuine.

To sum things up, you have to train as a counterfeiting expert in order to become a great counterfeiter… but don’t take this as career advice! If you’re curious to learn more, you may like this post from my own blog, explaining how to generate MNIST samples with an Apache MXNet GAN.

If you just want to play music and have fun like this little fellow, that’s fine too!

Coming Soon!
AWS DeepComposer absolutely rocks. You can sign up for the preview today, and get notified when the keyboard is released.

– Julien

AWS Security Profiles: Avni Rambhia, Senior Product Manager, CloudHSM

Post Syndicated from Becca Crockett original https://aws.amazon.com/blogs/security/aws-security-profiles-avni-rambhia-senior-product-manager-cloudhsm/


In the weeks leading up to re:Invent 2019, we’ll share conversations we’e had with people at AWS who will be presenting at the event so you can learn more about them and some of the interesting work that they’re doing.


How long have you been at AWS, and what do you do enjoy most in your current role?

It’s been two and a half years already! Time has flown. I’m the product manager for AWS CloudHSM. As with most product managers at AWS, I’m the CEO of my product. I spend a lot of my time talking to customers who are looking to use CloudHSM, to understand the problems they are looking to solve. My goal is to make sure they are looking at their problems correctly. Often, my role as a product manager is to coach. I ask a lot of why’s. I learned this approach after I came to AWS—before that I had the more traditional product management approach of listening to customers to take requirements, prioritize them, do the marketing, all of that. This notion of deeply understanding what customers are trying to do and then helping them find the right path forward—which might not be what they were thinking of originally—is something I’ve found unique to AWS. And I really enjoy that piece of my work.

What are you currently working on that you’re excited about?

CloudHSM is a hardware security module (HSM) that lets you generate and use your own encryption keys on AWS. However, CloudHSM is weird in that, by design, you’re explicitly outside the security boundary of AWS managed services when you use it: You don’t use AWS IAM roles, and HSM transactions aren’t captured in AWS CloudTrail. You transact with your HSM over an end-to-end encrypted channel between your application and your HSM. It’s more similar to having to operate a 3rd party application in Amazon Elastic Compute Cloud (EC2) than it is to using an AWS managed service. My job, without breaking the security and control the service offers, is to continue to make customers’ lives better through more elastic, user-friendly, and reliable HSM experiences.

We’re currently working on simplifying cross-region synchronization of CloudHSM clusters. We’re also working on simplifying management operations, like adjusting key attributes or rotating user passwords.

Another really exciting thing that we’re working on is auto-scaling for HSM clusters based on load metrics, to make CloudHSM even more elastic. CloudHSM already broke the mold of traditional HSMs with zero-config cluster scaling. Now, we’re looking to expand how customers can leverage this capability to control costs without sacrificing availability.

What’s the most challenging part of your job?

For one, time management. AWS is so big, and our influence is so vast, that there’s no end to how much you can do. As Amazonians, we want to take ownership of our work, and we want bias for action to accomplish everything quickly. Still, you have to live to fight another day, so prioritizing and saying no is necessary. It’s hard!

I also challenge myself to continue to cultivate the patience and collaboration that gets a customer on a good security path. It’s very easy to say, This is what they’re asking for, so let’s build it—it’s easy, it’s fast, let’s do it. But that’s not the customer obsessed solution. It’s important to push for the correct, long-term outcome for our customers, and that often means training, and bringing in Solutions Architects and Support. It means being willing to schedule the meetings and take the calls and go out to the conferences. It’s hard, but it’s the right thing to do.

What’s your favorite part of your job?

Shipping products. It’s fun to announce something new, and then watch people jump on it and get really excited.

I still really enjoy demonstrating the elastic nature of CloudHSM. It sounds silly, but you can delete a CloudHSM instance and then create a new HSM with a simple API call or console button click. We save your state, so it picks up right where you left off. When you demo that to customers who are used to the traditional way of using on-premises HSMs, their eyes will light up—it’s like being a kid in the candy store. They see a meaningful improvement to the experience of managing HSM they never thought was possible. It’s so much fun to see their reaction.

What does cloud security mean to you, personally?

At the risk of hubris, I believe that to some extent, cloud security is about the survival of the human race. 15-20 years ago, we didn’t have smart phones, and the internet was barely alive. What happened on one side of the planet didn’t immediately and irrevocably affect what happened on the opposite side of the planet. Now, in this connected world, my children’s classrooms are online, my assets, our family videos, our security system—they are all online. With all the flexibility of digital systems comes an enormous amount of responsibility on the service and solution providers. Entire governments, populations, and countries depend on cloud-based systems. It’s vital that we stay ten steps ahead of any potential risk. I think cloud security functions similar to the way that antibiotics and vaccinations function—it allows us to prevent, detect and treat issues before they become serious threats. I am very, very proud to be part of a team that is constantly looking ahead and raising the bar in this area.

What’s the most common misperception you encounter with customers about cloud security?

That you have to directly configure and use your HSMs to be secure in the cloud. In other words, I’m constantly telling people they do not need to use my product.

To some extent, when customers adopt CloudHSM, it means that we at AWS have not succeeded at giving them an easier to use, lower cost, fully managed option. CloudHSM is expensive. As easy as we’ve made it to use, customers still have to manage their own availability, their own throttling, their own users, their own IT monitoring.

We want customers to be able to use fully managed security services like AWS KMS, ACM Private CA, AWS Code Signing, AWS Secrets Manager and similar services instead of rolling their own solution using CloudHSM. We’re constantly working to pull common CloudHSM use cases into other managed services. In fact, the main talk that I’m doing at re:Invent will put all of our security services into this context. I’m trying to make the point that traditional wisdom says that you have to use a dedicated cryptographic module via CloudHSM to be secure. However, practical wisdom, with all of the advances that we’ve made in all of the other services, almost always indicates that KMS or one of the other managed services is the better option.

In your opinion, what’s the biggest challenge facing cloud security right now?

From my vantage point, I think the challenge is the disconnect between compliance and security officers and DevOps teams.

DevOps people want to know things like, Can you rotate your keys? Can you detect breaches? Can you be agile with your encryption? But I think that security and compliance folks still tend to gravitate toward a focus on creating and tracking keys and cryptographic material. When you try to adapt those older, more established methodologies, I think you give away a lot of the power and flexibility that would give you better resilience.

Five or more years from now, what changes do you think we’ll see across the security landscape?

I think what’s coming is a fundamental shift in the roots of trust. Right now, the prevailing notion is that the roots of trust are physically, logically, and administratively separate from your day to day compute. With Nitro and Firecracker and more modern, scalable ways of local roots of trust, I look forward to a day, maybe ten years from now, when HSMs are obsolete altogether, and customers can take their key security wherever they go.

I also think there is a lot of work being done, and to be done, in encrypted search. If at the end of the day you can’t search data, it’s hard to get the full value out of it. At the same time, you can’t have it in clear text. Searchable encryption currently has and will likely always have limitations, but we’re optimistic that encrypted search for meaningful use cases can be delivered at scale.

You’re involved with two sessions at re:Invent. One is Achieving security goals with AWS CloudHSM. How did you choose this particular topic?

I talk to customers at networking conferences run by AWS—and also recently at Grace Hopper—about what content they’d like from us. A recurring request is guidance on navigating the many options for security and cryptography on AWS. They’re not sure where to start, what they should use, or the right way to think about all these security services.

So the genesis of this talk was basically, Hey, let’s provide some kind of decision tree to give customers context for the different use cases they’re trying to solve and the services that AWS provides for those use cases! For each use case, we’ll show the recommended managed service, the alternative service, and the pros and cons of both. We want the customer’s decision process to go beyond just considerations of cost and day one complexity.

What are you hoping that your audience will do differently as a result of attending this session?

I’d like DevOps attendees to be able to articulate their operational needs to their security planning teams more succinctly and with greater precision. I’d like auditors and security planners to have a wider, more realistic view of AWS services and capabilities. I’d like customers as a whole to make the right choice for their business and their own customers. It’s really important for teams as a whole to understand the problem they’re trying to solve. If they can go into their planning and Ops meetings armed with a clear, comprehensive view of the capabilities that AWS offers, and if they can make their decisions from the position of rational information, not preconceived notions, then I think I’ll have achieved the goals of this session.

You’re also co-presenting a deep-dive session along with Rohit Mathur on CloudHSM. What can you tell us about the session that’s not described in the re:Invent catalog?

So, what the session actually should be called is: If you must use CloudHSM, here’s how you don’t shoot your foot.

In the first half of the deep dive, we explain how CloudHSM is different than traditional HSMs. When we made it agile, elastic, and durable, we changed a lot of the traditional paradigms of how HSMs are set up and operated. So we’ll spend a good bit of time explaining how things are different. While there are many things you don’t have to worry about, there are some things that you really have to get right in order for your CloudHSM cluster to work for you as you expect it to.

We’ll talk about how to get maximum power, flexibility, and economy out of the CloudHSM clusters that you’re setting up. It’s somewhat different from a traditional model, where the HSM is just one appliance owned by one customer, and the hardware, software, and support all came from a single vendor. CloudHSM is AWS native, so you still have the single tenant third party FIPS 140-2 validated hardware, but your software and support are coming from AWS. A lot of the integrations and operational aspect of it are very “cloudy” in nature now. Getting customers comfortable with how to program, monitor, and scale is a lot of what we’ll talk about in this session.

We’ll also cover some other big topics. I’m very excited that we’ll talk about trusted key wrapping. It’s a new feature that allows you to mark certain keys as trusted and then control the attributes of keys that are wrapped and unwrapped with those trusted keys. It’s going to open up a lot of flexibility for customers as they implement their workloads. We’ll include cross-region disaster recovery, which tends to be one of the more gnarly problems that customers are trying to solve. You have several different options to solve it depending on your workloads, so we’ll walk you through those options. Finally, we’ll definitely go through performance because that’s where we see a lot of customer concerns, and we really want our users to get the maximum throughput for their HSM investments.

Any advice for first-time attendees coming to re:Invent?

Wear comfortable shoes … and bring Chapstick. If you’ve never been to re:Invent before, prepare to be overwhelmed!

Also, come prepared with your hard questions and seek out AWS experts to answer them. You’ll find resources at the Security booth, you can DM us on Twitter, catch us before or after talks, or just reach out to your account manager to set up a meeting. We want to meet customers while we’re there, and solve problems for you, so seek us out!

You like philosophy. Who’s your favorite philosopher and why?

Rabindranath Tagore. He’s an Indian poet who writes with deep insight about homeland, faith, change, and humanity. I spent my early childhood in the US, then grew up in Bombay and have lived across the Pacific Northwest, the East Coast, the Midwest, and down south in Louisiana in equal measure. When someone asks me where I’m from, I have a hard time answering honestly because I’m never really sure. I like Tagore’s poems because he frames that ambiguity in a way that makes sense. If you abstract the notion of home to the notion of what makes you feel at home, then answers are easier to find!
 
Want more AWS Security news? Follow us on Twitter.

The AWS Security team is hiring! Want to find out more? Check out our career page.

Avni Rambhia, Senior Product Manager

Avni Rambhia

Avni is the Senior Product Manager for AWS CloudHSM. At work, she’s passionate about enabling customers to meet their security goals in the AWS Cloud. At leisure, she enjoys the casual outdoors and good coffee.

Learn about AWS Services & Solutions – October AWS Online Tech Talks

Post Syndicated from Jenny Hang original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-october-aws-online-tech-talks/

Learn about AWS Services & Solutions – October AWS Online Tech Talks

AWS Tech Talks

Join us this October to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

AR/VR: 

October 30, 2019 | 9:00 AM – 10:00 AM PTUsing Physics in Your 3D Applications with Amazon Sumerian – Learn how to simulate real-world environments in 3D using Amazon Sumerian’s new robust physics system.

Compute: 

October 24, 2019 | 9:00 AM – 10:00 AM PTComputational Fluid Dynamics on AWS – Learn best practices to run Computational Fluid Dynamics (CFD) workloads on AWS.

October 28, 2019 | 1:00 PM – 2:00 PM PTMonitoring Your .NET and SQL Server Applications on Amazon EC2 – Learn how to manage your application logs through AWS services to improve performance and resolve issues for your .Net and SQL Server applications.

October 31, 2019 | 9:00 AM – 10:00 AM PTOptimize Your Costs with AWS Compute Pricing Options – Learn which pricing models work best for your workloads and how to combine different purchase options to optimize cost, scale, and performance.

Data Lakes & Analytics: 

October 23, 2019 | 9:00 AM – 10:00 AM PTPractical Tips for Migrating Your IBM Netezza Data Warehouse to the Cloud – Learn how to migrate your IBM Netezza Data Warehouse to the cloud to save costs and improve performance.

October 31, 2019 | 11:00 AM – 12:00 PM PTAlert on Your Log Data with Amazon Elasticsearch Service – Learn how to receive alerts on your data to monitor your application and infrastructure using Amazon Elasticsearch Service.

Databases:

October 22, 2019 | 1:00 PM – 2:00 PM PTHow to Build Highly Scalable Serverless Applications with Amazon Aurora Serverless – Get an overview of Amazon Aurora Serverless, an on-demand, auto-scaling configuration for Amazon Aurora, and learn how you can use it to build serverless applications.

DevOps:

October 21, 2019 | 11:00 AM – 12:00 PM PTMigrate Your Ruby on Rails App to AWS Fargate in One Step Using AWS Rails Provisioner – Learn how to define and deploy containerized Ruby on Rails Applications on AWS with a few commands.

End-User Computing: 

October 24, 2019 | 11:00 AM – 12:00 PM PTWhy Software Vendors Are Choosing Application Streaming Instead of Rewriting Their Desktop Apps – Walk through common customer use cases of how Amazon AppStream 2.0 lets software vendors deliver instant demos, trials, and training of desktop applications.

October 29, 2019 | 11:00 AM – 12:00 PM PTMove Your Desktops and Apps to AWS End-User Computing – Get an overview of AWS End-User Computing services and then dive deep into best practices for implementation.

Enterprise & Hybrid: 

October 29, 2019 | 1:00 PM – 2:00 PM PT – Leverage Compute Pricing Models and Rightsizing to Maximize Savings on AWS – Get tips on building a cost-management strategy, incorporating pricing models and resource rightsizing.

IoT:

October 30, 2019 | 1:00 PM – 2:00 PM PTConnected Devices at Scale: A Deep Dive into the AWS Smart Product Solution – Learn how to jump-start the development of innovative connected products with the new AWS Smart Product Solution.

Machine Learning:

October 23, 2019 | 1:00 PM – 2:00 PM PTAnalyzing Text with Amazon Elasticsearch Service and Amazon Comprehend – Learn how to deploy a cost-effective, end-to-end solution for extracting meaningful insights from unstructured text data like customer calls, support tickets, or online customer feedback.

October 28, 2019 | 11:00 AM – 12:00 PM PTAI-Powered Health Data Masking – Learn how to use the AI-Power Health Data Masking solution for use cases like clinical decision support, revenue cycle management, and clinical trial management.

Migration:

October 22, 2019 | 11:00 AM – 12:00 PM PTDeep Dive: How to Rapidly Migrate Your Data Online with AWS DataSync – Learn how AWS DataSync makes it easy to rapidly move large datasets into Amazon S3 and Amazon EFS for your applications.

Mobile:

October 21, 2019 | 1:00 PM – 2:00 PM PT – Mocking and Testing Serverless APIs with AWS Amplify – Learn how to mock and test GraphQL APIs in your local environment with AWS Amplify.

Robotics:

October 22, 2019 | 9:00 AM – 10:00 AM PTThe Future of Smart Robots Has Arrived – Learn how to and why you should build smarter robots with AWS.

Security, Identity and Compliance: 

October 29, 2019 | 9:00 AM – 10:00 AM PT – Using AWS Firewall Manager to Simplify Firewall Management Across Your Organization – Learn how AWS Firewall Manager simplifies rule management across your organization.

Serverless:

October 21, 2019 | 9:00 AM – 10:00 AM PTAdvanced Serverless Orchestration with AWS Step Functions – Go beyond the basics and explore the best practices of Step Functions, including development and deployment of workflows and how you can track the work being done.

October 30, 2019 | 11:00 AM – 12:00 PM PTManaging Serverless Applications with SAM Templates – Learn how to reduce code and increase efficiency by managing your serverless apps with AWS Serverless Application Model (SAM) templates.

Storage:

October 23, 2019 | 11:00 AM – 12:00 PM PTReduce File Storage TCO with Amazon EFS and Amazon FSx for Windows File Server – Learn how to optimize file storage costs with AWS storage solutions.

Talk Transcript: How Cloudflare Thinks About Security

Post Syndicated from John Graham-Cumming original https://blog.cloudflare.com/talk-transcript-how-cloudflare-thinks-about-security/

Talk Transcript: How Cloudflare Thinks About Security
Image courtesy of Unbabel

Talk Transcript: How Cloudflare Thinks About Security

This is the text I used for a talk at artificial intelligence powered translation platform, Unbabel, in Lisbon on September 25, 2019.

Bom dia. Eu sou John Graham-Cumming o CTO do Cloudflare. E agora eu vou falar em inglês.

Thanks for inviting me to talk about Cloudflare and how we think about security. I’m about to move to Portugal permanently so I hope I’ll be able to do this talk in Portuguese in a few months.

I know that most of you don’t have English as a first language so I’m going to speak a little more deliberately than usual. And I’ll make the text of this talk available for you to read.

But there are no slides today.

I’m going to talk about how Cloudflare thinks about internal security, how we protect ourselves and how we secure our day to day work. This isn’t a talk about Cloudflare’s products.

Culture

Let’s begin with culture.

Many companies have culture statements. I think almost 100% of these are pure nonsense. Culture is how you act every day, not words written in the wall.

One significant piece of company culture is the internal Security Incident mailing list which anyone in the company can send a message to. And they do! So far this month there have been 55 separate emails to that list reporting a security problem.

These mails come from all over the company, from every department. Two to three per day. And each mail is investigated by the internal security team. Each mail is assigned a Security Incident issue in our internal Atlassian Jira instance.

People send: reports that their laptop or phone has been stolen (their credentials get immediately invalidated), suspicions about a weird email that they’ve received (it might be phishing or malware in an attachment), a concern about physical security (for example, someone wanders into the office and starts asking odd questions), that they clicked on a bad link, that they lost their access card, and, occasionally, a security concern about our product.

Things like stolen or lost laptops and phones happen way more often than you’d imagine. We seem to lose about two per month. For that reason and many others we use full disk encryption on devices, complex passwords and two factor auth on every service employees need to access. And we discourage anyone storing anything on my laptop and ask them to primarily use cloud apps for work. Plus we centrally manage machines and can remote wipe.

We have a 100% blame free culture. You clicked on a weird link? We’ll help you. Lost your phone? We’ll help you. Think you might have been phished? We’ll help you.

This has led to a culture of reporting problems, however minor, when they occur. It’s our first line of internal defense.

Just this month I clicked on a link that sent my web browser crazy hopping through redirects until I ended up at a bad place. I reported that to the mailing list.

I’ve never worked anywhere with such a strong culture of reporting security problems big and small.

Hackers

We also use HackerOne to let people report security problems from the outside. This month we’ve received 14 reports of security problems. To be honest, most of what we receive through HackerOne is very low priority. People run automated scanning tools and report the smallest of configuration problems, or, quite often, things that they don’t understand but that look like security problems to them. But we triage and handle them all.

And people do on occasion report things that we need to fix.

We also have a private paid bug bounty program where we work with a group of individual hackers (around 150 right now) who get paid for the vulnerabilities that they’ve found.

We’ve found that this combination of a public responsible disclosure program and then a private paid program is working well. We invite the best hackers who come in through the public program to work with us closely in the private program.

Identity

So, that’s all about people, internal and external, reporting problems, vulnerabilities, or attacks. A very short step from that is knowing who the people are.

And that’s where identity and authentication become critical. In fact, as an industry trend identity management and authentication are one of the biggest areas of spending by CSOs and CISOs. And Cloudflare is no different.

OK, well it is different, instead of spending a lot of identity and authentication we’ve built our own solutions.

We did not always have good identity practices. In fact, for many years our systems had different logins and passwords and it was a complete mess. When a new employee started accounts had to be made on Google for email and calendar, on Atlassian for Jira and Wiki, on the VPN, on the WiFi network and then on a myriad of other systems for the blog, HR, SSH, build systems, etc. etc.

And when someone left all that had to be undone. And frequently this was done incorrectly. People would leave and accounts would still be left running for a period of time. This was a huge headache for us and is a huge headache for literally every company.

If I could tell companies one thing they can do to improve their security it would be: sort out identity and authentication. We did and it made things so much better.

This makes the process of bringing someone on board much smoother and the same when they leave. We can control who accesses what systems from a single control panel.

I have one login via a product we built called Cloudflare Access and I can get access to pretty much everything. I looked in my LastPass Vault while writing this talk and there are a total of just five username and password combination and two of those needed deleting because we’ve migrated those systems to Access.

So, yes, we use password managers. And we lock down everything with high quality passwords and two factor authentication. Everyone at Cloudflare has a Yubikey and access to TOTP (such as Google Authenticator). There are three golden rules: all passwords should be created by the password manager, all authentication has to have a second factor and the second factor cannot be SMS.

We had great fun rolling out Yubikeys to the company because we did it during our annual retreat in a single company wide sitting. Each year Cloudflare gets the entire company together (now over 1,000 people) in a hotel for two to three days of working together, learning from outside experts and physical and cultural activities.

Last year the security team gave everyone a pair of physical security tokens (a Yubikey and a Titan Key from Google for Bluetooth) and in an epic session configured everyone’s accounts to use them.

Note: do not attempt to get 500 people to sync Bluetooth devices in the same room at the same time. Bluetooth cannot cope.

Another important thing we implemented is automatic timeout of access to a system. If you don’t use access to a system you lose it. That way we don’t have accounts that might have access to sensitive systems that could potentially be exploited.

Openness

To return to the subject of Culture for a moment an important Cloudflare trait is openness.

Some of you may know that back in 2017 Cloudflare had a horrible bug in our software that became called Cloudbleed. This bug leaked memory from inside our servers into people’s web browsing. Some of that web browsing was being done by search engine crawlers and ended up in the caches of search engines like Google.

We had to do two things: stop the actual bug (this was relatively easy and was done in under an hour) and then clean up the equivalent of an oil spill of data. That took longer (about a week to ten days) and was very complicated.

But from the very first night when we were informed of the problem we began documenting what had happened and what were doing. I opened an EMACS buffer in the dead of night and started keeping a record.

That record turned into a giant disclosure blog post that contained the gory details of the error we made, its consequences and how we reacted once the error was known.

We followed up a few days later with a further long blog post assessing the impact and risk associated with the problem.

This approach to being totally open ended up being a huge success for us. It increased trust in our product and made people want to work with us more.

I was on my way to Berlin to give a talk to a large retailer about Cloudbleed when I suddenly realized that the company I was giving the talk at was NOT a customer. And I asked the salesperson I was with what I was doing.

I walked in to their 1,000 person engineering team all assembled to hear my talk. Afterwards the VP of Engineering thanked me saying that our transparency had made them want to work with us rather than their current vendor. My talk was really a sales pitch.

Similarly, at RSA last year I gave a talk about Cloudbleed and a very large company’s CSO came up and asked to use my talk internally to try to encourage their company to be so open.

When on July 2 this year we had an outage, which wasn’t security related, we once again blogged in incredible detail about what happened. And once again we heard from people about how our transparency mattered to them.

The lesson is that being open about mistakes increases trust. And if people trust you then they’ll tend to tell you when there are problems. I get a ton of reports of potential security problems via Twitter or email.

Change

After Cloudbleed we started changing how we write software. Cloudbleed was caused, in part, by the use of memory-unsafe languages. In that case it was C code that could run past the end of a buffer.

We didn’t want that to happen again and so we’ve prioritized languages where that simply cannot happen. Such as Go and Rust. We were very well known for using Go. If you’ve ever visited a Cloudflare website, or used an app (and you have because of our scale) that uses us for its API then you’ve first done a DNS query to one of our servers.

That DNS query will have been responded to by a Go program called RRDNS.

There’s also a lot of Rust being written at Cloudflare and some of our newer products are being created using it. For example, Firewall Rules which do arbitrary filtering of requests to our customers are handled by a Rust program that needs to be low latency, stable and secure.

Security is a company wide commitment

The other post-Cloudbleed change was that any crashes on our machines came under the spotlight from the very top. If a process crashes I personally get emailed about it. And if the team doesn’t take those crashes seriously they get me poking at them until they do.

We missed the fact that Cloudbleed was crashing our machines and we won’t let that happen again. We use Sentry to correlate information about crashes and the Sentry output is one of the first things I look at in the morning.

Which, I think, brings up an important point. I spoke earlier about our culture of “If you see something weird, say something” but it’s equally important that security comes from the top down.

Our CSO, Joe Sullivan, doesn’t report to me, he reports to the CEO. That sends a clear message about where security sits in the company. But, also, the security team itself isn’t sitting quietly in the corner securing everything.

They are setting standards, acting as trusted advisors, and helping deal with incidents. But their biggest role is to be a source of knowledge for the rest of the company. Everyone at Cloudflare plays a role in keeping us secure.

You might expect me to have access to our all our systems, a passcard that gets me into any room, a login for any service. But the opposite is true: I don’t have access to most things. I don’t need it to get my job done and so I don’t have it.

This makes me a less attractive target for hackers, and we apply the same rule to everyone. If you don’t need access for your job you don’t get it. That’s made a lot easier by the identity and authentication systems and by our rule about timing out access if you don’t use a service. You probably didn’t need it in the first place.

The flip side of all of us owning security is that deliberately doing the wrong thing has severe consequences.

Making a mistake is just fine. The person who wrote the bad line of code that caused Cloudbleed didn’t get fired, the person who wrote the bad regex that brought our service to a halt on July 2 is still with us.‌‌

Detection and Response‌‌

Naturally, things do go wrong internally. Things that didn’t get reported. To do with them we need to detect problems quickly. This is an area where the security team does have real expertise and data.‌‌

We do this by collecting data about how our endpoints (my laptop, a company phone, servers on the edge of our network) are behaving. And this is fed into a homebuilt data platform that allows the security team to alert on anomalies.‌‌

It also allows them to look at historical data in case of a problem that occurred in the past, or to understand when a problem started. ‌‌

Initially the team was going to use a commercial data platform or SIEM but they quickly realized that these platforms are incredibly expensive and they could build their own at a considerably lower price.‌‌

Also, Cloudflare handles a huge amount of data. When you’re looking at operating system level events on machines in 194 cities plus every employee you’re dealing with a huge stream. And the commercial data platforms love to charge by the size of that stream.‌‌

We are integrating internal DNS data, activity on individual machines, network netflow information, badge reader logs and operating system level events to get a complete picture of what’s happening on any machine we own.‌‌

When someone joins Cloudflare they travel to our head office in San Francisco for a week of training. Part of that training involves getting their laptop and setting it up and getting familiar with our internal systems and security.‌‌

During one of these orientation weeks a new employee managed to download malware while setting up their laptop. Our internal detection systems spotted this happening and the security team popped over to the orientation room and helped the employee get a fresh laptop.‌‌

The time between the malware being downloaded and detected was about 40 minutes.‌‌

If you don’t want to build something like this yourself, take a look at Google’s Chronicle product. It’s very cool. ‌‌

One really rich source of data about your organization is DNS. For example, you can often spot malware just by the DNS queries it makes from a machine. If you do one thing then make sure all your machines use a single DNS resolver and get its logs.‌‌‌‌

Edge Security‌‌

In some ways the most interesting part of Cloudflare is the least interesting from a security perspective. Not because there aren’t great technical challenges to securing machines in 194 cities but because some of the more apparently mundane things I’ve talked about how such huge impact.‌‌

Identity, Authentication, Culture, Detection and Response.‌‌

But, of course, the edge needs securing. And it’s a combination of physical data center security and software. ‌‌

To give you one example let’s talk about SSL private keys. Those keys need to be distributed to our machines so that when an SSL connection is made to one of our servers we can respond. But SSL private keys are… private!‌‌

And we have a lot of them. So we have to distribute private key material securely. This is a hard problem. We encrypt the private keys while at rest and in transport with a separate key that is distributed to our edge machines securely. ‌‌

Access to that key is tightly controlled so that no one can start decrypting keys in our database. And if our database leaked then the keys couldn’t be decrypted since the key needed is stored separately.‌‌

And that key is itself GPG encrypted.‌‌

But wait… there’s more!‌‌

We don’t actually want to have decrypted keys stored in any process that accessible from the Internet. So we use a technology called Keyless SSL where the keys are kept by a separate process and accessed only when needed to perform operations.‌‌

And Keyless SSL can run anywhere. For example, it doesn’t have to be on the same machine as the machine handling an SSL connection. It doesn’t even have to be in the same country. Some of our customers make use of that to specify where their keys are distributed to).

Use Cloudflare to secure Cloudflare

One key strategy of Cloudflare is to eat our own dogfood. If you’ve not heard that term before it’s quite common in the US. The idea is that if you’re making food for dogs you should be so confident in its quality that you’d eat it yourself.

Cloudflare does the same for security. We use our own products to secure ourselves. But more than that if we see that there’s a product we don’t currently have in our security toolkit then we’ll go and build it.

Since Cloudflare is a cybersecurity company we face the same challenges as our customers, but we can also build our way out of those challenges. In  this way, our internal security team is also a product team. They help to build or influence the direction of our own products.

The team is also a Cloudflare customer using our products to secure us and we get feedback internally on how well our products work. That makes us more secure and our products better.

Our customers data is more precious than ours‌‌

The data that passes through Cloudflare’s network is private and often very personal. Just think of your web browsing or app use. So we take great care of it.‌‌

We’re handling that data on behalf of our customers. They are trusting us to handle it with care and so we think of it as more precious than our own internal data.‌‌

Of course, we secure both because the security of one is related to the security of the other. But it’s worth thinking about the data you have that, in a way, belongs to your customer and is only in your care.‌‌‌‌

Finally‌‌

I hope this talk has been useful. I’ve tried to give you a sense of how Cloudflare thinks about security and operates. We don’t claim to be the ultimate geniuses of security and would love to hear your thoughts, ideas and experiences so we can improve.‌‌

Security is not static and requires constant attention and part of that attention is listening to what’s worked for others.‌‌

Thank you.‌‌‌‌‌‌‌‌‌‌‌‌

Re:Inforce 2019 wrap-up and session links

Post Syndicated from Becca Crockett original https://aws.amazon.com/blogs/security/reinforce-2019-wrap-up-and-session-links/

re:Inforce conference

A big thank you to the attendees of the inaugural AWS re:Inforce conference for two successful days of cloud security learning. As you head home and look toward next steps for your organization (or if you weren’t able to attend and want to know what all the fuss was about), check out some of the session videos. You can watch the keynote to hear from our AWS CISO Steve Schmidt, view the full list of recorded conference sessions on the AWS YouTube channel, or check out popular sessions by track below.

Re:Inforce leadership sessions

Listen to cloud security leaders talk about key concepts from each track:

Popular sessions by track

View sessions that you might have missed or want to re-watch. (“Popular” determined by number of video views at the time this post was published.)

Security Deep Dive

View the full list of Security Deep Dive break-out sessions.

The Foundation

View the full list of The Foundation break-out sessions.

Governance, Risk & Compliance

View the full list of Governance, Risk & Compliance break-out sessions.

Security Pioneers

View the full list of Security Pioneers break-out sessions.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Deeper Connection with the Local Tech Community in India

Post Syndicated from Tingting (Teresa) Huang original https://blog.cloudflare.com/deeper-connection-with-the-local-tech-community-in-india/

Deeper Connection with the Local Tech Community in India

On June 6th 2019, Cloudflare hosted the first ever customer event in a beautiful and green district of Bangalore, India. More than 60 people, including executives, developers, engineers, and even university students, have attended the half day forum.

Deeper Connection with the Local Tech Community in India

The forum kicked off with a series of presentations on the current DDoS landscape, the cyber security trends, the Serverless computing and Cloudflare’s Workers. Trey Quinn, Cloudflare Global Head of Solution Engineering, gave a brief introduction on the evolution of edge computing.

Deeper Connection with the Local Tech Community in India

We also invited business and thought leaders across various industries to share their insights and best practices on cyber security and performance strategy. Some of the keynote and penal sessions included live demos from our customers.

Deeper Connection with the Local Tech Community in India

At this event, the guests had gained first-hand knowledge on the latest technology. They also learned some insider tactics that will help them to protect their business, to accelerate the performance and to identify the quick-wins in a complex internet environment.

Deeper Connection with the Local Tech Community in India

To conclude the event, we arrange some dinner for the guests to network and to enjoy a cool summer night.

Deeper Connection with the Local Tech Community in India

Through this event, Cloudflare has strengthened the connection with the local tech community. The success of the event cannot be separated from the constant improvement from Cloudflare and the continuous support from our customers in India.

As the old saying goes, भारत महान है (India is great). India is such an important market in the region. Cloudflare will enhance the investment and engagement in providing better services and user experience for India customers.

Deeper Connection with the Local Tech Community in India

Post Syndicated from Tingting (Teresa) Huang original https://blog.cloudflare.com/deeper-connection-with-the-local-tech-community-in-india/

Deeper Connection with the Local Tech Community in India

On June 6th 2019, Cloudflare hosted the first ever customer event in a beautiful and green district of Bangalore, India. More than 60 people, including executives, developers, engineers, and even university students, have attended the half day forum.

Deeper Connection with the Local Tech Community in India

The forum kicked off with a series of presentations on the current DDoS landscape, the cyber security trends, the Serverless computing and Cloudflare’s Workers. Trey Quinn, Cloudflare Global Head of Solution Engineering, gave a brief introduction on the evolution of edge computing.

Deeper Connection with the Local Tech Community in India

We also invited business and thought leaders across various industries to share their insights and best practices on cyber security and performance strategy. Some of the keynote and penal sessions included live demos from our customers.

Deeper Connection with the Local Tech Community in India

At this event, the guests had gained first-hand knowledge on the latest technology. They also learned some insider tactics that will help them to protect their business, to accelerate the performance and to identify the quick-wins in a complex internet environment.

Deeper Connection with the Local Tech Community in India

To conclude the event, we arrange some dinner for the guests to network and to enjoy a cool summer night.

Deeper Connection with the Local Tech Community in India

Through this event, Cloudflare has strengthened the connection with the local tech community. The success of the event cannot be separated from the constant improvement from Cloudflare and the continuous support from our customers in India.

As the old saying goes, भारत महान है (India is great). India is such an important market in the region. Cloudflare will enhance the investment and engagement in providing better services and user experience for India customers.

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!

Post Syndicated from Giuliana DeAngelis original https://blog.cloudflare.com/join-cloudflare-moz-at-our-next-meetup-serverless-in-seattle/

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!
Photo by oakie / Unsplash

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!

Cloudflare is organizing a meetup in Seattle on Tuesday, June 25th and we hope you can join. We’ll be bringing together members of the developers community and Cloudflare users for an evening of discussion about serverless compute and the infinite number of use cases for deploying code at the edge.

To kick things off, our guest speaker Devin Ellis will share how Moz uses Cloudflare Workers to reduce time to first byte 30-70% by caching dynamic content at the edge. Kirk Schwenkler, Solutions Engineering Lead at Cloudflare, will facilitate this discussion and share his perspective on how to grow and secure businesses at scale.

Next up, Developer Advocate Kristian Freeman will take you through a live demo of Workers and highlight new features of the platform. This will be an interactive session where you can try out Workers for free and develop your own applications using our new command-line tool.

Food and drinks will be served til close so grab your laptop and a friend and come on by!

View Event Details & Register Here

Agenda:

  • 5:00 pm Doors open, food and drinks
  • 5:30 pm Customer use case by Devin and Kirk
  • 6:00 pm Workers deep dive with Kristian
  • 6:30 – 8:30 pm Networking, food and drinks

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!

Post Syndicated from Giuliana DeAngelis original https://blog.cloudflare.com/join-cloudflare-moz-at-our-next-meetup-serverless-in-seattle/

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!
Photo by oakie / Unsplash

Join Cloudflare & Moz at our next meetup, Serverless in Seattle!

Cloudflare is organizing a meetup in Seattle on Tuesday, June 25th and we hope you can join. We’ll be bringing together members of the developers community and Cloudflare users for an evening of discussion about serverless compute and the infinite number of use cases for deploying code at the edge.

To kick things off, our guest speaker Devin Ellis will share how Moz uses Cloudflare Workers to reduce time to first byte 30-70% by caching dynamic content at the edge. Kirk Schwenkler, Solutions Engineering Lead at Cloudflare, will facilitate this discussion and share his perspective on how to grow and secure businesses at scale.

Next up, Developer Advocate Kristian Freeman will take you through a live demo of Workers and highlight new features of the platform. This will be an interactive session where you can try out Workers for free and develop your own applications using our new command-line tool.

Food and drinks will be served til close so grab your laptop and a friend and come on by!

View Event Details & Register Here

Agenda:

  • 5:00 pm Doors open, food and drinks
  • 5:30 pm Customer use case by Devin and Kirk
  • 6:00 pm Workers deep dive with Kristian
  • 6:30 – 8:30 pm Networking, food and drinks

How to sign up for a Leadership Session at re:Inforce 2019

Post Syndicated from Ashley Nelson original https://aws.amazon.com/blogs/security/how-to-sign-up-for-a-leadership-session-at-reinforce-2019/

The first annual re:Inforce conference is one week away and with two full days of security, identity, and compliance learning ahead, I’m looking forward to the community building opportunities (such as Capture the Flag) and the hundreds of sessions that dive deep into how AWS services can help keep businesses secure in the cloud. The track offerings are built around four main topics (Governance, Risk & Compliance; Security Deep Dive; Security Pioneers; and The Foundation) and to help highlight each track, AWS security experts will headline four Leadership Sessions that cover the overall track structure and key takeaways from the conference.

Join one—or all—of these Leadership Sessions to hear AWS security experts discuss top cloud security trends. But I recommend reserving your spot now – seating is limited for these sessions. (See below for instructions on how to reserve a seat.)

Leadership Sessions at re:Inforce 2019

When you attend a Leadership Session, you’ll learn about AWS services and solutions from the folks who are responsible for them end-to-end. These hour-long sessions are presented by AWS security leads who are experts in their fields. The sessions also provide overall strategy and best practices for safeguarding your environments. See below for the list of Leadership Sessions offered at re:Inforce 2019.

Leadership Session: Security Deep Dive

Tuesday, Jun 25, 12:00 PM – 1:00 PM
Speakers: Bill Reid (Sr Mgr, Security and Platform – AWS); Bill Shinn (Sr Principal, Office of the CISO – AWS)

In this session, Bill Reid, Senior Manager of Security Solutions Architects, and Bill Shinn, Senior Principal in the Office of the CISO, walk attendees through the ways in which security leadership and security best practices have evolved, with an emphasis on advanced tooling and features. Both speakers have provided frontline support on complex security and compliance questions posed by AWS customers; join them in this master class in cloud strategy and tactics.

Leadership Session: Foundational Security

Tuesday, Jun 25, 3:15 PM – 4:15 PM
Speakers: Don “Beetle” Bailey (Sr Principal Security Engineer – AWS); Rohit Gupta (Global Segment Leader, Security – AWS); Philip “Fitz” Fitzsimons (Lead, Well-Architected – AWS); Corey Quinn (Cloud Economist – The Duckbill Group)

Senior Principal Security Engineer Don “Beetle” Bailey and Corey Quinn from the highly acclaimed “Last Week in AWS” newsletter present best practices, features, and security updates you may have missed in the AWS Cloud. With more than 1,000 service updates per year being released, having expert distillation of what’s relevant to your environment can accelerate your adoption of the cloud. As techniques for operationalizing cloud security, compliance, and identity remain a critical business need, this leadership session considers a strategic path forward for all levels of enterprises and users, from beginner to advanced.

Leadership Session: Aspirational Security

Wednesday, Jun 26, 11:45 AM – 12:45 PM
Speaker: Eric Brandwine (VP/Distinguished Engineer – AWS)

How does the cloud foster innovation? Join Vice President and Distinguished Engineer Eric Brandwine as he details why there is no better time than now to be a pioneer in the AWS Cloud, discussing the changes that next-gen technologies such as quantum computing, machine learning, serverless, and IoT are expected to make to the digital and physical spaces over the next decade. Organizations within the large AWS customer base can take advantage of security features that would have been inaccessible even five years ago; Eric discusses customer use cases along with simple ways in which customers can realize tangible benefits around topics previously considered mere buzzwords.

Leadership Session: Governance, Risk, and Compliance

Wednesday, Jun 26, 2:45 PM – 3:45 PM
Speakers: Chad Woolf (VP of Security – AWS); Rima Tanash (Security Engineer – AWS); Hart Rossman (Dir, Global Security Practice – AWS)

Vice President of Security Chad Woolf, Director of Global Security Practice Hart Rossman, and Security Engineer Rima Tanash explain how governance functionality can help ensure consistency in your compliance program. Some specific services covered are Amazon GuardDuty, AWS Config, AWS CloudTrail, Amazon CloudWatch, Amazon Macie, and AWS Security Hub. The speakers also discuss how customers leverage these services in conjunction with each other. Additional attention is paid to the concept of “elevated assurance,” including how it may transform the audit industry going forward. Finally, the speakers discuss how AWS secures its own environment, as well as talk about the control frameworks of specific compliance regulations.

How to reserve a seat

Unlike the Keynote session delivered by AWS CISO Steve Schmidt, you must reserve a seat for Leadership Sessions to guarantee entrance. Seats are limited, so put down that coffee, pause your podcast, and follow these steps to secure your spot.

  1. Log into the re:Inforce Session Catalog with your registration credentials. (Not registered yet? Head to the Registration page and sign up.)
  2. Select Event Catalog from the Dashboard.
  3. Enter “Leadership Session” in the Keyword Search box and check the “Exact Match” box to filter your results.
  4. Select the Scheduling Options dropdown to view the date and location of the session.
  5. Select the plus mark to add it to your schedule.
  6. How to add a leadership session to your schedule

And that’s it! Your seat is now reserved. While you’re at it, check out the other available sessions, chalk talks, workshops, builders sessions, and security jams taking place during the event. You can customize your schedule to focus on security topics most relevant to your role, or take the opportunity to explore something new. The session catalog is subject to change, so be sure to check back to see what’s been added. And if you have any questions, email the re:Inforce team at [email protected].

Hope to see you there!

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

author photo

Ashley Nelson

Ashley is a Content Manager within AWS Security. Ashley oversees both print and digital content, and has over six years of experience in editorial and project management roles. Originally from Boston, Ashley attended Lesley University where she earned her degree in English Literature with a minor in Psychology. Ashley is passionate about books, food, video games, and Oxford Commas.

Definitely not an AWS Security Profile: Corey Quinn, a “Cloud Economist” who doesn’t work here

Post Syndicated from Becca Crockett original https://aws.amazon.com/blogs/security/definitely-not-an-aws-security-profile-corey-quinn-a-cloud-economist-who-doesnt-work-here/

platypus scowling beside cloud

In the weeks leading up to re:Inforce, we’ll share conversations we’ve had with people who will be presenting at the event so you can learn more about them and some of the interesting work that they’re doing.


You don’t work at AWS, but you do have deep experience with AWS Services. Can you talk about how you developed that experience and the work that you do as a “Cloud Economist?”

I see those sarcastic scare-quotes!

I’ve been using AWS for about a decade in a variety of environments. It sounds facile, but it turns out that being kinda good at something starts with being abjectly awful at it first. Once you break things enough times, you start to learn how to wield them in more constructive ways.

I have a background in SRE-style work and finance. Blending those together into a made-up thing called “Cloud Economics” made sense and focused on a business problem that I can help solve. It starts with finding low-effort cost savings opportunities in customer accounts and quickly transitions into building out costing predictions, allocating spend—and (aligned with security!) building out workable models of cloud governance that don’t get in an engineer’s way.

This all required me to be both broad and deep across AWS’s offerings. Somewhere along the way, I became something of a go-to resource for the community. I don’t pretend to understand how it happened, but I’m incredibly grateful for the faith the broader community has placed in me.

You’re known for your snarky newsletter. When you meet AWS employees, how do they tend to react to you?

This may surprise you, but the most common answer by far is that they have no idea who I am.

It turns out AWS employs an awful lot of people, most of whom have better things to do than suffer my weekly snarky slings and arrows.

Among folks who do know who I am, the response has been nearly universal appreciation. It seems that the newsletter is received in which the spirit I intend it—namely, that 90–95% of what AWS does is awesome. The gap between that and perfection offers boundless opportunities for constructive feedback—and also hilarity.

The funniest reaction I ever got was when someone at a Summit registration booth saw “Last Week in AWS” on my badge and assumed I was an employee serving out the end of his notice period.

“Senior RageQuit Engineer” at your service, I suppose.

You’ve been invited to present during the Leadership Session for the re:Inforce Foundation Track with Beetle. What have you got planned?

Ideally not leaving folks asking incredibly pointed questions about how the speaker selection process was mismanaged! If all goes well, I plan on being able to finish my talk without being dragged off the stage by AWS security!

I kid. But my theory of adult education revolves around needing to grab people’s attention before you can teach them something. For better or worse, my method for doing that has always been humor. While I’m cognizant that messaging to a large audience of security folks requires a delicate touch, I don’t subscribe to the idea that you can’t have fun with it as well.

In short: if nothing else, it’ll be entertaining!

What’s one thing that everyone should stop reading and go do RIGHT NOW to improve their security posture?

Easy. Log into the console of your organization’s master account and enable AWS CloudTrail for all regions and all accounts in your organization. Direct that trail to a locked-down S3 bucket in a completely separate, highly restricted account, and you’ve got a forensic log of all management options across your estate.

Worst case, you’ll thank me later. Best case, you’ll never need it.

It’s important, so what’s another security thing everyone should do?

Log in to your AWS accounts right now and update your security contact to your ops folks. It’s not used for marketing; it’s a point of contact for important announcements.

If you’re like many rapid-growth startups, your account is probably pointing to your founder’s personal email address— which means critical account notices are getting lost among Amazon.com sock purchase receipts.

That is not what being “SOC-compliant” means.

From a security perspective, what recent AWS release are you most excited about?

It was largely unheralded, but I was thrilled to see AWS Systems Manager Parameter Store (it’s a great service, though the name could use some work) receive higher API rate limits; it went from 40 to 1,000 requests per second.

This is great for concurrent workloads and makes it likelier that people will manage secrets properly without having to roll their own.

Yes, I know that AWS Secrets Manager is designed around secrets, but KMS-encrypted parameters in Parameter Store also get the job done. If you keep pushing I’ll go back to using Amazon Route 53 TXT records as my secrets database… (Just kidding. Please don’t do this.)

In your opinion, what’s the biggest challenge facing cloud security right now?

The same thing that’s always been the biggest challenge in security: getting people to care before a disaster happens.

We see the same thing in cloud economics. People care about monitoring and controlling cloud spend right after they weren’t being diligent and wound up with an unpleasant surprise.

Thankfully, with an unexpectedly large bill, you have a number of options. But you don’t get a do-over with a data breach.

The time to care is now—particularly if you don’t think it’s a focus area for you. One thing that excites me about re:Inforce is that it gives an opportunity to reinforce that viewpoint.

Five years from now, what changes do you think we’ll see across the cloud security landscape?

I think we’re already seeing it now. With the advent of things like AWS Security Hub and AWS Control Tower (both currently in preview), security is moving up the stack.

Instead of having to keep track of implementing a bunch of seemingly unrelated tooling and rulesets, higher-level offerings are taking a lot of the error-prone guesswork out of maintaining an effective security posture.

Customers aren’t going to magically reprioritize security on their own. So it’s imperative that AWS continue to strive to meet them where they are.

What are the comparative advantages of being a cloud economist vs. a platypus keeper?

They’re more alike than you might expect. The cloud has sharp edges, but platypodes are venomous.

Of course, large bills are a given in either space.

You sometimes rename or reimagine AWS services. How should the Security Blog rebrand itself?

I think the Security Blog suffers from a common challenge in this space.

It talks about AWS’s security features, releases, and enhancements—that’s great! But who actually identifies as its target market?

Ideally, everyone should; security is everyone’s job, after all.

Unfortunately, no matter what user persona you envision, a majority of the content on the blog isn’t written for that user. This potentially makes it less likely that folks read the important posts that apply to their use cases, which, in turn, reinforces the false narrative that cloud security is both impossibly hard and should be someone else’s job entirely.

Ultimately, I’d like to see it split into different blogs that emphasize CISOs, engineers, and business tracks. It could possibly include an emergency “this is freaking important” feed.

And as to renaming it, here you go: you’d be doing a great disservice to your customers should you name it anything other than “AWS Klaxon.”

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Corey Quinn

Corey is the Cloud Economist at the Duckbill Group. Corey specializes in helping companies fix their AWS bills by making them smaller and less horrifying. He also hosts the AWS Morning Brief and Screaming in the Cloud podcasts and curates Last Week in AWS, a weekly newsletter summarizing the latest in AWS news, blogs, and tools, sprinkled with snark.

AWS Security Profiles: Fritz Kunstler, Principal Consultant, Global Financial Services

Post Syndicated from Becca Crockett original https://aws.amazon.com/blogs/security/aws-security-profiles-fritz-kunstler-principal-consultant-global-financial-services/


In the weeks leading up to re:Inforce, we’ll share conversations we’ve had with people at AWS who will be presenting at the event so you can learn more about them and some of the interesting work that they’re doing.


How long have you been at AWS, and what do you do in your current role?

I’ve been here for three years. My job is Security Transformation, which is a technical role in AWS Professional Services. It’s a fancy way of saying that I help customers build the confidence and technical capability to run their most sensitive workloads in the AWS Cloud. Much of my work lives at the intersection of DevOps and information security.

Broadly, how does the role of Consultant differ from positions like “Solutions Architect”?

Depth of engagement is one of the main differences. On many customer engagements, I’m involved for three months, or six months, or nine months. I have one customer now that I’ve been working with for more than a year. Consultants are also more integrated—I’m often embedded in the customer’s team, working side-by-side with their employees, which helps me learn about their culture and needs.

What’s your favorite part of your job?

There’s a lot I like about working at Amazon, but a couple of things stand out. First, the people I work with. Amazon culture—and the people who comprise that culture—are amazing. I’m constantly interacting with really smart people who are willing to go out of their way to make good things happen for customers. At companies I’ve worked for in the past, I’ve encountered individuals like this. But being surrounded by so many people who behave like this day in and day out is something special.

The customers that we have the privilege of working with at AWS also represent some very large brands. They serve many, many consumers all over the world. When I help these customers achieve their security and privacy goals, I’m doing something that has an impact on the world at large. I’ve worked in tech my entire career, in roles ranging from executive to coder, but I’ve never had a job that lets me make such a broad impact before. It’s really cool.

What does cloud security mean to you, personally?

I work in Global Financial Services, so my customers are the world’s biggest banks, investment firms, and independent software vendors. These are companies that we all rely on every day, and they put enormous effort into protecting their customers’ data and finances. As I work to support their efforts, I think about it in terms of my wife, kids, parents, siblings—really, my entire extended family. I’m working to protect us, to ensure that the online world we live in is a safer one.

In your opinion, what’s the biggest cloud security challenge facing the Financial Services industry right now?

How to transform the way they do security. It’s not only a technical challenge—it’s a human challenge. For FinServe customers to get the most value out of the cloud, a lot of people need to be willing to change their minds.

Highly regulated customers like financial services firms tend to have sophisticated security organizations already in place. They’ve been doing things effectively in a particular way for quite a while. It takes a lot of evidence to convince them to change their processes—and to convince them that those changes can drive increased value and performance while reducing risk. Security leaders tend to be a skeptical lot, and that has its place, but I think that we should strive to always be the most optimistic people in the room. The cloud lets people experiment with big ideas that may lead to big innovation, and security needs to enable that. If the security leader in the room is always saying no, then who’s going to say yes? That’s the essence of security transformation – developing capabilities that enable your organization to say yes.

What’s a trend you see currently happening in the Financial Services space that you’re excited about?

AWS has been working hard alongside some of our financial services customers for several years. Moving to the cloud is a big transition, and there’s been some FUD—some fear, uncertainty, and doubt—to work through, so not everyone has been able to adopt the cloud as quickly as they might’ve liked. But I feel we’re approaching an inflection point. I’m seeing increasing comfort, increasing awareness, and an increasingly trained workforce among my customers.

These changes, in conjunction with executive recognition that “the cloud” is not only worthwhile, but strategically significant to the business, may signal that we’re close to a breakthrough. These are firms that have the resources to make things happen when they’re ready. I’m optimistic that even the more conservative of our financial services customers will soon be taking advantage of AWS in a big way.

Five years from now, what changes do you think we’ll see across the Financial Services/Cloud Security landscape?

I think cloud adoption will continue to accelerate on the business side. I also expect to see the security orgs within these firms leverage the cloud more for their own workloads – in particular, to integrate AI and machine learning into security operations, and further left in the systems development lifecycle. Security teams still do a lot of manual work to analyze code, policies, logs, and so on. This is critical stuff, but it’s also very time consuming and much of it is ripe for automation. Skilled security practitioners are in high demand. They should be focused on high-value tasks that enable the business. Amazon GuardDuty is just one example of how security teams can use the cloud toward that end.

What’s one thing that people outside of Financial Services can learn from what’s happening in this industry?

As more and more Financial Services customers adopt AWS, I think that it becomes increasingly hard for leaders in other sectors to suggest that the cloud isn’t secure, reliable, or capable enough for any given use case. I love the quote from Capital One’s CIO about why they chose AWS.

You’re leading a re:Inforce session that focuses on “IAM strategy for financial services.” What are some of the unique considerations that the financial services industry faces when it comes to IAM?

Financial services firms and other highly regulated customers tend to invest much more into tools and processes to enforce least privilege and separation of duties, due to regulatory and compliance requirements. Traditional, centralized approaches to implementing those two principles don’t always work well in the cloud, where resources can be ephemeral. If your goal is to enable builders to experiment and fail fast, then it shouldn’t take weeks to get the approvals and access required for a proof-of-concept than can be built in two days.

AWS Identity and Access Management (IAM) capabilities have changed significantly in the past year. Those changes make it easier and safer than ever to do things like delegate administrative access to developers. But they aren’t the sort of high-profile announcement that you’d hear a keynote speaker talk about at re:Invent. So I think a lot of customers aren’t fully aware of them, or of what you can accomplish by combining them with automation and CI/CD techniques.

My talk will offer a strategy and examples for using those capabilities to provide the same level of security—if not a better level of security—without so many of the human reviews and approvals that often become bottlenecks.

What are you hoping that your audience will do differently as a result of attending your session?

I’d like them to investigate and holistically implement the handful of IAM capabilities that we’ll discuss during the session. I also hope that they’ll start working to delegate IAM responsibilities to developers and automate low-value human reviews of policy code. Finally, I think it’s critical to have CI/CD or other capabilities that enable rapid, reliable delivery of updates to IAM policies across many AWS accounts.

Can you talk about some of the recent enhancements to IAM that you’re excited about?

Permissions boundaries and IAM resource tagging are two features that are really powerful and that I don’t see widely used today. In some cases, customers may not even be aware of them. Another powerful and even more recent development is the introduction of conditional support to the service control policy mechanism provided by AWS Organizations.

You’re an avid photographer: What’s appealing to you about photography? What’s your favorite photo you’ve ever taken?

I’ve always struggled to express myself artistically. I take a very technical, analytical approach to life. I started programming computers when I was six. That’s how I think. Photography is sufficiently technical for me to wrap my brain around, which is how I got started. It took me a long time to begin to get comfortable with the creative aspects. But it fits well with my personality, while enabling expression that I’d never be able to find, say, as a painter.

I won’t claim to be an amazing photographer, but I’ve managed a few really good shots. The photo that comes to mind is one I captured in Bora Bora. There was a guy swimming through a picturesque, sheltered part of the ocean, where a reef stopped the big waves from coming in. This swimmer was towing a surfboard with his dog standing on it, and the sun was going down in the background. The colors were so vibrant it felt like a Disneyland attraction, and from a distance, you could just see a dog on a surfboard. Everything about that moment – where I was, how I was feeling, how surreal it all was, and the fact that I was on a honeymoon with my wife – made for a poignant photo.

The AWS Security team is hiring! Want to find out more? Check out our career page.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author photo

Fritz Kunstler

Fritz is a Principal Consultant in AWS Professional Services, specializing in security. His first computer was a Commodore 64, which he learned to program in BASIC from the back of a magazine. Fritz has spent more than 20 years working in tech and has been an AWS customer since 2008. He is an avid photographer and is always one batch away from baking the perfect chocolate chip cookie.

AWS Security Profiles: Matthew Campagna, Sr. Principal Security Engineer, Cryptography

Post Syndicated from Becca Crockett original https://aws.amazon.com/blogs/security/aws-security-profiles-matthew-campagna-sr-principal-security-engineer-cryptography/

AWS Security Profiles: Matthew Campagna, Senior Principal Security Engineer, Cryptography

In the weeks leading up to re:Inforce, we’ll share conversations we’ve had with people at AWS who will be presenting at the event so you can learn more about them and some of the interesting work that they’re doing.


How long have you been at AWS, and what do you do in your current role?

I’ve been with AWS for almost 6 years. I joined as a Principal Security Engineer, but my focus has always been cryptography. I’m a cryptographer. At the start of my Amazon career, I worked on designing our AWS Key Management Service (KMS). Since then, I’ve gotten involved in other projects—working alongside a group of volunteers in the AWS Cryptography Bar Raisers group.

Today, the Crypto Bar Raisers are a dedicated portion of my team that work with any AWS team who’s designed a novel application of cryptography. The underlying cryptographic mechanisms aren’t novel, but the engineer has figured out how to apply them in a non-standard way, often to solve a specific problem for a customer. We provide these AWS employees with a deep analysis of their applications to ensure that the applications meet our high cryptographic security bar.

How do you explain your job to non-tech friends?

I usually tell people that I’m a mathematician. Sometimes I’ll explain that I’m a cryptographer. If anyone wants detail beyond that, I say I design security protocols or application uses of cryptography.

What’s the most challenging part of your job?

I’m convinced the most challenging part of any job is managing email.

Apart from that, within AWS there’s lots of demand for making sure we’re doing security right. The people who want us to review their projects come to us via many channels. They might already be aware of the Crypto Bar Raisers, and they want our advice. Or, one of our internal AWS teams—often, one of the teams who perform security reviews of our services—will alert the project owner that they’ve deviated from the normal crypto engineering path, and the team will wind up working with us. Our requests tend to come from smart, enthusiastic engineers who are trying to deliver customer value as fast as possible. Our ability to attract smart, enthusiastic engineers has served us quite well as a company. Our engineering strength lies in our ability to rapidly design, develop, and deploy features for our customers.

The challenge of this approach is that it’s not the fastest way to achieve a secure system. That is, you might end up designing things before you can demonstrate that they’re secure. Cryptographers design in the opposite way: We consider “ability to demonstrate security” in advance, as a design consideration. This approach can seem unusual to a team that has already designed something—they’re eager to build the thing and get it out the door. There’s a healthy tension between the need to deliver the right level of security and the need to deliver solutions as quickly as possible. It can make our day-to-day work challenging, but the end result tends to be better for customers.

Amazon’s s2n implementation of the Transport Layer Security protocol was a pretty big deal when it was announced in 2015. Can you summarize why it was a big deal, and how you were involved?

It was a big deal, and it was a big decision for AWS to take ownership of the TLS libraries that we use. The decision was predicated on the belief we could do a better job than other open source TLS packages by providing a smaller, simpler—and inherently more secure—version of TLS that would raise the security bar for us and for our customers.

To do this, the Automated Reasoning Group demonstrated the formal correctness of the code to meet the TLS specification. For the most part, my involvement in the initial release was limited to scenarios where the Amazon contributors did their own cryptographic implementations within TLS (that is, within the existing s2n library), which was essentially like any other Crypto Bar Raiser review for me.

Currently, my team and I are working on additional developments to s2n—we’re deploying something called “quantum-safe cryptography” into it.

You’re leading a session at re:Inforce that provides “an introduction to post-quantum cryptography.” How do you explain post-quantum cryptography to a beginner?

Post-quantum cryptography, or quantum-safe cryptography, refers to cryptographic techniques that remain secure even against the power of a large-scale quantum computer.

A quantum computer would be fundamentally different than the computers we use today. Today, we build computers based off of certain mathematical assumptions—that certain cryptographic ciphers cannot be cracked without an immense, almost impossible amount of computing power. In particular, a basic assumption that cryptographers build upon today is that the discreet log problem, or integer factorization, is hard. We take it for granted that this type of problem is fundamentally difficult to solve. It’s not a task that can be completed quickly or easily.

Well, it turns out that if you had the computing power of a large-scale quantum computer, those assumptions would be incorrect. If you could figure out how to build a quantum computer, it could unravel the security aspects of the TLS sessions we create today, which are built upon those assumptions.

The reason that we take this “if” so seriously is that, as a company, we have data that we know we want to keep secure. The probability of such a quantum computer coming into existence continues to rise. Eventually, the probability that a quantum computer exists during the lifetime of the sensitivity of the data we are protecting will rise above the risk threshold that we’re willing to accept.

It can take 10 to 15 years for the cryptographic community to study new algorithms well enough to have faith in the core assumptions about how they work. Additionally, it takes time to establish new standards and build high quality and certified implementations of these algorithms, so we’re investing now.

I research post-quantum cryptographic techniques, which means that I’m basically looking for quantum-safe techniques that can be designed to run on the classical computers that we use now. Identifying these techniques lets us implement quantum-safe security well in advance of a quantum computer. We’ll remain secure even if someone figures out how to create one.

We aren’t doing this alone. We’re working within in the larger cryptographic community and participating in the NIST Post-Quantum Cryptography Standardization process.

What do you hope that people will do differently as a result of attending your re:Inforce session?

First, I hope people download and use s2n in any form. S2n is a nice, simple Transport Layer Socket (TLS) implementation that reduces overall risk for people who are currently using TLS.

In addition, I’d encourage engineers to try the post-quantum version of s2n and see how their applications work with it. Post-quantum cryptographic schemes are different. They have a slightly different “shape,” or usage. They either take up more bandwidth, which will change your application’s latency and bandwidth use, or they require more computational power, which will affect battery life and latency.

It’s good to understand how this increase in bandwidth, latency, and power consumption will impact your application and your user experience. This lets you make proactive choices, like reducing the frequency of full TLS handshakes that your application has to complete, or whatever the equivalent would be for the security protocol that you’re currently using.

What implications do post-quantum s2n developments have for the field of cloud security as a whole?

My team is working in the public domain as much as possible. We want to raise the cryptography bar not just for AWS, but for everyone. In addition to the post-quantum extension to s2n that we’re writing, we’re writing specifications. This means that any interested party can inspect and analyze precisely how we’re doing things. If they want to understand nuances of TLS 1.2 or 1.3, they can look at those specifications, and see how these post-quantum extensions apply to those standards.

We hope that documenting our work in the public space, where others can build interoperable systems, will raise the bar for all cloud providers, so that everyone is building upon a more secure foundation.

What resources would you recommend to someone interested in learning more about s2n or post-quantum cryptography?

For s2n, we do a lot of our communication through Security Blog posts. There’s also the AWS GitHub repository, which houses our source code. It’s available to anyone who wants to look at it, use it, or become a contributor. Any issues that arise are captured in issue pages there.

For quantum-safe crypto, a fairly influential paper was released in 2015. It’s the European Telecommunications Standards Institute’s Quantum-Safe Whitepaper (PDF file). It provides a gentle introduction to quantum computing and the impact it has on information systems that we’re using today to secure our information. It sets forth all of the reasons we need to invest now. It helped spur a shift in thinking about post-quantum encryption, from “research project” to “business need.”

There are certainly resources that allow you to go a lot deeper. There’s a highly technical conference called PQ Crypto that’s geared toward cryptographers and focuses on post-quantum crypto. For resources ranging from executive to developer level, there’s a quantum-safe cryptography workshop organized every year by the Institute for Quantum Computing at the University of Waterloo (IQC) and the European Telecommuncations Standards Institute (ETSI). AWS is partnering with ETSI/IQC to host the 2019 workshop in Seattle.

What’s one fact about cryptography that you think everyone—even laypeople—should be aware of?

People sometimes speak about cryptography like it’s a fact or a mathematical science. And it’s not, precisely. Cryptography doesn’t guarantee outcomes. It deals with probabilities based upon core assumptions. Cryptographic engineering requires you to understand what those assumptions are and closely monitor any challenges to them.

In the business world, if you want to keep something secret or confidential, you need to be able to express the probability that the cryptographic method fails to provide the desired security property. Understanding this probability is how businesses evaluate risk when they’re building out a new capability. Cryptography can enable new capabilities that might otherwise represent too high a risk. For instance, public-key cryptography and certificate authorities enabled the development of the Secure Socket Layer (SSL) protocol, and this unlocked e-Commerce, making it possible for companies to authenticate to end users, and for end users to engage in a confidential session to conduct business transactions with very little risk. So at the end of the day, I think of cryptography as essentially a tool to reduce the risk of creating new capabilities, especially for business.

Anything else?

Don’t think of cryptography as a guarantee. Think about it as a probability that’s tied to how often you use the cryptographic method.

You have confidentiality if you use the system based on an assumption that you can understand, like “this cryptographic primitive (or block cipher) is a pseudo-random permutation.” Then, if you encrypt 232 messages, the probability that all your data stays secure (confidential or authentic) is, let’s say, 2-72. Those numbers are where people’s eyes may start to gloss over when they hear them, but most engineers can process that information if it’s written down. And people should be expecting that from their solutions.

Once you express it like that, I think it’s clear why we want to move to quantum-safe crypto. The probabilities we tolerate for cryptographic security are very small, typically smaller than 2-32, around the order of one in four billion. We’re not willing to take much risk, and we don’t typically have to from our cryptographic constructions.

That’s especially true for a company like Amazon. We process billions of objects a day. Even if there’s a one in the 232 chance that some information is going to spill over, we can’t tolerate such a probability.

Most of cryptography wasn’t built with the cloud in mind. We’re seeing that type of cryptography develop now—for example, cryptographic computing models where you encrypt the data before you store it in the cloud, and you maintain the ability to do some computation on its encrypted form, and the plaintext never exists within the cloud provider’s systems. We’re also seeing core crypto primitives, like the Advanced Encryption Standard, which wasn’t designed for the cloud, begin to show some age. The massive use cases and sheer volume of things that we’re encrypting require us to develop new techniques, like the derived-key mode of AES-GCM that we use in AWS KMS.

What does cloud security mean to you, personally?

I’ll give you a roundabout answer. Before I joined Amazon, I’d been working on quantum-safe cryptography, and I’d been thinking about how to securely distribute an alternative cryptographic solution to the community. I was focused on whether this could be done by tying distribution into a user’s identity provider.

Now, we all have a trust relationship with some entity. For example, you have a trust relationship between yourself and your mobile phone company that creates a private, encrypted tunnel between the phone and your local carrier. You have a similar relationship with your cable or internet provider—a private connection between the modem and the internet provider.

When I looked around and asked myself who’d make a good identity provider, I found a lot of entities with conflicting interests. I saw few companies positioned to really deliver on the promise of next-generation cryptographic solutions, but Amazon was one of them, and that’s why I came to Amazon.

I don’t think I will provide the ultimate identity provider to the world. Instead, I’ve stayed to focus on providing Amazon customers the security they need, and I’m thrilled to be here because of the sheer volume of great cryptographic engineering problems that I get to see on a regular basis. More and more people have their data in a cloud. I have data in the cloud. I’m very motivated to continue my work in an environment where the security and privacy of customer data is taken so seriously.

You live in the Seattle area: When friends from out of town visit, what hidden gem do you take them to?

When friends visit, I bring them to the Amazon Spheres, which are really neat, and the MoPOP museum. For younger people, children, I take them on the Seattle Underground Tour. It has a little bit of a Harry Potter-like feel. Otherwise, the great outdoors! We spend a lot of time outside, hiking or biking.

The AWS Security team is hiring! Want to find out more? Check out our career page.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Campagna bio photo

Matthew Campagna

Matthew is a Sr. Principal Engineer for Amazon Web Services’s Cryptography Group. He manages the design and review of cryptographic solutions across AWS. He is an affiliate of Institute for Quantum Computing at the University of Waterloo, a member of the ETSI Security Algorithms Group Experts (SAGE), and ETSI TC CYBER’s Quantum Safe Cryptography group. Previously, Matthew led the Certicom Research group at BlackBerry managing cryptographic research, standards, and IP, and participated in various standards organizations, including ANSI, ZigBee, SECG, ETSI’s SAGE, and the 3GPP-SA3 working group. He holds a Ph.D. in mathematics from Wesleyan University in group theory, and a bachelor’s degree in mathematics from Fordham University.

Technology’s Promise – Highlights from DEF CON China 1.0

Post Syndicated from Claire Tsai original https://blog.cloudflare.com/technologys-promise-def-con-china-1-0-highlights/

Technology's Promise - Highlights from DEF CON China 1.0

Technology's Promise - Highlights from DEF CON China 1.0

DEF CON is one of the largest and oldest security conferences in the world. Last year, it launched a beta event in China in hopes of bringing the local security communities closer together. This year, the organizer made things official by introducing DEF CON China 1.0 with a promise to build a forum for China where everyone can gather, connect, and grow together.

Themed “Technology’s Promise”, DEF CON China kicked off on 5/30 in Beijing and attracted participants of all ages. Watching young participants test, play and tinker with new technologies with such curiosity and excitement absolutely warmed our hearts!

It was a pleasure to participate in DEF CON China 1.0 this year and connect with local communities. Great synergy as we exchanged ideas and learnings on cybersecurity topics. Did I mention we also spoiled ourselves with the warm hospitality, wonderful food, live music, and amazing crowd while in Beijing.

Technology's Promise - Highlights from DEF CON China 1.0
Event Highlights: Cloudflare Team Meets with DEF CON China Visitors and Organizers (DEF CON Founder Jeff Moss and Baidu Security General Manager Jefferey Ma)


Youngest DEF CON China Participant Explores New Technologies on the Eve of International Children’s Day. (Source: Abhinav SP | #BugZee, DEFCON China )


The Iconic DEF CON Badge, Designed by Joe Grand, is a Flexible Printed Circuit Board that Lights up the Interactive “Tree of Promise”.


Technology's Promise - Highlights from DEF CON China 1.0
The Capture The Flag (CTF) Contest is a Continuation of One of the Oldest Contests at DEF CON Dating Back to DEF CON 4 in 1996.


Cloudflare’s Mission is to Help Build a Better Internet

Founded in 2009, Cloudflare is a global company with 180 data centers across 80 countries. Our Performance and Security Services work in conjunction to reduce latency of websites, mobile applications, and APIs end-to-end, while protecting against DDoS attack, abusive bots, and data breach.

We are looking forward to growing our presence in the region and continuing to serve our customers, partners, and prospects. Sign up for a free account now for a faster and safer Internet experience: cloudflare.com/sign-up.

We’re Hiring

We are a team with global vision and local insight committed to building a better Internet. We are hiring in Beijing and globally. Check out the opportunities here: cloudflare.com/careers and join us at Cloudflare today!

Technology's Promise - Highlights from DEF CON China 1.0
The Cloudflare Team from Beijing, Singapore, and San Francisco