Tag Archives: SANS

Say Hello to the New Atlassian

Post Syndicated from Chris De Santis original https://www.anchor.com.au/blog/2017/09/hello-new-atlassian/

Who is Atlassian?

Atlassian is an Australian IT company that develops enterprise software, with its best-known products being its issue-tracking app, Jira, and team collaboration and wiki product, Confluence.

In December 2015, Atlassian went public and made their initial public offering (IPO) under the symbol TEAM, valuing them at $4.37 billion. In summary, they big.

What happened?

A facelift

It’s a nice sunny day in Sydney in mid-September of 2017, and Atlassian, after 15 years of consistency, has rebranded, changing their look and feel for a brighter and funner one, compared to the dreary previous look.New Atlassian Branding VideoIt’s a hell of a lot simpler and, as they show in the above video, it’s going to be used with a lot more creativity and flair in mind—it’s flexible in a sense that they can use it in a lot more ways than before, with a lot more colours than before.

Atlassian Logo ComparisonThe blues they’re using now work super-well with the logos on a white background, whereas the white logos on their new champion, brand colour blue can go both ways: some can see it as a bold, daring step which is quite attractive, while others can see it as off-putting and not very user-friendly.

New Atlassian Logo Versions

What’s it all mean?


In his announcement blog, Atlassian Co-Founder & Co-CEO, Mike Cannon-Brookes, mentions that the branding change reflects their newly-shifted focus on the concept of teamwork. He continues to explain that their previous logo depicted the sky-holding Greek titan Atlas and symbolised legendary service and support. But, while it has become renown, they’re shifting their focus on the concept of teamwork—why focus on something you’ve already done right, right?

Atlassian Logo EvolutionThe new logo contains more symbolism than meets the eye, as can be interpreted as:

  • Two people high-fiving
  • A mountain to scale
  • The letter “A” (seen as two pillars reinforcing each other)
Product logos

Atlassian has created and acquired many products in their adventure so far, and they all seemed to have a similar art style, but something always felt off about their consistency. Well, needless to say, this was addressed with Atlassian’s very own “identity system”, which is a pretty cool term for a consistent logo-look for 14+ products, to fit them under one brand.

New Atlassian Product LogosThe result is a set of unique marks that “still feel very related to each other”. Whereas, I also see a new set of “unknown” Pokémon.


New Atlassian TypefaceTo add a cherry on top, Atlassian will be using their own custom-made typeface called Charlie Sans, specifically designed to balance legibility with personality–that’s probably the best way to describe it. Otherwise, I’d say, out of purely-constructive criticism, that there isn’t much difference between itself and any of the other staple fonts; i.e. Arial, Verdana, etc. Then again, I’m not a professional designer.

It doesn’t look as distinct as their previous typeface, but, to be fair, it does look very slick next to the new product logos.


What do you think about it all?


Image credits: Atlassian

The post Say Hello to the New Atlassian appeared first on AWS Managed Services by Anchor.

A kindly lesson for you non-techies about encryption

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/06/a-kindly-lesson-for-you-non-techies.html

The following tweets need to be debunked:

The answer to John Schindler’s question is:

every expert in cryptography doesn’t know this

Oh, sure, you can find fringe wacko who also knows crypto that agrees with you but all the sane members of the security community will not.

Telegram is not trustworthy because it’s partially closed-source. We can’t see how it works. We don’t know if they’ve made accidental mistakes that can be hacked. We don’t know if they’ve been bribed by the NSA or Russia to put backdoors in their program. In contrast, PGP and Signal are open-source. We can read exactly what the software does. Indeed, thousands of people have been reviewing their software looking for mistakes and backdoors. Being open-source doesn’t automatically make software better, but it does make hiding secret backdoors much harder.

Telegram is not trustworthy because we aren’t certain the crypto is done properly. Signal, and especially PGP, are done properly.

The thing about encryption is that when done properly, it works. Neither the NSA nor the Russians can break properly encrypted content. There’s no such thing as “military grade” encryption that is better than consumer grade. There’s only encryption that nobody can hack vs. encryption that your neighbor’s teenage kid can easily hack. Those scenes in TV/movies about breaking encryption is as realistic as sound in space: good for dramatic presentation, but not how things work in the real world.

In particular, end-to-end encryption works. Sure, in the past, such apps only encrypted as far as the server, so whoever ran the server could read your messages. Modern chat apps, though, are end-to-end: the servers have absolutely no ability to decrypt what’s on them, unless they can get the decryption keys from the phones. But some tasks, like encrypted messages to a group of people, can be hard to do properly.

Thus, in contrast to what John Schindler says, while we techies have doubts about Telegram, we don’t have doubts about Russia authorities having access to Signal and PGP messages.

Snowden hatred has become the anti-vax of crypto. Sure, there’s no particular reason to trust Snowden — people should really stop treating him as some sort of privacy-Jesus. But there’s no particular reason to distrust him, either. His bland statements on crypto are indistinguishable from any other crypto-enthusiast statements. If he’s a Russian pawn, then so too is the bulk of the crypto community.

With all this said, using Signal doesn’t make you perfectly safe. The person you are chatting with could be a secret agent — especially in group chat. There could be cameras/microphones in the room where you are using the app. The Russians can also hack into your phone, and likewise eavesdrop on everything you do with the phone, regardless of which app you use. And they probably have hacked specific people’s phones. On the other hand, if the NSA or Russians were widely hacking phones, we’d detect that this was happening. We haven’t.

Signal is therefore not a guarantee of safety, because nothing is, and if your life depends on it, you can’t trust any simple advice like “use Signal”. But, for the bulk of us, it’s pretty damn secure, and I trust neither the Russians nor the NSA are reading my Signal or PGP messages.

At first blush, this @20committee tweet appears to be non-experts opining on things outside their expertise. But in reality, it’s just obtuse partisanship, where truth and expertise doesn’t matter. Nothing you or I say can change some people’s minds on this matter, no matter how much our expertise gives weight to our words. This post is instead for bystanders, who don’t know enough to judge whether these crazy statements have merit.


So let’s talk about “every crypto expert“. It’s, of course, impossible to speak for every crypto expert. It’s like saying how the consensus among climate scientists is that mankind is warming the globe, while at the same time, ignoring the wide spread disagreement on how much warming that is.

The same is true here. You’ll get a widespread different set of responses from experts about the above tweet. Some, for example, will stress my point at the bottom that hacking the endpoint (the phone) breaks all the apps, and thus justify the above tweet from that point of view. Others will point out that all software has bugs, and it’s quite possible that Signal has some unknown bug that the Russians are exploiting.

So I’m not attempting to speak for what all experts might say here in the general case and what long lecture they can opine about. I am, though, pointing out the basics that virtually everyone agrees on, the consensus of open-source and working crypto.

BackMap, the haptic navigation system

Post Syndicated from Janina Ander original https://www.raspberrypi.org/blog/backmap-haptic/

At this year’s TechCrunch Disrupt NY hackathon, one team presented BackMap, a haptic feedback system which helps visually impaired people to navigate cities and venues. It is assisted by a Raspberry Pi and integrated into a backpack.

Good vibrations with BackMap

The team, including Shashank Sharma, wrote an iOS phone app in Swift, Apple’s open-source programming language. To convert between addresses and geolocations, they used the Esri APIs offered by PubNub. So far, so standard. However, they then configured their BackMap setup so that the user can input their destination via the app, and then follow the route without having to look at a screen or listen to directions. Instead, vibrating motors have been integrated into the straps of a backpack and hooked up to a Raspberry Pi. Whenever the user needs to turn left or right, the Pi makes the respective motor vibrate.

Disrupt NY 2017 Hackathon | Part 1

Disrupt NY 2017 Hackathon presentations filmed live on May 15th, 2017. Preceding the Disrupt Conference is Hackathon weekend on May 13-14, where developers and engineers descend from all over the world to take part in a 24-hour hacking endurance test.

BackMap can also be adapted for indoor navigation by receiving signals from beacons. This could be used to direct users to toilet facilities or exhibition booths at conferences. The team hopes to upgrade the BackMap device to use a wristband format in the future.

Accessible Pi

Here at Pi Towers, we are always glad to see Pi builds for people with disabilities: we’ve seen Sanskriti and Aman’s Braille teacher Mudra, the audio e-reader Valdema by Finnish non-profit Kolibre, and Myrijam and Paul’s award-winning, eye-movement-controlled wheelchair, to name but a few.

Our mission is to bring the power of coding and digital making to everyone, and we are lucky to be part of a diverse community of makers and educators who have often worked proactively to make events and resources accessible to as many people as possible. There is, for example, the autism- and Tourette’s syndrome-friendly South London Raspberry Jam, organised by Femi Owolade-Coombes and his mum Grace. The Raspberry VI website is a portal to all things Pi for visually impaired and blind people. Deaf digital makers may find Jim Roberts’ video tutorials, which are signed in ASL, useful. And anyone can contribute subtitles in any language to our YouTube channel.

If you create or use accessible tutorials, or run a Jam, Code Club, or CoderDojo that is designed to be friendly to people who are neuroatypical or have a disability, let us know how to find your resource or event in the comments!

The post BackMap, the haptic navigation system appeared first on Raspberry Pi.

A note about "false flag" operations

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/03/a-note-about-false-flag-operations.html

There’s nothing in the CIA #Vault7 leaks that calls into question strong attribution, like Russia being responsible for the DNC hacks. On the other hand, it does call into question weak attribution, like North Korea being responsible for the Sony hacks.

There are really two types of attribution. Strong attribution is a preponderance of evidence that would convince an unbiased, skeptical expert. Weak attribution is flimsy evidence that confirms what people are predisposed to believe.

The DNC hacks have strong evidence pointing to Russia. Not only does all the malware check out, but also other, harder to “false flag” bits, like active command-and-control servers. A serious operator could still false-flag this in theory, if only by bribing people in Russia, but nothing in the CIA dump hints at this.

The Sony hacks have weak evidence pointing to North Korea. One of the items was the use of the RawDisk driver, used both in malware attributed to North Korea and the Sony attacks. This was described as “flimsy” at the time [*]. The CIA dump [*] demonstrates that indeed it’s flimsy — as apparently CIA malware also uses the RawDisk code.

In the coming days, biased partisans are going to seize on the CIA leaks as proof of “false flag” operations, calling into question Russian hacks. No, this isn’t valid. We experts in the industry criticized “malware techniques” as flimsy attribution, long before the Sony attack, and long before the DNC hacks. All the CIA leaks do is prove we were right. On the other hand, the DNC hack attribution is based on more than just this, so nothing in the CIA leaks calls into question that attribution.

Managing Secrets for Amazon ECS Applications Using Parameter Store and IAM Roles for Tasks

Post Syndicated from Chris Barclay original https://aws.amazon.com/blogs/compute/managing-secrets-for-amazon-ecs-applications-using-parameter-store-and-iam-roles-for-tasks/

Thanks to my colleague Stas Vonholsky  for a great blog on managing secrets with Amazon ECS applications.


As containerized applications and microservice-oriented architectures become more popular, managing secrets, such as a password to access an application database, becomes more challenging and critical.

Some examples of the challenges include:

  • Support for various access patterns across container environments such as dev, test, and prod
  • Isolated access to secrets on a container/application level rather than at the host level
  • Multiple decoupled services with their own needs for access, both as services and as clients of other services

This post focuses on newly released features that support further improvements to secret management for containerized applications running on Amazon ECS. My colleague, Matthew McClean, also published an excellent post on the AWS Security Blog, How to Manage Secrets for Amazon EC2 Container Service–Based Applications by Using Amazon S3 and Docker, which discusses some of the limitations of passing and storing secrets with container parameter variables.

Most secret management tools provide the following functionality:

  • Highly secured storage system
  • Central management capabilities
  • Secure authorization and authentication mechanisms
  • Integration with key management and encryption providers
  • Secure introduction mechanisms for access
  • Auditing
  • Secret rotation and revocation

Amazon EC2 Systems Manager Parameter Store

Parameter Store is a feature of Amazon EC2 Systems Manager. It provides a centralized, encrypted store for sensitive information and has many advantages when combined with other capabilities of Systems Manager, such as Run Command and State Manager. The service is fully managed, highly available, and highly secured.

Because Parameter Store is accessible using the Systems Manager API, AWS CLI, and AWS SDKs, you can also use it as a generic secret management store. Secrets can be easily rotated and revoked. Parameter Store is integrated with AWS KMS so that specific parameters can be encrypted at rest with the default or custom KMS key. Importing KMS keys enables you to use your own keys to encrypt sensitive data.

Access to Parameter Store is enabled by IAM policies and supports resource level permissions for access. An IAM policy that grants permissions to specific parameters or a namespace can be used to limit access to these parameters. CloudTrail logs, if enabled for the service, record any attempt to access a parameter.

While Amazon S3 has many of the above features and can also be used to implement a central secret store, Parameter Store has the following added advantages:

  • Easy creation of namespaces to support different stages of the application lifecycle.
  • KMS integration that abstracts parameter encryption from the application while requiring the instance or container to have access to the KMS key and for the decryption to take place locally in memory.
  • Stored history about parameter changes.
  • A service that can be controlled separately from S3, which is likely used for many other applications.
  • A configuration data store, reducing overhead from implementing multiple systems.
  • No usage costs.

Note: At the time of publication, Systems Manager doesn’t support VPC private endpoint functionality. To enforce stricter access to a Parameter Store endpoint from a private VPC, use a NAT gateway with a set Elastic IP address together with IAM policy conditions that restrict parameter access to a limited set of IP addresses.

IAM roles for tasks

With IAM roles for Amazon ECS tasks, you can specify an IAM role to be used by the containers in a task. Applications interacting with AWS services must sign their API requests with AWS credentials. This feature provides a strategy for managing credentials for your applications to use, similar to the way that Amazon EC2 instance profiles provide credentials to EC2 instances.

Instead of creating and distributing your AWS credentials to the containers or using the EC2 instance role, you can associate an IAM role with an ECS task definition or the RunTask API operation. For more information, see IAM Roles for Tasks.

You can use IAM roles for tasks to securely introduce and authenticate the application or container with the centralized Parameter Store. Access to the secret manager should include features such as:

  • Limited TTL for credentials used
  • Granular authorization policies
  • An ID to track the requests in the logs of the central secret manager
  • Integration support with the scheduler that could map between the container or task deployed and the relevant access privileges

IAM roles for tasks support this use case well, as the role credentials can be accessed only from within the container for which the role is defined. The role exposes temporary credentials and these are rotated automatically. Granular IAM policies are supported with optional conditions about source instances, source IP addresses, time of day, and other options.

The source IAM role can be identified in the CloudTrail logs based on a unique Amazon Resource Name and the access permissions can be revoked immediately at any time with the IAM API or console. As Parameter Store supports resource level permissions, a policy can be created to restrict access to specific keys and namespaces.

Dynamic environment association

In many cases, the container image does not change when moving between environments, which supports immutable deployments and ensures that the results are reproducible. What does change is the configuration: in this context, specifically the secrets. For example, a database and its password might be different in the staging and production environments. There’s still the question of how do you point the application to retrieve the correct secret? Should it retrieve prod.app1.secret, test.app1.secret or something else?

One option can be to pass the environment type as an environment variable to the container. The application then concatenates the environment type (prod, test, etc.) with the relative key path and retrieves the relevant secret. In most cases, this leads to a number of separate ECS task definitions.

When you describe the task definition in a CloudFormation template, you could base the entry in the IAM role that provides access to Parameter Store, KMS key, and environment property on a single CloudFormation parameter, such as “environment type.” This approach could support a single task definition type that is based on a generic CloudFormation template.

Walkthrough: Securely access Parameter Store resources with IAM roles for tasks

This walkthrough is configured for the North Virginia region (us-east-1). I recommend using the same region.

Step 1: Create the keys and parameters

First, create the following KMS keys with the default security policy to be used to encrypt various parameters:

  • prod-app1 –used to encrypt any secrets for app1.
  • license-key –used to encrypt license-related secrets.
aws kms create-key --description prod-app1 --region us-east-1
aws kms create-key --description license-code --region us-east-1

Note the KeyId property in the output of both commands. You use it throughout the walkthrough to identify the KMS keys.

The following commands create three parameters in Parameter Store:

  • prod.app1.db-pass (encrypted with the prod-app1 KMS key)
  • general.license-code (encrypted with the license-key KMS key)
  • prod.app2.user-name (stored as a standard string without encryption)
aws ssm put-parameter --name prod.app1.db-pass --value "AAAAAAAAAAA" --type SecureString --key-id "<key-id-for-prod-app1-key>" --region us-east-1
aws ssm put-parameter --name general.license-code --value "CCCCCCCCCCC" --type SecureString --key-id "<key-id-for-license-code-key>" --region us-east-1
aws ssm put-parameter --name prod.app2.user-name --value "BBBBBBBBBBB" --type String --region us-east-1

Step 2: Create the IAM role and policies

Now, create a role and an IAM policy to be associated later with the ECS task that you create later on.
The trust policy for the IAM role needs to allow the ecs-tasks entity to assume the role.

   "Version": "2012-10-17",
   "Statement": [
       "Sid": "",
       "Effect": "Allow",
       "Principal": {
         "Service": "ecs-tasks.amazonaws.com"
       "Action": "sts:AssumeRole"

Save the above policy as a file in the local directory with the name ecs-tasks-trust-policy.json.

aws iam create-role --role-name prod-app1 --assume-role-policy-document file://ecs-tasks-trust-policy.json

The following policy is attached to the role and later associated with the app1 container. Access is granted to the prod.app1.* namespace parameters, the encryption key required to decrypt the prod.app1.db-pass parameter and the license code parameter. The namespace resource permission structure is useful for building various hierarchies (based on environments, applications, etc.).

Make sure to replace <key-id-for-prod-app1-key> with the key ID for the relevant KMS key and <account-id> with your account ID in the following policy.

     "Version": "2012-10-17",
     "Statement": [
             "Effect": "Allow",
             "Action": [
             "Resource": "*"
             "Sid": "Stmt1482841904000",
             "Effect": "Allow",
             "Action": [
             "Resource": [
             "Sid": "Stmt1482841948000",
             "Effect": "Allow",
             "Action": [
             "Resource": [

Save the above policy as a file in the local directory with the name app1-secret-access.json:

aws iam create-policy --policy-name prod-app1 --policy-document file://app1-secret-access.json

Replace <account-id> with your account ID in the following command:

aws iam attach-role-policy --role-name prod-app1 --policy-arn "arn:aws:iam::<account-id>:policy/prod-app1"

Step 3: Add the testing script to an S3 bucket

Create a file with the script below, name it access-test.sh and add it to an S3 bucket in your account. Make sure the object is publicly accessible and note down the object link, for example https://s3-eu-west-1.amazonaws.com/my-new-blog-bucket/access-test.sh

#This is simple bash script that is used to test access to the EC2 Parameter store.
# Install the AWS CLI
apt-get -y install python2.7 curl
curl -O https://bootstrap.pypa.io/get-pip.py
python2.7 get-pip.py
pip install awscli
# Getting region
EC2_AVAIL_ZONE=`curl -s`
EC2_REGION="`echo \"$EC2_AVAIL_ZONE\" | sed -e 's:\([0-9][0-9]*\)[a-z]*\$:\\1:'`"
# Trying to retrieve parameters from the EC2 Parameter Store
APP1_WITH_ENCRYPTION=`aws ssm get-parameters --names prod.app1.db-pass --with-decryption --region $EC2_REGION --output text 2>&1`
APP1_WITHOUT_ENCRYPTION=`aws ssm get-parameters --names prod.app1.db-pass --no-with-decryption --region $EC2_REGION --output text 2>&1`
LICENSE_WITH_ENCRYPTION=`aws ssm get-parameters --names general.license-code --with-decryption --region $EC2_REGION --output text 2>&1`
LICENSE_WITHOUT_ENCRYPTION=`aws ssm get-parameters --names general.license-code --no-with-decryption --region $EC2_REGION --output text 2>&1`
APP2_WITHOUT_ENCRYPTION=`aws ssm get-parameters --names prod.app2.user-name --no-with-decryption --region $EC2_REGION --output text 2>&1`
# The nginx server is started after the script is invoked, preparing folder for HTML.
if [ ! -d /usr/share/nginx/html/ ]; then
mkdir -p /usr/share/nginx/html/;
chmod 755 /usr/share/nginx/html/

# Creating an HTML file to be accessed at http://<public-instance-DNS-name>/ecs.html
cat > /usr/share/nginx/html/ecs.html <<EOF
<!DOCTYPE html>
body {padding: 20px;margin: 0 auto;font-family: Tahoma, Verdana, Arial, sans-serif;}
code {white-space: pre-wrap;}
result {background: hsl(220, 80%, 90%);}
<h1>Hi there!</h1>
<p style="padding-bottom: 0.8cm;">Following are the results of different access attempts as expirienced by "App1".</p>

<p><b>Access to prod.app1.db-pass:</b><br/>
<pre><code>aws ssm get-parameters --names prod.app1.db-pass --with-decryption</code><br/>
<code>aws ssm get-parameters --names prod.app1.db-pass --no-with-decryption</code><br/>

<p><b>Access to general.license-code:</b><br/>
<pre><code>aws ssm get-parameters --names general.license-code --with-decryption</code><br/>
<code>aws ssm get-parameters --names general.license-code --no-with-decryption</code><br/>

<p><b>Access to prod.app2.user-name:</b><br/>
<pre><code>aws ssm get-parameters --names prod.app2.user-name --no-with-decryption</code><br/>

<p><em>Thanks for visiting</em></p>

Step 4: Create a test cluster

I recommend creating a new ECS test cluster with the latest ECS AMI and ECS agent on the instance. Use the following field values:

  • Cluster name: access-test
  • EC2 instance type: t2.micro
  • Number of instances: 1
  • Key pair: No EC2 key pair is required, unless you’d like to SSH to the instance and explore the running container.
  • VPC: Choose the default VPC. If unsure, you can find the VPC ID with the IP range in the Amazon VPC console.
  • Subnets: Pick a subnet in the default VPC.
  • Security group: Create a new security group with CIDR block and port 80 for inbound access.

Leave other fields with the default settings.

Create a simple task definition that relies on the public NGINX container and the role that you created for app1. Specify the properties such as the available container resources and port mappings. Note the command option is used to download and invoke a test script that installs the AWS CLI on the container, runs a number of get-parameter commands, and creates an HTML file with the results.

Replace <account-id> with your account ID, <your-S3-URI> with a link to the S3 object created in step 3 in the following commands:

aws ecs register-task-definition --family access-test --task-role-arn "arn:aws:iam::<account-id>:role/prod-app1" --container-definitions name="access-test",image="nginx",portMappings="[{containerPort=80,hostPort=80,protocol=tcp}]",readonlyRootFilesystem=false,cpu=512,memory=490,essential=true,entryPoint="sh,-c",command="\"/bin/sh -c \\\"apt-get update ; apt-get -y install curl ; curl -O <your-S3-URI> ; chmod +x access-test.sh ; ./access-test.sh ; nginx -g 'daemon off;'\\\"\"" --region us-east-1

aws ecs run-task --cluster access-test --task-definition access-test --count 1 --region us-east-1

Verifying access

After the task is in a running state, check the public DNS name of the instance and navigate to the following page:


You should see the results of running different access tests from the container after a short duration.

If the test results don’t appear immediately, wait a few seconds and refresh the page.
Make sure that inbound traffic for port 80 is allowed on the security group attached to the instance.

The results you see in the static results HTML page should be the same as running the following commands from the container.


aws ssm get-parameters --names prod.app1.db-pass --with-decryption --region us-east-1
aws ssm get-parameters --names prod.app1.db-pass --no-with-decryption --region us-east-1

Both commands should work, as the policy provides access to both the parameter and the required KMS key.


aws ssm get-parameters --names general.license-code --no-with-decryption --region us-east-1
aws ssm get-parameters --names general.license-code --with-decryption --region us-east-1

Only the first command with the “no-with-decryption” parameter should work. The policy allows access to the parameter in Parameter Store but there’s no access to the KMS key. The second command should fail with an access denied error.


aws ssm get-parameters --names prod.app2.user-name –no-with-decryption --region us-east-1

The command should fail with an access denied error, as there are no permissions associated with the namespace for prod.app2.

Finishing up

Remember to delete all resources (such as the KMS keys and EC2 instance), so that you don’t incur charges.


Central secret management is an important aspect of securing containerized environments. By using Parameter Store and task IAM roles, customers can create a central secret management store and a well-integrated access layer that allows applications to access only the keys they need, to restrict access on a container basis, and to further encrypt secrets with custom keys with KMS.

Whether the secret management layer is implemented with Parameter Store, Amazon S3, Amazon DynamoDB, or a solution such as Vault or KeyWhiz, it’s a vital part to the process of managing and accessing secrets.

AWS Hot Startups – August 2016

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-hot-startups-august-2016/

Back with her second guest post, Tina Barr talks about four more hot startups!


This month we are featuring four hot AWS-powered startups:

  • Craftsvilla – Offering a platform to purchase ethnic goods.
  • SendBird – Helping developers build 1-on-1 messaging and group chat quickly.
  • Teletext.io – A solution for content management, without the system.
  • Wavefront – A cloud-based analytics platform.

Craftsvilla was born in 2011 out of sheer love and appreciation for the crafts, arts, and culture of India. On a road trip through the Gujarat region of western India, Monica and Manoj Gupta were mesmerized by the beautiful creations crafted by local artisans. However, they were equally dismayed that these artisans were struggling to make ends meet. Monica and Manoj set out to create a platform where these highly skilled workers could connect directly with their consumers and reach a much broader audience. The demand for authentic ethnic products is huge across the globe, but consumers are often unable to find the right place to buy them. Craftsvilla helps to solve this issue.

The culture of India is so rich and diverse, that no one had attempted to capture it on a single platform. Using technological innovations, Craftsvilla combines apparel, accessories, health and beauty products, food items and home décor all in one easily accessible space. For instance, they not only offer a variety of clothing (Salwar suits, sarees, lehengas, and casual wear) but each of those categories are further broken down into subcategories. Consumers can find anything that fits their needs – they can filter products by fabric, style, occasion, and even by the type of work (embroidered, beads, crystal work, handcrafted, etc.). If you are interested in trying new cuisine, Craftsvilla can help. They offer hundreds of interesting products from masalas to traditional sweets to delicious tea blends. They even give you the option to filter through India’s many diverse regions to discover new foods.

Becoming a seller on Craftsvilla is simple. Shop owners just need to create a free account and they’re able to start selling their unique products and services. Craftsvilla’s ultimate vision is to become the ‘one-stop destination’ for all things ethnic. They look to be well on their way!

AWS itself is an engineer on Craftsvilla’s team. Customer experience is highly important to the people behind the company, and an integral aspect of their business is to attain scalability with efficiency. They automate their infrastructure at a large scale, which wouldn’t be possible at the current pace without AWS. Currently, they utilize over 20 AWS services – Amazon Elastic Compute Cloud (EC2), Elastic Load Balancing, Amazon Kinesis, AWS Lambda, Amazon Relational Database Service (RDS), Amazon Redshift, and Amazon Virtual Private Cloud to name a few. Their app QA process will move to AWS Device Farm, completely automated in the cloud, on 250+ services thanks to Lambda. Craftsvilla relies completely on AWS for all of their infrastructure needs, from web serving to analytics.

Check out Craftsvilla’s blog for more information!

After successfully exiting their first startup, SendBird founders John S. Kim, Brandon Jeon, Harry Kim, and Forest Lee saw a great market opportunity for a consumer app developer. Today, over 2,000 global companies such as eBay, Nexon, Beat, Malang Studio, and SK Telecom are using SendBird to implement chat and messaging capabilities on their mobiles apps and websites. A few ways companies are using SendBird:

  • 1-on-1 messaging for private messaging and conversational commerce.
  • Group chat for friends and interest groups.
  • Massive scale chat rooms for live-video streams and game communities.

As they watched messaging become a global phenomenon, the SendBird founders realized that it no longer made sense for app developers to build their entire tech stack from scratch. Research from the Localytics Data Team actually shows that in-app messaging can increase app launches by 27% and engagement by 3 times. By simply downloading the SendBird SDK (available for iOS, Android, Unity, .NET Xamarin, and JavaScript), app and web developers can implement real-time messaging features in just minutes. SendBird also provides a full chat history and allows users to send chat messages in addition to complete file and data transfers. Moreover, developers can integrate innovative features such as smart-throttling to control the speed of messages being displayed to the mobile devices during live broadcasting.

After graduating from accelerator Y Combinator W16 Batch, the company grew from 1,000,000 monthly chat users to 5,000,000 monthly chat users within months while handling millions of new messages daily across live-video streaming, games, ecommerce, and consumer apps. Customers found value in having a cross-platform, full-featured, and whole-stack approach to a real-time chat API and SDK which can be deployed in a short period of time.

SendBird chose AWS to build a robust and scalable infrastructure to handle a massive concurrent user base scattered across the globe. It uses EC2 with Elastic Load Balancing and Auto Scaling, Route 53, S3, ElastiCache, Amazon Aurora, CloudFront, CloudWatch, and SNS. The company expects to continue partnering with AWS to scale efficiently and reliably.

Check out SendBird and their blog to follow their journey!

Marcel Panse and Sander Nagtegaal, co-founders of Teletext.io, had worked together at several startups and experienced the same problem at each one: within the scope of custom software development, content management is a big pain. Even the smallest correction, such as a typo, typically requires a developer, which can become very expensive over time. Unable to find a proper solution that was readily available, Marcel and Sander decided to create their own service to finally solve the issue. Leveraging only the API Gateway, Lambda functions, Amazon DynamoDB, S3, and CloudFront, they built a drop-in content management service (CMS). Their serverless approach for a CMS alternative quickly attracted other companies, and despite intending to use it only for their own needs, the pair decided to professionally market their idea and Teletext.io was born.

Today, Teletext.io is called a solution for content management, without the system. Content distributors are able to edit text and images through a WYSIWYG editor without the help of a programmer and directly from their own website or user interface. There are just three easy steps to get started:

  1. Include Teletext.io script.
  2. Add data attributes.
  3. Login and start typing.

That’s it! There is no system that needs to be installed or maintained by developers – Teletext.io works directly out of the box. In addition to recurring content updates, the data attribution technique can also be used for localization purposes. Making a website multilingual through a CMS can take days or weeks, but Teletext.io can accomplish this task in mere minutes. The time-saving factor is the main benefit for developers and editors alike.

Teletext.io uses AWS in a variety of ways. Since the company is responsible for the website content of others, they must have an extremely fast and reliable system that keeps website visitors from noticing external content being loaded. In addition, this critical infrastructure service should never go down. Both of these requirements call for a robust architecture with as few moving parts as possible. For these reasons, Teletext.io runs a serverless architecture that really makes it stand out. For loading draft content, storing edits and images, and publishing the result, the Amazon API Gateway gets called, triggering AWS Lambda functions. The Lambda functions store their data in Amazon DynamoDB.

Read more about Teletext.io’s unique serverless approach in their blog post.

Founded in 2013 and based in Palo Alto, Wavefront is a cloud-based analytics platform that stores time series data at millions of points per second. They are able to detect any divergence from “normal” in hybrid and cloud infrastructures before anomalies ever happen. This is a critical service that companies like Lyft, Okta, Yammer, and Box are using to keep running smoothly. From data scientists to product managers, from startups to Fortune 500 companies, Wavefront offers a powerful query engine and a language designed for everyone.

With a pay-as-you-go model, Wavefront gives customers the flexibility to start with the necessary application size and scale up/down as needed. They also include enterprise-class support as part of their pricing at no extra cost. Take a look at their product demos to learn more about how Wavefront is helping their customers.

The Wavefront Application is hosted entirely on AWS, and runs its single-tenant instances and multi-tenant instances in the virtual private cloud (VPC) clusters within AWS. The application has deep, native integrations with CloudWatch and CloudTrail, which benefits many of its larger customers also using AWS. Wavefront uses AWS to create a “software problem”, to operate, automate and monitor clouds using its own application. Most importantly, AWS allows Wavefront to focus on its core business – to build the best enterprise cloud monitoring system in the world.

To learn more about Wavefront, check out their blog post, How Does Wavefront Work!

Tina Barr

Defining "Gray Hat"

Post Syndicated from Robert Graham original http://blog.erratasec.com/2016/04/defining-gray-hat.html

WIRED has written an article defining “White Hat”, “Black Hat”, and “Grey Hat”. It’s incomplete and partisan.

Black Hats are the bad guys: cybercriminals (like Russian cybercrime gangs), cyberspies (like the Chinese state-sponsored hackers that broke into OPM), or cyberterrorists (ISIS hackers who want to crash the power grid). They may or may not include cybervandals (like some Anonymous activity) that simply defaces websites. Black Hats are those who want to cause damage or profit at the expense of others.

White Hats do the same thing as Black Hats, but are the good guys. The break into networks (as pentesters), but only with permission, when a company/organization hires them to break into their own network. They research the security art, such vulnerabilities, exploits, and viruses. When they find vulnerabilities, they typically work to fix/patch them. (That you frequently have to apply security updates to your computers/devices is primarily due to White Hats). They develop products and tools for use by good guys (even though they sometimes can be used by the bad guys). The movie “Sneakers” refers to a team of White Hat hackers.

Grey Hat is anything that doesn’t fit nicely within these two categories. There are many objective meanings. It can sometimes refer to those who break the law, but who don’t have criminal intent. It can sometimes include the cybervandals, whose activities are more of a prank rather than a serious enterprise. It can refer to “Search Engine Optimizers” who use unsavory methods to trick search engines like Google to rank certain pages higher in search results, to generate advertising profits.

But, it’s also used subjectively, to simply refer to activities the speaker disagrees with. Our community has many debates over proper behavior. Those on one side of a debate frequently use Gray Hat to refer to those on the other side of the debate.

The biggest recent debate is “0day sales to the NSA”, which blew up after Stuxnet, and in particular, after Snowden. This is when experts look for bugs/vulnerabilities, but instead of reporting them to the vendor to be fixed (as White Hats typically do), they sell the bugs to the NSA, so the vulnerabilities (call “0days” in this context) can be used to hack computers in intelligence and military operations. Partisans who don’t like the NSA use “Grey Hat” to refer to those who sell 0days to the NSA.
WIRED’s definition is this partisan definition. Kim Zetter has done more to report on Stuxnet than any other journalist, which is why her definition is so narrow.

But Google is your friend. If you search for “Gray Hat” on Google and set the time range to pre-Stuxnet, then you’ll find no use of the term that corresponds to Kim’s definition, despite the term being in widespread use for more than a decade by that point. Instead, you’ll find things like this EFF “Gray Hat Guide”. You’ll also find how L0pht used the term to describe themselves when selling their password cracking tool called “L0phtcrack”, from back in 1998.

Fast forward to today, activists from the EFF and ACLU call 0day sellers “merchants of death”. But those on the other side of the debate point out how the 0days in Stuxnet saved thousands of lives. The US government had decided to stop Iran’s nuclear program, and 0days gave them a way to do that without bombs, assassinations, or a shooting war. Those who engage in 0day sales do so with the highest professional ethics. If that WaPo article about Gray Hats unlocking the iPhone is true, then it’s almost certain it’s the FBI side of things who leaked the information, because 0day sellers don’t. It’s the government who is full of people who foreswear their oaths for petty reasons, not those who do 0day research.

The point is, the ethics of 0day sales are a hot debate. Using either White Hat or Gray Hat to refer to 0day sellers prejudices that debate. It reflects your own opinion, not that of the listener, who might choose a different word. The definition by WIRED, or the use of “Gray Hat” in the WaPo article, are obviously biased and partisan.

Apple did not invent emoji

Post Syndicated from Eevee original https://eev.ee/blog/2016/04/12/apple-did-not-invent-emoji/

I love emoji. I love Unicode in general. I love seeing plain text become more expressive and more universal.

But, Internet, I’ve noticed a worrying trend. Both popular media and a lot of tech circles tend to assume that “emoji” de facto means Apple’s particular font.

I have some objections.

A not so brief history of emoji

The Unicode Technical Report on emoji also goes over some of this.

Emoji are generally traced back to the Japanese mobile carrier NTT DoCoMo, which in February 1999 released a service called i-mode which powered a line of wildly popular early smartphones. Its messenger included some 180 small pixel-art images you could type as though they were text, because they were text, encoded using unused space in Shift JIS.

(Quick background, because I’d like this to be understandable by a general audience: computers only understand numbers, not text, so we need a “character set” that lists all the characters you can type and what numbers represent them. So in ASCII, for example, a capital “A” is passed around as the number 65. Computers always deal with bytes, which can go up to 255, but ASCII only lists characters up to 127 — so everything from 128 to 255 is just unused space. Shift JIS is Japan’s equivalent to ASCII, and had a lot more unused space, and that’s where early emoji were put.)

Naturally, other carriers added their own variations. Naturally, they used different sets of images, but often in a different order, so the same character might be an apple on one phone and a banana on another. They came up with tables for translating between carriers, but that wouldn’t help if your friend tried to send you an image that your phone just didn’t have. And when these characters started to leak outside of Japan, they had no hope whatsoever of displaying as anything other than garbage.

This is kind of like how Shift JIS is mostly compatible with ASCII, except that for some reason it has the yen sign ¥ in place of the ASCII backslash , producing hilarious results. Also, this is precisely the problem that Unicode was invented to solve.

I’ll get back to all this in a minute, but something that’s left out of emoji discussions is that the English-speaking world was developing a similar idea. As far as I can tell, we got our first major exposure to graphical emoticons with the release of AIM 4.0 circa May 2000 and these infamous “smileys”:

Pixellated, 4-bit smileys from 2000

Even though AIM was a closed network where there was little risk of having private characters escape, these were all encoded as ASCII emoticons. That simple smiley on the very left would be sent as :-) and turned into an image on your friend’s computer, which meant that if you literally typed :-) in a message, it would still render graphically. Rather than being an extension to regular text, these images were an enhancement of regular text, showing a graphical version of something the text already spelled out. A very fancy ligature.

Little ink has been spilled over this, but those humble 4-bit graphics became a staple of instant messaging, by which I mean everyone immediately ripped them off. ICQ, MSN Messenger, Yahoo! Messenger, Pidgin (then Gaim), Miranda, Trillian… I can’t name a messenger since 2003 that didn’t have smileys included. All of them still relied on the same approach of substituting graphics for regular ASCII sequences. That had made sense for AIM’s limited palette of faces, but during its heyday MSN Messenger included 67 graphics, most of them not faces. If you sent a smiling crescent moon to someone who had the graphics disabled (or used an alternative client), all they’d see was a mysterious (S).

So while Japan is generally credited as the source of emoji, the US was quite busy making its own mess of things.

Anyway, Japan had this mess of several different sets of emoji in common use, being encoded in several different incompatible ways. That’s exactly the sort of mess Unicode exists to sort out, so in mid-2007, several Google employees (one of whom was the co-founder of the Unicode Consortium, which surely helped) put together a draft proposal for adding the emoji to Unicode. The idea was to combine all the sets, drop any duplicates, and add to Unicode whatever wasn’t already there.

(Unicode is intended as a unification of all character sets. ASCII has , Shift JIS has ¥, but Unicode has both — so an English speaker and a Japanese speaker can both use both characters without getting confused, as long as they’re using Unicode. And so on, for thousands of characters in dozens of character sets. Part of the problem with sending the carriers’ emoji to American computers was that the US was pretty far along in shifting everything to use Unicode, but the emoji simply didn’t exist in Unicode. Obvious solution: add them!)

Meanwhile, the iPhone launched in Japan in 2008. iOS 2.2, released in November, added the first implementation of emoji — but using SoftBank’s invented encoding, since they were only on one carrier and the characters weren’t yet in Unicode. A couple Apple employees jumped on the bandwagon around that time and coauthored the first official proposal, published in January 2009. Unicode 6.0, the first version to include emoji, was released in October 2010.

iPhones worldwide gained the ability to use its emoji (now mapped to Unicode) with the release of iOS 5.0 in October 2011.

Android didn’t get an emoji font at all until version 4.3, in July 2013. I’m at a loss for why, given that Google had proposed emoji in the first place, and Android had been in Japan since the HTC Magic in May 2009. It was even on NTT DoCoMo, the carrier that first introduced emoji! What the heck, Google.

The state of things

Consider this travesty of an article from last week. This Genius Theory Will Change the Way You Use the “Pink Lady” Emoji:

Unicode, creators of the emoji app, call her the “Information Desk Person.”

Oh, dear. Emoji aren’t an “app”, Unicode didn’t create them, and the person isn’t necessarily female. But the character is named “Information Desk Person”, so at least that part is correct.

It’s non-technical clickbait, sure. But notice that neither “Apple” nor the names of any of its platforms appear in the text. As far as this article and author are concerned, emoji are Apple’s presentation of them.

I see also that fileformat.info is now previewing emoji using Apple’s font. Again, there’s no mention of Apple that I can find here; even the page that credits the data and name sources doesn’t mention Apple. The font is even called “Apple Color Emoji”, so you’d think that might show up somewhere.

Telegram and WhatsApp both use Apple’s font for emoji on every platform; you cannot use your system font. Slack lets you choose, but defaults to Apple’s font. (I objected to Android Telegram’s jarring use of a non-native font; the sole developer explained simply that they like Apple’s font more, and eventually shut down the issue tracker to stop people from discussing it further.)

The latest revision of Google’s emoji font even made some questionable changes, seemingly just for the sake of more closely resembling Apple’s font. I’ll get into that a bit later, but suffice to say, even Google is quietly treating Apple’s images as a de facto standard.

The Unicode Consortium will now let you “adopt” a character. If you adopt an emoji, the certificate they print out for you uses Apple’s font.

It’s a little unusual that this would happen when Android has been more popular than the iPhone almost everywhere, even since iOS first exposed its emoji keyboard worldwide. Also given that Apple’s font is not freely-licensed (so you’re not actually allowed to use it in your project), whereas Google’s whole font family is. And — full disclosure here — quite a few of them look to me like they came from a disquieting uncanny valley populated by plastic people.


Granted, the iPhone did have a 20-month head start at exposing the English-speaking world to emoji. Plus there’s that whole thing where Apple features are mysteriously assumed to be the first of their kind. I’m not entirely surprised that Apple’s font is treated as canonical; I just have some objections.

Some objections

I’m writing this in a terminal that uses Source Code Pro. You’re (probably) reading it on the web in Merriweather. Miraculously, you still understand what all the letters mean, even though they appear fairly differently.

Emoji are text, just like the text you’re reading now, not too different from those goofy :-) smileys in AIM. They’re often displayed with colorful graphics, but they’re just ideograms, similar to Egyptian hieroglyphs (which are also in Unicode). It’s totally okay to write them a little differently sometimes.

This the only reason emoji are in Unicode at all — the only reason we have a universal set of little pictures. If they’d been true embedded images, there never would have been any reason to turn them into characters.

Having them as text means we can use them anywhere we can use text — there’s no need to hunt down a graphic and figure out how to embed it. You want to put emoji in filenames, in source code, in the titlebar of a window? Sure thing — they’re just text.

Treating emoji as though they are a particular set of graphics rather defeats the point. At best, it confuses people’s understanding of what the heck is going on here, and I don’t much like that.

I’ve encountered people who genuinely believed that Apple’s emoji were some kind of official standard, and anyone deviating from them was somehow wrong. I wouldn’t be surprised if a lot of lay people believed Apple invented emoji. I can hardly blame them, when we have things like World Emoji Day, based on the date on Apple’s calendar glyph. This is not a good state of affairs.

Along the same lines, nothing defines an emoji, as I’ve mentioned before. Whether a particular character appears as a colored graphic is purely a property of the fonts you have installed. You could have a font that rendered all English text in sparkly purple letters, if you really wanted to. Or you could have a font that rendered emoji as simple black-and-white outlines like other characters — which is in fact what I have.

Well… that was true, but mere weeks before that post was published, the Unicode Consortium published a list of characters with a genuine “Emoji” property.

But, hang on. That list isn’t part of the actual Unicode database; it’s part of a “technical report”, which is informative only. In fact, if you look over the Unicode Technical Report on emoji, you may notice that the bulk of it is merely summarizing what’s being done in the wild. It’s not saying what you must do, only what’s already been done. The very first sentence even says that it’s about interoperability.

If that doesn’t convince you, consider that the list of “emoji” characters includes # and *. Yes, the ASCII characters on a regular qwerty keyboard. I don’t think this is a particularly good authoritative reference.

Speaking of which, the same list also contains ©, ®, and — and Twitter’s font has glyphs for all three of them: ©, ®, . They aren’t used on web Twitter, but if you naïvely dropped twemoji into your own project, you’d see these little superscript characters suddenly grow to fit large full-width squares. (Worse, all three of them are a single solid color, so they’ll be unreadable on a dark background.) There’s an excellent reason for this, believe it or not: Shift JIS doesn’t contain any of these characters, so the Japanese carriers faked it by including them as emoji.

Anyway, the technical report proper is a little more nuanced, breaking emoji into a few coarse groups based on who implements them. (Observe that it uses Apple’s font for all 1282 example emoji.)

I care about all this because I see an awful lot of tech people link this document as though it were a formal specification, which leads to a curious cycle.

  1. Apple does a thing with emoji.
  2. Because Apple is a major vendor, the thing it did is added to the technical report.
  3. Other people look at the report, believe it to be normative, and also do Apple’s thing because it’s “part of Unicode”.
  4. (Wow, Apple did this first again! They’re so ahead of the curve!)

After I wrote the above list, I accidentally bumbled upon this page from emojipedia, which states:

In addition to emojis approved in Unicode 8.0 (mid-2015), iOS 9.1 also includes emoji versions of characters all the way back to Unicode 1.1 (1993) that have retroactively been deemed worthy of emoji presentation by the Unicode Consortium.

That’s flat-out wrong. The Unicode Consortium has never deemed characters worthy of “emoji presentation” — it’s written reports about the characters that vendors like Apple have given colored glyphs. This paragraph congratulates Apple for having an emoji font that covers every single character Apple decided to put in their emoji font!

This is a great segue into what happened with Google’s recent update to its own emoji font.

Google’s emoji font changes

Android 6.0.1 was released in December 2015, and contained a long-overdue update to its emoji font, Noto Color Emoji. It added newly-defined emoji like 🌭 U+1F32D HOT DOG and 🦄 U+1F984 UNICORN FACE, so, that was pretty good.

ZWJ sequences

How is this a segue, you ask? Well, see, there are these curious chimeras called ZWJ sequences — effectively new emoji created by mashing multiple emoji together with a special “glue” character in the middle. Apple used (possibly invented?) this mechanism to create “diverse” versions of several emoji like 💏 U+1F48F KISS. The emoji for two women kissing looks like a single image, but it’s actually written as seven characters: woman + heart + kiss + woman with some glue between them. It’s a lot like those AIM smileys, only not ASCII under the hood.

So, that’s fine, it makes sense, I guess. But then Apple added a new chimera emoji: a speech bubble with an eyeball in it, written as eye + speech bubble. It turned out to be some kind of symbol related to an anti-bullying campaign, dreamed up in conjunction with the Ad Council (?!). I’ve never seen it used and never heard about this campaign outside of being a huge Unicode nerd.

Lo and behold, it appeared in the updated font. And Twitter’s font. And Emoji One.

Is this how we want it to work? Apple is free to invent whatever it wants by mashing emoji together, and everyone else treats it as canonical, with no resistance whatsoever? Apple gets to deliberately circumvent the Unicode character process?

Apple appreciated the symbol, too. “When we first asked about bringing this emoji to the official Apple keyboard, they told us it would take at least a year or two to get it through and approved under Unicode,” says Wittmark. The company found a way to fast-track it, she says, by combining two existing emoji.

Maybe this is truly a worthy cause. I don’t know. All I know is that Apple added a character (designed by an ad agency) basically on a whim, and now it’s enshrined forever in Unicode documents. There doesn’t seem to be any real incentive for them to not do this again. I can’t wait for apple + laptop to become the MacBook Pro™ emoji.

(On the other hand, I can absolutely get behind ninja cat.)

Gender diversity

I take issue with using this mechanism for some of the “diverse” emoji as well. I didn’t even realize the problem until Google copied Apple’s implementation.

The basic emoji in question are 💏 U+1F48F KISS and 💑 U+1F491 COUPLE WITH HEART. The emoji technical report contains the following advice, emphasis mine:

Some multi-person groupings explicitly indicate gender: MAN AND WOMAN HOLDING HANDS, TWO MEN HOLDING HANDS, TWO WOMEN HOLDING HANDS. Others do not: KISS, COUPLE WITH HEART, FAMILY (the latter is also non-specific as to the number of adult and child members). While the default representation for the characters in the latter group should be gender-neutral, implementations may desire to provide (and users may desire to have available) multiple representations of each of these with a variety of more-specific gender combinations.

This reinforces the document’s general advice about gender which comes down to: if the name doesn’t explicitly reference gender, the image should be gender-neutral. Makes sense.

Here’s how 💏 U+1F48F KISS and 💑 U+1F491 COUPLE WITH HEART look, before and after the font update.

Pictured: straight people, ruining everything

Before, both images were gender-agnostic blobs. Now, with the increased “diversity”, you can choose from various combinations of genders… but the genderless version is gone. The default — what you get from the single characters on their own, without any chimera gluing stuff — is heteromance.

In fact, almost every major font does this for both KISS and COUPLE WITH HEART, save for Microsoft’s. (HTC’s KISS doesn’t, but only because it doesn’t show people at all.)

Google’s font has changed from “here are two people” to “heterosexuals are the default, but you can use some other particular combinations too”. This isn’t a step towards diversity; this is a step backwards. It also violates the advice in the very document that’s largely based on “whatever Apple and Google are doing”, which is confounding.

Sometimes, Apple is wrong

It also highlights another problem with treating Apple’s font as canonical, which is that Apple is occasionally wrong. I concede that “wrong” is a fuzzy concept here, but I think “surprising, given the name of the character” is a reasonable definition.

In that sense, everyone but Microsoft is wrong about 💏 U+1F48F KISS and 💑 U+1F491 COUPLE WITH HEART, since neither character mentions gender.

You might expect 🙌 U+1F64C PERSON RAISING BOTH HANDS IN CELEBRATION and 🙏 U+1F64F PERSON WITH FOLDED HANDS to depict people, but Apple only shows a pair of hands for both of them. This is particularly bad with PERSON WITH FOLDED HANDS, which just looks like a high five. Almost every other font has followed suit (CELEBRATION, FOLDED HANDS). Google used to get this right, but changed it with the update.

Celebration changed to pat-a-cake, for some reason

👿 U+1F47F IMP suggests, er, an imp, especially since it’s right next to other “monster” characters like 👾 U+1F47E ALIEN MONSTER and 👹 U+1F479 JAPANESE OGRE. Apple appears to have copied its own 😈 U+1F608 SMILING FACE WITH HORNS from the emoticons block and changed the smile to a frown, producing something I would never guess is meant to be an imp. Google followed suit, just like most other fonts, resulting in the tragic loss of one of my favorite Noto glyphs and the only generic representation of a demon.

This is going to wreak havoc on all my tweets about Doom

👯 U+1F46F WOMAN WITH BUNNY EARS suggests a woman. Apple has two, for some reason, though that hasn’t been copied quite as much.

⬜ U+2B1C WHITE LARGE SQUARE needs a little explanation. Before Unicode contained any emoji (several of which are named with explicit colors), quite a few character names used “black” to mean “filled” and “white” to mean “empty”, referring to how the character would look when printed in black ink on white paper. “White large square” really means the outline of a square, in contrast to ⬛ U+2B1B BLACK LARGE SQUARE, which is solid. Unfortunately, both of these characters somehow ended up in virtually every emoji font, despite not being in the original lists of Japanese carriers’ emoji… and everyone gets it wrong, save for Microsoft. Every single font shows a solid square colored white. Except Google, who colors it blue. And Facebook, who has some kind of window frame, which it colors black for the BLACK glyph.

When Apple screws up and doesn’t fix it, everyone else copies their screw-up for the sake of compatibility — and as far as I can tell, the only time Apple has ever changed emoji is for the addition of skin tones and when updating images of their own products. We’re letting Apple set a de facto standard for the appearance of text, even when they’re incorrect, because… well, I’m not even sure why.

Hand gestures

Returning briefly to the idea of diversity, Google also updated the glyphs for its dozen or so “hand gesture” emoji:

Hmm I wonder where they got the inspiration for these

They used to be pink outlines with a flat white fill, but now are a more realistic flat style with the same yellow as the blob faces and shading. This is almost certainly for the sake of supporting the skin tone modifiers later, though Noto doesn’t actually support them yet.

The problem is, the new ones are much harder to tell apart at a glance! The shadows are very subtle, especially at small sizes, so they might as well all be yellow splats.

I always saw the old glyphs as abstract symbols, rather than a crop of a person, even a cartoony person. That might be because I’m white as hell, though. I don’t know. If people of color generally saw them the same way, it seems a shame to have made them all less distinct.

It’s not like the pink and white style would’ve prevented Noto from supporting skin tones in the future, either. Nothing says an emoji with a skin tone has to look exactly like the same emoji without one. The font could easily use the more abstract symbols by default, and switch to this more realistic style when combined with a skin tone.


And finally, some kind of tragic accident has made 💩 U+1F4A9 PILE OF POO turn super goofy and grow a face.

What even IS that now?

Why? Well, you see, Apple’s has a face. And so does almost everyone else’s, now.

I looked at the original draft proposal for this one, and SoftBank (the network the iPhone first launched on in Japan) also had a face for this character, whereas KDDI did not. So the true origin is probably just that one particular carrier happened to strike a deal to carry the iPhone first.

Interop and confusion

I’m sure the rationale for many of these changes was to reduce confusion when Android and iOS devices communicate. I’m sure plenty of people celebrated the changes on those grounds.

I was subscribed to several Android Telegram issues about emoji before the issue tracker was shut down, so I got a glimpse into how people feel about this. One person was particularly adamant that in general, the recipient should always see exactly the same image that the sender chose. Which sounds… like it’s asking for embedded images. Which Telegram supports. So maybe use those instead?

I grew up on the Internet, in a time when ^_^ looked terrible in mIRC’s default font of Fixedsys but just fine in PIRCH98. Some people used MS Comic Chat, which would try to encode actions in a way that looked like annoying noise to everyone else. Abbreviations were still a novelty, so you might not know what “ttfn” means.

Somehow, we all survived. We caught on, we asked for clarification, we learned the rules, and life went on. All human communication is ambiguous, so it baffles me when people bring up “there’s more than one emoji font” as though it spelled the end of civilization. Someone might read what you wrote and interpret it differently than you intended? Damn, that is definitely a new and serious problem that we have no idea how to handle.

It sounds to me how this would’ve sounded in 1998:

A: ^_^
B: Wow, that looks totally goofy over here. I’m using mIRC.
A: Oh, I see the problem. Every IRC client should use Arial, like PIRCH does.

That is, after all, the usual subtext: every font should just copy whatever Apple does. Let’s not.

Look, science!

Conveniently for me, someone just did a study on this. Here’s what I found most interesting:

Overall, we found that if you send an emoji across platform boundaries (e.g., an iPhone to a Nexus), the sender and the receiver will differ by about 2.04 points on average on our -5 to 5 sentiment scale. However, even within platforms, the average difference is 1.88 points.

In other words, people still interpret the same exact glyph differently — just like people sometimes interpret the same words differently.

The gap between same-glyph and different-glyph is a mere 0.16 points out of a 10-point scale, a mere 1.6%. The paper still concludes that the designs should move closer together, and sure, they totally should — towards what the characters describe.

To underscore that idea, note the summary page discusses U+1F601 😁 GRINNING FACE WITH SMILING EYES across five different fonts. Surely this should express something positive, right? Grins are positive, smiling eyes are positive; this might be the most positive face in Unicode. Indeed, every font was measured as expressing a very positive emotion, except Apple’s, which was apparently controversial but averaged out to slightly negative. Looking at the various renderings, I can totally see how Apple’s might be construed as a grimace.

So in the name of interoperability, what should font vendors do here? Push Apple (and Twitter and Facebook, by the look of it) to change their glyph? Or should everyone else change, so we end up in a world where two-thirds of people think “grinning face with smiling eyes” is expressing negativity?

A diversion: fonts

Perhaps the real problem here is font support itself.

You can’t install fonts or change default fonts on either iOS or Android (sans root). That Telegram developer who loves Apple’s emoji should absolutely be able to switch their Android devices to use Apple’s font… but that’s impossible.

It’s doubly impossible because of a teensy technical snag. You see,

  • Apple added support for embedding PNG images in an OpenType font to OS X and iOS.

  • Google added support for embedding PNG images in an OpenType font to FreeType, the font rendering library used on Linux and Android. But they did it differently from Apple.

  • Microsoft added support for color layers in OpenType, so all of its emoji are basically several different monochrome vector images colored and stacked together. It’s actually an interesting approach — it makes the font smaller, it allows pieces to be reused between characters, and it allows the same emoji to be rendered in different palettes on different background colors almost for free.

  • Mozilla went way out into the weeds and added support for embedding SVG in OpenType. If you’re using Firefox, please enjoy these animated emoji. Those are just the letter “o” in plain text — try highlighting or copy/pasting it. The animation is part of the font. (I don’t know whether this mechanism can adapt to the current font color, but these particular soccer balls do not.)

We have four separate ways to create an emoji font, all of them incompatible, none of them standard (yet? I think?). You can’t even make one set of images and save it as four separate fonts, because they’re all designed very differently: Apple and Google only support regular PNG images, Microsoft only supports stacked layers of solid colors, and Mozilla is ridiculously flexible but still prefers vectors. Apple and Google control the mobile market, so they’re likely to win in the end, which seems a shame since their approaches are the least flexible in terms of size and color and other text properties.

I don’t think most people have noticed this, partly because even desktop operating systems don’t have an obvious way to change the emoji font (so who would think to try?), and partly because emoji mostly crop up on desktops via web sites which can quietly substitute images (like Twitter and Slack do). It’s not a situation I’d like to see become permanent, though.

Consider, if you will, that making an emoji font is really hard — there are over 1200 high-resolution images to create, if you want to match Apple’s font. If you used any web forums or IM clients ten years ago, you’re probably also aware that most smiley packs are pretty bad. If you’re stuck on a platform where the default emoji font just horrifies you (for example), surely you’d like to be able to change the font system-wide.

Disconnecting the fonts from the platforms would actually make it easier to create a new emoji font, because the ability to install more than one side-by-side means that no one font would need to cover everything. You could make a font that provides all the facial expressions, and let someone else worry about the animals. Or you could make a font that provides ZWJ sequences for every combination of an animal face and a facial expression. (Yes, please.) Or you could make a font that turns names of Pokémon into ligatures, so e-e-v-e-e displays as (eevee icon), similar to how Sans Bullshit Sans works.

But no one can do any of this, so long as there’s no single extension that works everywhere.

(Also, for some reason, I’ve yet to get Google’s font to work anywhere in Linux. I’m sure there are some fascinating technical reasons, but the upshot is that Google’s browser doesn’t support Google’s emoji font using Google’s FreeType patch that implements Google’s own font extension. It’s been like this for years, and there’s been barely any movement on it, leaving Linux as the only remotely-major platform that can’t seem to natively render color emoji glyphs — even though Android can.)


Some miscellaneous thoughts:

  • I’m really glad that emoji have forced more developers to actually handle Unicode correctly. Having to deal with commonly-used characters outside of ASCII is a pretty big kick in the pants already, but most emoji are also in Plane 1, which means they don’t fit in a single JavaScript “character” — an issue that would otherwise be really easy to overlook. 💩 is

  • On the other hand, it’s a shame that the rise of emoji keyboards hasn’t necessarily made the rest of Unicode accessible. There are still plenty of common symbols, like ♫, that I can only type on my phone using the Japanese keyboard. I do finally have an input method on my desktop that lets me enter characters by name, which is nice. We’ve certainly improved since the olden days, when you just had to memorize that Alt0233 produced an é… or, wait, maybe English Windows users still have to do that.

  • Breadth of font support is still a problem outside of emoji, and in a plaintext environment there’s just no way to provide any fallback. Google’s Noto font family aspires to have full coverage — it’s named for “no tofu”, referring to the small boxes that often appear for undisplayable characters — but there are still quite a few gaps. Also, on Android, a character that you don’t have a font for just doesn’t appear at all, with no indication you’re missing anything. That’s one way to get no tofu, I guess.

  • Brands™ running ad campaigns revolving around emoji are probably the worst thing. Hey, if we had a standard way to make colored fonts, then Guinness could’ve just released a font with a darker 🍺 U+1F37A BEER MUG and 🍻 U+1F37B CLINKING BEER MUGS, rather than running a ridiculous ad campaign asking Unicode to add a stout emoji.

  • If you’re on a platform that doesn’t ship with an emoji font, you should really really get Symbola. It covers a vast swath of Unicode with regular old black-and-white vector glyphs, usually using the example glyphs from Unicode’s own documents.

  • The plural is “emoji”, dangit. ∎

Benchmarking Streaming Computation Engines at Yahoo!

Post Syndicated from revans2 original https://yahooeng.tumblr.com/post/135321837876

(Yahoo Storm Team in alphabetical order) Sanket Chintapalli, Derek Dagit, Bobby Evans, Reza Farivar, Tom Graves, Mark Holderbaugh, Zhuo Liu, Kyle Nusbaum, Kishorkumar Patil, Boyang Jerry Peng and Paul Poulosky.DISCLAIMER: Dec 17th 2015 data-artisans has pointed out to us that we accidentally left on some debugging in the flink benchmark. So the flink numbers should not be directly compared to the storm and spark numbers.  We will rerun and repost the numbers when we have fixed this.UPDATE: Dec 18, 2015 there was a miscommunication and the code that was checked in was not the exact code we ran with for flink.  The real code had the debugging removed.  Data-Artisans has looked at the code and confirmed it and the current numbers are good.  We will still rerun at some point soon.Executive Summary –  Due to a lack of real-world streaming benchmarks, we developed one to compare Apache Flink, Apache Storm and Apache Spark Streaming. Storm 0.10.0, 0.11.0-SNAPSHOT and Flink 0.10.1 show sub- second latencies at relatively high throughputs with Storm having the lowest 99th percentile latency. Spark streaming 1.5.1 supports high throughputs, but at a relatively higher latency. At Yahoo, we have invested heavily in a number of open source big data platforms that we use daily to support our business. For streaming workloads, our platform of choice has been Apache Storm, which replaced our internally developed S4 platform. We have been using Storm extensively, and the number of nodes running Storm at Yahoo has now reached about 2,300 (and is still growing).Since our initial decision to use Storm in 2012, the streaming landscape has changed drastically. There are now several other noteworthy competitors including Apache Flink, Apache Spark (Spark Streaming), Apache Samza, Apache Apex and Google Cloud Dataflow. There is increasing confusion over which package offers the best set of features and which one performs better under which conditions (for instance see here, here, here, and here).To provide the best streaming tools to our internal customers, we wanted to know what Storm is good at and where it needs to be improved compared to other systems. To do this we started to look for stream processing benchmarks that we could use to do this evaluation, but all of them were lacking in several fundamental areas. Primarily, they did not test with anything close to a real world use case. So we decided to write one and released it as open source https://github.com/yahoo/streaming-benchmarks.  In our initial evaluation we decided to limit our test to three of the most popular and promising platforms (Storm, Flink and Spark), but welcome contributions for other systems, and to expand the scope of the benchmark.Benchmark DesignThe benchmark is a simple advertisement application. There are a number of advertising campaigns, and a number of advertisements for each campaign. The job of the benchmark is to read various JSON events from Kafka, identify the relevant events, and store a windowed count of relevant events per campaign into Redis. These steps attempt to probe some common operations performed on data streams.The flow of operations is as follows (and shown in the following figure):Read an event from Kafka.
Deserialize the JSON string.
Filter out irrelevant events (based on event_type field)
Take a projection of the relevant fields (ad_id and event_time)
Join each event by ad_id with its associated campaign_id. This information is stored in Redis.
Take a windowed count of events per campaign and store each window in Redis along with a timestamp of the time the window was last updated in Redis. This step must be able to handle late events.
The input data has the following schema:user_id: UUID
page_id: UUID
ad_id: UUID
ad_type: String in {banner, modal, sponsored-search, mail, mobile}
event_type: String in {view, click, purchase}
event_time: Timestamp
ip_address: String
Producers create events with timestamps marking creation time. Truncating this timestamp to a particular digit gives the begin-time of the time window the event belongs in. In Storm and Flink, updates to Redis are written periodically, but frequently enough to meet a chosen SLA. Our SLA was 1 second, so once per second we wrote updated windows to Redis. Spark operated slightly differently due to great differences in its design. There’s more details on that in the Spark section. Along with the data, we record the time at which each window in Redis was last updated.After each run, a utility reads windows from Redis and compares the windows’ times to their last_updated_at times, yielding a latency data point. Because the last event for a window cannot have been emitted after the window closed but will be very shortly before, the difference between a window’s time and its last_updated_at time minus its duration represents the time it took for the final tuple in a window to go from Kafka to Redis through the application.window.final_event_latency = (window.last_updated_at – window.timestamp) – window.durationThis is a bit rough, but this benchmark was not purposed to get fine-grained numbers on these engines, but to provide a more high-level view of their behavior.Benchmark setup10 second windows
1 second SLA
100 campaigns
10 ads per campaign
5 Kafka nodes with 5 partitions
1 Redis node
10 worker nodes (not including coordination nodes like Storm’s Nimbus)
5-10 Kafka producer nodes
3 ZooKeeper nodes
Since the Redis node in our architecture only performs in-memory lookups using a well-optimized hashing scheme, it did not become a bottleneck. The nodes are homogeneously configured, each with two Intel E5530 processors running at 2.4GHz, with a total of 16 cores (8 physical, 16 hyperthreading) per node. Each node has 24GiB of memory, and the machines are all located within the same rack, connected through a gigabit Ethernet switch. The cluster has a total of 40 nodes available.We ran multiple instances of the Kafka producers to create the required load since individual producers begin to fall behind at around 17,000 events per second. In total, we use anywhere between 20 to 25 nodes in this benchmark.The use of 10 workers for a topology is near the average number we see being used by topologies internal to Yahoo. Of course, our Storm clusters are larger in size, but they are multi-tenant and run many topologies.To begin the benchmarks Kafka is cleared, Redis is populated with initial data (ad_id to campaign_id mapping), the streaming job is started, and then after a bit of time to let the job finish launching, the producers are started with instructions to produce events at a particular rate, giving the desired aggregate throughput. The system was left to run for 30 minutes before the producers were shut down. A few seconds were allowed for all events to be processed before the streaming job itself was stopped. The benchmark utility was then run to generate a file containing a list of window.last_updated_at – window.timestamp numbers. These files were saved for each throughput we tested and then were used to generate the charts in this document.FlinkThe benchmark for Flink was implemented in Java by using Flink’s DataStream API.  The Flink DataStream API has many similarities to Storm’s streaming API.  For both Flink and Storm, the dataflow can be represented as a directed graph. Each vertex is a user defined operator and each directed edge represents a flow of data.  Storm’s API uses spouts and bolts as its operators while Flink uses map, flatMap, as well as many pre-built operators such as filter, project, and reduce. Flink uses a mechanism called checkpointing to guarantee processing which offers similar guarantees to Storm’s acking. Flink has checkpointing off by default and that is how we ran this benchmark. Notable configs we used in Flink is listed below:taskmanager.heap.mb: 15360
taskmanager.numberOfTaskSlots: 16
The Flink version of the benchmark uses the FlinkKafkaConsumer to read data in from Kafka.  The data read in from Kafka—which is in a JSON formatted string— is then deserialized and parsed by a custom defined flatMap operator. Once deserialized, the data is filtered via a custom defined filter operator. Afterwards, the filtered data is projected by using the project operator. From there, the data is joined with data in Redis by a custom defined flapMap function. Lastly, the final results are calculated from the data and written to redis.The rate at which Kafka emitted data events into the Flink benchmark is varied from 50,000 events/sec to 170,000 events/sec. For each Kafka emit rate, the percentile latency for a tuple to be completely processed in the Flink benchmark is illustrated in the graph below.
The percentile latency for all Kafka emit rates are relatively the same. The percentile latency rises linearly until around the 99th percentile, where the latency appears to increase exponentially.  SparkFor the Spark benchmark, the code was written in Scala. Since the micro-batching methodology of Spark is different than the pure streaming nature of Storm, we needed to rethink parts of the benchmark. Storm and Flink benchmarks would update the Redis database once a second to try and meet our SLA, keeping the intermediate update values in a local cache. As a result, the batch duration in the Spark streaming version was set to 1 second, at least for smaller amounts of traffic. We had to increase the batch duration for larger throughputs.The benchmark is written in a typical Spark style using DStreams. DStreams are the streaming equivalent of regular RDDs, and create a separate RDD for every micro batch. Note that in the subsequent discussion, we use the term “RDD” instead of “DStream” to refer to the RDD representation of the DStream in the currently active microbatch. Processing begins with the direct Kafka consumer included with Spark 1.5. Since the Kafka input data in our benchmark is stored in 5 partitions, this Kafka consumer creates a DStream with 5 partitions as well. After that, a number of transformations are applied on the DStreams, including maps and filters. The transformation involving joining data with Redis is a special case. Since we do not want to create a separate connection to Redis for each record, we use a mapPartitions operation that can give control of a whole RDD partition to our code.  This way, we create one connection to Redis and use this single connection to query information from Redis for all the events in that RDD partition. The same approach is used later when we update the final results in Redis.It should be noted that our writes to Redis were implemented as a side-effect of the execution of the RDD transformation in order to keep the benchmark simple, so this would not be compatible with exactly-once semantics.We found that with high enough throughput, Spark was not able to keep up.  At 100,000 messages per second the latency greatly increased. We considered adjustments along two control dimensions to help Spark cope with increasing throughput.The first is the microbatch duration. This is a control dimension that is not present in a pure streaming system like Storm. Increasing the duration increases latency while reducing overhead and therefore increasing maximum throughput. The challenge is that the choice of the optimal batch duration that minimizes latency while allowing spark to handle the throughput is a time consuming process. Essentially, we have to set a batch duration, run the benchmark for 30 minutes, check the results and decrease/increase the duration.The second dimension is parallelism. However, increasing parallelism is simpler said than done in the case of Spark. For a true streaming system like Storm, one bolt instance can send its results to any number of subsequent bolt instances by using a random shuffle. To scale, one can increase the parallelism of the second bolt. In the case of a micro batch system like Spark, we need to perform a reshuffle operation similar to how intermediate data in a Hadoop MapReduce program are shuffled and merged across the cluster. But the reshuffling itself introduces considerable overhead. Initially, we thought our operations were CPU-bound, and so the benefits of reshuffling to a higher number of partitions would outweigh the cost of reshuffling.  Instead, we found the bottleneck to be scheduling, and so reshuffling only added overhead. We suspect that at higher throughput rates or with operations that are CPU-bound, the reverse would be true.The final results are interesting. There are essentially three behaviors for a Spark workload depending on the window duration. First, if the batch duration is set sufficiently large, the majority of the events will be handled within the current micro batch. The following figure shows the resulting percentile processing graph for this case (100K events, 10 seconds batch duration).
But whenever 90% of events are processed in the first batch, there is possibility of improving latency. By reducing the batch duration sufficiently, we get into a region where the incoming events are processed within 3 or 4 subsequent batches. This is the second behavior, in which the batch duration puts the system on the verge of falling behind, but is still manageable, and results in better latency. This situation is shown in the following figure for a sample throughput rate (100K events, 3 seconds batch duration).
Finally, the third behavior is when Spark streaming falls behind. In this case, the benchmark takes a few minutes after the input data finishes to process all of the events. This situation is shown in the following figure. Under this undesirable operating region, Spark spills lots of data onto disks, and in extreme cases we could end up running out of disk space.One final note is that we tried the new back pressure feature introduced in Spark 1.5. If the system is in the first operating region, enabling back pressure does nothing. In the second operating region, enabling back pressure results in longer latencies. The third operating region is where back pressure shows the most negative impact.  It changes the batch length, but Spark still cannot cope with the throughput and falls behind. This is shown in the next figures. Our experiments showed that the current back pressure implementation did not help our benchmark, and as a result we disabled it.
Performance without back pressure (top), and with back pressure enabled (bottom). The latencies with the back pressure enabled are worse (70 seconds vs 120 seconds). Note that both of these results are unacceptable for a streaming system as both fall behind the incoming data. Batch duration was set to 2 seconds for each run, with 130,000 throughput.StormStorm’s benchmark was written using the Java API. We tested both Apache Storm 0.10.0 release and a 0.11.0 snapshot. The snapshot’s commit hash was a8d253a. One worker process per host was used, and each worker was given 16 tasks to run in 16 executors – one for each core.Storm 0.10.0:Storm 0.11.0:Storm compared favorably to both Flink and Spark Streaming. Storm 0.11.0 beat Storm 0.10.0, showing the optimizations that have gone in since the 0.10.0 release. However, at high-throughput both versions of Storm struggled. Storm 0.10.0 was not able to handle throughputs above 135,000 events per second.Storm 0.11.0 performed similarly until we disabled acking. In the benchmarking topology, acking was used for flow control but not for processing guarantees. In 0.11.0, Storm added a simple back pressure controller, allowing us to avoid the overhead of acking. With acking enabled, 0.11.0 performed terribly at 150,000/s—slightly better than 0.10.0, but still far worse than anything else. With acking disabled, Storm even beat Flink for latency at high throughput. However, with acking disabled, the ability to report and handle tuple failures is disabled also.Conclusions and Future WorkIt is interesting to compare the behavior of these three systems. Looking at the following figure, we can see that Storm and Flink both respond quite linearly. This is because these two systems try to process an incoming event as it becomes available. On the other hand, the Spark Streaming system behaves in a stepwise function, a direct result from its micro-batching design.
The throughput vs latency graph for the various systems is maybe the most revealing, as it summarizes our findings with this benchmark. Flink and Storm have very similar performance, and Spark Streaming, while it has much higher latency, is expected to be able to handle much higher throughput.
We did not include the results for Storm 0.10.0 and 0.11.0 with acking enabled beyond 135,000 events per second, because they could not keep up with the throughput. The resulting graph had the final point for Storm 0.10.0 in the 45,000 ms range, dwarfing every other line on the graph. The longer the topology ran, the higher the latencies got, indicating that it was losing ground.All of these benchmarks except where otherwise noted were performed using default settings for Storm, Spark, and Flink, and we focused on writing correct, easy to understand programs without optimizing each to its full potential. Because of this each of the six steps were a separate bolt or spout. Flink and Spark both do operator combining automatically, but Storm (without Trident) does not. What this means for Storm is that events go through many more steps and have a higher overhead compared to the other systems.In addition to further optimizations to Storm, we would like to expand the benchmark in terms of functionality, and to include other stream processing systems like Samza and Apex. We would also like to take into account fault tolerance, processing guarantees, and resource utilization.The bottom line for us is Storm did great. Writing topologies is simple, and it’s easy to get low latency comparable to or better than Flink up to fairly high throughputs. Without acking, Storm even beat Flink at very high throughput, and we expect that with further optimizations like combining bolts, more intelligent routing of tuples, and improved acking, Storm with acking enabled would compete with Flink at very high throughput too.The competition between near real time streaming systems is heating up, and there is no clear winner at this point. Each of the platforms studied here have their advantages and disadvantages. Performance is but one factor among others, such as security or integration with tools and libraries. Active communities for these and other big data processing projects continue to innovate and benefit from each other’s advancements. We look forward to expanding this benchmark and testing newer releases of these systems as they come out.

How to handle "I want to be a security guy" with an easy assignment

Post Syndicated from David original http://feedproxy.google.com/~r/DevilsAdvocateSecurity/~3/6gQIEf5LNoM/how-to-handle-i-want-to-be-security-guy.html

As the manager of a security team I’m often approached by technologists who are interested in information security. Their reasons range from a long term interest in the subject to those who simply want a change of pace, or think that the grass may just be greener in infosec.Over the years I’ve developed a simple list of things that I tell people who express an interest:Get a copy of Hacking Exposed. Anything recent will do, and a good alternative is Counterhack Reloaded.Skim the book, and read anything that catches your eye. Don’t try to read it cover to cover, unless you really find that you want to.Come back and talk to me once you’ve done that, and we’ll talk about what you found interesting.It’s a very simple process – but I’ve found it immensely valuable. Those who are really interested, and who will put the time into the effort will buy the book, and will come back with questions and comments. A certain percentage will get the book and will realize that information security isn’t really what they want to do, or they will realize that they need or want to know more before they tackle a career in security. A final group are interested, but not enough to take the step to follow up.Once you have an interested candidate, the conversation or conversations that you can have next are far more interesting. Hopefully, you’ve read the book yourself, as you’ll be answering questions, and often providing references to deeper resources on the topics that interest them. Favorite resources for follow-up activities include:OWASP – particularly WebGoat and MultilldaeInvestigation of vulnerability scanners like Nikto and Nessus andExploration of tools like Metasploit and the BeEF browser exploitation framework using DVL or a similar vulnerable OSSANS courses like SANS 401 and 501A whole range of options exists once you start to have the conversation – but you’re certain you’re having the conversation with someone who is interested enough to follow up, and who has helped you identify what they’ll have some passion for.

_uacct = “UA-1423386-1”;

How to handle "I want to be a security guy" with an easy assignment

Post Syndicated from David original http://feedproxy.google.com/~r/DevilsAdvocateSecurity/~3/6gQIEf5LNoM/how-to-handle-i-want-to-be-security-guy.html

As the manager of a security team I’m often approached by technologists who are interested in information security. Their reasons range from a long term interest in the subject to those who simply want a change of pace, or think that the grass may just be greener in infosec.Over the years I’ve developed a simple list of things that I tell people who express an interest:Get a copy of Hacking Exposed. Anything recent will do, and a good alternative is Counterhack Reloaded.Skim the book, and read anything that catches your eye. Don’t try to read it cover to cover, unless you really find that you want to.Come back and talk to me once you’ve done that, and we’ll talk about what you found interesting.It’s a very simple process – but I’ve found it immensely valuable. Those who are really interested, and who will put the time into the effort will buy the book, and will come back with questions and comments. A certain percentage will get the book and will realize that information security isn’t really what they want to do, or they will realize that they need or want to know more before they tackle a career in security. A final group are interested, but not enough to take the step to follow up.Once you have an interested candidate, the conversation or conversations that you can have next are far more interesting. Hopefully, you’ve read the book yourself, as you’ll be answering questions, and often providing references to deeper resources on the topics that interest them. Favorite resources for follow-up activities include:OWASP – particularly WebGoat and MultilldaeInvestigation of vulnerability scanners like Nikto and Nessus andExploration of tools like Metasploit and the BeEF browser exploitation framework using DVL or a similar vulnerable OSSANS courses like SANS 401 and 501A whole range of options exists once you start to have the conversation – but you’re certain you’re having the conversation with someone who is interested enough to follow up, and who has helped you identify what they’ll have some passion for.

_uacct = “UA-1423386-1”;