Tag Archives: erts

How SmartNews Built a Lambda Architecture on AWS to Analyze Customer Behavior and Recommend Content

Post Syndicated from SmartNews original https://blogs.aws.amazon.com/bigdata/post/Tx2V1BSKGITCMTU/How-SmartNews-Built-a-Lambda-Architecture-on-AWS-to-Analyze-Customer-Behavior-an

This is a guest post by Takumi Sakamoto, a software engineer at SmartNews . SmartNews in their own words: "SmartNews is a machine learning-based news discovery app that delivers the very best stories on the Web for more than 18 million users worldwide."

Data processing is one of the key technologies for SmartNews. Every team’s workload involves data processing for various purposes. The news team at SmartNews uses data as input to their machine learning algorithm for delivering the very best stories on the Web. The product team relies on data to run various A/B tests, to learn about how our customers consume news articles, and to make product decisions.

To meet the goals of both teams, we built a sustainable data platform based on the lambda architecture, which is a data-processing framework that handles a massive amount of data and integrates batch and real-time processing within a single framework.

Thanks to AWS services and OSS technologies, our data platform is highly scalable and reliable, and is flexible enough to satisfy various requirements with minimum cost and effort.

Our current system generates tens of GBs of data from multiple data sources, and runs daily aggregation queries or machine learning algorithms on datasets with hundreds of GBs. Some outputs by machine learning algorithms are joined on data streams for gathering user feedback in near real-time (e.g. the last 5 minutes). It lets us adapt our product for users with minimum latency. In this post, I’ll show you how we built a SmartNews data platform on AWS.

The image below depicts the platform. Please scroll to see the full architecture.

Design principles

Before I dive into how we built our data platform, it’s important to know the design principles behind the architecture.

When we started to discuss the data platform, most data was stored in a document database. Although it was a good at product launch, it became painful with growth. For data platform maintainers, it was very expensive to store and serve data at scale. At that time, our system generated more than 10 GB of user activity records every day and processing time increased linearly. For data platform users, it was hard to try something new for data processing because of the database’s insufficient scalability and limited integration with the big data ecosystem. Obviously, it wasn’t not sustainable for both.

To make our data platform sustainable, we decided to completely separate the compute and storage layers. We adopted Amazon S3  for file storage and Amazon Kinesis Streams for stream storage. Both services replicate data into multiple Availability Zones and keep it available without high operation costs. We don’t have to pay much attention to the storage layer and we can focus on the computation layer that transforms raw data to a valuable output.

In addition, Amazon S3 and Amazon Kinesis Streams let us run multiple compute layers without complex negotiations. After data is stored, everyone can consume it in their own way. For example, if a team wants to try a new version of Spark, they can launch a new cluster and start to evaluate it immediately. That means every engineer in SmartNews can craft any solutions using whatever tools they feel are best suited to the task.

Input data

The first step is dispatching raw data to both the batch layer and the speed layer for processing. There are two types of data sources at SmartNews:

  • Groups of user activity logs generated from our mobile app
  • Various tables on Amazon RDS

User activity logs include more than 60 types of activities to understand user behavior such as which news articles are read. After we receive logs from the mobile app, all logs are passed to Fluentd, an OSS log collector, and forwarded to Amazon S3 and Amazon Kinesis Streams. If you are not familiar with Fluentd, see Store Apache Logs into Amazon S3 and Collect Log Files into Kinesis Stream in Real-Time to understand how Fluentd works.

Our recommended practice is adding the flush_at_shutdown parameter. If set to true, Fluentd waits for the buffer to flush at shutdown. Because our instances are scaled automatically, it’s important to store log files on Amazon S3 before terminating instances.

In addition, monitoring Fluentd status is important so that you know when bad things happen. We use Datadog and some Fluentd plugins. Because the Fluent-plugin-flowcounter counts incoming messages and bytes per second, we post these metrics to Dogstatsd via Fluent-plugin-dogstatsd. An example configuration is available in a GitHub Gist post.

After metrics are sent to Datadog, we can visualize aggregated metrics across any level that we choose. The following graph aggregates the number of records per data source.

Also, Datadog notifies us when things go wrong. The alerts in the figure below let us know that there have been no incoming records on an instance for the last 1 hour. We also monitor Fluentd’s buffer status by using Datadog’s Fluentd integration.

Various tables on Amazon RDS are dumped by Embulk, an OSS bulk data loader, and exported to Amazon S3. Its pluggable architecture lets us mask some fields that we don’t want to export to the data platform.

Batch layer

This layer is responsible for various ETL tasks such as transforming text files into columnar files (RCFile or ORCFile) for following consumers, generating machine learning features, and pre-computing the batch views.

We run multiple Amazon EMR clusters for each task. Amazon EMR lets us run multiple heterogeneous Hive and Spark clusters with a few clicks. Because all data is stored on Amazon S3, we can use Spot Instances for most tasks and adjust cluster capacity dynamically. It significantly reduces the cost of running our data processing system.

In addition to data processing itself, task management is very important for this layer. Although a cron scheduler is a good first solution, it becomes hard to maintain after increasing the number of ETL tasks.

When using a cron scheduler, a developer needs to write additional code to handle dependencies such as waiting until the previous task is done, or failure handling such as retrying failed tasks or specifying timeouts for long-running tasks. We use Airflow, an open-sourced task scheduler, to manage our ETL tasks. We can define ETL tasks and dependencies with Python scripts.

Because every task is described as code, we can introduce pull request–based review flows for modifying ETL tasks.

Serving layer

The serving layer indexes and exposes the views so that they can be queried.

We use Presto for this layer. Presto is an open source, distributed SQL query engine for running interactive queries against various data sources such as Hive tables on S3, MySQL on Amazon RDS, Amazon Redshift, and Amazon Kinesis Streams. Presto converts a SQL query into a series of task stages and processes each stage in parallel. Because all processing occurs in memory to reduce disk I/O, end-to-end latency is very low: ~30 seconds to scan billions of records.

With Presto, we can analyze the data from various perspectives. The following simplified query shows the result of A/B testing by user clusters.

```sql
-- Suppose that this table exists
DESC hive.default.user_activities;
user_id bigint
action  varchar
abtest  array>
url     varchar

-- Summarize page view per A/B Test identifier
--   for comparing two algorithms v1 & v2
SELECT
  dt,
  t['behaviorId'],
  count(*) as pv
FROM hive.default.user_activities CROSS JOIN UNNEST(abtest) AS t (t)
WHERE dt like '2016-01-%' AND action = 'viewArticle'
  AND t['definitionId'] = 163
GROUP BY dt, t['behaviorId'] ORDER BY dt
;

-- Output:
-- 2015-12-01 | algorithm_v1 | 40000
-- 2015-12-01 | algorithm_v2 | 62000
```

Speed layer

Like the batch layer, the speed layer computes views from the data it receives. The difference is latency. Sometimes, the low latency adds variable outputs for the product.

For example, we need to detect current trending news by interest-based clusters to deliver the best stories for each user. For this purpose, we run Spark Streaming.

User feedback in Amazon Kinesis Streams is joined on the interest-based user cluster data calculated in offline machine learning, and then the output metrics for each news article. These metrics are used to rank news articles in a later phase. What Spark Streaming does in the above figure looks something like the following:

```scala
def main(args: Array[String]): Unit = {
  // ..... (prepare SparkContext)

  // Load user clusters that are generated by offline machine learning
  if (needToUpdate) {
    userClusterRDD: RDD[(Long, Int)] = sqlContext.sql(
      "SELECT user_id, cluster_id FROM user_cluster"
    ).map( row => {
      (row.getLong(0), row.getInt(1))
    })
  }

  // Fetch and parse JSON records in Amazon Kinesis Streams
  val userPageviewStream: DStream[(Long, String)] = ssc.union(kinesisStreams)
    .map( byteArray => {
      val json = new String(bytesArray)
      val userActivity = parse(json)
      (userActivity.user_id, userActivity.url)
    })

  // Join stream records with pre-calculated user clusters
  val clusterPageviewStream: DStream[(Int, String)] = userPageviewStream
    .transform( userPageviewStreamRDD => {
      userPageviewStreamRDD.join(userClusterRDD).map( data => {
        val (userId, (url, clusterId) ) = data
        (clusterId, url)
      })
    })

  // ..... (aggregates pageview by clusters and store to DynamoDB)
}
```

Because every EMR cluster uses the shared Hive metastore, Spark Streaming applications can load all tables created on the batch layer by using SQLContext. After the tables are loaded as an RDD (Resilient Distributed Dataset), we can join it to a Kinesis stream.

Spark Streaming is a great tool for empowering your machine learning–based application, but it can be overkill for simpler use cases such as monitoring. For these cases, we use AWS Lambda and PipelineDB (not covered here in detail).

Output data

Chartio is a commercial business intelligence (BI) service. Chartio enables every member (including non-engineers!) in the company to create, edit, and refine beautiful dashboards with minimal effort. This has saved us hours each week so we can spend our time improving our product, not reporting on it. Because Chartio supports various data sources such as Amazon RDS (MySQL, PostgreSQL), Presto, PipelineDB, Amazon Redshift, and Amazon Elasticsearch, you can start using it easily.

Summary

In this post, I’ve shown you how SmartNews uses AWS services and OSS technologies to create a data platform that is highly scalable and reliable, and is flexible enough to satisfy various requirements with minimum cost and effort. If you’re interested in our data platform, check out these two slides in our SlideShare: Building a Sustainable Data Platform on AWS  and Stream Processing in SmartNews.

If you have questions or suggestions, please leave a comment below.

Takumi Sakamoto is not an Amazon employee and does not represent Amazon.

———————————

Related

Building a Near Real-Time Discovery Platform with AWS

Want to learn more about Big Data or Streaming Data? Check out our Big Data and Streaming data educational pages.

 

 

Nine-year-old inventor’s award-winning asthma monitor

Post Syndicated from Liz Upton original https://www.raspberrypi.org/blog/nine-year-old-inventors-asthma-monitor/

We keep a very close eye on the annual Tech4Good competition, and especially the children who are nominated for their BT Young Pioneer award; there are some fiercely smart kids there doing some hugely impressive work. This year’s was a very close field (I would not like to have been judging – there were some extraordinary projects presented).

Tech4Good award winners 2016

Tech4Good award winners 2016

Arnav Sharma, nine years old, was the Winner of Winners as well as the winner of the Young Pioneer section with this asthma monitor, which runs on Raspberry Pi. Arnav started by learning about the causes and effects of asthma, and thought about ways to help patients. He discovered that asthma is hard to diagnose, but can be fatal if left undetected. This leads to many children being over-diagnosed and over-medicated; inhalers are often given as treatment to reduce the symptoms of asthma, but come with side-effects like reduced growth and immunity. Arnav discovered that the best way to manage asthma is to prevent attacks by understanding what triggers asthma attacks and following a treatment plan.

Asthma Pi

AsthmaPi

Arnav’s AsthmaPi uses a Raspberry Pi, a Sense HAT, an MQ-135 Gas Sensor, a Sharp Optical Dust Sensor and an Arduino Uno.The sensors on the SenseHAT are used to measure temperature and humidity, while the MQ gas sensor detects nitrogen compounds, carbon dioxide, cigarette smoke, smog, ammonia and alcohol, all known asthma triggers. The dust sensor measures the size of dust particles and their density. The AsthmaPi is programmed in Python and C++, and triggers email and SMS text message alerts to remind the owner take medication and to go for review visits.

Here’s Arnav’s very impressive project video, which will walk you through what he’s put together, and how it all works.

AsthmaPi Asthma Management Kit Arnav, Asthma, Allergy, Raspberry Pi, Dust Sensor, Gas Sensor

This is the video demo for the AsthmaPi: An affordable asthma management kit made by Arnav Sharma, aged 9, finalist of Tech For Good competition. Please tweet him at #T4GArnavSharma or visit his page here http://www.tech4goodawards.com/finalist/arnav-sharma/ or vote for him at http://www.tech4goodawards.com/peoples-award/ Thank you.

Well done Arnav!

The post Nine-year-old inventor’s award-winning asthma monitor appeared first on Raspberry Pi.

ERTS – Exploit Reliability Testing System

Post Syndicated from Darknet original http://feedproxy.google.com/~r/darknethackers/~3/heOaYUkdEdU/

ERTS or Exploit Reliability Testing System is a Python based tool to calculate the reliability of an exploit based on the number of times the exploit is able to control EIP register with the desired address/value. It’s created to help you code reliable exploits and take the manual parts out of running and re-running exploits […]

The post…

Read the full post at darknet.org.uk

Hey, Mac and iOS users: Make sure to back up before you upgrade!

Post Syndicated from Peter Cohen original https://www.backblaze.com/blog/os-upgrade-backup-plan/

blog-backup-macos-sierra

Editor’s note: This article was originally published over the summer as a guideline for those dabbling with Apple’s public betas. It’s been updated to reflect the upcoming release of the new Mac and iOS versions.

New versions of Apple’s operating systems are coming to your iPhone and Mac later this month! iOS 10 will be released on September 13th, with macOS 10.12 “Sierra” coming a week later. If you’re planning to upgrade your Mac or iOS device with Apple’s newest software, you should make it a point to back up before you install anything new.

The new releases were announced in June at Apple’s annual Worldwide Developer Conference (WWDC) in San Francisco, which gathers thousands of Apple developers from around the world each year. It’s a familiar annual processional: Apple introduces new versions of both the Mac and iOS operating systems. They’re tested by developers and the public throughout the summer. At its recent iPhone 7 event, Apple announced the release dates for iOS 10 and macOS 10.12.

Here’s a rundown of some of the cool stuff coming with these new releases.

macOS

With this release Apple is rebranding the OS X operating system as macOS to keep it consistent with iOS, tvOS and watchOS. macOS’s new tentpole feature is Siri support, so you can talk to your computer the same way you talk to your phone. A lot of other new features have been added, too, including Apple Pay support and the ability to unlock your Mac using your Apple Watch. Some exciting under the hood changes to the operating system provide more optimized storage and seamless cloud transfer, which we’ll have more to say about later.

macOS Sierra

Up until a few years ago only registered developers could gain access to the new operating system software ahead of everyone else. Apple has loosened up the reins by expanding its Apple Beta Software Program to regular civilians, not just Apple experts and pros.

That means a lot more people than ever are using pre-release versions of iOS and macOS. Apple makes you wade through pages of legalese jargon and it’s easy to get glassy-eyed at all the stuff they throw at you. So if this is your first time, please keep some things in mind before you get rolling with the new software.

Back up early and often

Changing your Mac or iPhone’s operating system isn’t like installing a new version of an app, even though Apple has tried to make it a relatively simple process. Operating system software is essential software for these devices, and how it works has a cascading effect on all the other apps and services you depend on.

Sometimes features and services you find absolutely necessary are left out by omission, sometimes by accident, sometimes by circumstance. And that can (and does) change from pre-release build to pre-release build. The bottom line is you want to be prepared if something goes drastically wrong, and you want to be inconvenienced as little as possible when something does.

One way you can do that is to make sure you have a restore point you can recover from before upgrading your system with new, unproven software. That way, if things go awry – and in pre-release days they often do – you can get your system back to working order and be none the worse for wear.

If you’re not currently backing up, it’s easy to get started using our 3-2-1 Backup Strategy. The idea behind the 3-2-1 Backup Strategy is that there should be three copies of your data: The main one you use, a local backup copy, and a remote copy, stored at a secure offsite data center like Backblaze. It’s served us and thousands of our customers very well over the years, so we recommend it unabashedly. Also check out our Mac Backup Guide.

Don’t use your only hardware

It’s a really bad idea to install early release software on any computer or device you absolutely need. If you only have one Mac or one iPhone, I’d seriously reconsider installing any kind of pre-release software on it, unless you know you can live without it for however many hours you’ll need to restore it to working condition.

It’s a good idea to use beta operating system software only on a spare machine you can afford to lose for a while if you need to reset or reinstall. Especially in the early days, you can never count on things working quite as they should.

At the very least, in the case of your Mac, I’d strongly consider having a spare external hard drive to use. You can set it up as a bootable drive with the new operating system on it – simply attach the drive, turn on your Mac and hold down the option key on the keyboard to select the external drive.

Some users repartition their Mac startup disks with a second partition that they use for pre-release software. I stay away from this method as recovering from it sometimes requires resorting to command line work in OS X macOS to restore things to where they should be.

If you plan to use a pre-release version of iOS, tvOS or watchOS on any of your devices, it’d be wise to limit your use to spare devices only. Older iPads and iPhones will work (within the limit of what iOS 10 supports – Apple has posted system requirements, or you can pick up an iPod touch for $200 and have a very nice little testbed for iOS 10.

Your patience will pay off

This week saw the release of the first public beta release of the new operating systems. These are intended to give early adopters first crack at the new software, and for Apple to shakedown the changes. While the new features are cool, just remember that this is still very much a work in progress. Expect problems if you decide to install the software.

If you have only a single device or computer with which to use this stuff, the general release, or even some of the later public betas, will be a better time to upgrade than right now.

Even then though, the same rules apply – please make sure to back up all of your systems before installing operating system software, even release software. Better safe than sorry, especially where the safety and security of your data is concerned.

The post Hey, Mac and iOS users: Make sure to back up before you upgrade! appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Case 225: The Three Most Terrifying Words

Post Syndicated from The Codeless Code original http://thecodelesscode.com/case/225

The nun Hwídah was eating lunch with her clan when a
senior monk approached, seeking her aid with a production issue.
Not wishing to disturb the others, the senior monk bent down
to whisper in Hwídah’s ear.

“Ah,” said Hwídah, rising from the table.
“The three most terrifying words.”
Immediately she departed with the senior monk.

A novice who witnessed this exchange happened upon
the senior monk that evening. The novice asked,
“What were those ‘three most terrifying words’?”

The senior monk replied, “Possible race condition.”

The novice thought a moment and said brightly,
“Tell Hwídah that those cannot be
the ‘three most terrifying words’,
for the words ‘Definite race condition’
would be even more terrifying!”

The senior monk laughed and continued on his way.

- - -

That night the novice fell into long, terrible nightmares
from which he was unable to wake. After what seemed like an
eternity he came to his senses, twisted up inside his own
bedsheets.

Tossing off the mangled covers, the novice found himself
alone in the middle of a featureless desert.
An empty sedative bottle lay on the sand nearby.
Tied to it was a tightly-folded map of the whole world:
all its continents and its mountains and its many, many deserts.

Inside one desert was a tiny red dot, pointed to by a tiny
red arrow, next to which was some tiny red text in Hwídah’s
handwriting which read, “Possibly your location”.

On journeys

Post Syndicated from Michal Zalewski original http://lcamtuf.blogspot.com/2015/03/on-journeys.html

– 1 –

Poland is an ancient country whose history is deeply intertwined with that of the western civilization. In its glory days, the Polish-Lithuanian Commonwealth sprawled across vast expanses of land in central Europe, from Black Sea to Baltic Sea. But over the past two centuries, it suffered a series of military defeats and political partitions at the hands of its closest neighbors: Russia, Austria, Prussia, and – later – Germany.

After more than a hundred years of foreign rule, Poland re-emerged as an independent state in 1918, only to face the armies of Nazi Germany at the onset of World War II. With Poland’s European allies reneging on their earlier military guarantees, the fierce fighting left the country in ruins. Some six million people have died within its borders – more than ten times the death toll in France or in the UK. Warsaw was reduced to a sea of rubble, with perhaps one in ten buildings still standing by the end of the war.

With the collapse of the Third Reich, Franklin D. Roosevelt, Winston Churchill, and Joseph Stalin held a meeting in Yalta to decide the new order for war-torn Europe. At Stalin’s behest, Poland and its neighboring countries were placed under Soviet political and military control, forming what has become known as the Eastern Bloc.

Over the next several decades, the Soviet satellite states experienced widespread repression and economic decline. But weakened by the expense of the Cold War, the communist chokehold on the region eventually began to wane. In Poland, even the introduction of martial law in 1981 could not put an end to sweeping labor unrest. Narrowly dodging the specter of Soviet intervention, the country regained its independence in 1989 and elected its first democratic government; many other Eastern Bloc countries soon followed suit.

Ever since then, Poland has enjoyed a period of unprecedented growth and has emerged as one of the more robust capitalist democracies in the region. In just two decades, it shed many of its backwardly, state-run heavy industries and adopted a modern, service-oriented economy. But the effects of the devastating war and the lost decades under communist rule still linger on – whether you look at the country’s infrastructure, at its socrealist cityscapes, at its political traditions, or at the depressingly low median wage.

When thinking about the American involvement in the Cold War, people around the world may recall Vietnam, Bay of Pigs, or the proxy wars fought in the Middle East. But in Poland and many of its neighboring states, the picture you remember the most is the fall of the Berlin Wall.

– 2 –

I was born in Warsaw in the winter of 1981, at the onset of martial law, with armored vehicles rolling onto Polish streets. My mother, like many of her generation, moved to the capital in the sixties as a part of an effort to rebuild and repopulate the war-torn city. My grandma would tell eerie stories of Germans and Soviets marching through their home village somewhere in the west. I liked listening to the stories; almost every family in Poland had some to tell.

I did not get to know my father. I knew his name; he was a noted cinematographer who worked on big-ticket productions back in the day. He left my mother when I was very young and never showed interest in staying in touch. He had a wife and other children, so it might have been that.

Compared to him, mom hasn’t done well for herself. We ended up in social housing in one of the worst parts of the city, on the right bank of the Vistula river. My early memories from school are that of classmates sniffing glue from crumpled grocery bags. I remember my family waiting in lines for rationed toilet paper and meat. As a kid, you don’t think about it much.

The fall of communism came suddenly. I have a memory of grandma listening to broadcasts from Radio Free Europe, but I did not understand what they were all about. I remember my family cheering one afternoon, transfixed to a black-and-white TV screen. I recall my Russian language class morphing into English; I had my first taste of bananas and grapefruits. There is the image of the monument of Feliks Dzierżyński coming down. I remember being able to go to a better school on the other side of Warsaw – and getting mugged many times on the way.

The transformation brought great wealth to some, but many others have struggled to find their place in the fledgling and sometimes ruthless capitalist economy. Well-educated and well read, my mom ended up in the latter pack, at times barely making ends meet. I think she was in part a victim of circumstance, and in part a slave to way of thinking that did not permit the possibility of taking chances or pursuing happiness.

– 3 –

Mother always frowned upon popular culture, seeing it as unworthy of an educated mind. For a time, she insisted that I only listen to classical music. She angrily shunned video games, comic books, and cartoons. I think she perceived technology as trivia; the only field of science she held in high regard was abstract mathematics, perhaps for its detachment from the mundane world. She hoped that I would learn Latin, a language she could read and write; that I would practice drawing and painting; or that I would read more of the classics of modernist literature.

Of course, I did almost none of that. I hid my grunge rock tapes between Tchaikovsky, listened to the radio under the sheets, and watched the reruns of The A-Team while waiting for her to come back from work. I liked electronics and chemistry a lot more than math. And when I laid my hands on my first computer – an 8-bit relic of British engineering from 1982 – I soon knew that these machines, in their incredible complexity and flexibility, were what I wanted to spend my time on.

I suspected I could become a competent programmer, but never had enough faith in my skill. Yet, in learning about computers, I realized that I had a knack for understanding complex systems and poking holes in how they work. With a couple of friends, we joined the nascent information security community in Europe, comparing notes on mailing lists. Before long, we were taking on serious consulting projects for banks and the government – usually on weekends and after school, but sometimes skipping a class or two. Well, sometimes more than that.

All of the sudden, I was facing an odd choice. I could stop, stay in school and try to get a degree – going back every night to a cramped apartment, my mom sleeping on a folding bed in the kitchen, my personal space limited to a bare futon and a tiny desk. Or, I could seize the moment and try to make it on my own, without hoping that one day, my family would be able to give me a head start.

I moved out, dropped out of school, and took on a full-time job. It paid somewhere around $12,000 a year – a pittance anywhere west of the border, but a solid wage in Poland even today. Not much later, I was making two times as much, about the upper end of what one could hope for in this line of work. I promised myself to keep taking courses after hours, but I wasn’t good at sticking to the plan. I moved in with my girlfriend, and at the age of 19, I felt for the first time that things were going to be all right.

– 4 –

Growing up in Europe, you get used to the barrage of low-brow swipes taken at the United States. Your local news will never pass up the opportunity to snicker about the advances of creationism somewhere in Kentucky. You can stay tuned for a panel of experts telling you about the vastly inferior schools, the medieval justice system, and the striking social inequality on the other side of the pond. You don’t doubt their words – but deep down inside, no matter how smug the critics are, or how seemingly convincing their arguments, the American culture still draws you in.

My moment of truth came in the summer of 2000. A company from Boston asked me if I’d like to talk about a position on their research team; I looked at the five-digit figure and could not believe my luck. Moving to the US was an unreasonable risk for a kid who could barely speak English and had no safety net to fall back to. But that did not matter: I knew I had no prospects of financial independence in Poland – and besides, I simply needed to experience the New World through my own eyes.

Of course, even with a job offer in hand, getting into the United States is not an easy task. An engineering degree and a willing employer opens up a straightforward path; it is simple enough that some companies would abuse the process to source cheap labor for menial, low-level jobs. With a visa tied to the petitioning company, such captive employees could not seek better wages or more rewarding work.

But without a degree, the options shrink drastically. For me, the only route would be a seldom-granted visa reserved for extraordinary skill – meant for the recipients of the Nobel Prize and other folks who truly stand out in their field of expertise. The attorneys looked over my publication record, citations, and the supporting letters from other well-known people in the field. Especially given my age, they thought we had a good shot. A few stressful months later, it turned out that they were right.

On the week of my twentieth birthday, I packed two suitcases and boarded a plane to Boston. My girlfriend joined me, miraculously securing a scholarship at a local university to continue her physics degree; her father helped her with some of the costs. We had no idea what we were doing; we had perhaps few hundred bucks on us, enough to get us through the first couple of days. Four thousand miles away from our place of birth, we were starting a brand new life.

– 5 –

The cultural shock gets you, but not in the sense you imagine. You expect big contrasts, a single eye-opening day to remember for the rest of your life. But driving down a highway in the middle of a New England winter, I couldn’t believe how ordinary the world looked: just trees, boxy buildings, and pavements blanketed with dirty snow.

Instead of a moment of awe, you drown in a sea of small, inconsequential things, draining your energy and making you feel helpless and lost. It’s how you turn on the shower; it’s where you can find a grocery store; it’s what they meant by that incessant “paper or plastic” question at the checkout line. It’s how you get a mailbox key, how you make international calls, it’s how you pay your bills with a check. It’s the rules at the roundabout, it’s your social security number, it’s picking the right toll lane, it’s getting your laundry done. It’s setting up a dial-up account and finding the food you like in the sea of unfamiliar brands. It’s doing all this without Google Maps or a Facebook group to connect with other expats nearby.

The other thing you don’t expect is losing touch with your old friends; you can call or e-mail them every day, but your social frames of reference begin to drift apart, leaving less and less to talk about. The acquaintances you make in the office will probably never replace the folks you grew up with. We managed, but we weren’t prepared for that.

– 6 –

In the summer, we had friends from Poland staying over for a couple of weeks. By the end of their trip, they asked to visit New York City one more time; we liked the Big Apple, so we took them on a familiar ride down I-95. One of them went to see the top of World Trade Center; the rest of us just walked around, grabbing something to eat before we all headed back. A few days later, we were all standing in front of a TV, watching September 11 unfold in real time.

We felt horror and outrage. But when we roamed the unsettlingly quiet streets of Boston, greeted by flags and cardboard signs urging American drivers to honk, we understood that we were strangers a long way from home – and that our future in this country hanged in the balance more than we would have thought.

Permanent residency is a status that gives a foreigner the right to live in the US and do almost anything they please – change jobs, start a business, or live off one’s savings all the same. For many immigrants, the pursuit of this privilege can take a decade or more; for some others, it stays forever out of reach, forcing them to abandon the country in a matter of days as their visas expire or companies fold. With my O-1 visa, I always counted myself among the lucky ones. Sure, it tied me to an employer, but I figured that sorting it out wouldn’t be a big deal.

That proved to be a mistake. In the wake of 9/11, an agency known as Immigration and Naturalization Services was being dismantled and replaced by a division within the Department of Homeland Security. My own seemingly straightforward immigration petition ended up somewhere in the bureaucratic vacuum that formed in between the two administrative bodies. I waited patiently, watching the deepening market slump, and seeing my employer’s prospects get dimmer and dimmer every month. I was ready for the inevitable, with other offers in hand, prepared to make my move perhaps the very first moment I could. But the paperwork just would not come through. With the Boston office finally shutting down, we packed our bags and booked flights. We faced the painful admission that for three years, we chased nothing but a pipe dream. The only thing we had to show for it were two adopted cats, now sitting frightened somewhere in the cargo hold.

The now-worthless approval came through two months later; the lawyers, cheerful as ever, were happy to send me a scan. The hollowed-out remnants of my former employer were eventually bought by Symantec – the very place from where I had my backup offer in hand.

– 7 –

In a way, Europe’s obsession with America’s flaws made it easier to come home without ever explaining how the adventure really played out. When asked, I could just wing it: a mention of the death penalty or permissive gun laws would always get you a knowing nod, allowing the conversation to move on.

Playing to other people’s preconceptions takes little effort; lying to yourself calls for more skill. It doesn’t help that when you come back after three years away from home, you notice all the small annoyances that you used to simply tune out. Back then, Warsaw still had a run-down vibe: the dilapidated road from the airport; the drab buildings on the other side of the river; the uneven pavements littered with dog poop; the dirty walls at my mother’s place, with barely any space to turn. You can live with it, of course – but it’s a reminder that you settled for less, and it’s a sensation that follows you every step of the way.

But more than the sights, I couldn’t forgive myself something else: that I was coming back home with just loose change in my pocket. There are some things that a failed communist state won’t teach you, and personal finance is one of them; I always looked at money just as a reward for work, something you get to spend to brighten your day. The indulgences were never extravagant: perhaps I would take the cab more often, or have take-out every day. But no matter how much I made, I kept living paycheck-to-paycheck – the only way I knew, the way our family always did.

– 8 –

With a three-year stint in the US on your resume, you don’t have a hard time finding a job in Poland. You face the music in a different way. I ended up with a salary around a fourth of what I used to make in Massachusetts, but I simply decided not to think about it much. I wanted to settle down, work on interesting projects, marry my girlfriend, have a child. I started doing consulting work whenever I could, setting almost all the proceeds aside.

After four years with T-Mobile in Poland, I had enough saved to get us through a year or so – and in a way, it changed the way I looked at my work. Being able to take on ambitious challenges and learn new things started to matter more than jumping ships for a modest salary bump. Burned by the folly of pursuing riches in a foreign land, I put a premium on boring professional growth.

Comically, all this introspection made me realize that from where I stood, I had almost nowhere left to go. Sure, Poland had telcos, refineries, banks – but they all consumed the technologies developed elsewhere, shipped here in a shrink-wrapped box; as far as their IT went, you could hardly tell the companies apart. To be a part of the cutting edge, you had to pack your bags, book a flight, and take a jump into the unknown. I sure as heck wasn’t ready for that again.

And then, out of the blue, Google swooped in with an offer to work for them from the comfort of my home, dialing in for a videoconference every now and then. The starting pay was about the same, but I had no second thoughts. I didn’t say it out loud, but deep down inside, I already knew what needed to happen next.

– 9 –

We moved back to the US in 2009, two years after taking the job, already on the hook for a good chunk of Google’s product security and with the comfort of knowing where we stood. In a sense, my motive was petty: you could call it a desire to vindicate a failed adolescent dream. But in many other ways, I have grown fond of the country that shunned us once before; and I wanted our children to grow up without ever having to face the tough choices and the uncertain prospects I had to deal with in my earlier years.

This time, we knew exactly what to do: a quick stop at a grocery store on a way from the airport, followed by e-mail to our immigration folks to get the green card paperwork out the door. A bit more than half a decade later, we were standing in a theater in Campbell, reciting the Oath of Allegiance and clinging on to our new certificates of US citizenship.

The ceremony closed a long and interesting chapter in my life. But more importantly, standing in that hall with people from all over the globe made me realize that my story is not extraordinary; many of them had lived through experiences far more harrowing and captivating than mine. If anything, my tale is hard to tell apart from that of countless other immigrants from the former Eastern Bloc. By some estimates, in the US alone, the Polish diaspora is about 9 million strong.

I know that the Poland of today is not the Poland I grew up in. It’s not not even the Poland I came back to in 2003; the gap to Western Europe is shrinking every single year. But I am grateful to now live in a country that welcomes more immigrants than any other place on Earth – and at the end of their journey, makes many of them them feel at home. It also makes me realize how small and misguided must be the conversations we are having about immigration – not just here, but all over the developed world.

To explore other articles in this short series about Poland, click here. You can also directly proceed to the next entry here.

systemd for Administrators, Part XII

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/security.html

Here’s the twelfth installment
of

my ongoing series
on
systemd
for
Administrators:

Securing Your Services

One of the core features of Unix systems is the idea of privilege separation
between the different components of the OS. Many system services run under
their own user IDs thus limiting what they can do, and hence the impact they
may have on the OS in case they get exploited.

This kind of privilege separation only provides very basic protection
however, since in general system services run this way can still do at least as
much as a normal local users, though not as much as root. For security purposes
it is however very interesting to limit even further what services can do, and
shut them off a couple of things that normal users are allowed to do.

A great way to limit the impact of services is by employing MAC technologies
such as SELinux. If you are interested to secure down your server, running
SELinux is a very good idea. systemd enables developers and administrators to
apply additional restrictions to local services independently of a MAC. Thus,
regardless whether you are able to make use of SELinux you may still enforce
certain security limits on your services.

In this iteration of the series we want to focus on a couple of these
security features of systemd and how to make use of them in your services.
These features take advantage of a couple of Linux-specific technologies that have
been available in the kernel for a long time, but never have been exposed in a
widely usable fashion. These systemd features have been designed to be as easy to use
as possible, in order to make them attractive to administrators and upstream
developers:

  • Isolating services from the network
  • Service-private /tmp
  • Making directories appear read-only or inaccessible to services
  • Taking away capabilities from services
  • Disallowing forking, limiting file creation for services
  • Controlling device node access of services

All options described here are documented in systemd’s man pages, notably systemd.exec(5).
Please consult these man pages for further details.

All these options are available on all systemd systems, regardless if
SELinux or any other MAC is enabled, or not.

All these options are relatively cheap, so if in doubt use them. Even if you
might think that your service doesn’t write to /tmp and hence enabling
PrivateTmp=yes (as described below) might not be necessary, due to
today’s complex software it’s still beneficial to enable this feature, simply
because libraries you link to (and plug-ins to those libraries) which you do
not control might need temporary files after all. Example: you never know what
kind of NSS module your local installation has enabled, and what that NSS module
does with /tmp.

These options are hopefully interesting both for administrators to secure
their local systems, and for upstream developers to ship their services secure
by default. We strongly encourage upstream developers to consider using these
options by default in their upstream service units. They are very easy to make
use of and have major benefits for security.

Isolating Services from the Network

A very simple but powerful configuration option you may use in systemd
service definitions is PrivateNetwork=:

...
[Service]
ExecStart=...
PrivateNetwork=yes
...

With this simple switch a service and all the processes it consists of are
entirely disconnected from any kind of networking. Network interfaces became
unavailable to the processes, the only one they’ll see is the loopback device
“lo”, but it is isolated from the real host loopback. This is a very powerful
protection from network attacks.

Caveat: Some services require the network to be operational. Of
course, nobody would consider using PrivateNetwork=yes on a
network-facing service such as Apache. However even for non-network-facing
services network support might be necessary and not always obvious. Example: if
the local system is configured for an LDAP-based user database doing glibc name
lookups with calls such as getpwnam() might end up resulting in network access.
That said, even in those cases it is more often than not OK to use
PrivateNetwork=yes since user IDs of system service users are required to
be resolvable even without any network around. That means as long as the only
user IDs your service needs to resolve are below the magic 1000 boundary using
PrivateNetwork=yes should be OK.

Internally, this feature makes use of network namespaces of the kernel. If
enabled a new network namespace is opened and only the loopback device
configured in it.

Service-Private /tmp

Another very simple but powerful configuration switch is
PrivateTmp=:

...
[Service]
ExecStart=...
PrivateTmp=yes
...

If enabled this option will ensure that the /tmp directory the
service will see is private and isolated from the host system’s /tmp.
/tmp traditionally has been a shared space for all local services and
users. Over the years it has been a major source of security problems for a
multitude of services. Symlink attacks and DoS vulnerabilities due to guessable
/tmp temporary files are common. By isolating the service’s
/tmp from the rest of the host, such vulnerabilities become moot.

For Fedora 17 a feature has
been accepted
in order to enable this option across a large number of
services.

Caveat: Some services actually misuse /tmp as a location
for IPC sockets and other communication primitives, even though this is almost
always a vulnerability (simply because if you use it for communication you need
guessable names, and guessable names make your code vulnerable to DoS and symlink
attacks) and /run is the much safer replacement for this, simply
because it is not a location writable to unprivileged processes. For example,
X11 places it’s communication sockets below /tmp (which is actually
secure — though still not ideal — in this exception since it does so in a
safe subdirectory which is created at early boot.) Services which need to
communicate via such communication primitives in /tmp are no
candidates for PrivateTmp=. Thankfully these days only very few
services misusing /tmp like this remain.

Internally, this feature makes use of file system namespaces of the kernel.
If enabled a new file system namespace is opened inheritng most of the host
hierarchy with the exception of /tmp.

Making Directories Appear Read-Only or Inaccessible to Services

With the ReadOnlyDirectories= and InaccessibleDirectories=
options it is possible to make the specified directories inaccessible for
writing resp. both reading and writing to the service:

...
[Service]
ExecStart=...
InaccessibleDirectories=/home
ReadOnlyDirectories=/var
...

With these two configuration lines the whole tree below /home
becomes inaccessible to the service (i.e. the directory will appear empty and
with 000 access mode), and the tree below /var becomes read-only.

Caveat: Note that ReadOnlyDirectories= currently is not
recursively applied to submounts of the specified directories (i.e. mounts below
/var in the example above stay writable). This is likely to get fixed
soon.

Internally, this is also implemented based on file system namspaces.

Taking Away Capabilities From Services

Another very powerful security option in systemd is
CapabilityBoundingSet= which allows to limit in a relatively fine
grained fashion which kernel capabilities a service started retains:

...
[Service]
ExecStart=...
CapabilityBoundingSet=CAP_CHOWN CAP_KILL
...

In the example above only the CAP_CHOWN and CAP_KILL capabilities are
retained by the service, and the service and any processes it might create have
no chance to ever acquire any other capabilities again, not even via setuid
binaries. The list of currently defined capabilities is available in capabilities(7).
Unfortunately some of the defined capabilities are overly generic (such as
CAP_SYS_ADMIN), however they are still a very useful tool, in particular for
services that otherwise run with full root privileges.

To identify precisely which capabilities are necessary for a service to run
cleanly is not always easy and requires a bit of testing. To simplify this
process a bit, it is possible to blacklist certain capabilities that are
definitely not needed instead of whitelisting all that might be needed. Example: the
CAP_SYS_PTRACE is a particularly powerful and security relevant capability
needed for the implementation of debuggers, since it allows introspecting and
manipulating any local process on the system. A service like Apache obviously
has no business in being a debugger for other processes, hence it is safe to
remove the capability from it:

...
[Service]
ExecStart=...
CapabilityBoundingSet=~CAP_SYS_PTRACE
...

The ~ character the value assignment here is prefixed with inverts
the meaning of the option: instead of listing all capabalities the service
will retain you may list the ones it will not retain.

Caveat: Some services might react confused if certain capabilities are
made unavailable to them. Thus when determining the right set of capabilities
to keep around you need to do this carefully, and it might be a good idea to talk
to the upstream maintainers since they should know best which operations a
service might need to run successfully.

Caveat 2: Capabilities are
not a magic wand.
You probably want to combine them and use them in
conjunction with other security options in order to make them truly useful.

To easily check which processes on your system retain which capabilities use
the pscap tool from the libcap-ng-utils package.

Making use of systemd’s CapabilityBoundingSet= option is often a
simple, discoverable and cheap replacement for patching all system daemons
individually to control the capability bounding set on their own.

Disallowing Forking, Limiting File Creation for Services

Resource Limits may be used to apply certain security limits on services
being run. Primarily, resource limits are useful for resource control (as the
name suggests…) not so much access control. However, two of them can be
useful to disable certain OS features: RLIMIT_NPROC and RLIMIT_FSIZE may be
used to disable forking and disable writing of any files with a size >
0:

...
[Service]
ExecStart=...
LimitNPROC=1
LimitFSIZE=0
...

Note that this will work only if the service in question drops privileges
and runs under a (non-root) user ID of its own or drops the CAP_SYS_RESOURCE
capability, for example via CapabilityBoundingSet= as discussed above.
Without that a process could simply increase the resource limit again thus
voiding any effect.

Caveat: LimitFSIZE= is pretty brutal. If the service
attempts to write a file with a size > 0, it will immeidately be killed with
the SIGXFSZ which unless caught terminates the process. Also, creating files
with size 0 is still allowed, even if this option is used.

For more information on these and other resource limits, see setrlimit(2).

Controlling Device Node Access of Services

Devices nodes are an important interface to the kernel and its drivers.
Since drivers tend to get much less testing and security checking than the core
kernel they often are a major entry point for security hacks. systemd allows
you to control access to devices individually for each service:

...
[Service]
ExecStart=...
DeviceAllow=/dev/null rw
...

This will limit access to /dev/null and only this device node,
disallowing access to any other device nodes.

The feature is implemented on top of the devices cgroup controller.

Other Options

Besides the easy to use options above there are a number of other security
relevant options available. However they usually require a bit of preparation
in the service itself and hence are probably primarily useful for upstream
developers. These options are RootDirectory= (to set up
chroot() environments for a service) as well as User= and
Group= to drop privileges to the specified user and group. These
options are particularly useful to greatly simplify writing daemons, where all
the complexities of securely dropping privileges can be left to systemd, and
kept out of the daemons themselves.

If you are wondering why these options are not enabled by default: some of
them simply break seamntics of traditional Unix, and to maintain compatibility
we cannot enable them by default. e.g. since traditional Unix enforced that
/tmp was a shared namespace, and processes could use it for IPC we
cannot just go and turn that off globally, just because /tmp‘s role in
IPC is now replaced by /run.

And that’s it for now. If you are working on unit files for upstream or in
your distribution, please consider using one or more of the options listed
above. If you service is secure by default by taking advantage of these options
this will help not only your users but also make the Internet a safer
place.

India, 360 Degrees at a Time, Part Four

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/photos/india-360-at-a-time-4.html

Here’s the fourth part of my ongoing series.

After Hampi we went to Bangalore to attend foss.in. (Fantastic conference, btw. The concerts at
the venue are unparalleled.) From there we flew up to Udaipur, in Rajasthan. Udaipur
is (among other things) famous for being the place where the central scenes of Octopussy were filmed.
Octopussy’s famous white palace is on Jagniwas Island in Lake Pichola:

Udaipur

This panorama was taken from another island in the lake, Jagmandir Island, which is visible in the following shot on the left:

Udaipur

Udaipur’s scenery, seen from the Maharaja’s City Palace down onto Pichola Lake:

Udaipur

That’s all for Udaipur, tomorrow I’ll post more panoramas, from other stops of our trip.