Tag Archives: cpd

Notes on setting up Raspberry Pi 3 as WiFi hotspot

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/04/notes-on-setting-up-raspberry-pi-3-as.html

I want to sniff the packets for IoT devices. There are a number of ways of doing this, but one straightforward mechanism is configuring a “Raspberry Pi 3 B” as a WiFi hotspot, then running tcpdump on it to record all the packets that pass through it. Google gives lots of results on how to do this, but they all demand that you have the precise hardware, WiFi hardware, and software that the authors do, so that’s a pain.

I got it working using the instructions here. There are a few additional notes, which is why I’m writing this blogpost, so I remember them.
https://www.raspberrypi.org/documentation/configuration/wireless/access-point.md

I’m using the RPi-3-B and not the RPi-3-B+, and the latest version of Raspbian at the time of this writing, “Raspbian Stretch Lite 2018-3-13”.

Some things didn’t work as described. The first is that it couldn’t find the package “hostapd”. That solution was to run “apt-get update” a second time.

The second problem was error message about the NAT not working when trying to set the masquerade rule. That’s because the ‘upgrade’ updates the kernel, making the running system out-of-date with the files on the disk. The solution to that is make sure you reboot after upgrading.

Thus, what you do at the start is:

apt-get update
apt-get upgrade
apt-get update
shutdown -r now

Then it’s just “apt-get install tcpdump” and start capturing on wlan0. This will get the non-monitor-mode Ethernet frames, which is what I want.

Some notes on memcached DDoS

Post Syndicated from Robert Graham original http://blog.erratasec.com/2018/03/some-notes-on-memcached-ddos.html

I thought I’d write up some notes on the memcached DDoS. Specifically, I describe how many I found scanning the Internet with masscan, and how to use masscan as a killswitch to neuter the worst of the attacks.

Test your servers

I added code to my port scanner for this, then scanned the Internet:
masscan 0.0.0.0/0 -pU:11211 –banners | grep memcached
This example scans the entire Internet (/0). Replaced 0.0.0.0/0 with your address range (or ranges).
This produces output that looks like this:
Banner on port 11211/udp on 172.246.132.226: [memcached] uptime=230130 time=1520485357 version=1.4.13
Banner on port 11211/udp on 89.110.149.218: [memcached] uptime=3935192 time=1520485363 version=1.4.17
Banner on port 11211/udp on 172.246.132.226: [memcached] uptime=230130 time=1520485357 version=1.4.13
Banner on port 11211/udp on 84.200.45.2: [memcached] uptime=399858 time=1520485362 version=1.4.20
Banner on port 11211/udp on 5.1.66.2: [memcached] uptime=29429482 time=1520485363 version=1.4.20
Banner on port 11211/udp on 103.248.253.112: [memcached] uptime=2879363 time=1520485366 version=1.2.6
Banner on port 11211/udp on 193.240.236.171: [memcached] uptime=42083736 time=1520485365 version=1.4.13
The “banners” check filters out those with valid memcached responses, so you don’t get other stuff that isn’t memcached. To filter this output further, use  the ‘cut’ to grab just column 6:
… | cut -d ‘ ‘ -f 6 | cut -d: -f1
You often get multiple responses to just one query, so you’ll want to sort/uniq the list:
… | sort | uniq

My results from an Internet wide scan

I got 15181 results (or roughly 15,000).
People are using Shodan to find a list of memcached servers. They might be getting a lot results back that response to TCP instead of UDP. Only UDP can be used for the attack.

Other researchers scanned the Internet a few days ago and found ~31k. I don’t know if this means people have been removing these from the Internet.

Masscan as exploit script

BTW, you can not only use masscan to find amplifiers, you can also use it to carry out the DDoS. Simply import the list of amplifier IP addresses, then spoof the source address as that of the target. All the responses will go back to the source address.
masscan -iL amplifiers.txt -pU:11211 –spoof-ip –rate 100000
I point this out to show how there’s no magic in exploiting this. Numerous exploit scripts have been released, because it’s so easy.

Why memcached servers are vulnerable

Like many servers, memcached listens to local IP address 127.0.0.1 for local administration. By listening only on the local IP address, remote people cannot talk to the server.
However, this process is often buggy, and you end up listening on either 0.0.0.0 (all interfaces) or on one of the external interfaces. There’s a common Linux network stack issue where this keeps happening, like trying to get VMs connected to the network. I forget the exact details, but the point is that lots of servers that intend to listen only on 127.0.0.1 end up listening on external interfaces instead. It’s not a good security barrier.
Thus, there are lots of memcached servers listening on their control port (11211) on external interfaces.

How the protocol works

The protocol is documented here. It’s pretty straightforward.
The easiest amplification attacks is to send the “stats” command. This is 15 byte UDP packet that causes the server to send back either a large response full of useful statistics about the server.  You often see around 10 kilobytes of response across several packets.
A harder, but more effect attack uses a two step process. You first use the “add” or “set” commands to put chunks of data into the server, then send a “get” command to retrieve it. You can easily put 100-megabytes of data into the server this way, and causes a retrieval with a single “get” command.
That’s why this has been the largest amplification ever, because a single 100-byte packet can in theory cause a 100-megabytes response.
Doing the math, the 1.3 terabit/second DDoS divided across the 15,000 servers I found vulnerable on the Internet leads to an average of 100-megabits/second per server. This is fairly minor, and is indeed something even small servers (like Raspberry Pis) can generate.

Neutering the attack (“kill switch”)

If they are using the more powerful attack against you, you can neuter it: you can send a “flush_all” command back at the servers who are flooding you, causing them to drop all those large chunks of data from the cache.
I’m going to describe how I would do this.
First, get a list of attackers, meaning, the amplifiers that are flooding you. The way to do this is grab a packet sniffer and capture all packets with a source port of 11211. Here is an example using tcpdump.
tcpdump -i -w attackers.pcap src port 11221
Let that run for a while, then hit [ctrl-c] to stop, then extract the list of IP addresses in the capture file. The way I do this is with tshark (comes with Wireshark):
tshark -r attackers.pcap -Tfields -eip.src | sort | uniq > amplifiers.txt
Now, craft a flush_all payload. There are many ways of doing this. For example, if you are using nmap or masscan, you can add the bytes to the nmap-payloads.txt file. Also, masscan can read this directly from a packet capture file. To do this, first craft a packet, such as with the following command line foo:
echo -en “\x00\x00\x00\x00\x00\x01\x00\x00flush_all\r\n” | nc -q1 -u 11211
Capture this packet using tcpdump or something, and save into a file “flush_all.pcap”. If you want to skip this step, I’ve already done this for you, go grab the file from GitHub:
Now that we have our list of attackers (amplifiers.txt) and a payload to blast at them (flush_all.pcap), use masscan to send it:
masscan -iL amplifiers.txt -pU:112211 –pcap-payload flush_all.pcap

Reportedly, “shutdown” may also work to completely shutdown the amplifiers. I’ll leave that as an exercise for the reader, since of course you’ll be adversely affecting the servers.

Some notes

Here are some good reading on this attack:

2017-12-18 ARP в Linux

Post Syndicated from Vasil Kolev original https://vasil.ludost.net/blog/?p=3371

Почнал съм да събирам списък “неща, на които разчитам и не работят”. Ето едно от тях, в което се ударих преди малко – arp-а на linux kernel-а.

(след като тоя протокол и поддръжката му ги има от години и всички го ползват, някакси очаквам да не ме ритат в кокалчетата)

Преди няколко дни имах оплакване, че от определени места не се стига до marla. След малко тестове нещото сработи от самосебе си и не успяхме да го хванем. Тая вечер проблемът се появи пак, като интересното беше, че до други машини в същата мрежа имаше свързаност, само до marla – не.

Последваха стандартните неща – едно mtr до marla, едно до един от адресите, който не е от нашата мрежа, и нищо. Слушайки на интерфейсите, виждах да влиза трафик, но не виждах нищо да излиза.

Един ip r get каза следното:

77.246.xxx.xxx via 193.169.198.179 dev eth3.1030 src 193.169.198.230

193.169.198.179 е inetbg.bix.bg, които са доставчика на човека. Пинг до това ip нямаше, нямаше и arp entry за него и моята първа мисъл беше “тия па какво са объркали”. След което пуснах един tcpdump и видях следното:

22:06:48.470979 ARP, Request who-has 193.169.198.179 tell 185.117.82.66, length 28

Ако нещо ви се вижда да не е наред – прави сте. Не би трябвало да питам в тоя сегмент с адрес, дето съм извадил от съвсем друго място, и е доста очаквано, че някой няма да иска да ми отговори. Кратко търсене и спомняне ме доведе до /proc/sys/net/ipv4/conf/*/arp_announce, за което може да прочетете в ip-sysctl.txt в документацията на kernel-а.

За който не му се чете, параметърът по default е 0, което значи “сложи там за source ip някакъв адрес, който ти хареса”, 1 значи “гледай поне да е от същата мрежа” и 2 значи “избери внимателно”. Защо не е 2 default-а, не мога да си обясня (но преди малко беше изконфигуриран на двата router-а при нас да е така).

Допълнително на който му се забавлява, може да види какво пише за останалите arp опции и как се държи по default kernel-а, например че може да отговори на arp за един интерфейс от друг, без изобщо да му пука (и което по някакви твърдения отговаря на RFC-тата, което обаче не успях да открия). За всички, които искат смислено поведение на arp-а на linux kernel-а, препоръчвам следните sysctl-та:

net.ipv4.conf.all.arp_filter=1
net.ipv4.conf.all.arp_announce=2
net.ipv4.conf.all.arp_ignore=2

(тези са особено нужни ако имате сегмент, в който имате две мрежи и по два и повече физически интерфейса и искате някакъв контрол откъде и как ви върви трафика)

Security updates for Monday

Post Syndicated from jake original https://lwn.net/Articles/737764/rss

Security updates have been issued by Arch Linux (apr, apr-util, chromium, and wget), CentOS (tomcat and tomcat6), Debian (curl, git-annex, golang, shadowsocks-libev, and wget), Fedora (libextractor and sssd), Gentoo (apache, asterisk, jython, oracle-jdk-bin, and xorg-server), openSUSE (chromium, curl, gcc48, GraphicsMagick, hostapd, kernel, libjpeg-turbo, libvirt, mysql-community-server, openvpn, SDL2, tcpdump, and wget), Oracle (tomcat and tomcat6), Red Hat (chromium-browser, tomcat, and tomcat6), Scientific Linux (tomcat and tomcat6), Slackware (php and wget), SUSE (firefox, mozilla-nss, kernel, wget, and xen), and Ubuntu (mysql-5.5, poppler, and wget).

Security updates for Monday

Post Syndicated from ris original https://lwn.net/Articles/734761/rss

Security updates have been issued by Debian (bzr, clamav, libgd2, libraw, samba, and tomcat7), Fedora (drupal7-views, gnome-shell, httpd, krb5, libmspack, LibRaw, mingw-LibRaw, mpg123, pkgconf, python-jwt, and samba), Gentoo (adobe-flash, chromium, cvs, exim, mercurial, oracle-jdk-bin, php, postfix, and tcpdump), openSUSE (Chromium and libraw), Red Hat (chromium-browser), and Slackware (libxml2 and python).

Security updates for Friday

Post Syndicated from ris original https://lwn.net/Articles/733829/rss

Security updates have been issued by Arch Linux (flashplugin, kernel, lib32-flashplugin, and linux-lts), CentOS (postgresql), Debian (tcpdump and wordpress-shibboleth), Fedora (lightdm, python-django, and tomcat), Mageia (flash-player-plugin and libsndfile), openSUSE (chromium, cvs, kernel, and libreoffice), Oracle (postgresql), and Ubuntu (libgcrypt20 and thunderbird).

Security updates for Thursday

Post Syndicated from ris original https://lwn.net/Articles/733699/rss

Security updates have been issued by Arch Linux (tcpdump), CentOS (bluez and kernel), Debian (wordpress-shibboleth), Fedora (augeas, bluez, emacs, and libwmf), Oracle (kernel), Red Hat (instack-undercloud, kernel, openvswitch, and postgresql), Scientific Linux (postgresql), SUSE (kernel and xen), and Ubuntu (tcpdump).

Security updates for Wednesday

Post Syndicated from ris original https://lwn.net/Articles/733583/rss

Security updates have been issued by Arch Linux (bluez and linux-hardened), CentOS (bluez and kernel), Debian (bluez, emacs24, tcpdump, and xen), Fedora (kernel and mimedefang), Oracle (bluez and kernel), Red Hat (bluez, flash-plugin, instack-undercloud, kernel, kernel-rt, and openvswitch), Scientific Linux (bluez and kernel), Slackware (emacs and libzip), SUSE (xen), and Ubuntu (bluez and qemu).

Security updates for Monday

Post Syndicated from ris original https://lwn.net/Articles/733389/rss

Security updates have been issued by Debian (freerdp, mbedtls, tiff, and tiff3), Fedora (chromium, krb5, libstaroffice, mbedtls, mingw-libidn2, mingw-openjpeg2, openjpeg2, and rubygems), Mageia (bzr, libarchive, libgcrypt, and tcpdump), openSUSE (gdk-pixbuf, libidn2, mpg123, postgresql94, postgresql96, and xen), Slackware (bash, mariadb, and tcpdump), and SUSE (evince and kernel).

Security updates for Wednesday

Post Syndicated from ris original https://lwn.net/Articles/733040/rss

Security updates have been issued by Debian (file, icedove, irssi, ruby2.3, and tcpdump), Fedora (libzip and openjpeg2), openSUSE (clamav-database, icu, libzypp, zypper, and php5), Oracle (389-ds-base), Red Hat (rh-maven33-groovy), SUSE (postgresql94, postgresql96, and python-pycrypto), and Ubuntu (bzr and libgd2).

Create a text-based adventure game with FutureLearn

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/text-based-futurelearn/

Learning with Raspberry Pi has never been so easy! We’re adding a new course to FutureLearn today, and you can take part anywhere in the world.

FutureLearn: the story so far…

In February 2017, we were delighted to launch two free online CPD training courses on the FutureLearn platform, available anywhere in the world. Since the launch, more than 30,000 educators have joined these courses: Teaching Programming in Primary Schools, and Teaching Physical Computing with Raspberry Pi and Python.

Futurelearn Raspberry Pi

Thousands of educators have been building their skills – completing tasks such as writing a program in Python to make an LED blink, or building a voting app in Scratch. The two courses are scaffolded to build skills, week by week. Learners are supported by videos, screencasts, and articles, and they have the chance to apply what they have learned in as many different practical projects as possible.

We have had some excellent feedback from learners on the courses, such as Kyle Wilke who commented: “Fantastic course. Nice integration of text-based and video instruction. Was very impressed how much support was provided by fellow students, kudos to us. Can’t wait to share this with fellow educators.”

Brand new course

We are launching a new course this autumn. You can join lead educator Laura Sachs to learn object-oriented programming principles by creating your own text-based adventure game in Python. The course is aimed at educators who have programming experience, but have never programmed in the object-oriented style.

Future Learn: Object-oriented Programming in Python trailer

Our newest FutureLearn course in now live. You can join lead educator Laura Sachs to learn object-oriented programming principles by creating your own text-based adventure game in Python. The course is aimed at educators who have programming experience, but have never programmed in the object-oriented style.

The course will introduce you to the principles of object-oriented programming in Python, showing you how to create objects, functions, methods, and classes. You’ll use what you learn to create your own text-based adventure game. You will have the chance to share your code with other learners, and to see theirs. If you’re an educator, you’ll also be able to develop ideas for using object-oriented programming in your classroom.

Take part

Sign up now to join us on the course, starting today, September 4. Our courses are free to join online – so you can learn wherever you are, and whenever you want.

The post Create a text-based adventure game with FutureLearn appeared first on Raspberry Pi.

Security updates for Tuesday

Post Syndicated from corbet original https://lwn.net/Articles/731678/rss

Security updates have been issued by Debian (extplorer and libraw), Fedora (mingw-libsoup, python-tablib, ruby, and subversion), Mageia (avidemux, clamav, nasm, php-pear-CAS, and shutter), Oracle (xmlsec1), Red Hat (openssl tomcat), Scientific Linux (authconfig, bash, curl, evince, firefox, freeradius, gdm gnome-session, ghostscript, git, glibc, gnutls, groovy, GStreamer, gtk-vnc, httpd, java-1.7.0-openjdk, kernel, libreoffice, libsoup, libtasn1, log4j, mariadb, mercurial, NetworkManager, openldap, openssh, pidgin, pki-core, postgresql, python, qemu-kvm, samba, spice, subversion, tcpdump, tigervnc fltk, tomcat, X.org, and xmlsec1), SUSE (git), and Ubuntu (augeas, cvs, and texlive-base).

Security updates for Wednesday

Post Syndicated from ris original https://lwn.net/Articles/730338/rss

Security updates have been issued by Mageia (atril, mpg123, perl-SOAP-Lite, and virtualbox), openSUSE (kernel and libzypp, zypper), Oracle (authconfig, bash, curl, gdm and gnome-session, ghostscript, git, glibc, gnutls, gtk-vnc, kernel, libreoffice, libtasn1, mariadb, openldap, openssh, pidgin, postgresql, python, qemu-kvm, samba, tcpdump, tigervnc and fltk, and tomcat), Red Hat (kernel, kernel-rt, openstack-neutron, and qemu-kvm), and SUSE (puppet and tcmu-runner).

Security updates for Tuesday

Post Syndicated from ris original https://lwn.net/Articles/729456/rss

Security updates have been issued by Debian (freerdp and ghostscript), Fedora (freerdp, jackson-databind, moodle, remmina, and runc), Red Hat (authconfig, devtoolset-4-jackson-databind, gnutls, libreoffice, NetworkManager and libnl3, pki-core, rh-eclipse46-jackson-databind, samba, and tcpdump), and Ubuntu (apache2, bash, imagemagick, openjdk-8, and rabbitmq-server).

Security updates for Tuesday

Post Syndicated from ris original https://lwn.net/Articles/728776/rss

Security updates have been issued by Debian (catdoc, gsoap, and libtasn1-3), Fedora (GraphicsMagick, java-1.8.0-openjdk, krb5, librsvg2, nodejs, phpldapadmin, rubygem-rack-cors, and yara), Mageia (irssi), openSUSE (rubygem-puppet), Red Hat (kernel), Slackware (tcpdump), and Ubuntu (imagemagick, linux, linux-raspi2, linux-snapdragon, linux-lts-xenial, mysql-5.5, samba, and xorg-server, xorg-server-hwe-16.04, xorg-server-lts-xenial).

Hello World issue 2: celebrating ten years of Scratch

Post Syndicated from Carrie Anne Philbin original https://www.raspberrypi.org/blog/hello-world-issue-2/

We are very excited to announce that issue 2 of Hello World is out today! Hello World is our magazine about computing and digital making, written by educators, for educators. It  is a collaboration between the Raspberry Pi Foundation and Computing at School, part of the British Computing Society.

We’ve been extremely fortunate to be granted an exclusive interview with Mitch Resnick, Leader of the Scratch Team at MIT, and it’s in the latest issue. All around the world, educators and enthusiasts are celebrating ten years of Scratch, MIT’s block-based programming language. Scratch has helped millions of people to learn the building blocks of computer programming through play, and is our go-to tool at Code Clubs everywhere.

Cover of issue 2 of hello world magazine

A magazine by educators, for educators.

This packed edition of Hello World also includes news, features, lesson activities, research and opinions from Computing At School Master Teachers, Raspberry Pi Certified Educators, academics, informal learning leaders and brilliant classroom teachers. Highlights (for me) include:

  • A round-up of digital making research from Oliver Quinlan
  • Safeguarding children online by Penny Patterson
  • Embracing chaos inside and outside the classroom with Code Club’s Rik Cross, Raspberry Jam-maker-in-chief Ben Nuttall, Raspberry Pi Certified Educator Sway Grantham, and CPD trainer Alan O’Donohoe
  • How MicroPython on the Micro:bit is inspiring a generation, by Nicholas Tollervey
  • Incredibly useful lesson activities on programming graphical user interfaces (GUI) with guizero, simulating logic gates in Minecraft, and introducing variables through story telling.
  • Exploring computing and gender through Girls Who Code, Cyber First Girls, the BCSLovelace Colloqium, and Computing At School’s #include initiative
  • A review of browser based IDEs

Get your copy

Hello World is available as a free Creative Commons download for anyone around the world who is interested in Computer Science and digital making education. Grab the latest issue straight from the Hello World website.

Thanks to the very generous support of our sponsors BT, we are able to offer a free printed version of the magazine to serving educators in the UK. It’s for teachers, Code Club volunteers, teaching assistants, teacher trainers, and others who help children and young people learn about computing and digital making. Remember to subscribe to receive your free copy, posted directly to your home.

Get involved

Are you an educator? Then Hello World needs you! As a magazine for educators by educators, we want to hear about your experiences in teaching technology. If you hear a little niggling voice in your head say “I’m just a teacher, why would my contributions be useful to anyone else?” stop immediately. We want to hear from you, because you are amazing!

Get in touch: [email protected] with your ideas, and we can help get them published.

 

The post Hello World issue 2: celebrating ten years of Scratch appeared first on Raspberry Pi.

Security updates for Tuesday

Post Syndicated from ris original https://lwn.net/Articles/722246/rss

Security updates have been issued by Debian (libtirpc and libytnef), Fedora (python-fedora, roundcubemail, and tnef), Mageia (ntp and virtualbox), openSUSE (dpkg, ghostscript, kernel, libressl, mysql-community-server, quagga, tcpdump, libpcap, xen, and zziplib), Red Hat (java-1.7.0-openjdk), Scientific Linux (java-1.7.0-openjdk), and SUSE (samba).

Operating OpenStack at Scale

Post Syndicated from mikesefanov original https://yahooeng.tumblr.com/post/159795571841

By James Penick, Cloud Architect & Gurpreet Kaur, Product Manager

A version of this byline was originally written for and appears in CIO Review.

A successful private cloud presents a consistent and reliable facade over the complexities of hyperscale infrastructure. It must simultaneously handle constant organic traffic growth, unanticipated spikes, a multitude of hardware vendors, and discordant customer demands. The depth of this complexity only increases with the age of the business, leaving a private cloud operator saddled with legacy hardware, old network infrastructure, customers dependent on legacy operating systems, and the list goes on. These are the foundations of the horror stories told by grizzled operators around the campfire.

Providing a plethora of services globally for over a billion active users requires a hyperscale infrastructure. Yahoo’s on-premises infrastructure is comprised of datacenters housing hundreds of thousands of physical and virtual compute resources globally, connected via a multi-terabit network backbone. As one of the very first hyperscale internet companies in the world, Yahoo’s infrastructure had grown organically – things were built, and rebuilt, as the company learned and grew. The resulting web of modern and legacy infrastructure became progressively more difficult to manage. Initial attempts to manage this via IaaS (Infrastructure-as-a-Service) taught some hard lessons. However, those lessons served us well when OpenStack was selected to manage Yahoo’s datacenters, some of which are shared below.

Centralized team offering Infrastructure-as-a-Service

Chief amongst the lessons learned prior to OpenStack was that IaaS must be presented as a core service to the whole organization by a dedicated team. An a-la-carte-IaaS, where each user is expected to manage their own control plane and inventory, just isn’t sustainable at scale. Multiple teams tackling the same challenges involved in the curation of software, deployment, upkeep, and security within an organization is not just a duplication of effort; it removes the opportunity for improved synergy with all levels of the business. The first OpenStack cluster, with a centralized dedicated developer and service engineering team, went live in June 2012.  This model has served us well and has been a crucial piece of making OpenStack succeed at Yahoo. One of the biggest advantages to a centralized, core team is the ability to collaborate with the foundational teams upon which any business is built: Supply chain, Datacenter Site-Operations, Finance, and finally our customers, the engineering teams. Building a close relationship with these vital parts of the business provides the ability to streamline the process of scaling inventory and presenting on-demand infrastructure to the company.

Developers love instant access to compute resources

Our developer productivity clusters, named “OpenHouse,” were a huge hit. Ideation and experimentation are core to developers’ DNA at Yahoo. It empowers our engineers to innovate, prototype, develop, and quickly iterate on ideas. No longer is a developer reliant on a static and costly development machine under their desk. OpenHouse enables developer agility and cost savings by obviating the desktop.

Dynamic infrastructure empowers agile products

From a humble beginning of a single, small OpenStack cluster, Yahoo’s OpenStack footprint is growing beyond 100,000 VM instances globally, with our single largest virtual machine cluster running over a thousand compute nodes, without using Nova Cells.

Until this point, Yahoo’s production footprint was nearly 100% focused on baremetal – a part of the business that one cannot simply ignore. In 2013, Yahoo OpenStack Baremetal began to manage all new compute deployments. Interestingly, after moving to a common API to provision baremetal and virtual machines, there was a marked increase in demand for virtual machines.

Developers across all major business units ranging from Yahoo Mail, Video, News, Finance, Sports and many more, were thrilled with getting instant access to compute resources to hit the ground running on their projects. Today, the OpenStack team is continuing to fully migrate the business to OpenStack-managed. Our baremetal footprint is well beyond that of our VMs, with over 100,000 baremetal instances provisioned by OpenStack Nova via Ironic.

How did Yahoo hit this scale?  

Scaling OpenStack begins with understanding how its various components work and how they communicate with one another. This topic can be very deep and for the sake of brevity, we’ll hit the high points.

1. Start at the bottom and think about the underlying hardware

Do not overlook the unique resource constraints for the services which power your cloud, nor the fashion in which those services are to be used. Leverage that understanding to drive hardware selection. For example, when one examines the role of the database server in an OpenStack cluster, and considers the multitudinous calls to the database: compute node heartbeats, instance state changes, normal user operations, and so on; they would conclude this core component is extremely busy in even a modest-sized Nova cluster, and in need of adequate computational resources to perform. Yet many deployers skimp on the hardware. The performance of the whole cluster is bottlenecked by the DB I/O. By thinking ahead you can save yourself a lot of heartburn later on.

2. Think about how things communicate

Our cluster databases are configured to be multi-master single-writer with automated failover. Control plane services have been modified to split DB reads directly to the read slaves and only write to the write-master. This distributes load across the database servers.

3. Scale wide

OpenStack has many small horizontally-scalable components which can peacefully cohabitate on the same machines: the Nova, Keystone, and Glance APIs, for example. Stripe these across several small or modest hardware. Some services, such as the Nova scheduler, run the risk of race conditions when running multi-active. If the risk of race conditions is unacceptable, use ZooKeeper to manage leader election.

4. Remove dependencies

In a Yahoo datacenter, DHCP is only used to provision baremetal servers. By statically declaring IPs in our instances via cloud-init, our infrastructure is less prone to outage from a failure in the DHCP infrastructure.

5. Don’t be afraid to replace things

Neutron used Dnsmasq to provide DHCP services, however it was not designed to address the complexity or scale of a dynamic environment. For example, Dnsmasq must be restarted for any config change, such as when a new host is being provisioned.  In the Yahoo OpenStack clusters this has been replaced by ISC-DHCPD, which scales far better than Dnsmasq and allows dynamic configuration updates via an API.

6. Or split them apart

Some of the core imaging services provided by Ironic, such as DHCP, TFTP, and HTTPS communicate with a host during the provisioning process. These services are normally  part of the Ironic Conductor (IC) service. In our environment we split these services into a new and physically-distinct service called the Ironic Transport Service (ITS). This brings value by:

  • Adding security: Splitting the ITS from the IC allows us to block all network traffic from production compute nodes to the IC, and other parts of our control plane. If a malicious entity attacks a node serving production traffic, they cannot escalate from it  to our control plane.
  • Scale: The ITS hosts allow us to horizontally scale the core provisioning services with which nodes communicate.
  • Flexibility: ITS allows Yahoo to manage remote sites, such as peering points, without building a new cluster in that site. Resources in those sites can now be managed by the nearest Yahoo owned & operated (O&O) datacenter, without needing to build a whole cluster in each site.

Be prepared for faulty hardware!

Running IaaS reliably at hyperscale is more than just scaling the control plane. One must take a holistic look at the system and consider everything. In fact, when examining provisioning failures, our engineers determined the majority root cause was faulty hardware. For example, there are a number of machines from varying vendors whose IPMI firmware fails from time to time, leaving the host inaccessible to remote power management. Some fail within minutes or weeks of installation. These failures occur on many different models, across many generations, and across many hardware vendors. Exposing these failures to users would create a very negative experience, and the cloud must be built to tolerate this complexity.

Focus on the end state

Yahoo’s experience shows that one can run OpenStack at hyperscale, leveraging it to wrap infrastructure and remove perceived complexity. Correctly leveraged, OpenStack presents an easy, consistent, and error-free interface. Delivering this interface is core to our design philosophy as Yahoo continues to double down on our OpenStack investment. The Yahoo OpenStack team looks forward to continue collaborating with the OpenStack community to share feedback and code.