Tag Archives: cgi

Progressing from tech to leadership

Post Syndicated from Michal Zalewski original http://lcamtuf.blogspot.com/2018/02/on-leadership.html

I’ve been a technical person all my life. I started doing vulnerability research in the late 1990s – and even today, when I’m not fiddling with CNC-machined robots or making furniture, I’m probably clobbering together a fuzzer or writing a book about browser protocols and APIs. In other words, I’m a geek at heart.

My career is a different story. Over the past two decades and a change, I went from writing CGI scripts and setting up WAN routers for a chain of shopping malls, to doing pentests for institutional customers, to designing a series of network monitoring platforms and handling incident response for a big telco, to building and running the product security org for one of the largest companies in the world. It’s been an interesting ride – and now that I’m on the hook for the well-being of about 100 folks across more than a dozen subteams around the world, I’ve been thinking a bit about the lessons learned along the way.

Of course, I’m a bit hesitant to write such a post: sometimes, your efforts pan out not because of your approach, but despite it – and it’s possible to draw precisely the wrong conclusions from such anecdotes. Still, I’m very proud of the culture we’ve created and the caliber of folks working on our team. It happened through the work of quite a few talented tech leads and managers even before my time, but it did not happen by accident – so I figured that my observations may be useful for some, as long as they are taken with a grain of salt.

But first, let me start on a somewhat somber note: what nobody tells you is that one’s level on the leadership ladder tends to be inversely correlated with several measures of happiness. The reason is fairly simple: as you get more senior, a growing number of people will come to you expecting you to solve increasingly fuzzy and challenging problems – and you will no longer be patted on the back for doing so. This should not scare you away from such opportunities, but it definitely calls for a particular mindset: your motivation must come from within. Look beyond the fight-of-the-day; find satisfaction in seeing how far your teams have come over the years.

With that out of the way, here’s a collection of notes, loosely organized into three major themes.

The curse of a techie leader

Perhaps the most interesting observation I have is that for a person coming from a technical background, building a healthy team is first and foremost about the subtle art of letting go.

There is a natural urge to stay involved in any project you’ve started or helped improve; after all, it’s your baby: you’re familiar with all the nuts and bolts, and nobody else can do this job as well as you. But as your sphere of influence grows, this becomes a choke point: there are only so many things you could be doing at once. Just as importantly, the project-hoarding behavior robs more junior folks of the ability to take on new responsibilities and bring their own ideas to life. In other words, when done properly, delegation is not just about freeing up your plate; it’s also about empowerment and about signalling trust.

Of course, when you hand your project over to somebody else, the new owner will initially be slower and more clumsy than you; but if you pick the new leads wisely, give them the right tools and the right incentives, and don’t make them deathly afraid of messing up, they will soon excel at their new jobs – and be grateful for the opportunity.

A related affliction of many accomplished techies is the conviction that they know the answers to every question even tangentially related to their domain of expertise; that belief is coupled with a burning desire to have the last word in every debate. When practiced in moderation, this behavior is fine among peers – but for a leader, one of the most important skills to learn is knowing when to keep your mouth shut: people learn a lot better by experimenting and making small mistakes than by being schooled by their boss, and they often try to read into your passing remarks. Don’t run an authoritarian camp focused on total risk aversion or perfectly efficient resource management; just set reasonable boundaries and exit conditions for experiments so that they don’t spiral out of control – and be amazed by the results every now and then.

Death by planning

When nothing is on fire, it’s easy to get preoccupied with maintaining the status quo. If your current headcount or budget request lists all the same projects as last year’s, or if you ever find yourself ending an argument by deferring to a policy or a process document, it’s probably a sign that you’re getting complacent. In security, complacency usually ends in tears – and when it doesn’t, it leads to burnout or boredom.

In my experience, your goal should be to develop a cadre of managers or tech leads capable of coming up with clever ideas, prioritizing them among themselves, and seeing them to completion without your day-to-day involvement. In your spare time, make it your mission to challenge them to stay ahead of the curve. Ask your vendor security lead how they’d streamline their work if they had a 40% jump in the number of vendors but no extra headcount; ask your product security folks what’s the second line of defense or containment should your primary defenses fail. Help them get good ideas off the ground; set some mental success and failure criteria to be able to cut your losses if something does not pan out.

Of course, malfunctions happen even in the best-run teams; to spot trouble early on, instead of overzealous project tracking, I found it useful to encourage folks to run a data-driven org. I’d usually ask them to imagine that a brand new VP shows up in our office and, as his first order of business, asks “why do you have so many people here and how do I know they are doing the right things?”. Not everything in security can be quantified, but hard data can validate many of your assumptions – and will alert you to unseen issues early on.

When focusing on data, it’s important not to treat pie charts and spreadsheets as an art unto itself; if you run a security review process for your company, your CSAT scores are going to reach 100% if you just rubberstamp every launch request within ten minutes of receiving it. Make sure you’re asking the right questions; instead of “how satisfied are you with our process”, try “is your product better as a consequence of talking to us?”

Whenever things are not progressing as expected, it is a natural instinct to fall back to micromanagement, but it seldom truly cures the ill. It’s probable that your team disagrees with your vision or its feasibility – and that you’re either not listening to their feedback, or they don’t think you’d care. It’s good to assume that most of your employees are as smart or smarter than you; barking your orders at them more loudly or more frequently does not lead anyplace good. It’s good to listen to them and either present new facts or work with them on a plan you can all get behind.

In some circumstances, all that’s needed is honesty about the business trade-offs, so that your team feels like your “partner in crime”, not a victim of circumstance. For example, we’d tell our folks that by not falling behind on basic, unglamorous work, we earn the trust of our VPs and SVPs – and that this translates into the independence and the resources we need to pursue more ambitious ideas without being told what to do; it’s how we game the system, so to speak. Oh: leading by example is a pretty powerful tool at your disposal, too.

The human factor

I’ve come to appreciate that hiring decent folks who can get along with others is far more important than trying to recruit conference-circuit superstars. In fact, hiring superstars is a decidedly hit-and-miss affair: while certainly not a rule, there is a proportion of folks who put the maintenance of their celebrity status ahead of job responsibilities or the well-being of their peers.

For teams, one of the most powerful demotivators is a sense of unfairness and disempowerment. This is where tech-originating leaders can shine, because their teams usually feel that their bosses understand and can evaluate the merits of the work. But it also means you need to be decisive and actually solve problems for them, rather than just letting them vent. You will need to make unpopular decisions every now and then; in such cases, I think it’s important to move quickly, rather than prolonging the uncertainty – but it’s also important to sincerely listen to concerns, explain your reasoning, and be frank about the risks and trade-offs.

Whenever you see a clash of personalities on your team, you probably need to respond swiftly and decisively; being right should not justify being a bully. If you don’t react to repeated scuffles, your best people will probably start looking for other opportunities: it’s draining to put up with constant pie fights, no matter if the pies are thrown straight at you or if you just need to duck one every now and then.

More broadly, personality differences seem to be a much better predictor of conflict than any technical aspects underpinning a debate. As a boss, you need to identify such differences early on and come up with creative solutions. Sometimes, all you need is taking some badly-delivered but valid feedback and having a conversation with the other person, asking some questions that can help them reach the same conclusions without feeling that their worldview is under attack. Other times, the only path forward is making sure that some folks simply don’t run into each for a while.

Finally, dealing with low performers is a notoriously hard but important part of the game. Especially within large companies, there is always the temptation to just let it slide: sideline a struggling person and wait for them to either get over their issues or leave. But this sends an awful message to the rest of the team; for better or worse, fairness is important to most. Simply firing the low performers is seldom the best solution, though; successful recovery cases are what sets great managers apart from the average ones.

Oh, one more thought: people in leadership roles have their allegiance divided between the company and the people who depend on them. The obligation to the company is more formal, but the impact you have on your team is longer-lasting and more intimate. When the obligations to the employer and to your team collide in some way, make sure you can make the right call; it might be one of the the most consequential decisions you’ll ever make.

Wikto Scanner Download – Web Server Security Tool

Post Syndicated from Darknet original https://www.darknet.org.uk/2017/09/wikto-scanner-download-web-server-security-tool/?utm_source=rss&utm_medium=social&utm_campaign=darknetfeed

Wikto Scanner Download – Web Server Security Tool

Wikto is an Open Source (GPL) web server scanner which performs comprehensive tests against web servers for multiple items, including over 3500 potentially dangerous files/CGIs, versions on over 900 servers, and version specific problems on over 250 servers.

It’s Nikto for Windows basically with some extra features written in C# and requires the .NET framework.

What is Wikto

Wikto is not a web application scanner. It is totally unaware of the application (if any) that’s running on the web site.

Read the rest of Wikto Scanner Download – Web Server Security Tool now! Only available at Darknet.

Top 10 Most Obvious Hacks of All Time (v0.9)

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/07/top-10-most-obvious-hacks-of-all-time.html

For teaching hacking/cybersecurity, I thought I’d create of the most obvious hacks of all time. Not the best hacks, the most sophisticated hacks, or the hacks with the biggest impact, but the most obvious hacks — ones that even the least knowledgeable among us should be able to understand. Below I propose some hacks that fit this bill, though in no particular order.

The reason I’m writing this is that my niece wants me to teach her some hacking. I thought I’d start with the obvious stuff first.

Shared Passwords

If you use the same password for every website, and one of those websites gets hacked, then the hacker has your password for all your websites. The reason your Facebook account got hacked wasn’t because of anything Facebook did, but because you used the same email-address and password when creating an account on “beagleforums.com”, which got hacked last year.

I’ve heard people say “I’m sure, because I choose a complex password and use it everywhere”. No, this is the very worst thing you can do. Sure, you can the use the same password on all sites you don’t care much about, but for Facebook, your email account, and your bank, you should have a unique password, so that when other sites get hacked, your important sites are secure.

And yes, it’s okay to write down your passwords on paper.

Tools: HaveIBeenPwned.com

PIN encrypted PDFs

My accountant emails PDF statements encrypted with the last 4 digits of my Social Security Number. This is not encryption — a 4 digit number has only 10,000 combinations, and a hacker can guess all of them in seconds.
PIN numbers for ATM cards work because ATM machines are online, and the machine can reject your card after four guesses. PIN numbers don’t work for documents, because they are offline — the hacker has a copy of the document on their own machine, disconnected from the Internet, and can continue making bad guesses with no restrictions.
Passwords protecting documents must be long enough that even trillion upon trillion guesses are insufficient to guess.

Tools: Hashcat, John the Ripper

SQL and other injection

The lazy way of combining websites with databases is to combine user input with an SQL statement. This combines code with data, so the obvious consequence is that hackers can craft data to mess with the code.
No, this isn’t obvious to the general public, but it should be obvious to programmers. The moment you write code that adds unfiltered user-input to an SQL statement, the consequence should be obvious. Yet, “SQL injection” has remained one of the most effective hacks for the last 15 years because somehow programmers don’t understand the consequence.
CGI shell injection is a similar issue. Back in early days, when “CGI scripts” were a thing, it was really important, but these days, not so much, so I just included it with SQL. The consequence of executing shell code should’ve been obvious, but weirdly, it wasn’t. The IT guy at the company I worked for back in the late 1990s came to me and asked “this guy says we have a vulnerability, is he full of shit?”, and I had to answer “no, he’s right — obviously so”.

XSS (“Cross Site Scripting”) [*] is another injection issue, but this time at somebody’s web browser rather than a server. It works because websites will echo back what is sent to them. For example, if you search for Cross Site Scripting with the URL https://www.google.com/search?q=cross+site+scripting, then you’ll get a page back from the server that contains that string. If the string is JavaScript code rather than text, then some servers (thought not Google) send back the code in the page in a way that it’ll be executed. This is most often used to hack somebody’s account: you send them an email or tweet a link, and when they click on it, the JavaScript gives control of the account to the hacker.

Cross site injection issues like this should probably be their own category, but I’m including it here for now.

More: Wikipedia on SQL injection, Wikipedia on cross site scripting.
Tools: Burpsuite, SQLmap

Buffer overflows

In the C programming language, programmers first create a buffer, then read input into it. If input is long than the buffer, then it overflows. The extra bytes overwrite other parts of the program, letting the hacker run code.
Again, it’s not a thing the general public is expected to know about, but is instead something C programmers should be expected to understand. They should know that it’s up to them to check the length and stop reading input before it overflows the buffer, that there’s no language feature that takes care of this for them.
We are three decades after the first major buffer overflow exploits, so there is no excuse for C programmers not to understand this issue.

What makes particular obvious is the way they are wrapped in exploits, like in Metasploit. While the bug itself is obvious that it’s a bug, actually exploiting it can take some very non-obvious skill. However, once that exploit is written, any trained monkey can press a button and run the exploit. That’s where we get the insult “script kiddie” from — referring to wannabe-hackers who never learn enough to write their own exploits, but who spend a lot of time running the exploit scripts written by better hackers than they.

More: Wikipedia on buffer overflow, Wikipedia on script kiddie,  “Smashing The Stack For Fun And Profit” — Phrack (1996)
Tools: bash, Metasploit

SendMail DEBUG command (historical)

The first popular email server in the 1980s was called “SendMail”. It had a feature whereby if you send a “DEBUG” command to it, it would execute any code following the command. The consequence of this was obvious — hackers could (and did) upload code to take control of the server. This was used in the Morris Worm of 1988. Most Internet machines of the day ran SendMail, so the worm spread fast infecting most machines.
This bug was mostly ignored at the time. It was thought of as a theoretical problem, that might only rarely be used to hack a system. Part of the motivation of the Morris Worm was to demonstrate that such problems was to demonstrate the consequences — consequences that should’ve been obvious but somehow were rejected by everyone.

More: Wikipedia on Morris Worm

Email Attachments/Links

I’m conflicted whether I should add this or not, because here’s the deal: you are supposed to click on attachments and links within emails. That’s what they are there for. The difference between good and bad attachments/links is not obvious. Indeed, easy-to-use email systems makes detecting the difference harder.
On the other hand, the consequences of bad attachments/links is obvious. That worms like ILOVEYOU spread so easily is because people trusted attachments coming from their friends, and ran them.
We have no solution to the problem of bad email attachments and links. Viruses and phishing are pervasive problems. Yet, we know why they exist.

Default and backdoor passwords

The Mirai botnet was caused by surveillance-cameras having default and backdoor passwords, and being exposed to the Internet without a firewall. The consequence should be obvious: people will discover the passwords and use them to take control of the bots.
Surveillance-cameras have the problem that they are usually exposed to the public, and can’t be reached without a ladder — often a really tall ladder. Therefore, you don’t want a button consumers can press to reset to factory defaults. You want a remote way to reset them. Therefore, they put backdoor passwords to do the reset. Such passwords are easy for hackers to reverse-engineer, and hence, take control of millions of cameras across the Internet.
The same reasoning applies to “default” passwords. Many users will not change the defaults, leaving a ton of devices hackers can hack.

Masscan and background radiation of the Internet

I’ve written a tool that can easily scan the entire Internet in a short period of time. It surprises people that this possible, but it obvious from the numbers. Internet addresses are only 32-bits long, or roughly 4 billion combinations. A fast Internet link can easily handle 1 million packets-per-second, so the entire Internet can be scanned in 4000 seconds, little more than an hour. It’s basic math.
Because it’s so easy, many people do it. If you monitor your Internet link, you’ll see a steady trickle of packets coming in from all over the Internet, especially Russia and China, from hackers scanning the Internet for things they can hack.
People’s reaction to this scanning is weirdly emotional, taking is personally, such as:
  1. Why are they hacking me? What did I do to them?
  2. Great! They are hacking me! That must mean I’m important!
  3. Grrr! How dare they?! How can I hack them back for some retribution!?

I find this odd, because obviously such scanning isn’t personal, the hackers have no idea who you are.

Tools: masscan, firewalls

Packet-sniffing, sidejacking

If you connect to the Starbucks WiFi, a hacker nearby can easily eavesdrop on your network traffic, because it’s not encrypted. Windows even warns you about this, in case you weren’t sure.

At DefCon, they have a “Wall of Sheep”, where they show passwords from people who logged onto stuff using the insecure “DefCon-Open” network. Calling them “sheep” for not grasping this basic fact that unencrypted traffic is unencrypted.

To be fair, it’s actually non-obvious to many people. Even if the WiFi itself is not encrypted, SSL traffic is. They expect their services to be encrypted, without them having to worry about it. And in fact, most are, especially Google, Facebook, Twitter, Apple, and other major services that won’t allow you to log in anymore without encryption.

But many services (especially old ones) may not be encrypted. Unless users check and verify them carefully, they’ll happily expose passwords.

What’s interesting about this was 10 years ago, when most services which only used SSL to encrypt the passwords, but then used unencrypted connections after that, using “cookies”. This allowed the cookies to be sniffed and stolen, allowing other people to share the login session. I used this on stage at BlackHat to connect to somebody’s GMail session. Google, and other major websites, fixed this soon after. But it should never have been a problem — because the sidejacking of cookies should have been obvious.

Tools: Wireshark, dsniff

Stuxnet LNK vulnerability

Again, this issue isn’t obvious to the public, but it should’ve been obvious to anybody who knew how Windows works.
When Windows loads a .dll, it first calls the function DllMain(). A Windows link file (.lnk) can load icons/graphics from the resources in a .dll file. It does this by loading the .dll file, thus calling DllMain. Thus, a hacker could put on a USB drive a .lnk file pointing to a .dll file, and thus, cause arbitrary code execution as soon as a user inserted a drive.
I say this is obvious because I did this, created .lnks that pointed to .dlls, but without hostile DllMain code. The consequence should’ve been obvious to me, but I totally missed the connection. We all missed the connection, for decades.

Social Engineering and Tech Support [* * *]

After posting this, many people have pointed out “social engineering”, especially of “tech support”. This probably should be up near #1 in terms of obviousness.

The classic example of social engineering is when you call tech support and tell them you’ve lost your password, and they reset it for you with minimum of questions proving who you are. For example, you set the volume on your computer really loud and play the sound of a crying baby in the background and appear to be a bit frazzled and incoherent, which explains why you aren’t answering the questions they are asking. They, understanding your predicament as a new parent, will go the extra mile in helping you, resetting “your” password.

One of the interesting consequences is how it affects domain names (DNS). It’s quite easy in many cases to call up the registrar and convince them to transfer a domain name. This has been used in lots of hacks. It’s really hard to defend against. If a registrar charges only $9/year for a domain name, then it really can’t afford to provide very good tech support — or very secure tech support — to prevent this sort of hack.

Social engineering is such a huge problem, and obvious problem, that it’s outside the scope of this document. Just google it to find example after example.

A related issue that perhaps deserves it’s own section is OSINT [*], or “open-source intelligence”, where you gather public information about a target. For example, on the day the bank manager is out on vacation (which you got from their Facebook post) you show up and claim to be a bank auditor, and are shown into their office where you grab their backup tapes. (We’ve actually done this).

More: Wikipedia on Social Engineering, Wikipedia on OSINT, “How I Won the Defcon Social Engineering CTF” — blogpost (2011), “Questioning 42: Where’s the Engineering in Social Engineering of Namespace Compromises” — BSidesLV talk (2016)

Blue-boxes (historical) [*]

Telephones historically used what we call “in-band signaling”. That’s why when you dial on an old phone, it makes sounds — those sounds are sent no differently than the way your voice is sent. Thus, it was possible to make tone generators to do things other than simply dial calls. Early hackers (in the 1970s) would make tone-generators called “blue-boxes” and “black-boxes” to make free long distance calls, for example.

These days, “signaling” and “voice” are digitized, then sent as separate channels or “bands”. This is call “out-of-band signaling”. You can’t trick the phone system by generating tones. When your iPhone makes sounds when you dial, it’s entirely for you benefit and has nothing to do with how it signals the cell tower to make a call.

Early hackers, like the founders of Apple, are famous for having started their careers making such “boxes” for tricking the phone system. The problem was obvious back in the day, which is why as the phone system moves from analog to digital, the problem was fixed.

More: Wikipedia on blue box, Wikipedia article on Steve Wozniak.

Thumb drives in parking lots [*]

A simple trick is to put a virus on a USB flash drive, and drop it in a parking lot. Somebody is bound to notice it, stick it in their computer, and open the file.

This can be extended with tricks. For example, you can put a file labeled “third-quarter-salaries.xlsx” on the drive that required macros to be run in order to open. It’s irresistible to other employees who want to know what their peers are being paid, so they’ll bypass any warning prompts in order to see the data.

Another example is to go online and get custom USB sticks made printed with the logo of the target company, making them seem more trustworthy.

We also did a trick of taking an Adobe Flash game “Punch the Monkey” and replaced the monkey with a logo of a competitor of our target. They now only played the game (infecting themselves with our virus), but gave to others inside the company to play, infecting others, including the CEO.

Thumb drives like this have been used in many incidents, such as Russians hacking military headquarters in Afghanistan. It’s really hard to defend against.

More: “Computer Virus Hits U.S. Military Base in Afghanistan” — USNews (2008), “The Return of the Worm That Ate The Pentagon” — Wired (2011), DoD Bans Flash Drives — Stripes (2008)

Googling [*]

Search engines like Google will index your website — your entire website. Frequently companies put things on their website without much protection because they are nearly impossible for users to find. But Google finds them, then indexes them, causing them to pop up with innocent searches.
There are books written on “Google hacking” explaining what search terms to look for, like “not for public release”, in order to find such documents.

More: Wikipedia entry on Google Hacking, “Google Hacking” book.

URL editing [*]

At the top of every browser is what’s called the “URL”. You can change it. Thus, if you see a URL that looks like this:

http://www.example.com/documents?id=138493

Then you can edit it to see the next document on the server:

http://www.example.com/documents?id=138494

The owner of the website may think they are secure, because nothing points to this document, so the Google search won’t find it. But that doesn’t stop a user from manually editing the URL.
An example of this is a big Fortune 500 company that posts the quarterly results to the website an hour before the official announcement. Simply editing the URL from previous financial announcements allows hackers to find the document, then buy/sell the stock as appropriate in order to make a lot of money.
Another example is the classic case of Andrew “Weev” Auernheimer who did this trick in order to download the account email addresses of early owners of the iPad, including movie stars and members of the Obama administration. It’s an interesting legal case because on one hand, techies consider this so obvious as to not be “hacking”. On the other hand, non-techies, especially judges and prosecutors, believe this to be obviously “hacking”.

DDoS, spoofing, and amplification [*]

For decades now, online gamers have figured out an easy way to win: just flood the opponent with Internet traffic, slowing their network connection. This is called a DoS, which stands for “Denial of Service”. DoSing game competitors is often a teenager’s first foray into hacking.
A variant of this is when you hack a bunch of other machines on the Internet, then command them to flood your target. (The hacked machines are often called a “botnet”, a network of robot computers). This is called DDoS, or “Distributed DoS”. At this point, it gets quite serious, as instead of competitive gamers hackers can take down entire businesses. Extortion scams, DDoSing websites then demanding payment to stop, is a common way hackers earn money.
Another form of DDoS is “amplification”. Sometimes when you send a packet to a machine on the Internet it’ll respond with a much larger response, either a very large packet or many packets. The hacker can then send a packet to many of these sites, “spoofing” or forging the IP address of the victim. This causes all those sites to then flood the victim with traffic. Thus, with a small amount of outbound traffic, the hacker can flood the inbound traffic of the victim.
This is one of those things that has worked for 20 years, because it’s so obvious teenagers can do it, yet there is no obvious solution. President Trump’s executive order of cyberspace specifically demanded that his government come up with a report on how to address this, but it’s unlikely that they’ll come up with any useful strategy.

More: Wikipedia on DDoS, Wikipedia on Spoofing

Conclusion

Tweet me (@ErrataRob) your obvious hacks, so I can add them to the list.

Security updates for Thursday

Post Syndicated from jake original https://lwn.net/Articles/727815/rss

Security updates have been issued by Arch Linux (irssi), CentOS (httpd and kernel), Debian (nginx), Fedora (perl-DBD-MySQL and qt5-qtwebengine), Mageia (apache-mod_fcgid, cairo, jbig2dec, nodejs, and sudo), openSUSE (libreoffice, spice, and systemd), Red Hat (python-django-horizon), and SUSE (kernel and xorg-x11-server).

The Future of Forgeries

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/07/the_future_of_f_1.html

This article argues that AI technologies will make image, audio, and video forgeries much easier in the future.

Combined, the trajectory of cheap, high-quality media forgeries is worrying. At the current pace of progress, it may be as little as two or three years before realistic audio forgeries are good enough to fool the untrained ear, and only five or 10 years before forgeries can fool at least some types of forensic analysis. When tools for producing fake video perform at higher quality than today’s CGI and are simultaneously available to untrained amateurs, these forgeries might comprise a large part of the information ecosystem. The growth in this technology will transform the meaning of evidence and truth in domains across journalism, government communications, testimony in criminal justice, and, of course, national security.

I am not worried about fooling the “untrained ear,” and more worried about fooling forensic analysis. But there’s an arms race here. Recording technologies will get more sophisticated, too, making their outputs harder to forge. Still, I agree that the advantage will go to the forgers and not the forgery detectors.

Security updates for Friday

Post Syndicated from jake original https://lwn.net/Articles/723927/rss

Security updates have been issued by CentOS (kernel), Debian (graphicsmagick, imagemagick, kde4libs, and puppet), Fedora (FlightCrew, kernel, libvncserver, and wordpress), Gentoo (adobe-flash, smb4k, teeworlds, and xen), Mageia (kernel, kernel-linus, kernel-tmb, and perl-CGI-Emulate-PSGI), openSUSE (GraphicsMagick and rpcbind), Oracle (kernel), Red Hat (kernel and kernel-rt), and Scientific Linux (kernel).

Encrypt and Decrypt Amazon Kinesis Records Using AWS KMS

Post Syndicated from Temitayo Olajide original https://aws.amazon.com/blogs/big-data/encrypt-and-decrypt-amazon-kinesis-records-using-aws-kms/

Customers with strict compliance or data security requirements often require data to be encrypted at all times, including at rest or in transit within the AWS cloud. This post shows you how to build a real-time streaming application using Kinesis in which your records are encrypted while at rest or in transit.

Amazon Kinesis overview

The Amazon Kinesis platform enables you to build custom applications that analyze or process streaming data for specialized needs. Amazon Kinesis can continuously capture and store terabytes of data per hour from hundreds of thousands of sources such as website clickstreams, financial transactions, social media feeds, IT logs, and transaction tracking events.

Through the use of HTTPS, Amazon Kinesis Streams encrypts data in-flight between clients which protects against someone eavesdropping on records being transferred. However, the records encrypted by HTTPS are decrypted once the data enters the service. This data is stored at rest for 24 hours (configurable up to 168 hours) to ensure that your applications have enough headroom to process, replay, or catch up if they fall behind.

Walkthrough

In this post you build encryption and decryption into sample Kinesis producer and consumer applications using the Amazon Kinesis Producer Library (KPL), the Amazon Kinesis Consumer Library (KCL), AWS KMS, and the aws-encryption-sdk. The methods and the techniques used in this post to encrypt and decrypt Kinesis records can be easily replicated into your architecture. Some constraints:

  • AWS charges for the use of KMS API requests for encryption and decryption, for more information see AWS KMS Pricing.
  • You cannot use Amazon Kinesis Analytics to query Amazon Kinesis Streams with records encrypted by clients in this sample application.
  • If your application requires low latency processing, note that there will be a slight hit in latency.

The following diagram shows the architecture of the solution.

Encrypting the records at the producer

Before you call the PutRecord or PutRecords API, you will encrypt the string record by calling KinesisEncryptionUtils.toEncryptedString.

In this example, we used a sample stock sales ticker object:

example {"tickerSymbol": "AMZN", "salesPrice": "900", "orderId": "300", "timestamp": "2017-01-30 02:41:38"}. 

The method (KinesisEncryptionUtils.toEncryptedString) call takes four parameters:

  • amazonaws.encryptionsdk.AwsCrypto
  • stock sales ticker object
  • amazonaws.encryptionsdk.kms.KmsMasterKeyProvider
  • util.Map of an encryption context

A ciphertext is returned back to the main caller which is then also checked for size by calling KinesisEncryptionUtils.calculateSizeOfObject. Encryption increases the size of an object. To prevent the object from being throttled, the size of the payload (one or more records) is validated to ensure it is not greater than 1MB. In this example encrypted records sizes with payload exceeding 1MB are logged as warning. If the size is less than the limit, then either addUserRecord or PutRecord and PutRecords are called if you are using the KPL or the Kinesis Streams API respectively 

Example: Encrypting records with KPL

//Encrypting the records
String encryptedString = KinesisEncryptionUtils.toEncryptedString(crypto, ticker, prov,context);
log.info("Size of encrypted object is : "+ KinesisEncryptionUtils.calculateSizeOfObject(encryptedString));
//check if size of record is greater than 1MB
if(KinesisEncryptionUtils.calculateSizeOfObject(encryptedString) >1024000)
   log.warn("Record added is greater than 1MB and may be throttled");
//UTF-8 encoding of encrypted record
ByteBuffer data = KinesisEncryptionUtils.toEncryptedByteStream(encryptedString);
//Adding the encrypted record to stream
ListenableFuture<UserRecordResult> f = producer.addUserRecord(streamName, randomPartitionKey(), data);
Futures.addCallback(f, callback);

In the above code, the example sales ticker record is passed to the KinesisEncryptionUtils.toEncryptedString and an encrypted record is returned. The encryptedRecord value is also passed to KinesisEncryptionUtils.calculateSizeOfObject and the size of the encrypted payload is returned and checked to see if it is less than 1MB. If it is, the payload is then UTF-8 encoded (KinesisEncryptionUtils.toEncryptedByteStream), then sent to the stream for processing.

Example: Encrypting the records with Streams PutRecord

//Encrypting the records
String encryptedString = KinesisEncryptionUtils.toEncryptedString(crypto, ticker, prov, context);
log.info("Size of encrypted object is : " + KinesisEncryptionUtils.calculateSizeOfObject(encryptedString));
//check if size of record is greater than 1MB
if (KinesisEncryptionUtils.calculateSizeOfObject(encryptedString) > 1024000)
    log.warn("Record added is greater than 1MB and may be throttled");
//UTF-8 encoding of encryptyed record
ByteBuffer data = KinesisEncryptionUtils.toEncryptedByteStream(encryptedString);
putRecordRequest.setData(data);
putRecordRequest.setPartitionKey(randomPartitionKey());
//putting the record into the stream
kinesis.putRecord(putRecordRequest);

Verifying that records are encrypted

After the call to KinesisEncryptionUtils.toEncryptedString, you can print out the encrypted string record just before UTF-8 encoding. An example of what is printed to standard output when running this sample application is shown below.

[main] INFO kinesisencryption.streams.EncryptedProducerWithStreams - String Record is TickerSalesObject{tickerSymbol='FB', salesPrice='184.285409142', orderId='2a0358f1-9f8a-4bbe-86b3-c2929047e15d', timeStamp='2017-01-30 02:41:38'} and Encrypted Record String is AYADeMf6zmVg9JvIkGNv5M39rhUAbgACAAdLaW5lc2lzAARjYXJzABVhd3MtY3J5cHRvLXB1YmxpYy1rZXkAREFpUkpCaG1UOFQ3UTZQZ253dm9FSU9iUDZPdE1xTHdBZ1JjNlZxN2doMDZ3QlBEWUZndWdJSEFKaXNvT0ZPUGsrdz09AAEAB2F3cy1rbXMAS2Fybjphd3M6a21zOnVzLWVhc3QtMTo1NzM5MDY1ODEwMDI6a2V5LzM3ZGM5MGRjLTNmMWMtNGE3Ny1hNTFkLWE2NTNiMTczZmNkYgCnAQEBAHgbPoaYTiF/oIMp49yPBkZmVVotylZpUqwkkzJJicLjLQAAAH4wfAYJKoZIhvcNAQcGoG8wbQIBADBoBgkqhkiG9w0BBwEwHgYJYIZIAWUDBAEuMBEEDCCYBk+hfB3tOGVx7QIBEIA7FqaEcOWpic+gKNeT+dUe4yttB9dsZSFPAUTlz2L2zlyLXSLMh1otRH24SO485ov+TCTtRCgiA8a9rYQCAAAAAAwAABAArlGWPO8BavNSJIpJOtJekRUhOwbM+WM1NBVXB/////8AAAABXNZnRND3J7u8EZx3AAAAkfSxVPMYUv0Ovrd4AIUTmMcaiR0Z+IcJNAXqAhvMmDKpsJaQG76Q6pYExarolwT+6i87UOi6TGvAiPnH74GbkEniWe66rAF6mOra2JkffK6pBdhh95mEOGLaVPBqs2jswUTfdcBJQl9NEb7wx9XpFX8fNDF56Vly7u6f8OQ7lY6fNrOupe5QBFnLvwehhtogd72NTQ/yEbDDoPKUZN3IlWIEAGYwZAIwISFw+zdghALtarsHSIgPMs7By7/Yuda2r3hqSmqlCyCXy7HMFIQxHcEILjiLp76NAjB1D8r8TC1Zdzsfiypi5X8FvnK/6EpUyFoOOp3y4nEuLo8M2V/dsW5nh4u2/m1oMbw=

You can also verify that the record stayed encrypted in Streams by printing out the UTF-8 decoded received record immediately after the getRecords API call. An example of the print output when running the sample application is shown below.

[Thread-2] INFO kinesisencryption.utils.KinesisEncryptionUtils - Verifying object received from stream is encrypted. -Encrypted UTF-8 decoded : AYADeBJz/kt7Fm3L1lvS8Wy8jhAAbgACAAdLaW5lc2lzAARjYXJzABVhd3MtY3J5cHRvLXB1YmxpYy1rZXkAREFrM2N4K2s1ODJuOGVlNWF3TVJ1dk1UUHZQc2FHeGoxQisxb09kNWtDUExHYjJDS0lMZW5LSnlYakRmdFR4dzQyUT09AAEAB2F3cy1rbXMAS2Fybjphd3M6a21zOnVzLWVhc3QtMTo1NzM5MDY1ODEwMDI6a2V5LzM3ZGM5MGRjLTNmMWMtNGE3Ny1hNTFkLWE2NTNiMTczZmNkYgCnAQEBAHgbPoaYTiF/oIMp49yPBkZmVVotylZpUqwkkzJJicLjLQAAAH4wfAYJKoZIhvcNAQcGoG8wbQIBADBoBgkqhkiG9w0BBwEwHgYJYIZIAWUDBAEuMBEEDAGI3oWLlIJ2p6kffQIBEIA7JVUOTsLtEyNK8vS4GIS9iyTejuB2xhIpRXfG8o0lUfHawcrCbNbNH8XLm/8RW5JbgXo10EpOs8dSjkICAAAAAAwAABAAy64r24sGVKWN4C1gXCwJYHvZkLpJJj16SZlhpv////8AAAABg2pPFchIiaM7D9VuAAAAkwh10ul5sZQ08KsgkFszOOvFoQu95CiY7cK8H+tBloVOZglMqhhhvoIIZLr9hmI8/lQvRXzGDdo7Xkp0FAT5Jpztt8Hq/ZuLfZtNYIWOw594jShqpZt6uXMdMnpb/38R3e5zLK5vrYkM6NS4WPMFrHsOKN5tn0CDForgojRcdpmCJ8+cWLNltb2S+EJiWiyWS+ibw2vJ/RFm6WZO6nD+MXn3vyMAZzBlAjAuIUTYL1cbQ3ENxDIeXHJAWQguNPqxq4HgaCmCEI9/rn/GAKSc2nT9ln3UsVq/2dgCMQC7yNJ3DCTnppavfxTbcVS+rXaDDpZZx/ZsluMqXAFM5/FFvKRqr0dVML28tGunxmU=

Decrypting the records at the consumer

After you receive the records into your consumer as a list, you can get the data as a ByteBuffer by calling record.getData. You then decode and decrypt the byteBuffer by calling the KinesisEncryptionUtils.decryptByteStream. This method takes five parameters:

  • amazonaws.encryptionsdk.AwsCrypto
  • record ByteBuffer
  • amazonaws.encryptionsdk.kms.KmsMasterKeyProvider
  • key arn string
  • java.util.Map of your encryption context

A string representation of the ticker sales object is returned back to the caller for further processing. In this example, this representation is just printed to standard output.

[Thread-2] INFO kinesisencryption.streams.DecryptShardConsumerThread - Decrypted Text Result is TickerSalesObject{tickerSymbol='AMZN', salesPrice='304.958313333', orderId='50defaf0-1c37-4e84-85d7-bc15597355eb', timeStamp='2017-01-30 02:41:38'}

Example: Decrypting records with the KCL and Streams API

ByteBuffer buffer = record.getData();
//Decrypting the encrypted record data
String decryptedResult = KinesisEncryptionUtils.decryptByteStream(crypto,buffer,prov,this.getKeyArn(), this.getContext());
log.info("Decrypted Text Result is " + decryptedResult);

With the above code, records in the Kinesis Streams are decrypted using the same key ARN and encryption context that was previously used to encrypt it at the producer side.

Maven dependencies

To use the implementation I’ve outlined in this post, you need to use a few maven dependencies outlined below in the pom.xml together with the Bouncy Castle libraries. Bouncy Castle provides a cryptography API for Java.

 <dependency>
        <groupId>org.bouncycastle</groupId>
        <artifactId>bcprov-ext-jdk15on</artifactId>
        <version>1.54</version>
    </dependency>
<dependency>
   <groupId>com.amazonaws</groupId>
   <artifactId>aws-encryption-sdk-java</artifactId>
   <version>0.0.1</version>
</dependency>

Summary

You may incorporate above sample code snippets or use it as a guide in your application code to just start encrypting and decrypting your records to and from an Amazon Kinesis Stream.

A complete producer and consumer example application and a more detailed step-by-step example of developing an Amazon Kinesis producer and consumer application on AWS with encrypted records is available at the kinesisencryption github repository.

If you have questions or suggestions, please comment below.


About the Author

Temitayo Olajide is a Cloud Support Engineer with Amazon Web Services. He works with customers to provide architectural solutions, support and guidance to implementing high velocity streaming data applications in the cloud. In his spare time, he plays ping-pong and hangs out with family and friends

 

 


Related

Secure Amazon EMR with Encryption

 

 

Security updates for Friday

Post Syndicated from ris original https://lwn.net/Articles/718027/rss

Security updates have been issued by Arch Linux (libpurple), Debian (audiofile, cgiemail, and imagemagick), Fedora (cloud-init, empathy, and mupdf), Mageia (firefox and thunderbird), Scientific Linux (icoutils and openjpeg), Slackware (mcabber and samba), and Ubuntu (eglibc).

Monday’s security advisories

Post Syndicated from jake original http://lwn.net/Articles/697941/rss

Arch Linux has updated linux-lts
(connection hijacking).

CentOS has updated kernel (C7:
connection hijacking).

Debian-LTS has updated cracklib2
(code execution) and suckless-tools (screen
lock bypass).

Fedora has updated firewalld
(F24: authentication bypass), glibc (F24:
denial of service on armhfp), knot (F24; F23:
denial of service), libgcrypt (F24: bad
random number generation), and perl (F23:
privilege escalation).

openSUSE has updated apache2-mod_fcgid (42.1, 13.2: proxy
injection), gd (13.2: multiple
vulnerabilities), iperf (SPHfSLE12;
42.1, 13.2: denial of service), pdns (42.1, 13.2: denial of service), python3 (42.1, 13.2: multiple
vulnerabilities), roundcubemail (42.1; 13.2; 13.1: multiple vulnerabilities, two from
2015), and typo3-cms-4_7 (42.1, 13.2: three
vulnerabilities from 2013 and 2014).

Scientific Linux has updated kernel (SL7: connection hijacking) and python (SL6&7: three vulnerabilities).

Astro Pi: Mission Update 9 – Science Results

Post Syndicated from David Honess original https://www.raspberrypi.org/blog/astro-pi-mission-update-9-science-results/

Liz: Before we get down to business, we’ve a notice to share. Laura Clay, who is behind the scenes editing this blog, The MagPi and much more, is also a fiction writer; and she’s been chosen as one of 17 Emerging Writers by the Edinburgh UNESCO City of Literature Trust. Each writer will be reading a short story at the Edinburgh International Book Festival, and it’s a great way to discover writers living and working in the city at the start of their careers. Laura will be reading her story Loch na Bèiste on Friday 26 August at 3pm in the Spiegeltent, and entry is free, so why not come along and support her? Warning: story may contain murderous kelpies.

Now that British ESA Astronaut Tim Peake is back on the ground it’s time for the final Astro Pi mission update: the summary of the experiment results from the International Space Station (ISS). We’ve been holding this back to give the winners some time to publish the results of their experiments themselves.

Back in 2015 we ran a competition where students could design and program computer science experiments, to be run by Tim Peake on specially cased Raspberry Pis called Astro Pis. Here’s the original competition video, voiced by Tim himself:

Astro Pi

This is “Astro Pi” by raspberrypi on Vimeo, the home for high quality videos and the people who love them.

The competition ran from January to July 2015 and produced seven winning experiments, which were launched into space a few days before Tim started his mission. Between February and April 2016, these experiments were run on board the ISS under Tim Peake’s supervision. They’re mostly based around the sensors found on the Sense HAT, but a few also employ the Raspberry Pi Camera Module. Head over to the Astro Pi website now to check out the results, released today!

You might also know that we ran an extension to this competition involving a couple of music-based challenges. These challenges have no scientific output to discuss, because they were part of a crew care package for Tim’s enjoyment, but you can get your hands on the winning code to turn the Astro Pis into MP3 players and Sonic Pi tunes.

One of the main things we’ve learnt from running Astro Pi is that the biggest motivational factor for young people is the very tangible goal of having their code run in space. This eclipses any physical prize we could offer. Many people see space as quite distant and abstract, but with Astro Pi you can actually get your hands on space-qualified hardware, create something that would work up in space, and become an active participant in the European space programme.

Many of the Astro Pi winners now express an interest in studying aerospace and computer science. They’ve gained exposure to the real-life process of scientific endeavour, and faced industrial software development challenges along the way. We hope that everyone who participated in Astro Pi has been positively influenced by the programme. The results also demonstrate that the payload works reliably in space. This has been noticed by ESA, who are now planning to use it during upcoming missions. It’s really important for us that the payload continues to be used to run your code in space, so we’re working hard with ESA to make sure that we can do Astro Pi all over again.

This project has been a huge collaborative effort from the start and the Raspberry Pi Foundation would like to thank everyone who has participated in the competitions, and the following companies who have contributed staff time, facilities, and funding to make it all happen: UK Space Agency, European Space Agency, BIOTESC, TLOGOS, Surrey Satellite Technology, Airbus Defence and Space, CGI Group, QinetiQ Space, UK Space Trade Association, ESERO UK, KTN Space, and Nesta. Of course, Tim Peake himself has been hugely supportive and enthusiastic about the project from the start.

British ESA Astronaut Tim Peake with the prototype Astro Pi

British ESA Astronaut Tim Peake with the prototype Astro Pi. Image credit ESA.

We would also like to thank Libby Jackson, who is the Astronaut Flight Education Programme Manager at the UK Space Agency and a former flight director at ESA. She oversees all of the Principia educational activities, including Astro Pi.

libby-jackson-uksa

Libby Jackson, UK Space Agency. Image credit Imperial College London.

During the interview for her job at the UK Space Agency a few years ago, she pitched an idea for running a project on the ISS involving Raspberry Pi computers. Instead of launching traditional physical equipment, the experiments would be in the form of computer software, meaning that many more experiments could be accommodated. That kernel of an idea is what eventually became Astro Pi.

iss046e042740

Izzy deployed on the Nadir Hatch window of Node 2. Image credit ESA.

The post Astro Pi: Mission Update 9 – Science Results appeared first on Raspberry Pi.

Security advisories for Tuesday

Post Syndicated from ris original http://lwn.net/Articles/696802/rss

Arch Linux has updated curl (three vulnerabilities).

Debian has updated chromium-browser (multiple vulnerabilities) and fontconfig (privilege escalation).

Debian-LTS has updated libreoffice (code execution) and python-django (rebase to 1.4.x).

Fedora has updated bind99 (F23:
denial of service), ca-certificates (F23:
certificate update), dhcp (F23: denial of
service), dnsmasq (F23: denial of service),
flex (F24: buffer overflow), fontconfig (F24: privilege escalation),
kernel (F24; F23: two vulnerabilities), libidn (F23: multiple vulnerabilities), libreswan (F23: unspecified), nodejs-tough-cookie (F24: denial of service),
pdns (F24: denial of service),
perl-CGI-Emulate-PSGI (F24; F23: HTTP redirect),
perl-Module-Load-Conditional (F24;
F23: privilege escalation), v8 (F24; F23:
denial of service), and xen (F23: multiple vulnerabilities).

Mageia has updated chromium-browser-stable (multiple vulnerabilities), firefox (multiple vulnerabilities), and openntpd/busybox (denial of service).

Red Hat has updated chromium-browser (RHEL6: multiple
vulnerabilities), kernel (RHEL6.4:
privilege escalation), nodejs010-nodejs-minimatch (RHSCL: denial of
service), and rh-nodejs4-nodejs-minimatch
(RHSCL: denial of service).

SUSE has updated kernel
(SLE11-SP4: multiple vulnerabilities).

Ubuntu has updated curl (three vulnerabilities).

Python FAQ: How do I port to Python 3?

Post Syndicated from Eevee original https://eev.ee/blog/2016/07/31/python-faq-how-do-i-port-to-python-3/

Part of my Python FAQ, which is doomed to never be finished.

Maybe you have a Python 2 codebase. Maybe you’d like to make it work with Python 3. Maybe you really wish someone would write a comically long article on how to make that happen.

I have good news! You’re already reading one.

(And if you’re not sure why you’d want to use Python 3 in the first place, perhaps you’d be interested in the companion article which delves into exactly that question?)

Don’t be intimidated

This article is quite long, but don’t take that as a sign that this is necessarily a Herculean task. I’m trying to cover every issue I can ever recall running across, which means a lot of small gotchas.

I’ve ported several codebases from Python 2 to Python 2+3, and most of them have gone pretty smoothly. If you have modern Python 2 code that handles Unicode responsibly, you’re already halfway there.

However… if you still haven’t ported by now, almost eight years after Python 3.0 was first released, chances are you have either a lumbering giant of an app or ancient and weird 2.2-era code. Or, perish the thought, a lumbering giant consisting largely of weird 2.2-era code. In that case, you’ll want to clean up the more obvious issues one at a time, then go back and start worrying about actually running parts of your code on Python 3.

On the other hand, if your Python 2 code is pretty small and you’ve just never gotten around to porting, good news! It’s not that bad, and much of the work can be done automatically. Python 3 is ultimately the same language as Python 2, just with some sharp bits filed off.

Making some tough decisions

We say “porting from 2 to 3”, but what we usually mean is “porting code from 2 to both 2 and 3”. That ends up being more difficult (and ugly), since rather than writing either 2 or 3, you have to write the common subset of 2 and 3. As nifty as some of the features in 3 are, you can’t actually use any of them if you have to remain compatible with Python 2.

The first thing you need to do, then, is decide exactly which versions of Python you’re targeting. For 2, your options are:

  • Python 2.5+ is possible, but very difficult, and this post doesn’t really discuss it. Even something as simple as exception handling becomes painful, because the only syntax that works in Python 3 was first introduced in Python 2.6. I wouldn’t recommend doing this.

  • Python 2.6+ used to be fairly common, and is well-tread ground. However, Python 2.6 reached end-of-life in 2013, and some common libraries have been dropping support for it. If you want to preserve Python 2.6 compatibility for the sake of making a library more widely-available, well, I’d urge you to reconsider. If you want to preserve Python 2.6 compatibility because you’re running a proprietary app on it, you should stop reading this right now and go upgrade to 2.7 already.

  • Python 2.7 is the last release of the Python 2 series, but is guaranteed to be supported until at least 2020. The major focus of the release was backporting a lot of minor Python 3 features, making it the best possible target for code that’s meant to run on both 2 and 3.

  • There is, of course, also the choice of dropping Python 2 support, in which case this process will be much easier. Python 2 is still very widely-used, though, so library authors probably won’t want to do this. App authors do have the option, but unless your app is trivial, it’s much easier to maintain Python 2 support during the port — that way you can port iteratively, and the app will still function on Python 2 in the interim, rather than being a 2/3 hybrid that can’t run on either.

Most of this post assumes you’re targeting Python 2.7, though there are mentions of 2.6 as well.

You also have to decide which version of Python 3 to target.

  • Python 3.0 and 3.1 are forgettable. Python 3 was still stabilizing for its first couple minor versions, and from what I hear, compatibility with both 2.7 and 3.0 is a huge pain. Both versions are also past end-of-life.

  • Python 3.2 and 3.3 are a common minimum version to target. Python 3.3 reinstated support for u'...' literals (redundant in Python 3, where normal strings are already Unicode), which makes supporting both 2 and 3 much easier. I bundle it with Python 3.2 because the latest version that stable PyPy supports is 3.2, but it also supports u'...' literals. You’ll support the biggest surface area by targeting that, a sort of 3.2½. (There’s an alpha PyPy supporting 3.3, but as of this writing it’s not released as stable yet.)

  • Python 3.4 and 3.5 add shiny new features, but you can only really use them if you’re dropping support for Python 2. Again, I’d suggest targeting Python 2.7 + Python 3.2½ first, then dropping the Python 2 support and adding whatever later Python 3 trinkets you want.

Another consideration is what attitude you want your final code to take. Do you want Python 2 code with enough band-aids that it also works on Python 3, or Python 3 code that’s carefully written so it still works on Python 2? The differences are subtle! Consider code like x = map(a, b). map returns a list in Python 2, but a lazy iterable in Python 3. Which way do you want to port this code?

1
2
3
4
5
6
7
8
9
# Python 2 style: force eager evaluation, even on Python 3
x = list(map(a, b))

# Python 3 style: use lazy evaluation, even on Python 2
try:
    from future_builtins import map
except ImportError:
    pass
x = map(a, b)

The answer may depend on which Python you primarily use for development, your target audience, or even case-by-case based on how x is used.

Personally, I’d err on the side of preserving Python 3 semantics and porting them to Python 2 when possible. I’m pretty used to Python 3, though, and you or your team might be thrown for a loop by changing Python 2’s behavior.

At the very least, prefer if PY2 to if not PY3. The former stresses that Python 2 is the special case, which is increasingly true going forward. Eventually there’ll be a Python 4, and perhaps even a Python 5, and those future versions will want the “Python 3” behavior.

Some helpful tools

The good news is that you don’t have to do all of this manually.

2to3 is a standard library module (since 2.6) that automatically modifies Python 2 source code to change some common Python 2 constructs to the Python 3 equivalent. (It also doubles as a framework for making arbitrary changes to Python code.)

Unfortunately, it ports 2 to 3, not 2 to 2+3. For libraries, it’s possible to rig 2to3 to run automatically on your code just before it’s installed on Python 3, so you can keep writing Python 2 code — but 2to3 isn’t perfect, and this makes it impossible to develop with your library on Python 3, so Python 3 ends up as a second-class citizen. I wouldn’t recommend it.

The more common approach is to use something like six, a library that wraps many of the runtime differences between 2 and 3, so you can run the same codebase on both 2 and 3.

Of course, that still leaves you making the changes yourself. A more recent innovation is the python-future project, which combines both of the above. It has a future library of renames and backports of Python 3 functionality that goes further than six and is designed to let you write Python 3-esque code that still runs on Python 2. It also includes a futurize script, based on the 2to3 plumbing, that rewrites your code to target 2+3 (using python-future’s library) rather than just 3.

The nice thing about python-future is that it explicitly takes the stance of writing code against Python 3 semantics and backporting them to Python 2. It’s very dedicated to this: it has a future.builtins module that includes not only easy cases like map, but also entire pure-Python reimplementations of types like bytes. (Naturally, this adds some significant overhead as well.) I do like the overall attitude, but I’m not totally sold on all the changes, and you might want to leaf through them to see which ones you like.

futurize isn’t perfect, but it’s probably the best starting point. The 2to3 design splits the various edits into a variety of “fixers” that each make a single style of change, and futurize works the same way, inheriting many of the fixers from 2to3. The nice thing about futurize is that it groups the fixers into “stages”, where stage 1 (futurize --stage1) only makes fairly straightforward changes, like fixing the except syntax. More importantly, it doesn’t add any dependencies on the future library, so it’s useful for making the easy changes even if you’d prefer to use six. You’re also free to choose individual fixes to apply, if you discover that some particular change breaks your code.

Another advantage of this approach is that you can tackle the porting piecemeal, which is great for very large projects. Run one fixer at a time, starting with the very simple ones like updating to except ... as ... syntax, and convince yourself that everything is fine before you do the next one. You can make some serious strides towards 3 compatibility just by eliminating behavior that already has cromulent alternatives in Python 2.

If you expect your Python 3 port to take a very long time — say, if you have a large project with numerous developers and a frantic release schedule — then you might want to prevent older syntax from creeping in with a tool like autopep8, which can automatically fix some deprecated features with a much lighter touch. If you’d like to automatically enforce that, say, from __future__ import absolute_import is at the top of every Python file, that’s a bit beyond the scope of this article, but I’ve had pre-commit + reorder_python_imports thrust upon me in the past to fairly good effect.

Anyway! For each of the issues below, I’ll mention whether futurize can fix it, the name of the responsible fixer, and whether six has anything relevant. If the name of the fixer begins with lib2to3, that means it’s part of the standard library, and you can use it with 2to3 without installing python-future.

Here we go!

Things you shouldn’t even be doing

These are ancient, ancient practices, and even Python 2 programmers may be surprised by them. Some of them are arguably outright bugs in the language; others are just old and forgotten. They generally have equivalents that work even in older versions of Python 2.

Old-style classes

1
2
class Foo:
    ...

In Python 3, this code creates a class that inherits from object. In Python 2, it creates a completely different kind of thing entirely: an “old-style” class, which worked a little differently from built-in types. The differences are generally subtle:

  • Old-style classes don’t support __getattribute__, __slots__

  • Old-style classes don’t correctly support data descriptors, i.e. the assignment behavior of @property.

  • Old-style classes had a __coerce__ method, which would attempt to turn a value into a built-in numeric type before performing a math operation.

  • Old-style classes didn’t use the C3 MRO, so in the case of diamond inheritance, a class could be skipped entirely by super().

  • Old-style instances check the instance for a special method name; new-style instances check the type. Additionally, if a special method isn’t found on an old-style instance, the lookup falls back to __getattr__; this is not the case for new-style classes (which makes proxying more complicated).

That last one is the only thing old-style classes can do that new-style classes cannot, and if you’re relying on it, you have a bit of refactoring to do. (The really curious thing is that there doesn’t seem to be a particularly good reason for the limitation on new-style classes, and it doesn’t even make things faster. Maybe that’ll be fixed in Python 4?)

If you have no idea what any of that means or why you should care, chances are you’re either not using old-style classes at all, or you’re only using them because you forgot to write (object) somewhere. In that case, futurize --stage2 will happily change class Foo: to class Foo(object): for you, using the libpasteurize.fixes.fix_newstyle fixer. (Strictly speaking, this is a Python 2 compatibility issue, since the old syntax still works fine in Python 3 — it just means something else now.)

cmp

Python 2 originally used the C approach for sorting. Given two things A and B, a comparison would produce a negative number if A < B, zero if A == B, and a positive number if A > B. This was the only way to customize sorting; there’s a cmp() built-in function, a __cmp__ special method, and cmp arguments to list.sort() and sorted().

This is a little cumbersome, as you may have noticed if you’ve ever tried to do custom sorting in Perl or JavaScript. Even a case-insensitive sort involves repeating yourself. Most custom sorts will have the same basic structure of cmp(op(a), op(b)), when the only thing you really care about is op.

1
names.sort(cmp=lambda a, b: cmp(a.lower(), b.lower()))

But more importantly, the C approach is flat-out wrong for some types. Consider sets, which use comparison to indicate subsets versus supersets:

1
2
3
4
{1, 2} < {1, 2, 3}  # True
{1, 2, 3} > {1, 2}  # True
{1, 2} < {1, 2}  # False
{1, 2} <= {1, 2}  # True

So what to do with {1, 2} < {3, 4}, where none of the three possible answers is correct?

Early versions of Python 2 added “rich comparisons”, which introduced methods for all six possible comparisons: __eq__, __ne__, __lt__, __le__, __gt__, and __ge__. You’re free to return False for all six, or even True for all six, or return NotImplemented to allow deferring to the other operand. The cmp argument became key instead, which allows mapping the original values to a different item to use for comparison:

1
names.sort(key=lambda a: a.lower())

(This is faster, too, since there are fewer calls to the lambda, fewer calls to .lower(), and no calls to cmp.)


So, fixing all this. Luckily, Python 2 supports all of the new stuff, so you don’t need compatibility hacks.

To replace simple implementations of __cmp__, you need only write the appropriate rich comparison methods. You could even do this the obvious way:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class Foo(object):
    def __cmp__(self, other):
        return cmp(self.prop, other.prop)

    def __eq__(self, other):
        return self.__cmp__(other) == 0

    def __ne__(self, other):
        return self.__cmp__(other) != 0

    def __lt__(self, other):
        return self.__cmp__(other) < 0

    ...

You would also have to change the use of cmp to a manual if tree, since cmp is gone in Python 3. I don’t recommend this.

A lazier alternative would be to use functools.total_ordering (backported from 3.0 into 2.7), which generates four of the comparison methods, given a class that implements __eq__ and one other:

1
2
3
4
5
6
7
@functools.total_ordering
class Foo(object):
    def __eq__(self, other):
        return self.prop == other.prop

    def __lt__(self, other):
        return self.prop < other.prop

There are a couple problems with this code. For one, it’s still pretty repetitive, accessing .prop four times (and imagine if you wanted to compare several properties). For another, it’ll either cause an error or do entirely the wrong thing if you happen to compare with an object of a different type. You should return NotImplemented in this case, but total_ordering doesn’t handle that correctly until Python 3.4. If those bother you, you might enjoy my own classtools.keyed_ordering, which uses a __key__ method (much like the key argument) to generate all six methods:

1
2
3
4
@classtools.keyed_ordering
class Foo(object):
    def __key__(self):
        return self.prop

Replacing uses of key arguments should be straightforward: a cmp argument of cmp(op(a), op(b)) becomes a key argument of op. If you’re doing something more elaborate, there’s a functools.cmp_to_key function (also backported from 3.0 to 2.7), which converts a cmp function to one usable as a key. (The implementation is much like the first Foo example above: it involves a class that calls the wrapped function from its comparison methods, and returns True or False depending on the return value.)

Finally, if you’re using cmp directly, don’t do that. If you really, really need it for something other than Python’s own sorting, just use an if.

The only help futurize offers is in futurize --stage2, via libfuturize.fixes.fix_cmp, which adds an import of past.builtins.cmp if it detects you’re using the cmp function anywhere.

Comparing incompatible types

Python 2’s use of C-style ordering also means that any two objects, of any types, must be either equal or occur in some defined order. Python’s answer to this problem is to sort on the names of the types. So None < 3 < "1", because "NoneType" < "int" < "str".

Python 3 removes this fallback rule; if two values don’t know how to compare against each other (i.e. both return NotImplemented), you just get a TypeError.

This might affect you in subtle ways, such as if you’re sorting a list of objects that may contain Nones and expecting it to silently work. The fix depends entirely on the type of data you have, and no automated tool can handle that for you. Most likely, you didn’t mean to be sorting a heterogenous list in the first place.

Of course, you could always sort on type(x).__name__, but I don’t know why you would do that.

The sets module

Python 2.3 introduced its set types as Set and ImmutableSet in the sets module. Since Python 2.4, they’ve been built-in types, set and frozenset. The sets module is gone in Python 3, so just use the built-in names.

Creating exceptions

Python 2 allows you to do this:

1
raise RuntimeError, "an error happened at runtime!!"

There’s not really any good reason to do this, since you can just as well do:

1
raise RuntimeError("an error happened at runtime!!")

futurize --stage1 will rewrite the two-arg form to a regular object creation via the libfuturize.fixes.fix_raise fixer. It’ll also fix this alternative way of specifying an exception type, which is so bizarre and obscure that I did not know about it until I read the fixer’s source code:

1
raise (((A, B), C), ...)  # equivalent to `raise A` (?!)

Additionally, exceptions act like sequences in Python 2, but not in Python 3. You can just operate on the .args sequence directly, in either version. Alas, there’s no automated way to fix this.

Backticks

Did you know that `x` is equivalent to repr(x) in Python 2? Yeah, most people don’t. It’s super weird. futurize --stage1 will fix this with the lib2to3.fixes.fix_repr fixer.

has_key

Very old code may still be using somedict.has_key("foo"). "foo" in somedict has worked since Python 2.2. What are you doing. futurize --stage1 will fix this with the lib2to3.fixes.fix_has_key fixer.

<>

<> is equivalent to != in Python 2! This is an ancient, ancient holdover, and there’s no reason to still be using it. futurize --stage1 will fix this with the lib2to3.fixes.fix_ne fixer.

(You could also use from __future__ import barry_as_FLUFL, which restores <> in Python 3. It’s an easter egg. I’m joking. Please don’t actually do this.)

Things with easy Python 2 equivalents

These aren’t necessarily ancient, but they have an alternative you can just as well express in Python 2, so there’s no need to juggle 2 and 3.

Other ancient builtins

apply() is gone. Use the built-in syntax, f(*args, **kwargs).

callable() was briefly gone, but then came back in Python 3.2.

coerce() is gone; it was only used for old-style classes.

execfile() is gone. Read the file and pass its contents to exec() instead.

file() is gone; Python 3 has multiple file types, and a hierarchy of interfaces defined in the io module. Occasionally, code uses this as a synonym for open(), but you should really be using open() anyway.

intern() has been moved into the sys module, though I have no earthly idea why you’d be using it.

raw_input() has been renamed to input(), and the old ludicrous input() is gone. If you really need input(), please stop.

reduce() has been moved into the functools module, but it’s there in Python 2.6 as well.

reload() has been moved into the imp module. It’s unreliable garbage and you shouldn’t be using it anyway.

futurize --stage1 can fix several of these:

  • apply, via lib2to3.fixes.fix_apply
  • intern, via lib2to3.fixes.fix_intern
  • reduce, via lib2to3.fixes.fix_reduce

futurize --stage2 can also fix execfile via the libfuturize.fixes.fix_execfile fixer, which imports past.builtins.execfile. The 2to3 fixer uses an open() call, but the true correct fix is to use a with block.

futurize --stage2 has a couple of fixers for raw_input, but you can just as well import future.builtins.input or six.moves.input.

Nothing can fix coerce, which has no equivalent. Curiously, I don’t see a fixer for file, which is trivially fixed by replacing it with open. Nothing for reload, either.

Catching exceptions

Historically, the way to say “if there’s a ValueError, store it in e and run some code” was:

1
2
3
4
try:
    ...
except ValueError, e:
    ...

Unfortunately, that’s very easy to confuse with the syntax for catching two different types of exception:

1
2
except (ValueError, TypeError):
    ...

If you forget the parentheses, you’ll only catch ValueError, and the exception will be assigned to a variable called, er, TypeError. Whoops!

Python 3.0 introduced clearer syntax, which was also backported to Python 2.6:

1
2
except ValueError as e:
    ...

Python 3.0 finally removed the old syntax, so you must use the as form. futurize --stage1 will fix this with the lib2to3.fixes.fix_except fixer.

As an additional wrinkle, the extra variable e is deleted at the end of the block in Python 3, but not in Python 2. If you really need to refer to it after the block, just assign it to a different name.

(The reason for this is that captured exceptions contain a traceback in Python 3, and tracebacks contain the locals for the current frame, and those locals will contain the captured exception. The resulting cycle would keep all local variables alive until the cycle detector dealt with it, at least in CPython. Scrapping the exception as soon as it’s been dealt with was a simple way to keep this from accidentally happening all over the place. It usually doesn’t make sense to refer to a captured exception after the except block, anyway, since the variable may or may not even exist, and that’s generally weird and bad in Python.)

Octals

It’s not uncommon for a new programmer to try to zero-pad a set of numbers:

1
2
3
4
a = 07
b = 08
c = 09
d = 10

Of course, this will have the rather bizarre result that 08 is a SyntaxError, even though 07 works fine — because numbers starting with a 0 are parsed as octal.

This is a holdover from C, and it’s fairly surprising, since there’s virtually no reason to ever use octal. The only time I can ever remember using it is for passing file modes to chmod.

Python 3.0 requires octal literals to be prefixed with 0o, in line with 0x for hex and 0b for binary; literal integers starting with only a 0 are a syntax error. Python 2.6 supports both forms.

futurize --stage1 will fix this with the lib2to3.fixes.fix_numliterals fixer.

pickle

If you’re using the pickle module (which you shouldn’t be), and you intend to pass pickles back and forth between Python 2 and Python 3, there’s a small issue to be aware of. pickle has several different “protocol” versions, and the default version used in Python 3 is protocol 3, which Python 2 cannot read.

The fix is simple: just find where you’re calling pickle.dump() or pickle.dumps(), and pass a protocol argument of 2. Protocol 2 is the highest version supported by Python 2, and you probably want to be using it anyway, since it’s much more compact and faster to read/write than Python 2’s default, protocol 0.

You may be already using HIGHEST_PROTOCOL, but you’ll have the same problem: the highest protocol supported in any version of Python 3 is unreadable by Python 2.


A somewhat bigger problem is that if you pickle an instance of a user-defined class on Python 2, the pickle will record all its attributes as bytestrings, because that’s what they are in Python 2. Python 3 will then dutifully load the pickle and populate your object’s __dict__ with keys like b'foo'. obj.foo will then not actually exist, because obj.foo looks for the string 'foo', and 'foo' != b'foo' in Python 3.

Don’t use pickle, kids.

It’s possible to fix this, but also a huge pain in the ass. If you don’t know how, you definitely shouldn’t be using pickle.

Things that have a __future__ import

Occasionally, the syntax changed in an incompatible way, but the new syntax was still backported and hidden behind a __future__ import — Python’s mechanism for opting into syntax changes. You have to put such an import at the top of the file, optionally after a docstring, like this:

1
2
"""My super important module."""
from __future__ import with_statement

Ugh! Parentheses! Why, Guido, why?

The reason is that the print statement has incredibly goofy syntax, unlike anything else in the language:

1
print >>a, b, c,

You might not even recognize the >> bit, but it lets you print to a file other than sys.stdout. It’s baked specifically into the print syntax. Python 3 replaces this with a straightforward built-in function with a couple extra bells and whistles. The above would be written:

1
print(b, c, end='', file=a)

It’s slightly more verbose, but it’s also easier to tell what’s going on, and that teeny little comma at the end is now a more obvious keyword argument.

from __future__ import print_function will forget about the print statement for the rest of the file, and make the builtin print function available instead. futurize --stage1 will fix all uses of print and add the __future__ import, with the libfuturize.fixes.fix_print_with_import fixer. (There’s also a 2to3 fixer, but it doesn’t add the __future__ import, since it’s unnecessary in Python 3.)

A word of warning: do not just use print with parentheses without adding the __future__ import. This may appear to work in stock Python 2:

1
print("See, what's the problem?  This works fine!")

However, that’s parsed as the print statement followed by an expression in parentheses. It becomes more obvious if you try to print two values:

1
2
print("The answer is:", 3)
# ("The answer is:", 3)

Now you have a comma inside parentheses, which is a tuple, so the old print statement prints its repr.

Division always produces a float

Quick, what’s the answer here?

1
5 / 2

If you’re a normal human being, you’ll say 2.5 or 2½. Unfortunately, if you’re like Python and have been afflicted by C, you might say the answer is 2, because this is “integer division” — a bizarre and alien concept probably invented because CPUs didn’t have FPUs when C was first invented.

Python 3.0 decided that maybe contorting fundamental arithmetic to match the inadequacies of 1970s hardware is not the best idea, and so it changed division to always produce a float.

Since Python 2.6, from __future__ import division will alter the division operator to always do true division. If you want to do floor division, there’s a separate // operator, which has existed for ages; you can use it in Python 2 with or without the __future__ import.

Note that true division always produces a float, even if the result is integral: 6 / 3 is 2.0. On the other hand, floor division uses the same typing rules as C-style division: 5 // 2 is 2, but 5 // 2.0 is 2.0.

futurize --stage2 will “fix” this with the libfuturize.fixes.fix_division fixer, but unfortunately that just adds the __future__ import. With the --conservative option, it uses the libfuturize.fixes.fix_division_safe fixer instead, which imports past.utils.old_div, a forward-port of Python 2’s division operator.

The trouble here is that the new / always produces a float, and the new // always floors, but the old / sometimes did one and sometimes did the other. futurize can’t just replace all uses of / with //, because 5/2.0 is 2.5 but 5//2.0 is 2.0, and it can’t generally know what types the operands are.

You might be best off fixing this one manually — perhaps using fix_division_safe to find all the places you do division, then changing them to use the right operator.

Of course, the __div__ magic method is gone in Python 3, replaced explicitly by __floordiv__ (//) and __truediv__ (/). Both of those methods already exist in Python 2, and __truediv__ is even called when you use / in the presence of the future import, so being compatible is a simple matter of implementing all three and deferring to one of the others from __div__.

Relative imports

In Python 2, if you’re in the module foo.bar and say import quux, Python will look for a foo.quux before it looks for a top-level quux. The former behavior is called a relative import, though it might be more clearly called a sibling import. It’s troublesome for several reasons.

  • If you have a sibling called quux, and there’s also a top-level or standard library module called quux, you can’t import the latter. (There used to be a py.std module for providing indirect access to the standard library, for this very reason!)

  • If you import the top-level quux module, and then later add a foo.quux module, you’ll suddenly be importing a different module.

  • When reading the source code, it’s not clear which imports are siblings and which are top-level. In fact, the modules you get depend on the module you’re in, so moving or renaming a file may change its imports in non-obvious ways.

Python 3 eliminates this behavior: import quux always means the top-level module. It also adds syntax for “explicit relative” or “absolute relative” (yikes) imports: from . import quux or from .quux import somefunc explicitly means to look for a sibling named quux. (You can also use ..quux to look in the parent package, three dots to look in the grandparent, etc.)

The explicit syntax is supported since Python 2.5. The old sibling behavior can be disabled since Python 2.5 with from __future__ import absolute_import.

futurize --stage1 has a libfuturize.fixes.fix_absolute_import fixer, which attempts to detect sibling imports and convert them to explicit relative imports. If it finds any sibling imports, it’ll also add the __future__ line, though honestly you should make an effort to to put that line in all of your Python 2 code.

It’s possible for the futurize fixer to guess wrong about a sibling import, but in general it works pretty well.

(There is one case I’ve run across where simply replacing import sibling with from . import sibling didn’t work. Unfortunately, it was Yelp code that I no longer have access to, and I can’t remember the precise details. It involved having several sibling imports inside a __init__.py, where the siblings also imported from each other in complex ways. The sibling imports worked, but the explicit relative imports failed, for some really obscure timing reason. It’s even possible this was a 2.6 bug that’s been fixed in 2.7. If you see it, please let me know!)

Things that require some effort

These problems are a little more obscure, but many of them are also more difficult to fix automatically. If you have a massive codebase, these are where the problems start to appear.

The grand module shuffle

A whole bunch of modules were deleted, merged, or removed. A full list is in PEP 3108, but you’ll never have heard of most of them. Here are the ones that might affect you.

  • __builtin__ has been renamed to builtins. Note that this is a module, not the __builtins__ attribute of modules, which is exactly why it was renamed. Incidentally, you should be using the builtins module rather than __builtins__ anyway. Or, wait, no, just don’t use either, please don’t mess with the built-in scope.

  • ConfigParser has been renamed to configparser.

  • Queue has been renamed to queue.

  • SocketServer has been renamed to socketserver.

  • cStringIO and StringIO are gone; instead, use StringIO or BytesIO from the io module. Note that these also exist in Python 2, but are pure-Python rather than the C versions in current Python 3.

  • cPickle is gone. Importing pickle in Python 3 now gives you the C implementation automatically.

  • cProfile is gone. Importing profile in Python 3 gives you the C implementation automatically.

  • copy_reg has been renamed to copyreg.

  • anydbm, dbhash, dbm, dumbdm, gdbm, and whichdb have all been merged into a dbm package.

  • dummy_thread has become _dummy_thread. It’s an implementation of the _thread module that doesn’t actually do any threading. You should be using dummy_threading instead, I guess?

  • httplib has become http.client. BaseHTTPServer, CGIHTTPServer, and SimpleHTTPServer have been merged into a single http.server module. Cookie has become http.cookies. cookielib has become http.cookiejar.

  • repr has been renamed to reprlib. (The module, not the built-in function.)

  • thread has been renamed to _thread, and you should really be using the threading module instead.

  • A whole mess of top-level Tk modules have been combined into a tkinter package.

  • The contents of urllib, urllib2, and urlparse have been consolidated and then split into urllib.error, urllib.parse, and urllib.request.

  • xmlrpclib has become xmlrpc.client. DocXMLRPCServer and SimpleXMLRPCServer have been merged into xmlrpc.server.

futurize --stage2 will fix this with the somewhat invasive libfuturize.fixes.fix_future_standard_library fixer, which uses a mechanism from future that adds aliases to Python 2 to make all the Python 3 standard library names work. It’s an interesting idea, but it didn’t actually work for all cases when I tried it (though now I can’t recall what was broken), so YMMV.

Alternative, you could manually replace any affected imports with imports from six.moves, which provides aliases that work on either version.

Or as a last resort, you can just sprinkle try ... except ImportError around.

Built-in iterators are now lazy

filter, map, range, and zip are all lazy in Python 3. You can still iterate over their return values (once), but if you have code that expects to be able to index them or traverse them more than once, it’ll break in Python 3. (Well, not range, that’s fine.) The lazy equivalents — xrange and the functions in itertools — are of course gone in Python 3.

In either case, the easiest thing to do is force eager evaluation by wrapping the call in list() or tuple(), which you’ll occasionally need to do in Python 3 regardless.

For the sake of consistency, you may want to import the lazy versions from the standard library future_builtins module. It only exists in Python 2, so be sure to wrap the import in a try.

futurize --stage2 tries to address this with several of lib2to3s fixers, but the results aren’t particularly pleasing: calls to all four are unconditionally wrapped in list(), even in an obviously safe case like a for block. I’d just look through your uses of them manually.

A more subtle point: if you pass a string or tuple to Python 2’s filter, the return value will be the same type. Blindly wrapping the call in list() will of course change the behavior. Filtering a string is not a particularly common thing to do, but I’ve seen someone complain about it before, so take note.

Also, Python 3’s map stops at the shortest input sequence, whereas Python 2 extends shorter sequences with Nones. You can fix this with itertools.zip_longest (which in Python 2 is izip_longest!), but honestly, I’ve never even seen anyone pass multiple sequences to map.

Relatedly, dict.iteritems (plus its friends, iterkeys and itervalues) is gone in Python 3, as the plain items (plus keys and values) is already lazy. The dict.view* methods are also gone, as they were only backports of Python 3’s normal behavior.

Both six and future.utils contain functions called iteritems, etc., which provide a lazy iterator in both Python 2 and 3. They also offer view* functions, which are closer to the Python 3 behavior, though I can’t say I’ve ever seen anyone actually use dict.viewitems in real code.

Of course, if you explicitly want a list of dictionary keys (or items or values), list(d) and list(d.items()) do the same thing in both versions.

buffer is gone

The buffer type has been replaced by memoryview (also in Python 2.7), which is similar but not identical. If you’ve even heard of either of these types, you probably know more about the subtleties involved than I do. There’s a lib2to3.fixes.fix_buffer fixer that blindly replaces buffer with memoryview, but futurize doesn’t use it in either stage.

Several special methods were renamed

Where Python 2 has __str__ and __unicode__, Python 3 has __bytes__ and __str__. The trick is that __str__ should return the native str type for each version: a bytestring for Python 2, but a Unicode string for Python 3. Also, you almost certainly don’t want a __bytes__ method in Python 3, where bytes is no longer used for text.

Both six and python-future have a python_2_unicode_compatible class decorator that tries to do the right thing. You write only a single __str__ method that returns a Unicode string. In Python 3, that’s all you need, so the decorator does nothing; in Python 2, the decorator will rename your method to __unicode__ and add a __str__ that returns the same value encoded as UTF-8. If you need different behavior, you’ll have to roll it yourself with if PY2.


Python 2’s next method is more appropriately __next__ in Python 3. The easy way to address this is to call your method __next__, then alias it with next = __next__. Be sure you never call it directly as a method, only with the built-in next() function.

Alternatively, future.builtins contains an alternative next which always calls __next__, but on Python 2, it falls back to trying next if __next__ doesn’t exist.

futurize --stage1 changes all use of obj.next() to next(obj) via the libfuturize.fixes.fix_next_call fixer. futurize --stage2 renames next methods to __next__ via the lib2to3.fixes.fix_next fixer (which also fixes calls). Note that there’s a remote chance of false positives, if for some reason you happened to use next as a regular method name.


Python 2’s __nonzero__ is Python 3’s __bool__. Again, you can just alias it manually. Or futurize --stage2 will rename it with the lib2to3.fixes.fix_nonzero fixer.

Renaming it will of course break it in Python 2, but futurize --stage2 also has a libfuturize.fixes.fix_object fixer that imports python-future’s own builtins.object. The replacement object class has a few methods for making Python 3’s __str__, __next__, and __bool__ work on Python 2.

This is one of the mildly invasive things python-future does, and it may or may not sit well. Up to you.


__long__ is completely gone, as there is no long type in Python 3.

__getslice__, __setslice__, and __delslice__ are gone. Instead, slice objects are passed to __getitem__ and friends. On the off chance you use these, you’ll have to do something clever in the item methods to defer to your slice logic on Python 3.

__oct__ and __hex__ are gone; oct() and hex() now consult __index__. I seriously doubt this will impact anyone.

__div__ is gone, as mentioned previously.

Unbound methods are gone; function attributes renamed

Say you have this useless class.

1
2
3
class Foo(object):
    def bar(self):
        pass

In Python 2, Foo.bar is an “unbound method”, a type that’s generally unseen and unexposed other than as types.MethodType. In Python 3, Foo.bar is just a regular function.

Offhand, I can only think of one time this would matter: if you want to get at attributes on the function, perhaps for the sake of a method decorator. In Python 2, you have to go through the unbound method’s .im_func attribute to get the original function, but in Python 3, you already have the original function and can get the attributes directly.

If you’re doing this anywhere, an easy way to make it work in both versions is:

1
2
method = Foo.bar
method = getattr(method, 'im_func', method)

As for bound methods (the objects you get from accessing methods but not calling them, like [].append), the im_self and im_func attributes have been renamed to __self__ and __func__. Happily, these names also work in Python 2.6, so no compatibility hacks are necessary.

im_class is completely gone in Python 3. Methods have no interest in which class they’re attached to. They can’t, since the same function could easily be attached to more than one class. If you’re relying on im_class somehow, for some reason… well, don’t do that, maybe.

Relatedly, the func_* function attributes have been renamed to dunder names in Python 3, since assigning function attributes is a fairly common practice and Python doesn’t like to clog namespaces with its own builtin names. func_closure, func_code, func_defaults, func_dict, func_doc, func_globals, and func_name are now __closure__, __code__, etc. (Note that func_doc and func_name were already aliases for __doc__ and __name__, and func_defaults is much more easily inspected with the inspect module.) The new names are not available in Python 2, so you’ll need to do a getattr dance, or use the get_function_* functions from six.

Metaclass syntax has changed

In Python 2, a metaclass is declared by assigning to a special name in the class body:

1
2
3
class Foo(object):
    __metaclass__ = FooMeta
    ...

Admittedly, this doesn’t make a lot of sense. The metaclass affects how a class is created, and the class body is evaluated as part of that creation, so this is sort of a goofy hack.

Python 3 changed this, opening the door to a few new neat tricks in the process, which you can find out about in the companion article.

1
2
class Foo(object, metaclass=FooMeta):
    ...

The catch is finding a way to express this idea in both Python 2 and Python 3 — the old syntax is ignored in Python 3, and the new syntax is a syntax error in Python 2.

It’s a bit of a pain, but the class statement is really just a lot of sugar for calling the type() constructor; after all, Python classes are just instances of type. All you have to do is manually create an instance of your metaclass, rather than of type.

Fortunately, other people have already made this work for you. futurize --stage2 will fix this using the libfuturize.fixes.fix_metaclass fixer, which imports future.utils.with_metaclass and produces the following:

1
2
3
4
from future.utils import with_metaclass

class Foo(with_metaclass(object)):
    ...

This creates an intermediate dummy class with the right metaclass, which you then inherit from. Classes use the same metaclass as their parents, so this works fine in any Python.

If you don’t want to depend on python-future, the same function exists in the six module.

Re-raising exceptions has different syntax

raise with no arguments does the same thing in Python 2 and Python 3: it re-raises the exception currently being handled, preserving the original traceback.

The problem comes in with the three-argument form of raise, which is for preserving the traceback while raising a different exception. It might look like this:

1
2
3
4
try:
    some_fragile_function()
except Exception as e:
    raise MyLibraryError, MyLibraryError("Failed to do a thing: " + str(e)), sys.exc_info()[2]

sys.exc_info()[2] is, of course, the only way to get the current traceback in Python 2. You may have noticed that the three arguments to raise are the same three things that sys.exc_info() returns: the type, the value, and the traceback.

Python 3 introduces exception chaining. If something raises an exception from within an except block, Python will remember the original exception, attach it to the new one, and show both exceptions when printing a traceback — including both exceptions’ types, messages, and where they happened. So to wrap and rethrow an exception, you don’t need to do anything special at all.

1
2
3
4
try:
    some_fragile_function()
except Exception:
    raise MyLibraryError("Failed to do a thing")

For more complicated handling, you can also explicitly say raise new_exception from old_exception. Exceptions contain their associated tracebacks as a __traceback__ attribute in Python 3, so there’s no need to muck around getting the traceback manually. If you really want to give an explicit traceback, you can use the .with_traceback() method, which just assigns to __traceback__ and then returns self.

1
raise MyLibraryError("Failed to do a thing").with_traceback(some_traceback)

It’s hard to say what it even means to write code that works “equivalently” in both versions, because Python 3 handles this problem largely automatically, and Python 2 code tends to have a variety of ad-hoc solutions. Note that you cannot simply do this:

1
2
3
4
if PY3:
    raise MyLibraryError("Beep boop") from exc
else:
    raise MyLibraryError, MyLibraryError("Beep boop"), sys.exc_info()[2]

The first raise is a syntax error in Python 2, and the second is a syntax error in Python 3. if won’t protect you from parse errors. (On the other hand, you can hide .with_traceback() behind an if, since that’s just a regular method call and will parse with no issues.)

six has a reraise function that will smooth out the differences for you (probably by using exec). The drawback is that it’s of course Python 2-oriented syntax, and on Python 3 the final traceback will include more context than expected.

Alternatively, there’s a six.raise_from, which is designed around the raise X from Y syntax of Python 3. The drawback is that Python 2 has no obvious equivalent, so you just get raise X, losing the old exception and its traceback.

There’s no clear right approach here; it depends on how you’re handling re-raising. Code that just blindly raises new exceptions doesn’t need any changes, and will get exception chaining for free on Python 3. Code that does more elaborate things, like implementing its own form of chaining or storing exc_info tuples to be re-raised later, may need a little more care.

Bytestrings are sequences of integers

In Python 2, bytes is a synonym for str, the default string type. Iterating or indexing a bytes/str produces 1-character strs.

1
2
3
4
list(b'hello')  # ['h', 'e', 'l', 'l', 'o']
b'hello'[0:4]  # 'hell'
b'hello'[0]  # 'h'
b'hello'[0][0][0][0][0]  # 'h' -- it's turtles all the way down

In Python 3, bytes is a specialized type for handling binary data, not text. As such, iterating or indexing a bytes produces integers.

1
2
3
4
list(b'hello')  # [104, 101, 108, 108, 111]
b'hello'[0:4]  # b'hell'
b'hello'[0]  # 104
b'hello'[0][0][0][0]  # TypeError, since you can't index 104

If you have explicitly binary data that want to be bytes in Python 3, this may pose a bit of a problem. Aside from just checking the version explicitly and making heavy use of chr/ord, there are two approaches.

One is to use bytearray instead. This is like bytes, but mutable. More importantly, since it was introduced as a new type in Python 2.6 — after Python 3.0 came out — it has the same iterating and indexing behavior as Python 3’s bytes, even in Python 2.

1
bytearray(b'hello')[0]  # 104, on either Python 2 or 3

The other is to slice rather than index, since slicing always produces a new iterable of the same type. If you want to extract a single character from a bytes, just take a one-element slice.

1
2
b'hello'[0]  # 104
b'hello'[0:1]  # b'h'

Things that are just a royal pain in the ass

Unicode

Saving the best for last, almost!

Honestly, if your Python 2 code is already careful with Unicode — working with unicode internally, and encoding/decoding only at the “boundaries” of your code — then you shouldn’t have too many problems. If your code is not so careful, you should really try to make it a little more careful before you worry about Python 3, since Python 3’s whole jam is to force you to be careful.

See, in Python 2, you can combine bytestrings (str) and text strings (unicode) more or less freely. Python will automatically try to convert between the two using the “default encoding”, which is generally ascii. Python 3 makes text strings the default string type, demotes bytestrings, and forbids ever converting between them.

Most obviously, Python 2’s str and unicode have been renamed to bytes and str in Python 3. If you happen to be using the names anywhere, you’ll probably need to change them! six offers text_type and binary_type, though you can just use bytes to mean the same thing in either version. python-future also has backports for both Python 3’s bytes and str types, which seems like an extreme approach to me. Changing str to mean a text type even in Python 2 might be a good idea, though.

b'' and u'' work the same way in either Python 2 or 3, but unadorned strings like '' are always the str type, which has different behavior. There is a from __future__ import unicode_literals, which will cause unadorned strings to be unicode in Python 2, and this might work for you. However, this prevents you from writing literal “native” strings — strings of the same type Python uses for names, keyword arguments, etc. Usually this won’t matter, since Python 2 will silently convert between bytes and text, but it’s caused me the occasional problem.

The right thing to do is just explicitly mark every single string with either a b or u sigil as necessary. That just, you know, sucks. But you should be doing it even if you’re not porting to Python 3.

basestring is completely gone in Python 3. str and bytes have no common base type, and their semantics are different enough that it rarely makes sense to treat them the same way. If you’re using basestring in Python 2, it’s probably to allow code to work on either form of “text”, and you’ll only want to use str in Python 3 (where bytes are completely unsuitable for text). six.string_types provides exactly this. futurize --stage2 also runs the lib2to3.fixes.fix_basestring fixer, but this replaces basestring with str, which will almost certainly break your code in Python. If you intend to use stage 2, definitely audit your uses of basestring first.

As mentioned above, bytestrings are sequences of integers, which may affect code trying to work with explicitly binary data.

Python 2 has both .decode() and .encode() on both bytes and text; if you try to encode bytes or decode text, Python will try to implicitly convert to the right type first. In Python 3, only text has an .encode() and only bytes have a .decode().

Relatedly, Python 2 allows you to do some cute tricks with “encodings” that aren’t really encodings; for example, "hi".encode('hex') produces '6869'. In Python 3, encoding must produce bytes, and decoding must produce text, so these sorts of text-to-text or bytes-to-bytes translations aren’t allowed. You can still do them explicitly with the codecs module, e.g. codecs.encode(b'hi', 'hex'), which also works in Python 2, despite being undocumented. (Note that Python 3 specifically requires bytes for the hex codec, alas. If it’s any consolation, there’s a bytes.hex() method to do this directly, which you can’t use anyway if you’re targeting Python 2.)

Python 3’s open decodes as UTF-8 by default (a vast oversimplification, but usually), so if you’re manually decoding after reading, you’ll get an error in Python 3. You could explicitly open the file in binary mode (preserving the Python 2 behavior), or you could use codecs.open to decode transparently on read (preserving the Python 3 behavior). The same goes for writing.

sys.stdin, sys.stdout, and sys.stderr are all text streams in Python 3, so they have the same caveats as above, with the additional wrinkle that you didn’t actually open them yourself. Their .buffer attribute gives a handle opened in binary mode (Python 2 behavior), or you can adapt them to transcode transparently (Python 3 behavior):

1
2
3
4
if six.PY2:
    sys.stdin = codecs.getreader('utf-8')(sys.stdin)
    sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
    sys.stderr = codecs.getwriter('utf-8')(sys.stderr)

A text-mode file’s .tell() in Python 3 still returns a number that can be passed back to .seek(), but the number is not necessarily meaningful, and in particular can’t be used to estimate progress through a file. (Python uses a few very high bits as flags to indicate the state of the decoder; if you mask them off, what’s left is probably the byte position in the file as you’d expect, but this is pretty definitively a hack.)

Python 3 likes to treat filenames as text, but most of the functions in os and os.path will accept either text or bytes as their arguments (and return a value of the same type), so you should be okay there.

os.environ has text keys and values in Python 3. If you direly need bytes, you can use os.environb (and os.getenvb()).

I think that covers most of the obvious basics. This is a whole sprawling topic that I can’t hope to cover off the top of my head. I’ve seen it be both fairly painful and completely pain-free, depending entirely on the state of the Python 2 codebase.

Oh, one final note: there’s a module for Python 2 called unicode-nazi (sorry, I didn’t name it) that will produce a warning anytime a bytestring is implicitly converted to a text string, or vice versa. It might help you root out places you’re accidentally slopping types back and forth, which will certainly break in Python 3. I’ve only tried it on a comically large project where it found thousands of violations, including plenty in surprising places in the standard library, so it may or may not be of any practical help.

Things that are not actually gone

String formatting with %

There’s a widespread belief that str % ... is deprecated, since there’s a newer and shinier str.format() method.

Well, it’s not. It’s not gone; it’s not deprecated; it still works just fine. I don’t like to use it, myself, since it’s easy to make accidentally ambiguous — "%s" % foo can crash if foo is a tuple! — but it’s not going anywhere. In fact, as of Python 3.5, bytes and bytearray support % but not .format.

optparse

argparse is certainly better, but the optparse module still exists in Python 3. It has been deprecated since Python 3.2, though.

Things that are preposterously obscure but that I have seen cause problems nonetheless

Tuple unpacking

A little-used feature of Python 2 is tuple unpacking in function arguments:

1
2
3
4
5
def foo(a, (b, c)):
    print a, b, c

x = (2, 3)
foo(1, x)

This syntax is gone in Python 3. I’ve rarely seen anyone use it, except in two cases. One was a parsing library that relied pretty critically on using it in every parsing function you wrote; whoops.

The other is when sorting a dict’s items:

1
sorted(d.items(), key=lambda (k, v): k + v)

In Python 3, you have to write that as lambda kv: kv[0] + kv[1]. Boo.

long is gone

Python 3 merged its long type with int, so now there’s only one integral type, called int.

Python 2 promotes int to long pretty much transparently, and longs aren’t very common in the first place, so it’s fairly unlikely that this will make a difference. On the off chance you’re type-checking for integers with isinstance(x, (int, long)) (and really, why are you doing that), you can just use six.integer_types instead.

Note that futurize --stage2 applies the lib2to3.fixes.fix_long fixer, which blindly renames long to int, leaving you with inappropriate code like isinstance(x, (int, int)).

However…

I have seen some very obscure cases where a hand-rolled binary protocol would encode ints and longs differently. My advice would be to not do that.

Oh, and a little-known feature of Python 2’s syntax is that you can have long literals by suffixing them with an L:

1
2
123  # int
123L  # long

You can write 1267650600228229401496703205376 directly in Python 2 code, and it’ll automatically create a long, so the only reason to do this is if you explicitly need a long with a small value like 1. If that’s the case, something has gone catastrophically wrong.

repr changes

These should really only affect you if you’re using reprs as expected test output (or, god forbid, as cache keys or something). Some notable changes:

  • Unicode strings have a u prefix in Python 2. In Python 3, of course, Unicode strings are just strings, so there’s no prefix.

  • Conversely, bytestrings have a b prefix in Python 3, but not in Python 2 (though the b prefix is allowed in source code).

  • Python 2 escapes all non-ASCII characters, even in the repr of a Unicode string. Python 3 only escapes control characters and codepoints considered non-printing.

  • Large integers and explicit longs have an L suffix in Python 2, but not in Python 3, where there is no separate long type.

  • A set becomes set([1, 2, 3]) in Python 2, but {1, 2, 3} in Python 3. The set literal syntax is allowed in source code in Python 2.7, but the repr wasn’t changed until 3.0.

  • floats stringify to the shortest possible representation that has the same underlying value — e.g., str(1.1) is '1.1' rather than '1.1000000000000001'. This change was backported to Python 2.7 as well, but I have seen it break tests.

Hash randomization

Python has traditionally had a predictable hashing mechanism: repr(dict(a=1, b=2, c=3)) will always produce the same string. (On the same platform with the same Python version, at least.) Unfortunately this opens the door to an obscure DoS exploit that was known to Perl long ago: if you know a web application is written in Python, you can construct a query string that will become a dict whose keys all go in the same hash bucket. If your query string is long enough and you send enough requests, you can tie up all the Python processes in dealing with hash collisions.

The fix is hash randomization, which seeds the hashing algorithm in such a way that items are bucketed differently every time Python runs. It’s available in Python 2.7 via an environment variable or the -R argument, but it wasn’t turned on by default until Python 3.3.

The fear was that it might break things. Naturally, it has broken things. Mostly, reprs in tests. But it also changes the iteration order of dicts between Python runs. I have seen code using dicts whose keys happened to always be sorted in alphabetical or insertion order before, but with hash randomization, the keys were of course in a different order every time the code ran. The author assumed that Python had somehow broken dict sorting (which it has never had).

nonlocal

Python 3 introduces the nonlocal keyword, which is like global except it looks through all outer scopes in the expected order. It fixes this mild annoyance:

1
2
3
4
5
6
7
def make_function():
    counter = 0
    def function():
        nonlocal counter
        counter += 1  # without 'nonlocal', this declares a new local!
        print("I've been called", counter, "times!")
    return function

The problem is that any use of assignment within a function automatically creates a new local, and locals are known statically for the entire body of the function. (They actually affect how functions are compiled, in CPython.) So without nonlocal, the above code would see counter += 1, but counter is a new local that has never been assigned a value, so Python cannot possibly add 1 to it, and you get an UnboundLocalError.

nonlocal tells Python that when it sees an assignment of a name that exists in some outer scope, it should reuse that outer variable rather than shadowing it. Great, right? Purely a new feature. No problem.

Unfortunately, I’ve worked on a codebase that needed this feature in Python 2, and decided to fake it with a class… named nonlocal.

1
2
3
4
5
6
7
def make_function():
    class nonlocal:
        counter = 0
    def function():
        nonlocal.counter += 1  # this alters an outer value in-place, so it's fine
        print("I've been called", counter, "times!")
    return function

The class here is used purely as a dummy container. Assigning to an attribute doesn’t create any locals, because it’s equivalent to a method call, so the operand must already exist. This is a slightly quirky approach, but it works fine.

Except that, of course, nonlocal is a keyword in Python 3, so this becomes complete gibberish. It’s such gibberish that (if I remember correctly) 2to3 actually cannot parse it, even though it’s perfectly valid Python 2 code.

I don’t have a magical fix for this one. Just, uh, don’t name things nonlocal.

List comprehensions no longer leak

Python 2 has the slightly inconsistent behavior that loop variables in a generator expression ((...)) are scoped to the generator expression, but loop variables in a list comprehension ([...]) belong to the enclosing scope.

The only reason is in implementation details: a list comprehension acts like a for loop, which has the same behavior, whereas a generator expression actually creates a generator internally.

Python 3 brings these cases into line: loop variables in list comprehensions (or dict or set comprehensions) are also scoped to the comprehension itself.

I cannot imagine any possible reason why this would affect you negatively, and yet, I can swear I’ve seen it happen. I wish I could remember where, because I’m sure it’s an exciting story.

cStringIO.h is gone

cStringIO.h is a private and undocumented C interface to Python 2’s cStringIO.StringIO type. It was removed in Python 3, or at least is somewhere I can’t find it.

This was one of the reasons Thrift’s Python 3 port took almost 3 years: Thrift has a “fast” C module that makes use of this private interface, and it’s not obvious how to replace it. I think they ended up just having the module not exist on Python 3, so Python 3 will just be mysteriously slower.

Some troublesome libraries

MySQLdb is some ancient, clunky, noncompliant, underdocumented trash, much like the database it connects to. It’s nigh abandoned, though it still promises Python 3 support in the MySQLdb 2.0 vaporware. I would suggest not using MySQL, but barring that, try mysqlclient, a fork of MySQLdb that continues development and adds Python 3 support. (The same people also maintain an earlier project, pymysql, which strives to be a pure-Python drop-in replacement for MySQLdb — it’s not quite perfect, but its existence is interesting and it’s sure easier to read than MySQLdb.)

At a glance, Thrift still hasn’t had a release since it merged Python 3 support, eight months ago. It’s some enterprise nightmare, anyway, and bizarrely does code generation for a bunch of dynamic languages. Might I suggest just using the pure-Python thriftpy, which parses Thrift definitions on the fly?

Twisted is, ah, large and complex. Parts of it now support Python 3; parts of it do not. If you need the parts that don’t, well, maybe you could give them a hand?

M2Crypto is working on it, though I’m pretty sure most Python crypto nerds would advise you to use cryptography instead.

And so on

You may find any number of other obscure compatibility problems, just as you might when upgrading from 2.6 to 2.7. The Python community has a lot of clever people willing to help you out, though, and they’ve probably even seen your super duper niche problem before.

Don’t let that, or this list of gotchas in general, dissaude you! Better to start now than later; even fixing an integer division gets you one step closer to having your code run on Python 3 as well.

I’ve bought some more awful IoT stuff

Post Syndicated from Matthew Garrett original https://mjg59.dreamwidth.org/43486.html

I bought some awful WiFi lightbulbs a few months ago. The short version: they introduced terrible vulnerabilities on your network, they violated the GPL and they were also just bad at being lightbulbs. Since then I’ve bought some other Internet of Things devices, and since people seem to have a bizarre level of fascination with figuring out just what kind of fractal of poor design choices these things frequently embody, I thought I’d oblige.

Today we’re going to be talking about the KanKun SP3, a plug that’s been around for a while. The idea here is pretty simple – there’s lots of devices that you’d like to be able to turn on and off in a programmatic way, and rather than rewiring them the simplest thing to do is just to insert a control device in between the wall and the device andn ow you can turn your foot bath on and off from your phone. Most vendors go further and also allow you to program timers and even provide some sort of remote tunneling protocol so you can turn off your lights from the comfort of somebody else’s home.

The KanKun has all of these features and a bunch more, although when I say “features” I kind of mean the opposite. I plugged mine in and followed the install instructions. As is pretty typical, this took the form of the plug bringing up its own Wifi access point, the app on the phone connecting to it and sending configuration data, and the plug then using that data to join your network. Except it didn’t work. I connected to the plug’s network, gave it my SSID and password and waited. Nothing happened. No useful diagnostic data. Eventually I plugged my phone into my laptop and ran adb logcat, and the Android debug logs told me that the app was trying to modify a network that it hadn’t created. Apparently this isn’t permitted as of Android 6, but the app was handling this denial by just trying again. I deleted the network from the system settings, restarted the app, and this time the app created the network record and could modify it. It still didn’t work, but that’s because it let me give it a 5GHz network and it only has a 2.4GHz radio, so one reset later and I finally had it online.

The first thing I normally do to one of these things is run nmap with the -O argument, which gives you an indication of what OS it’s running. I didn’t really need to in this case, because if I just telnetted to port 22 I got a dropbear ssh banner. Googling turned up the root password (“p9z34c”) and I was logged into a lightly hacked (and fairly obsolete) OpenWRT environment.

It turns out that here’s a whole community of people playing with these plugs, and it’s common for people to install CGI scripts on them so they can turn them on and off via an API. At first this sounds somewhat confusing, because if the phone app can control the plug then there clearly is some kind of API, right? Well ha yeah ok that’s a great question and oh good lord do things start getting bad quickly at this point.

I’d grabbed the apk for the app and a copy of jadx, an incredibly useful piece of code that’s surprisingly good at turning compiled Android apps into something resembling Java source. I dug through that for a while before figuring out that before packets were being sent, they were being handed off to some sort of encryption code. I couldn’t find that in the app, but there was a native ARM library shipped with it. Running strings on that showed functions with names matching the calls in the Java code, so that made sense. There were also references to AES, which explained why when I ran tcpdump I only saw bizarre garbage packets.

But what was surprising was that most of these packets were substantially similar. There were a load that were identical other than a 16-byte chunk in the middle. That plus the fact that every payload length was a multiple of 16 bytes strongly indicated that AES was being used in ECB mode. In ECB mode each plaintext is split up into 16-byte chunks and encrypted with the same key. The same plaintext will always result in the same encrypted output. This implied that the packets were substantially similar and that the encryption key was static.

Some more digging showed that someone had figured out the encryption key last year, and that someone else had written some tools to control the plug without needing to modify it. The protocol is basically ascii and consists mostly of the MAC address of the target device, a password and a command. This is then encrypted and sent to the device’s IP address. The device then sends a challenge packet containing a random number. The app has to decrypt this, obtain the random number, create a response, encrypt that and send it before the command takes effect. This avoids the most obvious weakness around using ECB – since the same plaintext always encrypts to the same ciphertext, you could just watch encrypted packets go past and replay them to get the same effect, even if you didn’t have the encryption key. Using a random number in a challenge forces you to prove that you actually have the key.

At least, it would do if the numbers were actually random. It turns out that the plug is just calling rand(). Further, it turns out that it never calls srand(). This means that the plug will always generate the same sequence of challenges after a reboot, which means you can still carry out replay attacks if you can reboot the plug. Strong work.

But there was still the question of how the remote control works, since the code on github only worked locally. tcpdumping the traffic from the server and trying to decrypt it in the same way as local packets worked fine, and showed that the only difference was that the packet started “wan” rather than “lan”. The server decrypts the packet, looks at the MAC address, re-encrypts it and sends it over the tunnel to the plug that registered with that address.

That’s not really a great deal of authentication. The protocol permits a password, but the app doesn’t insist on it – some quick playing suggests that about 90% of these devices still use the default password. And the devices are all based on the same wifi module, so the MAC addresses are all in the same range. The process of sending status check packets to the server with every MAC address wouldn’t take that long and would tell you how many of these devices are out there. If they’re using the default password, that’s enough to have full control over them.

There’s some other failings. The github repo mentioned earlier includes a script that allows arbitrary command execution – the wifi configuration information is passed to the system() command, so leaving a semicolon in the middle of it will result in your own commands being executed. Thankfully this doesn’t seem to be true of the daemon that’s listening for the remote control packets, which seems to restrict its use of system() to data entirely under its control. But even if you change the default root password, anyone on your local network can get root on the plug. So that’s a thing. It also downloads firmware updates over http and doesn’t appear to check signatures on them, so there’s the potential for MITM attacks on the plug itself. The remote control server is on AWS unless your timezone is GMT+8, in which case it’s in China. Sorry, Western Australia.

It’s running Linux and includes Busybox and dnsmasq, so plenty of GPLed code. I emailed the manufacturer asking for a copy and got told that they wouldn’t give it to me, which is unsurprising but still disappointing.

The use of AES is still somewhat confusing, given the relatively small amount of security it provides. One thing I’ve wondered is whether it’s not actually intended to provide security at all. The remote servers need to accept connections from anywhere and funnel decent amounts of traffic around from phones to switches. If that weren’t restricted in any way, competitors would be able to use existing servers rather than setting up their own. Using AES at least provides a minor obstacle that might encourage them to set up their own server.

Overall: the hardware seems fine, the software is shoddy and the security is terrible. If you have one of these, set a strong password. There’s no rate-limiting on the server, so a weak password will be broken pretty quickly. It’s also infringing my copyright, so I’d recommend against it on that point alone.

comment count unavailable comments

I’ve bought some more awful IoT stuff

Post Syndicated from Matthew Garrett original http://mjg59.dreamwidth.org/43486.html

I bought some awful WiFi lightbulbs a few months ago. The short version: they introduced terrible vulnerabilities on your network, they violated the GPL and they were also just bad at being lightbulbs. Since then I’ve bought some other Internet of Things devices, and since people seem to have a bizarre level of fascination with figuring out just what kind of fractal of poor design choices these things frequently embody, I thought I’d oblige.

Today we’re going to be talking about the KanKun SP3, a plug that’s been around for a while. The idea here is pretty simple – there’s lots of devices that you’d like to be able to turn on and off in a programmatic way, and rather than rewiring them the simplest thing to do is just to insert a control device in between the wall and the device andn ow you can turn your foot bath on and off from your phone. Most vendors go further and also allow you to program timers and even provide some sort of remote tunneling protocol so you can turn off your lights from the comfort of somebody else’s home.

The KanKun has all of these features and a bunch more, although when I say “features” I kind of mean the opposite. I plugged mine in and followed the install instructions. As is pretty typical, this took the form of the plug bringing up its own Wifi access point, the app on the phone connecting to it and sending configuration data, and the plug then using that data to join your network. Except it didn’t work. I connected to the plug’s network, gave it my SSID and password and waited. Nothing happened. No useful diagnostic data. Eventually I plugged my phone into my laptop and ran adb logcat, and the Android debug logs told me that the app was trying to modify a network that it hadn’t created. Apparently this isn’t permitted as of Android 6, but the app was handling this denial by just trying again. I deleted the network from the system settings, restarted the app, and this time the app created the network record and could modify it. It still didn’t work, but that’s because it let me give it a 5GHz network and it only has a 2.4GHz radio, so one reset later and I finally had it online.

The first thing I normally do to one of these things is run nmap with the -O argument, which gives you an indication of what OS it’s running. I didn’t really need to in this case, because if I just telnetted to port 22 I got a dropbear ssh banner. Googling turned up the root password (“p9z34c”) and I was logged into a lightly hacked (and fairly obsolete) OpenWRT environment.

It turns out that here’s a whole community of people playing with these plugs, and it’s common for people to install CGI scripts on them so they can turn them on and off via an API. At first this sounds somewhat confusing, because if the phone app can control the plug then there clearly is some kind of API, right? Well ha yeah ok that’s a great question and oh good lord do things start getting bad quickly at this point.

I’d grabbed the apk for the app and a copy of jadx, an incredibly useful piece of code that’s surprisingly good at turning compiled Android apps into something resembling Java source. I dug through that for a while before figuring out that before packets were being sent, they were being handed off to some sort of encryption code. I couldn’t find that in the app, but there was a native ARM library shipped with it. Running strings on that showed functions with names matching the calls in the Java code, so that made sense. There were also references to AES, which explained why when I ran tcpdump I only saw bizarre garbage packets.

But what was surprising was that most of these packets were substantially similar. There were a load that were identical other than a 16-byte chunk in the middle. That plus the fact that every payload length was a multiple of 16 bytes strongly indicated that AES was being used in ECB mode. In ECB mode each plaintext is split up into 16-byte chunks and encrypted with the same key. The same plaintext will always result in the same encrypted output. This implied that the packets were substantially similar and that the encryption key was static.

Some more digging showed that someone had figured out the encryption key last year, and that someone else had written some tools to control the plug without needing to modify it. The protocol is basically ascii and consists mostly of the MAC address of the target device, a password and a command. This is then encrypted and sent to the device’s IP address. The device then sends a challenge packet containing a random number. The app has to decrypt this, obtain the random number, create a response, encrypt that and send it before the command takes effect. This avoids the most obvious weakness around using ECB – since the same plaintext always encrypts to the same ciphertext, you could just watch encrypted packets go past and replay them to get the same effect, even if you didn’t have the encryption key. Using a random number in a challenge forces you to prove that you actually have the key.

At least, it would do if the numbers were actually random. It turns out that the plug is just calling rand(). Further, it turns out that it never calls srand(). This means that the plug will always generate the same sequence of challenges after a reboot, which means you can still carry out replay attacks if you can reboot the plug. Strong work.

But there was still the question of how the remote control works, since the code on github only worked locally. tcpdumping the traffic from the server and trying to decrypt it in the same way as local packets worked fine, and showed that the only difference was that the packet started “wan” rather than “lan”. The server decrypts the packet, looks at the MAC address, re-encrypts it and sends it over the tunnel to the plug that registered with that address.

That’s not really a great deal of authentication. The protocol permits a password, but the app doesn’t insist on it – some quick playing suggests that about 90% of these devices still use the default password. And the devices are all based on the same wifi module, so the MAC addresses are all in the same range. The process of sending status check packets to the server with every MAC address wouldn’t take that long and would tell you how many of these devices are out there. If they’re using the default password, that’s enough to have full control over them.

There’s some other failings. The github repo mentioned earlier includes a script that allows arbitrary command execution – the wifi configuration information is passed to the system() command, so leaving a semicolon in the middle of it will result in your own commands being executed. Thankfully this doesn’t seem to be true of the daemon that’s listening for the remote control packets, which seems to restrict its use of system() to data entirely under its control. But even if you change the default root password, anyone on your local network can get root on the plug. So that’s a thing. It also downloads firmware updates over http and doesn’t appear to check signatures on them, so there’s the potential for MITM attacks on the plug itself. The remote control server is on AWS unless your timezone is GMT+8, in which case it’s in China. Sorry, Western Australia.

It’s running Linux and includes Busybox and dnsmasq, so plenty of GPLed code. I emailed the manufacturer asking for a copy and got told that they wouldn’t give it to me, which is unsurprising but still disappointing.

The use of AES is still somewhat confusing, given the relatively small amount of security it provides. One thing I’ve wondered is whether it’s not actually intended to provide security at all. The remote servers need to accept connections from anywhere and funnel decent amounts of traffic around from phones to switches. If that weren’t restricted in any way, competitors would be able to use existing servers rather than setting up their own. Using AES at least provides a minor obstacle that might encourage them to set up their own server.

Overall: the hardware seems fine, the software is shoddy and the security is terrible. If you have one of these, set a strong password. There’s no rate-limiting on the server, so a weak password will be broken pretty quickly. It’s also infringing my copyright, so I’d recommend against it on that point alone.

comment count unavailable comments

Satoshi: how Craig Wright’s deception worked

Post Syndicated from Robert Graham original http://blog.erratasec.com/2016/05/satoshi-how-craig-wrights-deception.html

My previous post shows how anybody can verify Satoshi using a GUI. In this post, I’ll do the same, with command-line tools (openssl). It’s just a simple application of crypto (hashes, public-keys) to the problem.

I go through this step-by-step discussion in order to demonstrate Craig Wright’s scam. Dan Kaminsky’s post and the redditors comes to the same point through a different sequence, but I think my way is clearer.

Step #1: the Bitcoin address

We know certain Bitcoin addresses correspond to Satoshi Nakamoto him/her self. For the sake of discussion, we’ll use the address 15fszyyM95UANiEeVa4H5L6va7Z7UFZCYP. It’s actually my address, but we’ll pretend it’s Satoshi’s. In this post, I’m going to prove that this address belongs to me.

The address isn’t the public-key, as you’d expect, but the hash of the public-key. Hashes are a lot shorter, and easier to pass around. We only pull out the public-key when we need to do a transaction. The hashing algorithm is explained on this website [http://gobittest.appspot.com/Address]. It’s basically base58(ripemd(sha256(public-key)).

Step #2: You get the public-key

Hashes are one-way, so given a Bitcoin address, we can’t immediately convert it into a public-key. Instead, we have to look it up in the blockchain, the vast public ledger that is at the heart of Bitcoin. The blockchain records every transaction, and is approaching 70-gigabytes in size.

To find an address’s match public-key, we have to search for a transaction where the bitcoin is spent. If an address has only received Bitcoins, then its matching public-key won’t appear in the Blockchain. In that case, a person trying to prove their identity will have to tell you the public-key, which is fine, of course, since the keys are designed to be public.

Luckily, there are lots of websites that store the blockchain in a database and make it easy for us to browse. I use Blockchain.info. The URL to my address is:

https://blockchain.info/address/15fszyyM95UANiEeVa4H5L6va7Z7UFZCYP

There is a list of transactions here where I spend coin. Let’s pick the top one, at this URL:

https://blockchain.info/tx/8c4263d864d4f36e4eb4065a877e3e9a68cbe1de63a7b1fda70096e1e209cbbb

Toward the bottom are the “scripts”. Bitcoin has a small scripting language, allowing complex transactions to be created, but most transactions are simple. There are two common formats for these scripts, and old format and a new format. In the old format, you’ll find the public-key in the Output Script. In the new format, you’ll find the public-key in the Input Scripts. It’ll be a long long number starting with “04”.

In this case, my public-key is:

04b19ffb77b602e4ad3294f770130c7677374b84a7a164fe6a80c81f13833a673dbcdb15c29857ce1a23fca1c808b9c29404b84b986924e6ff08fb3517f38bc099

You can verify this hashes to my Bitcoin address by the website I mention above.

Step #3: You format the key according to OpenSSL

OpenSSL wants the public-key in it’s own format (wrapped in ASN.1 DER, then encoded in BASE64). I should just insert the JavaScript form to do it directly in this post, but I’m lazy. Instead, use the following code in the file “foo.js”:

KeyEncoder = require(‘key-encoder’);
sec = new KeyEncoder(‘secp256k1’);
args = process.argv.slice(2);
pemKey = sec.encodePublic(args[0], ‘raw’, ‘pem’);
console.log(pemKey);
Then run:
npm install key-encoder

node foo.js 04b19ffb77b602e4ad3294f770130c7677374b84a7a164fe6a80c81f13833a673dbcdb15c29857ce1a23fca1c808b9c29404b84b986924e6ff08fb3517f38bc099
This will output the following file pub.pem:
—–BEGIN PUBLIC KEY—–
MFYwEAYHKoZIzj0CAQYFK4EEAAoDQgAEsZ/7d7YC5K0ylPdwEwx2dzdLhKehZP5q
gMgfE4M6Zz282xXCmFfOGiP8ocgIucKUBLhLmGkk5v8I+zUX84vAmQ==
—–END PUBLIC KEY—–
To verify that we have a correctly formatted OpenSSL public-key, we do the following command. As you can see, the hex of the OpenSSL public-key agrees with the original hex above 04b19ffb… that I got from the Blockchain: 
$ openssl ec -in pub.pem -pubin -text -noout
read EC key
Private-Key: (256 bit)
pub:
    04:b1:9f:fb:77:b6:02:e4:ad:32:94:f7:70:13:0c:
    76:77:37:4b:84:a7:a1:64:fe:6a:80:c8:1f:13:83:
    3a:67:3d:bc:db:15:c2:98:57:ce:1a:23:fc:a1:c8:
    08:b9:c2:94:04:b8:4b:98:69:24:e6:ff:08:fb:35:
    17:f3:8b:c0:99
ASN1 OID: secp256k1

Step #4: I create a message file

What are we are going to do is sign a message. That could be a message you create, that you test if I can decrypt. Or I can simply create my own message file.
In this example, I’m going to use the file message.txt:
Robert Graham is Satoshi Nakamoto

Obviously, if I can sign this file with Satoshi’s key, then I’m the real Satoshi.

There’s a problem here, though. The message I choose can be too long (such as when choosing a large work of Sartre). Or, in this case, depending on how you copy/paste the text into a file, it may end with varying “line-feeds” and “carriage-returns”. 

Therefore, at this stage, I may instead just choose to hash the message file into something smaller and more consistent. I’m not going to in my example, but that’s what Craig Wright does in his fraudulent example. And it’s important.

BTW, if you just echo from the command-line, or use ‘vi’ to create a file, it’ll automatically append a single line-feed. That’s what I assume for my message. In hex you should get:

$ xxd -i message.txt
unsigned char message_txt[] = {
  0x52, 0x6f, 0x62, 0x65, 0x72, 0x74, 0x20, 0x47, 0x72, 0x61, 0x68, 0x61,
  0x6d, 0x20, 0x69, 0x73, 0x20, 0x53, 0x61, 0x74, 0x6f, 0x73, 0x68, 0x69,
  0x20, 0x4e, 0x61, 0x6b, 0x61, 0x6d, 0x6f, 0x74, 0x6f, 0x0a
};
unsigned int message_txt_len = 34;

Step #5: I grab my private-key from my wallet

To prove my identity, I extract my private-key from my wallet file, and convert it into an OpenSSL file in a method similar to that above, creating the file priv.pem (the sister of the pub.pem that you create). I’m skipping the steps, because I’m not actually going to show you my private key, but they are roughly the same as above. Bitcoin-qt has a little “dumprivkey” command that’ll dump the private key, which I then wrap in OpenSSL ASN.1. If you want to do this, I used the following node.js code, with the “base-58” and “key-encoder” dependencies.
Base58 = require(“base-58”);
KeyEncoder = require(‘key-encoder’);
sec = new KeyEncoder(‘secp256k1’);
var args = process.argv.slice(2);
var x = Base58.decode(args[0]);
x = x.slice(1);
if (x.length == 36)
    x = x.slice(0, 32);
pemPrivateKey = sec.encodePrivate(x, ‘raw’, ‘pem’);
console.log(pemPrivateKey)

Step #6: I sign the message.txt with priv.pem

I then sign the file message.txt with my private-key priv.pem, and save the base64 encoded results in sig.b64.
openssl dgst -sign priv.pem message.txt | base64 >sig.b64
This produces the following file sig.b64 that hash the following contents:
MEUCIQDoy6K0xQ1cAPg7fXbQcmfbtK4VJ5wlMTzG4DaUV3zF9gIgLNbJw0oqj3lQf7lhe7TtPzse
PXf8GB3q4IhCiWVxTJ8=
How signing works is that it first creates a SHA256 hash of the file message.txt, then it encrypts it with the secp256k1 public-key algorithm. It wraps the result in a ASN.1 DER binary file. Sadly, there’s no native BASE64 file format, so I have to encode it in BASE64 myself in order to post on this page, and you’ll have to BASE64 decode it before you use it.

Step #6: You verify the signature

Okay, at this point you have three files. You have my public-key pub.pem, my messagemessage.txt, and the signature sig.b64.
First, you need to convert the signature back into binary:
base64 -d sig.b64 > sig.der
Now you run the verify command:
openssl dgst -verify pub.pem -signature sig.der message.txt
If I’m really who I say I am, and then you’ll see the result:
Verified OK
If something has gone wrong, you’ll get the error:
Verification Failure

How we know the Craig Wright post was a scam

This post is similarly structure to Craig Wright’s post, and in the differences we’ll figure out how he did his scam.
As I point out in Step #4 above, a large file (like a work from Sartre) would be difficult to work with, so I could just hash it, and put the binary hash into a file. It’s really all the same, because I’m creating some arbitrary un-signed bytes, then signing them.
But here’s the clever bit. If you’ve been paying attention, you’ll notice that the Sartre file has been hashed twice by SHA256, before the hash has been encrypted. In other words, it looks like the function:
secp256k1(sha256(sha256(message)))
Now let’s go back to Bitcoin transactions. Transactions are signed by first hashing twice::
secp256k1(sha256(sha256(transaction)))

Notice that the algorithms are the same. That’s how how Craig Write tried to fool us. Unknown to us, he grabbed a transaction from the real Satoshi, and grabbed the initial hash (see Update below for contents ). He then claimed that his “Sartre” file had that same hash:

479f9dff0155c045da78402177855fdb4f0f396dc0d2c24f7376dd56e2e68b05

Which signed (hashed again, then encrypted), becomes:

3045022100c12a7d54972f26d14cb311339b5122f8c187417dde1e8efb6841f55c34220ae0022066632c5cd4161efa3a2837764eee9eb84975dd54c2de2865e9752585c53e7cce
That’s a lie. How are we supposed to know? After all, we aren’t going to type in a bunch of hex digits then go search the blockchain for those bytes. We didn’t have a copy of the Sartre file to calculate the hash ourselves.
Now, when hashed an signed, the results from openssl exactly match the results from that old Bitcoin transaction. Craig Wright magically appears to have proven he knows Satoshi’s private-key, when in fact he’s copied the inputs/outputs and made us think we calculcated them.
It would’ve worked, too, but there’s too many damn experts in the blockchain who immediately pick up on the subtle details. There’s too many people willing to type in all those characters. Once typed in, it’s a simple matter of googling them to find them in the blockchain.
Also, it looks as suspicious as all hell. He explains the trivial bits, like “what is hashing”, with odd references to old publications, but then leaves out important bits. I had to write code in order to extract my own private-key from my wallet in order to make it into something that OpenSSL would accept — I step he didn’t actually have to go through, and thus, didn’t have to document.

Conclusion

Both Bitcoin and OpenSSL are just straightforward applications of basic crypto. It’s that they share the basics that made this crossover work. It’s by applying our basic crypto knowledge to the problem that catches him in the lie.
I write this post not really to catch Craig Wright in a scam, but to help teach basic crypto. Working backwards from this blogpost, learning the bits you didn’t understand, will teach you the important basics of crypto.

Appendix

To verify that I have that Bitcoin address, you’ll need the three files:
pub.pem

—–BEGIN PUBLIC KEY—–
MFYwEAYHKoZIzj0CAQYFK4EEAAoDQgAEsZ/7d7YC5K0ylPdwEwx2dzdLhKehZP5q
gMgfE4M6Zz282xXCmFfOGiP8ocgIucKUBLhLmGkk5v8I+zUX84vAmQ==
—–END PUBLIC KEY—–

message.txt

Robert Graham is Satoshi Nakamoto

sig.b64

MEUCIQDoy6K0xQ1cAPg7fXbQcmfbtK4VJ5wlMTzG4DaUV3zF9gIgLNbJw0oqj3lQf7lhe7TtPzsePXf8GB3q4IhCiWVxTJ8=

Now run the following command, and verify it matches the hex value for the public-key that you found in the transaction in the blockchain:


openssl ec -in pub.pem -pubin -text -noout

Now verify the message:
base64 -d sig.b64 > sig.der
openssl dgst -verify pub.pem -signature sig.der message.txt

Update:

The lie can be condensed into two images. In the first is excerpts from his post, where he claims the file “Sartre” has the specific sha256sum and contains the shown text:
But, we know that this checksum matches instead an intermediate step in the 2009 Bitcoin transaction, which if put in a file, would have the following contents:
The sha256sum result is the same in both cases, so either I’m lying or Craig Wright is. You can verify for yourself which one is lying by creating your own Sartre file from this base64 encoded data (copy/paste into file, then base64 -d > Sartre to create binary file).

AQAAAAG6kcHV5VqeL6tOQfVbhipzskcZqtE6Un0WnB+tO2O1EgEAAABDQQQR25Ph3NuKAWtJhA+MU7wetoo4LpexSC7K17FIppCaXLLg6t37hMz5dERk+C4WC/qbi2T51MA/mZuGQ/ZWtBKjrP////8CAMqaOwAAAABDQQS+2CfTdHS+/7N+/lM3AawffGAJV6RIe+izcTRvAWgm7m9XujDYikcqDk7NLwdZmnlfHwHeeNeRs4LmXuHFi0UIrADSSWsAAAAAQ0EEEduT4dzbigFrSYQPjFO8HraKOC6XsUguytexSKaQmlyy4Ord+4TM+XREZPguFgv6m4tk+dTAP5mbhkP2VrQSo6wAAAAAAQAAAA==

I got this file from https://rya.nc/sartre.html, after spending an hour looking for the right tool. Transactions are verified using a script within the transactions itself. At some intermediate step, it transmogrifies the transaction into something else, then verifies it. It’s this transmogrified form of the transaction that we need to grab for the contents of the “Sartre” file.

Astro Pi: Coding Challenges Results!

Post Syndicated from David Honess original https://www.raspberrypi.org/blog/astro-pi-coding-challenges-results/

Astro_Pi_Logo_WEB-300px

Back in early February we announced a new opportunity for young programmers to send their code up the International Space Station to be used by British ESA Astronaut Tim Peake.

Two challenges were on offer. The first required you to write Python Sense HAT code to turn Ed and Izzy (the Astro Pi computers) into an MP3 player, so that Tim can plug in his headphones and listen to music. The second required you to code Sonic Pi music for Tim to listen to via the MP3 player.

The competition closed on March 31st and the judging took place at Pi Towers in Cambridge last week. With the assistance of Flat Tim!

The judges were selected from companies who have contributed to the Astro Pi mission so far. These were;

12omdfin_(live)-600x0

Orchestral Manoeuvres In the Dark (Andy McCluskey and Paul Humphreys)

We also wanted to have some judges to provide musical talent to balance the science and technology expertise from the aerospace people. Thanks to Carl Walker at ESA we were able to connect with synthpop giants OMD (Enola Gay, Electricity, Maid of Orleans) and British/French film composer Ilan Eshkeri (Stardust, Layer Cake, Shaun the Sheep).

ilanEshkeri_composing_Stardust

Ilan Eshkeri working on the Stardust soundtrack

We also secured Sam Aaron, the author of Sonic Pi and Overtone, a live coder who regularly performs in clubs across the UK.

sam-aaron

Sam Aaron at TEDx Newcastle

Entries were received from all over the UK and were judged across four age categories; 11 and under, 11 to 13, 14 to 16 and 17 to 18. So the outcome is that four MP3 players and four songs will be going up to the ISS for Tim to use. Note that the Sonic Pi tunes will be converted to MP3 so that the MP3 player programs can load and play the audio to Tim.

The judging took two days to complete: one full day for the MP3 players and one day for the Sonic Pi tunes. So without further ado, let’s see who the winners are!

MP3 Player Winners

11 and under

11 to 13

14 to 16

  • Winner: Joe Speers
  • School: n/a (Independent entry)
  • Teacher/Adult: Craig Speers
  • Code on Github

17 to 18

Sonic Pi Winners

11 and under

11 to 13

  • Winner: Isaac Ingram
  • School: Knox Academy
  • Teacher/Adult: Karl Ingram

14 to 16

17 to 18

Congratulations to you all. The judges had a lot of fun with your entries and they will very soon be uploaded to the International Space Station for Tim Peake. The Astro Pi Twitter account will post a tweet to indicate when Tim is listening to the music.

The Raspberry Pi Foundation would like to thank all the judges who contributed to this competition, and especially our special judges: Andy McCluskey and Paul Humphreys from OMD, Ilan Eshkeri and Sam Aaron.

The post Astro Pi: Coding Challenges Results! appeared first on Raspberry Pi.

Friday’s security advisories

Post Syndicated from ris original http://lwn.net/Articles/683139/rss

Debian has updated cgit (three
vulnerabilities), optipng (code execution),
and python-django (two vulnerabilities).

Fedora has updated libmaxminddb (F23; F22:
multiple vulnerabilities), mercurial (F23; F22:
three vulnerabilities), and python-rsa
(F22: unspecified).

Mageia has updated flash-player-plugin (multiple vulnerabilities).

openSUSE has updated clamav-database (Leap42.1: database refresh),
flash-player (13.2: code execution), and java-1_8_0-openjdk (13.2: sandbox bypass).

Red Hat has updated flash-plugin
(RHEL5,6: multiple vulnerabilities).

SUSE has updated flash-player
(SLE12-SP1: code execution).

Ubuntu has updated firefox
(regression in previous update).