Tag Archives: Frozen

Friday Squid Blogging: Eating Firefly Squid

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/04/friday_squid_bl_620.html

In Tokama, Japan, you can watch the firefly squid catch and eat them in various ways:

“It’s great to eat hotaruika around when the seasons change, which is when people tend to get sick,” said Ryoji Tanaka, an executive at the Toyama prefectural federation of fishing cooperatives. “In addition to popular cooking methods, such as boiling them in salted water, you can also add them to pasta or pizza.”

Now there is a new addition: eating hotaruika raw as sashimi. However, due to reports that parasites have been found in their internal organs, the Health, Labor and Welfare Ministry recommends eating the squid after its internal organs have been removed, or after it has been frozen for at least four days at minus 30 C or lower.

As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.

Read my blog posting guidelines here.

What is HAMR and How Does It Enable the High-Capacity Needs of the Future?

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hamr-hard-drives/

HAMR drive illustration

During Q4, Backblaze deployed 100 petabytes worth of Seagate hard drives to our data centers. The newly deployed Seagate 10 and 12 TB drives are doing well and will help us meet our near term storage needs, but we know we’re going to need more drives — with higher capacities. That’s why the success of new hard drive technologies like Heat-Assisted Magnetic Recording (HAMR) from Seagate are very relevant to us here at Backblaze and to the storage industry in general. In today’s guest post we are pleased to have Mark Re, CTO at Seagate, give us an insider’s look behind the hard drive curtain to tell us how Seagate engineers are developing the HAMR technology and making it market ready starting in late 2018.

What is HAMR and How Does It Enable the High-Capacity Needs of the Future?

Guest Blog Post by Mark Re, Seagate Senior Vice President and Chief Technology Officer

Earlier this year Seagate announced plans to make the first hard drives using Heat-Assisted Magnetic Recording, or HAMR, available by the end of 2018 in pilot volumes. Even as today’s market has embraced 10TB+ drives, the need for 20TB+ drives remains imperative in the relative near term. HAMR is the Seagate research team’s next major advance in hard drive technology.

HAMR is a technology that over time will enable a big increase in the amount of data that can be stored on a disk. A small laser is attached to a recording head, designed to heat a tiny spot on the disk where the data will be written. This allows a smaller bit cell to be written as either a 0 or a 1. The smaller bit cell size enables more bits to be crammed into a given surface area — increasing the areal density of data, and increasing drive capacity.

It sounds almost simple, but the science and engineering expertise required, the research, experimentation, lab development and product development to perfect this technology has been enormous. Below is an overview of the HAMR technology and you can dig into the details in our technical brief that provides a point-by-point rundown describing several key advances enabling the HAMR design.

As much time and resources as have been committed to developing HAMR, the need for its increased data density is indisputable. Demand for data storage keeps increasing. Businesses’ ability to manage and leverage more capacity is a competitive necessity, and IT spending on capacity continues to increase.

History of Increasing Storage Capacity

For the last 50 years areal density in the hard disk drive has been growing faster than Moore’s law, which is a very good thing. After all, customers from data centers and cloud service providers to creative professionals and game enthusiasts rarely go shopping looking for a hard drive just like the one they bought two years ago. The demands of increasing data on storage capacities inevitably increase, thus the technology constantly evolves.

According to the Advanced Storage Technology Consortium, HAMR will be the next significant storage technology innovation to increase the amount of storage in the area available to store data, also called the disk’s “areal density.” We believe this boost in areal density will help fuel hard drive product development and growth through the next decade.

Why do we Need to Develop Higher-Capacity Hard Drives? Can’t Current Technologies do the Job?

Why is HAMR’s increased data density so important?

Data has become critical to all aspects of human life, changing how we’re educated and entertained. It affects and informs the ways we experience each other and interact with businesses and the wider world. IDC research shows the datasphere — all the data generated by the world’s businesses and billions of consumer endpoints — will continue to double in size every two years. IDC forecasts that by 2025 the global datasphere will grow to 163 zettabytes (that is a trillion gigabytes). That’s ten times the 16.1 ZB of data generated in 2016. IDC cites five key trends intensifying the role of data in changing our world: embedded systems and the Internet of Things (IoT), instantly available mobile and real-time data, cognitive artificial intelligence (AI) systems, increased security data requirements, and critically, the evolution of data from playing a business background to playing a life-critical role.

Consumers use the cloud to manage everything from family photos and videos to data about their health and exercise routines. Real-time data created by connected devices — everything from Fitbit, Alexa and smart phones to home security systems, solar systems and autonomous cars — are fueling the emerging Data Age. On top of the obvious business and consumer data growth, our critical infrastructure like power grids, water systems, hospitals, road infrastructure and public transportation all demand and add to the growth of real-time data. Data is now a vital element in the smooth operation of all aspects of daily life.

All of this entails a significant infrastructure cost behind the scenes with the insatiable, global appetite for data storage. While a variety of storage technologies will continue to advance in data density (Seagate announced the first 60TB 3.5-inch SSD unit for example), high-capacity hard drives serve as the primary foundational core of our interconnected, cloud and IoT-based dependence on data.

HAMR Hard Drive Technology

Seagate has been working on heat assisted magnetic recording (HAMR) in one form or another since the late 1990s. During this time we’ve made many breakthroughs in making reliable near field transducers, special high capacity HAMR media, and figuring out a way to put a laser on each and every head that is no larger than a grain of salt.

The development of HAMR has required Seagate to consider and overcome a myriad of scientific and technical challenges including new kinds of magnetic media, nano-plasmonic device design and fabrication, laser integration, high-temperature head-disk interactions, and thermal regulation.

A typical hard drive inside any computer or server contains one or more rigid disks coated with a magnetically sensitive film consisting of tiny magnetic grains. Data is recorded when a magnetic write-head flies just above the spinning disk; the write head rapidly flips the magnetization of one magnetic region of grains so that its magnetic pole points up or down, to encode a 1 or a 0 in binary code.

Increasing the amount of data you can store on a disk requires cramming magnetic regions closer together, which means the grains need to be smaller so they won’t interfere with each other.

Heat Assisted Magnetic Recording (HAMR) is the next step to enable us to increase the density of grains — or bit density. Current projections are that HAMR can achieve 5 Tbpsi (Terabits per square inch) on conventional HAMR media, and in the future will be able to achieve 10 Tbpsi or higher with bit patterned media (in which discrete dots are predefined on the media in regular, efficient, very dense patterns). These technologies will enable hard drives with capacities higher than 100 TB before 2030.

The major problem with packing bits so closely together is that if you do that on conventional magnetic media, the bits (and the data they represent) become thermally unstable, and may flip. So, to make the grains maintain their stability — their ability to store bits over a long period of time — we need to develop a recording media that has higher coercivity. That means it’s magnetically more stable during storage, but it is more difficult to change the magnetic characteristics of the media when writing (harder to flip a grain from a 0 to a 1 or vice versa).

That’s why HAMR’s first key hardware advance required developing a new recording media that keeps bits stable — using high anisotropy (or “hard”) magnetic materials such as iron-platinum alloy (FePt), which resist magnetic change at normal temperatures. Over years of HAMR development, Seagate researchers have tested and proven out a variety of FePt granular media films, with varying alloy composition and chemical ordering.

In fact the new media is so “hard” that conventional recording heads won’t be able to flip the bits, or write new data, under normal temperatures. If you add heat to the tiny spot on which you want to write data, you can make the media’s coercive field lower than the magnetic field provided by the recording head — in other words, enable the write head to flip that bit.

So, a challenge with HAMR has been to replace conventional perpendicular magnetic recording (PMR), in which the write head operates at room temperature, with a write technology that heats the thin film recording medium on the disk platter to temperatures above 400 °C. The basic principle is to heat a tiny region of several magnetic grains for a very short time (~1 nanoseconds) to a temperature high enough to make the media’s coercive field lower than the write head’s magnetic field. Immediately after the heat pulse, the region quickly cools down and the bit’s magnetic orientation is frozen in place.

Applying this dynamic nano-heating is where HAMR’s famous “laser” comes in. A plasmonic near-field transducer (NFT) has been integrated into the recording head, to heat the media and enable magnetic change at a specific point. Plasmonic NFTs are used to focus and confine light energy to regions smaller than the wavelength of light. This enables us to heat an extremely small region, measured in nanometers, on the disk media to reduce its magnetic coercivity,

Moving HAMR Forward

HAMR write head

As always in advanced engineering, the devil — or many devils — is in the details. As noted earlier, our technical brief provides a point-by-point short illustrated summary of HAMR’s key changes.

Although hard work remains, we believe this technology is nearly ready for commercialization. Seagate has the best engineers in the world working towards a goal of a 20 Terabyte drive by 2019. We hope we’ve given you a glimpse into the amount of engineering that goes into a hard drive. Keeping up with the world’s insatiable appetite to create, capture, store, secure, manage, analyze, rapidly access and share data is a challenge we work on every day.

With thousands of HAMR drives already being made in our manufacturing facilities, our internal and external supply chain is solidly in place, and volume manufacturing tools are online. This year we began shipping initial units for customer tests, and production units will ship to key customers by the end of 2018. Prepare for breakthrough capacities.

The post What is HAMR and How Does It Enable the High-Capacity Needs of the Future? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Vulnerability in Amazon Key

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/11/vulnerability_i.html

Amazon Key is an IoT door lock that can enable one-time access codes for delivery people. To further secure that system, Amazon sells Cloud Cam, a camera that watches the door to ensure that delivery people don’t abuse their one-time access privilege.

Cloud Cam has been hacked:

But now security researchers have demonstrated that with a simple program run from any computer in Wi-Fi range, that camera can be not only disabled but frozen. A viewer watching its live or recorded stream sees only a closed door, even as their actual door is opened and someone slips inside. That attack would potentially enable rogue delivery people to stealthily steal from Amazon customers, or otherwise invade their inner sanctum.

And while the threat of a camera-hacking courier seems an unlikely way for your house to be burgled, the researchers argue it potentially strips away a key safeguard in Amazon’s security system.

Amazon is patching the system.

Some memorable levels

Post Syndicated from Eevee original https://eev.ee/blog/2017/07/01/some-memorable-levels/

Another Patreon request from Nova Dasterin:

Maybe something about level design. In relation to a vertical shmup since I’m working on one of those.

I’ve been thinking about level design a lot lately, seeing as how I’ve started… designing levels. Shmups are probably the genre I’m the worst at, but perhaps some general principles will apply universally.

And speaking of general principles, that’s something I’ve been thinking about too.

I’ve been struggling to create a more expansive tileset for a platformer, due to two general problems: figuring out what I want to show, and figuring out how to show it with a limited size and palette. I’ve been browsing through a lot of pixel art from games I remember fondly in the hopes of finding some inspiration, but so far all I’ve done is very nearly copy a dirt tile someone submitted to my potluck project.

Recently I realized that I might have been going about looking for inspiration all wrong. I’ve been sifting through stuff in the hopes of finding something that would create some flash of enlightenment, but so far that aimless tourism has only found me a thing or two to copy.

I don’t want to copy a small chunk of the final product; I want to understand the underlying ideas that led the artist to create what they did in the first place. Or, no, that’s not quite right either. I don’t want someone else’s ideas; I want to identify what I like, figure out why I like it, and turn that into some kinda of general design idea. Find the underlying themes that appeal to me and figure out some principles that I could apply. You know, examine stuff critically.

I haven’t had time to take a deeper look at pixel art this way, so I’ll try it right now with level design. Here, then, are some levels from various games that stand out to me for whatever reason; the feelings they evoke when I think about them; and my best effort at unearthing some design principles from those feelings.

Doom II: MAP10, Refueling Base

Opening view of Refueling Base, showing a descent down some stairs into a room not yet visible

screenshots mine — map via doom wiki — see also textured perspective map (warning: large!) via ian albertpistol start playthrough

I’m surprising myself by picking Refueling Base. I would’ve expected myself to pick MAP08, Tricks and Traps, for its collection of uniquely bizarre puzzles and mechanisms. Or MAP13, Downtown, the map that had me convinced (erroneously) that Doom levels supported multi-story structures. Or at least MAP08, The Pit, which stands out for the unique way it feels like a plunge into enemy territory.

(Curiously, those other three maps are all Sandy Petersen’s sole work. Refueling Base was started by Tom Hall in the original Doom days, then finished by Sandy for Doom II.)

But Refueling Base is the level I have the most visceral reaction to: it terrifies me.

See, I got into Doom II through my dad, who played it on and off sometimes. My dad wasn’t an expert gamer or anything, but as a ten-year-old, I assumed he was. I watched him play Refueling Base one night. He died. Again, and again, over and over. I don’t even have very strong memories of his particular attempts, but watching my parent be swiftly and repeatedly defeated — at a time when I still somewhat revered parents — left enough of an impression that hearing the level music still makes my skin crawl.

This may seem strange to bring up as a first example in a post about level design, but I don’t think it would have impressed on me quite so much if the level weren’t designed the way it is. (It’s just a video game, of course, and since then I’ve successfully beaten it from a pistol start myself. But wow, little kid fears sure do linger.)

Map of Refueling Base, showing multiple large rooms and numerous connections between them

The one thing that most defines the map has to be its interconnected layout. Almost every major area (of which there are at least half a dozen) has at least three exits. Not only are you rarely faced with a dead end, but you’ll almost always have a choice of where to go next, and that choice will lead into more choices.

This hugely informs the early combat. Many areas near the beginning are simply adjacent with no doors between them, so it’s easy for monsters to start swarming in from all directions. It’s very easy to feel overwhelmed by an endless horde; no matter where you run, they just seem to keep coming. (In fact, Refueling Base has the most monsters of any map in the game by far: 279. The runner up is the preceding map at 238.) Compounding this effect is the relatively scant ammo and health in the early parts of the map; getting very far from a pistol start is an uphill battle.

The connections between rooms also yield numerous possible routes through the map, as well as several possible ways to approach any given room. Some of the connections are secrets, which usually connect the “backs” of two rooms. Clearing out one room thus rewards you with a sneaky way into another room that puts you behind all the monsters.

Outdoor area shown from the back; a large number of monsters are lying in wait

In fact, the map rewards you for exploring it in general.

Well, okay. It might be more accurate to say that that map punishes you for not exploring it. From a pistol start, the map is surprisingly difficult — the early areas offer rather little health and ammo, and your best chance of success is a very specific route that collects weapons as quickly as possible. Many of the most precious items are squirrelled away in (numerous!) secrets, and you’ll have an especially tough time if you don’t find any of them — though they tend to be telegraphed.

One particularly nasty surprise is in the area shown above, which has three small exits at the back. Entering or leaving via any of those exits will open one of the capsule-shaped pillars, revealing even more monsters. A couple of those are pain elementals, monsters which attack by spawning another monster and shooting it at you — not something you want to be facing with the starting pistol.

But nothing about the level indicates this, so you have to make the association the hard way, probably after making several mad dashes looking for cover. My successful attempt avoided this whole area entirely until I’d found some more impressive firepower. It’s fascinating to me, because it’s a fairly unique effect that doesn’t make any kind of realistic sense, yet it’s still built out of familiar level mechanics: walk through an area and something opens up. Almost like 2D sidescroller design logic applied to a 3D space. I really like it, and wish I saw more of it. So maybe that’s a more interesting design idea: don’t be afraid to do something weird only once, as long as it’s built out of familiar pieces so the player has a chance to make sense of it.

A similarly oddball effect is hidden in a “barracks” area, visible on the far right of the map. A secret door leads to a short U-shaped hallway to a marble skull door, which is themed nothing like the rest of the room. Opening it seems to lead back into the room you were just in, but walking through the doorway teleports you to a back entrance to the boss fight at the end of the level.

It sounds so bizarre, but the telegraphing makes it seem very natural; if anything, the “oh, I get it!” moment overrides the weirdness. It stops being something random and becomes something consciously designed. I believe that this might have been built by someone, even if there’s no sensible reason to have built it.

In fact, that single weird teleporter is exactly the kind of thing I’d like to be better at building. It could’ve been just a plain teleporter pad, but instead it’s a strange thing that adds a lot of texture to the level and makes it much more memorable. I don’t know how to even begin to have ideas like that. Maybe it’s as simple as looking at mundane parts of a level and wondering: what could I do with this instead?

I think a big problem I have is limiting myself to the expected and sensible, to the point that I don’t even consider more outlandish ideas. I can’t shake that habit simply by bolding some text in a blog post, but maybe it would help to keep this in mind: you can probably get away with anything, as long as you justify it somehow. Even “justify” here is too strong a word; it takes only the slightest nod to make an arbitrary behavior feel like part of a world. Why does picking up a tiny glowing knight helmet give you 1% armor in Doom? Does anyone care? Have you even thought about it before? It’s green and looks like armor; the bigger armor pickup is also green; yep, checks out.

A dark and dingy concrete room full of monsters; a couple are standing under light fixtures

On the other hand, the map as a whole ends up feeling very disorienting. There’s no shortage of landmarks, but every space is distinct in both texture and shape, so everything feels like a landmark. No one part of the map feels particularly central; there are a few candidates, but they neighbor other equally grand areas with just as many exits. It’s hard to get truly lost, but it’s also hard to feel like you have a solid grasp of where everything is. The space itself doesn’t make much sense, even though small chunks of it do. Of course, given that the Hellish parts of Doom were all just very weird overall, this is pretty fitting.

This sort of design fascinates me, because the way it feels to play is so different from the way it looks as a mapper with God Vision. Looking at the overhead map, I can identify all the familiar places easily enough, but I don’t know how to feel the way the map feels to play; it just looks like some rooms with doors between them. Yet I can see screenshots and have a sense of how “deep” in the level they are, how difficult they are to reach, whether I want to visit or avoid them. The lesson here might be that most of the interesting flavor of the map isn’t actually contained within the overhead view; it’s in the use of height and texture and interaction.

Dark room with numerous alcoves in the walls, all of them containing a hitscan monster

I realize as I describe all of this that I’m really just describing different kinds of contrast. If I know one thing about creative work (and I do, I only know one thing), it’s that effectively managing contrast is super duper important.

And it appears here in spades! A brightly-lit, outdoor, wide-open round room is only a short jog away from a dark, cramped room full of right angles and alcoves. A wide straight hallway near the beginning is directly across from a short, curvy, organic hallway. Most of the monsters in the map are small fry, but a couple stronger critters are sprinkled here and there, and then the exit is guarded by the toughest monster in the game. Some of the connections between rooms are simple doors; others are bizarre secret corridors or unnatural twisty passages.

You could even argue that the map has too much contrast, that it starts to lose cohesion. But if anything, I think this is one of the more cohesive maps in the first third of the game; many of the earlier maps aren’t so much places as they are concepts. This one feels distinctly like it could be something. The theming is all over the place, but enough of the parts seem deliberate.

I hadn’t even thought about it until I sat down to write this post, but since this is a “refueling base”, I suppose those outdoor capsules (which contain green slime, inset into the floor) could be the fuel tanks! I already referred to that dark techy area as “barracks”. Elsewhere is a rather large barren room, which might be where the vehicles in need of refueling are parked? Or is this just my imagination, and none of it was intended this way?

It doesn’t really matter either way, because even in this abstract world of ambiguity and vague hints, all of those rooms still feel like a place. I don’t have to know what the place is for it to look internally consistent.

I’m hesitant to say every game should have the loose design sense of Doom II, but it might be worth keeping in mind that anything can be a believable world as long as it looks consciously designed. And I’d say this applies even for natural spaces — we frequently treat real-world nature as though it were “designed”, just with a different aesthetic sense.

Okay, okay. I’m sure I could clumsily ramble about Doom forever, but I do that enough as it is. Other people have plenty to say if you’re interested.

I do want to stick in one final comment about MAP13, Downtown, while I’m talking about theming. I’ve seen a few people rag on it for being “just a box” with a lot of ideas sprinkled around — the map is basically a grid of skyscrapers, where each building has a different little mini encounter inside. And I think that’s really cool, because those encounters are arranged in a way that very strongly reinforces the theme of the level, of what this place is supposed to be. It doesn’t play quite like anything else in the game, simply because it was designed around a shape for flavor reasons. Weird physical constraints can do interesting things to level design.

Braid: World 4-7, Fickle Companion

Simple-looking platformer level with a few ladders, a switch, and a locked door

screenshots via StrategyWikiplaythroughplaythrough of secret area

I love Braid. If you’re not familiar (!), it’s a platformer where you have the ability to rewind time — whenever you want, for as long as you want, all the way back to when you entered the level.

The game starts in world 2, where you do fairly standard platforming and use the rewind ability to do some finnicky jumps with minimal frustration. It gets more interesting in world 3 with the addition of glowing green objects, which aren’t affected by the reversal of time.

And then there’s world 4, “Time and Place”. I love world 4, so much. It’s unlike anything I’ve ever seen in any other game, and it’s so simple yet so clever.

The premise is this: for everything except you, time moves forwards as you move right, and backwards as you move left.

This has some weird implications, which all come together in the final level of the world, Fickle Companion. It’s so named because you have to use one (single-use) key to open three doors, but that key is very easy to lose.

Say you pick up the key and walk to the right with it. Time continues forwards for the key, so it stays with you as expected. Now you climb a ladder. Time is frozen since you aren’t moving horizontally, but the key stays with you anyway. Now you walk to the left. Oops — the key follows its own path backwards in time, going down the ladder and back along the path you carried it in the first place. You can’t fix this by walking to the right again, because that will simply advance time normally for the key; since you’re no longer holding it, it will simply fall to the ground and stay there.

You can see how this might be a problem in the screenshot above (where you get the key earlier in the level, to the left). You can climb the first ladder, but to get to the door, you have to walk left to get to the second ladder, which will reverse the key back down to the ground.

The solution is in the cannon in the upper right, which spits out a Goomba-like critter. It has the timeproof green glow, so the critters it spits out have the same green glow — making them immune to both your time reversal power and to the effect your movement has on time. What you have to do is get one of the critters to pick up the key and carry it leftwards for you. Once you have the puzzle piece, you have to rewind time and do it again elsewhere. (Or, more likely, the other way around; this next section acts as a decent hint for how to do the earlier section.)

A puzzle piece trapped behind two doors, in a level containing only one key

It’s hard to convey how bizarre this is in just text. If you haven’t played Braid, it’s absolutely worth it just for this one world, this one level.

And it gets even better, slash more ridiculous: there’s a super duper secret hidden very cleverly in this level. Reaching it involves bouncing twice off of critters; solving the puzzle hidden there involves bouncing the critters off of you. It’s ludicrous and perhaps a bit too tricky, but very clever. Best of all, it’s something that an enterprising player might just think to do on a whim — hey, this is possible here, I wonder what happens if I try it. And the game rewards the player for trying something creative! (Ironically, it’s most rewarding to have a clever idea when it turns out the designer already had the same idea.)

What can I take away from this? Hm.

Well, the underlying idea of linking time with position is pretty novel, but getting to it may not be all that hard: just combine different concepts and see what happens.

A similar principle is to apply a general concept to everything and see what happens. This is the first sighting of a timeproof wandering critter; previously timeproofing had only been seen on keys, doors, puzzle pieces, and stationary monsters. Later it even applies to Tim himself in special circumstances.

The use of timeproofing on puzzle pieces is especially interesting, because the puzzle pieces — despite being collectibles that animate moving into the UI when you get them — are also affected by time. If the pieces in this level weren’t timeproof, then as soon as you collected one and moved left to leave its alcove, time would move backwards and the puzzle piece would reverse out of the UI and right back into the world.

Along similar lines, the music and animated background are also subject to the flow of time. It’s obvious enough that the music plays backwards when you rewind time, but in world 4, the music only plays at all while you’re moving. It’s a fantastic effect that makes the whole world feel as weird and jerky as it really is under these rules. It drives the concept home instantly, and it makes your weird influence over time feel all the more significant and far-reaching. I love when games weave all the elements of the game into the gameplaylike this, even (especially?) for the sake of a single oddball level.

Admittedly, this is all about gameplay or puzzle mechanics, not so much level design. What I like about the level itself is how simple and straightforward it is: it contains exactly as much as it needs to, yet still invites trying the wrong thing first, which immediately teaches the player why it won’t work. And it’s something that feels like it ought to work, except that the rules of the game get in the way just enough. This makes for my favorite kind of puzzle, the type where you feel like you’ve tried everything and it must be impossible — until you realize the creative combination of things you haven’t tried yet. I’m talking about puzzles again, oops; I guess the general level design equivalent of this is that players tend to try the first thing they see first, so if you put required parts later, players will be more likely to see optional parts.

I think that’s all I’ve got for this one puzzle room. I do want to say (again) that I love both endings of Braid. The normal ending weaves together the game mechanics and (admittedly loose) plot in a way that gave me chills when I first saw it; the secret ending completely changes both how the ending plays and how you might interpret the finale, all by making only the slightest changes to the level.

Portal: Testchamber 18 (advanced)

View into a Portal test chamber; the ceiling and most of the walls are covered in metal

screenshot mine — playthrough of normal mapplaythrough of advanced map

I love Portal. I blazed through the game in a couple hours the night it came out. I’d seen the trailer and instantly grasped the concept, so the very slow and gentle learning curve was actually a bit frustrating for me; I just wanted to portal around a big playground, and I finally got to do that in the six “serious” tests towards the end, 13 through 18.

Valve threw an interesting curveball with these six maps. As well as being more complete puzzles by themselves, Valve added “challenges” requiring that they be done with as few portals, time, or steps as possible. I only bothered with the portal challenges — time and steps seemed less about puzzle-solving and more about twitchy reflexes — and within them I found buried an extra layer of puzzles. All of the minimum portal requirements were only possible if you found an alternative solution to the map: skipping part of it, making do with only one cube instead of two, etc. But Valve offered no hints, only a target number. It was a clever way to make me think harder about familiar areas.

Alongside the challenges were “advanced” maps, and these blew me away. They were six maps identical in layout to the last six test chambers, but with a simple added twist that completely changed how you had to approach them. Test 13 has two buttons with two boxes to place on them; the advanced version removes a box and also changes the floor to lava. Test 14 is a live fire course with turrets you have to knock over; the advanced version puts them all in impenetrable cages. Test 17 is based around making extensive use of a single cube; the advanced version changes it to a ball.

But the one that sticks out the most to me is test 18, a potpourri of everything you’ve learned so far. The beginning part has you cross several large pits of toxic sludge by portaling from the ceilings; the advanced version simply changes the ceilings to unportalable metal. It seems you’re completely stuck after only the first jump, unless you happen to catch a glimpse of the portalable floor you pass over in mid-flight. Or you might remember from the regular version of the map that the floor was portalable there, since you used it to progress further. Either way, you have to fire a portal in midair in a way you’ve never had to do before, and the result feels very cool, like you’ve defeated a puzzle that was intended to be unsolvable. All in a level that was fairly easy the first time around, and has been modified only slightly.

I’m not sure where I’m going with this. I could say it’s good to make the player feel clever, but that feels wishy-washy. What I really appreciated about the advanced tests is that they exploited inklings of ideas I’d started to have when playing through the regular game; they encouraged me to take the spark of inspiration this game mechanic gave me and run with it.

So I suppose the better underlying principle here — the most important principle in level design, in any creative work — is to latch onto what gets you fired up and run with it. I am absolutely certain that the level designers for this game loved the portal concept as much as I do, they explored it thoroughly, and they felt compelled to fit their wilder puzzle ideas in somehow.

More of that. Find the stuff that feels like it’s going to burst out of your head, and let it burst.

Chip’s Challenge: Level 122, Totally Fair and Level 131, Totally Unfair

A small maze containing a couple monsters and ending at a brown button

screenshots mine — full maps of both levelsplaythrough of Totally Fairplaythrough of Totally Unfair

I mention this because Portal reminded me of it. The regular and advanced maps in Portal are reminiscent of parallel worlds or duality or whatever you want to call the theme. I extremely dig that theme, and it shows up in Chip’s Challenge in an unexpected way.

Totally Fair is a wide open level with a little maze walled off in one corner. The maze contains a monster called a “teeth”, which follows Chip at a slightly slower speed. (The second teeth, here shown facing upwards, starts outside the maze but followed me into it when I took this screenshot.)

The goal is to lure the teeth into standing on the brown button on the right side. If anything moves into a “trap” tile (the larger brown recesses at the bottom), it cannot move out of that tile until/unless something steps on the corresponding brown button. So there’s not much room for error in maneuvering the teeth; if it falls in the water up top, it’ll die, and if it touches the traps at the bottom, it’ll be stuck permanently.

The reason you need the brown button pressed is to acquire the chips on the far right edge of the level.

Several chips that cannot be obtained without stepping on a trap

The gray recesses turn into walls after being stepped on, so once you grab a chip, the only way out is through the force floors and ice that will send you onto the trap. If you haven’t maneuvered the teeth onto the button beforehand, you’ll be trapped there.

Doesn’t seem like a huge deal, since you can go see exactly how the maze is shaped and move the teeth into position fairly easily. But you see, here is the beginning of Totally Fair.

A wall with a single recessed gray space in it

The gray recess leads up into the maze area, so you can only enter it once. A force floor in the upper right lets you exit it.

Totally Unfair is exactly identical, except the second teeth has been removed, and the entrance to the maze looks like this.

The same wall is now completely solid, and the recess has been replaced with a hint

You can’t get into the maze area. You can’t even see the maze; it’s too far away from the wall. You have to position the teeth completely blind. In fact, if you take a single step to the left from here, you’ll have already dumped the teeth into the water and rendered the level impossible.

The hint tile will tell you to “Remember sjum”, where SJUM is the password to get back to Totally Fair. So you have to learn that level well enough to recreate the same effect without being able to see your progress.

It’s not impossible, and it’s not a “make a map” faux puzzle. A few scattered wall blocks near the chips, outside the maze area, are arranged exactly where the edges of the maze are. Once you notice that, all you have to do is walk up and down a few times, waiting a moment each time to make sure the teeth has caught up with you.

So in a sense, Totally Unfair is the advanced chamber version of Totally Fair. It makes a very minor change that force you to approach the whole level completely differently, using knowledge gleaned from your first attempt.

And crucially, it’s an actual puzzle! A lot of later Chip’s Challenge levels rely heavily on map-drawing, timing, tedium, or outright luck. (Consider, if you will, Blobdance.) The Totally Fair + Totally Unfair pairing requires a little ingenuity unlike anything else in the game, and the solution is something more than just combinations of existing game mechanics. There’s something very interesting about that hint in the walls, a hint you’d have no reason to pick up on when playing through the first level. I wish I knew how to verbalize it better.

Anyway, enough puzzle games; let’s get back to regular ol’ level design.

A 4×4 arrangement of rooms with a conspicuous void in the middle

maps via vgmaps and TCRFplaythrough with commentary

Link’s Awakening was my first Zelda (and only Zelda for a long time), which made for a slightly confusing introduction to the series — what on earth is a Zelda and why doesn’t it appear in the game?

The whole game is a blur of curiosities and interesting little special cases. It’s fabulously well put together, especially for a Game Boy game, and the dungeons in particular are fascinating microcosms of design. I never really appreciated it before, but looking at the full maps, I’m struck by how each dungeon has several large areas neatly sliced into individual screens.

Much like with Doom II, I surprise myself by picking Eagle’s Tower as the most notable part of the game. The dungeon isn’t that interesting within the overall context of the game; it gives you only the mirror shield, possibly the least interesting item in the game, second only to the power bracelet upgrade from the previous dungeon. The dungeon itself is fairly long, full of traps, and overflowing with crystal switches and toggle blocks, making it possibly the most frustrating of the set. Getting to it involves spending some excellent quality time with a flying rooster, but you don’t really do anything — mostly you just make your way through nondescript caves and mountaintops.

Having now thoroughly dunked on it, I’ll tell you what makes it stand out: the player changes the shape of the dungeon.

That’s something I like a lot about Doom, as well, but it’s much more dramatic in Eagle’s Tower. As you might expect, the dungeon is shaped like a tower, where each floor is on a 4×4 grid. The top floor, 4F, is a small 2×2 block of rooms in the middle — but one of those rooms is the boss door, and there’s no way to get to that floor.

(Well, sort of. The “down” stairs in the upper-right of 3F actually lead up to 4F, but the connection is bogus and puts you in a wall, and both of the upper middle rooms are unreachable during normal gameplay.)

The primary objective of the dungeon is to smash four support columns on 2F by throwing a huge iron ball at them, which causes 4F to crash down into the middle of 3F.

The same arrangement of rooms, but the four in the middle have changed

Even the map on the pause screen updates to reflect this. In every meaningful sense, you, the player, have fundamentally reconfigured the shape of this dungeon.

I love this. It feels like I have some impact on the world, that I came along and did something much more significant than mere game mechanics ought to allow. I saw that the tower was unsolvable as designed, so I fixed it.

It’s clear that the game engine supports rearranging screens arbitrarily — consider the Wind Fish’s Egg — but this is s wonderfully clever and subtle use of that. Let the player feel like they have an impact on the world.

The cutting room floor

This is getting excessively long so I’m gonna cut it here. Some other things I thought of but don’t know how to say more than a paragraph about:

  • Super Mario Land 2: Six Golden Coins has a lot of levels with completely unique themes, backed by very simple tilesets but enhanced by interesting one-off obstacles and enemies. I don’t even know how to pick a most interesting one. Maybe just play the game, or at least peruse the maps.

  • This post about density of detail in Team Fortress 2 is really good so just read that I guess. It’s really about careful balance of contrast again, but through the lens of using contrasting amounts of detail to draw the player’s attention, while still carrying a simple theme through less detailed areas.

  • Metroid Prime is pretty interesting in a lot of ways, but I mostly laugh at how they spaced rooms out with long twisty hallways to improve load times — yet I never really thought about it because they all feel like they belong in the game.

One thing I really appreciate is level design that hints at a story, that shows me a world that exists persistently, that convinces me this space exists for some reason other than as a gauntlet for me as a player. But it seems what comes first to my mind is level design that’s clever or quirky, which probably says a lot about me. Maybe the original Fallouts are a good place to look for that sort of detail.

Conversely, it sticks out like a sore thumb when a game tries to railroad me into experiencing the game As The Designer Intended. Games are interactive, so the more input the player can give, the better — and this can be as simple as deciding to avoid rather than confront enemies, or deciding to run rather than walk.

I think that’s all I’ve got in me at the moment. Clearly I need to meditate on this a lot more, but I hope some of this was inspiring in some way!

mkosi — A Tool for Generating OS Images

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/mkosi-a-tool-for-generating-os-images.html

Introducing mkosi

After blogging about
casync
I realized I never blogged about the
mkosi tool that combines nicely
with it. mkosi has been around for a while already, and its time to
make it a bit better known. mkosi stands for Make Operating System
Image
, and is a tool for precisely that: generating an OS tree or
image that can be booted.

Yes, there are many tools like mkosi, and a number of them are quite
well known and popular. But mkosi has a number of features that I
think make it interesting for a variety of use-cases that other tools
don’t cover that well.

What is mkosi?

What are those use-cases, and what does mkosi precisely set apart?
mkosi is definitely a tool with a focus on developer’s needs for
building OS images, for testing and debugging, but also for generating
production images with cryptographic protection. A typical use-case
would be to add a mkosi.default file to an existing project (for
example, one written in C or Python), and thus making it easy to
generate an OS image for it. mkosi will put together the image with
development headers and tools, compile your code in it, run your test
suite, then throw away the image again, and build a new one, this time
without development headers and tools, and install your build
artifacts in it. This final image is then “production-ready”, and only
contains your built program and the minimal set of packages you
configured otherwise. Such an image could then be deployed with
casync (or any other tool of course) to be delivered to your set of
servers, or IoT devices or whatever you are building.

mkosi is supposed to be legacy-free: the focus is clearly on
today’s technology, not yesteryear’s. Specifically this means that
we’ll generate GPT partition tables, not MBR/DOS ones. When you tell
mkosi to generate a bootable image for you, it will make it bootable
on EFI, not on legacy BIOS. The GPT images generated follow
specifications such as the Discoverable Partitions
Specification
,
so that /etc/fstab can remain unpopulated and tools such as
systemd-nspawn can automatically dissect the image and boot from
them.

So, let’s have a look on the specific images it can generate:

  1. Raw GPT disk image, with ext4 as root
  2. Raw GPT disk image, with btrfs as root
  3. Raw GPT disk image, with a read-only squashfs as root
  4. A plain directory on disk containing the OS tree directly (this is useful for creating generic container images)
  5. A btrfs subvolume on disk, similar to the plain directory
  6. A tarball of a plain directory

When any of the GPT choices above are selected, a couple of additional
options are available:

  1. A swap partition may be added in
  2. The system may be made bootable on EFI systems
  3. Separate partitions for /home and /srv may be added in
  4. The root, /home and /srv partitions may be optionally encrypted with LUKS
  5. The root partition may be protected using dm-verity, thus making offline attacks on the generated system hard
  6. If the image is made bootable, the dm-verity root hash is automatically added to the kernel command line, and the kernel together with its initial RAM disk and the kernel command line is optionally cryptographically signed for UEFI SecureBoot

Note that mkosi is distribution-agnostic. It currently can build
images based on the following Linux distributions:

  1. Fedora
  2. Debian
  3. Ubuntu
  4. ArchLinux
  5. openSUSE

Note though that not all distributions are supported at the same
feature level currently. Also, as mkosi is based on dnf
--installroot
, debootstrap, pacstrap and zypper, and those
packages are not packaged universally on all distributions, you might
not be able to build images for all those distributions on arbitrary
host distributions.

The GPT images are put together in a way that they aren’t just
compatible with UEFI systems, but also with VM and container managers
(that is, at least the smart ones, i.e. VM managers that know UEFI,
and container managers that grok GPT disk images) to a large
degree. In fact, the idea is that you can use mkosi to build a
single GPT image that may be used to:

  1. Boot on bare-metal boxes
  2. Boot in a VM
  3. Boot in a systemd-nspawn container
  4. Directly run a systemd service off, using systemd’s RootImage= unit file setting

Note that in all four cases the dm-verity data is automatically used
if available to ensure the image is not tampered with (yes, you read
that right, systemd-nspawn and systemd’s RootImage= setting
automatically do dm-verity these days if the image has it.)

Mode of Operation

The simplest usage of mkosi is by simply invoking it without
parameters (as root):

# mkosi

Without any configuration this will create a GPT disk image for you,
will call it image.raw and drop it in the current directory. The
distribution used will be the same one as your host runs.

Of course in most cases you want more control about how the image is
put together, i.e. select package sets, select the distribution, size
partitions and so on. Most of that you can actually specify on the
command line, but it is recommended to instead create a couple of
mkosi.$SOMETHING files and directories in some directory. Then,
simply change to that directory and run mkosi without any further
arguments. The tool will then look in the current working directory
for these files and directories and make use of them (similar to how
make looks for a Makefile…). Every single file/directory is
optional, but if they exist they are honored. Here’s a list of the
files/directories mkosi currently looks for:

  1. mkosi.default — This is the main configuration file, here you
    can configure what kind of image you want, which distribution, which
    packages and so on.

  2. mkosi.extra/ — If this directory exists, then mkosi will copy
    everything inside it into the images built. You can place arbitrary
    directory hierarchies in here, and they’ll be copied over whatever is
    already in the image, after it was put together by the distribution’s
    package manager. This is the best way to drop additional static files
    into the image, or override distribution-supplied ones.

  3. mkosi.build — This executable file is supposed to be a build
    script. When it exists, mkosi will build two images, one after the
    other in the mode already mentioned above: the first version is the
    build image, and may include various build-time dependencies such as
    a compiler or development headers. The build script is also copied
    into it, and then run inside it. The script should then build
    whatever shall be built and place the result in $DESTDIR (don’t
    worry, popular build tools such as Automake or Meson all honor
    $DESTDIR anyway, so there’s not much to do here explicitly). It may
    also run a test suite, or anything else you like. After the script
    finished, the build image is removed again, and a second image (the
    final image) is built. This time, no development packages are
    included, and the build script is not copied into the image again —
    however, the build artifacts from the first run (i.e. those placed in
    $DESTDIR) are copied into the image.

  4. mkosi.postinst — If this executable script exists, it is invoked
    inside the image (inside a systemd-nspawn invocation) and can
    adjust the image as it likes at a very late point in the image
    preparation. If mkosi.build exists, i.e. the dual-phased
    development build process used, then this script will be invoked
    twice: once inside the build image and once inside the final
    image. The first parameter passed to the script clarifies which phase
    it is run in.

  5. mkosi.nspawn — If this file exists, it should contain a
    container configuration file for systemd-nspawn (see
    systemd.nspawn(5)
    for details), which shall be shipped along with the final image and
    shall be included in the check-sum calculations (see below).

  6. mkosi.cache/ — If this directory exists, it is used as package
    cache directory for the builds. This directory is effectively bind
    mounted into the image at build time, in order to speed up building
    images. The package installers of the various distributions will
    place their package files here, so that subsequent runs can reuse
    them.

  7. mkosi.passphrase — If this file exists, it should contain a
    pass-phrase to use for the LUKS encryption (if that’s enabled for the
    image built). This file should not be readable to other users.

  8. mkosi.secure-boot.crt and mkosi.secure-boot.key should be an
    X.509 key pair to use for signing the kernel and initrd for UEFI
    SecureBoot, if that’s enabled.

How to use it

So, let’s come back to our most trivial example, without any of the
mkosi.$SOMETHING files around:

# mkosi

As mentioned, this will create a build file image.raw in the current
directory. How do we use it? Of course, we could dd it onto some USB
stick and boot it on a bare-metal device. However, it’s much simpler
to first run it in a container for testing:

# systemd-nspawn -bi image.raw

And there you go: the image should boot up, and just work for you.

Now, let’s make things more interesting. Let’s still not use any of
the mkosi.$SOMETHING files around:

# mkosi -t raw_btrfs --bootable -o foobar.raw
# systemd-nspawn -bi foobar.raw

This is similar as the above, but we made three changes: it’s no
longer GPT + ext4, but GPT + btrfs. Moreover, the system is made
bootable on UEFI systems, and finally, the output is now called
foobar.raw.

Because this system is bootable on UEFI systems, we can run it in KVM:

qemu-kvm -m 512 -smp 2 -bios /usr/share/edk2/ovmf/OVMF_CODE.fd -drive format=raw,file=foobar.raw

This will look very similar to the systemd-nspawn invocation, except
that this uses full VM virtualization rather than container
virtualization. (Note that the way to run a UEFI qemu/kvm instance
appears to change all the time and is different on the various
distributions. It’s quite annoying, and I can’t really tell you what
the right qemu command line is to make this work on your system.)

Of course, it’s not all raw GPT disk images with mkosi. Let’s try
a plain directory image:

# mkosi -d fedora -t directory -o quux
# systemd-nspawn -bD quux

Of course, if you generate the image as plain directory you can’t boot
it on bare-metal just like that, nor run it in a VM.

A more complex command line is the following:

# mkosi -d fedora -t raw_squashfs --checksum --xz --package=openssh-clients --package=emacs

In this mode we explicitly pick Fedora as the distribution to use, ask
mkosi to generate a compressed GPT image with a root squashfs,
compress the result with xz, and generate a SHA256SUMS file with
the hashes of the generated artifacts. The package will contain the
SSH client as well as everybody’s favorite editor.

Now, let’s make use of the various mkosi.$SOMETHING files. Let’s
say we are working on some Automake-based project and want to make it
easy to generate a disk image off the development tree with the
version you are hacking on. Create a configuration file:

# cat > mkosi.default <<EOF
[Distribution]
Distribution=fedora
Release=24

[Output]
Format=raw_btrfs
Bootable=yes

[Packages]
# The packages to appear in both the build and the final image
Packages=openssh-clients httpd
# The packages to appear in the build image, but absent from the final image
BuildPackages=make gcc libcurl-devel
EOF

And let’s add a build script:

# cat > mkosi.build <<EOF
#!/bin/sh
./autogen.sh
./configure --prefix=/usr
make -j `nproc`
make install
EOF
# chmod +x mkosi.build

And with all that in place we can now build our project into a disk image, simply by typing:

# mkosi

Let’s try it out:

# systemd-nspawn -bi image.raw

Of course, if you do this you’ll notice that building an image like
this can be quite slow. And slow build times are actively hurtful to
your productivity as a developer. Hence let’s make things a bit
faster. First, let’s make use of a package cache shared between runs:

# mkdir mkosi.cache

Building images now should already be substantially faster (and
generate less network traffic) as the packages will now be downloaded
only once and reused. However, you’ll notice that unpacking all those
packages and the rest of the work is still quite slow. But mkosi can
help you with that. Simply use mkosi‘s incremental build feature. In
this mode mkosi will make a copy of the build and final images
immediately before dropping in your build sources or artifacts, so
that building an image becomes a lot quicker: instead of always
starting totally from scratch a build will now reuse everything it can
reuse from a previous run, and immediately begin with building your
sources rather than the build image to build your sources in. To
enable the incremental build feature use -i:

# mkosi -i

Note that if you use this option, the package list is not updated
anymore from your distribution’s servers, as the cached copy is made
after all packages are installed, and hence until you actually delete
the cached copy the distribution’s network servers aren’t contacted
again and no RPMs or DEBs are downloaded. This means the distribution
you use becomes “frozen in time” this way. (Which might be a bad
thing, but also a good thing, as it makes things kinda reproducible.)

Of course, if you run mkosi a couple of times you’ll notice that it
won’t overwrite the generated image when it already exists. You can
either delete the file yourself first (rm image.raw) or let mkosi
do it for you right before building a new image, with mkosi -f. You
can also tell mkosi to not only remove any such pre-existing images,
but also remove any cached copies of the incremental feature, by using
-f twice.

I wrote mkosi originally in order to test systemd, and quickly
generate a disk image of various distributions with the most current
systemd version from git, without all that affecting my host system. I
regularly use mkosi for that today, in incremental mode. The two
commands I use most in that context are:

# mkosi -if && systemd-nspawn -bi image.raw

And sometimes:

# mkosi -iff && systemd-nspawn -bi image.raw

The latter I use only if I want to regenerate everything based on the
very newest set of RPMs provided by Fedora, instead of a cached
snapshot of it.

BTW, the mkosi files for systemd are included in the systemd git
tree:
mkosi.default
and
mkosi.build. This
way, any developer who wants to quickly test something with current
systemd git, or wants to prepare a patch based on it and test it can
check out the systemd repository and simply run mkosi in it and a
few minutes later he has a bootable image he can test in
systemd-nspawn or KVM. casync has similar files:
mkosi.default,
mkosi.build.

Random Interesting Features

  1. As mentioned already, mkosi will generate dm-verity enabled
    disk images if you ask for it. For that use the --verity switch on
    the command line or Verity= setting in mkosi.default. Of course,
    dm-verity implies that the root volume is read-only. In this mode
    the top-level dm-verity hash will be placed along-side the output
    disk image in a file named the same way, but with the .roothash
    suffix. If the image is to be created bootable, the root hash is also
    included on the kernel command line in the roothash= parameter,
    which current systemd versions can use to both find and activate the
    root partition in a dm-verity protected way. BTW: it’s a good idea
    to combine this dm-verity mode with the raw_squashfs image mode,
    to generate a genuinely protected, compressed image suitable for
    running in your IoT device.

  2. As indicated above, mkosi can automatically create a check-sum
    file SHA256SUMS for you (--checksum) covering all the files it
    outputs (which could be the image file itself, a matching .nspawn
    file using the mkosi.nspawn file mentioned above, as well as the
    .roothash file for the dm-verity root hash.) It can then
    optionally sign this with gpg (--sign). Note that systemd‘s
    machinectl pull-tar and machinectl pull-raw command can download
    these files and the SHA256SUMS file automatically and verify things
    on download. With other words: what mkosi outputs is perfectly
    ready for downloads using these two systemd commands.

  3. As mentioned, mkosi is big on supporting UEFI SecureBoot. To
    make use of that, place your X.509 key pair in two files
    mkosi.secureboot.crt and mkosi.secureboot.key, and set
    SecureBoot= or --secure-boot. If so, mkosi will sign the
    kernel/initrd/kernel command line combination during the build. Of
    course, if you use this mode, you should also use
    Verity=/--verity=, otherwise the setup makes only partial
    sense. Note that mkosi will not help you with actually enrolling
    the keys you use in your UEFI BIOS.

  4. mkosi has minimal support for GIT checkouts: when it recognizes
    it is run in a git checkout and you use the mkosi.build script
    stuff, the source tree will be copied into the build image, but will
    all files excluded by .gitignore removed.

  5. There’s support for encryption in place. Use --encrypt= or
    Encrypt=. Note that the UEFI ESP is never encrypted though, and the
    root partition only if explicitly requested. The /home and /srv
    partitions are unconditionally encrypted if that’s enabled.

  6. Images may be built with all documentation removed.

  7. The password for the root user and additional kernel command line
    arguments may be configured for the image to generate.

Minimum Requirements

Current mkosi requires Python 3.5, and has a number of dependencies,
listed in the
README. Most
notably you need a somewhat recent systemd version to make use of its
full feature set: systemd 233. Older versions are already packaged for
various distributions, but much of what I describe above is only
available in the most recent release mkosi 3.

The UEFI SecureBoot support requires sbsign which currently isn’t
available in Fedora, but there’s a
COPR
.

Future

It is my intention to continue turning mkosi into a tool suitable
for:

  1. Testing and debugging projects
  2. Building images for secure devices
  3. Building portable service images
  4. Building images for secure VMs and containers

One of the biggest goals I have for the future is to teach mkosi and
systemd/sd-boot native support for A/B IoT style partition
setups. The idea is that the combination of systemd, casync and
mkosi provides generic building blocks for building secure,
auto-updating devices in a generic way from, even though all pieces
may be used individually, too.

FAQ

  1. Why are you reinventing the wheel again? This is exactly like
    $SOMEOTHERPROJECT!
    — Well, to my knowledge there’s no tool that
    integrates this nicely with your project’s development tree, and can
    do dm-verity and UEFI SecureBoot and all that stuff for you. So
    nope, I don’t think this exactly like $SOMEOTHERPROJECT, thank you
    very much.

  2. What about creating MBR/DOS partition images? — That’s really
    out of focus to me. This is an exercise in figuring out how generic
    OSes and devices in the future should be built and an attempt to
    commoditize OS image building. And no, the future doesn’t speak MBR,
    sorry. That said, I’d be quite interested in adding support for
    booting on Raspberry Pi, possibly using a hybrid approach, i.e. using
    a GPT disk label, but arranging things in a way that the Raspberry Pi
    boot protocol (which is built around DOS partition tables), can still
    work.

  3. Is this portable? — Well, depends what you mean by
    portable. No, this tool runs on Linux only, and as it uses
    systemd-nspawn during the build process it doesn’t run on
    non-systemd systems either. But then again, you should be able to
    create images for any architecture you like with it, but of course if
    you want the image bootable on bare-metal systems only systems doing
    UEFI are supported (but systemd-nspawn should still work fine on
    them).

  4. Where can I get this stuff? — Try
    GitHub. And some distributions
    carry packaged versions, but I think none of them the current v3
    yet.

  5. Is this a systemd project? — Yes, it’s hosted under the
    systemd GitHub umbrella. And yes,
    during run-time systemd-nspawn in a current version is required. But
    no, the code-bases are separate otherwise, already because systemd
    is a C project, and mkosi Python.

  6. Requiring systemd 233 is a pretty steep requirement, no?
    Yes, but the feature we need kind of matters (systemd-nspawn‘s
    --overlay= switch), and again, this isn’t supposed to be a tool for
    legacy systems.

  7. Can I run the resulting images in LXC or Docker? — Humm, I am
    not an LXC nor Docker guy. If you select directory or subvolume
    as image type, LXC should be able to boot the generated images just
    fine, but I didn’t try. Last time I looked, Docker doesn’t permit
    running proper init systems as PID 1 inside the container, as they
    define their own run-time without intention to emulate a proper
    system. Hence, no I don’t think it will work, at least not with an
    unpatched Docker version. That said, again, don’t ask me questions
    about Docker, it’s not precisely my area of expertise, and quite
    frankly I am not a fan. To my knowledge neither LXC nor Docker are
    able to run containers directly off GPT disk images, hence the
    various raw_xyz image types are definitely not compatible with
    either. That means if you want to generate a single raw disk image
    that can be booted unmodified both in a container and on bare-metal,
    then systemd-nspawn is the container manager to go for
    (specifically, its -i/--image= switch).

Should you care? Is this a tool for you?

Well, that’s up to you really.

If you hack on some complex project and need a quick way to compile
and run your project on a specific current Linux distribution, then
mkosi is an excellent way to do that. Simply drop the mkosi.default
and mkosi.build files in your git tree and everything will be
easy. (And of course, as indicated above: if the project you are
hacking on happens to be called systemd or casync be aware that
those files are already part of the git tree — you can just use them.)

If you hack on some embedded or IoT device, then mkosi is a great
choice too, as it will make it reasonably easy to generate secure
images that are protected against offline modification, by using
dm-verity and UEFI SecureBoot.

If you are an administrator and need a nice way to build images for a
VM or systemd-nspawn container, or a portable service then mkosi
is an excellent choice too.

If you care about legacy computers, old distributions, non-systemd
init systems, old VM managers, Docker, … then no, mkosi is not for
you, but there are plenty of well-established alternatives around that
cover that nicely.

And never forget: mkosi is an Open Source project. We are happy to
accept your patches and other contributions.

Oh, and one unrelated last thing: don’t forget to submit your talk
proposal

and/or buy a ticket for
All Systems Go! 2017 in Berlin — the
conference where things like systemd, casync and mkosi are
discussed, along with a variety of other Linux userspace projects used
for building systems.

mkosi — A Tool for Generating OS Images

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/mkosi-a-tool-for-generating-os-images.html

Introducing mkosi

After blogging about
casync
I realized I never blogged about the
mkosi tool that combines nicely
with it. mkosi has been around for a while already, and its time to
make it a bit better known. mkosi stands for Make Operating System
Image
, and is a tool for precisely that: generating an OS tree or
image that can be booted.

Yes, there are many tools like mkosi, and a number of them are quite
well known and popular. But mkosi has a number of features that I
think make it interesting for a variety of use-cases that other tools
don’t cover that well.

What is mkosi?

What are those use-cases, and what does mkosi precisely set apart?
mkosi is definitely a tool with a focus on developer’s needs for
building OS images, for testing and debugging, but also for generating
production images with cryptographic protection. A typical use-case
would be to add a mkosi.default file to an existing project (for
example, one written in C or Python), and thus making it easy to
generate an OS image for it. mkosi will put together the image with
development headers and tools, compile your code in it, run your test
suite, then throw away the image again, and build a new one, this time
without development headers and tools, and install your build
artifacts in it. This final image is then “production-ready”, and only
contains your built program and the minimal set of packages you
configured otherwise. Such an image could then be deployed with
casync (or any other tool of course) to be delivered to your set of
servers, or IoT devices or whatever you are building.

mkosi is supposed to be legacy-free: the focus is clearly on
today’s technology, not yesteryear’s. Specifically this means that
we’ll generate GPT partition tables, not MBR/DOS ones. When you tell
mkosi to generate a bootable image for you, it will make it bootable
on EFI, not on legacy BIOS. The GPT images generated follow
specifications such as the Discoverable Partitions
Specification
,
so that /etc/fstab can remain unpopulated and tools such as
systemd-nspawn can automatically dissect the image and boot from
them.

So, let’s have a look on the specific images it can generate:

  1. Raw GPT disk image, with ext4 as root
  2. Raw GPT disk image, with btrfs as root
  3. Raw GPT disk image, with a read-only squashfs as root
  4. A plain directory on disk containing the OS tree directly (this is useful for creating generic container images)
  5. A btrfs subvolume on disk, similar to the plain directory
  6. A tarball of a plain directory

When any of the GPT choices above are selected, a couple of additional
options are available:

  1. A swap partition may be added in
  2. The system may be made bootable on EFI systems
  3. Separate partitions for /home and /srv may be added in
  4. The root, /home and /srv partitions may be optionally encrypted with LUKS
  5. The root partition may be protected using dm-verity, thus making offline attacks on the generated system hard
  6. If the image is made bootable, the dm-verity root hash is automatically added to the kernel command line, and the kernel together with its initial RAM disk and the kernel command line is optionally cryptographically signed for UEFI SecureBoot

Note that mkosi is distribution-agnostic. It currently can build
images based on the following Linux distributions:

  1. Fedora
  2. Debian
  3. Ubuntu
  4. ArchLinux
  5. openSUSE

Note though that not all distributions are supported at the same
feature level currently. Also, as mkosi is based on dnf
--installroot
, debootstrap, pacstrap and zypper, and those
packages are not packaged universally on all distributions, you might
not be able to build images for all those distributions on arbitrary
host distributions. For example, Fedora doesn’t package zypper,
hence you cannot build an openSUSE image easily on Fedora, but you can
still build Fedora (obviously…), Debian, Ubuntu and ArchLinux images
on it just fine.

The GPT images are put together in a way that they aren’t just
compatible with UEFI systems, but also with VM and container managers
(that is, at least the smart ones, i.e. VM managers that know UEFI,
and container managers that grok GPT disk images) to a large
degree. In fact, the idea is that you can use mkosi to build a
single GPT image that may be used to:

  1. Boot on bare-metal boxes
  2. Boot in a VM
  3. Boot in a systemd-nspawn container
  4. Directly run a systemd service off, using systemd’s RootImage= unit file setting

Note that in all four cases the dm-verity data is automatically used
if available to ensure the image is not tempered with (yes, you read
that right, systemd-nspawn and systemd’s RootImage= setting
automatically do dm-verity these days if the image has it.)

Mode of Operation

The simplest usage of mkosi is by simply invoking it without
parameters (as root):

# mkosi

Without any configuration this will create a GPT disk image for you,
will call it image.raw and drop it in the current directory. The
distribution used will be the same one as your host runs.

Of course in most cases you want more control about how the image is
put together, i.e. select package sets, select the distribution, size
partitions and so on. Most of that you can actually specify on the
command line, but it is recommended to instead create a couple of
mkosi.$SOMETHING files and directories in some directory. Then,
simply change to that directory and run mkosi without any further
arguments. The tool will then look in the current working directory
for these files and directories and make use of them (similar to how
make looks for a Makefile…). Every single file/directory is
optional, but if they exist they are honored. Here’s a list of the
files/directories mkosi currently looks for:

  1. mkosi.default — This is the main configuration file, here you
    can configure what kind of image you want, which distribution, which
    packages and so on.

  2. mkosi.extra/ — If this directory exists, then mkosi will copy
    everything inside it into the images built. You can place arbitrary
    directory hierarchies in here, and they’ll be copied over whatever is
    already in the image, after it was put together by the distribution’s
    package manager. This is the best way to drop additional static files
    into the image, or override distribution-supplied ones.

  3. mkosi.build — This executable file is supposed to be a build
    script. When it exists, mkosi will build two images, one after the
    other in the mode already mentioned above: the first version is the
    build image, and may include various build-time dependencies such as
    a compiler or development headers. The build script is also copied
    into it, and then run inside it. The script should then build
    whatever shall be built and place the result in $DESTDIR (don’t
    worry, popular build tools such as Automake or Meson all honor
    $DESTDIR anyway, so there’s not much to do here explicitly). It may
    also run a test suite, or anything else you like. After the script
    finished, the build image is removed again, and a second image (the
    final image) is built. This time, no development packages are
    included, and the build script is not copied into the image again —
    however, the build artifacts from the first run (i.e. those placed in
    $DESTDIR) are copied into the image.

  4. mkosi.postinst — If this executable script exists, it is invoked
    inside the image (inside a systemd-nspawn invocation) and can
    adjust the image as it likes at a very late point in the image
    preparation. If mkosi.build exists, i.e. the dual-phased
    development build process used, then this script will be invoked
    twice: once inside the build image and once inside the final
    image. The first parameter passed to the script clarifies which phase
    it is run in.

  5. mkosi.nspawn — If this file exists, it should contain a
    container configuration file for systemd-nspawn (see
    systemd.nspawn(5)
    for details), which shall be shipped along with the final image and
    shall be included in the check-sum calculations (see below).

  6. mkosi.cache/ — If this directory exists, it is used as package
    cache directory for the builds. This directory is effectively bind
    mounted into the image at build time, in order to speed up building
    images. The package installers of the various distributions will
    place their package files here, so that subsequent runs can reuse
    them.

  7. mkosi.passphrase — If this file exists, it should contain a
    pass-phrase to use for the LUKS encryption (if that’s enabled for the
    image built). This file should not be readable to other users.

  8. mkosi.secure-boot.crt and mkosi.secure-boot.key should be an
    X.509 key pair to use for signing the kernel and initrd for UEFI
    SecureBoot, if that’s enabled.

How to use it

So, let’s come back to our most trivial example, without any of the
mkosi.$SOMETHING files around:

# mkosi

As mentioned, this will create a build file image.raw in the current
directory. How do we use it? Of course, we could dd it onto some USB
stick and boot it on a bare-metal device. However, it’s much simpler
to first run it in a container for testing:

# systemd-nspawn -bi image.raw

And there you go: the image should boot up, and just work for you.

Now, let’s make things more interesting. Let’s still not use any of
the mkosi.$SOMETHING files around:

# mkosi -t raw_btrfs --bootable -o foobar.raw
# systemd-nspawn -bi foobar.raw

This is similar as the above, but we made three changes: it’s no
longer GPT + ext4, but GPT + btrfs. Moreover, the system is made
bootable on UEFI systems, and finally, the output is now called
foobar.raw.

Because this system is bootable on UEFI systems, we can run it in KVM:

qemu-kvm -m 512 -smp 2 -bios /usr/share/edk2/ovmf/OVMF_CODE.fd -drive format=raw,file=foobar.raw

This will look very similar to the systemd-nspawn invocation, except
that this uses full VM virtualization rather than container
virtualization. (Note that the way to run a UEFI qemu/kvm instance
appears to change all the time and is different on the various
distributions. It’s quite annoying, and I can’t really tell you what
the right qemu command line is to make this work on your system.)

Of course, it’s not all raw GPT disk images with mkosi. Let’s try
a plain directory image:

# mkosi -d fedora -t directory -o quux
# systemd-nspawn -bD quux

Of course, if you generate the image as plain directory you can’t boot
it on bare-metal just like that, nor run it in a VM.

A more complex command line is the following:

# mkosi -d fedora -t raw_squashfs --checksum --xz --package=openssh-clients --package=emacs

In this mode we explicitly pick Fedora as the distribution to use, ask
mkosi to generate a compressed GPT image with a root squashfs,
compress the result with xz, and generate a SHA256SUMS file with
the hashes of the generated artifacts. The package will contain the
SSH client as well as everybody’s favorite editor.

Now, let’s make use of the various mkosi.$SOMETHING files. Let’s
say we are working on some Automake-based project and want to make it
easy to generate a disk image off the development tree with the
version you are hacking on. Create a configuration file:

# cat > mkosi.default <<EOF
[Distribution]
Distribution=fedora
Release=24

[Output]
Format=raw_btrfs
Bootable=yes

[Packages]
# The packages to appear in both the build and the final image
Packages=openssh-clients httpd
# The packages to appear in the build image, but absent from the final image
BuildPackages=make gcc libcurl-devel
EOF

And let’s add a build script:

# cat > mkosi.build <<EOF
#!/bin/sh
cd $SRCDIR
./autogen.sh
./configure --prefix=/usr
make -j `nproc`
make install
EOF
# chmod +x mkosi.build

And with all that in place we can now build our project into a disk image, simply by typing:

# mkosi

Let’s try it out:

# systemd-nspawn -bi image.raw

Of course, if you do this you’ll notice that building an image like
this can be quite slow. And slow build times are actively hurtful to
your productivity as a developer. Hence let’s make things a bit
faster. First, let’s make use of a package cache shared between runs:

# mkdir mkosi.chache

Building images now should already be substantially faster (and
generate less network traffic) as the packages will now be downloaded
only once and reused. However, you’ll notice that unpacking all those
packages and the rest of the work is still quite slow. But mkosi can
help you with that. Simply use mkosi‘s incremental build feature. In
this mode mkosi will make a copy of the build and final images
immediately before dropping in your build sources or artifacts, so
that building an image becomes a lot quicker: instead of always
starting totally from scratch a build will now reuse everything it can
reuse from a previous run, and immediately begin with building your
sources rather than the build image to build your sources in. To
enable the incremental build feature use -i:

# mkosi -i

Note that if you use this option, the package list is not updated
anymore from your distribution’s servers, as the cached copy is made
after all packages are installed, and hence until you actually delete
the cached copy the distribution’s network servers aren’t contacted
again and no RPMs or DEBs are downloaded. This means the distribution
you use becomes “frozen in time” this way. (Which might be a bad
thing, but also a good thing, as it makes things kinda reproducible.)

Of course, if you run mkosi a couple of times you’ll notice that it
won’t overwrite the generated image when it already exists. You can
either delete the file yourself first (rm image.raw) or let mkosi
do it for you right before building a new image, with mkosi -f. You
can also tell mkosi to not only remove any such pre-existing images,
but also remove any cached copies of the incremental feature, by using
-f twice.

I wrote mkosi originally in order to test systemd, and quickly
generate a disk image of various distributions with the most current
systemd version from git, without all that affecting my host system. I
regularly use mkosi for that today, in incremental mode. The two
commands I use most in that context are:

# mkosi -if && systemd-nspawn -bi image.raw

And sometimes:

# mkosi -iff && systemd-nspawn -bi image.raw

The latter I use only if I want to regenerate everything based on the
very newest set of RPMs provided by Fedora, instead of a cached
snapshot of it.

BTW, the mkosi files for systemd are included in the systemd git
tree:
mkosi.default
and
mkosi.build. This
way, any developer who wants to quickly test something with current
systemd git, or wants to prepare a patch based on it and test it can
check out the systemd repository and simply run mkosi in it and a
few minutes later he has a bootable image he can test in
systemd-nspawn or KVM. casync has similar files:
mkosi.default,
mkosi.build.

Random Interesting Features

  1. As mentioned already, mkosi will generate dm-verity enabled
    disk images if you ask for it. For that use the --verity switch on
    the command line or Verity= setting in mkosi.default. Of course,
    dm-verity implies that the root volume is read-only. In this mode
    the top-level dm-verity hash will be placed along-side the output
    disk image in a file named the same way, but with the .roothash
    suffix. If the image is to be created bootable, the root hash is also
    included on the kernel command line in the roothash= parameter,
    which current systemd versions can use to both find and activate the
    root partition in a dm-verity protected way. BTW: it’s a good idea
    to combine this dm-verity mode with the raw_squashfs image mode,
    to generate a genuinely protected, compressed image suitable for
    running in your IoT device.

  2. As indicated above, mkosi can automatically create a check-sum
    file SHA256SUMS for you (--checksum) covering all the files it
    outputs (which could be the image file itself, a matching .nspawn
    file using the mkosi.nspawn file mentioned above, as well as the
    .roothash file for the dm-verity root hash.) It can then
    optionally sign this with gpg (--sign). Note that systemd‘s
    machinectl pull-tar and machinectl pull-raw command can download
    these files and the SHA256SUMS file automatically and verify things
    on download. With other words: what mkosi outputs is perfectly
    ready for downloads using these two systemd commands.

  3. As mentioned, mkosi is big on supporting UEFI SecureBoot. To
    make use of that, place your X.509 key pair in two files
    mkosi.secureboot.crt and mkosi.secureboot.key, and set
    SecureBoot= or --secure-boot. If so, mkosi will sign the
    kernel/initrd/kernel command line combination during the build. Of
    course, if you use this mode, you should also use
    Verity=/--verity=, otherwise the setup makes only partial
    sense. Note that mkosi will not help you with actually enrolling
    the keys you use in your UEFI BIOS.

  4. mkosi has minimal support for GIT checkouts: when it recognizes
    it is run in a git checkout and you use the mkosi.build script
    stuff, the source tree will be copied into the build image, but will
    all files excluded by .gitignore removed.

  5. There’s support for encryption in place. Use --encrypt= or
    Encrypt=. Note that the UEFI ESP is never encrypted though, and the
    root partition only if explicitly requested. The /home and /srv
    partitions are unconditionally encrypted if that’s enabled.

  6. Images may be built with all documentation removed.

  7. The password for the root user and additional kernel command line
    arguments may be configured for the image to generate.

Minimum Requirements

Current mkosi requires Python 3.5, and has a number of dependencies,
listed in the
README. Most
notably you need a somewhat recent systemd version to make use of its
full feature set: systemd 233. Older versions are already packaged for
various distributions, but much of what I describe above is only
available in the most recent release mkosi 3.

The UEFI SecureBoot support requires sbsign which currently isn’t
available in Fedora, but there’s a
COPR
.

Future

It is my intention to continue turning mkosi into a tool suitable
for:

  1. Testing and debugging projects
  2. Building images for secure devices
  3. Building portable service images
  4. Building images for secure VMs and containers

One of the biggest goals I have for the future is to teach mkosi and
systemd/sd-boot native support for A/B IoT style partition
setups. The idea is that the combination of systemd, casync and
mkosi provides generic building blocks for building secure,
auto-updating devices in a generic way from, even though all pieces
may be used individually, too.

FAQ

  1. Why are you reinventing the wheel again? This is exactly like
    $SOMEOTHERPROJECT!
    — Well, to my knowledge there’s no tool that
    integrates this nicely with your project’s development tree, and can
    do dm-verity and UEFI SecureBoot and all that stuff for you. So
    nope, I don’t think this exactly like $SOMEOTHERPROJECT, thank you
    very much.

  2. What about creating MBR/DOS partition images? — That’s really
    out of focus to me. This is an exercise in figuring out how generic
    OSes and devices in the future should be built and an attempt to
    commoditize OS image building. And no, the future doesn’t speak MBR,
    sorry. That said, I’d be quite interested in adding support for
    booting on Raspberry Pi, possibly using a hybrid approach, i.e. using
    a GPT disk label, but arranging things in a way that the Raspberry Pi
    boot protocol (which is built around DOS partition tables), can still
    work.

  3. Is this portable? — Well, depends what you mean by
    portable. No, this tool runs on Linux only, and as it uses
    systemd-nspawn during the build process it doesn’t run on
    non-systemd systems either. But then again, you should be able to
    create images for any architecture you like with it, but of course if
    you want the image bootable on bare-metal systems only systems doing
    UEFI are supported (but systemd-nspawn should still work fine on
    them).

  4. Where can I get this stuff? — Try
    GitHub. And some distributions
    carry packaged versions, but I think none of them the current v3
    yet.

  5. Is this a systemd project? — Yes, it’s hosted under the
    systemd GitHub umbrella. And yes,
    during run-time systemd-nspawn in a current version is required. But
    no, the code-bases are separate otherwise, already because systemd
    is a C project, and mkosi Python.

  6. Requiring systemd 233 is a pretty steep requirement, no?
    Yes, but the feature we need kind of matters (systemd-nspawn‘s
    --overlay= switch), and again, this isn’t supposed to be a tool for
    legacy systems.

  7. Can I run the resulting images in LXC or Docker? — Humm, I am
    not an LXC nor Docker guy. If you select directory or subvolume
    as image type, LXC should be able to boot the generated images just
    fine, but I didn’t try. Last time I looked, Docker doesn’t permit
    running proper init systems as PID 1 inside the container, as they
    define their own run-time without intention to emulate a proper
    system. Hence, no I don’t think it will work, at least not with an
    unpatched Docker version. That said, again, don’t ask me questions
    about Docker, it’s not precisely my area of expertise, and quite
    frankly I am not a fan. To my knowledge neither LXC nor Docker are
    able to run containers directly off GPT disk images, hence the
    various raw_xyz image types are definitely not compatible with
    either. That means if you want to generate a single raw disk image
    that can be booted unmodified both in a container and on bare-metal,
    then systemd-nspawn is the container manager to go for
    (specifically, its -i/--image= switch).

Should you care? Is this a tool for you?

Well, that’s up to you really.

If you hack on some complex project and need a quick way to compile
and run your project on a specific current Linux distribution, then
mkosi is an excellent way to do that. Simply drop the mkosi.default
and mkosi.build files in your git tree and everything will be
easy. (And of course, as indicated above: if the project you are
hacking on happens to be called systemd or casync be aware that
those files are already part of the git tree — you can just use them.)

If you hack on some embedded or IoT device, then mkosi is a great
choice too, as it will make it reasonably easy to generate secure
images that are protected against offline modification, by using
dm-verity and UEFI SecureBoot.

If you are an administrator and need a nice way to build images for a
VM or systemd-nspawn container, or a portable service then mkosi
is an excellent choice too.

If you care about legacy computers, old distributions, non-systemd
init systems, old VM managers, Docker, … then no, mkosi is not for
you, but there are plenty of well-established alternatives around that
cover that nicely.

And never forget: mkosi is an Open Source project. We are happy to
accept your patches and other contributions.

Oh, and one unrelated last thing: don’t forget to submit your talk
proposal

and/or buy a ticket for
All Systems Go! 2017 in Berlin — the
conference where things like systemd, casync and mkosi are
discussed, along with a variety of other Linux userspace projects used
for building systems.

OpsWorks September Enhancements Blog Post

Post Syndicated from Daniel Huesch original http://blogs.aws.amazon.com/application-management/post/Tx38UMHBUUN5DX2/OpsWorks-September-Enhancements-Blog-Post

Over the past few months, the AWS OpsWorks team has introduced several enhancements to existing features and added to support for new one. Let’s discuss some of these new capabilities.

·       Chef client 12.13.37 – Released a new AWS OpsWorks agent version for Chef 12 for Linux, enabling the latest enhancements from Chef. The OpsWorks console now shows the full history of enhancements to its agent software. Here’s an example of what the change log looks like:

·       Node.js 0.12.15 – Provided support for a new version of Node.js, in Chef 11.

–        Fixes a bug in the read/write locks implementation for the Windows operating system.
–        Fixes a potential buffer overflow vulnerability.

·       Ruby 2.3.1 – The built-in Chef 11 Ruby layer now supports Ruby 2.3.1, which includes these Ruby enhancements:

–        Introduced a frozen string literal pragma.
–        Introduced a safe navigation operator (lonely operator).
–        Numerous performance improvements.

·       Larger EBS volumes – Following the recent announcement from Amazon EBS, you can now use OpsWorks to create provisioned IOPS volumes that store up to 16 TB and process up to 20,000 IOPS, with a maximum throughput of 320 MBps. You can also create general purpose volumes that store up to 16 TB and process up to 10,000 IOPS, with a maximum throughput of 160 MBps.

·       New Linux operating systems – OpsWorks continues to enhance its operating system support and now offers:

–        Amazon Linux 2016.03 (Amazon Linux 2016.09 support will be available soon)
–        Ubuntu 16.04
–        CentOS 7

·       Instance tenancy – You can provision dedicated instances through OpsWorks. Dedicated instances are Amazon EC2 instances that run in a virtual private cloud (VPC) on hardware that’s dedicated to a single customer. Your dedicated instances are physically isolated at the host hardware level from instances that belong to other AWS accounts.

·       Define root volumes – You can define the size of the root volume of your EBS-backed instances directly from the OpsWorks console. Choose from a variety of volume types: General Purpose (SSD), Provisioned IOPS (SSD), and Magnetic.

·       Instance page – The OpsWorks instance page now displays a summary bar that indicates the aggregated state of all the instances in a selected stack. Summary fields include total instance count, online instances, instances that are in the setting-up stage, instances that are in the shutting-down stage, stopped instances, and instances in an error state.

·       Service role regeneration – You can now use the OpsWorks console to recreate your IAM service role if it was deleted.

Recreate IAM service role

Confirmation of IAM service role creation

As always, we welcome your feedback about features you’re using in OpsWorks. Be sure to visit the OpsWorks user forums, and check out the documentation.

 

 

OpsWorks September 2016 Updates

Post Syndicated from Daniel Huesch original https://aws.amazon.com/blogs/devops/opsworks-september-2016-updates/

Over the past few months, the AWS OpsWorks team has introduced several enhancements to existing features and added to support for new one. Let’s discuss some of these new capabilities.

·       Chef client 12.13.37 – Released a new AWS OpsWorks agent version for Chef 12 for Linux, enabling the latest enhancements from Chef. The OpsWorks console now shows the full history of enhancements to its agent software. Here’s an example of what the change log looks like:

·       Node.js 0.12.15 – Provided support for a new version of Node.js, in Chef 11.

–        Fixes a bug in the read/write locks implementation for the Windows operating system.
–        Fixes a potential buffer overflow vulnerability.

·       Ruby 2.3.1 – The built-in Chef 11 Ruby layer now supports Ruby 2.3.1, which includes these Ruby enhancements:

–        Introduced a frozen string literal pragma.
–        Introduced a safe navigation operator (lonely operator).
–        Numerous performance improvements.

·       Larger EBS volumes – Following the recent announcement from Amazon EBS, you can now use OpsWorks to create provisioned IOPS volumes that store up to 16 TB and process up to 20,000 IOPS, with a maximum throughput of 320 MBps. You can also create general purpose volumes that store up to 16 TB and process up to 10,000 IOPS, with a maximum throughput of 160 MBps.

·       New Linux operating systems – OpsWorks continues to enhance its operating system support and now offers:

–        Amazon Linux 2016.03 (Amazon Linux 2016.09 support will be available soon)
–        Ubuntu 16.04
–        CentOS 7

·       Instance tenancy – You can provision dedicated instances through OpsWorks. Dedicated instances are Amazon EC2 instances that run in a virtual private cloud (VPC) on hardware that’s dedicated to a single customer. Your dedicated instances are physically isolated at the host hardware level from instances that belong to other AWS accounts.

·       Define root volumes – You can define the size of the root volume of your EBS-backed instances directly from the OpsWorks console. Choose from a variety of volume types: General Purpose (SSD), Provisioned IOPS (SSD), and Magnetic.

·       Instance page – The OpsWorks instance page now displays a summary bar that indicates the aggregated state of all the instances in a selected stack. Summary fields include total instance count, online instances, instances that are in the setting-up stage, instances that are in the shutting-down stage, stopped instances, and instances in an error state.

·       Service role regeneration – You can now use the OpsWorks console to recreate your IAM service role if it was deleted.

Recreate IAM service role

Confirmation of IAM service role creation

As always, we welcome your feedback about features you’re using in OpsWorks. Be sure to visit the OpsWorks user forums, and check out the documentation.

 

 

I wish I enjoyed Pokémon Go

Post Syndicated from Eevee original https://eev.ee/blog/2016/07/31/i-wish-i-enjoyed-pok%C3%A9mon-go/

I’ve been trying really hard not to be a sourpuss about this, because everyone seems to enjoy it a lot and I don’t want to be the jerk pissing in their cornflakes.

And yet!

Despite all the potential of the game, despite all the fervor all across the world, it doesn’t tickle my fancy.

It seems like the sort of thing I ought to enjoy. Pokémon is kind of my jam, if you hadn’t noticed. When I don’t enjoy a Pokémon thing, something is wrong with at least one of us.

The app is broken

I’m not talking about the recent update that everyone’s mad about and that I haven’t even tried. They removed pawprints, which didn’t work anyway? That sucks, yeah, but I think it’s more significant that the thing is barely usable.

I’ve gone out hunting Pokémon several times with my partner and their husband. We wandered around for about an hour each time, and like clockwork, the game would just stop working for me every fifteen minutes. It would still run, and the screen would still update, but it would completely ignore all taps or swipes. The only fix seems to be killing it and restarting it, which takes like a week, and meanwhile the rest of my party has already caught the Zubat or whatever and is moving on.

For the brief moments when it works, it seems to be constantly confused about exactly where I am and which way I’m facing. Pokéstops (Poké Stops?) have massive icons when they’re nearby, and more than once I’ve had to mess around with the camera angle to be able to tap a nearby Pokémon, because a cluster of several already-visited Pokéstops are in the way. There’s also a strip along the bottom of the screen, surrounding the menu buttons, where tapping just does nothing at all.

I’ve had the AR Pokémon catching screen — the entire conceit of the game — lag so badly on multiple occasions that a Pokéball just stayed frozen in midair, and I couldn’t tell if I’d hit the Pokémon or not. There was also the time the Pokéball hit the Pokémon, landed on the ground, and… slowly rolled into the distance. For at least five minutes. I’m not exaggerating this time.

The game is much more responsive with AR disabled, so the Pokémon appear on a bland and generic background, which… seems to defeat the purpose of the game.

(Catching Pokémon doesn’t seem to have any real skill to it, either? Maybe I’m missing something, but I don’t understand how I’m supposed to gauge distance to an isolated 3D model and somehow connect this to how fast I flick my finger. I don’t really like “squishy” physics games like Angry Birds, and this is notably worse. It might as well be random.)

I had a better time just enjoying my party’s company and looking at actual wildlife, which in this case consists of cicadas and a few semi-wild rabbits that inexplicably live in a nearby park. I feel that something has gone wrong with your augmented reality game when it is worse than reality.

It’s not about Pokémon

Let’s see if my reasoning is sound, here.

In the mainline Pokémon games, you play as a human, but many of your important interactions are with Pokémon. You carry a number of Pokémon with you. When you encounter a Pokémon, you immediately send out your own. All the NPCs talk about how much they love Pokémon. There are overworld Pokémon hanging out. It’s pretty clear what the focus is. It’s right there on the title screen, even: both the word itself and an actual Pokémon.

Contrast this with Pokémon Go.

Most of the time, the only thing of interest on the screen is your avatar, a human. Once you encounter a Pokémon, you don’t send out your own; it’s just you, and it. In fact, once you catch a Pokémon, you hardly ever interact with it again. You can go look at its stats, assuming you can find it in your party of, what, 250?

The best things I’ve seen done with the app are AR screenshots of Pokémon in funny or interesting real-world places. It didn’t even occur to me that you can only do this with wild Pokémon until I played it. You can’t use the AR feature — again, the main conceit of the game — with your own Pokémon. How obvious is this? How can it not be possible? (If it is possible, it’s so well-hidden that several rounds of poking through the app haven’t revealed how to do it, which is still a knock for hiding the most obvious thing to want to do.)

So you are a human, and you wander around hoping you see Pokémon, and then you catch them, and then they are effectively just a sprite in a list until you feed them to your other Pokémon. And feed them you must, because the only way to level up a Pokémon is to feed them the corpses — sorry, “candies” — of their brethren. The Pokémon themselves aren’t involved in this process; they are passive consumers you fatten up.

If you’re familiar with Nuzlocke runs, you might be aware of just how attached players — or even passive audiences — can get to their Pokémon in mainline games. Yet in Pokémon Go, the critters themselves are just something to collect, just something to have, just something to sacrifice. No other form of interaction is offered.

In Pokémon X and Y, you can pet your Pokémon and feed them cakes, then go solve puzzles with them. They will love you in return. In Pokémon Go, you can swipe to make the model rotate.

There is some kind of battle system in here somewhere, but as far as I can tell, you only ever battle against gym leaders, who are jerks who’ve been playing the damn thing since it came out and have Pokémon whose CP have more digits than you even knew were possible. Also the battling is real-time with some kind of weird gestural interface, so it’s kind of a crapshoot whether you even do the thing you want, a far cry from the ostensibly strategic theme of the mainline games.

If I didn’t know any better, I’d think some no-name third-party company just took an existing product and poorly plastered Pokémon onto it.

There are very few Pokémon per given area

The game is limited to generation 1, the Red/Blue/Yellow series. And that’s fine.

I’ve seen about six of them.

Rumor has it that they are arranged very cleverly, with fire Pokémon appearing in deserts and water Pokémon appearing in waterfronts. That sounds really cool, except that I don’t live at the intersection of fifteen different ecosystems. How do you get ice Pokémon? Visit my freezer?

I freely admit, I’m probably not the target audience here; I don’t have a commute at all, and on an average day I have no reason to leave the house at all. I can understand that I might not see a huge variety, sure. But I’ve seen several friends lamenting that they don’t see much variety on their own commutes, or around the points of interest near where they live.

If you spend most of your time downtown in a major city, the game is probably great; if you live out in the sticks, it sounds a bit barren. It might be a little better if you could actually tell how to find Pokémon that are more than a few feet away — there used to be a distance indicator for nearby Pokémon, which I’m told even worked at one point, but it’s never worked since I first tried the game and it’s gone now.

Ah, of course, there’s always Pokévision, a live map of what Pokémon are where… which Niantic just politely asked to cease and desist.

It’s full of obvious “free-to-play” nudges

I put “free-to-play” in quotes because it’s a big ol’ marketing lie and I don’t know why the gaming community even tolerates the phrase. The game is obviously designed to be significantly worse if you don’t give them money, and there are little reminders of this everywhere.

The most obvious example: eggs rain from the sky, and are the only way to get Pokémon that don’t appear naturally nearby. You have to walk a certain number of kilometers to hatch an egg, much like the mainline games, which is cute.

Ah, but you also have to put an egg in an incubator for the steps to count. And you only start with one. And they’re given to you very rarely, and any beyond the one you start with only have limited uses at a time. And you can carry 9 eggs at a time.

Never fear! You can an extra (limited use) incubator for the low low price of $1.48. Or maybe $1.03. It’s hard to tell, since (following the usual pattern of flagrant dishonesty) you first have to turn real money into game-specific trinkets at one of several carefully obscured exchange rates.

The thing is, you could just sell a Pokémon game. Nintendo has done so quite a few times, in fact. But who would pay for Pokémon Go, in the state it’s in?

In conclusion

This game is bad and I wish it weren’t bad. If you enjoy it, that’s awesome, and I’m not trying to rain on your parade, really. I just wish I enjoyed it too.

Python FAQ: How do I port to Python 3?

Post Syndicated from Eevee original https://eev.ee/blog/2016/07/31/python-faq-how-do-i-port-to-python-3/

Part of my Python FAQ, which is doomed to never be finished.

Maybe you have a Python 2 codebase. Maybe you’d like to make it work with Python 3. Maybe you really wish someone would write a comically long article on how to make that happen.

I have good news! You’re already reading one.

(And if you’re not sure why you’d want to use Python 3 in the first place, perhaps you’d be interested in the companion article which delves into exactly that question?)

Don’t be intimidated

This article is quite long, but don’t take that as a sign that this is necessarily a Herculean task. I’m trying to cover every issue I can ever recall running across, which means a lot of small gotchas.

I’ve ported several codebases from Python 2 to Python 2+3, and most of them have gone pretty smoothly. If you have modern Python 2 code that handles Unicode responsibly, you’re already halfway there.

However… if you still haven’t ported by now, almost eight years after Python 3.0 was first released, chances are you have either a lumbering giant of an app or ancient and weird 2.2-era code. Or, perish the thought, a lumbering giant consisting largely of weird 2.2-era code. In that case, you’ll want to clean up the more obvious issues one at a time, then go back and start worrying about actually running parts of your code on Python 3.

On the other hand, if your Python 2 code is pretty small and you’ve just never gotten around to porting, good news! It’s not that bad, and much of the work can be done automatically. Python 3 is ultimately the same language as Python 2, just with some sharp bits filed off.

Making some tough decisions

We say “porting from 2 to 3”, but what we usually mean is “porting code from 2 to both 2 and 3”. That ends up being more difficult (and ugly), since rather than writing either 2 or 3, you have to write the common subset of 2 and 3. As nifty as some of the features in 3 are, you can’t actually use any of them if you have to remain compatible with Python 2.

The first thing you need to do, then, is decide exactly which versions of Python you’re targeting. For 2, your options are:

  • Python 2.5+ is possible, but very difficult, and this post doesn’t really discuss it. Even something as simple as exception handling becomes painful, because the only syntax that works in Python 3 was first introduced in Python 2.6. I wouldn’t recommend doing this.

  • Python 2.6+ used to be fairly common, and is well-tread ground. However, Python 2.6 reached end-of-life in 2013, and some common libraries have been dropping support for it. If you want to preserve Python 2.6 compatibility for the sake of making a library more widely-available, well, I’d urge you to reconsider. If you want to preserve Python 2.6 compatibility because you’re running a proprietary app on it, you should stop reading this right now and go upgrade to 2.7 already.

  • Python 2.7 is the last release of the Python 2 series, but is guaranteed to be supported until at least 2020. The major focus of the release was backporting a lot of minor Python 3 features, making it the best possible target for code that’s meant to run on both 2 and 3.

  • There is, of course, also the choice of dropping Python 2 support, in which case this process will be much easier. Python 2 is still very widely-used, though, so library authors probably won’t want to do this. App authors do have the option, but unless your app is trivial, it’s much easier to maintain Python 2 support during the port — that way you can port iteratively, and the app will still function on Python 2 in the interim, rather than being a 2/3 hybrid that can’t run on either.

Most of this post assumes you’re targeting Python 2.7, though there are mentions of 2.6 as well.

You also have to decide which version of Python 3 to target.

  • Python 3.0 and 3.1 are forgettable. Python 3 was still stabilizing for its first couple minor versions, and from what I hear, compatibility with both 2.7 and 3.0 is a huge pain. Both versions are also past end-of-life.

  • Python 3.2 and 3.3 are a common minimum version to target. Python 3.3 reinstated support for u'...' literals (redundant in Python 3, where normal strings are already Unicode), which makes supporting both 2 and 3 much easier. I bundle it with Python 3.2 because the latest version that stable PyPy supports is 3.2, but it also supports u'...' literals. You’ll support the biggest surface area by targeting that, a sort of 3.2½. (There’s an alpha PyPy supporting 3.3, but as of this writing it’s not released as stable yet.)

  • Python 3.4 and 3.5 add shiny new features, but you can only really use them if you’re dropping support for Python 2. Again, I’d suggest targeting Python 2.7 + Python 3.2½ first, then dropping the Python 2 support and adding whatever later Python 3 trinkets you want.

Another consideration is what attitude you want your final code to take. Do you want Python 2 code with enough band-aids that it also works on Python 3, or Python 3 code that’s carefully written so it still works on Python 2? The differences are subtle! Consider code like x = map(a, b). map returns a list in Python 2, but a lazy iterable in Python 3. Which way do you want to port this code?

1
2
3
4
5
6
7
8
9
# Python 2 style: force eager evaluation, even on Python 3
x = list(map(a, b))

# Python 3 style: use lazy evaluation, even on Python 2
try:
    from future_builtins import map
except ImportError:
    pass
x = map(a, b)

The answer may depend on which Python you primarily use for development, your target audience, or even case-by-case based on how x is used.

Personally, I’d err on the side of preserving Python 3 semantics and porting them to Python 2 when possible. I’m pretty used to Python 3, though, and you or your team might be thrown for a loop by changing Python 2’s behavior.

At the very least, prefer if PY2 to if not PY3. The former stresses that Python 2 is the special case, which is increasingly true going forward. Eventually there’ll be a Python 4, and perhaps even a Python 5, and those future versions will want the “Python 3” behavior.

Some helpful tools

The good news is that you don’t have to do all of this manually.

2to3 is a standard library module (since 2.6) that automatically modifies Python 2 source code to change some common Python 2 constructs to the Python 3 equivalent. (It also doubles as a framework for making arbitrary changes to Python code.)

Unfortunately, it ports 2 to 3, not 2 to 2+3. For libraries, it’s possible to rig 2to3 to run automatically on your code just before it’s installed on Python 3, so you can keep writing Python 2 code — but 2to3 isn’t perfect, and this makes it impossible to develop with your library on Python 3, so Python 3 ends up as a second-class citizen. I wouldn’t recommend it.

The more common approach is to use something like six, a library that wraps many of the runtime differences between 2 and 3, so you can run the same codebase on both 2 and 3.

Of course, that still leaves you making the changes yourself. A more recent innovation is the python-future project, which combines both of the above. It has a future library of renames and backports of Python 3 functionality that goes further than six and is designed to let you write Python 3-esque code that still runs on Python 2. It also includes a futurize script, based on the 2to3 plumbing, that rewrites your code to target 2+3 (using python-future’s library) rather than just 3.

The nice thing about python-future is that it explicitly takes the stance of writing code against Python 3 semantics and backporting them to Python 2. It’s very dedicated to this: it has a future.builtins module that includes not only easy cases like map, but also entire pure-Python reimplementations of types like bytes. (Naturally, this adds some significant overhead as well.) I do like the overall attitude, but I’m not totally sold on all the changes, and you might want to leaf through them to see which ones you like.

futurize isn’t perfect, but it’s probably the best starting point. The 2to3 design splits the various edits into a variety of “fixers” that each make a single style of change, and futurize works the same way, inheriting many of the fixers from 2to3. The nice thing about futurize is that it groups the fixers into “stages”, where stage 1 (futurize --stage1) only makes fairly straightforward changes, like fixing the except syntax. More importantly, it doesn’t add any dependencies on the future library, so it’s useful for making the easy changes even if you’d prefer to use six. You’re also free to choose individual fixes to apply, if you discover that some particular change breaks your code.

Another advantage of this approach is that you can tackle the porting piecemeal, which is great for very large projects. Run one fixer at a time, starting with the very simple ones like updating to except ... as ... syntax, and convince yourself that everything is fine before you do the next one. You can make some serious strides towards 3 compatibility just by eliminating behavior that already has cromulent alternatives in Python 2.

If you expect your Python 3 port to take a very long time — say, if you have a large project with numerous developers and a frantic release schedule — then you might want to prevent older syntax from creeping in with a tool like autopep8, which can automatically fix some deprecated features with a much lighter touch. If you’d like to automatically enforce that, say, from __future__ import absolute_import is at the top of every Python file, that’s a bit beyond the scope of this article, but I’ve had pre-commit + reorder_python_imports thrust upon me in the past to fairly good effect.

Anyway! For each of the issues below, I’ll mention whether futurize can fix it, the name of the responsible fixer, and whether six has anything relevant. If the name of the fixer begins with lib2to3, that means it’s part of the standard library, and you can use it with 2to3 without installing python-future.

Here we go!

Things you shouldn’t even be doing

These are ancient, ancient practices, and even Python 2 programmers may be surprised by them. Some of them are arguably outright bugs in the language; others are just old and forgotten. They generally have equivalents that work even in older versions of Python 2.

Old-style classes

1
2
class Foo:
    ...

In Python 3, this code creates a class that inherits from object. In Python 2, it creates a completely different kind of thing entirely: an “old-style” class, which worked a little differently from built-in types. The differences are generally subtle:

  • Old-style classes don’t support __getattribute__, __slots__

  • Old-style classes don’t correctly support data descriptors, i.e. the assignment behavior of @property.

  • Old-style classes had a __coerce__ method, which would attempt to turn a value into a built-in numeric type before performing a math operation.

  • Old-style classes didn’t use the C3 MRO, so in the case of diamond inheritance, a class could be skipped entirely by super().

  • Old-style instances check the instance for a special method name; new-style instances check the type. Additionally, if a special method isn’t found on an old-style instance, the lookup falls back to __getattr__; this is not the case for new-style classes (which makes proxying more complicated).

That last one is the only thing old-style classes can do that new-style classes cannot, and if you’re relying on it, you have a bit of refactoring to do. (The really curious thing is that there doesn’t seem to be a particularly good reason for the limitation on new-style classes, and it doesn’t even make things faster. Maybe that’ll be fixed in Python 4?)

If you have no idea what any of that means or why you should care, chances are you’re either not using old-style classes at all, or you’re only using them because you forgot to write (object) somewhere. In that case, futurize --stage2 will happily change class Foo: to class Foo(object): for you, using the libpasteurize.fixes.fix_newstyle fixer. (Strictly speaking, this is a Python 2 compatibility issue, since the old syntax still works fine in Python 3 — it just means something else now.)

cmp

Python 2 originally used the C approach for sorting. Given two things A and B, a comparison would produce a negative number if A < B, zero if A == B, and a positive number if A > B. This was the only way to customize sorting; there’s a cmp() built-in function, a __cmp__ special method, and cmp arguments to list.sort() and sorted().

This is a little cumbersome, as you may have noticed if you’ve ever tried to do custom sorting in Perl or JavaScript. Even a case-insensitive sort involves repeating yourself. Most custom sorts will have the same basic structure of cmp(op(a), op(b)), when the only thing you really care about is op.

1
names.sort(cmp=lambda a, b: cmp(a.lower(), b.lower()))

But more importantly, the C approach is flat-out wrong for some types. Consider sets, which use comparison to indicate subsets versus supersets:

1
2
3
4
{1, 2} < {1, 2, 3}  # True
{1, 2, 3} > {1, 2}  # True
{1, 2} < {1, 2}  # False
{1, 2} <= {1, 2}  # True

So what to do with {1, 2} < {3, 4}, where none of the three possible answers is correct?

Early versions of Python 2 added “rich comparisons”, which introduced methods for all six possible comparisons: __eq__, __ne__, __lt__, __le__, __gt__, and __ge__. You’re free to return False for all six, or even True for all six, or return NotImplemented to allow deferring to the other operand. The cmp argument became key instead, which allows mapping the original values to a different item to use for comparison:

1
names.sort(key=lambda a: a.lower())

(This is faster, too, since there are fewer calls to the lambda, fewer calls to .lower(), and no calls to cmp.)


So, fixing all this. Luckily, Python 2 supports all of the new stuff, so you don’t need compatibility hacks.

To replace simple implementations of __cmp__, you need only write the appropriate rich comparison methods. You could even do this the obvious way:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class Foo(object):
    def __cmp__(self, other):
        return cmp(self.prop, other.prop)

    def __eq__(self, other):
        return self.__cmp__(other) == 0

    def __ne__(self, other):
        return self.__cmp__(other) != 0

    def __lt__(self, other):
        return self.__cmp__(other) < 0

    ...

You would also have to change the use of cmp to a manual if tree, since cmp is gone in Python 3. I don’t recommend this.

A lazier alternative would be to use functools.total_ordering (backported from 3.0 into 2.7), which generates four of the comparison methods, given a class that implements __eq__ and one other:

1
2
3
4
5
6
7
@functools.total_ordering
class Foo(object):
    def __eq__(self, other):
        return self.prop == other.prop

    def __lt__(self, other):
        return self.prop < other.prop

There are a couple problems with this code. For one, it’s still pretty repetitive, accessing .prop four times (and imagine if you wanted to compare several properties). For another, it’ll either cause an error or do entirely the wrong thing if you happen to compare with an object of a different type. You should return NotImplemented in this case, but total_ordering doesn’t handle that correctly until Python 3.4. If those bother you, you might enjoy my own classtools.keyed_ordering, which uses a __key__ method (much like the key argument) to generate all six methods:

1
2
3
4
@classtools.keyed_ordering
class Foo(object):
    def __key__(self):
        return self.prop

Replacing uses of key arguments should be straightforward: a cmp argument of cmp(op(a), op(b)) becomes a key argument of op. If you’re doing something more elaborate, there’s a functools.cmp_to_key function (also backported from 3.0 to 2.7), which converts a cmp function to one usable as a key. (The implementation is much like the first Foo example above: it involves a class that calls the wrapped function from its comparison methods, and returns True or False depending on the return value.)

Finally, if you’re using cmp directly, don’t do that. If you really, really need it for something other than Python’s own sorting, just use an if.

The only help futurize offers is in futurize --stage2, via libfuturize.fixes.fix_cmp, which adds an import of past.builtins.cmp if it detects you’re using the cmp function anywhere.

Comparing incompatible types

Python 2’s use of C-style ordering also means that any two objects, of any types, must be either equal or occur in some defined order. Python’s answer to this problem is to sort on the names of the types. So None < 3 < "1", because "NoneType" < "int" < "str".

Python 3 removes this fallback rule; if two values don’t know how to compare against each other (i.e. both return NotImplemented), you just get a TypeError.

This might affect you in subtle ways, such as if you’re sorting a list of objects that may contain Nones and expecting it to silently work. The fix depends entirely on the type of data you have, and no automated tool can handle that for you. Most likely, you didn’t mean to be sorting a heterogenous list in the first place.

Of course, you could always sort on type(x).__name__, but I don’t know why you would do that.

The sets module

Python 2.3 introduced its set types as Set and ImmutableSet in the sets module. Since Python 2.4, they’ve been built-in types, set and frozenset. The sets module is gone in Python 3, so just use the built-in names.

Creating exceptions

Python 2 allows you to do this:

1
raise RuntimeError, "an error happened at runtime!!"

There’s not really any good reason to do this, since you can just as well do:

1
raise RuntimeError("an error happened at runtime!!")

futurize --stage1 will rewrite the two-arg form to a regular object creation via the libfuturize.fixes.fix_raise fixer. It’ll also fix this alternative way of specifying an exception type, which is so bizarre and obscure that I did not know about it until I read the fixer’s source code:

1
raise (((A, B), C), ...)  # equivalent to `raise A` (?!)

Additionally, exceptions act like sequences in Python 2, but not in Python 3. You can just operate on the .args sequence directly, in either version. Alas, there’s no automated way to fix this.

Backticks

Did you know that `x` is equivalent to repr(x) in Python 2? Yeah, most people don’t. It’s super weird. futurize --stage1 will fix this with the lib2to3.fixes.fix_repr fixer.

has_key

Very old code may still be using somedict.has_key("foo"). "foo" in somedict has worked since Python 2.2. What are you doing. futurize --stage1 will fix this with the lib2to3.fixes.fix_has_key fixer.

<>

<> is equivalent to != in Python 2! This is an ancient, ancient holdover, and there’s no reason to still be using it. futurize --stage1 will fix this with the lib2to3.fixes.fix_ne fixer.

(You could also use from __future__ import barry_as_FLUFL, which restores <> in Python 3. It’s an easter egg. I’m joking. Please don’t actually do this.)

Things with easy Python 2 equivalents

These aren’t necessarily ancient, but they have an alternative you can just as well express in Python 2, so there’s no need to juggle 2 and 3.

Other ancient builtins

apply() is gone. Use the built-in syntax, f(*args, **kwargs).

callable() was briefly gone, but then came back in Python 3.2.

coerce() is gone; it was only used for old-style classes.

execfile() is gone. Read the file and pass its contents to exec() instead.

file() is gone; Python 3 has multiple file types, and a hierarchy of interfaces defined in the io module. Occasionally, code uses this as a synonym for open(), but you should really be using open() anyway.

intern() has been moved into the sys module, though I have no earthly idea why you’d be using it.

raw_input() has been renamed to input(), and the old ludicrous input() is gone. If you really need input(), please stop.

reduce() has been moved into the functools module, but it’s there in Python 2.6 as well.

reload() has been moved into the imp module. It’s unreliable garbage and you shouldn’t be using it anyway.

futurize --stage1 can fix several of these:

  • apply, via lib2to3.fixes.fix_apply
  • intern, via lib2to3.fixes.fix_intern
  • reduce, via lib2to3.fixes.fix_reduce

futurize --stage2 can also fix execfile via the libfuturize.fixes.fix_execfile fixer, which imports past.builtins.execfile. The 2to3 fixer uses an open() call, but the true correct fix is to use a with block.

futurize --stage2 has a couple of fixers for raw_input, but you can just as well import future.builtins.input or six.moves.input.

Nothing can fix coerce, which has no equivalent. Curiously, I don’t see a fixer for file, which is trivially fixed by replacing it with open. Nothing for reload, either.

Catching exceptions

Historically, the way to say “if there’s a ValueError, store it in e and run some code” was:

1
2
3
4
try:
    ...
except ValueError, e:
    ...

Unfortunately, that’s very easy to confuse with the syntax for catching two different types of exception:

1
2
except (ValueError, TypeError):
    ...

If you forget the parentheses, you’ll only catch ValueError, and the exception will be assigned to a variable called, er, TypeError. Whoops!

Python 3.0 introduced clearer syntax, which was also backported to Python 2.6:

1
2
except ValueError as e:
    ...

Python 3.0 finally removed the old syntax, so you must use the as form. futurize --stage1 will fix this with the lib2to3.fixes.fix_except fixer.

As an additional wrinkle, the extra variable e is deleted at the end of the block in Python 3, but not in Python 2. If you really need to refer to it after the block, just assign it to a different name.

(The reason for this is that captured exceptions contain a traceback in Python 3, and tracebacks contain the locals for the current frame, and those locals will contain the captured exception. The resulting cycle would keep all local variables alive until the cycle detector dealt with it, at least in CPython. Scrapping the exception as soon as it’s been dealt with was a simple way to keep this from accidentally happening all over the place. It usually doesn’t make sense to refer to a captured exception after the except block, anyway, since the variable may or may not even exist, and that’s generally weird and bad in Python.)

Octals

It’s not uncommon for a new programmer to try to zero-pad a set of numbers:

1
2
3
4
a = 07
b = 08
c = 09
d = 10

Of course, this will have the rather bizarre result that 08 is a SyntaxError, even though 07 works fine — because numbers starting with a 0 are parsed as octal.

This is a holdover from C, and it’s fairly surprising, since there’s virtually no reason to ever use octal. The only time I can ever remember using it is for passing file modes to chmod.

Python 3.0 requires octal literals to be prefixed with 0o, in line with 0x for hex and 0b for binary; literal integers starting with only a 0 are a syntax error. Python 2.6 supports both forms.

futurize --stage1 will fix this with the lib2to3.fixes.fix_numliterals fixer.

pickle

If you’re using the pickle module (which you shouldn’t be), and you intend to pass pickles back and forth between Python 2 and Python 3, there’s a small issue to be aware of. pickle has several different “protocol” versions, and the default version used in Python 3 is protocol 3, which Python 2 cannot read.

The fix is simple: just find where you’re calling pickle.dump() or pickle.dumps(), and pass a protocol argument of 2. Protocol 2 is the highest version supported by Python 2, and you probably want to be using it anyway, since it’s much more compact and faster to read/write than Python 2’s default, protocol 0.

You may be already using HIGHEST_PROTOCOL, but you’ll have the same problem: the highest protocol supported in any version of Python 3 is unreadable by Python 2.


A somewhat bigger problem is that if you pickle an instance of a user-defined class on Python 2, the pickle will record all its attributes as bytestrings, because that’s what they are in Python 2. Python 3 will then dutifully load the pickle and populate your object’s __dict__ with keys like b'foo'. obj.foo will then not actually exist, because obj.foo looks for the string 'foo', and 'foo' != b'foo' in Python 3.

Don’t use pickle, kids.

It’s possible to fix this, but also a huge pain in the ass. If you don’t know how, you definitely shouldn’t be using pickle.

Things that have a __future__ import

Occasionally, the syntax changed in an incompatible way, but the new syntax was still backported and hidden behind a __future__ import — Python’s mechanism for opting into syntax changes. You have to put such an import at the top of the file, optionally after a docstring, like this:

1
2
"""My super important module."""
from __future__ import with_statement

Ugh! Parentheses! Why, Guido, why?

The reason is that the print statement has incredibly goofy syntax, unlike anything else in the language:

1
print >>a, b, c,

You might not even recognize the >> bit, but it lets you print to a file other than sys.stdout. It’s baked specifically into the print syntax. Python 3 replaces this with a straightforward built-in function with a couple extra bells and whistles. The above would be written:

1
print(b, c, end='', file=a)

It’s slightly more verbose, but it’s also easier to tell what’s going on, and that teeny little comma at the end is now a more obvious keyword argument.

from __future__ import print_function will forget about the print statement for the rest of the file, and make the builtin print function available instead. futurize --stage1 will fix all uses of print and add the __future__ import, with the libfuturize.fixes.fix_print_with_import fixer. (There’s also a 2to3 fixer, but it doesn’t add the __future__ import, since it’s unnecessary in Python 3.)

A word of warning: do not just use print with parentheses without adding the __future__ import. This may appear to work in stock Python 2:

1
print("See, what's the problem?  This works fine!")

However, that’s parsed as the print statement followed by an expression in parentheses. It becomes more obvious if you try to print two values:

1
2
print("The answer is:", 3)
# ("The answer is:", 3)

Now you have a comma inside parentheses, which is a tuple, so the old print statement prints its repr.

Division always produces a float

Quick, what’s the answer here?

1
5 / 2

If you’re a normal human being, you’ll say 2.5 or 2½. Unfortunately, if you’re like Python and have been afflicted by C, you might say the answer is 2, because this is “integer division” — a bizarre and alien concept probably invented because CPUs didn’t have FPUs when C was first invented.

Python 3.0 decided that maybe contorting fundamental arithmetic to match the inadequacies of 1970s hardware is not the best idea, and so it changed division to always produce a float.

Since Python 2.6, from __future__ import division will alter the division operator to always do true division. If you want to do floor division, there’s a separate // operator, which has existed for ages; you can use it in Python 2 with or without the __future__ import.

Note that true division always produces a float, even if the result is integral: 6 / 3 is 2.0. On the other hand, floor division uses the same typing rules as C-style division: 5 // 2 is 2, but 5 // 2.0 is 2.0.

futurize --stage2 will “fix” this with the libfuturize.fixes.fix_division fixer, but unfortunately that just adds the __future__ import. With the --conservative option, it uses the libfuturize.fixes.fix_division_safe fixer instead, which imports past.utils.old_div, a forward-port of Python 2’s division operator.

The trouble here is that the new / always produces a float, and the new // always floors, but the old / sometimes did one and sometimes did the other. futurize can’t just replace all uses of / with //, because 5/2.0 is 2.5 but 5//2.0 is 2.0, and it can’t generally know what types the operands are.

You might be best off fixing this one manually — perhaps using fix_division_safe to find all the places you do division, then changing them to use the right operator.

Of course, the __div__ magic method is gone in Python 3, replaced explicitly by __floordiv__ (//) and __truediv__ (/). Both of those methods already exist in Python 2, and __truediv__ is even called when you use / in the presence of the future import, so being compatible is a simple matter of implementing all three and deferring to one of the others from __div__.

Relative imports

In Python 2, if you’re in the module foo.bar and say import quux, Python will look for a foo.quux before it looks for a top-level quux. The former behavior is called a relative import, though it might be more clearly called a sibling import. It’s troublesome for several reasons.

  • If you have a sibling called quux, and there’s also a top-level or standard library module called quux, you can’t import the latter. (There used to be a py.std module for providing indirect access to the standard library, for this very reason!)

  • If you import the top-level quux module, and then later add a foo.quux module, you’ll suddenly be importing a different module.

  • When reading the source code, it’s not clear which imports are siblings and which are top-level. In fact, the modules you get depend on the module you’re in, so moving or renaming a file may change its imports in non-obvious ways.

Python 3 eliminates this behavior: import quux always means the top-level module. It also adds syntax for “explicit relative” or “absolute relative” (yikes) imports: from . import quux or from .quux import somefunc explicitly means to look for a sibling named quux. (You can also use ..quux to look in the parent package, three dots to look in the grandparent, etc.)

The explicit syntax is supported since Python 2.5. The old sibling behavior can be disabled since Python 2.5 with from __future__ import absolute_import.

futurize --stage1 has a libfuturize.fixes.fix_absolute_import fixer, which attempts to detect sibling imports and convert them to explicit relative imports. If it finds any sibling imports, it’ll also add the __future__ line, though honestly you should make an effort to to put that line in all of your Python 2 code.

It’s possible for the futurize fixer to guess wrong about a sibling import, but in general it works pretty well.

(There is one case I’ve run across where simply replacing import sibling with from . import sibling didn’t work. Unfortunately, it was Yelp code that I no longer have access to, and I can’t remember the precise details. It involved having several sibling imports inside a __init__.py, where the siblings also imported from each other in complex ways. The sibling imports worked, but the explicit relative imports failed, for some really obscure timing reason. It’s even possible this was a 2.6 bug that’s been fixed in 2.7. If you see it, please let me know!)

Things that require some effort

These problems are a little more obscure, but many of them are also more difficult to fix automatically. If you have a massive codebase, these are where the problems start to appear.

The grand module shuffle

A whole bunch of modules were deleted, merged, or removed. A full list is in PEP 3108, but you’ll never have heard of most of them. Here are the ones that might affect you.

  • __builtin__ has been renamed to builtins. Note that this is a module, not the __builtins__ attribute of modules, which is exactly why it was renamed. Incidentally, you should be using the builtins module rather than __builtins__ anyway. Or, wait, no, just don’t use either, please don’t mess with the built-in scope.

  • ConfigParser has been renamed to configparser.

  • Queue has been renamed to queue.

  • SocketServer has been renamed to socketserver.

  • cStringIO and StringIO are gone; instead, use StringIO or BytesIO from the io module. Note that these also exist in Python 2, but are pure-Python rather than the C versions in current Python 3.

  • cPickle is gone. Importing pickle in Python 3 now gives you the C implementation automatically.

  • cProfile is gone. Importing profile in Python 3 gives you the C implementation automatically.

  • copy_reg has been renamed to copyreg.

  • anydbm, dbhash, dbm, dumbdm, gdbm, and whichdb have all been merged into a dbm package.

  • dummy_thread has become _dummy_thread. It’s an implementation of the _thread module that doesn’t actually do any threading. You should be using dummy_threading instead, I guess?

  • httplib has become http.client. BaseHTTPServer, CGIHTTPServer, and SimpleHTTPServer have been merged into a single http.server module. Cookie has become http.cookies. cookielib has become http.cookiejar.

  • repr has been renamed to reprlib. (The module, not the built-in function.)

  • thread has been renamed to _thread, and you should really be using the threading module instead.

  • A whole mess of top-level Tk modules have been combined into a tkinter package.

  • The contents of urllib, urllib2, and urlparse have been consolidated and then split into urllib.error, urllib.parse, and urllib.request.

  • xmlrpclib has become xmlrpc.client. DocXMLRPCServer and SimpleXMLRPCServer have been merged into xmlrpc.server.

futurize --stage2 will fix this with the somewhat invasive libfuturize.fixes.fix_future_standard_library fixer, which uses a mechanism from future that adds aliases to Python 2 to make all the Python 3 standard library names work. It’s an interesting idea, but it didn’t actually work for all cases when I tried it (though now I can’t recall what was broken), so YMMV.

Alternative, you could manually replace any affected imports with imports from six.moves, which provides aliases that work on either version.

Or as a last resort, you can just sprinkle try ... except ImportError around.

Built-in iterators are now lazy

filter, map, range, and zip are all lazy in Python 3. You can still iterate over their return values (once), but if you have code that expects to be able to index them or traverse them more than once, it’ll break in Python 3. (Well, not range, that’s fine.) The lazy equivalents — xrange and the functions in itertools — are of course gone in Python 3.

In either case, the easiest thing to do is force eager evaluation by wrapping the call in list() or tuple(), which you’ll occasionally need to do in Python 3 regardless.

For the sake of consistency, you may want to import the lazy versions from the standard library future_builtins module. It only exists in Python 2, so be sure to wrap the import in a try.

futurize --stage2 tries to address this with several of lib2to3s fixers, but the results aren’t particularly pleasing: calls to all four are unconditionally wrapped in list(), even in an obviously safe case like a for block. I’d just look through your uses of them manually.

A more subtle point: if you pass a string or tuple to Python 2’s filter, the return value will be the same type. Blindly wrapping the call in list() will of course change the behavior. Filtering a string is not a particularly common thing to do, but I’ve seen someone complain about it before, so take note.

Also, Python 3’s map stops at the shortest input sequence, whereas Python 2 extends shorter sequences with Nones. You can fix this with itertools.zip_longest (which in Python 2 is izip_longest!), but honestly, I’ve never even seen anyone pass multiple sequences to map.

Relatedly, dict.iteritems (plus its friends, iterkeys and itervalues) is gone in Python 3, as the plain items (plus keys and values) is already lazy. The dict.view* methods are also gone, as they were only backports of Python 3’s normal behavior.

Both six and future.utils contain functions called iteritems, etc., which provide a lazy iterator in both Python 2 and 3. They also offer view* functions, which are closer to the Python 3 behavior, though I can’t say I’ve ever seen anyone actually use dict.viewitems in real code.

Of course, if you explicitly want a list of dictionary keys (or items or values), list(d) and list(d.items()) do the same thing in both versions.

buffer is gone

The buffer type has been replaced by memoryview (also in Python 2.7), which is similar but not identical. If you’ve even heard of either of these types, you probably know more about the subtleties involved than I do. There’s a lib2to3.fixes.fix_buffer fixer that blindly replaces buffer with memoryview, but futurize doesn’t use it in either stage.

Several special methods were renamed

Where Python 2 has __str__ and __unicode__, Python 3 has __bytes__ and __str__. The trick is that __str__ should return the native str type for each version: a bytestring for Python 2, but a Unicode string for Python 3. Also, you almost certainly don’t want a __bytes__ method in Python 3, where bytes is no longer used for text.

Both six and python-future have a python_2_unicode_compatible class decorator that tries to do the right thing. You write only a single __str__ method that returns a Unicode string. In Python 3, that’s all you need, so the decorator does nothing; in Python 2, the decorator will rename your method to __unicode__ and add a __str__ that returns the same value encoded as UTF-8. If you need different behavior, you’ll have to roll it yourself with if PY2.


Python 2’s next method is more appropriately __next__ in Python 3. The easy way to address this is to call your method __next__, then alias it with next = __next__. Be sure you never call it directly as a method, only with the built-in next() function.

Alternatively, future.builtins contains an alternative next which always calls __next__, but on Python 2, it falls back to trying next if __next__ doesn’t exist.

futurize --stage1 changes all use of obj.next() to next(obj) via the libfuturize.fixes.fix_next_call fixer. futurize --stage2 renames next methods to __next__ via the lib2to3.fixes.fix_next fixer (which also fixes calls). Note that there’s a remote chance of false positives, if for some reason you happened to use next as a regular method name.


Python 2’s __nonzero__ is Python 3’s __bool__. Again, you can just alias it manually. Or futurize --stage2 will rename it with the lib2to3.fixes.fix_nonzero fixer.

Renaming it will of course break it in Python 2, but futurize --stage2 also has a libfuturize.fixes.fix_object fixer that imports python-future’s own builtins.object. The replacement object class has a few methods for making Python 3’s __str__, __next__, and __bool__ work on Python 2.

This is one of the mildly invasive things python-future does, and it may or may not sit well. Up to you.


__long__ is completely gone, as there is no long type in Python 3.

__getslice__, __setslice__, and __delslice__ are gone. Instead, slice objects are passed to __getitem__ and friends. On the off chance you use these, you’ll have to do something clever in the item methods to defer to your slice logic on Python 3.

__oct__ and __hex__ are gone; oct() and hex() now consult __index__. I seriously doubt this will impact anyone.

__div__ is gone, as mentioned previously.

Unbound methods are gone; function attributes renamed

Say you have this useless class.

1
2
3
class Foo(object):
    def bar(self):
        pass

In Python 2, Foo.bar is an “unbound method”, a type that’s generally unseen and unexposed other than as types.MethodType. In Python 3, Foo.bar is just a regular function.

Offhand, I can only think of one time this would matter: if you want to get at attributes on the function, perhaps for the sake of a method decorator. In Python 2, you have to go through the unbound method’s .im_func attribute to get the original function, but in Python 3, you already have the original function and can get the attributes directly.

If you’re doing this anywhere, an easy way to make it work in both versions is:

1
2
method = Foo.bar
method = getattr(method, 'im_func', method)

As for bound methods (the objects you get from accessing methods but not calling them, like [].append), the im_self and im_func attributes have been renamed to __self__ and __func__. Happily, these names also work in Python 2.6, so no compatibility hacks are necessary.

im_class is completely gone in Python 3. Methods have no interest in which class they’re attached to. They can’t, since the same function could easily be attached to more than one class. If you’re relying on im_class somehow, for some reason… well, don’t do that, maybe.

Relatedly, the func_* function attributes have been renamed to dunder names in Python 3, since assigning function attributes is a fairly common practice and Python doesn’t like to clog namespaces with its own builtin names. func_closure, func_code, func_defaults, func_dict, func_doc, func_globals, and func_name are now __closure__, __code__, etc. (Note that func_doc and func_name were already aliases for __doc__ and __name__, and func_defaults is much more easily inspected with the inspect module.) The new names are not available in Python 2, so you’ll need to do a getattr dance, or use the get_function_* functions from six.

Metaclass syntax has changed

In Python 2, a metaclass is declared by assigning to a special name in the class body:

1
2
3
class Foo(object):
    __metaclass__ = FooMeta
    ...

Admittedly, this doesn’t make a lot of sense. The metaclass affects how a class is created, and the class body is evaluated as part of that creation, so this is sort of a goofy hack.

Python 3 changed this, opening the door to a few new neat tricks in the process, which you can find out about in the companion article.

1
2
class Foo(object, metaclass=FooMeta):
    ...

The catch is finding a way to express this idea in both Python 2 and Python 3 — the old syntax is ignored in Python 3, and the new syntax is a syntax error in Python 2.

It’s a bit of a pain, but the class statement is really just a lot of sugar for calling the type() constructor; after all, Python classes are just instances of type. All you have to do is manually create an instance of your metaclass, rather than of type.

Fortunately, other people have already made this work for you. futurize --stage2 will fix this using the libfuturize.fixes.fix_metaclass fixer, which imports future.utils.with_metaclass and produces the following:

1
2
3
4
from future.utils import with_metaclass

class Foo(with_metaclass(object)):
    ...

This creates an intermediate dummy class with the right metaclass, which you then inherit from. Classes use the same metaclass as their parents, so this works fine in any Python.

If you don’t want to depend on python-future, the same function exists in the six module.

Re-raising exceptions has different syntax

raise with no arguments does the same thing in Python 2 and Python 3: it re-raises the exception currently being handled, preserving the original traceback.

The problem comes in with the three-argument form of raise, which is for preserving the traceback while raising a different exception. It might look like this:

1
2
3
4
try:
    some_fragile_function()
except Exception as e:
    raise MyLibraryError, MyLibraryError("Failed to do a thing: " + str(e)), sys.exc_info()[2]

sys.exc_info()[2] is, of course, the only way to get the current traceback in Python 2. You may have noticed that the three arguments to raise are the same three things that sys.exc_info() returns: the type, the value, and the traceback.

Python 3 introduces exception chaining. If something raises an exception from within an except block, Python will remember the original exception, attach it to the new one, and show both exceptions when printing a traceback — including both exceptions’ types, messages, and where they happened. So to wrap and rethrow an exception, you don’t need to do anything special at all.

1
2
3
4
try:
    some_fragile_function()
except Exception:
    raise MyLibraryError("Failed to do a thing")

For more complicated handling, you can also explicitly say raise new_exception from old_exception. Exceptions contain their associated tracebacks as a __traceback__ attribute in Python 3, so there’s no need to muck around getting the traceback manually. If you really want to give an explicit traceback, you can use the .with_traceback() method, which just assigns to __traceback__ and then returns self.

1
raise MyLibraryError("Failed to do a thing").with_traceback(some_traceback)

It’s hard to say what it even means to write code that works “equivalently” in both versions, because Python 3 handles this problem largely automatically, and Python 2 code tends to have a variety of ad-hoc solutions. Note that you cannot simply do this:

1
2
3
4
if PY3:
    raise MyLibraryError("Beep boop") from exc
else:
    raise MyLibraryError, MyLibraryError("Beep boop"), sys.exc_info()[2]

The first raise is a syntax error in Python 2, and the second is a syntax error in Python 3. if won’t protect you from parse errors. (On the other hand, you can hide .with_traceback() behind an if, since that’s just a regular method call and will parse with no issues.)

six has a reraise function that will smooth out the differences for you (probably by using exec). The drawback is that it’s of course Python 2-oriented syntax, and on Python 3 the final traceback will include more context than expected.

Alternatively, there’s a six.raise_from, which is designed around the raise X from Y syntax of Python 3. The drawback is that Python 2 has no obvious equivalent, so you just get raise X, losing the old exception and its traceback.

There’s no clear right approach here; it depends on how you’re handling re-raising. Code that just blindly raises new exceptions doesn’t need any changes, and will get exception chaining for free on Python 3. Code that does more elaborate things, like implementing its own form of chaining or storing exc_info tuples to be re-raised later, may need a little more care.

Bytestrings are sequences of integers

In Python 2, bytes is a synonym for str, the default string type. Iterating or indexing a bytes/str produces 1-character strs.

1
2
3
4
list(b'hello')  # ['h', 'e', 'l', 'l', 'o']
b'hello'[0:4]  # 'hell'
b'hello'[0]  # 'h'
b'hello'[0][0][0][0][0]  # 'h' -- it's turtles all the way down

In Python 3, bytes is a specialized type for handling binary data, not text. As such, iterating or indexing a bytes produces integers.

1
2
3
4
list(b'hello')  # [104, 101, 108, 108, 111]
b'hello'[0:4]  # b'hell'
b'hello'[0]  # 104
b'hello'[0][0][0][0]  # TypeError, since you can't index 104

If you have explicitly binary data that want to be bytes in Python 3, this may pose a bit of a problem. Aside from just checking the version explicitly and making heavy use of chr/ord, there are two approaches.

One is to use bytearray instead. This is like bytes, but mutable. More importantly, since it was introduced as a new type in Python 2.6 — after Python 3.0 came out — it has the same iterating and indexing behavior as Python 3’s bytes, even in Python 2.

1
bytearray(b'hello')[0]  # 104, on either Python 2 or 3

The other is to slice rather than index, since slicing always produces a new iterable of the same type. If you want to extract a single character from a bytes, just take a one-element slice.

1
2
b'hello'[0]  # 104
b'hello'[0:1]  # b'h'

Things that are just a royal pain in the ass

Unicode

Saving the best for last, almost!

Honestly, if your Python 2 code is already careful with Unicode — working with unicode internally, and encoding/decoding only at the “boundaries” of your code — then you shouldn’t have too many problems. If your code is not so careful, you should really try to make it a little more careful before you worry about Python 3, since Python 3’s whole jam is to force you to be careful.

See, in Python 2, you can combine bytestrings (str) and text strings (unicode) more or less freely. Python will automatically try to convert between the two using the “default encoding”, which is generally ascii. Python 3 makes text strings the default string type, demotes bytestrings, and forbids ever converting between them.

Most obviously, Python 2’s str and unicode have been renamed to bytes and str in Python 3. If you happen to be using the names anywhere, you’ll probably need to change them! six offers text_type and binary_type, though you can just use bytes to mean the same thing in either version. python-future also has backports for both Python 3’s bytes and str types, which seems like an extreme approach to me. Changing str to mean a text type even in Python 2 might be a good idea, though.

b'' and u'' work the same way in either Python 2 or 3, but unadorned strings like '' are always the str type, which has different behavior. There is a from __future__ import unicode_literals, which will cause unadorned strings to be unicode in Python 2, and this might work for you. However, this prevents you from writing literal “native” strings — strings of the same type Python uses for names, keyword arguments, etc. Usually this won’t matter, since Python 2 will silently convert between bytes and text, but it’s caused me the occasional problem.

The right thing to do is just explicitly mark every single string with either a b or u sigil as necessary. That just, you know, sucks. But you should be doing it even if you’re not porting to Python 3.

basestring is completely gone in Python 3. str and bytes have no common base type, and their semantics are different enough that it rarely makes sense to treat them the same way. If you’re using basestring in Python 2, it’s probably to allow code to work on either form of “text”, and you’ll only want to use str in Python 3 (where bytes are completely unsuitable for text). six.string_types provides exactly this. futurize --stage2 also runs the lib2to3.fixes.fix_basestring fixer, but this replaces basestring with str, which will almost certainly break your code in Python. If you intend to use stage 2, definitely audit your uses of basestring first.

As mentioned above, bytestrings are sequences of integers, which may affect code trying to work with explicitly binary data.

Python 2 has both .decode() and .encode() on both bytes and text; if you try to encode bytes or decode text, Python will try to implicitly convert to the right type first. In Python 3, only text has an .encode() and only bytes have a .decode().

Relatedly, Python 2 allows you to do some cute tricks with “encodings” that aren’t really encodings; for example, "hi".encode('hex') produces '6869'. In Python 3, encoding must produce bytes, and decoding must produce text, so these sorts of text-to-text or bytes-to-bytes translations aren’t allowed. You can still do them explicitly with the codecs module, e.g. codecs.encode(b'hi', 'hex'), which also works in Python 2, despite being undocumented. (Note that Python 3 specifically requires bytes for the hex codec, alas. If it’s any consolation, there’s a bytes.hex() method to do this directly, which you can’t use anyway if you’re targeting Python 2.)

Python 3’s open decodes as UTF-8 by default (a vast oversimplification, but usually), so if you’re manually decoding after reading, you’ll get an error in Python 3. You could explicitly open the file in binary mode (preserving the Python 2 behavior), or you could use codecs.open to decode transparently on read (preserving the Python 3 behavior). The same goes for writing.

sys.stdin, sys.stdout, and sys.stderr are all text streams in Python 3, so they have the same caveats as above, with the additional wrinkle that you didn’t actually open them yourself. Their .buffer attribute gives a handle opened in binary mode (Python 2 behavior), or you can adapt them to transcode transparently (Python 3 behavior):

1
2
3
4
if six.PY2:
    sys.stdin = codecs.getreader('utf-8')(sys.stdin)
    sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
    sys.stderr = codecs.getwriter('utf-8')(sys.stderr)

A text-mode file’s .tell() in Python 3 still returns a number that can be passed back to .seek(), but the number is not necessarily meaningful, and in particular can’t be used to estimate progress through a file. (Python uses a few very high bits as flags to indicate the state of the decoder; if you mask them off, what’s left is probably the byte position in the file as you’d expect, but this is pretty definitively a hack.)

Python 3 likes to treat filenames as text, but most of the functions in os and os.path will accept either text or bytes as their arguments (and return a value of the same type), so you should be okay there.

os.environ has text keys and values in Python 3. If you direly need bytes, you can use os.environb (and os.getenvb()).

I think that covers most of the obvious basics. This is a whole sprawling topic that I can’t hope to cover off the top of my head. I’ve seen it be both fairly painful and completely pain-free, depending entirely on the state of the Python 2 codebase.

Oh, one final note: there’s a module for Python 2 called unicode-nazi (sorry, I didn’t name it) that will produce a warning anytime a bytestring is implicitly converted to a text string, or vice versa. It might help you root out places you’re accidentally slopping types back and forth, which will certainly break in Python 3. I’ve only tried it on a comically large project where it found thousands of violations, including plenty in surprising places in the standard library, so it may or may not be of any practical help.

Things that are not actually gone

String formatting with %

There’s a widespread belief that str % ... is deprecated, since there’s a newer and shinier str.format() method.

Well, it’s not. It’s not gone; it’s not deprecated; it still works just fine. I don’t like to use it, myself, since it’s easy to make accidentally ambiguous — "%s" % foo can crash if foo is a tuple! — but it’s not going anywhere. In fact, as of Python 3.5, bytes and bytearray support % but not .format.

optparse

argparse is certainly better, but the optparse module still exists in Python 3. It has been deprecated since Python 3.2, though.

Things that are preposterously obscure but that I have seen cause problems nonetheless

Tuple unpacking

A little-used feature of Python 2 is tuple unpacking in function arguments:

1
2
3
4
5
def foo(a, (b, c)):
    print a, b, c

x = (2, 3)
foo(1, x)

This syntax is gone in Python 3. I’ve rarely seen anyone use it, except in two cases. One was a parsing library that relied pretty critically on using it in every parsing function you wrote; whoops.

The other is when sorting a dict’s items:

1
sorted(d.items(), key=lambda (k, v): k + v)

In Python 3, you have to write that as lambda kv: kv[0] + kv[1]. Boo.

long is gone

Python 3 merged its long type with int, so now there’s only one integral type, called int.

Python 2 promotes int to long pretty much transparently, and longs aren’t very common in the first place, so it’s fairly unlikely that this will make a difference. On the off chance you’re type-checking for integers with isinstance(x, (int, long)) (and really, why are you doing that), you can just use six.integer_types instead.

Note that futurize --stage2 applies the lib2to3.fixes.fix_long fixer, which blindly renames long to int, leaving you with inappropriate code like isinstance(x, (int, int)).

However…

I have seen some very obscure cases where a hand-rolled binary protocol would encode ints and longs differently. My advice would be to not do that.

Oh, and a little-known feature of Python 2’s syntax is that you can have long literals by suffixing them with an L:

1
2
123  # int
123L  # long

You can write 1267650600228229401496703205376 directly in Python 2 code, and it’ll automatically create a long, so the only reason to do this is if you explicitly need a long with a small value like 1. If that’s the case, something has gone catastrophically wrong.

repr changes

These should really only affect you if you’re using reprs as expected test output (or, god forbid, as cache keys or something). Some notable changes:

  • Unicode strings have a u prefix in Python 2. In Python 3, of course, Unicode strings are just strings, so there’s no prefix.

  • Conversely, bytestrings have a b prefix in Python 3, but not in Python 2 (though the b prefix is allowed in source code).

  • Python 2 escapes all non-ASCII characters, even in the repr of a Unicode string. Python 3 only escapes control characters and codepoints considered non-printing.

  • Large integers and explicit longs have an L suffix in Python 2, but not in Python 3, where there is no separate long type.

  • A set becomes set([1, 2, 3]) in Python 2, but {1, 2, 3} in Python 3. The set literal syntax is allowed in source code in Python 2.7, but the repr wasn’t changed until 3.0.

  • floats stringify to the shortest possible representation that has the same underlying value — e.g., str(1.1) is '1.1' rather than '1.1000000000000001'. This change was backported to Python 2.7 as well, but I have seen it break tests.

Hash randomization

Python has traditionally had a predictable hashing mechanism: repr(dict(a=1, b=2, c=3)) will always produce the same string. (On the same platform with the same Python version, at least.) Unfortunately this opens the door to an obscure DoS exploit that was known to Perl long ago: if you know a web application is written in Python, you can construct a query string that will become a dict whose keys all go in the same hash bucket. If your query string is long enough and you send enough requests, you can tie up all the Python processes in dealing with hash collisions.

The fix is hash randomization, which seeds the hashing algorithm in such a way that items are bucketed differently every time Python runs. It’s available in Python 2.7 via an environment variable or the -R argument, but it wasn’t turned on by default until Python 3.3.

The fear was that it might break things. Naturally, it has broken things. Mostly, reprs in tests. But it also changes the iteration order of dicts between Python runs. I have seen code using dicts whose keys happened to always be sorted in alphabetical or insertion order before, but with hash randomization, the keys were of course in a different order every time the code ran. The author assumed that Python had somehow broken dict sorting (which it has never had).

nonlocal

Python 3 introduces the nonlocal keyword, which is like global except it looks through all outer scopes in the expected order. It fixes this mild annoyance:

1
2
3
4
5
6
7
def make_function():
    counter = 0
    def function():
        nonlocal counter
        counter += 1  # without 'nonlocal', this declares a new local!
        print("I've been called", counter, "times!")
    return function

The problem is that any use of assignment within a function automatically creates a new local, and locals are known statically for the entire body of the function. (They actually affect how functions are compiled, in CPython.) So without nonlocal, the above code would see counter += 1, but counter is a new local that has never been assigned a value, so Python cannot possibly add 1 to it, and you get an UnboundLocalError.

nonlocal tells Python that when it sees an assignment of a name that exists in some outer scope, it should reuse that outer variable rather than shadowing it. Great, right? Purely a new feature. No problem.

Unfortunately, I’ve worked on a codebase that needed this feature in Python 2, and decided to fake it with a class… named nonlocal.

1
2
3
4
5
6
7
def make_function():
    class nonlocal:
        counter = 0
    def function():
        nonlocal.counter += 1  # this alters an outer value in-place, so it's fine
        print("I've been called", counter, "times!")
    return function

The class here is used purely as a dummy container. Assigning to an attribute doesn’t create any locals, because it’s equivalent to a method call, so the operand must already exist. This is a slightly quirky approach, but it works fine.

Except that, of course, nonlocal is a keyword in Python 3, so this becomes complete gibberish. It’s such gibberish that (if I remember correctly) 2to3 actually cannot parse it, even though it’s perfectly valid Python 2 code.

I don’t have a magical fix for this one. Just, uh, don’t name things nonlocal.

List comprehensions no longer leak

Python 2 has the slightly inconsistent behavior that loop variables in a generator expression ((...)) are scoped to the generator expression, but loop variables in a list comprehension ([...]) belong to the enclosing scope.

The only reason is in implementation details: a list comprehension acts like a for loop, which has the same behavior, whereas a generator expression actually creates a generator internally.

Python 3 brings these cases into line: loop variables in list comprehensions (or dict or set comprehensions) are also scoped to the comprehension itself.

I cannot imagine any possible reason why this would affect you negatively, and yet, I can swear I’ve seen it happen. I wish I could remember where, because I’m sure it’s an exciting story.

cStringIO.h is gone

cStringIO.h is a private and undocumented C interface to Python 2’s cStringIO.StringIO type. It was removed in Python 3, or at least is somewhere I can’t find it.

This was one of the reasons Thrift’s Python 3 port took almost 3 years: Thrift has a “fast” C module that makes use of this private interface, and it’s not obvious how to replace it. I think they ended up just having the module not exist on Python 3, so Python 3 will just be mysteriously slower.

Some troublesome libraries

MySQLdb is some ancient, clunky, noncompliant, underdocumented trash, much like the database it connects to. It’s nigh abandoned, though it still promises Python 3 support in the MySQLdb 2.0 vaporware. I would suggest not using MySQL, but barring that, try mysqlclient, a fork of MySQLdb that continues development and adds Python 3 support. (The same people also maintain an earlier project, pymysql, which strives to be a pure-Python drop-in replacement for MySQLdb — it’s not quite perfect, but its existence is interesting and it’s sure easier to read than MySQLdb.)

At a glance, Thrift still hasn’t had a release since it merged Python 3 support, eight months ago. It’s some enterprise nightmare, anyway, and bizarrely does code generation for a bunch of dynamic languages. Might I suggest just using the pure-Python thriftpy, which parses Thrift definitions on the fly?

Twisted is, ah, large and complex. Parts of it now support Python 3; parts of it do not. If you need the parts that don’t, well, maybe you could give them a hand?

M2Crypto is working on it, though I’m pretty sure most Python crypto nerds would advise you to use cryptography instead.

And so on

You may find any number of other obscure compatibility problems, just as you might when upgrading from 2.6 to 2.7. The Python community has a lot of clever people willing to help you out, though, and they’ve probably even seen your super duper niche problem before.

Don’t let that, or this list of gotchas in general, dissaude you! Better to start now than later; even fixing an integer division gets you one step closer to having your code run on Python 3 as well.

Real-World Security and the Internet of Things

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/07/real-world_secu.html

Disaster stories involving the Internet of Things are all the rage. They feature cars (both driven and driverless), the power grid, dams, and tunnel ventilation systems. A particularly vivid and realistic one, near-future fiction published last month in New York Magazine, described a cyberattack on New York that involved hacking of cars, the water system, hospitals, elevators, and the power grid. In these stories, thousands of people die. Chaos ensues. While some of these scenarios overhype the mass destruction, the individual risks are all real. And traditional computer and network security isn’t prepared to deal with them.

Classic information security is a triad: confidentiality, integrity, and availability. You’ll see it called “CIA,” which admittedly is confusing in the context of national security. But basically, the three things I can do with your data are steal it (confidentiality), modify it (integrity), or prevent you from getting it (availability).

So far, Internet threats have largely been about confidentiality. These can be expensive; one survey estimated that data breaches cost an average of $3.8 million each. They can be embarrassing, as in the theft of celebrity photos from Apple’s iCloud in 2014 or the Ashley Madison breach in 2015. They can be damaging, as when the government of North Korea stole tens of thousands of internal documents from Sony or when hackers stole data about 83 million customer accounts from JPMorgan Chase, both in 2014. They can even affect national security, as in the case of the Office of Personnel Management data breach by — presumptively — China in 2015.

On the Internet of Things, integrity and availability threats are much worse than confidentiality threats. It’s one thing if your smart door lock can be eavesdropped upon to know who is home. It’s another thing entirely if it can be hacked to allow a burglar to open the door — or prevent you from opening your door. A hacker who can deny you control of your car, or take over control, is much more dangerous than one who can eavesdrop on your conversations or track your car’s location.

With the advent of the Internet of Things and cyber-physical systems in general, we’ve given the Internet hands and feet: the ability to directly affect the physical world. What used to be attacks against data and information have become attacks against flesh, steel, and concrete.

Today’s threats include hackers crashing airplanes by hacking into computer networks, and remotely disabling cars, either when they’re turned off and parked or while they’re speeding down the highway. We’re worried about manipulated counts from electronic voting machines, frozen water pipes through hacked thermostats, and remote murder through hacked medical devices. The possibilities are pretty literally endless. The Internet of Things will allow for attacks we can’t even imagine.

The increased risks come from three things: software control of systems, interconnections between systems, and automatic or autonomous systems. Let’s look at them in turn:

Software Control. The Internet of Things is a result of everything turning into a computer. This gives us enormous power and flexibility, but it brings insecurities with it as well. As more things come under software control, they become vulnerable to all the attacks we’ve seen against computers. But because many of these things are both inexpensive and long-lasting, many of the patch and update systems that work with computers and smartphones won’t work. Right now, the only way to patch most home routers is to throw them away and buy new ones. And the security that comes from replacing your computer and phone every few years won’t work with your refrigerator and thermostat: on the average, you replace the former every 15 years, and the latter approximately never. A recent Princeton survey found 500,000 insecure devices on the Internet. That number is about to explode.

Interconnections. As these systems become interconnected, vulnerabilities in one lead to attacks against others. Already we’ve seen Gmail accounts compromised through vulnerabilities in Samsung smart refrigerators, hospital IT networks compromised through vulnerabilities in medical devices, and Target Corporation hacked through a vulnerability in its HVAC system. Systems are filled with externalities that affect other systems in unforeseen and potentially harmful ways. What might seem benign to the designers of a particular system becomes harmful when it’s combined with some other system. Vulnerabilities on one system cascade into other systems, and the result is a vulnerability that no one saw coming and no one bears responsibility for fixing. The Internet of Things will make exploitable vulnerabilities much more common. It’s simple mathematics. If 100 systems are all interacting with each other, that’s about 5,000 interactions and 5,000 potential vulnerabilities resulting from those interactions. If 300 systems are all interacting with each other, that’s 45,000 interactions. 1,000 systems: 12.5 million interactions. Most of them will be benign or uninteresting, but some of them will be very damaging.

Autonomy. Increasingly, our computer systems are autonomous. They buy and sell stocks, turn the furnace on and off, regulate electricity flow through the grid, and — in the case of driverless cars — automatically pilot multi-ton vehicles to their destinations. Autonomy is great for all sorts of reasons, but from a security perspective it means that the effects of attacks can take effect immediately, automatically, and ubiquitously. The more we remove humans from the loop, faster attacks can do their damage and the more we lose our ability to rely on actual smarts to notice something is wrong before it’s too late.

We’re building systems that are increasingly powerful, and increasingly useful. The necessary side effect is that they are increasingly dangerous. A single vulnerability forced Chrysler to recall 1.4 million vehicles in 2015. We’re used to computers being attacked at scale — think of the large-scale virus infections from the last decade — but we’re not prepared for this happening to everything else in our world.

Governments are taking notice. Last year, both Director of National Intelligence James Clapper and NSA Director Mike Rogers testified before Congress, warning of these threats. They both believe we’re vulnerable.

This is how it was phrased in the DNI’s 2015 Worldwide Threat Assessment: “Most of the public discussion regarding cyber threats has focused on the confidentiality and availability of information; cyber espionage undermines confidentiality, whereas denial-of-service operations and data-deletion attacks undermine availability. In the future, however, we might also see more cyber operations that will change or manipulate electronic information in order to compromise its integrity (i.e. accuracy and reliability) instead of deleting it or disrupting access to it. Decision-making by senior government officials (civilian and military), corporate executives, investors, or others will be impaired if they cannot trust the information they are receiving.”

The DNI 2016 threat assessment included something similar: “Future cyber operations will almost certainly include an increased emphasis on changing or manipulating data to compromise its integrity (i.e., accuracy and reliability) to affect decision making, reduce trust in systems, or cause adverse physical effects. Broader adoption of IoT devices and AI — in settings such as public utilities and healthcare — will only exacerbate these potential effects.”

Security engineers are working on technologies that can mitigate much of this risk, but many solutions won’t be deployed without government involvement. This is not something that the market can solve. Like data privacy, the risks and solutions are too technical for most people and organizations to understand; companies are motivated to hide the insecurity of their own systems from their customers, their users, and the public; the interconnections can make it impossible to connect data breaches with resultant harms; and the interests of the companies often don’t match the interests of the people.

Governments need to play a larger role: setting standards, policing compliance, and implementing solutions across companies and networks. And while the White House Cybersecurity National Action Plan says some of the right things, it doesn’t nearly go far enough, because so many of us are phobic of any government-led solution to anything.

The next president will probably be forced to deal with a large-scale Internet disaster that kills multiple people. I hope he or she responds with both the recognition of what government can do that industry can’t, and the political will to make it happen.

This essay previously appeared on Vice Motherboard.

BoingBoing post.

Elegance

Post Syndicated from Eevee original https://eev.ee/blog/2016/04/21/elegance/

Programmers sometimes like to compliment code as elegant, yet I can’t recall ever seeing a satisfying explanation of what “elegant code” is. Perhaps it’s telling that I see “elegant” used much less often by more experienced programmers, who opt for more concrete commentary.

Surely elegance is a quality to strive for, but how are we to strive for something we can’t define? “I know it when we see it” isn’t good enough.

I think about this from time to time. Here’s what I’ve come up with.

Some definitions

I get a gut feeling when something is elegant, and a different gut feeling altogether when something is hacky; I suspect most programmers experience the same. The strongest pattern I’ve found is this:

Elegance is about expressing exactly what you mean — no more, no less.

Conversely, I could define a hack as something that doesn’t remotely express what you mean, but happens to have a close-enough effect.

That’s not to say all code lies on a linear spectrum between two extremes. There’s some complexity here, because “what you mean” is less concrete than the shape of your code or what happens when it executes.

Consider my recent example of links recreated in JavaScript. You might implement such a faux link with some jQuery.

1
2
3
$('#link').click(function() {
    window.location = 'http://www.google.com/';
});

Isn’t that elegant? It’s short, it’s sweet, and it does exactly as it says: when the link element is clicked, navigate to Google.

No, of course not. jQuery is elegant, perhaps, for some set of simple operations. This code is a hack, but in a way that only a human could reckon. What the author actually meant was a link — not “an element that navigates when clicked upon”, but a link, the fundamental concept that makes the Web what it is. The concrete impact of this is that a bunch of stuff humans expect from links, like the ability to middle click, is conspicuously missing.

Okay, what if you could reproduce all of that extra functionality? What if you painstakingly researched how links behave on every operating system and in every browser, and recreated that with JavaScript? No one would look at the resulting pile of special cases and call it elegant. And yet it wouldn’t really be a hack, either. The code would express “simulate everything a link does”, and while that’s not the same as having a link, it’s at least fairly close. It’d fall into a third unnamed category where a problem is solved fairly rigorously, but the outcome isn’t pretty.

The trick here is, again, all about meaning. We like to pretend that programming is a purely abstract thing, and perhaps some of the ideas are, but the languages and tools are all ultimately designed for humans. They’re designed to make sense to humans (as much as possible within given constraints, anyway), and they’re designed to solve problems humans have.

Elegance is what happens when we find a way to express what we mean with the units of meaning that our tools provide.

Sometimes that’s not possible, so the rest of this post — spoilers! — will be some concrete examples that have crossed my path recently. Maybe they’ll give you a better idea of when and why I frown at computers.

ZDoom and PickActor

Every live object in ZDoom — monsters, weapons, pickups, etc. — is called an “actor” (or sometimes a “thing”). ZDoom’s scripting language has a PickActor function for, essentially, finding what actor another actor is looking at. You might use this to find what monster is under the player’s crosshair and show a health bar over it, say.

There’s a catch. ZDoom’s scripting language can not manipulate actors directly; there is no actor type. Instead, an actor can have a “TID” (“thing ID”). Most actor-related functions thus accept a TID and affect every actor with that TID simultaneously. Appropriately, PickActor doesn’t (can’t!) return the actor it finds, but assigns that actor a TID so you can manipulate it indirectly.

By default, PickActor will refuse to overwrite an existing TID, and returns 1 to indicate it found a target or 0 to indicate it didn’t. It has two flags that can change its behavior: one to force overwriting an existing TID, and one to change the return value to the found actor’s TID.

This is everything you need to know to understand the problem, which is: how do you use PickActor to pick an actor?

The actor might already have a TID, intended for some other purpose. If you use no flags, the function will return 1 to indicate success, but the target won’t have your chosen TID, so any attempts to manipulate the target will silently do nothing. If you use the flag that forces changing a TID, you’ll almost certainly break some other effect that needed to be able to identify that actor. If you use the flag that returns an existing TID, you might end up manipulating multiple actors, because actors can share a TID.

It seems that there’s no way at all to use this function correctly!

In truth, there is, but it relies on a little outside knowledge: scripts block the entire simulation. Unless you explicitly sleep (or run for far too long and get forcibly killed), the game state cannot change out from under you.

With that in mind, the solution is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
    // Get the existing TID, if there is one.
    // A TID of 0 is the same as not having a TID, so "setting" it to 0 will not
    // change anything's TID.
    int old_tid = PickActor(..., tid=0, flags=PICKAF_RETURNTID);

    // Find a TID that's not currently in use.
    int new_tid = UniqueTID();

    // Do the "real" call, which both forcibly sets the TID and returns true if
    // something was actually found.
    if (PickActor(..., tid=new_tid, flags=PICKAF_FORCETID)) {
        do_some_stuff_with(new_tid);
        do_some_other_stuff_with(new_tid);

        // Restore the actor's TID to its original value.
        Thing_ChangeTID(new_tid, old_tid);
    }

This relies on calling PickActor twice: once to get the target’s old TID, and once to change it. As long as both calls have the same arguments, the result must be the same, because the game state is frozen for the duration of this code. If you need to operate on the target for more than one frame, well… you have some more work to do.

The workaround is certainly not elegant. “Look for an actor in this direction, twice” is not what I wanted to express. And yet it’s not a hack, either. The code above demonstrably does the correct thing in all cases, and is suitable as a permanent solution. It occupies that nebulous third category of “complete, but not pretty”.

PickActor, on the other hand, is a shambles. I don’t know if you can really call an API a “hack”, but if you can, I would definitely like to do so right here. The function alone does the wrong thing, the “force” flag does the wrong thing, and the correct solution of calling it twice is not remotely obvious.


I care about this because modding an ancient game is largely a hobbyist affair, and there are plenty of people in the Doom community for whom tinkering with ZDoom is their first exposure to programming. They aren’t going to realize the caveats of this function, let alone know how to fix them.

I don’t want to rag on anyone in particular who’s just trying to make a fun game mod, but there is a whole lot of bad ZDoom code floating around. People try stuff until it looks like it works, and then they leave it be. I don’t blame non-professionals for that. I blame tools that don’t offer the right building blocks for modders to express what they mean.

Tool design is important! It’s why I pick on programming languages. If the fundamental pieces at your disposal are awkwardly-shaped, you’ll have a much harder time expressing what you actually intended.

Inform 7

More spoilers: these are all going to be from video games.

Video games are surprisingly difficult to express. Programming languages are languages, so they usually inherit some of the properties of human languages: code is read (executed) in sequence, and large amounts of code can be broken into a hierarchy for easier understanding. For a great many applications, that works great.

For simulations (a superset of video games), that all goes to hell. You have a great many things all acting independently and simultaneously, each with slightly different concerns and behavior. Expressing that elegantly in a linear, hierarchical language is a lost cause right from the start. Operating systems could be described in much the same way, and I’ve never heard anyone call OS code particularly elegant either. We’ve barely even figured out how to expose threads in a language, without the associated mountain of gotchas and pitfalls.

Inform 7 is interesting because it’s explicitly designed for games, and even a particular kind of game. (Contrast with the more popular style of “something like C, but made to fit my engine”.) I should do a full post on it sometime, but here’s the tl;dr.

Inform 7 is intended for writing interactive fiction games (like Zork). The syntax resembles written English, roughly mirroring how the resulting game is played. For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
The mysterious cavern is a room.
"Numerous stalagtites cast eerie shadows on the walls."

A golden crown is a wearable thing in the mysterious cavern.
"A golden glint pierces the darkness."
The description is "It's a golden crown."
Understand "glint" as the golden crown.

After wearing the golden crown:
    end the story finally saying "All hail the new queen!"

This is a full and complete game, which you can win by wearing the crown. There’s a lot of stuff crammed in here, and hopefully you get the general idea: objects are defined with English statements of fact, not by fenced-off blocks.

Inform 7 is particularly interesting because it attempts to address an ancient question of game design: how do you define the interaction between two mechanics?

This game’s win condition is wearing the crown — i.e., wearing the crown triggers the effect of winning the game. That code has to go somewhere, but does it go in the implementation of wearing or the implementation of the crown? Probably in the crown, since it’s a oneoff effect.

Consider, then, if I had a whole set of armor items with different defense ratings. Is that the concern of the action or the individual objects? It depends. If the vast majority of wearable objects are armor, then surely the act of wearing is concerned with handling armor. If there are only a few wearables that act as armor, they’re outliers, and should implement their own handling. Between these two extremes lies an ambiguous middle ground where it may not be clear who should really be responsible.

Every interaction in a simulated world has at least three possible “parents”: the action, who’s doing the action, and what the action is being done to. Expressing this in a language designed around hierarchy can feel clumsy when the “parent” is ambiguous, because a hierarchy necessarily requires that code only have one parent. (I’m not talking just about inheritance hierarchies, either; your actual source code can only exist in one place, nested inside one class, which is nested inside a namespace, which is nested inside a file, and so on.)

Inform 7 mostly throws the problem out the window. We can’t escape hierarchy entirely, because it’s how we as humans tend to make sense of the world. But instead of forcing the design of your game into a series of nested blocks, Inform 7 more or less lets you define anything anywhere. You can write “After” rules that apply to particular actions with particular objects, and you can put those rules anywhere in your source code: alongside the definition of wearing (well, not in this case, because that’s in the standard library), alongside the definition of the crown, or even somewhere else entirely that deals with all the ways to end the game. It’s completely up to you, and the compiler largely doesn’t care. You can even put the rule before you announce that the golden crown is an object at all, and the compiler will do a pretty good job of figuring it out.

You can still have a hierarchy, but it’s imposed entirely by you. Inform 7 lets you split your source text into a tree of volumes, books, parts, chapters, and sections, named and arranged however you like. You can see this in the source text for the example game, Bronze by Emily Short.


I tell you all this so I can compare it to the other extreme: NetHack. NetHack, written in C, is what you might call “data-oriented”. Consider for example peffects(), a 500-line function that single-handedly implements every possible effect a potion can have. polymon(), the function responsible for polymorphing the player into a monster, contains all manner of little details like halting the process of stoning if you polymorph into a stoning-resistant monster.

I once tried writing a roguelike that used entity component to avoid this kind of mess — no small feat, since it involves a whole lot of fighting back against the language design. Having beaten on that for a while, I can safely say I vastly prefer the Inform 7 model. It’s flexible enough for most purposes, and it has multiple layers of overriding, so you can make an object that prevents the golden crown from winning the game without having to touch the existing rule!

Given the constraints of how we read and write text (and code), and given that the intended usage is still fairly broad, I daresay Inform 7’s approach is fairly elegant. It allows me to clearly say what I mean in most cases, and that’s often enough.

Of course, I wouldn’t have brought it up without some more specific anecdotes.

Locksmith by Emily Short

That’s the name of an extension bundled with Inform 7. Extension names always include the name of the author, which is charming.

Locks and keys are a staple of interactive fiction, and Inform 7 has some built-in support for them. However, it requires you to explicitly name which key you’re using to unlock something. If this is part of a puzzle, it might be appropriate; in the common case of “iron key” unlocking “iron door”, it’s a little tedious.

Locksmith by Emily Short addresses this problem by splitting the “LOCK” and “UNLOCK” actions into two parts. If you say “UNLOCK DOOR WITH KEY”, that’s regular old unlocking. If you only say “UNLOCK DOOR”, that’s a new action called unlocking keylessly, and it tries to find a matching key in your inventory. “LOCK” works similarly.

Now, I want to make a padlock. You don’t need a key to close a padlock, so I want to hijack “locking keylessly” to work without needing a key at all. Both “CLOSE PADLOCK” and “LOCK PADLOCK” should do the same thing, so I use an instead rule to redirect both actions. I also want to make the padlock support the standard “open” property, but I define it in terms of whether it’s currently locked, since that’s how a padlock works.

1
2
3
4
5
6
Definition: the padlock is open rather than closed when it is not locked.
Instead of closing the padlock:
    try locking keylessly the padlock.
Instead of locking keylessly the padlock:
    now the padlock is locked;
    say "You snap the padlock shut.  It makes a chunky, satisfying click."

I compile this and type “CLOSE PADLOCK” and… the game freezes, trapped forever in an infinite loop. Or maybe the stack would overflow eventually.

What’s the problem? It’s not quite in my code; it’s an interaction with the extension, which has some convenience features. If you try to keylessly lock something that’s open (usually a door!), a before rule kicks in and makes you automatically close it first. Before rules happen before instead rules, so we have the following sequence of events.

  1. The player tries to close the (open) padlock.
  2. My code redirects this into having the player lock the padlock instead.
  3. The extension sees that the player is trying to lock something that’s open, and has the player try to close it first.
  4. GOTO 1

This is the downside of having a lot of override mechanisms. Arguably, the extension’s convenience behavior shouldn’t be in the actual implementation of “locking keylessly” rather than in before rules, but there are decent arguments to be made both ways.

The solution is remarkably simple: just swap the actions.

1
2
3
4
5
6
Definition: the padlock is open rather than closed when it is not locked.
Instead of locking keylessly the padlock:
    try closing the padlock.
Instead of closing the padlock:
    now the padlock is locked;
    say "You snap the padlock shut.  It makes a chunky, satisfying click."

And now I get exactly what I want:

> close padlock

You snap the padlock shut. It makes a chunky, satisfying click.

Simple, but perhaps not obvious. Something here is inelegant, yet everyone involved was pretty clearly expressing what they meant. I wonder what the lesson is.

Doors

Ah, good, more Inform 7 stuff, and more door stuff as well!

Doors are pretty straightforward in Inform 7: they have two sides, and each side goes in a room. The player can go through the door to get from one side to the other. Easy.

I want to have a door that can lead to several different places. In other words, the front side is fixed in place, but the back side can move around.

Inform 7 won’t let you do this, full stop. That means it’s time for an adventure.


The obvious solution is to fake it.

  1. Make a bunch of identical fake-door objects (which aren’t actually doors) and put one in each place the door can appear.
  2. Hide them from everywhere the door isn’t supposed to be.
  3. Override all the actions that could conceivably take the player through the door to just teleport them instead.

This is pretty clearly a hack. What I mean is a door, and this is as far away from a door as is possible to get. As is common with hacks, there are also thorny downsides to every step.

  1. Identical objects can be slightly awkward. In particular, if you try to use a custom command that works on objects you have seen but can’t currently see (like “remember”, perhaps), you may get a rather awful disambiguation prompt like: “Which do you mean? The green door, the green door, the green door, or the green door?”

    Also, since they aren’t doors, no door-specific behavior will work on them. Pathfinding won’t work, because the spaces aren’t actually connected. I want to implement looking through a door, and that won’t work. Built-in mechanisms like “Instead of going through the green door” won’t work.

    You might have slightly more luck if you used real doors that all lead nowhere, and very carefully hooked only the handling for where a door leads. But that has problems with part 2.

  2. Hiding objects is more complicated than you’d think! The most common suggestion I’ve seen is to do this:

    1
    2
    Instead of doing something with the hidden item:
        say "You can't see any such thing."
    

    That is, of course, the default response when you try to refer to an object that doesn’t exist. And this is a terrible hack! If you change that default response, this hack will get out of sync. If you use one of the available extensions to improve the error handling, this will stick out like a sore thumb. And just like with the padlock, a before rule will still trigger first.

    An easier option is to actually move the dummy objects around. Of course, if you decide to use real single-sided doors, you’re back to the original problem: you can’t move a door.

  3. Inform 7’s movement is remarkably complex. You can push objects from room to room and ride in vehicles. Merely teleporting the player would break these; you would need to reimplement them in some fashion.

None of these problems are insurmountable, but certainly this approach is difficult to implement correctly.


The solution I went for was to make it work by force.

You see, Inform 7 actually compiles to a very different language called Inform 6, which is more traditional and C-like (though with some pretty funky syntax in places). Parts of the standard library are implemented in Inform 6, and the detailed comments in those files reveal some interesting details.

One, Inform 6 models the entire world as… a hierarchy. Everything is one big tree. The room contains the player, and the player contains their inventory — the only thing distinguishing an inventory is that the parent is a person. Even worn objects are children like any other, except that they have an invisible “worn” flag.

This model presents a problem with doors, which exist in two places at once. The secret is… they actually don’t. The physical door object moves to follow the player! There’s some special consideration for doors in places to make them appear to be in both places from other characters’ points of view, so you’d never notice the difference.

Two, Inform 6 actually had an entire feature for objects that faked being in more than one room, called “floating”. These were apparently such a huge and buggy pain in the ass that Inform 7 drastically limited floating objects to only two specific categories: doors, and a certain kind of non-interactive decorations.

Three, every Inform 7 door ends up with a found_in property in Inform 6, which is an array of the two places the door should exist.

Well, hey, that sounds easy. You can embed chunks of Inform 6 code in Inform 7, so I wrote some code that would edit that array directly.

Foiled! It didn’t work. The problem is that going through a door takes you to the “other side”. “Other side” is a real physical property — an Inform 6 function that picks a side based on where the player currently is. Alas, rather than consulting the array, this function has the two sides hardcoded. It looks like this:

1
2
3
4
5
6
with door_to [ 
    loc ! room of actor
    ;
    loc = location;
    if (loc == thedark) loc = real_location;
    if (loc == I483_red_room) return I384_green_room; return I483_red_room;],

door_to is just the Inform 6 name of the “other side” property. loc is an argument to the function bounded by square brackets. Like I said, funky syntax.

I did try changing “the other side of the green door” from Inform 7 land, but it seems the compiler special-cases this one particular property name and intercepts any attempt to change it dynamically. Foiled again!

The solution I settled on was to rewrite the “other side” function for this particular door entirely. I ended up with:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Include (-
[ HackedOtherSide loc;
    loc = location;
    if (loc == thedark) loc = real_location;
    if (loc == (+ the green door +).&found_in-->0) return (+ the green door +).&found_in-->1; return (+ the green door +).&found_in-->0;
];
[ HackedDirection loc;
    loc = location;
    if (loc == thedark) loc = real_location;
    if (loc == (+ the green door +).&found_in-->0) return (+ southwest +); return (+ northeast +);
];
-).

To hackily override the door props:
    (-
    (+ the green door +).door_dir = HackedDirection;
    (+ the green door +).door_to = HackedOtherSide;
    -).

When play begins:
    hackily override the door_to.

To hackily force the back side of (portal - a door) to (where - an object):
    (- ({portal}.&found_in)-->1 = {where}; -).

(- ... -) denotes embedded Inform 6 code; (+ ... +) denotes embedded Inform 7 object names; foo-->0 is an array subscript. door_to is another property that tells you what direction the door is in, based on what room you’re in.

Right. So.

This is definitely… uglier, at a glance, than faking the door. On the other hand, it’s pretty much transparent to the rest of the language and standard library. The door is a real door that genuinely moves from place to place. Anything that operates on a door goes through the code I’ve overridden here, so it’s fairly bulletproof.

The one issue not addressed here is route-finding — the standard library can cache information about connections between rooms to make route-finding faster, and after moving the door, that cache is no longer valid. A quick read of the comments reveals a trivial fix: call SignalMapChange() after moving the door around.

So. Is this elegant?

It sure doesn’t look pretty, and it requires meddling with guts that I’m not even supposed to know about. That’s not particularly elegant.

And yet! I only did this in the first place because it lets all the rest of my code express exactly what I want. I don’t have to keep remembering that my doors are fake and sticking special cases everywhere. This code is self-contained, robust, and forgettable. It sacrifices elegance so that other code doesn’t have to.

I think that’s something to value.

Starbound versus Aseprite

My door hack is not so much about code elegance, but interface elegance. It does some strange things in order to preserve all the guarantees and semantics of a fundamental existing interface: the door.

I’d like to close with a more literal example of that.

Hardware-accelerated programs generally don’t use native GUI controls. This generally means: video games implement their own buttons, textboxes, etc. I’ve heard rumors that it’s possible to mash native GUI controls into an accelerated canvas, but I can’t recall ever having seen it done.

I got into Starbound for a while last year. I even ran a little private server for some friends. And one constant aggravation I had was that Tab doesn’t move the focus between textboxes in dialogs. The “connect to server” dialog is just three boxes: hostname, username, password. It seems so obvious that you should be able to tab between these, but you cannot. Whoever built the GUI didn’t implement tabbing.

Some other common keyboard shortcuts are wrong, which trips me up from time to time. Ctrlarrow moves the cursor only one character at a time, rather than one word at a time. Shiftarrow moves one word at a time, rather than selecting a character. CtrlBackspace appears to clear an entire textbox, rather than deleting the previous word.

It’s not surprising that a few things would slip through the cracks. Game GUI libraries have to be reimplemented from scratch, and that’s an awful lot of work just to get the buttons and textboxes that other developers get to take for granted.

A few days ago, I tried Aseprite, a pixel art editor. The entire interface is custom-drawn (in a pixellated style, no less), which had me worried that I’d encounter the same problems as in Starbound.

To my great surprise, I didn’t! All the keys I’m used to work just fine. I can even double-click a word to select it. I don’t know if this is a third-party library or something the Aseprite developer built themselves, but someone put a lot of work into making this feel natural.

I don’t usually like custom UIs at all. Between the very appropriate design and the attention to detail, I’m pretty sold on Aseprite’s.

I might even call it… elegant.

It doesn’t matter that I haven’t seen the code. Maybe the code is a mess. In fact it almost certainly is. The end result is what’s elegant here, because it completely disappears. I can forget I’m not using native controls.

Maybe that’s a better definition of elegance, then: elegance is transparent. It’s so natural that it disappears entirely. A hack stands out because it’s unusual and jarring; an elegant solution (or interface, or whatever) fits together so smoothly that you forget the problem ever existed.

Goodbye to our $25.00 On/Off Switch

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/goodbye-25-00-onoff-switch/

Backblaze Power Supply Button
Looking at the image above you can almost hear Hal saying “I’m sorry, Dave. I’m afraid I can’t do that”. By coincidence, we heard “I’m afraid I can’t do that” any time we suggested to our Storage Pod designers that we replace the on-off switch in our Storage Pods. You know the one: the Frozen CPU model ele-302 on/off switch and cable we paid $24.95 or more for over these past seven years. The switch that helped garner 118 comments from the folks on the Sysadmin sub-reddit a couple of years ago. The switch whose blue LEDs burn brightly on over 1,200 Backblaze Storage Pods. That switch. I have some news…

Ding dong the switch is dead

To be clear, you can still buy the ele-302 on/off switch, but we are no longer using this switch in our Storage Pods. We have a new switch, but more on that later. For the moment let’s pay homage to the ele-302.

The ele-302 has been with us since the beginning. In the picture to the right are a couple of early, circa 2009, prototype Storage Pods. The hole for the power button in the upper right corner of each pod was drilled by hand because we forgot to include the hole in the original case design. Filling those hand-crafted holes are ele-302 power buttons. Pod with no switches

If you go all the way back to our Storage Pod 1.0 blog post and scroll down to the bottom, you’ll see the ele-302 proudly listed there and in every Storage Pod bill-of-materials since. While you’re staring at the Storage Pod 1.0 parts list you’ll notice that every other part listed has been replaced by other parts and/or newer models in subsequent Storage Pod versions. That’s right, the ele-302 was the last hold out and now it’s gone.

Cue Hal

The image at the top of this blog post is not Hal, but instead it is our new on/off switch. Ladies and gentlemen let me introduce the Primochill mod/smart Silver Aluminum Momentary Vandal Switch – 22mm – Ring Illumination – Red LED, model PSW-SMV22-R-R. Ta-da. OK, so it doesn’t quite roll off the tongue like the ele-302, but let’s give it a chance. First, it has a cool red ring and we here at Backblaze like red. Second, the list price is only $14.95 and that includes the cable. That’s a $10 savings for our next generation Storage Pod. If we continue to save that kind of money, we’ll get the price to build one of our Storage Pods under four cents a GB in no time. And it’s all part of our ongoing efforts to build reliable, cost efficient cloud storage and pass the savings on to our customers.

Goodbye ele-302

As we said the ele-302 is still available if you’d like to buy one. We have nothing but nice things to say about it, but from now on we’ll be using the Primochill switch – the “Chill-22” – that’s the name I’m going to give it. Maybe the folks at Primochill will read this and change the name, probably not. Join us as we welcome the Chill-22 to the Backblaze family.

The post Goodbye to our $25.00 On/Off Switch appeared first on Backblaze Blog | The Life of a Cloud Backup Company.

Avahi 0.6 in Beta

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/avahi-0.6-pre.html

Unless we find any major bugs Avahi 0.6 will be released on friday. We ask everyone to do some testing for us:

Current Avahi SVN snapshort
Current libdaemon SVN snapshot

There have been a bunch of API changes. However, the API is now frozen, so feel free to start porting your application to the new API now.

A rough overview about the many improvements in Avahi 0.6.

Support for (read-only) wide area support. (i.e. DNS-SD over unicast DNS)
Ported to FreeBSD, NetBSD, Darwin/MacOSX and to some extent OpenBSD
Compatibility layers for HOWL and Bonjour
Support for registering/browsing abritrary records
Proper support for DNS-SD service subtypes
Native C implementations of the client utilities
Now passes the Bonjour conformance test suite without any exceptions
“Passive observation of failures”
chroot() support
Many traffic reduction improvements
Bugfixes, cleanups

Avahi 0.6 in Beta

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/avahi-0.6-pre.html

Unless we find any major bugs Avahi 0.6 will be released on friday. We ask everyone to do some testing for us:

There have been a bunch of API changes. However, the API is now frozen, so feel free to start porting your application to the new API now.

A rough overview about the many improvements in Avahi 0.6.

  • Support for (read-only) wide area support. (i.e. DNS-SD over unicast DNS)
  • Ported to FreeBSD, NetBSD, Darwin/MacOSX and to some extent OpenBSD
  • Compatibility layers for HOWL and Bonjour
  • Support for registering/browsing abritrary records
  • Proper support for DNS-SD service subtypes
  • Native C implementations of the client utilities
  • Now passes the Bonjour conformance test suite without any exceptions
  • “Passive observation of failures”
  • chroot() support
  • Many traffic reduction improvements
  • Bugfixes, cleanups