Tag Archives: lua

Let’s stop copying C

Post Syndicated from Eevee original https://eev.ee/blog/2016/12/01/lets-stop-copying-c/

Ah, C. The best lingua franca language we have… because we have no other lingua franca languages.

C is fairly old — 44 years, now! — and comes from a time when there were possibly more architectures than programming languages. It works well for what it is, and what it is is a relatively simple layer of indirection atop assembly.

Alas, the popularity of C has led to a number of programming languages’ taking significant cues from its design, and parts of its design are… slightly questionable. I’ve gone through some common features that probably should’ve stayed in C and my justification for saying so. The features are listed in rough order from (I hope) least to most controversial. The idea is that C fans will give up when I complain about argument order and not even get to the part where I rag on braces. Wait, crap, I gave it away.

I’ve listed some languages that do or don’t take the same approach as C. Plenty of the listed languages have no relation to C, and some even predate it — this is meant as a cross-reference of the landscape (and perhaps a list of prior art), not a genealogy. The language selections are arbitrary and based on what I could cobble together from documentation, experiments, Wikipedia, and attempts to make sense of Rosetta Code. I don’t know everything about all of them, so I might be missing some interesting quirks. Things are especially complicated for very old languages like COBOL or Fortran, which by now have numerous different versions and variants and de facto standard extensions.

Bash” generally means zsh and ksh and other derivatives as well, and when referring to expressions, means the $(( ... )) syntax; “Unix shells” means Bourne and thus almost certainly everything else as well. I didn’t look too closely into, say, fish. Unqualified “Python” means both 2 and 3; likewise, unqualified “Perl” means both 5 and 6. Also some of the puns are perhaps a little too obtuse, but the first group listed is always C-like.

Textual inclusion

#include is not a great basis for a module system. It’s not even a module system. You can’t ever quite tell what symbols came from which files, or indeed whether particular files are necessary at all. And in languages with C-like header files, most headers include other headers include more headers, so who knows how any particular declaration is actually ending up in your code? Oh, and there’s the whole include guards thing.

It’s a little tricky to pick on individual languages here, because ultimately even the greatest module system in the world boils down to “execute this other file, and maybe do some other stuff”. I think the true differentiating feature is whether including/importing/whatevering a file creates a new namespace. If a file gets dumped into the caller’s namespace, that looks an awful lot like textual inclusion; if a file gets its own namespace, that’s a good sign of something more structured happening behind the scenes.

This tends to go hand-in-hand with how much the language relies on a global namespace. One surprising exception is Lua, which can compartmentalize required files quite well, but dumps everything into a single global namespace by default.

Quick test: if you create a new namespace and import another file within that namespace, do its contents end up in that namespace?

Included: ACS, awk, COBOL, Erlang, Forth, Fortran, most older Lisps, Perl 5 (despite that required files must return true), PHP, Ruby, Unix shells.

Excluded: Ada, Clojure, D, Haskell, Julia, Lua (the file’s return value is returned from require), Nim, Node (similar to Lua), Perl 6, Python, Rust.

Special mention: ALGOL appears to have been designed with the assumption that you could include other code by adding its punch cards to your stack. C#, Java, OCaml, and Swift all have some concept of “all possible code that will be in this program”, sort of like C with inferred headers, so imports are largely unnecessary; Java’s import really just does aliasing. Inform 7 has no namespacing, but does have a first-class concept of external libraries, but doesn’t have a way to split a single project up between multiple files.

Optional block delimiters

Old and busted and responsible for gotofail:

1
2
if (condition)
    thing;

New hotness, which reduces the amount of punctuation overall and eliminates this easy kind of error:

1
2
3
if condition {
    thing;
}

To be fair, and unlike most of these complaints, the original idea was a sort of clever consistency: the actual syntax was merely if (expr) stmt, and also, a single statement could always be replaced by a block of statements. Unfortunately, the cuteness doesn’t make up for the ease with which errors sneak in. If you’re stuck with a language like this, I advise you always use braces, possibly excepting the most trivial cases like immediately returning if some argument is NULL. Definitely do not do this nonsense, which I saw in actual code not 24 hours ago.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
for (x = ...)
    for (y = ...) {
        ...
    }

    // more code

    for (x = ...)
        for (y = ...)
            buffer[y][x] = ...

The only real argument for omitting the braces is that the braces take up a lot of vertical space, but that’s mostly a problem if you put each { on its own line, and you could just not do that.

Some languages use keywords instead of braces, and in such cases it’s vanishingly rare to make the keywords optional.

Blockheads: ACS, awk, C#, D, Erlang (kinda?), Java, JavaScript.

New kids on the block: Go, Perl 6, Rust, Swift.

Had their braces removed: Ada, ALGOL, BASIC, COBOL, CoffeeScript, Forth, Fortran (but still requires parens), Haskell, Lua, Ruby.

Special mention: Inform 7 has several ways to delimit blocks, none of them vulnerable to this problem. Perl 5 requires both the parentheses and the braces… but it lets you leave off the semicolon on the last statement. Python just uses indentation to delimit blocks in the first place, so you can’t have a block look wrong. Lisps exist on a higher plane of existence where the very question makes no sense.

Bitwise operator precedence

For ease of transition from B, in C, the bitwise operators & | ^ have lower precedence than the comparison operators == and friends. That means they happen later. For binary math operators, this is nonsense.

1
2
3
1 + 2 == 3  // (1 + 2) == 3
1 * 2 == 3  // (1 * 2) == 3
1 | 2 == 3  // 1 | (2 == 3)

Many other languages have copied C’s entire set of operators and their precedence, including this. Because a new language is easier to learn if its rules are familiar, you see. Which is why we still, today, have extremely popular languages maintaining compatibility with a language from 1969 — so old that it probably couldn’t get a programming job.

Honestly, if your language is any higher-level than C, I’m not sure bit operators deserve to be operators at all. Free those characters up to do something else. Consider having a first-class bitfield type; then 99% of the use of bit operations would go away.

Quick test: 1 & 2 == 2 evaluates to 1 with C precedence, false otherwise. Or just look at a precedence table: if equality appears between bitwise ops and other math ops, that’s C style.

A bit wrong: C#, D, expr, JavaScript, Perl 5, PHP.

Wisened up: Bash, F# (ops are &&& ||| ^^^), Go, Julia, Lua (bitwise ops are new in 5.3), Perl 6 (ops are ?& ?| ?^), Python, Ruby, SQL, Swift.

Special mention: Java has C’s precedence, but forbids using bitwise operators on booleans, so the quick test is a compile-time error. Lisp-likes have no operator precedence.

Negative modulo

The modulo operator, %, finds the remainder after division. Thus you might think that this always holds:

1
0 <= a % b < abs(b)

But no — if a is negative, C will produce a negative value. This is so a / b * b + a % b is always equal to a. Truncating integer division rounds towards zero, so the sign of a % b always needs to be away from zero.

I’ve never found this behavior (or the above equivalence) useful. An easy example is that checking for odd numbers with x % 2 == 1 will fail for negative numbers, which produce -1. But the opposite behavior can be pretty handy.

Consider the problem of having n items that you want to arrange into rows with c columns. A calendar, say; you want to include enough empty cells to fill out the last row. n % c gives you the number of items on the last row, so c - n % c seems like it will give you the number of empty spaces. But if the last row is already full, then n % c is zero, and c - n % c equals c! You’ll have either a double-width row or a spare row of empty cells. Fixing this requires treating n % c == 0 as a special case, which is unsatisfying.

Ah, but if we have positive %, the answer is simply… -n % c! Consider this number line for n = 5 and c = 3:

1
2
-6      -3       0       3       6
 | - x x | x x x | x x x | x x - |

a % b tells you how far to count down to find a multiple of b. For positive a, that means “backtracking” over a itself and finding a smaller number. For negative a, that means continuing further away from zero. If you look at negative numbers as the mirror image of positive numbers, then % on a positive number tells you how to much to file off to get a multiple, whereas % on a negative number tells you how much further to go to get a multiple. 5 % 3 is 2, but -5 % 3 is 1. And of course, -6 % 3 is still zero, so that’s not a special case.

Positive % effectively lets you choose whether to round up or down. It doesn’t come up often, but when it’s handy, it’s really handy.

(I have no strong opinion on what 5 % -3 should be; I don’t think I’ve ever tried to use % with a negative divisor. Python makes it negative; Pascal makes it positive. Wikipedia has a whole big chart.)

Quick test: -5 % 3 is -2 with C semantics, 1 with “positive” semantics.

Leftovers: Bash, C#, D, expr, Go, Java, JavaScript, OCaml, PowerShell, PHP, Rust, Scala, SQL, Swift, VimL, Visual Basic. Notably, some of these languages don’t even have integer division.

Paying dividends: Dart, MUMPS (#), Perl, Python, R (%%), Ruby, Smalltalk (\\\\), Standard ML, Tcl.

Special mention: Ada, Haskell, Julia, many Lisps, MATLAB, VHDL, and others have separate mod (Python-like) and rem (C-like) operators. CoffeeScript has separate % (C-like) and %% (Python-like) operators.

Leading zero for octal

Octal notation like 0777 has three uses.

One: to make a file mask to pass to chmod().

Two: to confuse people when they write 013 and it comes out as 11.

Three: to confuse people when they write 018 and get a syntax error.

If you absolutely must have octal (?!) in your language, it’s fine to use 0o777. Really. No one will mind. Or you can go the whole distance and allow literals written in any base, as several languages do.

Gets a zero: awk (gawk only), Bash, Clojure, Go, Groovy, Java, JavaScript, m4, Perl 5, PHP, Python 2, Scala.

G0od: ECMAScript 6, Eiffel (0c — cute!), F#, Haskell, Julia, Nemerle, Nim, OCaml, Perl 6, Python 3, Racket (#o), Ruby, Scheme (#o), Swift, Tcl.

Based literals: Ada (8#777#), Bash (8#777), Erlang (8#777), Icon (8r777), J (8b777), Perl 6 (:8<777>), PostScript (8#777), Smalltalk (8r777).

Special mention: BASIC uses &O and &H prefixes for octal and hex. bc and Forth allow the base used to interpret literals to be changed on the fly, via ibase and BASE respectively. C#, D, expr, Lua, and Standard ML have no octal literals at all. Some COBOL extensions use O# and H#/X# prefixes for octal and hex. Fortran uses the slightly odd O'777' syntax.

No power operator

Perhaps this makes sense in C, since it doesn’t correspond to an actual instruction on most CPUs, but in JavaScript? If you can make + work for strings, I think you can add a **.

If you’re willing to ditch the bitwise operators (or lessen their importance a bit), you can even use ^, as most people would write in regular ASCII text.

Powerless: ACS, C#, Eiffel, Erlang, expr, Forth, Go.

Two out of two stars: Ada, ALGOL ( works too), Bash, COBOL, CoffeeScript, Fortran, F#, Groovy, OCaml, Perl, PHP, Python, Ruby.

I tip my hat: awk, BASIC, bc, COBOL, fish, Lua.

Otherwise powerful: APL (), D (^^).

Special mention: Lisps tend to have a named function rather than a dedicated operator (e.g. Math/pow in Clojure, expt in Common Lisp), but since operators are regular functions, this doesn’t stand out nearly so much. Haskell uses all three of ^, ^^, and ** for typing reasons.

C-style for loops

This construct is bad. It very rarely matches what a human actually wants to do, which 90% of the time is “go through this list of stuff” or “count from 1 to 10”. A C-style for obscures those wishes. The syntax is downright goofy, too: nothing else in the language uses ; as a delimiter and repeatedly executes only part of a line. It’s like a tuple of statements.

I said in my previous post about iteration that having an iteration protocol requires either objects or closures, but I realize that’s not true. I even disproved it in the same post. Lua’s own iteration protocol can be implemented without closures — the semantics of for involve keeping a persistent state value and passing it to the iterator function every time. It could even be implemented in C! Awkwardly. And with a bunch of macros. Which aren’t hygenic in C. Hmm, well.

Loopy: ACS, bc, Fortran.

Cool and collected: C#, Clojure, D, Delphi (recent), Eiffel (recent), Go, Groovy, Icon, Inform 7, Java, Julia, Logo, Lua, Nemerle, Nim, Objective-C, Perl, PHP, PostScript, Prolog, Python, R, Rust, Scala, Smalltalk, Swift, Tcl, Unix shells, Visual Basic.

Special mention: Functional languages and Lisps are laughing at the rest of us here. awk has for...in, but it doesn’t iterate arrays in order which makes it rather less useful. JavaScript has both for...in and for...of, but both are differently broken, so you usually end up using C-style for or external iteration. BASIC has an ergonomic numeric loop, but no iteration loop. Ruby mostly uses external iteration, and its for block is actually expressed in those terms.

Switch with default fallthrough

We’ve been through this before. Wanting completely separate code per case is, by far, the most common thing to want to do. It makes no sense to have to explicitly opt out of the more obvious behavior.

Breaks my heart: C#, Java, JavaScript.

Follows through: Ada, BASIC, CoffeeScript, Go (has a fallthrough statement), Lisps, Swift (has a fallthrough statement), Unix shells.

Special mention: D requires break, but requires something one way or the other — implicit fallthrough is disallowed except for empty cases. Perl 5 historically had no switch block built in, but it comes with a Switch module, and the last seven releases have had an experimental given block which I stress is still experimental. Python has no switch block. Erlang, Haskell, and Rust have pattern-matching instead (which doesn’t allow fallthrough at all).

Type first

1
int foo;

In C, this isn’t too bad. You get into problems when you remember that it’s common for type names to be all lowercase.

1
foo * bar;

Is that a useless expression, or a declaration? It depends entirely on whether foo is a variable or a type.

It gets a little weirder when you consider that there are type names with spaces in them. And storage classes. And qualifiers. And sometimes part of the type comes after the name.

1
extern const volatile _Atomic unsigned long long int * restrict foo[];

That’s not even getting into the syntax for types of function pointers, which might have arbitrary amounts of stuff after the variable name.

And then C++ came along with generics, which means a type name might also have other type names nested arbitrarily deep.

1
extern const volatile std::unordered_map<unsigned long long int, std::unordered_map<const long double * const, const std::vector<std::basic_string<char>>::const_iterator>> foo;

And that’s just a declaration! Imagine if there were an assignment in there too.

The great thing about static typing is that I know the types of all the variables, but that advantage is somewhat lessened if I can’t tell what the variables are.

Between type-first, function pointer syntax, Turing-complete duck-typed templates, and C++’s initialization syntax, there are several ways where parsing C++ is ambiguous or even undecidable! “Undecidable” here means that there exist C++ programs which cannot even be parsed into a syntax tree, because the same syntax means two different things depending on whether some expression is a value or a type, and that question can depend on an endlessly recursive template instantiation. (This is also a great example of ambiguity, where x * y(z) could be either an expression or a declaration.)

Contrast with, say, Rust:

1
let x: ... = ...;

This is easy to parse, both for a human and a computer. The thing before the colon must be a variable name, and it stands out immediately; the thing after the colon must be a type name. Even better, Rust has pretty good type inference, so the type is probably unnecessary anyway.

Of course, languages with no type declarations whatsoever are immune to this problem.

Most vexing: Java, Perl 6

Looks Lovely: Python 3 (annotation syntax and the typing module), Rust, Swift, TypeScript

Weak typing

Please note: this is not the opposite of static typing. Weak typing is more about the runtime behavior of values — if I try to use a value of type T as though it were of type U, will it be implicitly converted?

C lets you assign pointers to int variables and then take square roots of them, which seems like a bad idea to me. C++ agreed and nixed this, but also introduced the ability to make your own custom types implicitly convertible to as many other types you want.

This one is pretty clearly a spectrum, and I don’t have a clear line. For example, I don’t fault Python for implicitly converting between int and float, because int is infinite-precision and float is 64-bit, so it’s usually fine. But I’m a lot more suspicious of C, which lets you assign an int to a char without complaint. (Well, okay. Literal integers in C are ints, which poses a slight problem.)

I do count a combined addition/concatenation operator that accepts different types of arguments as a form of weak typing.

Weak: JavaScript (+), Unix shells (everything’s a string, but even arrays/scalars are somewhat interchangeable)

Strong: Rust (even numeric upcasts must be explicit).

Special mention: Perl 5 is weak, but it avoids most of the ambiguity by having entirely separate sets of operators for string vs numeric operations. Python 2 is mostly strong, but that whole interchangeable bytes/text thing sure caused some ruckus.

Integer division

Hey, new programmers!” you may find yourself saying. “Don’t worry, it’s just like math, see? Here’s how to use $LANGUAGE as a calculator.”

Oh boy!” says your protégé. “Let’s see what 7 ÷ 2 is! Oh, it’s 3. I think the computer is broken.”

They’re right! It is broken. I have genuinely seen a non-trivial number of people come into #python thinking division is “broken” because of this.

To be fair, C is pretty consistent about making math operations always produce a value whose type matches one of the arguments. It’s also unclear whether such division should produce a float or a double. Inferring from context would make sense, but that’s not something C is really big on.

Quick test: 7 / 2 is 3½, not 3.

Integrous: Bash, bc, C#, D, expr, F#, Fortran, Go, OCaml, Python 2, Ruby, Rust (hard to avoid).

Afloat: awk (no integers), Clojure (produces a rational!), Groovy, JavaScript (no integers), Lua (no integers until 5.3), Nim, Perl 5 (no integers), Perl 6, PHP, Python 3.

Special mention: Haskell disallows / on integers. Nim, Perl 6, Python, and probably others have separate integral division operators: div, div, and //, respectively.

Bytestrings

Strings” in C are arrays of 8-bit characters. They aren’t really strings at all, since they can’t hold the vast majority of characters without some further form of encoding. Exactly what the encoding is and how to handle it is left entirely up to the programmer. This is a pain in the ass.

Some languages caught wind of this Unicode thing in the 90s and decided to solve this problem once and for all by making “wide” strings with 16-bit characters. (Even C95 has this, in the form of wchar_t* and L"..." literals.) Unicode, you see, would never have more than 65,536 characters.

Whoops, so much for that. Now we have strings encoded as UTF-16 rather than UTF-8, so we’re paying extra storage cost and we still need to write extra code to do basic operations right. Or we forget, and then later we have to track down a bunch of wonky bugs because someone typed a 💩.

Note that handling characters/codepoints is very different from handling glyphs, i.e. the distinct shapes you see on screen. Handling glyphs doesn’t even really make sense outside the context of a font, because fonts are free to make up whatever ligatures they want. Remember “diverse” emoji? Those are ligatures of three to seven characters, completely invented by a font vendor. A programming language can’t reliably count the display length of that, especially when new combining behaviors could be introduced at any time.

Also, it doesn’t matter how you solve this problem, as long as it appears to be solved. I believe Ruby uses bytestrings, for example, but they know their own encoding, so they can be correctly handled as sequences of codepoints. Having a separate non-default type or methods does not count, because everyone will still use the wrong thing first — sorry, Python 2.

Quick test: what’s the length of “💩”? If 1, you have real unencoded strings. If 2, you have UTF-16 strings. If 4, you have UTF-8 strings. If something else, I don’t know what the heck is going on.

Totally bytes: Lua, Python 2 (separate unicode type).

Comes up short: Java, JavaScript.

One hundred emoji: Python 3, Ruby, Rust.

Special mention: Perl 5 gets the quick test right if you put use utf8; at the top of the file, but Perl 5’s Unicode support is such a confusing clusterfuck that I can’t really give it a 💯.

Autoincrement and autodecrement

I don’t think there are too many compelling reasons to have ++. It means the same as += 1, which is still nice and short. The only difference is that people can do stupid unreadable tricks with ++.

One exception: it is possible to overload ++ in ways that don’t make sense as += 1 — for example, C++ uses ++ to advance iterators, which may do any arbitrary work under the hood.

Double plus ungood:

Double plus good: Python

Special mention: Perl 5 and PHP both allow ++ on strings, in which case it increments letters or something, but I don’t know if much real code has ever used this.

!

A pet peeve. Spot the difference:

1
2
3
4
5
6
if (looks_like_rain()) {
    ...
}
if (!looks_like_rain()) {
    ...
}

That single ! is ridiculously subtle, which seems wrong to me when it makes an expression mean its polar opposite. Surely it should stick out like a sore thumb. The left parenthesis makes it worse, too; it blends in slightly as just noise.

It helps a bit to space after the ! in cases like this:

1
2
3
if (! looks_like_rain()) {
    ...
}

But this seems to be curiously rare. The easy solution is to just spell the operator not. At which point the other two might as well be and and or.

Interestingly enough, C95 specifies and, or, not, and some others as standard alternative spellings, though I’ve never seen them in any C code and I suspect existing projects would prefer I not use them.

Not right: ACS, awk, C#, D, Go, Groovy, Java, JavaScript, Nemerle, PHP, R, Rust, Scala, Swift, Tcl, Vala.

Spelled out: Ada, ALGOL, BASIC, COBOL, Erlang, F#, Fortran, Haskell, Lisps, Lua, Nim, OCaml, Pascal, PostScript, Python, Smalltalk, Standard ML.

Special mention: APL and Julia both use ~, which is at least easier to pick out, which is more than I can say for most of APL. bc and expr, which are really calculators, have no concept of Boolean operations. Forth and Icon, which are not calculators, don’t seem to either. Perl and Ruby have both symbolic and named Boolean operators (Perl 6 has even more), with different precedence (which inside if won’t matter), but I believe the named forms are preferred.

Single return and out parameters

Because C can only return a single value, and that value is often an indication of failure for the sake of an if, “out” parameters are somewhat common.

1
2
double x, y;
get_point(&x, &y);

It’s not immediately clear whether x and y are input or output. Sometimes they might function as both. (And of course, in this silly example, you’d be better off returning a single point struct. Or would you use a point out parameter because returning structs is potentially expensive?)

Some languages have doubled down on this by adding syntax to declare “out” parameters, which removes the ambiguity in the function definition, but makes it worse in function calls. In the above example, using & on an argument is at least a decent hint that the function wants to write to those values. If you have implicit out parameters or pass-by-reference or whatever, that would just be get_point(x, y) and you’d have no indication that those arguments are special in any way.

The vast majority of the time, this can be expressed in a more straightforward way by returning multiple values:

1
x, y = get_point()

That was intended as Python, but technically, Python doesn’t have multiple returns! It seems to, but it’s really a combination of several factors: a tuple type, the ability to make a tuple literal with just commas, and the ability to unpack a tuple via multiple assignment. In the end it works just as well. Also this is a way better use of the comma operator than in C.

But the exact same code could appear in Lua, which has multiple return/assignment as an explicit feature… and no tuples. The difference becomes obvious if you try to assign the return value to a single variable instead:

1
point = get_point()

In Python, point would be a tuple containing both return values. In Lua, point would be the x value, and y would be silently discarded. I don’t tend to be a fan of silently throwing data away, but I have to admit that Lua makes pretty good use of this in several places for “optional” return values that the caller can completely ignore if desired. An existing function can even be extended to return more values than before — that would break callers in Python, but work just fine in Lua.

(Also, to briefly play devil’s advocate: I once saw Python code that returned 14 values all with very complicated values, types, and semantics. Maybe don’t do that. I think I cleaned it up to return an object, which simplified the calling code considerably too.)

It’s also possible to half-ass this. ECMAScript 6::

1
2
3
4
5
function get_point() {
    return [1, 2];
}

var [x, y] = get_point();

It works, but it doesn’t actually look like multiple return. The trouble is that JavaScript has C’s comma operator and C’s variable declaration syntax, so neither of the above constructs could’ve left off the brackets without significantly changing the syntax:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
function get_point() {
    // Whoops!  This uses the comma operator, which evaluates to its last
    // operand, so it just returns 2
    return 1, 2;
}

// Whoops!  This is multiple declaration, where each variable gets its own "=",
// so it assigns nothing to x and the return value to y
var x, y = get_point();
// Now x is undefined and y is 2

This is still better than either out parameters or returning an explicit struct that needs manual unpacking, but it’s not as good as comma-delimited tuples. Note that some languages require parentheses around tuples (and also call them tuples), and I’m arbitrarily counting that as better than bracket.

Single return: Ada, ALGOL, BASIC, C#, COBOL, Fortran, Groovy, Java, Smalltalk.

Half-assed multiple return: C++11, D, ECMAScript 6, Erlang, PHP.

Multiple return via tuples: F#, Go, Haskell, Julia, Nemerle, Nim, OCaml, Perl (just lists really), Python, Ruby, Rust, Scala, Standard ML, Swift, Tcl.

Native multiple return: Common Lisp, Lua.

Special mention: Forth is stack-based, and all return values are simply placed on the stack, so multiple return isn’t a special case. Unix shell functions don’t return values. Visual Basic sets a return value by assigning to the function’s name (?!), so good luck fitting multiple return in there.

Silent errors

Most runtime errors in C are indicated by one of two mechanisms: returning an error code, or segfaulting. Segfaulting is pretty noisy, so that’s okay, except for the exploit potential and all.

Returning an error code kinda sucks. Those tend to be important, but nothing in the language actually reminds you to check them, and of course we silly squishy humans have the habit of assuming everything will succeed at all times. Which is how I segfaulted git two days ago: I found a spot where it didn’t check for a NULL returned as an error.

There are several alternatives here: exceptions, statically forcing the developer to check for an error code, or using something monad-like to statically force the developer to distinguish between an error and a valid return value. Probably some others. In the end I was surprised by how many languages went the exception route.

Quietly wrong: Unix shells. Wow, yeah, I’m having a hard time naming anything else. Good job, us!

Exceptional: Ada, C++, C#, D, Erlang, Forth, Java (exceptions are even part of function signature), JavaScript, Nemerle, Nim, Objective-C, OCaml, Perl 6, Python, Ruby, Smalltalk, Standard ML, Visual Basic.

Monadic: Haskell (Either), Rust (Result).

Special mention: ACS doesn’t really have many operations that can error, and those that do simply halt the script. ALGOL apparently has something called “mending” that I don’t understand. Go tends to use secondary return values, which calling code has to unpack, making them slightly harder to forget about. Lisps have conditions and call/cc, which are different things entirely. Lua and Perl 5 handle errors by taking down the whole program, but offer a construct that can catch that further up the stack, which is clumsy but enough to emulate try..catch. PHP has exceptions, and errors (which are totally different), and a lot of builtin functions that return error codes. Swift has something that looks like exceptions, but it doesn’t involve stack unwinding and does require some light annotation, so I think it’s all sugar for a monadic return value. Visual Basic, and I believe some other BASICs, decided C wasn’t bad enough and introduced the bizarre On Error Resume Next construct which does exactly what it sounds like.

Nulls

The billion dollar mistake.

I think it’s considerably worse in a statically typed language like C, because the whole point is that you can rely on the types. But a double* might be NULL, which is not actually a pointer to a double; it’s a pointer to a segfault. Other kinds of bad pointers are possible, of course, but those are more an issue of memory safety; allowing any reference to be null violates type safety. The root of the problem is treating null as a possible value of any type, when really it’s its own type entirely.

The alternatives tend to be either opt-in nullability or an “optional” generic type (a monad!) which eliminates null as its own value entirely. Notably, Swift does it both ways: optional types are indicated by a trailing ?, but that’s just syntactic sugar for Option<T>.

On the other hand, while it’s annoying to get a None where I didn’t expect one in Python, it’s not like I’m surprised. I occasionally get a string where I expected a number, too. The language explicitly leaves type concerns in my hands. My real objection is to having a static type system that lies. So I’m not going to list every single dynamic language here, because not only is it consistent with the rest of the type system, but they don’t really have any machinery to prevent this anyway.

Nothing doing: C#, D, Go, Java, Nim (non-nullable types are opt in), R.

Nullable types: Swift.

Monads: F# (Option — though technically F# also inherits null from .NET), Haskell (Maybe), Rust (Option), Swift (Optional).

Special mention: awk, Tcl, and Unix shells only have strings, so in a surprising twist, they have no concept of null whatsoever. Java recently introduced an Optional<T> type which explicitly may or may not contain a value, but since it’s still a non-primitive, it could also be null. C++17 doesn’t quite have the same problem with std::optional<T>, since non-reference values can’t be null. Inform 7’s nothing value is an object (the root of half of its type system), which means any object variable might be nothing, but any value of a more specific type cannot be nothing. JavaScript has two null values, null and undefined. Perl 6 is really big on static types, but claims its Nil object doesn’t exist, and I don’t know how to even begin to unpack that.

Assignment as expression

How common a mistake is this:

1
2
3
if (x = 3) {
    ...
}

Well, I don’t know, actually. Maybe not that common, save for among beginners. But I sort of wonder whether allowing this buys us anything. I can only think of two cases where it does. One is with something like iteration:

1
2
3
4
// Typical linked list
while (p = p->next) {
    ...
}

But this is only necessary in C in the first place because it has no first-class notion of iteration. The other is shorthand for checking that a function returned a useful value:

1
2
3
if (ptr = get_pointer()) {
    ...
}

But if a function returns NULL, that’s really an error condition, and presumably you have some other way to handle that too.

What does that leave? The only time I remotely miss this in Python (where it’s illegal) is when testing a regex. You tend to see this a lot instead.

1
2
3
m = re.match('x+y+z+', some_string)
if m:
    ...

re treats failure as an acceptable possibility and returns None, rather than raising an exception. I’m not sure whether this was the right thing to do or not, but off the top of my head I can’t think of too many other Python interfaces that sometimes return None.

Some languages go entirely the opposite direction and make everything an expression, including block constructs like if. In those languages, it makes sense for assignment to be an expression, for consistency with everything else.

Assignment’s an expression: ACS, C#, D, Java, JavaScript, Perl, PHP, Swift.

Everything’s an expression: Ruby, Rust.

Assignment’s a statement: Inform 7, Lua, Python, Unix shells.

Special mention: BASIC uses = for both assignment and equality testing — the meaning is determined from context. Functional languages generally don’t have an assignment operator. Rust has a special if let block that explicitly combines assignment with pattern matching, which is way nicer than the C approach.

No hyphens in identifiers

snake_case requires dancing on the shift key (unless you rearrange your keyboard, which is perfectly reasonable). It slows you down slightly and leads to occasional mistakes like snake-Case. The alternative is dromedaryCase, which is objectively wrong and doesn’t actually solve this problem anyway.

Why not just allow hyphens in identifiers, so we can avoid this argument and use kebab-case?

Ah, but then it’s ambiguous whether you mean an identifier or the subtraction operator. No problem: require spaces for subtraction. I don’t think a tiny way you’re allowed to make your code harder to read is really worth this clear advantage.

Low score: ACS, C#, D, Java, JavaScript, OCaml, Pascal, Perl 5, PHP, Python, Ruby, Rust, Swift, Unix shells.

Nicely-named: COBOL, CSS (and thus Sass), Forth, Inform 7, Lisps, Perl 6, XML.

Special mention: Perl has a built-in variable called $-, and Ruby has a few called $-n for various values of “n”, but these are very special cases.

Braces and semicolons

Okay. Hang on. Bear with me.

C code looks like this.

1
2
3
4
5
some block header {
    line 1;
    line 2;
    line 3;
}

The block is indicated two different ways here. The braces are for the compiler; the indentation is for humans.

Having two different ways to say the same thing means they can get out of sync. They can disagree. And that can be, as previously mentioned, really bad. This is really just a more general form of the problem of optional block delimiters.

The only solution is to eliminate one of the two. Programming languages exist for the benefit of humans, so we obviously can’t get rid of the indentation. Thus, we should get rid of the braces. QED.

As an added advantage, we reclaim all the vertical space wasted on lines containing only a }, and we can stop squabbling about where to put the {.

If you accept this, you might start to notice that there are also two different ways of indicating where a line ends: with semicolons for the compiler, and with vertical whitespace for humans. So, by the same reasoning, we should lose the semicolons.

Right? Awesome. Glad we’re all on the same page.

Some languages use keywords instead of braces, but the effect is the same. I’m not aware of any languages that use keywords instead of semicolons.

Bracing myself: C#, D, Erlang, Java, Perl, Rust.

Braces, but no semicolons: JavaScript (kinda — see below), Lua, Ruby, Swift.

Free and clear: CoffeeScript, Haskell, Python.

Special mention: Lisp, just, in general. Inform 7 has an indented style, but it still requires semicolons.

Here’s some interesting trivia. JavaScript, Lua, and Python all optionally allow semicolons at the end of a statement, but the way each language determines line continuation is very different.

JavaScript takes an “opt-out” approach: it continues reading lines until it hits a semicolon, or until reading the next line would cause a syntax error. That leaves a few corner cases like starting a new line with a (, which could look like the last thing on the previous line is a function you’re trying to call. Or you could have -foo on its own line, and it would parse as subtraction rather than unary negation. You might wonder why anyone would do that, but using unary + is one way to make function parse as an expression rather than a statement! I’m not so opposed to semicolons that I want to be debugging where the language thinks my lines end, so I just always use semicolons in JavaScript.

Python takes an “opt-in” approach: it assumes, by default, that a statement ends at the end of a line. However, newlines inside parentheses or brackets are ignored, which takes care of 99% of cases — long lines are most frequently caused by function calls (which have parentheses!) with a lot of arguments. If you really need it, you can explicitly escape a newline with \\, but this is widely regarded as incredibly ugly.

Lua avoids the problem almost entirely. I believe Lua’s grammar is designed such that it’s almost always unambiguous where a statement ends, even if you have no newlines at all. This has a few weird side effects: void expressions are syntactically forbidden in Lua, for example, so you just can’t have -foo as its own statement. Also, you can’t have code immediately following a return, because it’ll be interpreted as a return value. The upside is that Lua can treat newlines just like any other whitespace, but still not need semicolons. In fact, semicolons aren’t statement terminators in Lua at all — they’re their own statement, which does nothing. Alas, not for lack of trying, Lua does have the same ( ambiguity as JavaScript (and parses it the same way), but I don’t think any of the others exist.

Oh, and the colons that Python has at the end of its block headers, like if foo:? As far as I can tell, they serve no syntactic purpose whatsoever. Purely aesthetic.

Blaming the programmer

Perhaps one of the worst misfeatures of C is the ease with which responsibility for problems can be shifted to the person who wrote the code. “Oh, you segfaulted? I guess you forgot to check for NULL.” If only I had a computer to take care of such tedium for me!

Clearly, computers can’t be expected to do everything for us. But they can be expected to do quite a bit. Programming languages are built for humans, and they ought to eliminate the sorts of rote work humans are bad at whenever possible. A programmer is already busy thinking about the actual problem they want to solve; it’s no surprise that they’ll sometimes forget some tedious detail the language forces them to worry about.

So if you’re designing a language, don’t just copy C. Don’t just copy C++ or Java. Hell, don’t even just copy Python or Ruby. Consider your target audience, consider the problems they’re trying to solve, and try to get as much else out of the way as possible. If the same “mistake” tends to crop up over and over, look for a way to modify the language to reduce or eliminate it. And be sure to look at a lot of languages for inspiration — even ones you hate, even weird ones no one uses! A lot of clever people have had a lot of other ideas in the last 44 years.


I hope you enjoyed this accidental cross-reference of several dozen languages! I enjoyed looking through them all, though it was incredibly time-consuming. Some of them look pretty interesting; maybe give them a whirl.

Also, dammit, now I’m thinking about language design again.

Let’s stop copying C

Post Syndicated from Eevee original https://eev.ee/blog/2016/12/01/lets-stop-copying-c/

Ah, C. The best lingua franca we have… because we have no other lingua francas. Linguae franca. Surgeons general?

C is fairly old — 44 years, now! — and comes from a time when there were possibly more architectures than programming languages. It works well for what it is, and what it is is a relatively simple layer of indirection atop assembly.

Alas, the popularity of C has led to a number of programming languages’ taking significant cues from its design, and parts of its design are… slightly questionable. I’ve gone through some common features that probably should’ve stayed in C and my justification for saying so. The features are listed in rough order from (I hope) least to most controversial. The idea is that C fans will give up when I call it “weakly typed” and not even get to the part where I rag on braces. Wait, crap, I gave it away.

I’ve listed some languages that do or don’t take the same approach as C. Plenty of the listed languages have no relation to C, and some even predate it — this is meant as a cross-reference of the landscape (and perhaps a list of prior art), not a genealogy. The language selections are arbitrary and based on what I could cobble together from documentation, experiments, Wikipedia, and attempts to make sense of Rosetta Code. I don’t know everything about all of them, so I might be missing some interesting quirks. Things are especially complicated for very old languages like COBOL or Fortran, which by now have numerous different versions and variants and de facto standard extensions.

Unix shells” means some handwaved combination that probably includes bash and its descendants; for expressions, it means the (( ... )) syntax. I didn’t look too closely into, say, fish. Unqualified “Python” means both 2 and 3; likewise, unqualified “Perl” means both 5 and 6. Also some of the puns are perhaps a little too obtuse, but the first group listed is always C-like.

Textual inclusion

#include is not a great basis for a module system. It’s not even a module system. You can’t ever quite tell what symbols came from which files, or indeed whether particular files are necessary at all. And in languages with C-like header files, most headers include other headers include more headers, so who knows how any particular declaration is actually ending up in your code? Oh, and there’s the whole include guards thing.

It’s a little tricky to pick on individual languages here, because ultimately even the greatest module system in the world boils down to “execute this other file, and maybe do some other stuff”. I think the true differentiating feature is whether including/importing/whatevering a file creates a new namespace. If a file gets dumped into the caller’s namespace, that looks an awful lot like textual inclusion; if a file gets its own namespace, that’s a good sign of something more structured happening behind the scenes.

This tends to go hand-in-hand with how much the language relies on a global namespace. One surprising exception is Lua, which can compartmentalize required files quite well, but dumps everything into a single global namespace by default.

Quick test: if you create a new namespace and import another file within that namespace, do its contents end up in that namespace?

Included: ACS, awk, COBOL, Erlang, Forth, Fortran, most older Lisps, Perl 5 (despite that required files must return true), PHP, Unix shells.

Excluded: Ada, Clojure, D, Haskell, Julia, Lua (the file’s return value is returned from require), Nim, Node (similar to Lua), Perl 6, Python, Rust.

Special mention: ALGOL appears to have been designed with the assumption that you could include other code by adding its punch cards to your stack. C#, Java, OCaml, and Swift all have some concept of “all possible code that will be in this program”, sort of like C with inferred headers, so imports are largely unnecessary; Java’s import really just does aliasing. Inform 7 has no namespacing, but does have a first-class concept of external libraries, but doesn’t have a way to split a single project up between multiple files. Ruby doesn’t automatically give required files their own namespace, but doesn’t evaluate them in the caller’s namespace either.

Optional block delimiters

Old and busted and responsible for gotofail:

1
2
if (condition)
    thing;

New hotness, which reduces the amount of punctuation overall and eliminates this easy kind of error:

1
2
3
if condition {
    thing;
}

To be fair, and unlike most of these complaints, the original idea was a sort of clever consistency: the actual syntax was merely if (expr) stmt, and also, a single statement could always be replaced by a block of statements. Unfortunately, the cuteness doesn’t make up for the ease with which errors sneak in. If you’re stuck with a language like this, I advise you always use braces, possibly excepting the most trivial cases like immediately returning if some argument is NULL. Definitely do not do this nonsense, which I saw in actual code not 24 hours ago.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
for (x = ...)
    for (y = ...) {
        ...
    }

    // more code

    for (x = ...)
        for (y = ...)
            buffer[y][x] = ...

The only real argument for omitting the braces is that the braces take up a lot of vertical space, but that’s mostly a problem if you put each { on its own line, and you could just not do that.

Some languages use keywords instead of braces, and in such cases it’s vanishingly rare to make the keywords optional.

Blockheads: ACS, awk, C#, D, Erlang (kinda?), Java, JavaScript.

New kids on the block: Go, Perl 6, Rust, Swift.

Had their braces removed: Ada, ALGOL, BASIC, COBOL, CoffeeScript, Forth, Fortran (but still requires parens), Haskell, Lua, Ruby.

Special mention: Inform 7 has several ways to delimit blocks, none of them vulnerable to this problem. Perl 5 requires both the parentheses and the braces… but it lets you leave off the semicolon on the last statement. Python just uses indentation to delimit blocks in the first place, so you can’t have a block look wrong. Lisps exist on a higher plane of existence where the very question makes no sense.

Bitwise operator precedence

For ease of transition from B, in C, the bitwise operators & | ^ have lower precedence than the comparison operators == and friends. That means they happen later. For binary math operators, this is nonsense.

1
2
3
1 + 2 == 3  // (1 + 2) == 3
1 * 2 == 3  // (1 * 2) == 3
1 | 2 == 3  // 1 | (2 == 3)

Many other languages have copied C’s entire set of operators and their precedence, including this. Because a new language is easier to learn if its rules are familiar, you see. Which is why we still, today, have extremely popular languages maintaining compatibility with a language from 1969 — so old that it probably couldn’t get a programming job.

Honestly, if your language is any higher-level than C, I’m not sure bit operators deserve to be operators at all. Free those characters up to do something else. Consider having a first-class bitfield type; then 99% of the use of bit operations would go away.

Quick test: 1 & 2 == 2 evaluates to 1 with C precedence, false otherwise. Or just look at a precedence table: if equality appears between bitwise ops and other math ops, that’s C style.

A bit wrong: D, expr, JavaScript, Perl 5, PHP.

Wisened up: F# (ops are &&& ||| ^^^), Go, Julia, Lua (bitwise ops are new in 5.3), Perl 6 (ops are +& +| +^), Python, Ruby, Rust, SQL, Swift, Unix shells.

Special mention: C# and Java have C’s precedence, but forbid using bitwise operators on booleans, so the quick test is a compile-time error. Lisp-likes have no operator precedence.

Negative modulo

The modulo operator, %, finds the remainder after division. Thus you might think that this always holds:

1
0 <= a % b < abs(b)

But no — if a is negative, C will produce a negative value. (Well, since C99; before that it was unspecified, which is probably worse.) This is so a / b * b + a % b is always equal to a. Truncating integer division rounds towards zero, so the sign of a % b always needs to be away from zero.

I’ve never found this behavior (or the above equivalence) useful. An easy example is that checking for odd numbers with x % 2 == 1 will fail for negative numbers, which produce -1. But the opposite behavior can be pretty handy.

Consider the problem of having n items that you want to arrange into rows with c columns. A calendar, say; you want to include enough empty cells to fill out the last row. n % c gives you the number of items on the last row, so c - n % c seems like it will give you the number of empty spaces. But if the last row is already full, then n % c is zero, and c - n % c equals c! You’ll have either a double-width row or a spare row of empty cells. Fixing this requires treating n % c == 0 as a special case, which is unsatisfying.

Ah, but if we have positive %, the answer is simply… -n % c! Consider this number line for n = 5 and c = 3:

1
2
-6      -3       0       3       6
 | - x x | x x x | x x x | x x - |

a % b tells you how far to count down to find a multiple of b. For positive a, that means “backtracking” over a itself and finding a smaller number. For negative a, that means continuing further away from zero. If you look at negative numbers as the mirror image of positive numbers, then % on a positive number tells you how to much to file off to get a multiple, whereas % on a negative number tells you how much further to go to get a multiple. 5 % 3 is 2, but -5 % 3 is 1. And of course, -6 % 3 is still zero, so that’s not a special case.

Positive % effectively lets you choose whether to round up or down. It doesn’t come up often, but when it’s handy, it’s really handy.

(I have no strong opinion on what 5 % -3 should be; I don’t think I’ve ever tried to use % with a negative divisor. Python makes it negative; Pascal makes it positive. Wikipedia has a whole big chart.)

Quick test: -5 % 3 is -2 with C semantics, 1 with “positive” semantics.

Leftovers: C#, D, expr, Go, Java, JavaScript, OCaml, PowerShell, PHP, Rust, Scala, SQL, Swift, Unix shells, VimL, Visual Basic. Notably, some of these languages don’t even have integer division.

Paying dividends: Dart, MUMPS (#), Perl, Python, R (%%), Ruby, Smalltalk (\\\\), Standard ML, Tcl.

Special mention: Ada, Haskell, Julia, many Lisps, MATLAB, VHDL, and others have separate mod (Python-like) and rem (C-like) operators. CoffeeScript has separate % (C-like) and %% (Python-like) operators.

Leading zero for octal

Octal notation like 0777 has three uses.

One: to make a file mask to pass to chmod().

Two: to confuse people when they write 013 and it comes out as 11.

Three: to confuse people when they write 018 and get a syntax error.

If you absolutely must have octal (?!) in your language, it’s fine to use 0o777. Really. No one will mind. Or you can go the whole distance and allow literals written in any base, as several languages do.

Gets a zero: awk (gawk only), Clojure, Go, Groovy, Java, JavaScript, m4, Perl 5, PHP, Python 2, Unix shells.

G0od: ECMAScript 6, Eiffel (0c — cute!), F#, Haskell, Julia, Nemerle, Nim, OCaml, Perl 6, Python 3, Ruby, Rust, Scheme (#o), Swift, Tcl.

Based literals: Ada (8#777#), Bash (8#777), Erlang (8#777), Icon (8r777), J (8b777), Perl 6 (:8<777>), PostScript (8#777), Smalltalk (8r777).

Special mention: BASIC uses &O and &H prefixes for octal and hex. bc and Forth allow the base used to interpret literals to be changed on the fly, via ibase and BASE respectively. C#, D, expr, Lua, Scala, and Standard ML have no octal literals at all. Some COBOL extensions use O# and H#/X# prefixes for octal and hex. Fortran uses the slightly odd O'777' syntax.

No power operator

Perhaps this makes sense in C, since it doesn’t correspond to an actual instruction on most CPUs, but in JavaScript? If you can make + work for strings, I think you can add a **.

If you’re willing to ditch the bitwise operators (or lessen their importance a bit), you can even use ^, as most people would write in regular ASCII text.

Powerless: ACS, C#, Eiffel, Erlang, expr, Forth, Go.

Two out of two stars: Ada, ALGOL ( works too), COBOL, CoffeeScript, ECMAScript 7, Fortran, F#, Groovy, OCaml, Perl, PHP, Python, Ruby, Unix shells.

I tip my hat: awk, BASIC, bc, COBOL, fish, Lua.

Otherwise powerful: APL (), D (^^).

Special mention: Lisps tend to have a named function rather than a dedicated operator (e.g. Math/pow in Clojure, expt in Common Lisp), but since operators are regular functions, this doesn’t stand out nearly so much. Haskell uses all three of ^, ^^, and ** for typing reasons.

C-style for loops

This construct is bad. It very rarely matches what a human actually wants to do, which 90% of the time is “go through this list of stuff” or “count from 1 to 10”. A C-style for obscures those wishes. The syntax is downright goofy, too: nothing else in the language uses ; as a delimiter and repeatedly executes only part of a line. It’s like a tuple of statements.

I said in my previous post about iteration that having an iteration protocol requires either objects or closures, but I realize that’s not true. I even disproved it in the same post. Lua’s own iteration protocol can be implemented without closures — the semantics of for involve keeping a persistent state value and passing it to the iterator function every time. It could even be implemented in C! Awkwardly. And with a bunch of macros. Which aren’t hygenic in C. Hmm, well.

Loopy: ACS, bc, Fortran.

Cool and collected: C#, Clojure, D, Delphi (recent), ECMAScript 6, Eiffel (recent), Go, Groovy, Icon, Inform 7, Java, Julia, Logo, Lua, Nemerle, Nim, Objective-C, Perl, PHP, PostScript, Prolog, Python, R, Rust, Scala, Smalltalk, Swift, Tcl, Unix shells, Visual Basic.

Special mention: Functional languages and Lisps are laughing at the rest of us here. awk has for...in, but it doesn’t iterate arrays in order which makes it rather less useful. JavaScript (pre ES6) has both for...in and for each...in, but both are differently broken, so you usually end up using C-style for or external iteration. BASIC has an ergonomic numeric loop, but no iteration loop. Ruby mostly uses external iteration, and its for block is actually expressed in those terms.

Switch with default fallthrough

We’ve been through this before. Wanting completely separate code per case is, by far, the most common thing to want to do. It makes no sense to have to explicitly opt out of the more obvious behavior.

Breaks my heart: Java, JavaScript.

Follows through: Ada, BASIC, CoffeeScript, Go (has a fallthrough statement), Lisps, Ruby, Swift (has a fallthrough statement), Unix shells.

Special mention: C# and D require break, but require something one way or the other — implicit fallthrough is disallowed except for empty cases. Perl 5 historically had no switch block built in, but it comes with a Switch module, and the last seven releases have had an experimental given block which I stress is still experimental. Python has no switch block. Erlang, Haskell, and Rust have pattern-matching instead (which doesn’t allow fallthrough at all).

Type first

1
int foo;

In C, this isn’t too bad. You get into problems when you remember that it’s common for type names to be all lowercase.

1
foo * bar;

Is that a useless expression, or a declaration? It depends entirely on whether foo is a variable or a type.

It gets a little weirder when you consider that there are type names with spaces in them. And storage classes. And qualifiers. And sometimes part of the type comes after the name.

1
extern const volatile _Atomic unsigned long long int * restrict foo[];

That’s not even getting into the syntax for types of function pointers, which might have arbitrary amounts of stuff after the variable name.

And then C++ came along with generics, which means a type name might also have other type names nested arbitrarily deep.

1
extern const volatile std::unordered_map<unsigned long long int, std::unordered_map<const long double * const, const std::vector<std::basic_string<char>>::const_iterator>> foo;

And that’s just a declaration! Imagine if there were an assignment in there too.

The great thing about static typing is that I know the types of all the variables, but that advantage is somewhat lessened if I can’t tell what the variables are.

Between type-first, function pointer syntax, Turing-complete duck-typed templates, and C++’s initialization syntax, there are several ways where parsing C++ is ambiguous or even undecidable! “Undecidable” here means that there exist C++ programs which cannot even be parsed into a syntax tree, because the same syntax means two different things depending on whether some expression is a value or a type, and that question can depend on an endlessly recursive template instantiation. (This is also a great example of ambiguity, where x * y(z) could be either an expression or a declaration.)

Contrast with, say, Rust:

1
let x: ... = ...;

This is easy to parse, both for a human and a computer. The thing before the colon must be a variable name, and it stands out immediately; the thing after the colon must be a type name. Even better, Rust has pretty good type inference, so the type is probably unnecessary anyway.

Of course, languages with no type declarations whatsoever are immune to this problem.

Most vexing: ACS, ALGOL, C#, D (though [] goes on the type), Fortran, Java, Perl 6.

Looks Lovely: Ada, Boo, F#, Go, Python 3 (via annotation syntax and the typing module), Rust, Swift, TypeScript.

Special mention: BASIC uses trailing type sigils to indicate scalar types.

Weak typing

Please note: this is not the opposite of static typing. Weak typing is more about the runtime behavior of values — if I try to use a value of type T as though it were of type U, will it be implicitly converted?

C lets you assign pointers to int variables and then take square roots of them, which seems like a bad idea to me. C++ agreed and nixed this, but also introduced the ability to make your own custom types implicitly convertible to as many other types you want.

This one is pretty clearly a spectrum, and I don’t have a clear line. For example, I don’t fault Python for implicitly converting between int and float, because int is infinite-precision and float is 64-bit, so it’s usually fine. But I’m a lot more suspicious of C, which lets you assign an int to a char without complaint. (Well, okay. Literal integers in C are ints, which poses a slight problem.)

I do count a combined addition/concatenation operator that accepts different types of arguments as a form of weak typing.

Weak: JavaScript (+), PHP, Unix shells (almost everything’s a string, but even arrays/scalars are somewhat interchangeable).

Strong: F#, Go (explicit numeric casts), Haskell, Python, Rust (explicit numeric casts).

Special mention: ACS only has integers; even fixed-point values are stored in integers, and the compiler has no notion of a fixed-point type, making it the weakest language imaginable. C++ and Scala both allow defining implicit conversions, for better or worse. Perl 5 is weak, but it avoids most of the ambiguity by having entirely separate sets of operators for string vs numeric operations. Python 2 is mostly strong, but that whole interchangeable bytes/text thing sure caused some ruckus. Tcl only has strings.

Integer division

Hey, new programmers!” you may find yourself saying. “Don’t worry, it’s just like math, see? Here’s how to use $LANGUAGE as a calculator.”

Oh boy!” says your protégé. “Let’s see what 7 ÷ 2 is! Oh, it’s 3. I think the computer is broken.”

They’re right! It is broken. I have genuinely seen a non-trivial number of people come into #python thinking division is “broken” because of this.

To be fair, C is pretty consistent about making math operations always produce a value whose type matches one of the arguments. It’s also unclear whether such division should produce a float or a double. Inferring from context would make sense, but that’s not something C is really big on.

Quick test: 7 / 2 is 3½, not 3.

Integrous: bc, C#, D, expr, F#, Fortran, Go, OCaml, Python 2, Ruby, Rust (hard to avoid), Unix shells.

Afloat: awk (no integers), Clojure (produces a rational!), Groovy, JavaScript (no integers), Lua (no integers until 5.3), Nim, Perl 5 (no integers), Perl 6, PHP, Python 3.

Special mention: Haskell disallows / on integers. Nim, Haskell, Perl 6, Python, and probably others have separate integral division operators: div, div, div, and //, respectively.

Bytestrings

Strings” in C are arrays of 8-bit characters. They aren’t really strings at all, since they can’t hold the vast majority of characters without some further form of encoding. Exactly what the encoding is and how to handle it is left entirely up to the programmer. This is a pain in the ass.

Some languages caught wind of this Unicode thing in the 90s and decided to solve this problem once and for all by making “wide” strings with 16-bit characters. (Even C95 has this, in the form of wchar_t* and L"..." literals.) Unicode, you see, would never have more than 65,536 characters.

Whoops, so much for that. Now we have strings encoded as UTF-16 rather than UTF-8, so we’re paying extra storage cost and we still need to write extra code to do basic operations right. Or we forget, and then later we have to track down a bunch of wonky bugs because someone typed a 💩.

Note that handling characters/codepoints is very different from handling glyphs, i.e. the distinct shapes you see on screen. Handling glyphs doesn’t even really make sense outside the context of a font, because fonts are free to make up whatever ligatures they want. Remember “diverse” emoji? Those are ligatures of three to seven characters, completely invented by a font vendor. A programming language can’t reliably count the display length of that, especially when new combining behaviors could be introduced at any time.

Also, it doesn’t matter how you solve this problem, as long as it appears to be solved. I believe Ruby uses bytestrings, for example, but they know their own encoding, so they can be correctly handled as sequences of codepoints. Having a separate non-default type or methods does not count, because everyone will still use the wrong thing first — sorry, Python 2.

Quick test: what’s the length of “💩”? If 1, you have real unencoded strings. If 2, you have UTF-16 strings. If 4, you have UTF-8 strings. If something else, I don’t know what the heck is going on.

Totally bytes: Go, Lua, Python 2 (separate unicode type).

Comes up short: Java, JavaScript.

One hundred emoji: Python 3, Ruby, Rust, Swift (even gets combining characters right!).

Special mention: Go’s strings are explicitly arbitrary byte sequences, but iterating over a string with for..range decodes UTF-8 code points. Perl 5 gets the quick test right if you put use utf8; at the top of the file, but Perl 5’s Unicode support is such a confusing clusterfuck that I can’t really give it a 💯.

Hmm. This one is kind of hard to track down for sure without either knowing a lot about internals or installing fifty different interpreters/compilers.

Increment and decrement

I don’t think there are too many compelling reasons to have ++. It means the same as += 1, which is still nice and short. The only difference is that people can do stupid unreadable tricks with ++.

One exception: it is possible to overload ++ in ways that don’t make sense as += 1 — for example, C++ uses ++ to advance iterators, which may do any arbitrary work under the hood.

Double plus ungood: ACS, awk, C#, D, Go, Java, JavaScript, Perl, Unix shells, Vala.

Double plus good: Lua (which doesn’t have += either), Python, Ruby, Rust, Swift (removed in v3).

Special mention: Perl 5 and PHP both allow ++ on strings, in which case it increments letters or something, but I don’t know if much real code has ever used this.

!

A pet peeve. Spot the difference:

1
2
3
4
5
6
if (looks_like_rain()) {
    ...
}
if (!looks_like_rain()) {
    ...
}

That single ! is ridiculously subtle, which seems wrong to me when it makes an expression mean its polar opposite. Surely it should stick out like a sore thumb. The left parenthesis makes it worse, too; it blends in slightly as just noise.

It helps a bit to space after the ! in cases like this:

1
2
3
if (! looks_like_rain()) {
    ...
}

But this seems to be curiously rare. The easy solution is to just spell the operator not. At which point the other two might as well be and and or.

Interestingly enough, C95 specifies and, or, not, and some others as standard alternative spellings, though I’ve never seen them in any C code and I suspect existing projects would prefer I not use them.

Not right: ACS, awk, C#, D, Go, Groovy, Java, JavaScript, Nemerle, PHP, R, Rust, Scala, Swift, Tcl, Vala.

Spelled out: Ada, ALGOL, BASIC, COBOL, Erlang, F#, Fortran, Haskell, Inform 7, Lisps, Lua, Nim, OCaml, Pascal, PostScript, Python, Smalltalk, Standard ML.

Special mention: APL and Julia both use ~, which is at least easier to pick out, which is more than I can say for most of APL. bc and expr, which are really calculators, have no concept of Boolean operations. Forth and Icon, which are not calculators, don’t seem to either. Inform 7 often blends the negation into the verb, e.g. if the player does not have.... Perl and Ruby have both symbolic and named Boolean operators (Perl 6 has even more), with different precedence (which inside if won’t matter); I believe Perl 5 prefers the words and Ruby prefers the symbols. Perl and Ruby also both have a separate unless block, with the opposite meaning to if. Python has is not and not in operators.

Single return and out parameters

Because C can only return a single value, and that value is often an indication of failure for the sake of an if, “out” parameters are somewhat common.

1
2
double x, y;
get_point(&x, &y);

It’s not immediately clear whether x and y are input or output. Sometimes they might function as both. (And of course, in this silly example, you’d be better off returning a single point struct. Or would you use a point out parameter because returning structs is potentially expensive?)

Some languages have doubled down on this by adding syntax to declare “out” parameters, which removes the ambiguity in the function definition, but makes it worse in function calls. In the above example, using & on an argument is at least a decent hint that the function wants to write to those values. If you have implicit out parameters or pass-by-reference or whatever, that would just be get_point(x, y) and you’d have no indication that those arguments are special in any way.

The vast majority of the time, this can be expressed in a more straightforward way by returning multiple values:

1
x, y = get_point()

That was intended as Python, but technically, Python doesn’t have multiple returns! It seems to, but it’s really a combination of several factors: a tuple type, the ability to make a tuple literal with just commas, and the ability to unpack a tuple via multiple assignment. In the end it works just as well. Also this is a way better use of the comma operator than in C.

But the exact same code could appear in Lua, which has multiple return/assignment as an explicit feature… and no tuples. The difference becomes obvious if you try to assign the return value to a single variable instead:

1
point = get_point()

In Python, point would be a tuple containing both return values. In Lua, point would be the x value, and y would be silently discarded. I don’t tend to be a fan of silently throwing data away, but I have to admit that Lua makes pretty good use of this in several places for “optional” return values that the caller can completely ignore if desired. An existing function can even be extended to return more values than before — that would break callers in Python, but work just fine in Lua.

(Also, to briefly play devil’s advocate: I once saw Python code that returned 14 values all with very complicated values, types, and semantics. Maybe don’t do that. I think I cleaned it up to return an object, which simplified the calling code considerably too.)

It’s also possible to half-ass this. ECMAScript 6::

1
2
3
4
5
function get_point() {
    return [1, 2];
}

var [x, y] = get_point();

It works, but it doesn’t actually look like multiple return. The trouble is that JavaScript has C’s comma operator and C’s variable declaration syntax, so neither of the above constructs could’ve left off the brackets without significantly changing the syntax:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
function get_point() {
    // Whoops!  This uses the comma operator, which evaluates to its last
    // operand, so it just returns 2
    return 1, 2;
}

// Whoops!  This is multiple declaration, where each variable gets its own "=",
// so it assigns nothing to x and the return value to y
var x, y = get_point();
// Now x is undefined and y is 2

This is still better than either out parameters or returning an explicit struct that needs manual unpacking, but it’s not as good as comma-delimited tuples. Note that some languages require parentheses around tuples (and also call them tuples), and I’m arbitrarily counting that as better than bracket.

Single return: Ada, ALGOL, BASIC, C#, COBOL, Fortran, Groovy, Java, Smalltalk.

Half-assed multiple return: C++11, D, ECMAScript 6, Erlang, PHP.

Multiple return via tuples: F#, Haskell, Julia, Nemerle, Nim, OCaml, Perl (just lists really), Python, Ruby, Rust, Scala, Standard ML, Swift, Tcl.

Native multiple return: Common Lisp, Go, Lua.

Special mention: C# has explicit syntax for out parameters, but it’s a compile-time error to not assign to all of them, which is slightly better than C. Forth is stack-based, and all return values are simply placed on the stack, so multiple return isn’t a special case. Unix shell functions don’t return values. Visual Basic sets a return value by assigning to the function’s name (?!), so good luck fitting multiple return in there.

Silent errors

Most runtime errors in C are indicated by one of two mechanisms: returning an error code, or segfaulting. Segfaulting is pretty noisy, so that’s okay, except for the exploit potential and all.

Returning an error code kinda sucks. Those tend to be important, but nothing in the language actually reminds you to check them, and of course we silly squishy humans have the habit of assuming everything will succeed at all times. Which is how I segfaulted git two days ago: I found a spot where it didn’t check for a NULL returned as an error.

There are several alternatives here: exceptions, statically forcing the developer to check for an error code, or using something monad-like to statically force the developer to distinguish between an error and a valid return value. Probably some others. In the end I was surprised by how many languages went the exception route.

Quietly wrong: Unix shells. Wow, yeah, I’m having a hard time naming anything else. Good job, us! And even Unix shells have set -e; it’s just opt-in.

Exceptional: Ada, C++, C#, D, Erlang, Forth, Java (exceptions are even part of function signature), JavaScript, Nemerle, Nim, Objective-C, OCaml, Perl 6, Python, Ruby, Smalltalk, Standard ML, Visual Basic.

Monadic: Haskell (Either), Rust (Result).

Special mention: ACS doesn’t really have many operations that can error, and those that do simply halt the script. ALGOL apparently has something called “mending” that I don’t understand. Go tends to use secondary return values, which calling code has to unpack, making them slightly harder to forget about; it also allows both the assignment and the error check together in the header of an if. Lisps have conditions and call/cc, which are different things entirely. Lua and Perl 5 handle errors by taking down the whole program, but offer a construct that can catch that further up the stack, which is clumsy but enough to emulate try..catch. PHP has exceptions, and errors (which are totally different), and a lot of builtin functions that return error codes. Swift has something that looks like exceptions, but it doesn’t involve stack unwinding and does require some light annotation — apparently sugar for an “out” parameter holding an error. Visual Basic, and I believe some other BASICs, decided C wasn’t bad enough and introduced the bizarre On Error Resume Next construct which does exactly what it sounds like.

Nulls

The billion dollar mistake.

I think it’s considerably worse in a statically typed language like C, because the whole point is that you can rely on the types. But a double* might be NULL, which is not actually a pointer to a double; it’s a pointer to a segfault. Other kinds of bad pointers are possible, of course, but those are more an issue of memory safety; allowing any reference to be null violates type safety. The root of the problem is treating null as a possible value of any type, when really it’s its own type entirely.

The alternatives tend to be either opt-in nullability or an “optional” generic type (a monad!) which eliminates null as its own value entirely. Notably, Swift does it both ways: optional types are indicated by a trailing ?, but that’s just syntactic sugar for Option<T>.

On the other hand, while it’s annoying to get a None where I didn’t expect one in Python, it’s not like I’m surprised. I occasionally get a string where I expected a number, too. The language explicitly leaves type concerns in my hands. My real objection is to having a static type system that lies. So I’m not going to list every single dynamic language here, because not only is it consistent with the rest of the type system, but they don’t really have any machinery to prevent this anyway.

Nothing doing: C#, D, Go, Java, Nim (non-nullable types are opt in).

Nullable types: Swift (sugar for a monad).

Monads: F# (Option — though technically F# also inherits null from .NET), Haskell (Maybe), Rust (Option), Swift (Optional).

Special mention: awk, Tcl, and Unix shells only have strings, so in a surprising twist, they have no concept of null whatsoever. Java recently introduced an Optional<T> type which explicitly may or may not contain a value, but since it’s still a non-primitive, it could also be null. C++17 doesn’t quite have the same problem with std::optional<T>, since non-reference values can’t be null. Inform 7’s nothing value is an object (the root of half of its type system), which means any object variable might be nothing, but any value of a more specific type cannot be nothing. JavaScript has two null values, null and undefined. Perl 6 is really big on static types, but claims its Nil object doesn’t exist, and I don’t know how to even begin to unpack that. R and SQL have a more mathematical kind of NULL, which tends to e.g. vanish from lists.

Assignment as expression

How common a mistake is this:

1
2
3
if (x = 3) {
    ...
}

Well, I don’t know, actually. Maybe not that common, save for among beginners. But I sort of wonder whether allowing this buys us anything. I can only think of two cases where it does. One is with something like iteration:

1
2
3
4
// Typical linked list
while (p = p->next) {
    ...
}

But this is only necessary in C in the first place because it has no first-class notion of iteration. The other is shorthand for checking that a function returned a useful value:

1
2
3
if (ptr = get_pointer()) {
    ...
}

But if a function returns NULL, that’s really an error condition, and presumably you have some other way to handle that too.

What does that leave? The only time I remotely miss this in Python (where it’s illegal) is when testing a regex. You tend to see this a lot instead.

1
2
3
m = re.match('x+y+z+', some_string)
if m:
    ...

re treats failure as an acceptable possibility and returns None, rather than raising an exception. I’m not sure whether this was the right thing to do or not, but off the top of my head I can’t think of too many other Python interfaces that sometimes return None.

Freedom of expression: ACS, C#, Java, JavaScript, Perl, PHP, Swift.

Makes a statement: Inform 7, Lua, Python, Unix shells.

Special mention: BASIC uses = for both assignment and equality testing — the meaning is determined from context. D allows variable declaration as an expression, so if (int x = 3) is allowed, but regular assignment is not. Functional languages generally don’t have an assignment operator. Go disallows assignment as an expression, but assignment and a test can appear together in an if condition, and this is an idiomatic way to check success. Ruby makes everything an expression, so assignment might as well be too. Rust makes everything an expression, but assignment evaluates to the useless () value (due to ownership rules), so it’s not actually useful. Rust and Swift both have a special if let block that explicitly combines assignment with pattern matching, which is way nicer than the C approach.

No hyphens in identifiers

snake_case requires dancing on the shift key (unless you rearrange your keyboard, which is perfectly reasonable). It slows you down slightly and leads to occasional mistakes like snake-Case. The alternative is dromedaryCase, which is objectively wrong and doesn’t actually solve this problem anyway.

Why not just allow hyphens in identifiers, so we can avoid this argument and use kebab-case?

Ah, but then it’s ambiguous whether you mean an identifier or the subtraction operator. No problem: require spaces for subtraction. I don’t think a tiny way you’re allowed to make your code harder to read is really worth this clear advantage.

Low scoring: ACS, C#, D, Java, JavaScript, OCaml, Pascal, Perl 5, PHP, Python, Ruby, Rust, Swift, Unix shells.

Nicely-designed: COBOL, CSS (and thus Sass), Forth, Inform 7, Lisps, Perl 6, XML.

Special mention: Perl has a built-in variable called $-, and Ruby has a few called $-n for various values of “n”, but these are very special cases.

Braces and semicolons

Okay. Hang on. Bear with me.

C code looks like this.

1
2
3
4
5
some block header {
    line 1;
    line 2;
    line 3;
}

The block is indicated two different ways here. The braces are for the compiler; the indentation is for humans.

Having two different ways to say the same thing means they can get out of sync. They can disagree. And that can be, as previously mentioned, really bad. This is really just a more general form of the problem of optional block delimiters.

The only solution is to eliminate one of the two. Programming languages exist for the benefit of humans, so we obviously can’t get rid of the indentation. Thus, we should get rid of the braces. QED.

As an added advantage, we reclaim all the vertical space wasted on lines containing only a }, and we can stop squabbling about where to put the {.

If you accept this, you might start to notice that there are also two different ways of indicating where a line ends: with semicolons for the compiler, and with vertical whitespace for humans. So, by the same reasoning, we should lose the semicolons.

Right? Awesome. Glad we’re all on the same page.

Some languages use keywords instead of braces, but the effect is the same. I’m not aware of any languages that use keywords instead of semicolons.

Bracing myself: C#, D, Erlang, Java, Perl, Rust.

Braces, but no semicolons: Go (ASI), JavaScript (ASI — see below), Lua, Ruby, Swift.

Free and clear: CoffeeScript, Haskell, Python.

Special mention: Lisp, just, in general. Inform 7 has an indented style, but it still requires semicolons. MUMPS doesn’t support nesting at all, but I believe there are extensions that use dots to indicate it.

Here’s some interesting trivia. JavaScript, Lua, and Python all optionally allow semicolons at the end of a statement, but the way each language determines line continuation is very different.

JavaScript takes an “opt-out” approach: it continues reading lines until it hits a semicolon, or until reading the next line would cause a syntax error. (This approach is called automatic semicolon insertion.) That leaves a few corner cases like starting a new line with a (, which could look like the last thing on the previous line is a function you’re trying to call. Or you could have -foo on its own line, and it would parse as subtraction rather than unary negation. You might wonder why anyone would do that, but using unary + is one way to make function parse as an expression rather than a statement! I’m not so opposed to semicolons that I want to be debugging where the language thinks my lines end, so I just always use semicolons in JavaScript.

Python takes an “opt-in” approach: it assumes, by default, that a statement ends at the end of a line. However, newlines inside parentheses or brackets are ignored, which takes care of 99% of cases — long lines are most frequently caused by function calls (which have parentheses!) with a lot of arguments. If you really need it, you can explicitly escape a newline with \\, but this is widely regarded as incredibly ugly.

Lua avoids the problem almost entirely. I believe Lua’s grammar is designed such that it’s almost always unambiguous where a statement ends, even if you have no newlines at all. This has a few weird side effects: void expressions are syntactically forbidden in Lua, for example, so you just can’t have -foo as its own statement. Also, you can’t have code immediately following a return, because it’ll be interpreted as a return value. The upside is that Lua can treat newlines just like any other whitespace, but still not need semicolons. In fact, semicolons aren’t statement terminators in Lua at all — they’re their own statement, which does nothing. Alas, not for lack of trying, Lua does have the same ( ambiguity as JavaScript (and parses it the same way), but I don’t think any of the others exist.

Oh, and the colons that Python has at the end of its block headers, like if foo:? As far as I can tell, they serve no syntactic purpose whatsoever. Purely aesthetic.

Blaming the programmer

Perhaps one of the worst misfeatures of C is the ease with which responsibility for problems can be shifted to the person who wrote the code. “Oh, you segfaulted? I guess you forgot to check for NULL.” If only I had a computer to take care of such tedium for me!

Clearly, computers can’t be expected to do everything for us. But they can be expected to do quite a bit. Programming languages are built for humans, and they ought to eliminate the sorts of rote work humans are bad at whenever possible. A programmer is already busy thinking about the actual problem they want to solve; it’s no surprise that they’ll sometimes forget some tedious detail the language forces them to worry about.

So if you’re designing a language, don’t just copy C. Don’t just copy C++ or Java. Hell, don’t even just copy Python or Ruby. Consider your target audience, consider the problems they’re trying to solve, and try to get as much else out of the way as possible. If the same “mistake” tends to crop up over and over, look for a way to modify the language to reduce or eliminate it. And be sure to look at a lot of languages for inspiration — even ones you hate, even weird ones no one uses! A lot of clever people have had a lot of other ideas in the last 44 years.


I hope you enjoyed this accidental cross-reference of several dozen languages! I enjoyed looking through them all, though it was incredibly time-consuming. Some of them look pretty interesting; maybe give them a whirl.

Also, dammit, now I’m thinking about language design again.

AWS Greengrass – Ubiquitous, Real-World Computing

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-greengrass-ubiquitous-real-world-computing/

Computing and data processing within the confines of a data center or office is easy. You can generally count on good connectivity and a steady supply of electricity, and you have access to as much on-premises or cloud-based storage and compute power as you need. The situation is much different out in the real world. Connectivity can be intermittent, unreliable, and limited in speed and scale. Power is at a premium, putting limits on how much storage and compute power can be brought in to play.

Lots of interesting and potentially valuable data is present out in the field, if only it could be collected, processed, and turned into actionable intelligence. This data could be located miles below the surface of the Earth in a mine or an oil well, in a sensitive & safety-critical location like a hospital or a factory, or even on another planet (hello, Curiosity).

Our customers are asking for a way to use the scale and power of the AWS Cloud to help them to do local processing under these trying conditions. First, they want to build systems that measure, sense, and act upon the data locally. Then they want to bring cloud-like, local intelligence to bear on the data, implementing local actions that are interdependent and coordinated. To make this even more challenging they want to do make use of any available local processing and storage resources, while also connecting to specialized sensors and peripherals.

Introducing AWS Greengrass
I’d like to tell you about AWS Greengrass. This new service is designed to allow you to address the challenges that I outlined above by extending the AWS programming model to small, simple, field-based devices.

Greengrass builds on AWS IoT and AWS Lambda, and can also access other AWS services. it is built for offline operation and greatly simplifies the implementation of local processing. Code running in the field can collect, filter, and aggregate freshly collected data and then push it up to the cloud for long-term storage and further aggregation. Further, code running in the field can also take action very quickly, even in cases where connectivity to the cloud is temporarily unavailable.

If you are already developing embedded systems for small devices, you will now be able to make use of modern, cloud-aware development tools and workflows. You can write and test your code in the cloud and then deploy it locally. You can write Python code that responds to device events and you can make use of MQTT-based pub/sub messaging for communication.

Greengrass has two constituent parts, the Greengrass Core (GGC) and the IoT Device SDK. Both of these components run on your own hardware, out in the field.

Greengrass Core is designed to run on devices that have at least 128 MB of memory and an x86 or ARM CPU running at 1 GHz or better, and can take advantage of additional resources if available. It runs Lambda functions locally, interacts with the AWS Cloud, manages security & authentication, and communicates with the other devices under its purview.

The IoT Device SDK is used to build the applications that run on the devices that connect to the device that hosts the core (generally via a LAN or other local connection). These applications will capture data from sensors, subscribe to MQTT topics, and use AWS IoT device shadows to store and retrieve state information.

Using AWS GreenGrass
You will be able to set up and manage many aspects of Greengrass through the AWS Management Console, the AWS APIs, and the AWS Command Line Interface (CLI).

You will be able to register new hub devices, configure the desired set of Lambda functions, and create a deployment package for delivery to the device. From there, you will associate the lightweight devices with the hub.

Now in Preview
We are launching a limited preview of AWS Greengrass today and you can [[sign up now]] if you would like to participate.

Each AWS customer will be able to use up 3 devices for one year at no charge. Beyond that, the monthly cost for each active Greengrass Core is $0.16 ($1.49 per year) for up to 10,000 devices.

Jeff;

 

Amazon Rekognition – Image Detection and Recognition Powered by Deep Learning

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-rekognition-image-detection-and-recognition-powered-by-deep-learning/

What do you see when you look at this picture?

You might simply see an animal. Maybe you see a pet, a dog, or a Golden Retriever. The association between the image and these labels is not hard-wired in to your brain. Instead, you learned the labels after seeing hundreds or thousands of examples. Operating on a number of different levels, you learned to distinguish an animal from a plant, a dog from a cat, and a Golden Retriever from other dog breeds.

Deep Learning for Image Detection
Giving computers the same level of comprehension has proven to be a very difficult task. Over the course of decades, computer scientists have taken many different approaches to the problem. Today, a broad consensus has emerged that the best way to tackle this problem is via deep learning. Deep learning uses a combination of feature abstraction and neural networks to produce results that can be (as Arthur C. Clarke once said) indistinguishable from magic. However, it comes at a considerable cost. First, you need to put a lot of work into the training phase. In essence, you present the learning network with a broad spectrum of labeled examples (“this is a dog”, “this is a pet”, and so forth) so that it can correlate features in the image with the labels. This phase is computationally expensive due to the size and the multi-layered nature of the neural networks. After the training phase is complete, evaluating new images against the trained network is far easier. The results are traditionally expressed in confidence levels (0 to 100%) rather than as cold, hard facts. This allows you to decide just how much precision is appropriate for your applications.

Introducing Amazon Rekognition
Today I would like to tell you about Amazon Rekognition. Powered by deep learning and built by our Computer Vision team over the course of many years, this fully-managed service already analyzes billions of images daily. It has been trained on thousands of objects and scenes, and is now available for you to use in your own applications. You can use the Rekognition Demos to put the service through its paces before dive in and start writing code that uses the Rekognition API.

Rekognition was designed from the get-go to run at scale. It comprehends scenes, objects, and faces. Given an image, it will return a list of labels. Given an image with one or more faces, it will return bounding boxes for each face, along with attributes. Let’s see what it has to say about the picture of my dog (her name is Luna, by the way):

As you can see, Rekognition labeled Luna as an animal, a dog, a pet, and as a golden retriever with a high degree of confidence. It is important to note that these labels are independent, in the sense that the deep learning model does not explicitly understand the relationship between, for example, dogs and animals. It just so happens that both of these labels were simultaneously present on the dog-centric training material presented to Rekognition.

Let’s see how it does with a picture of my wife and I:

Amazon Rekognition found our faces, set up bounding boxes, and let me know that my wife was happy (the picture was taken on her birthday, so I certainly hope she was).

You can also use Rekognition to compare faces and to see if a given image contains any one of a number of faces that you have asked it to recognize.

All of this power is accessible from a set of API functions (the console is great for quick demos). For example, you can call DetectLabels to programmatically reproduce my first example, or DetectFaces to reproduce my second one. You can make multiple calls to IndexFaces to prepare Rekognition to recognize some faces. Each time you do this, Rekognition extracts some features (known as face vectors) from the image, stores the vectors, and discards the image. You can create one or more Rekognition collections and store related groups of face vectors in each one.

Rekognition can directly process images stored in Amazon Simple Storage Service (S3). In fact, you can use AWS Lambda functions to process newly uploaded photos at any desired scale. You can use AWS Identity and Access Management (IAM) to control access to the Rekognition APIs.

Applications for Rekognition
So, what can you use this for? I’ve got plenty of ideas to get you started!

If you have a large collection of photos, you can tag and index them using Amazon Rekognition. Because Rekognition is a service, you can process millions of photos per day without having to worry about setting up, running, or scaling any infrastructure. You can implement visual search, tag-based browsing, and all sorts of interactive discovery models.

You can use Rekognition in several different authentication and security contexts. You can compare a face on a webcam to a badge photo before allowing an employee to enter a secure zone. You can perform visual surveillance, inspecting photos for objects or people of interest or concern.

You can build “smart” marketing billboards that collect demographic data about viewers.

Now Available
Rekognition is now available in the US East (Northern Virginia), US West (Oregon), and EU (Ireland) Regions and you can start using it today. As part of the AWS Free Tier tier, you can analyze up to 5,000 images per month and store up to 1,000 face vectors each month for an entire year. After that (and at higher volume), you will pay tiered pricing based on the number of images that you analyze and the number of face vectors that you store.

Jeff;

 

Weekly roundup: Arguing on the internet

Post Syndicated from Eevee original https://eev.ee/dev/2016/11/28/weekly-roundup-arguing-on-the-internet/

November is improving.

  • zdoom: On a whim I went back to playing with my experiment of embedding Lua in ZDoom. I tracked down a couple extremely subtle and frustrating bugs with the Lua binding library I was using, and I made it possible to define new actor classes from Lua and delay a script executed from a switch. Those were two huge requirements, so that’s pretty okay progress.

  • blog: Zed Shaw said some appalling nonsense and I felt compelled to correct it. I also wrote about some of the interesting issues with that whole Lua-in-ZDoom thing. I have a final post half-done as well.

  • art: I did a few pixels, which you may or not be seeing in the near future.

Mostly writing and Pokémon, then. But I’m clearing my plate behind the scenes. Stuff’s a-brewin’.

Weekly roundup: Arguing on the internet

Post Syndicated from Eevee original https://eev.ee/dev/2016/11/28/weekly-roundup-arguing-on-the-internet/

November is improving.

  • zdoom: On a whim I went back to playing with my experiment of embedding Lua in ZDoom. I tracked down a couple extremely subtle and frustrating bugs with the Lua binding library I was using, and I made it possible to define new actor classes from Lua and delay a script executed from a switch. Those were two huge requirements, so that’s pretty okay progress.

  • blog: Zed Shaw said some appalling nonsense and I felt compelled to correct it. I also wrote about some of the interesting issues with that whole Lua-in-ZDoom thing. I have a final post half-done as well.

  • art: I did a few pixels, which you may or not be seeing in the near future.

Mostly writing and Pokémon, then. But I’m clearing my plate behind the scenes. Stuff’s a-brewin’.

Embedding Lua in ZDoom

Post Syndicated from Eevee original https://eev.ee/blog/2016/11/26/embedding-lua-in-zdoom/

I’ve spent a little time trying to embed a Lua interpreter in ZDoom. I didn’t get too far yet; it’s just an experimental thing I poke at every once and a while. The existing pile of constraints makes it an interesting problem, though.

Background

ZDoom is a “source port” (read: fork) of the Doom engine, with all the changes from the commercial forks merged in (mostly Heretic, Hexen, Strife), and a lot of internal twiddles exposed. It has a variety of mechanisms for customizing game behavior; two are major standouts.

One is ACS, a vaguely C-ish language inherited from Hexen. It’s mostly used to automate level behavior — at the simplest, by having a single switch perform multiple actions. It supports the usual loops and conditionals, it can store data persistently, and ZDoom exposes a number of functions to it for inspecting and altering the state of the world, so it can do some neat tricks. Here’s an arbitrary script from my DUMP2 map.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
script "open_church_door" (int tag)
{
    // Open the door more quickly on easier skill levels, so running from the
    // arch-vile is a more viable option
    int skill = GameSkill();
    int speed;
    if (skill < SKILL_NORMAL)
        speed = 64;  // blazing door speed
    else if (skill == SKILL_NORMAL)
        speed = 16;  // normal door speed
    else
        speed = 8;  // very dramatic door speed

    Door_Raise(tag, speed, 68);  // double usual delay
}

However, ZDoom doesn’t actually understand the language itself; ACS is compiled to bytecode. There’s even at least one alternative language that compiles to the same bytecode, which is interesting.

The other big feature is DECORATE, a mostly-declarative mostly-interpreted language for defining new kinds of objects. It’s a fairly direct reflection of how Doom actors are implemented, which is in terms of states. In Doom and the other commercial games, actor behavior was built into the engine, but this language has allowed almost all actors to be extracted as text files instead. For example, the imp is implemented partly as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
  States
  {
  Spawn:
    TROO AB 10 A_Look
    Loop
  See:
    TROO AABBCCDD 3 A_Chase
    Loop
  Melee:
  Missile:
    TROO EF 8 A_FaceTarget
    TROO G 6 A_TroopAttack
    Goto See
  ...
  }

TROO is the name of the imp’s sprite “family”. A, B, and so on are individual frames. The numbers are durations in tics (35 per second). All of the A_* things (which are optional) are action functions, behavioral functions (built into the engine) that run when the actor switches to that frame. An actor starts out at its Spawn state, so an imp behaves as follows:

  • Spawn. Render as TROO frame A. (By default, action functions don’t run on the very first frame they’re spawned.)
  • Wait 10 tics.
  • Change to TROO frame B. Run A_Look, which checks to see if a player is within line of sight, and if so jumps to the See state.
  • Wait 10 tics.
  • Repeat. (This time, frame A will also run A_Look, since the imp was no longer just spawned.)

All monster and item behavior is one big state table. Even the player’s own weapons work this way, which becomes very confusing — at some points a weapon can be running two states simultaneously. Oh, and there’s A_CustomMissile for monster attacks but A_FireCustomMissile for weapon attacks, and the arguments are different, and if you mix them up you’ll get extremely confusing parse errors.

It’s a little bit of a mess. It’s fairly flexible for what it is, and has come a long way — for example, even original Doom couldn’t pass arguments to action functions (since they were just function pointers), so it had separate functions like A_TroopAttack for every monster; now that same function can be written generically. People have done some very clever things with zero-delay frames (to run multiple action functions in a row) and storing state with dummy inventory items, too. Still, it’s not quite a programming language, and it’s easy to run into walls and bizarre quirks.

When DECORATE lets you down, you have one interesting recourse: to call an ACS script!

Unfortunately, ACS also has some old limitations. The only type it truly understands is int, so you can’t manipulate an actor directly or even store one in a variable. Instead, you have to work with TIDs (“thing IDs”). Every actor has a TID (zero is special-cased to mean “no TID”), and most ACS actor-related functions are expressed in terms of TIDs. For level automation, this is fine, and probably even what you want — you can dump a group of monsters in a map, give them all a TID, and then control them as a group fairly easily.

But if you want to use ACS to enhance DECORATE, you have a bit of a problem. DECORATE defines individual actor behavior. Also, many DECORATE actors are designed independently of a map and intended to be reusable anywhere. DECORATE should thus not touch TIDs at all, because they’re really the map‘s concern, and mucking with TIDs might break map behavior… but ACS can’t refer to actors any other way. A number of action functions can, but you can’t call action functions from ACS, only DECORATE. The workarounds for this are not pretty, especially for beginners, and they’re very easy to silently get wrong.

Also, ultimately, some parts of the engine are just not accessible to either ACS or DECORATE, and neither language is particularly amenable to having them exposed. Adding more native types to ACS is rather difficult without making significant changes to both the language and bytecode, and DECORATE is barely a language at all.

Some long-awaited work is finally being done on a “ZScript”, which purports to solve all of these problems by expanding DECORATE into an entire interpreted-C++-ish scripting language with access to tons of internals. I don’t know what I think of it, and it only seems to half-solve the problem, since it doesn’t replace ACS.

Trying out Lua

Lua is supposed to be easy to embed, right? That’s the one thing it’s famous for. Before ZScript actually started to materialize, I thought I’d take a little crack at embedding a Lua interpreter and exposing some API stuff to it.

It’s not very far along yet, but it can do one thing that’s always been completely impossible in both ACS and DECORATE: print out the player’s entire inventory. You can check how many of a given item the player has in either language, but neither has a way to iterate over a collection. In Lua, it’s pretty easy.

1
2
3
4
5
6
function lua_test_script(activator, ...)
    for item, amount in pairs(activator.inventory) do
        -- This is Lua's builtin print(), so it goes to stdout
        print(item.class.name, amount)
    end
end

I made a tiny test map with a switch that tries to run the ACS script named lua_test_script. I hacked the name lookup to first look for the name in Lua’s global scope; if the function exists, it’s called immediately, and ACS isn’t consulted at all. The code above is just a regular (global) function in a regular Lua file, embedded as a lump in the map. So that was a good start, and was pretty neat to see work.

Writing the bindings

I used the bare Lua API at first. While its API is definitely very simple, actually using it to define and expose a large API in practice is kind of repetitive and error-prone, and I was never confident I was doing it quite right. It’s plain C and it works entirely through stack manipulation and it relies on a lot of casting to/from void*, so virtually anything might go wrong at any time.

I was on the cusp of writing a bunch of gross macros to automate the boring parts, and then I found sol2, which is pretty great. It makes heavy use of basically every single C++11 feature, so it’s a nightmare when it breaks (and I’ve had to track down a few bugs), but it’s expressive as hell when it works:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
lua.new_usertype<AActor>("zdoom.AActor",
    "__tostring", [](AActor& actor) { return "<actor>"; },
    // Pointer to an unbound method.  Sol automatically makes this an attribute
    // rather than a method because it takes no arguments, then wraps its
    // return value to pass it back to Lua, no manual wrapper code required.
    "class", &AActor::GetClass,
    "inventory", sol::property([](AActor& actor) -> ZLuaInventory { return ZLuaInventory(actor); }),
    // Pointers to unbound attributes.  Sol turns these into writable
    // attributes on the Lua side.
    "health", &AActor::health,
    "floorclip", &AActor::Floorclip,
    "weave_index_xy", &AActor::WeaveIndexXY,
    "weave_index_z", &AActor::WeaveIndexZ);

This is the type of the activator argument from the script above. It works via template shenanigans, so most of the work is done at compile time. AActor has a lot of properties of various types; wrapping them with the bare Lua API would’ve been awful, but wrapping them with Sol is fairly straightforward.

Lifetime

activator.inventory is a wrapper around a ZLuaInventory object, which I made up. It’s just a tiny proxy struct that tries to represent the inventory of a particular actor, because the engine itself doesn’t quite have such a concept — an actor’s “inventory” is a single item (itself an actor), and each item has a pointer to the next item in the inventory. Creating an intermediate type lets me hide that detail from Lua and pretend the inventory is a real container.

The inventory is thus not a real table; pairs() works on it because it provides the __pairs metamethod. It calls an iter method returning a closure, per Lua’s iteration API, which Sol makes just work:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
struct ZLuaInventory {
    ...
    std::function<AInventory* ()> iter()
    {
        TObjPtr<AInventory> item = this->actor->Inventory;
        return [item]() mutable {
            AInventory* ret = item;
            if (ret)
                item = ret->NextInv();
            return ret;
        };
    }
}

C++’s closures are slightly goofy and it took me a few tries to land on this, but it works.

Well, sort of.

I don’t know how I got this idea in my head, but I was pretty sure that ZDoom’s TObjPtr did reference counting and would automatically handle the lifetime problems in the above code. Eventually Lua reaps the closure, then C++ reaps the closure, then the wrapped AInventorys refcount drops, and all is well.

Turns out TObjPtr doesn’t do reference counting. Rather, all the game objects participate in tracing garbage collection. The basic idea is to start from some root object and recursively traverse all the objects reachable from that root; whatever isn’t reached is garbage and can be deleted.

Unfortunately, the Lua interpreter is not reachable from ZDoom’s own object tree. If an object ends up only being held by Lua, ZDoom will think it’s garbage and delete it prematurely, leaving a dangling reference. Those are bad.

I think I can fix without too much trouble. Sol allows customizing how it injects particular types, so I can use that for the type tree that participates in this GC scheme and keep an unordered_set of all objects that are alive in Lua. The Lua interpreter itself is already wrapped in an object that participates in the GC, so when the GC descends to the wrapper, it’s easy to tell it that that set of objects is alive. I’ll probably need to figure out read/write barriers, too, but I haven’t looked too closely at how ZDoom uses those yet. I don’t know whether it’s possible for an object to be “dead” (as in no longer usable, not just 0 health) before being reaped, but if so, I’ll need to figure out something there too.

It’s a little ironic that I have to do this weird workaround when ZDoom’s tracing garbage collector is based on… Lua’s.

ZDoom does have types I want to expose that aren’t garbage collected, but those are all map structures like sectors, which are never created or destroyed at runtime. I will have to be careful with the Lua interpreter itself to make sure those can’t live beyond the current map, but I haven’t really dealt with map changes at all yet. The ACS approach is that everything is map-local, and there’s some limited storage for preserving values across maps; I could do something similar, perhaps only allowing primitive scalars.

Asynchronicity

Another critical property of ACS scripts is that they can pause themselves. They can either wait for a set number of tics with delay(), or wait for map geometry to stop being busy with something like tagwait(). So you can raise up some stairs, wait for the stairs to finish appearing, and then open the door they lead to. Or you can simulate game rules by running a script in an infinite loop that waits for a few tics between iterations. It’s pretty handy. It’s incredibly handy. It’s non-negotiable.

Luckily, Lua can emulate this using coroutines. I implemented the delay case yesterday:

1
2
3
4
5
function lua_test_script(activator, ...)
    zprint("hey it's me what's up", ...)
    coroutine.yield("delay", 70)
    zprint("i'm back again")
end

When I press the switch, I see the first message, then there’s a two-second pause (Doom is 35fps), then I see the second message.

A lot more details need to be hammered out before this is really equivalent to what ACS can do, but the basic functionality is there. And since these are full-stack coroutines, I can trivially wrap that yield gunk in a delay(70) function, so you never have to know the difference.

Determinism

ZDoom has demos and peer-to-peer multiplayer. Both features rely critically on the game state’s unfolding exactly the same way, given the same seed and sequence of inputs.

ACS goes to great lengths to preserve this. It executes deterministically. It has very, very few ways to make decisions based on anything but the current state of the game. Netplay and demos just work; modders and map authors never have to think about it.

I don’t know if I can guarantee the same about Lua. I’d think so, but I don’t know so. Will the order of keys in a table be exactly the same on every system, for example? That’s important! Even the ACS random-number generator is deterministic.

I hope this is the case. I know some games, like Starbound, implicitly assume for multiplayer purposes that scripts will execute the same way on every system. So it’s probably fine. I do wish Lua made some sort of guarantee here, though, especially since it’s such an obvious and popular candidate for game scripting.

Savegames

ZDoom allows you to quicksave at any time.

Any time.

Not while a script is running, mind you. Script execution blocks the gameplay thread, so only one thing can actually be happening at a time. But what happens if you save while a script is in the middle of a tagwait?

The coroutine needs to be persisted, somehow. More importantly, when the game is loaded, the coroutine needs to be restored to the same state: paused in the same place, with locals set to the same values. Even if those locals were wrapped pointers to C++ objects, which now have different addresses.

Vanilla Lua has no way to do this. Vanilla Lua has a pretty poor serialization story overall — nothing is built in — which is honestly kind of shocking. People use Lua for games, right? Like, a lot? How is this not an extremely common problem?

A potential solution exists in the form of Eris, a modified Lua that does all kinds of invasive things to allow absolutely anything to be serialized. Including coroutines!

So Eris makes this at least possible. I haven’t made even the slightest attempt at using it yet, but a few gotchas already stand out to me.

For one, Eris serializes everything. Even regular ol’ functions are serialized as Lua bytecode. A naïve approach would thus end up storing a copy of the entire game script in the save file.

Eris has a thing called the “permanent object table”, which allows giving names to specific Lua values. Those values are then serialized by name instead, and the names are looked up in the same table to deserialize. So I could walk the Lua namespace myself after the initial script load and stick all reachable functions in this table to avoid having them persisted. (That won’t catch if someone loads new code during play, but that sounds like a really bad idea anyway, and I’d like to prevent it if possible.) I have to do this to some extent anyway, since Eris can’t persist the wrapped C++ functions I’m exposing to Lua. Even if a script does some incredibly fancy dynamic stuff to replace global functions with closures at runtime, that’s okay; they’ll be different functions, so Eris will fall back to serializing them.

Then when the save is reloaded, Eris will replace any captured references to a global function with the copy that already exists in the map script. ZDoom doesn’t let you load saves across different mods, so the functions should be the same. I think. Hmm, maybe I should check on exactly what the load rules are. If you can load a save against a more recent copy of a map, you’ll want to get its updated scripts, but stored closures and coroutines might be old versions, and that is probably bad. I don’t know if there’s much I can do about that, though, unless Eris can somehow save the underlying code from closures/coros as named references too.

Eris also has a mechanism for storing wrapped native objects, so all I have to worry about is translating pointers, and that’s a problem Doom has already solved (somehow). Alas, that mechanism is also accessible to pure Lua code, and the docs warn that it’s possible to get into an infinite loop when loading. I’d rather not give modders the power to fuck up a save file, so I’ll have to disable that somehow.

Finally, since Eris loads bytecode, it’s possible to do nefarious things with a specially-crafted save file. But since the save file is full of a web of pointers anyway, I suspect it’s not too hard to segfault the game with a specially-crafted save file anyway. I’ll need to look into this. Or maybe I won’t, since I don’t seriously expect this to be merged in.

Runaway scripts

Speaking of which, ACS currently has detection for “runaway scripts”, i.e. those that look like they might be stuck in an infinite loop (or are just doing a ludicrous amount of work). Since scripts are blocking, the game does not actually progress while a script is running, and a very long script would appear to freeze the game.

I think ACS does this by counting instructions. I see Lua has its own mechanism for doing that, so limiting script execution “time” shouldn’t be too hard.

Defining new actors

I want to be able to use Lua with (or instead of) DECORATE, too, but I’m a little hung up on syntax.

I do have something slightly working — I was able to create a variant imp class with a bunch more health from Lua, then spawn it and fight it. Also, I did it at runtime, which is probably bad — I don’t know that there’s any way to destroy an actor class, so having them be map-scoped makes no sense.

That could actually pose a bit of a problem. The Lua interpreter should be scoped to a single map, but actor classes are game-global. Do they live in separate interpreters? That seems inconvenient. I could load the game-global stuff, take an internal-only snapshot of the interpreter with Lua (bytecode and all), and then restore it at the beginning of each level? Hm, then what happens if you capture a reference to an actor method in a save file…? Christ.

I could consider making the interpreter global and doing black magic to replace all map objects with nil when changing maps, but I don’t think that can possibly work either. ZDoom has hubs — levels that can be left and later revisited, preserving their state just like with a save — and that seems at odds with having a single global interpreter whose state persists throughout the game.

Er, anyway. So, the problem with syntax is that DECORATEs own syntax is extremely compact and designed for its very specific goal of state tables. Even ZScript appears to preserve the state table syntax, though it lets you write your own action functions or just provide a block of arbitrary code. Here’s a short chunk of the imp implementation again, for reference.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
  States
  {
  Spawn:
    TROO AB 10 A_Look
    Loop
  See:
    TROO AABBCCDD 3 A_Chase
    Loop
  ...
  }

Some tricky parts that stand out to me:

  • Labels are important, since these are state tables, and jumping to a particular state is very common. It’s tempting to use Lua coroutines here somehow, but short of using a lot of goto in Lua code (yikes!), jumping around arbitrarily doesn’t work. Also, it needs to be possible to tell an actor to jump to a particular state from outside — that’s how A_Look works, and there’s even an ACS function to do it manually.

  • Aside from being shorthand, frames are fine. Though I do note that hacks like AABBCCDD 3 are relatively common. The actual animation that’s wanted here is ABCD 6, but because animation and behavior are intertwined, the labels need to be repeated to run the action function more often. I wonder if it’s desirable to be able to separate display and behavior?

  • The durations seem straightforward, but they can actually be a restricted kind of expression as well. So just defining them as data in a table doesn’t quite work.

  • This example doesn’t have any, but states can also have a number of flags, indicated by keywords after the duration. (Slightly ambiguous, since there’s nothing strictly distinguishing them from action functions.) Bright, for example, is a common flag on projectiles, weapons, and important pickups; it causes the sprite to be drawn fullbright during that frame.

  • Obviously, actor behavior is a big part of the game sim, so ideally it should require dipping into Lua-land as little as possible.

Ideas I’ve had include the following.

Emulate state tables with arguments? A very straightforward way to do the above would be to just, well, cram it into one big table.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
define_actor{
    ...
    states = {
        'Spawn:',
        'TROO', 'AB', 10, A_Look,
        'loop',
        'See:',
        'TROO', 'AABBCCDD', 3, A_Chase,
        'loop',
        ...
    },
}

It would work, technically, I guess, except for non-literal durations, but I’d basically just be exposing the DECORATE parser from Lua and it would be pretty ridiculous.

Keep the syntax, but allow calling Lua from it? DECORATE is okay, for the most part. For simple cases, it’s great, even. Would it be good enough to be able to write new action functions in Lua? Maybe. Your behavior would be awkwardly split between Lua and DECORATE, though, which doesn’t seem ideal. But it would be the most straightforward approach, and it would completely avoid questions of how to emulate labels and state counts.

As an added benefit, this would keep DECORATE almost-purely declarative — which means editor tools could still reliably parse it and show you previews of custom objects.

Split animation from behavior? This could go several ways, but the most obvious to me is something like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
define_actor{
    ...
    states = {
        spawn = function(self)
            self:set_animation('AB', 10)
            while true do
                A_Look(self)
                delay(10)
            end
        end,
        see = function(self)
            self:set_animation('ABCD', 6)
            while true do
                A_Chase(self)
                delay(3)
            end
        end,
    },
}

This raises plenty of other API questions, like how to wait until an animation has finished or how to still do work on a specific frame, but I think those are fairly solvable. The big problems are that it’s very much not declarative, and it ends up being rather wordier. It’s not all boilerplate, though; it’s fairly straightforward. I see some value in having state delays and level script delays work the same way, too. And in some cases, you have only an animation with no code at all, so the heavier use of Lua should balance out. I don’t know.

A more practical problem is that, currently, it’s possible to jump to an arbitrary number of states past a given label, and that would obviously make no sense with this approach. It’s pretty rare and pretty unreadable, so maybe that’s okay. Also, labels aren’t blocks, so it’s entirely possible to have labels that don’t end with a keyword like loop and instead carry straight on into the next label — but those are usually used for logic more naturally expressed as for or while, so again, maybe losing that ability is okay.

Or… perhaps it makes sense to do both of these last two approaches? Built-in classes should stay as DECORATE anyway, so that existing code can still inherit from them and perform jumps with offsets, but new code could go entirely Lua for very complex actors.

Alas, this is probably one of those questions that won’t have an obvious answer unless I just build several approaches and port some non-trivial stuff to them to see how they feel.

And further

An enduring desire among ZDoom nerds has been the ability to write custom “thinkers”. Thinkers are really anything that gets to act each tic, but the word also specifically refers to the logic responsible for moving floors, opening doors, changing light levels, and so on. Exposing those more directly to Lua, and letting you write your own, would be pretty interesting.

Anyway

I don’t know if I’ll do all of this. I somewhat doubt it, in fact. I pick it up for half a day every few weeks to see what more I can make it do, just because it’s interesting. It has virtually no chance of being upstreamed anyway (the only active maintainer hates Lua, and thinks poorly of dynamic languages in general; plus, it’s redundant with ZScript) and I don’t really want to maintain my own yet another Doom fork, so I don’t expect it to ever be a serious project.

The source code for what I’ve done so far is available, but it’s brittle and undocumented, so I’m not going to tell you where to find it. If it gets far enough along to be useful as more than a toy, I’ll make a slightly bigger deal about it.

Embedding Lua in ZDoom

Post Syndicated from Eevee original https://eev.ee/blog/2016/11/26/embedding-lua-in-zdoom/

I’ve spent a little time trying to embed a Lua interpreter in ZDoom. I didn’t get too far yet; it’s just an experimental thing I poke at every once and a while. The existing pile of constraints makes it an interesting problem, though.

Background

ZDoom is a “source port” (read: fork) of the Doom engine, with all the changes from the commercial forks merged in (mostly Heretic, Hexen, Strife), and a lot of internal twiddles exposed. It has a variety of mechanisms for customizing game behavior; two are major standouts.

One is ACS, a vaguely C-ish language inherited from Hexen. It’s mostly used to automate level behavior — at the simplest, by having a single switch perform multiple actions. It supports the usual loops and conditionals, it can store data persistently, and ZDoom exposes a number of functions to it for inspecting and altering the state of the world, so it can do some neat tricks. Here’s an arbitrary script from my DUMP2 map.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
script "open_church_door" (int tag)
{
    // Open the door more quickly on easier skill levels, so running from the
    // arch-vile is a more viable option
    int skill = GameSkill();
    int speed;
    if (skill < SKILL_NORMAL)
        speed = 64;  // blazing door speed
    else if (skill == SKILL_NORMAL)
        speed = 16;  // normal door speed
    else
        speed = 8;  // very dramatic door speed

    Door_Raise(tag, speed, 68);  // double usual delay
}

However, ZDoom doesn’t actually understand the language itself; ACS is compiled to bytecode. There’s even at least one alternative language that compiles to the same bytecode, which is interesting.

The other big feature is DECORATE, a mostly-declarative mostly-interpreted language for defining new kinds of objects. It’s a fairly direct reflection of how Doom actors are implemented, which is in terms of states. In Doom and the other commercial games, actor behavior was built into the engine, but this language has allowed almost all actors to be extracted as text files instead. For example, the imp is implemented partly as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
  States
  {
  Spawn:
    TROO AB 10 A_Look
    Loop
  See:
    TROO AABBCCDD 3 A_Chase
    Loop
  Melee:
  Missile:
    TROO EF 8 A_FaceTarget
    TROO G 6 A_TroopAttack
    Goto See
  ...
  }

TROO is the name of the imp’s sprite “family”. A, B, and so on are individual frames. The numbers are durations in tics (35 per second). All of the A_* things (which are optional) are action functions, behavioral functions (built into the engine) that run when the actor switches to that frame. An actor starts out at its Spawn state, so an imp behaves as follows:

  • Spawn. Render as TROO frame A. (By default, action functions don’t run on the very first frame they’re spawned.)
  • Wait 10 tics.
  • Change to TROO frame B. Run A_Look, which checks to see if a player is within line of sight, and if so jumps to the See state.
  • Wait 10 tics.
  • Repeat. (This time, frame A will also run A_Look, since the imp was no longer just spawned.)

All monster and item behavior is one big state table. Even the player’s own weapons work this way, which becomes very confusing — at some points a weapon can be running two states simultaneously. Oh, and there’s A_CustomMissile for monster attacks but A_FireCustomMissile for weapon attacks, and the arguments are different, and if you mix them up you’ll get extremely confusing parse errors.

It’s a little bit of a mess. It’s fairly flexible for what it is, and has come a long way — for example, even original Doom couldn’t pass arguments to action functions (since they were just function pointers), so it had separate functions like A_TroopAttack for every monster; now that same function can be written generically. People have done some very clever things with zero-delay frames (to run multiple action functions in a row) and storing state with dummy inventory items, too. Still, it’s not quite a programming language, and it’s easy to run into walls and bizarre quirks.

When DECORATE lets you down, you have one interesting recourse: to call an ACS script!

Unfortunately, ACS also has some old limitations. The only type it truly understands is int, so you can’t manipulate an actor directly or even store one in a variable. Instead, you have to work with TIDs (“thing IDs”). Every actor has a TID (zero is special-cased to mean “no TID”), and most ACS actor-related functions are expressed in terms of TIDs. For level automation, this is fine, and probably even what you want — you can dump a group of monsters in a map, give them all a TID, and then control them as a group fairly easily.

But if you want to use ACS to enhance DECORATE, you have a bit of a problem. DECORATE defines individual actor behavior. Also, many DECORATE actors are designed independently of a map and intended to be reusable anywhere. DECORATE should thus not touch TIDs at all, because they’re really the map‘s concern, and mucking with TIDs might break map behavior… but ACS can’t refer to actors any other way. A number of action functions can, but you can’t call action functions from ACS, only DECORATE. The workarounds for this are not pretty, especially for beginners, and they’re very easy to silently get wrong.

Also, ultimately, some parts of the engine are just not accessible to either ACS or DECORATE, and neither language is particularly amenable to having them exposed. Adding more native types to ACS is rather difficult without making significant changes to both the language and bytecode, and DECORATE is barely a language at all.

Some long-awaited work is finally being done on a “ZScript”, which purports to solve all of these problems by expanding DECORATE into an entire interpreted-C++-ish scripting language with access to tons of internals. I don’t know what I think of it, and it only seems to half-solve the problem, since it doesn’t replace ACS.

Trying out Lua

Lua is supposed to be easy to embed, right? That’s the one thing it’s famous for. Before ZScript actually started to materialize, I thought I’d take a little crack at embedding a Lua interpreter and exposing some API stuff to it.

It’s not very far along yet, but it can do one thing that’s always been completely impossible in both ACS and DECORATE: print out the player’s entire inventory. You can check how many of a given item the player has in either language, but neither has a way to iterate over a collection. In Lua, it’s pretty easy.

1
2
3
4
5
6
function lua_test_script(activator, ...)
    for item, amount in pairs(activator.inventory) do
        -- This is Lua's builtin print(), so it goes to stdout
        print(item.class.name, amount)
    end
end

I made a tiny test map with a switch that tries to run the ACS script named lua_test_script. I hacked the name lookup to first look for the name in Lua’s global scope; if the function exists, it’s called immediately, and ACS isn’t consulted at all. The code above is just a regular (global) function in a regular Lua file, embedded as a lump in the map. So that was a good start, and was pretty neat to see work.

Writing the bindings

I used the bare Lua API at first. While its API is definitely very simple, actually using it to define and expose a large API in practice is kind of repetitive and error-prone, and I was never confident I was doing it quite right. It’s plain C and it works entirely through stack manipulation and it relies on a lot of casting to/from void*, so virtually anything might go wrong at any time.

I was on the cusp of writing a bunch of gross macros to automate the boring parts, and then I found sol2, which is pretty great. It makes heavy use of basically every single C++11 feature, so it’s a nightmare when it breaks (and I’ve had to track down a few bugs), but it’s expressive as hell when it works:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
lua.new_usertype<AActor>("zdoom.AActor",
    "__tostring", [](AActor& actor) { return "<actor>"; },
    // Pointer to an unbound method.  Sol automatically makes this an attribute
    // rather than a method because it takes no arguments, then wraps its
    // return value to pass it back to Lua, no manual wrapper code required.
    "class", &AActor::GetClass,
    "inventory", sol::property([](AActor& actor) -> ZLuaInventory { return ZLuaInventory(actor); }),
    // Pointers to unbound attributes.  Sol turns these into writable
    // attributes on the Lua side.
    "health", &AActor::health,
    "floorclip", &AActor::Floorclip,
    "weave_index_xy", &AActor::WeaveIndexXY,
    "weave_index_z", &AActor::WeaveIndexZ);

This is the type of the activator argument from the script above. It works via template shenanigans, so most of the work is done at compile time. AActor has a lot of properties of various types; wrapping them with the bare Lua API would’ve been awful, but wrapping them with Sol is fairly straightforward.

Lifetime

activator.inventory is a wrapper around a ZLuaInventory object, which I made up. It’s just a tiny proxy struct that tries to represent the inventory of a particular actor, because the engine itself doesn’t quite have such a concept — an actor’s “inventory” is a single item (itself an actor), and each item has a pointer to the next item in the inventory. Creating an intermediate type lets me hide that detail from Lua and pretend the inventory is a real container.

The inventory is thus not a real table; pairs() works on it because it provides the __pairs metamethod. It calls an iter method returning a closure, per Lua’s iteration API, which Sol makes just work:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
struct ZLuaInventory {
    ...
    std::function<AInventory* ()> iter()
    {
        TObjPtr<AInventory> item = this->actor->Inventory;
        return [item]() mutable {
            AInventory* ret = item;
            if (ret)
                item = ret->NextInv();
            return ret;
        };
    }
}

C++’s closures are slightly goofy and it took me a few tries to land on this, but it works.

Well, sort of.

I don’t know how I got this idea in my head, but I was pretty sure that ZDoom’s TObjPtr did reference counting and would automatically handle the lifetime problems in the above code. Eventually Lua reaps the closure, then C++ reaps the closure, then the wrapped AInventorys refcount drops, and all is well.

Turns out TObjPtr doesn’t do reference counting. Rather, all the game objects participate in tracing garbage collection. The basic idea is to start from some root object and recursively traverse all the objects reachable from that root; whatever isn’t reached is garbage and can be deleted.

Unfortunately, the Lua interpreter is not reachable from ZDoom’s own object tree. If an object ends up only being held by Lua, ZDoom will think it’s garbage and delete it prematurely, leaving a dangling reference. Those are bad.

I think I can fix without too much trouble. Sol allows customizing how it injects particular types, so I can use that for the type tree that participates in this GC scheme and keep an unordered_set of all objects that are alive in Lua. The Lua interpreter itself is already wrapped in an object that participates in the GC, so when the GC descends to the wrapper, it’s easy to tell it that that set of objects is alive. I’ll probably need to figure out read/write barriers, too, but I haven’t looked too closely at how ZDoom uses those yet. I don’t know whether it’s possible for an object to be “dead” (as in no longer usable, not just 0 health) before being reaped, but if so, I’ll need to figure out something there too.

It’s a little ironic that I have to do this weird workaround when ZDoom’s tracing garbage collector is based on… Lua’s.

ZDoom does have types I want to expose that aren’t garbage collected, but those are all map structures like sectors, which are never created or destroyed at runtime. I will have to be careful with the Lua interpreter itself to make sure those can’t live beyond the current map, but I haven’t really dealt with map changes at all yet. The ACS approach is that everything is map-local, and there’s some limited storage for preserving values across maps; I could do something similar, perhaps only allowing primitive scalars.

Asynchronicity

Another critical property of ACS scripts is that they can pause themselves. They can either wait for a set number of tics with delay(), or wait for map geometry to stop being busy with something like tagwait(). So you can raise up some stairs, wait for the stairs to finish appearing, and then open the door they lead to. Or you can simulate game rules by running a script in an infinite loop that waits for a few tics between iterations. It’s pretty handy. It’s incredibly handy. It’s non-negotiable.

Luckily, Lua can emulate this using coroutines. I implemented the delay case yesterday:

1
2
3
4
5
function lua_test_script(activator, ...)
    zprint("hey it's me what's up", ...)
    coroutine.yield("delay", 70)
    zprint("i'm back again")
end

When I press the switch, I see the first message, then there’s a two-second pause (Doom is 35fps), then I see the second message.

A lot more details need to be hammered out before this is really equivalent to what ACS can do, but the basic functionality is there. And since these are full-stack coroutines, I can trivially wrap that yield gunk in a delay(70) function, so you never have to know the difference.

Determinism

ZDoom has demos and peer-to-peer multiplayer. Both features rely critically on the game state’s unfolding exactly the same way, given the same seed and sequence of inputs.

ACS goes to great lengths to preserve this. It executes deterministically. It has very, very few ways to make decisions based on anything but the current state of the game. Netplay and demos just work; modders and map authors never have to think about it.

I don’t know if I can guarantee the same about Lua. I’d think so, but I don’t know so. Will the order of keys in a table be exactly the same on every system, for example? That’s important! Even the ACS random-number generator is deterministic.

I hope this is the case. I know some games, like Starbound, implicitly assume for multiplayer purposes that scripts will execute the same way on every system. So it’s probably fine. I do wish Lua made some sort of guarantee here, though, especially since it’s such an obvious and popular candidate for game scripting.

Savegames

ZDoom allows you to quicksave at any time.

Any time.

Not while a script is running, mind you. Script execution blocks the gameplay thread, so only one thing can actually be happening at a time. But what happens if you save while a script is in the middle of a tagwait?

The coroutine needs to be persisted, somehow. More importantly, when the game is loaded, the coroutine needs to be restored to the same state: paused in the same place, with locals set to the same values. Even if those locals were wrapped pointers to C++ objects, which now have different addresses.

Vanilla Lua has no way to do this. Vanilla Lua has a pretty poor serialization story overall — nothing is built in — which is honestly kind of shocking. People use Lua for games, right? Like, a lot? How is this not an extremely common problem?

A potential solution exists in the form of Eris, a modified Lua that does all kinds of invasive things to allow absolutely anything to be serialized. Including coroutines!

So Eris makes this at least possible. I haven’t made even the slightest attempt at using it yet, but a few gotchas already stand out to me.

For one, Eris serializes everything. Even regular ol’ functions are serialized as Lua bytecode. A naïve approach would thus end up storing a copy of the entire game script in the save file.

Eris has a thing called the “permanent object table”, which allows giving names to specific Lua values. Those values are then serialized by name instead, and the names are looked up in the same table to deserialize. So I could walk the Lua namespace myself after the initial script load and stick all reachable functions in this table to avoid having them persisted. (That won’t catch if someone loads new code during play, but that sounds like a really bad idea anyway, and I’d like to prevent it if possible.) I have to do this to some extent anyway, since Eris can’t persist the wrapped C++ functions I’m exposing to Lua. Even if a script does some incredibly fancy dynamic stuff to replace global functions with closures at runtime, that’s okay; they’ll be different functions, so Eris will fall back to serializing them.

Then when the save is reloaded, Eris will replace any captured references to a global function with the copy that already exists in the map script. ZDoom doesn’t let you load saves across different mods, so the functions should be the same. I think. Hmm, maybe I should check on exactly what the load rules are. If you can load a save against a more recent copy of a map, you’ll want to get its updated scripts, but stored closures and coroutines might be old versions, and that is probably bad. I don’t know if there’s much I can do about that, though, unless Eris can somehow save the underlying code from closures/coros as named references too.

Eris also has a mechanism for storing wrapped native objects, so all I have to worry about is translating pointers, and that’s a problem Doom has already solved (somehow). Alas, that mechanism is also accessible to pure Lua code, and the docs warn that it’s possible to get into an infinite loop when loading. I’d rather not give modders the power to fuck up a save file, so I’ll have to disable that somehow.

Finally, since Eris loads bytecode, it’s possible to do nefarious things with a specially-crafted save file. But since the save file is full of a web of pointers anyway, I suspect it’s not too hard to segfault the game with a specially-crafted save file anyway. I’ll need to look into this. Or maybe I won’t, since I don’t seriously expect this to be merged in.

Runaway scripts

Speaking of which, ACS currently has detection for “runaway scripts”, i.e. those that look like they might be stuck in an infinite loop (or are just doing a ludicrous amount of work). Since scripts are blocking, the game does not actually progress while a script is running, and a very long script would appear to freeze the game.

I think ACS does this by counting instructions. I see Lua has its own mechanism for doing that, so limiting script execution “time” shouldn’t be too hard.

Defining new actors

I want to be able to use Lua with (or instead of) DECORATE, too, but I’m a little hung up on syntax.

I do have something slightly working — I was able to create a variant imp class with a bunch more health from Lua, then spawn it and fight it. Also, I did it at runtime, which is probably bad — I don’t know that there’s any way to destroy an actor class, so having them be map-scoped makes no sense.

That could actually pose a bit of a problem. The Lua interpreter should be scoped to a single map, but actor classes are game-global. Do they live in separate interpreters? That seems inconvenient. I could load the game-global stuff, take an internal-only snapshot of the interpreter with Lua (bytecode and all), and then restore it at the beginning of each level? Hm, then what happens if you capture a reference to an actor method in a save file…? Christ.

I could consider making the interpreter global and doing black magic to replace all map objects with nil when changing maps, but I don’t think that can possibly work either. ZDoom has hubs — levels that can be left and later revisited, preserving their state just like with a save — and that seems at odds with having a single global interpreter whose state persists throughout the game.

Er, anyway. So, the problem with syntax is that DECORATEs own syntax is extremely compact and designed for its very specific goal of state tables. Even ZScript appears to preserve the state table syntax, though it lets you write your own action functions or just provide a block of arbitrary code. Here’s a short chunk of the imp implementation again, for reference.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
  States
  {
  Spawn:
    TROO AB 10 A_Look
    Loop
  See:
    TROO AABBCCDD 3 A_Chase
    Loop
  ...
  }

Some tricky parts that stand out to me:

  • Labels are important, since these are state tables, and jumping to a particular state is very common. It’s tempting to use Lua coroutines here somehow, but short of using a lot of goto in Lua code (yikes!), jumping around arbitrarily doesn’t work. Also, it needs to be possible to tell an actor to jump to a particular state from outside — that’s how A_Look works, and there’s even an ACS function to do it manually.

  • Aside from being shorthand, frames are fine. Though I do note that hacks like AABBCCDD 3 are relatively common. The actual animation that’s wanted here is ABCD 6, but because animation and behavior are intertwined, the labels need to be repeated to run the action function more often. I wonder if it’s desirable to be able to separate display and behavior?

  • The durations seem straightforward, but they can actually be a restricted kind of expression as well. So just defining them as data in a table doesn’t quite work.

  • This example doesn’t have any, but states can also have a number of flags, indicated by keywords after the duration. (Slightly ambiguous, since there’s nothing strictly distinguishing them from action functions.) Bright, for example, is a common flag on projectiles, weapons, and important pickups; it causes the sprite to be drawn fullbright during that frame.

  • Obviously, actor behavior is a big part of the game sim, so ideally it should require dipping into Lua-land as little as possible.

Ideas I’ve had include the following.

Emulate state tables with arguments? A very straightforward way to do the above would be to just, well, cram it into one big table.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
define_actor{
    ...
    states = {
        'Spawn:',
        'TROO', 'AB', 10, A_Look,
        'loop',
        'See:',
        'TROO', 'AABBCCDD', 3, A_Chase,
        'loop',
        ...
    },
}

It would work, technically, I guess, except for non-literal durations, but I’d basically just be exposing the DECORATE parser from Lua and it would be pretty ridiculous.

Keep the syntax, but allow calling Lua from it? DECORATE is okay, for the most part. For simple cases, it’s great, even. Would it be good enough to be able to write new action functions in Lua? Maybe. Your behavior would be awkwardly split between Lua and DECORATE, though, which doesn’t seem ideal. But it would be the most straightforward approach, and it would completely avoid questions of how to emulate labels and state counts.

As an added benefit, this would keep DECORATE almost-purely declarative — which means editor tools could still reliably parse it and show you previews of custom objects.

Split animation from behavior? This could go several ways, but the most obvious to me is something like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
define_actor{
    ...
    states = {
        spawn = function(self)
            self:set_animation('AB', 10)
            while true do
                A_Look(self)
                delay(10)
            end
        end,
        see = function(self)
            self:set_animation('ABCD', 6)
            while true do
                A_Chase(self)
                delay(3)
            end
        end,
    },
}

This raises plenty of other API questions, like how to wait until an animation has finished or how to still do work on a specific frame, but I think those are fairly solvable. The big problems are that it’s very much not declarative, and it ends up being rather wordier. It’s not all boilerplate, though; it’s fairly straightforward. I see some value in having state delays and level script delays work the same way, too. And in some cases, you have only an animation with no code at all, so the heavier use of Lua should balance out. I don’t know.

A more practical problem is that, currently, it’s possible to jump to an arbitrary number of states past a given label, and that would obviously make no sense with this approach. It’s pretty rare and pretty unreadable, so maybe that’s okay. Also, labels aren’t blocks, so it’s entirely possible to have labels that don’t end with a keyword like loop and instead carry straight on into the next label — but those are usually used for logic more naturally expressed as for or while, so again, maybe losing that ability is okay.

Or… perhaps it makes sense to do both of these last two approaches? Built-in classes should stay as DECORATE anyway, so that existing code can still inherit from them and perform jumps with offsets, but new code could go entirely Lua for very complex actors.

Alas, this is probably one of those questions that won’t have an obvious answer unless I just build several approaches and port some non-trivial stuff to them to see how they feel.

And further

An enduring desire among ZDoom nerds has been the ability to write custom “thinkers”. Thinkers are really anything that gets to act each tic, but the word also specifically refers to the logic responsible for moving floors, opening doors, changing light levels, and so on. Exposing those more directly to Lua, and letting you write your own, would be pretty interesting.

Anyway

I don’t know if I’ll do all of this. I somewhat doubt it, in fact. I pick it up for half a day every few weeks to see what more I can make it do, just because it’s interesting. It has virtually no chance of being upstreamed anyway (the only active maintainer hates Lua, and thinks poorly of dynamic languages in general; plus, it’s redundant with ZScript) and I don’t really want to maintain my own yet another Doom fork, so I don’t expect it to ever be a serious project.

The source code for what I’ve done so far is available, but it’s brittle and undocumented, so I’m not going to tell you where to find it. If it gets far enough along to be useful as more than a toy, I’ll make a slightly bigger deal about it.

A Rebuttal For Python 3

Post Syndicated from Eevee original https://eev.ee/blog/2016/11/23/a-rebuttal-for-python-3/

Zed Shaw, of Learn Python the Hard Way fame, has now written The Case Against Python 3.

I’m not involved with core Python development. The only skin I have in this game is that I like Python 3. It’s a good language. And one of the big factors I’ve seen slowing its adoption is that respected people in the Python community keep grouching about it. I’ve had multiple newcomers tell me they have the impression that Python 3 is some kind of unusable disaster, though they don’t know exactly why; it’s just something they hear from people who sound like they know what they’re talking about. Then they actually use the language, and it’s fine.

I’m sad to see the Python community needlessly sabotage itself, but Zed’s contribution is beyond the pale. It’s not just making a big deal about changed details that won’t affect most beginners; it’s complete and utter nonsense, on a platform aimed at people who can’t yet recognize it as nonsense. I am so mad.

The Case Against Python 3

I give two sets of reasons as I see them now. One for total beginners, and another for people who are more knowledgeable about programming.

Just to note: the two sets of reasons are largely the same ideas presented differently, so I’ll just weave them together below.

The first section attempts to explain the case against starting with Python 3 in non-technical terms so a beginner can make up their own mind without being influenced by propaganda or social pressure.

Having already read through this once, this sentence really stands out to me. The author of a book many beginners read to learn Python in the first place is providing a number of reasons (some outright fabricated) not to use Python 3, often in terms beginners are ill-equipped to evaluate, but believes this is a defense against propaganda or social pressure.

The Most Important Reason

Before getting into the main technical reasons I would like to discuss the one most important social reason for why you should not use Python 3 as a beginner:

THERE IS A HIGH PROBABILITY THAT PYTHON 3 IS SUCH A FAILURE IT WILL KILL PYTHON.

Python 3’s adoption is really only at about 30% whenever there is an attempt to measure it.

Wait, really? Wow, that’s fantastic.

I mean, it would probably be higher if the most popular beginner resources were actually teaching Python 3, but you know.

Nobody is all that interested in finding out what the real complete adoption is, despite there being fairly simple ways to gather metrics on the adoption.

This accusatory sentence conspicuously neglects to mention what these fairly simple ways are, a pattern that repeats throughout. The trouble is that it’s hard to even define what “adoption” means — I write all my code in Python 3 now, but veekun is still Python 2 because it’s in maintenance mode, so what does that say about adoption? You could look at PyPI download stats, but those are thrown way off by caches and system package managers. You could look at downloads from the Python website, but a great deal of Python is written and used on Unix-likes, where Python itself is either bundled or installed from the package manager.

It’s as simple as that. If you learn Python 2, then you can still work with all the legacy Python 2 code in existence until Python dies or you (hopefully) move on. But if you learn Python 3 then your future is very uncertain. You could really be learning a dead language and end up having to learn Python 2 anyway.

You could use Python 2, until it dies… or you could use Python 3, which might die. What a choice.

By some definitions, Python 2 is already dead — it will not see another major release, only security fixes. Python 3 is still actively developed, and its seventh major release is next month. It even contains a new feature that Zed later mentions he prefers to Python 2’s offerings.

It may shock you to learn that I know both Python 2 and Python 3. Amazingly, two versions of the same language are much more similar than they are different. If you learned Python 3 and then a wizard cast a spell that made it vanish from the face of the earth, you’d just have to spend half an hour reading up on what had changed from Python 2.

Also, it’s been over a decade, maybe even multiple decades, and Python 3 still isn’t above about 30% in adoption. Even among the sciences where Python 3 is touted as a “success” it’s still only around 25-30% adoption. After that long it’s time to admit defeat and come up with a new plan.

Python 3.0 came out in 2008. The first couple releases ironed out some compatibility and API problems, so it didn’t start to gain much traction until Python 3.2 came out in 2011. Hell, Python 2.0 came out in 2000, so even Python 2 isn’t multiple decades old. It would be great if this trusted beginner reference could take two seconds to check details like this before using them to scaremonger.

The big early problem was library compatibility: it’s hard to justify switching to a new version of the language if none of the libraries work. Libraries could only port once their own dependencies had ported, of course, and it took a couple years to figure out the best way to maintain compatibility with both Python 2 and Python 3. I’d say we only really hit critical mass a few years ago — for instance, Django didn’t support Python 3 until 2013 — in which case that 30% is nothing to sneeze at.

There are more reasons beyond just the uncertain future of Python 3 even decades later.

In one paragraph, we’ve gone from “maybe even multiple decades” to just “decades”, which is a funny way to spell “eight years”.

Not In Your Best Interests

The Python project’s efforts to convince you to start with Python 3 are not in your best interest, but, rather, are only in the best interests of the Python project.

It’s bad, you see, for the Python project to want people to use the work it produced.

Anyway, please buy Zed Shaw’s book.

Anyway, please pledge to my Patreon.

Ultimately though, if Python 3 were good they wouldn’t need to do any convincing to get you to use it. It would just naturally work for you and you wouldn’t have any problems. Instead, there are serious issues with Python 3 for beginners, and rather than fix those issues the Python project uses propaganda, social pressure, and marketing to convince you to use it. In the world of technology using marketing and propaganda is immediately a sign that the technology is defective in some obvious way.

This use of social pressure and propaganda to convince you to use Python 3 despite its problems, in an attempt to benefit the Python project, is morally unconscionable to me.

Ten paragraphs in, Zed is telling me that I should be suspicious of anything that relies on marketing and propaganda. Meanwhile, there has yet to be a single concrete reason why Python 3 is bad for beginners — just several flat-out incorrect assertions and a lot of handwaving about how inexplicably nefarious the Python core developers are. You know, the same people who made Python 2. But they weren’t evil then, I guess.

You Should Be Able to Run 2 and 3

In the programming language theory there is this basic requirement that, given a “complete” programming language, I can run any other programming language. In the world of Java I’m able to run Ruby, Java, C++, C, and Lua all at the same time. In the world of Microsoft I can run F#, C#, C++, and Python all at the same time. This isn’t just a theoretical thing. There is solid math behind it. Math that is truly the foundation of computer science.

The fact that you can’t run Python 2 and Python 3 at the same time is purely a social and technical decision that the Python project made with no basis in mathematical reality. This means you are working with a purposefully broken platform when you use Python 3, and I personally can’t condone teaching people to use something that is fundamentally broken.

The programmer-oriented section makes clear that the solid math being referred to is Turing-completeness — the section is even titled “Python 3 Is Not Turing Complete”.

First, notice a rhetorical trick here. You can run Ruby, Java, C++, etc. at the same time, so why not Python 2 and Python 3?

But can you run Java and C# at the same time? (I’m sure someone has done this, but it’s certainly much less popular than something like Jython or IronPython.)

Can you run Ruby 1.8 and Ruby 2.3 at the same time? Ah, no, so I guess Ruby 2.3 is fundamentally and purposefully broken.

Can you run Lua 5.1 and 5.3 at the same time? Lua is a spectacular example, because Lua 5.2 made a breaking change to how the details of scope work, and it’s led to a situation where a lot of programs that embed Lua haven’t bothered upgrading from Lua 5.1. Was Lua 5.2 some kind of dark plot to deliberately break the language? No, it’s just slightly more inconvenient than expected for people to upgrade.

Anyway, as for Turing machines:

In computer science a fundamental law is that if I have one Turing Machine I can build any other Turing Machine. If I have COBOL then I can bootstrap a compiler for FORTRAN (as disgusting as that might be). If I have FORTH, then I can build an interpreter for Ruby. This also applies to bytecodes for CPUs. If I have a Turing Complete bytecode then I can create a compiler for any language. The rule then can be extended even further to say that if I cannot create another Turing Machine in your language, then your language cannot be Turing Complete. If I can’t use your language to write a compiler or interpreter for any other language then your language is not Turing Complete.

Yes, this is true.

Currently you cannot run Python 2 inside the Python 3 virtual machine. Since I cannot, that means Python 3 is not Turing Complete and should not be used by anyone.

And this is completely asinine. Worse, it’s flat-out dishonest, and relies on another rhetorical trick. You only “cannot” run Python 2 inside the Python 3 VM because no one has written a Python 2 interpreter in Python 3. The “cannot” is not a mathematical impossibility; it’s a simple matter of the code not having been written. Or perhaps it has, but no one cares anyway, because it would be comically and unusably slow.

I assume this was meant to be sarcastic on some level, since it’s followed by a big blue box that seems unsure about whether to double down or reverse course. But I can’t tell why it was even brought up, because it has absolutely nothing to do with Zed’s true complaint, which is that Python 2 and Python 3 do not coexist within a single environment. Implementing language X using language Y does not mean that X and Y can now be used together seamlessly.

The canonical Python release is written in C (just like with Ruby or Lua), but you can’t just dump a bunch of C code into a Python (or Ruby or Lua) file and expect it to work. You can talk to C from Python and vice versa, but defining how they communicate is a bit of a pain in the ass and requires some level of setup.

I’ll get into this some more shortly.

No Working Translator

Python 3 comes with a tool called 2to3 which is supposed to take Python 2 code and translate it to Python 3 code.

I should point out right off the bat that this is not actually what you want to use most of the time, because you probably want to translate your Python 2 code to Python 2/3 code. 2to3 produces code that most likely will not work on Python 2. Other tools exist to help you port more conservatively.

Translating one programming language into another is a solidly researched topic with solid math behind it. There are translators that convert any number of languages into JavaScript, C, C++, Java, and many times you have no idea the translation is being done. In addition to this, one of the first steps when implementing a new language is to convert the new language into an existing language (like C) so you don’t have to write a full compiler. Translation is a fully solved problem.

This is completely fucking ludicrous. Translating one programming language to another is a common task, though “fully solved” sounds mighty questionable. But do you know what the results look like?

I found a project called “Transcrypt”, which puts Python in the browser by “translating” it to JavaScript. I’ve never used or heard of this before; I just googled for something to convert Python to JavaScript. Here’s their first sample, a demo using jQuery:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def start ():
    def changeColors ():
        for div in S__divs:
            S (div) .css ({
                'color': 'rgb({},{},{})'.format (* [int (256 * Math.random ()) for i in range (3)]),
            })

    S__divs = S ('div')
    changeColors ()
    window.setInterval (changeColors, 500)

And here’s the JavaScript code it compiles to:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
(function () {
    var start = function () {
        var changeColors = function () {
            var __iterable0__ = $divs;
            for (var __index0__ = 0; __index0__ < __iterable0__.length; __index0__++) {
                var div = __iterable0__ [__index0__];
                $ (div).css (dict ({'color': 'rgb({},{},{})'.format.apply (null, function () {
                    var __accu0__ = [];
                    for (var i = 0; i < 3; i++) {
                        __accu0__.append (int (256 * Math.random ()));
                    }
                    return __accu0__;
                } ())}));
            }
        };
        var $divs = $ ('div');
        changeColors ();
        window.setInterval (changeColors, 500);
    };
    __pragma__ ('<all>')
        __all__.start = start;
    __pragma__ ('</all>')
}) ();

Well, not quite. That’s actually just a small piece at the end of the full 1861-line file.

You may notice that the emitted JavaScript effectively has to emulate the Python for loop, because JavaScript doesn’t have anything that works exactly the same way. And this is a basic, common language feature translated between two languages in the same general family! Imagine how your code would look if you relied on gritty details of how classes are implemented.

Is this what you want 2to3 to do to your code?

Even if something has been proven to be mathematically possible, that doesn’t mean it’s easy, and it doesn’t mean the results will be pretty (or fast).

The 2to3 translator fails on about 15% of the code it attempts, and does a poor job of translating the code it can handle. The motivations for this are unclear, but keep in mind that a group of people who claim to be programming language experts can’t write a reliable translator from one version of their own language to another. This is also a cause of their porting problems, which adds up to more evidence Python 3’s future is uncertain.

Writing a translator from one language to another is a fully proven and fundamental piece of computer science. Yet, the 2to3 translator cannot translate code 100%. In my own tests it is only about 85% effective, leaving a large amount of code to translate manually. Given that translation is a solved problem this seems to be a decision bordering on malice rather than incredible incompetence.

The programmer-oriented section doubles down on this idea with a title of “Purposefully Crippled 2to3 Translator” — again, accusing the Python project of sabotaging everyone. That doesn’t even make sense; if their goal is to make everyone use Python 3 at any cost, why would they deliberately break their tool that reduces the amount of Python 2 code and increases the amount of Python 3 code?

2to3 sucks because its job is hard. Python is dynamically typed. If it sees d.iteritems(), it might want to change that to d.items(), as it’s called in Python 3 — but it can’t always be sure that d is actually a dict. If d is some user-defined type, renaming the method is wrong.

But hey, Turing-completeness, right? It must be mathematically possible. And it is! As long as you’re willing to see this:

1
2
for key, value in d.iteritems():
    ...

Get translated to this:

1
2
3
__d = d
for key, value in (__d.items() if isinstance(__d, dict) else __d.iteritems()):
    ...

Would Zed be happier with that, I wonder?

The JVM and CLR Prove It’s Pointless

Yet, for some reason, the Python 3 virtual machine can’t run Python 2? Despite the solidly established mathematics disproving this, the countless examples of running one crazy language inside a Russian doll cascade of other crazy languages, and huge number of languages that can coexist in nearly every other virtual machine? That makes no sense.

This, finally, is the real complaint. It’s not a bad one, and it comes up sometimes, but… it’s not this easy.

The Python 3 VM is fairly similar to the Python 2 VM. The problem isn’t the VM, but the core language constructs and standard library.

Consider: what happens when a Python 2 old-style class instance gets passed into Python 3, which has no such concept? It seems like a value would have to always have the semantics of the language version it came from — that’s how languages usually coexist on the same VM, anyway.

Now, I’m using Python 3, and I load some library written for Python 2. I call a Python 2 function that deals with bytestrings, and I pass it a Python 3 bytestring. Oh no! It breaks because Python 3 bytestrings iterate as integers, whereas the Python 2 library expects them to iterate as characters.

Okay, well, no big deal, you say. Maybe Python 2 libraries just need to be updated to work either way, before they can be used with Python 3.

But that’s exactly the situation we’re in right now. Syntax changes are trivially fixed by 2to3 and similar tools. It’s libraries that cause the subtler issues.

The same applies the other way, too. I write Python 3 code, and it gets an int from some Python 2 library. I try to use the .to_bytes method on it, but that doesn’t exist on Python 2 integers. So my Python 3 code, written and intended purely for Python 3, now has to deal with Python 2 integers as well.

Perhaps “primitive” types should convert automatically, on the boundary? Okay, sure. What about the Python 2 buffer type, which is C-backed and replaced by memoryview in Python 3?

Or how about this very fundamental problem: names of methods and other attributes are str in both versions, but that means they’re bytestrings in Python 2 and text in Python 3. If you’re in Python 3 land, and you call obj.foo() on a Python 2 object, what happens? Python 3 wants a method with the text name foo, but Python 2 wants a method with the bytes name foo. Text and bytes are not implicitly convertible in Python 3. So does it error? Somehow work anyway? What about the other way around?

What about the standard library, which has had a number of improvements in Python 3 that don’t or can’t exist in Python 2? Should Python ship two entire separate copies of its standard library? What about modules like logging, which rely on global state? Does Python 2 and Python 3 code need to set up logging separately within the same process?

There are no good solutions here. The language would double in size and complexity, and you’d still end up with a mess at least as bad as the one we have now when values leak from one version into the other.

We either have two situations here:

  1. Python 3 has been purposefully crippled to prevent Python 2’s execution alongside Python 3 for someone’s professional or ideological gain.
  2. Python 3 cannot run Python 2 due to simple incompetence on the part of the Python project.

I can think of a third.

Difficult To Use Strings

The strings in Python 3 are very difficult to use for beginners. In an attempt to make their strings more “international” they turned them into difficult to use types with poor error messages.

Why is “international” in scare quotes?

Every time you attempt to deal with characters in your programs you’ll have to understand the difference between byte sequences and Unicode strings.

Given that I’m reading part of a book teaching Python, this would be a perfect opportunity to drive this point home by saying “Look! Running exercise N in Python 3 doesn’t work.” Exercise 1, at least, works fine for me with a little extra sprinkle of parentheses:

1
2
3
4
5
6
7
print("Hello World!")
print("Hello Again")
print("I like typing this.")
print("This is fun.")
print('Yay! Printing.')
print("I'd much rather you 'not'.")
print('I "said" do not touch this.')

Contrast with the actual content of that exercise — at the bottom is a big red warning box telling people from “another country” (relative to where?) that if they get errors about ASCII encodings, they should put an unexplained magical incantation at the top of their scripts to fix “Unicode UTF-8”, whatever that is. I wonder if Zed has read his own book.

Don’t know what that is? Exactly.

If only there were a book that could explain it to beginners in more depth than “you have to fix this if you’re foreign”.

The Python project took a language that is very forgiving to beginners and mostly “just works” and implemented strings that require you to constantly know what type of string they are. Worst of all, when you get an error with strings (which is very often) you get an error message that doesn’t tell you what variable names you need to fix.

The complaint is that this happens in Python 3, whereas it’s accepted in Python 2:

1
2
3
4
>>> b"hello" + "hello"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't concat bytes to str

The programmer section is called “Statically Typed Strings”. But this is not static typing. That’s strong typing, a property that sets Python’s type system apart from languages like JavaScript. It’s usually considered a good thing, because the alternative is to silently produce nonsense in some cases, and then that nonsense propagates through your program and is hard to track down when it finally causes problems.

If they’re going to require beginners to struggle with the difference between bytes and Unicode the least they could do is tell people what variables are bytes and what variables are strings.

That would be nice, but it’s not like this is a new problem. Try this in Python 2.

1
2
3
4
>>> 3 + "hello"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'

How would Python even report this error when I used literals instead of variables? How could custom types hook into such a thing? Error messages are hard.

By the way, did you know that several error messages are much improved in Python 3? Python 2 is somewhat notorious for the confusing errors it produces when an argument is missing from a method call, but Python 3 is specific about the problem, which is much friendlier to beginners.

However, when you point out that this is hard to use they try to claim it’s good for you. It is not. It’s simple blustering covering for a poor implementation.

I don’t know what about this is hard. Why do you have a text string and a bytestring in the first place? Why is it okay to refuse adding a number to a string, but not to refuse adding bytes to a string?

Imagine if one of the Python core developers were just getting into Python 2 and messing around.

1
2
3
# -*- coding: utf8 -*-
print "Hi, my name is Łukasz Langa."
print "Hi, my name is Łukasz Langa."[::-1]
1
2
Hi, my name is Łukasz Langa.
.agnaL zsaku�� si eman ym ,iH

Good luck figuring out how to fix that.

This isn’t blustering. Bytes are not text; they are binary data that could encode anything. They happen to look like text sometimes, and you can get away with thinking they’re text if you’re not from “another country”, but that mindset will lead you to write code that is wrong. The resulting bugs will be insidious and confusing, and you’ll have a hard time even reasoning about them because it’ll seem like “Unicode text” is somehow a different beast altogether from “ASCII text”.

Exercise 11 mentions at the end that you can use int() to convert a number to an integer. It’s no more complicated to say that you convert bytes to a string using .decode(). It shouldn’t even come up unless you’re explicitly working with binary data, and I don’t see any reading from sockets in LPTHW.

It’s also not statically compiled as strongly as it could be, so you can’t find these kinds of type errors until you run the code.

This comes a scant few paragraphs after “Dynamic typing is what makes Python easy to use and one of the reasons I advocate it for beginners.”

You can’t find any kinds of type errors until you run the code. Welcome to dynamic typing.

Strings are also most frequently received from an external source, such as a network socket, file, or similar input. This means that Python 3’s statically typed strings and lack of static type safety will cause Python 3 applications to crash more often and have more security problems when compared with Python 2.

On the contrary — Python 3 applications should crash less often. The problem with silently converting between bytestrings and text in Python 2 is that it might fail, depending on the contents. "cafe" + u"hello" works fine, but "café" + u"hello" raises a UnicodeDecodeError. Python 2 makes it very easy to write code that appears to work when tested with ASCII data, but later breaks with anything else, even though the values are still the same types. In Python 3, you get an error the first time you try to run such code, regardless of what’s in the actual values. That’s the biggest reason for the change: it improves things from being intermittent value errors to consistent type errors.

More security problems? This is never substantiated, and seems to have been entirely fabricated.

Too Many Formatting Options

In addition to that you will have 3 different formatting options in Python 3.6. That means you’ll have to learn to read and use multiple ways to format strings that are all very different. Not even I, an experienced professional programmer, can easily figure out these new formatting systems or keep up with their changing features.

I don’t know what on earth “keep up with their changing features” is supposed to mean, and Zed doesn’t bother to go into details.

Python 3 has three ways to format strings: % interpolation, str.format(), and the new f"" strings in Python 3.6. The f"" strings use the same syntax as str.format(); the difference is that where str.format() uses numbers or names of keyword arguments, f"" strings just use expressions. Compare:

1
2
3
number = 133
print("{n:02x}".format(n=number))
print(f"{number:02x}")

This isn’t “very different”. A frequently-used method is being promoted to syntax.

I really like this new style, and I have no idea why this wasn’t the formatting for Python 3 instead of that stupid .format function. String interpolation is natural for most people and easy to explain.

The problem is that beginner will now how to know all three of these formatting styles, and that’s too many.

I could swear Zed, an experienced professional programmer, just said he couldn’t easily figure out these new formatting systems. Note also that str.format() has existed in Python 2 since Python 2.6 was released in 2008, so I don’t know why Zed said “new formatting systems“, plural.

This is a truly bizarre complaint overall, because the mechanism Zed likes best is the newest one. If Python core had agreed that three mechanisms was too many, we wouldn’t be getting f"" at all.

Even More Versions of Strings

Finally, I’m told there is a new proposal for a string type that is both bytes and Unicode at the same time? That’d be fantastic if this new type brings back the dynamic typing that makes Python easy, but I’m betting it will end up being yet another static type to learn. For that reason I also think beginners should avoid Python 3 until this new “chimera string” is implemented and works reliably in a dynamic way. Until then, you will just be dealing with difficult strings that are statically typed in a dynamically typed language.

I have absolutely no idea what this is referring to, and I can’t find anyone who does. I don’t see any recent PEPs mentioning such a thing, nor anything in the last several months on the python-dev mailing list. I don’t see it in the Python 3.6 release notes.

The closest thing I can think of is the backwards-compatibility shenanigans for PEP 528 and PEP 529 — they switch to the Windows wide-string APIs for console and filesystem encoding, but pretend under the hood that the APIs take UTF-8-encoded bytes to avoid breaking libraries like Twisted. That’s a microscopic detail that should never matter to anyone but authors of Twisted, and is nothing like a new hybrid string type, but otherwise I’m at a loss.

This paragraph really is a perfect summary of the whole article. It speaks vaguely yet authoritatively about something that doesn’t seem to exist, it doesn’t bother actually investigating the thing the entire section talks about, it conjectures that this mysterious feature will be hard just because it’s in Python 3, and it misuses terminology to complain about a fundamental property of Python that’s always existed.

Core Libraries Not Updated

Many of the core libraries included with Python 3 have been rewritten to use Python 3, but have not been updated to use its features. How could they given Python 3’s constant changing status and new features?

What “constant changing status”? The language makes new releases; is that bad? The only mention of “changing” so far was with string formatting, which makes no sense to me, because the only major change has been the addition of syntax that Zed prefers.

There are several libraries that, despite knowing the encoding of data, fail to return proper strings. The worst offender seems to be any libraries dealing with the HTTP protocol, which does indicate the encoding of the underlying byte stream in many cases.

In many cases, yes. Not in all. Some web servers don’t send back an encoding. Some files don’t have an encoding, because they’re images or other binary data. HTML allows the encoding to be given inside the document, instead. urllib has always returned bytes, so it’s not all that unreasonable to keep doing that, rather than… well, I’m not quite sure what this is proposing. Return strings sometimes?

The documentation for urllib.request and http.client both advise using the higher-level Requests library instead, in a prominent yellow box right at the top. Requests has distinct mechanisms for retrieving bytes versus text and is vastly easier to use overall, though I don’t think even it understands reading encodings from HTML. Alas, computers.

Good luck to any beginner figuring out how to install Requests on Python 2 — but thankfully, Python 3 now comes bundled with pip, which makes installing libraries much easier. Contrast with the beginning of exercise 46, which apologizes for how difficult this is to explain, lists four things to install, warns that it will be frustrating, and advises watching a video to help figure it out.

What’s even more idiotic about this is Python has a really good Chardet library for detecting the encoding of byte streams. If Python 3 is supposed to be “batteries included” then fast Chardet should be baked into the core of Python 3’s strings making it cake to translate strings to bytes even if you don’t know the underlying encoding. … Call the function whatever you want, but it’s not magic to guess at the encoding of a byte stream, it’s science. The only reason this isn’t done for you is that the Python project decided that you should be punished for not knowing about bytes vs. Unicode, and their arrogance means you have difficult to use strings.

Guessing at the encoding of a byte stream isn’t so much science as, well, guessing. Guessing means that sometimes you’re wrong. Sometimes that’s what you want, and I’m honestly ambivalent about having chardet in the standard library, but it’s hardly arrogant to not want to include a highly-fallible heuristic in your programming language.

Conclusions and Warnings

I have resisted writing about these problems with Python 3 for 5 versions because I hoped it would become usable for beginners. Each year I would attempt to convert some of my code and write a couple small tests with Python 3 and simply fail. If I couldn’t use Python 3 reliably then there’s no way a total beginner could manage it. So each year I’d attempt it, and fail, and wait until they fix it. I really liked Python and hoped the Python project would drop their stupid stances on usability.

Let us recap the usability problems seen thusfar.

  • You can’t add b"hello" to "hello".
  • TypeErrors are phrased exactly the same as they were in Python 2.
  • The type system is exactly as dynamic as it was in Python 2.
  • There is a new formatting mechanism, using the same syntax as one in Python 2, that Zed prefers over the ones in Python 2.
  • urllib.request doesn’t decode for you, just like in Python 2.
  • 档牡敤㽴 isn’t built in. Oh, sorry, I meant chardet.

Currently, the state of strings is viewed as a Good Thing in the Python community. The fact that you can’t run Python 2 inside Python 3 is seen as a weird kind of tough love. The brainwashing goes so far as to outright deny the mathematics behind language translation and compilation in an attempt to motivate the Python community to brute force convert all Python 2 code.

Which is probably why the Python project focuses on convincing unsuspecting beginners to use Python 3. They don’t have a switching cost, so if you get them to fumble their way through the Python 3 usability problems then you have new converts who don’t know any better. To me this is morally wrong and is simply preying on people to prop up a project that needs a full reset to survive. It means beginners will fail at learning to code not because of their own abilities, but because of Python 3’s difficulty.

Now that we’re towards the end, it’s a good time to say this: Zed Shaw, your behavior here is fucking reprehensible.

Half of what’s written here is irrelevant nonsense backed by a vague appeal to “mathematics”. Instead of having even the shred of humility required to step back and wonder if there are complicating factors beyond whether something is theoretically possible, you have invented a variety of conflicting and malicious motivations to ascribe to the Python project.

It’s fine to criticize Python 3. The string changes force you to think about what you’re doing a little more in some cases, and occasionally that’s a pain in the ass. I absolutely get it.

But you’ve gone out of your way to invent a conspiracy out of whole cloth and promote it on your popular platform aimed at beginners, who won’t know how obviously full of it you are. And why? Because you can’t add b"hello" to "hello"? Are you kidding me? No one can even offer to help you, because instead of examples of real problems you’ve had, you gave two trivial toys and then yelled a lot about how the whole Python project is releasing mind-altering chemicals into the air.

The Python 3 migration has been hard enough. It’s taken a lot of work from a lot of people who’ve given enough of a crap to help Python evolve — to make it better to the best of their judgment and abilities. Now we’re finally, finally at the point where virtually all libraries support Python 3, a few new ones only support Python 3, and Python 3 adoption is starting to take hold among application developers.

And you show up to piss all over it, to propagate this myth that Python 3 is hamstrung to the point of unusability, because if the Great And Wise Zed Shaw can’t figure it out in ten seconds then it must just be impossible.

Fuck you.

Sadly, I doubt this will happen, and instead they’ll just rant about how I don’t know what I’m talking about and I should shut up.

This is because you don’t know what you’re talking about, and you should shut up.

Weekly roundup: Screw it

Post Syndicated from Eevee original https://eev.ee/dev/2016/11/21/weekly-roundup-screw-it/

November is a disaster and I’ve given up on it.

  • runed awakening: I found a decent DAG diagrammer and plotted out most of the puzzles and finally had visual confirmation that a few parts of the game are lacking. Haven’t yet figured out how to address that.

  • blog: I finished an article for SitePoint. Also I wrote a thing about iteration.

  • veekun: Playing Pokémon has made me interested in veekun work again, so I did some good ergonomic work on the YAML thing, which is starting to look like it might end up kind of sensible.

  • zdoom: I had a peek at my zdoom-lua branch again, dug into object lifetime, and discovered a world of pain.

Everything is still distracting, and Pokémon came out, so.

Iteration in one language, then all the others

Post Syndicated from Eevee original https://eev.ee/blog/2016/11/18/iteration-in-one-language-then-all-the-others/

You may have noticed that I like comparing features across different languages. I hope you like it too, because I’m doing it again.

Python

I’m most familiar with Python, and iteration is one of its major concepts, so it’s a good place to start and a good overview of iteration. I’ll dive into Python a little more deeply, then draw parallels to other languages.

Python only has one form of iteration loop, for. (Note that all of these examples are written for Python 3; in Python 2, some of the names are slightly different, and fewer things are lazy.)

1
2
for value in sequence:
    ...

in is also an operator, so value in sequence is also the way you test for containment. This is either very confusing or very satisfying.

When you need indices, or specifically a range of numbers, you can use the built-in enumerate or range functions. enumerate works with lazy iterables as well.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# This makes use of tuple unpacking to effectively return two values at a time
for index, value in enumerate(sequence):
    ...

# Note that the endpoint is exclusive, and the default start point is 0.  This
# matches how list indexing works and fits the C style of numbering.
# 0 1 2 3 4
for n in range(5):
    ...

# Start somewhere other than zero, and the endpoint is still exclusive.
# 1 2 3 4
for n in range(1, 5):
    ...

# Count by 2 instead.  Can also use a negative step to count backwards.
# 1 3 5 7 9
for n in range(1, 11, 2):
    ...

dicts (mapping types) have several methods for different kinds of iteration. Additionally, iterating over a dict directly produces its keys.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
for key in mapping:
    ...

for key in mapping.keys():
    ...

for value in mapping.values():
    ...

for key, value in mapping.items():
    ...

Python distinguishes between an iterable, any value that can be iterated over, and an iterator, a value that performs the actual work of iteration. Common iterable types include list, tuple, dict, str, and set. enumerate and range are also iterable.

Since Python code rarely works with iterators directly, and many iterable types also function as their own iterators, it’s common to hear “iterator” used to mean an iterable. To avoid this ambiguity, and because the words are fairly similar already, I’ll refer to iterables as containers like the Python documentation sometimes does. Don’t be fooled — an object doesn’t actually need to contain anything to be iterable. Python’s range type is iterable, but it doesn’t physically contain all the numbers in the range; it generates them on the fly as needed.

The fundamental basics of iteration are built on these two ideas. Given a container, ask for an iterator; then repeatedly advance the iterator to get new values. When the iterator runs out of values, it raises StopIteration. That’s it. In Python, those two steps can be performed manually with the iter and next functions. A for loop is roughly equivalent to:

1
2
3
4
5
6
7
8
9
_iterator = iter(container)
_done = False
while not _done:
    try:
        value = next(_iterator)
    except StopIteration:
        _done = True
    else:
        ...

An iterator can only move forwards. Once a value has been produced, it’s lost, at least as far as the iterator is concerned. These restrictions are occasionally limiting, but they allow iteration to be used for some unexpected tasks. For example, iterating over an open file produces its lines — even if the “file” is actually a terminal or pipe, where data only arrives once and isn’t persistently stored anywhere.

Generators

A more common form of “only forwards, only once” in Python is the generator, a function containing a yield statement. For example:

1
2
3
4
5
6
7
8
9
def inclusive_range(start, stop):
    val = start
    while val <= stop:
        yield val
        val += 1

# 6 7 8 9
for n in inclusive_range(6, 9):
    ...

Calling a generator function doesn’t execute its code, but immediately creates a generator iterator. Every time the iterator is advanced, the function executes until the next yield, at which point the yielded value is returned as the next value and the function pauses. The next iteration will then resume the function. When the function returns (or falls off the end), the iterator stops.

Since the values here are produced by running code on the fly, it’s of course impossible to rewind a generator.

The underlying protocol is straightforward. A container must have an __iter__ method that returns an iterator, corresponding to the iter function. An iterator must have a __next__ method that returns the next item, corresponding to the next function. If the iterator is exhausted, __next__ must raise StopIteration. An iterator must also have an __iter__ that returns itself — this is so an iterator can be used directly in a for loop.

The above inclusive range generator might be written out explicitly like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class InclusiveRange:
    def __init__(self, start, stop):
        self.start = start
        self.stop = stop

    def __iter__(self):
        return InclusiveRangeIterator(self)

class InclusiveRangeIterator:
    def __init__(self, incrange):
        self.incrange = incrange
        self.nextval = incrange.start

    def __iter__(self):
        return self

    def __next__(self):
        if self.nextval > self.incrange.stop:
            raise StopIteration

        val = self.nextval
        self.nextval += 1
        return val

This might seem like a lot of boilerplate, but note that the iterator state (here, nextval) can’t go on InclusiveRange directly, because then it’d be impossible to iterate over the same object twice at the same time. (Some types, like files, do act as their own iterators because they can’t meaningfully be iterated in parallel.)

Even Python’s internals work this way. Try iter([]) in a Python REPL; you’ll get a list_iterator object.

In truth, it is a lot of boilerplate. User code usually uses this trick:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class InclusiveRange:
    def __init__(self, start, stop):
        self.start = start
        self.stop = stop

    def __iter__(self):
        val = self.start
        while val <= self.stop:
            yield val
            val += 1

Nothing about this is special-cased in any way. Now __iter__ is a generator, and calling a generator function returns an iterator, so all the constraints are met. It’s a really easy way to convert a generator function into a type. If this class were named inclusive_range instead, it would even be backwards-compatible; consuming code wouldn’t even have to know it’s a class.

Reversal

But why would you do this? One excellent reason is to add support for other sequence-like operations, like reverse iteration support. An iterator can’t be reversed, but a container might support being iterated in reverse:

1
2
3
4
fruits = ['apple', 'orange', 'pear']
# pear, orange, apple
for value in reversed(fruits):
    ...

Iterating a lazy container doesn’t always make sense, but when it does, it’s easy to implement by returning an iterator from __reversed__.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
class InclusiveRange:
    def __init__(self, start, stop):
        self.start = start
        self.stop = stop

    def __iter__(self):
        val = self.start
        while val <= self.stop:
            yield val
            val += 1

    def __reversed__(self):
        val = self.stop
        while val >= self.start:
            yield val
            val -= 1

Note that Python does not have “bi-directional” iterators, which can freely switch between forwards and reverse iteration on the fly. A bidirectional iterator is useful for cases like doubly-linked lists, where it’s easy to get from one value to the next or previous value, but not as easy to start from the beginning and get the tenth item.

Iteration is often associated with sequences, though they’re not quite the same. In Python, a sequence is a value that can be indexed in order as container[0], container[1], etc. (Indexing is implemented with __getitem__.) All sequences are iterable; in fact, if a type implements indexing but not __iter__, the iter function will automatically try indexing it from zero instead. reversed does the same, though it requires that the type implement __len__ as well so it knows what the last item is.

Much of this is codified more explicitly in the abstract base classes in collections.abc, which also provide default implementations of common methods.

Not all iterables are sequences, and not every value that can be indexed is a sequence! Python’s mapping type, dict, uses indexing to fetch the value for a key; but a dict has no defined order and is not a sequence. However, a dict can still be iterated over, producing its keys (in arbitrary order). A set can be iterated over, producing its values in arbitrary order, but it cannot be indexed at all. A type could conceivably use indexing for something more unusual and not be iterable at all.

A common question

It’s not really related to iteration, but people coming to Python from Ruby often ask why len() is a built-in function, rather than a method. The same question could be asked about iter() and next() (and other Python builtins), which more or less delegate directly to a “reserved” __dunder__ method anyway.

I believe the technical reason is simply the order that features were added to the language in very early days, which is not very interesting.

The philosophical reason, imo, is that Python does not reserve method names for fundamental operations. All __dunder__ names are reserved, of course, but everything else is fair game. This makes it obvious when a method is intended to add support for some language-ish-level operation, even if you don’t know what all the method names are. Occasionally a third-party library invents its own __dunder__ name, which is a little naughty, but the same reasoning applies: “this is a completely generic interface that some external mechanism is expected to use”.

This approach also avoids a namespacing problem. In Ruby, a Rectangle class might want to have width and length attributes… but the presence of length means a Rectangle looks like it functions as a sequence! Since “interface” method names aren’t namespaced in any way, there is no way to say that you don’t mean the same thing as Array.length.

It’s a minor quibble, since everything’s dynamically typed anyway, so the real solution is “well don’t try to iterate a rectangle then”. And Python does use keys as a method name in some obscure cases. Oh, well.

Some cute tricks

The distinction between sequences and iterables can cause some subtle problems. A lot of code that only needs to loop over items can be passed, e.g., a generator. But this can take some conscious care. Compare:

1
2
3
4
5
6
7
8
# This will NOT work with generators, which don't support len() or indexing
for i in range(len(container)):
    value = container[i]
    ...

# But this will
for i, value in enumerate(container):
    ...

enumerate also has a subtle, unfortunate problem: it cannot be combined with reversed. This has bit me more than once, surprisingly.

1
2
3
4
5
6
7
# This produces a TypeError from reversed()
for i, value in reversed(enumerate(container)):
    ...

# This almost works, but the index goes forwards while the values go backwards
for i, value in enumerate(reversed(container)):
    ...

The problem is that enumerate can’t, in general, reverse itself. It counts up from zero as it iterates over its argument; reversing it means starting from one less than the number of items, but it doesn’t yet know how many items there are. But if you just want to run over a list or other sequence backwards, this feels very silly. A trivial helper can make it work:

1
2
3
4
5
def revenum(iterable, end=0):
    start = len(iterable) + end
    for value in iterable:
        start -= 1
        yield start, value

I’ve run into other odd cases where it’s frustrating that a generator doesn’t have a length or indexing. This especially comes up if you make heavy use of generator expressions, which are a very compact way to write a one-off generator. (Python also has list, set, and dict “comprehensions”, which have the same syntax but use brackets or braces instead of parentheses, and are evaluated immediately instead of lazily.)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def get_big_fruits():
    fruits = ['apple', 'orange', 'pear']
    return (fruit.upper() for fruit in fruits)

# Roughly equivalent to:
def get_big_fruits():
    fruits = ['apple', 'orange', 'pear']
    def genexp():
        for fruit in fruits:
            yield fruit.upper()
    return genexp()

If you had thousands of fruits, doing this could save a little memory. The caller is probably just going to loop over them to print them out (or whatever), so using a generator expression means that each uppercase name only exists for a short time; returning a list would mean creating a lot of values all at once.

Ah, but now the caller wants to know how many fruits there are, with minimal fuss. Generators have no length, so that won’t work. Turning this generator expression into a class that also has a __len__ would be fairly ridiculous. So you resort to some slightly ugly trickery.

1
2
3
4
5
6
7
8
# Ugh.  Obvious, but feels really silly.
count = 0
for value in container:
    count += 1

# Better, but weird if you haven't seen it before.  Creates another generator
# expression that just yields 1 for every item, then sums them up.
count = sum(1 for _ in container)

Or perhaps you want the first big fruit? Well, [0] isn’t going to help. This is one of the few cases where using iter and next directly can be handy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Oops!  If the container is empty, this raises StopIteration, which you
# probably don't want.
first = next(iter(container))

# Catch the StopIteration explicitly.
try:
    first = next(iter(container))
except StopIteration:
    # This code runs if there are zero items
    ...

# Regular loop that terminates immediately.
# The "else" clause only runs when the container ends naturally (i.e. NOT if
# the loop breaks), which can only happen here if there are zero items.
for value in container:
    first = value
    break
else:
    ...

# next() -- but not __next__()! -- takes a second argument indicating a
# "default" value to return when the iterator is exhausted.  This only makes
# sense if you were going to substitute a default value anyway; doing this and
# then checking for None will do the wrong thing if the container actually
# contained a None.
first = next(iter(container), None)

Other tricks with iter and next include skipping the first item (or any number of initial items, though consider itertools.islice for more complex cases):

1
2
3
4
5
6
7
it = iter(container)
next(it, None)  # Use second arg to ignore StopIteration
for value in it:
    # Since the first item in the iterator has already been consumed, this loop
    # will start with the second item.  If the container had only one or zero
    # items, the loop will get StopIteration and end immediately.
    ...

Iterating two (or more) items at a time:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Obvious way: call next() inside the loop.
it = iter(container)
for value1 in it:
    # With an odd number of items, this will raise an uncaught StopIteration!
    # Catch it or provide a default value.
    value2 = next(it)
    ...

# Moderately clever way: abuse zip().
# zip() takes some number of containers and iterates over them pairwise.  It
# stores an iterator for each container.  When it's asked for its next item, it
# in turn asks all of its iterators for their next items, and returns them as a
# set.  But by giving it the same exact iterator twice, it'll end up advancing
# that iterator twice and returning two consecutive items.
# Note that zip() stops early as soon as an iterator runs dry, so if the
# container has an odd number of items, this will silently skip the last one.
# If you don't want that, use itertools.zip_longest instead.
it = iter(container)
for line1, line2 in zip(it, it):
    ...

# Far too clever way: exactly the same as above, but written as a one-liner.
# zip(iter(), iter()) would create two separate iterators and break the trick.
# List multiplication produces a list containing the same iterator twice.
# One advantage of this is that the 2 can be a variable.
for value1, value2 in zip(*[iter(container)] * 2):
    ...

Wow, that got pretty weird towards the end. Somehow this turned into Stupid Python Iterator Tricks. Don’t worry; I know far less about these other languages.

C

C is an extreme example with no iterator protocol whatsoever. It barely even supports sequences; arrays are just pointer math. All it has is the humble C-style for loop:

1
2
3
4
5
int[] container = {...};
for (int i = 0; i < container_length; i++) {
    int value = container[i];
    ...
}

Unfortunately, it’s really the best C can do. C arrays don’t know their own length, so no matter what, the developer has to provide it some other way. Even without that, a built-in iterator protocol is impossible — iterators require persistent state (the current position) to be bundled alongside code (how to get to the next position). That pretty much means one of two things: closures or objects. C has neither.

Lua

Lua has two forms of for loop. The first is a simple numeric loop.

1
2
3
4
-- 1 3 5 7 9 11
for value = 1, 11, 2 do
    ...
end

The three values after the = are the start, end, and step. They work similarly to Python’s range(), except that everything in Lua is always inclusive, so for i = 1, 5 will count from 1 to 5.

The generic form uses in.

1
2
3
for value in iterate(container) do
    ...
end

iterate isn’t a special name here, but most of the time a generic for will look like this.

See, Lua doesn’t have objects. It has enough tools that you can build objects fairly easily, but the core language has no explicit concept of objects or method calls. An iterator protocol needs to bundle state and behavior somehow, so Lua uses closures for that. But you still need a way to get that closure, and that means calling a function, and a plain value can’t have functions attached to it. So iterating over a table (Lua’s single data structure) looks like this:

1
2
for key, value in pairs(container) do
    ...

pairs is a built-in function. Lua also has an ipairs, which iterates over consecutive keys and values starting from key 1. (Lua starts at 1, not 0. Lua also represents sequences as tables with numeric keys.)

Lua does have a way to associate “methods” with values, which is how objects are made, but for loops almost certainly came first. So iteration is almost always over a function call, not a bare value.

Also, because objects are built out of tables, having a default iteration behavior for all tables would mean having the same default for all objects. Nothing’s stopping you from using pairs on an object now, but at least that looks deliberate. It’s easy enough to give objects iteration methods and iterate over obj:iter(), though it’s slightly unfortunate that every type might look slightly different. Unfortunately, Lua has no truly generic interface for “this can produce a sequence of values”.

The iteration protocol is really just calling a function repeatedly to get new values. When the function returns nil, the iteration ends. (That means nil can never be part of an iteration! You can work around this by returning two values and making sure the first one is something else that’s never nil, like an index.) The manual explains the exact semantics of the generic for with Lua code, a move I wish every language would make.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
-- This:
for var_1, ···, var_n in explist do block end

-- Is equivalent to this:
do
    local _func, _state, _lastval = explist
    while true do
        local var_1, ···, var_n = _func(_state, _lastval)
        if var_1 == nil then break end
        _lastval = var_1
        block
    end
end

Important to note here is the way multiple-return works in Lua. Lua doesn’t have tuples; multiple assignment is a distinct feature of the language, and multiple return works exactly the same way as multiple assignment. If there are too few values, the extra variables become nil; if there are too many values, the extras are silently discarded.

So in the line local _func, _state, _lastval = explist, the “state” value _state and the “last loop value” _lastval are both optional. Lua doesn’t use them, except to pass them back to the iterator function _func, and they aren’t visible to the for loop body. An iterator can thus be only a function and nothing else, letting _state and _lastval be nil — but they can be a little more convenient at times. Compare:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- Usual approach: return only a closure, completely ignoring state and lastval
local function inclusive_range(start, stop)
    local nextval = start
    return function()
        if nextval > stop then
            return
        end
        local val = nextval
        nextval = nextval + 1
        return val
    end
end

-- Alternative approach, not using closures at all.  This is the function we
-- return; each time it's called with the same "state" value and whatever it
-- returned last time it was called.
-- This function could even be written exactly a method (a la Python's
-- __next__), where the state value is the object itself.
local function inclusive_range_iter(stop, prev)
    -- "stop" is the state value; "prev" is the last value we returned
    local val = prev + 1
    if val > stop then
        return
    end
    return val
end
local function inclusive_range(start, stop)
    -- Return the iterator function, and pass it the stop value as its state.
    -- The "last value" is a little weird here; on the first iteration, there
    -- is no last value.  Here we can fake it by subtracting 1 from the
    -- starting number, but in other cases, it might make more sense if the
    -- "state" were a table containing both the start and stop values.
    return inclusive_range_iter, stop, start - 1
end

-- 6 7 8 9 with both implementations
for n in inclusive_range(6, 9) do
    ...
end

Lua doesn’t have generators. Surprisingly, it has fully-fledged coroutines — call stacks that can be paused at any time. Lua sometimes refers to them as “threads”, but only one can be running at a time. Effectively they’re like Python generators, except you can call a function which calls a function which calls a function which eventually yields, and the entire call stack from that point up to the top of the coroutine is paused and preserved.

In Python, the mere presence of yield causes a function to become a generator. In Lua, since any function might try to yield the coroutine it’s currently in, a function has to be explicitly called as a coroutine using functions in the coroutine library.

But this post is about iterators, not coroutines. Coroutines don’t function as iterators, but Lua provides a coroutine.wrap() that takes a function, turns it into a coroutine, and returns a function that resumes the coroutine. That’s enough to allow a coroutine to be turned into an iterator. The Lua book even has a section about this.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
local function inclusive_range(start, stop)
    local val = start
    while val <= stop do
        coroutine.yield(val)
        val = val + 1
    end
end
-- Unfortunately, coroutine.wrap() doesn't have any way to pass initial
-- arguments to the function it wraps, so we need this dinky wrapper.
-- I should clarify that the ... here is literal syntax for once.
local function iter_coro(entry_point, ...)
    local args = {...}
    return coroutine.wrap(function()
        entry_point(unpack(args))
    end)
end

# 6 7 8 9
for n in iter_coro(inclusive_range, 6, 9) do
    ...
end

So, that’s cool. Lua doesn’t do a lot for you — unfortunately, list processing tricks can be significantly more painful in Lua — but it has some pretty interesting primitives that compose with each other remarkably well.

Perl 5

Perl has a very straightforward C-style for loop, which looks and works exactly as you might expect. my, which appears frequently in these examples, is just local variable declaration.

1
2
3
for (my $i = 0; $i < 10; $i++) {
    ...
}

Nobody uses it. Everyone uses the iteration-style for loop. (It’s occasionally called foreach, which is extra confusing because both for and foreach can be used for both kinds of loop. Nobody actually uses the foreach keyword.)

1
2
3
for my $value (@container) {
    ...
}

The iteration loop can be used for numbers, as well, since Perl has a .. inclusive range operator. For iterating over an array with indexes, Perl has the slightly odd $#array syntax, which is the index of the last item in @array. Creating something like Python’s enumerate is a little tricky in Perl, because you can’t directly return a list of lists, and the workaround doesn’t support unpacking. It’s complicated.

1
2
3
4
5
6
7
8
for my $i (1..10) {
    ...
}

for my $index (0..$#array) {
    my $value = $array[$index];
    ...
}

A hash (Perl’s mapping “shape”) can’t be iterated directly. Or, well, it can, but the loop will alternate between keys and values because Perl is weird. Instead you need the keys or values built-in functions to get the keys or values as regular lists. (These functions also work on arrays as of Perl 5.12.)

1
2
3
for my $key (keys %container) {
    ...
}

For iterating over both keys and values at the same time, Perl has an each function. The behavior is a little weird, since every call to the function advances an internal iterator inside the hash and returns a new pair. If a loop using each terminates early, the next use of each may silently start somewhere in the middle of the hash, skipping a bunch of its keys. This is probably why I’ve never seen each actually used.

1
2
3
while (my ($key, value) = each %container) {
    ...
}

Despite being very heavily built on the concept of lists, Perl doesn’t have an explicit iterator protocol, and its support for lazy iteration in general is not great. When they’re used at all, lazy iterators tend to be implemented as ad-hoc closures or callable objects, which require a while loop:

1
2
3
4
my $iter = custom_iterator($collection);
while (my $value = $iter->()) {
    ...
}

Here be dragons

It is possible to sorta-kinda fake an iterator protocol. If you’re not familiar, Perl’s variables come in several different “shapes” — hash, array, scalar — and it’s possible to “tie” a variable to a backing object which defines the operations for a particular shape. It’s a little like operator overloading, except that Perl also has operator overloading and it’s a completely unrelated mechanism. In fact, you could use operator overloading to make your object return a tied array when dereferenced as an array. I am talking gibberish now.

Anyway, the trick is to tie an array and return a new value for each consecutive fetch of an index. Like so:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
use v5.12;
package ClosureIterator;

# This is the tie "constructor" and just creates a regular object to store
# our state
sub TIEARRAY {
    my ($class, $closure) = @_;
    my $self = {
        closure => $closure,
        nextindex => 0,
    };
    return bless $self, $class;
}

# This is called to fetch the item at a particular index; for an iterator,
# only the next item is valid
sub FETCH {
    my ($self, $index) = @_;

    if ($index == 0) {
        # Always allow reading index 0, both to mean a general "get next
        # item" and so that looping over the same array twice will work as
        # expected
        $self->{nextindex} = 0;
    }
    elsif ($index != $self->{nextindex}) {
        die "ClosureIterator does not support random access";
    }

    $self->{nextindex}++;
    return $self->{closure}->();
}

# The built-in shift() function means "remove and return the first item", so
# it's a good fit for a general "advance iterator"
sub SHIFT {
    my ($self) = @_;
    $self->{nextindex} = 0;
    return $self->{closure}->();
}

# Yes, an array has to be able to report its own size...  but luckily, a for
# loop fetches the size on every iteration!  As long as this returns
# increasingly large values, such a loop will continue indefinitely
sub FETCHSIZE {
    my ($self) = @_;
    return $self->{nextindex} + 1;
}

# Most other tied array operations are for modifying the array, which makes no
# sense here.  They're deliberately omitted, so trying to use them will cause a
# "can't locate object method" error.


package main;

# Create an iterator that yields successive powers of 2
tie my @array, 'ClosureIterator', sub {
    # State variables are persistent, like C statics
    state $next = 1;
    my $ret = $next;
    $next *= 2;
    return $ret;
};

# This will print out 1, 2, 4, 8, ... 1024, at which point the loop breaks
for my $i (@array) {
    say $i;
    last if $i > 1000;
}

This transparently works like any other array… sort of. You can loop over it (forever!); you can use shift to pop off the next value; you can stop a loop and then continue reading from it later.

Unfortunately, this is just plain weird, even for Perl, and I very rarely see it used. Ultimately, Perl’s array operations come in a set, and this is an array that pretends not to be able to do half of them. Even Perl developers are likely to be surprised by an array, a fundamental “shape” of the language, with quirky behavior.

The biggest problem is that, as I said, Perl is heavily built on lists. Part of that design is that @arrays are very eager to spill their contents into a surrounding context. Naïvely passing an array to a function, for example, will expand its elements into separate arguments, losing the identity of the array itself (and losing any tied-ness). Interpolating an array into a string automatically space-separates its elements.

Unlike a for loop, these operations only ask the array for its size once — so rather than printing an infinite sequence, they’ll print a completely arbitrary prefix of it. In the case above, spilling a fresh array will read one item; spilling the array after the example loop will read eleven items. So while a tied array works nicely with a for loop, it’s at odds with the most basic rules of Perl syntax.

Also, Perl’s list-based nature means it’s attracted a lot of list-processing utilities — but these naturally expect to receive a spilled list of arguments and cannot work with a lazy iterator.

I found multiple mentions of the List::Gen module while looking into this. I’d never heard of it before and I’ve never seen it used, but it tries to fill this gap (and makes use of array tying, among other things). It’s a bit weird, and its source code is extremely weird, and it took me twenty minutes to figure out how it was using <...> as a quoting construct.

(<...> in Perl does filename globbing, so it’s usually seen as <*.txt>. The same syntax is used for reading from a filehandle, which makes this confusing and ambiguous, so it’s generally discouraged in favor of the built-in glob function which does the same thing. Well, it turns out that <...> must just call glob() at Perl-level, because List::Gen manages to co-opt this syntax simply by exporting its own glob function. Perl is magical.)

Perl 6

Perl 6, a mad experiment to put literally every conceivable feature into one programming language, naturally has a more robust concept of iteration.

At first glance, many of the constructs are similar to those of Perl 5. The C-style for loop still exists for some reason, but has been disambiguated under the loop keyword.

1
2
3
4
5
6
7
8
loop (my $i = 1; $i <= 10; $i++) {
    ...
}

# More interestingly, loop can be used completely bare for an infinite loop
loop {
    ...
}

The for block has slightly different syntax and a couple new tricks.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Unlike in Perl 5, $value is automatically declared and scoped to the block,
# without needing an explicit 'my'
for @container -> $value {
    ...
}

for 1..10 -> $i {
    ...
}

# This doesn't iterate in pairs; it reads two items at a time from a flat list!
for 1..10 -> $a, $b {
    ...
}

Not apparent in the above code is that ranges are lazy in Perl 6, as in Python; the elements are computed on demand. In fact, Perl 6 supports a range like 1..Inf.

Loop variables are also aliases. By default they’re read-only, so this appears to work like Python… but Perl has always had a C-like language-level notion of “slots” that Python does not, and it becomes apparent if the loop variable is made read-write:

1
2
3
4
5
6
7
8
my @fruits = «apple orange pear»;
for @fruits -> $fruit is rw {
    # This is "apply method inplace", i.e. shorthand for:
    # $fruit = $fruit.uc;
    # Yes, you can do that.
    $fruit .= uc;
}
say @fruits;  # APPLE ORANGE PEAR

For iterating with indexes, there’s a curious idiom:

1
2
3
4
5
6
7
# ^Inf is shorthand for 0..Inf, read as "up to Inf".
# Z is the zip operator, which interleaves its arguments' elements into a
# single flat list.
# This makes use of the "two at a time" trick from above.
for ^Inf Z @array -> $index, $value {
    ...
}

Iterating hashes is somewhat simpler; hashes have methods, and the .kv method returns the keys and values. (It actually returns them in a flat list interleaved, which again uses “two at a time” syntax. If you only use a single loop variable, your loop iterations will alternate between a key and a value. Iterating a hash directly produces pairs, which are a first-class data type in Perl 6, but I can’t find any syntax for directly unpacking a pair within a loop header.)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
for %container.kv -> $key, value {
    ...
}

# No surprises here
for %container.keys -> $key {
    ...
}
for %container.values -> $value {
    ...
}

Perl 6 is very big on laziness, which is perhaps why it took fifteen years to see a release. It has the same iterable versus iterator split as Python. Given a container (iterable), ask for an iterator; given an iterator, repeatedly ask for new values. When the iterator is exhausted, it returns the IterationEnd sentinel. Exactly the same ideas. I’m not clear on the precise semantics of the for block and can’t find a simple reference, but they’re probably much like Python’s… plus a thousand special cases.

Generators, kinda

Perl 6 also has its own version of generators, though with a few extra twists. Curiously, generators are a block called gather, rather than a kind of function — this means that a one-off gather is easier to create, but a gather factory must be explicitly wrapped in a function. gather can even take a single expression rather than a block, so there’s no need for separate “generator expression” syntax as in Python.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
sub inclusive-range($start, $stop) {
    return gather {
        my $val = $start;
        while $val <= $stop {
            take $val;
            $val++;
        }
    };
}

# 6 7 8 9
for inclusive-range(6, 9) -> $n {
    ...
}

Unlike Python’s yield, Perl 6’s take is dynamically scoped — i.e., take can be used anywhere in the call stack, and it will apply to the most recent gather caller. That means arbitrary-depth coroutines, which seems like a big deal to me, but the documentation mentions it almost as an afterthought.

The documentation also says gather/take can generate values lazily, depending on context,” but neglects to clarify how context factors in. The code I wrote above turns out to be lazy, but this ambiguity inclines me to use the explicit lazy marker everywhere.

Ultimately it’s a pretty flexible feature, but has a few quirks that make it a bit clumsier to use as a straightforward generator. Given that the default behavior is an eagerly-evaluated block, I think the original intention was to avoid the slightly unsatisfying pattern of “push onto an array every iteration through a loop” — instead you can now do this:

1
2
3
4
5
6
my @results = gather {
    for @source-data -> $datum {
        next unless some-test($datum);
        take process($datum);
    }
};

Using a simple (syntax-highlighted!) take puts the focus on the value being taken, rather than the details of putting it where it wants to go and how it gets there. It’s an interesting idea and I’m surprised I’ve never seen it demonstrated this way.

With gather and some abuse of Perl’s exceptionally compactable syntax, I can write a much shorter version of the infinite Perl 5 iterator above.

1
2
3
4
5
6
7
8
my @powers-of-two = lazy gather take (state $n = 1) *= 2 for ^Inf;

# Binds to $_ by default
for @powers-of-two {
    # Method calls are on $_ by default
    .say;
    last if $_ > 1000;
}

It’s definitely shorter, I’ll give it that. Leaving off the lazy in this case causes an infinite loop as Perl tries to evaluate the entire list; using a $ instead of a @ produces a “Cannot .elems a lazy list” error; using $ without lazy prints a ...-terminated representation of the infinite list and then hangs forever. I don’t quite understand the semantics of stuffing a list into a scalar ($) variable in Perl 6, and to be honest the list/array semantics seem to be far more convoluted than Perl 5, so I have no idea what’s going on here. Perl 6 has a lot of fascinating toys that are very easy to use incorrectly.

Nuts and bolts

Iterables and iterators are encoded explicitly as the Iterable and Iterator roles. An Iterable has an .iterator method that should return an Iterator. An Iterator has a .pull-one method that returns the next value, or the IterationEnd sentinel when the iterator is exhausted. Both roles offer several other methods, but they have suitable default implementations.

inclusive-range might be transformed into a class thusly:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class InclusiveRangeIterator does Iterator {
    has $.range is required;
    has $!nextval = $!range.start;

    method pull-one() {
        if $!nextval > $!range.stop {
            return IterationEnd;
        }

        # Perl people would probably phrase this:
        # ++$!nextval
        # and they are wrong.
        my $val = $!nextval;
        $!nextval++;
        return $val;
    }
}

class InclusiveRange does Iterable {
    has $.start is required;
    has $.stop is required;

    # Don't even ask
    method new($start, $stop) {
        self.bless(:$start, :$stop);
    }

    method iterator() {
        InclusiveRangeIterator.new(range => self);
    }
}

# 6 7 8 9
for InclusiveRange.new(6, 9) -> $n {
    ...
}

Can we use gather to avoid the need for an extra class, just as in Python? We sure can! The only catch is that Perl 6 iterators don’t also pretend to be iterables (remember, in Python, iter(it) should produce it), so we need to explicitly return a gather block’s iterator.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
class InclusiveRange does Iterable {
    has $.start is required;
    has $.stop is required;

    # Don't even ask
    method new($start, $stop) {
        self.bless(:$start, :$stop);
    }

    method iterator() {
        gather {
            my $val = $!start;
            while $val <= $!stop {
                take $val;
                $val++;
            }
        }.iterator;  # <- this is important
    }
}

For sequences, Perl 6 has the Seq type. Curiously, even an infinite lazy gather is still a Seq. Indexing and length are not part of Seq — both are implemented as separate methods.

Curiously, even though Perl 6 became much stricter overall, the indexing methods don’t seem to be part of a role; you only need define them, much like Python’s __dunder__ methods. In fact, the preceding examples, does Iterator isn’t necessary at all; the for block will blindly try to call an iterator method and doesn’t much care where it came from.

I’m sure there are plenty of cute tricks possible with Perl 6, but, er, I’ll leave those as an exercise for the reader.

Ruby

Ruby is a popular and well-disguised Perl variant, if Perl just went completely all-in on Smalltalk. It has no C-style for, but it does have an infinite loop block and a very Python-esque for:

1
2
3
for value in sequence do
    ...
end

Nobody uses this. No, really, the core language documentation outright says:

The for loop is rarely used in modern ruby programs.

Instead, you’ll probably see this:

1
2
3
sequence.each do |value|
    ...
end

It doesn’t look it, but this is completely backwards from everything seen so far. All of these other languages have used external iterators, where an object is repeatedly asked to produce values and calling code can do whatever it wants with them. Here, something very different is happening. The entire do ... end block acts as a closure whose argument is value; it’s passed to the each method, which calls it once for each value in the sequence. This is an internal iterator.

Pass a block to a function which can then call it a lot” is a built-in syntactic feature of Ruby, so these kinds of iterators are fairly common. The upside is that they look almost like a custom block, so they fit naturally with the language. The downside is that all of these block-accepting methods are implemented on Array, rather than as generic functions: bsearch, bsearch_index, collect, collect!, combination, count, cycle, delete, delete_if, drop_while, each, each_index, fetch, fill, find_index, index, keep_if, map, map!, permutation, product, reject, reject!, repeated_combination, repeated_permutation, reverse_each, rindex, select, select!, sort, sort!, sort_by!, take_while, uniq, uniq!, zip. Some of those, as well as a number of additional methods, are provided by the Enumerable mixin which can express them in terms of each. I suppose the other upside is that any given type can provide its own more efficient implementation of these methods, if it so desires.

I guess that huge list of methods answers most questions about how to iterate over indices or in reverse. The only bit missing is that .. range syntax exists in Ruby as well, and it produces Range objects which also have an each method. If you don’t care about each index, you can also use the cute 3.times method.

Ruby blocks are a fundamental part of the language and built right into the method-calling syntax. Even break is defined in terms of blocks, and it works with an argument!

1
2
3
4
# This just doesn't feel like it should work, but it does.  Prints 17.
# Braces are conventionally used for inline blocks, but do/end would work too.
primes = [2, 3, 5, 7, 11, 13, 17, 19]
puts primes.each { |p| break p if p > 16 }

each() doesn’t need to do anything special here; break will just cause its return value to be 17. Somehow. (Honestly, this is the sort of thing that makes me wary of Ruby; it seems so ad-hoc and raises so many questions. A language keyword that changes the return value of a different function? Does the inside of each() know about this or have any control over it? How does it actually work? Is there any opportunity for cleanup? I have no idea, and the documentation doesn’t seem to think this is worth commenting on.)

Using blocks

Anyway, with block-passing as a language feature, the “iterator protocol” is pretty straightforward: just write a method that takes a block.

1
2
3
4
5
def each
    for value in self do
        yield value
    end
end

Be careful! Though it’s handy for iteration, that yield is not the same as Python’s yield. Ruby’s yield calls the passed-in block — yields control to the caller — with the given value(s).

I pulled a dirty trick there, because I expressed each in terms of for. So how does for work? Well, ah, it just delegates to each. Oops!

How, then, do you write an iterator completely from scratch? The obvious way is to use yield repeatedly. That gives you something that looks rather a lot like Python, though it doesn’t actually pause execution.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class InclusiveRange
    # This gets you a variety of other iteration methods, all defined in
    # terms of each()
    include Enumerable

    def initialize(start, stop)
        @start = start
        @stop = stop
    end
    def each
        val = @start
        while val <= @stop do
            yield val
            val += 1
        end
    end
end

# 6 7 8 9
# A `for` loop would also work here
InclusiveRange.new(6, 9).each do |n|
    ...
end

Enumerators

Well, that’s nice for creating a whole collection type, but what if I want an ad-hoc custom iterator? Enter the Enumerator class, which allows you to create… ah, enumerators.

Note that the relationship between Enumerable and Enumerator is not the same as the relationship between “iterable” and “iterator”. Most importantly, neither is really an interface. Enumerable is a set of common iteration methods that any collection type may want to have, and it expects an each to exist. Enumerator is a generic collection type, and in fact mixes in Enumerable. Maybe I should just show you some code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def inclusive_range(start, stop)
    Enumerator.new do |y|
        val = start
        while val <= stop do
            y.yield val
            val += 1
        end
    end
end

# 6 7 8 9
inclusive_range(6, 9).each do |n|
    puts n
end

Enumerator turns a block into a fully-fledged data stream. The block is free to do whatever it wants, and whenever it wants to emit a value, it calls y.yield value. The y argument is a “yielder” object, an opaque magic type; y.yield is a regular method call, unrelated to the yield keyword. (y << value is equivalent; << is Ruby’s “append” operator. And also, yes, bit shift.)

The amazing bit is that you can do this:

1
2
# 6
puts inclusive_range(6, 9).first

Enumerator has all of the Enumerable methods, one of which is first. So, that’s nice.

The really amazing bit is that if you stick some debugging code into the block passed to Enumerator.new, you’ll find that… the values are produced lazily. That call to first() doesn’t generate the full sequence and then discard everything after the first item; it only generates the first item, then stops.

(Beware! The values are produced lazily, but many Enumerable methods are eager. I’ll get back to this in a moment.)

Hang on, didn’t I say yield doesn’t pause execution? Didn’t I also say the above yield is just a method call, not the keyword?

I did! And I wasn’t lying. The really truly amazing bit, which I’ve seen shockingly little excitement about while researching this, is that under the hood, this is all using Fibers. Coroutines.

Enumerator.new takes a block and turns it into a coroutine. Every time something wants a value from the enumerator, it resumes the coroutine. The yielder object’s yield method then calls Fiber.yield() to pause the coroutine. It works just like Lua, but it’s designed to work with existing Ruby conventions, like the piles of internal iteration methods developers expect to find.

So Enumerator.new can produce Python-style generators, albeit in a slightly un-native-looking way. There’s also one other significant difference: an Enumerator can restart itself for each method called on it, simply by calling the block again. This code will print 6 three times:

1
2
3
4
ir = inclusive_range(6, 9)
puts ir.first
puts ir.first
puts ir.first

For something like an inclusive range object, that’s pretty nice. For something like a file, maybe not so nice. It also means you need to be sure to put your setup code inside the block passed to Enumerator.new, or funny things will happen when the block is restarted.

Something like generators

But wait, there’s more. Specifically, this common pattern, which pretty much lets you ignore Enumerator.new entirely.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
def some_iterator_method
    # __method__ is the current method name.  block_given? is straightforward.
    return enum_for(__method__) unless block_given?

    # An extremely accurate simulation of a large list.
    (1..1000).each do |item|
        puts "having a look at #{item}"
        # Blocks are invisible to `yield`; this will yield to the block passed
        # to some_iterator_method.
        yield item if item.even?
    end
end

# having a look at 1
# having a look at 2
# 2
puts some_iterator_method.first

Okay, bear with me.

First, some_iterator_method() is called. It doesn’t have a block attached, so block_given? is false, and it returns enum_for(...), whatever that does. Then first() is called on the result, and that produces a single element and stops.

The above code has no magic yielder object. It uses the straightforward yield keyword. Why doesn’t it loop over the entire range from 1 to 1000?

Remember, Enumerator uses coroutines under the hood. One neat thing coroutines can do is pause code that doesn’t know it’s in a coroutine. Python’s generators pause themselves with yield, and the mere presence of yield turns a function into a generator; but in Lua or Ruby or any other language with coroutines, any function can pause at any time. You can even make a closure that pauses, then pass that closure to another function which calls it, without that function ever knowing anything happened.

(This arguably has some considerable downsides as well — it becomes difficult to know when or where your code might pause, which makes reasoning about the order of operations much harder. That’s why Python and some other languages opted to implement async IO with an await keyword — anyone reading the code knows that it can only pause where an await appears.)

(Also, I’m saying “pause” here instead of “yield” because Ruby has really complicated the hell out of this by already having a yield keyword that does something totally different, and naming its coroutine pause function yield.)

Anyway, that’s exactly what’s happening here. enum_for returns an Enumerator that wraps the whole method. (It doesn’t need to know self, because enum_for is actually a method inherited from Object, goodness gracious.) When the Enumerator needs some items, it calls the method a second time with its own block, running in a coroutine, just like a block passed to Enumerator.new. Eventually the method emits a value using the regular old yield keyword, and that value reaches the block created by Enumerator, and that block pauses the call stack. It doesn’t matter that Range.each is eager, because its iteration is still happening in code somewhere, and that code is part of a call stack in a coroutine, so it can be paused. Eventually the coroutine is no longer useful and gets thrown away, so the eager each call simply stops midway through its work, unaware that anything unusual ever happened.

In fact, despite being an Object method, enum_for isn’t special at all. It can be expressed in pure Ruby very easily:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def my_enum_for(receiver, method)
    # Enumerator.new creates a coroutine-as-iteration-source, as above.
    Enumerator.new do |y|
        # All it does is call the named method with a trivial block.  Every
        # time the method produces a value with the `yield` keyword, we pass it
        # along to the yielder object, which pauses the coroutine.
        # This is nothing more than a bridge between "yield" in the Ruby block
        # sense, and "yield" in the coroutine sense.
        receiver.send method do |value|
            y.yield value
        end
    end
end

So, that’s pretty neat. Incidentally, several built-in methods like Array.each and Enumerable.collect act like this, returning an Enumerator if called with no arguments.

Full laziness

I mentioned above that while an Enumerator fetches items lazily, many of the methods are eager. To clarify what I mean by that, consider:

1
2
3
4
5
inclusive_range(6, 9000).collect {
    |n|
    puts "considering #{n}"
    "a" * n
}.first(3)

collect() is one of those common Enumerable methods. You might know it by its other name, map(). Ruby is big on multiple names for the same thing: one that everyone uses in practice, and another that people who don’t use Ruby will actually recognize.

Even though this code ultimately only needs three items, and even though there’s all this coroutine machinery happening under the hood, this still evaluates the entire range. Why?

The problem is that collect() has always returned an array, and is generally expected to continue doing so. It has no way of knowing that it’s about to be fed into first. Rather than violate this API, Ruby added a new method, Enumerable.lazy. This stops after three items:

1
2
3
4
5
inclusive_range(6, 9000).lazy.collect {
    |n|
    puts "considering #{n}"
    "a" * n
}.first(3)

All this does is return an Enumerator::Lazy object, which has lazy implementations of various methods that would usually do a full iteration. Methods like first(3) are still “eager” (in the sense that they just return an array), since their results have a fixed finite size.

This seems a little clunky to me, since the end result is still an object with a collect method that doesn’t return an array. I suspect the real reason is just that Enumerator was added first; even though the coroutine support was already there, Enumerator::Lazy only came along later. Changing existing eager methods to be lazy can, ah, cause problems.

The only built-in type that seems to have interesting lazy behavior is Range, which can be infinite.

1
2
3
4
# Whoops, infinite loop.
(1..Float::INFINITY).select { |n| n.even? }.first(5)
# 2 4 6 8 10
(1..Float::INFINITY).lazy.select { |n| n.even? }.first(5)

A loose end

I think the only remaining piece of this puzzle is something I stumbled upon but can’t explain. Enumerator has a next method, which returns the next value or raises StopIteration.

Wow, that sounds awfully familiar.

But I can’t find anything in the language or standard library that uses this, with one single and boring exception: the loop construct. It catches StopIteration and exits the block.

1
2
3
4
5
6
enumerator = [1, 2, 3].each
loop do
    while true do
        puts enumerator.next
    end
end

On the fourth call, next() will be out of items, so it raises StopIteration. Removing the loop block makes this quite obvious.

That’s it. That’s the only use of it in the language, as far as I can tell. It seems almost… vestigial. It’s also a little weird, since it keeps the current iteration state inside the Enumerator, unlike any of its other methods. But it’s also the only form of external iteration that I know of in Ruby, and that’s handy to have sometimes.

And, uh, so on

I intended to foray into a few more languages, including some recent lower-level friends like C++/Rust/Swift, but this post somehow spiraled out of control and hit nine thousand words. No one has read this far.

Handily, it turns out that the above languages pretty much cover the basic ways of approaching iteration; if any of this made sense, other languages will probably seem pretty familiar.

  • C++’s iteration protocol(s) has existed for a long time in the form of ++it to advance an iterator and *it to read the current item, though this was usually written manually in a C-style for loop, and loops were generally terminated with an explicit endpoint.

    C++11 added the range-based for, which does basically the same stuff under the hood. Idiomatic C++ is inscrutible, but maybe you can make sense of this project which provides optionally-infinite iterable ranges.

  • Rust has an entire (extremely well-documented) iter module with numerous iterators and examples of how to create your own. The core of the Iterator trait is just a next method which returns None when exhausted. It also has a lot of handy Ruby-like chainable methods, so working directly with iterators is more common in Rust than in Python.

  • Swift also has (well-documented) simple next-based iterators, which return nil when exhausted, effectively the same API as Rust.

I could probably keep finding more subsequent languages indefinitely, so I’m gonna take a break from this now.

Weekly roundup: National Novelty Writing Month

Post Syndicated from Eevee original https://eev.ee/dev/2016/11/07/weekly-roundup-national-novelty-writing-month/

Inktober is a distant memory.

Now it’s time for NaNoWriMo! Almost. I don’t have any immediate interest in writing a novel, but I do have plenty of other stuff that needs writing — blog posts, my book, Runed Awakening, etc. So I’m going to try to write 100,000 words this month, spread across whatever.

Rules:

  1. I’m only measuring, like, works. I’ll count this page, as short as it is, because it’s still a single self-contained thing that took some writing effort. But no tweets or IRC or the like.

  2. I’m counting with vim’s g C-g or wc -w, whichever is more convenient. The former is easier for single files I edit in vim; the latter is easier for multiple files or stuff I edit outside of vim.

  3. I’m making absolutely zero effort to distinguish between English text, code, comments, etc.; whatever the word count is, that’s what it is. So code snippets in the book will count, as will markup in blog posts. Runed Awakening is a weird case, but I’m choosing to count it because it’s inherently a text-based game, plus it’s written in a prosaic language. On the other hand, dialogue for Isaac HD does not count, because it’s a few bits of text in what is otherwise just a Lua codebase.

  4. Only daily net change counts. This rule punishes me for editing, but that’s the entire point of NaNoWriMo’s focus on word count: to get something written rather than linger on a section forever and edit it to death. I tend to do far too much of the latter.

    This rule already bit me on day one, where I made some significant progress on Runed Awakening but ended up with a net word count of -762 because it involved some serious refactoring. Oops. Turns out word-counting code is an even worse measure of productivity than line-counting code.

These rules are specifically crafted to nudge me into working a lot more on my book and Runed Awakening, those two things I’d hoped to get a lot further on in the last three months. And unlike Inktober, blog posts contribute towards my preposterous goal rather than being at odds with it.

With one week down, so far I’m at +8077 words. I got off to a pretty slow (negative, in fact) start, and then spent a day out of action from an ear infection, so I’m a bit behind. Hoping I can still catch up as I get used to this whole “don’t rewrite the same paragraph over and over for hours” approach.

  • art: Last couple ink drawings of Pokémon, hallelujah. I made a montage of them all, too.

    I drew Momo (the cat from Google’s Halloween doodle game) alongside Isaac and it came out spectacularly well.

    I finally posted the loophole commission.

    I posted a little “what type am I” meme on Twitter and drew some of the interesting responses. I intended to draw a couple more, but then I got knocked on my ass and my brain stopped working. I still might get back to them later.

  • blog: I posted an extremely thorough teardown of JavaScript. That might be cheating, but it’s okay, because I love cheating.

    Wrote a whole lot about Java.

  • doom: I did another speedmap. I haven’t released the last two yet; I want to do a couple more and release them as a set.

  • blog: I wrote about game accessibility, which touched on those speedmaps.

  • runed awakening: I realized I didn’t need all the complexity of (and fallout caused by) the dialogue extension I was using, so I ditched it in favor of something much simpler. I cleaned up some stuff, fixed some stuff, improved some stuff, and started on some stuff. You know.

  • book: I’m working on the PICO-8 chapter, since I’ve actually finished the games it describes. I’m having to speedily reconstruct the story of how I wrote Under Construction, which is interesting. I hope it still comes out like a story and not a tutorial.

As for the three big things, well, they sort of went down the drain. I thought they might; I don’t tend to be very good at sticking with the same thing for a long and contiguous block of time. I’m still making steady progress on all of them, though, and I did some other interesting stuff in the last three months, so I’m satisfied regardless.

With November devoted almost exclusively to writing, I’m really hoping I can finally have a draft chapter of the book ready for Patreon by the end of the month. That $4 tier has kinda been languishing, sorry.

Organizing Software Deployments to Match Failure Conditions

Post Syndicated from Nick Trebon original https://aws.amazon.com/blogs/architecture/organizing-software-deployments-to-match-failure-conditions/

Deploying new software into production will always carry some amount of risk, and failed deployments (e.g., software bugs, misconfigurations, etc.) will occasionally occur. As a service owner, the goal is to try and reduce the number of these incidents and to limit customer impact when they do occur. One method to reduce potential impact is to shape your deployment strategies around the failure conditions of your service. Thus, when a deployment fails, the service owner has more control over the blast radius as well as the scope of the impact. These strategies require an understanding of how the various components of your system interact, how those components can fail and how those failures impact your customers. This blog post discusses some of the deployment strategies that we’ve made on the Route 53 team and how these choices affect the availability of our service.

To begin, I’ll briefly describe some of the deployment procedures and the Route 53 architecture in order to provide some context for the deployment strategies that we have chosen. Hopefully, these examples will reveal strategies that could benefit your own service’s availability. Like many services, Route 53 consists of multiple environments or stages: one for active development, one for staging changes to production and the production stage itself. The natural tension with trying to reduce the number of failed deployments in production is to add more rigidity and processes that slow down the release of new code. At Route 53, we do not enforce a strict release or deployment schedule; individual developers are responsible for verifying their changes in the staging environment and pushing their changes into production. Typically, our deployments proceed in a pipelined fashion. Each step of the pipeline is referred to as a “wave” and consists of some portion of our fleet. A pipeline is a good abstraction as each wave can be thought of as an independent and separate step. After each wave of the pipeline, the change can be verified — this can include automatic, scheduled and manual testing as well as the verification of service metrics. Furthermore, we typically space out the earlier waves of production deployment at least 24 hours apart, in order to allow the changes to “bake.” Letting our software bake refers to rolling out software changes slowly to allow us to validate those changes and verify service metrics with production traffic before pushing the deployment to the next wave. The clear advantage of deploying new code to only a portion of your fleet is that it reduces the impact of a failed deployment to just the portion of the fleet containing the new code. Another benefit of our deployment infrastructure is that it provides us a mechanism to quickly “roll back” a deployment to a previous software version if any problems are detected which, in many cases, enables us to quickly mitigate a failed deployment.

Based on our experiences, we have further organized our deployments to try and match our failure conditions to further reduce impact. First, our deployment strategies are tailored to the part of the system that is the target of our deployment. We commonly refer to two main components of Route 53: the control plane and the data plane (pictured below). The control plane consists primarily of our API and DNS change propagation system. Essentially, this is the part of our system that accepts a customer request to create or delete a DNS record and then the transmission of that update to all of our DNS servers distributed across the world. The data plane consists of our fleet of DNS servers that are responsible for answering DNS queries on behalf of our customers. These servers currently reside in more than 50 locations around the world. Both of these components have their own set of failure conditions and differ in how a failed deployment will impact customers. Further, a failure of one component may not impact the other. For example, an API outage where customers are unable to create new hosted zones or records has no impact on our data plane continuing to answer queries for all records created prior to the outage. Given their distinct set of failure conditions, the control plane and data plane have their own deployment strategies, which are each discussed in more detail below.

Control Plane Deployments

The bulk of the of the control plane actually consists of two APIs. The first is our external API that is reachable from the Internet and is the entry point for customers to create, delete and view their DNS records. This external API performs authentication and authorization checks on customer requests before forwarding them to our internal API. The second, internal API supports a much larger set of operations than just the ones needed by the external API; it also includes operations required to monitor and propagate DNS changes to our DNS servers as well as other operations needed to operate and monitor the service. Failed deployments to the external API typically impact a customer’s ability to view or modify their DNS records. The availability of this API is critical as our customers may rely on the ability to update their DNS records quickly and reliably during an operational event for their own service or site.

Deployments to the external API are fairly straightforward. For increased availability, we host the external API in multiple availability zones. Each wave of deployment consists of the hosts within a single availability zone, and each host in that availability zone is deployed to individually. If any single host deployment fails, the deployment to the entire availability zone is halted automatically. Some host failures may be quickly caught and mitigated by the load balancer for our hosts in that particular availability zone, which is responsible for health checking the hosts. Hosts that fail these load balancer health checks are automatically removed from service by the load balancer. Thus, a failed deployment to just a single host would result in it being removed from service automatically and the deployment halted without any operator intervention. For other types of failed deployments that may not cause the load balancer health checks to fail, restricting waves to a single availability zone allows us to easily flip away from that availability zone as soon as the failure is detected. A similar approach could be applied to services that utilize Route 53 plus ELB in multiple regions and availability zones for their services. ELBs automatically health check their back-end instances and remove unhealthy instances from service. By creating Route 53 alias records marked to evaluate target health (see ELB documentation for how to set this up), if all instances behind an ELB are unhealthy, Route 53 will fail away from this alias and attempt to find an alternate healthy record to serve. This configuration will enable automatic failover at the DNS-level for an unhealthy region or availability zone. To enable manual failover, simply convert the alias resource record set for your ELB to either a weighted alias or associate it with a health check whose health you control. To initiate a failover, simply set the weight to 0 or fail the health check. A weighted alias also allows you the ability to slowly increase the traffic to that ELB, which can be useful for verifying your own software deployments to the back-end instances.

For our internal API, the deployment strategy is more complicated (pictured below). Here, our fleet is partitioned by the type of traffic it handles. We classify traffic into three types: (1) low-priority, long-running operations used to monitor the service (batch fleet), (2) all other operations used to operate and monitor the service (operations fleet) and (3) all customer operations (customer fleet). Deployments to the production internal API are then organized by how critical their traffic is to the service as a whole. For instance, the batch fleet is deployed to first because their operations are not critical to the running of the service and we can tolerate long outages of this fleet. Similarly, we prioritize the operations fleet below that of customer traffic as we would rather continue accepting and processing customer traffic after a failed deployment to the operations fleet. For the internal API, we have also organized our staging waves differently from our production waves. In the staging waves, all three fleets are split across two waves. This is done intentionally to allow us to verify that the code changes work in a split-world where multiple versions of the software are running simultaneously. We have found this to be useful in catching incompatibilities between software versions. Since we never deploy software in production to 100% of our fleet at the same time, our software updates must be designed to be compatible with the previous version. Finally, as with the external API, all wave deployments proceed with a single host at a time. For this API, we also include a deep application health check as part of the deployment. Similar to the load balancer health checks for the external API, if this health check fails, the entire deployment is immediately halted.

Data Plane Deployments

As mentioned earlier, our data plane consists of Route 53’s DNS servers, which are distributed across the world in more than 50 distinct locations (we refer to each location as an ‘edge location’). An important consideration with our deployment strategy is how we stripe our anycast IP space across locations. In summary, each hosted zone is assigned four delegation name servers, each of which belong to a “stripe” (i.e., one quarter of our anycast range). Generally speaking, each edge location announces only a single stripe, so each stripe is therefore announced by roughly 1/4 of our edge locations worldwide. Thus, when a resolver issues a query against each of the four delegation name servers, those queries are directed via BGP to the closest (in a network sense) edge location from each stripe. While the availability and correctness of our API is important, the availability and correctness of our data plane are even more critical. In this case, an outage directly results in an outage for our customers. Furthermore, the impact of serving even a single wrong answer on behalf of a customer is magnified by that answer being cached by both intermediate resolvers and end clients alike. Thus, deployments to our data plane are organized even more carefully to both prevent failed deployments and to reduce potential impact.

The safest way to deploy and minimize impact would be to deploy to a single edge location at a time. However, with manual deployments that are overseen by a developer, this approach is just not scalable with how frequently we deploy new software to over 50 locations (with more added each year). Thus, most of our production deployment waves consist of multiple locations; the one exception is our first wave that includes just a single location. Furthermore, this location is specifically chosen because it runs our oldest hardware, which provides us a quick notification for any unintended performance degradation. It is important to note that while the caching behavior for resolvers can cause issues if we serve an incorrect answer, they handle other types failures well. When a recursive resolver receives a query for a record that is not cached, it will typically issue queries to at least three of the four delegation name servers in parallel and it will use the first response it receives. Thus, in the event where one of our locations is black holing customer queries (i.e., not replying to DNS queries), the resolver should receive a response from one of the other delegation name servers. In this case, the only impact is to resolvers where the edge location that is not answering would have been the fastest responder. Now, that resolver will effectively be waiting for the response from the second fastest stripe. To take advantage of this resiliency, our other waves are organized such that they include edge locations that are geographically diverse, with the intent that for any single resolver, there will be nearby locations that are not included in the current deployment wave. Furthermore, to guarantee that at most a single nameserver for all customers is affected, waves are actually organized by stripe. Finally, each stripe is spread across multiple waves so that failures impact only a single name server for a portion of our customers. An example of this strategy is depicted below. A few notes: our staging environment consists of a much smaller number of edge locations than production, so single-location waves are possible. Second, each stripe is denoted by color; in this example, we see deployments spread across a blue and orange stripe. You, too, can think about organizing your deployment strategy around your failure conditions. For example, if you have a database schema used by both your production system and a warehousing system, deploy the change to the warehousing system first to ensure you haven’t broken any compatibility. You might catch problems with the warehousing system before it affects customer traffic.

Conclusions

Our team’s experience with operating Route 53 over the last 3+ years have highlighted the importance of reducing the impact from failed deployments. Over the years, we have been able to identify some of the common failure conditions and to organize our software deployments in such a way so that we increase the ease of mitigation while decreasing the potential impact to our customers.

– Nick Trebon

systemd for Developers III

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/journal-submit.html

Here’s the third episode of of my
systemd for Developers series.

Logging to the Journal

In a recent blog
story
intended for administrators I shed some light on how to use
the journalctl(1)
tool to browse and search the systemd journal. In this blog story for developers
I want to explain a little how to get log data into the systemd
Journal in the first place.

The good thing is that getting log data into the Journal is not
particularly hard, since there’s a good chance the Journal already
collects it anyway and writes it to disk. The journal collects:

  1. All data logged via libc syslog()
  2. The data from the kernel logged with printk()
  3. Everything written to STDOUT/STDERR of any system service

This covers pretty much all of the traditional log output of a
Linux system, including messages from the kernel initialization phase,
the initial RAM disk, the early boot logic, and the main system
runtime.

syslog()

Let’s have a quick look how syslog() is used again. Let’s
write a journal message using this call:

#include <syslog.h>

int main(int argc, char *argv[]) {
        syslog(LOG_NOTICE, "Hello World");
        return 0;
}

This is C code, of course. Many higher level languages provide APIs
that allow writing local syslog messages. Regardless which language
you choose, all data written like this ends up in the Journal.

Let’s have a look how this looks after it has been written into the
journal (this is the JSON
output
journalctl -o json-pretty generates):

{
        "_BOOT_ID" : "5335e9cf5d954633bb99aefc0ec38c25",
        "_TRANSPORT" : "syslog",
        "PRIORITY" : "5",
        "_UID" : "500",
        "_GID" : "500",
        "_AUDIT_SESSION" : "2",
        "_AUDIT_LOGINUID" : "500",
        "_SYSTEMD_CGROUP" : "/user/lennart/2",
        "_SYSTEMD_SESSION" : "2",
        "_SELINUX_CONTEXT" : "unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023",
        "_MACHINE_ID" : "a91663387a90b89f185d4e860000001a",
        "_HOSTNAME" : "epsilon",
        "_COMM" : "test-journal-su",
        "_CMDLINE" : "./test-journal-submit",
        "SYSLOG_FACILITY" : "1",
        "_EXE" : "/home/lennart/projects/systemd/test-journal-submit",
        "_PID" : "3068",
        "SYSLOG_IDENTIFIER" : "test-journal-submit",
        "MESSAGE" : "Hello World!",
        "_SOURCE_REALTIME_TIMESTAMP" : "1351126905014938"
}

This nicely shows how the Journal implicitly augmented our little
log message with various meta data fields which describe in more
detail the context our message was generated from. For an explanation
of the various fields, please refer to systemd.journal-fields(7)

printf()

If you are writing code that is run as a systemd service, generating journal
messages is even easier:

#include <stdio.h>

int main(int argc, char *argv[]) {
        printf("Hello World\n");
        return 0;
}

Yupp, that’s easy, indeed.

The printed string in this example is logged at a default log
priority of LOG_INFO[1]. Sometimes it is useful to change
the log priority for such a printed string. When systemd parses
STDOUT/STDERR of a service it will look for priority values enclosed
in < > at the beginning of each line[2], following the scheme
used by the kernel’s printk() which in turn took
inspiration from the BSD syslog network serialization of messages. We
can make use of this systemd feature like this:

#include <stdio.h>

#define PREFIX_NOTICE "<5>"

int main(int argc, char *argv[]) {
        printf(PREFIX_NOTICE "Hello World\n");
        return 0;
}

Nice! Logging with nothing but printf() but we still get
log priorities!

This scheme works with any programming language, including, of course, shell:

#!/bin/bash

echo "<5>Hellow world"

Native Messages

Now, what I explained above is not particularly exciting: the
take-away is pretty much only that things end up in the journal if
they are output using the traditional message printing APIs. Yaaawn!

Let’s make this more interesting, let’s look at what the Journal
provides as native APIs for logging, and let’s see what its benefits
are. Let’s translate our little example into the 1:1 counterpart
using the Journal’s logging API sd_journal_print(3):

#include <systemd/sd-journal.h>

int main(int argc, char *argv[]) {
        sd_journal_print(LOG_NOTICE, "Hello World");
        return 0;
}

This doesn’t look much more interesting than the two examples
above, right? After compiling this with `pkg-config --cflags
--libs libsystemd-journal`
appended to the compiler parameters,
let’s have a closer look at the JSON representation of the journal
entry this generates:

 {
        "_BOOT_ID" : "5335e9cf5d954633bb99aefc0ec38c25",
        "PRIORITY" : "5",
        "_UID" : "500",
        "_GID" : "500",
        "_AUDIT_SESSION" : "2",
        "_AUDIT_LOGINUID" : "500",
        "_SYSTEMD_CGROUP" : "/user/lennart/2",
        "_SYSTEMD_SESSION" : "2",
        "_SELINUX_CONTEXT" : "unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023",
        "_MACHINE_ID" : "a91663387a90b89f185d4e860000001a",
        "_HOSTNAME" : "epsilon",
        "CODE_FUNC" : "main",
        "_TRANSPORT" : "journal",
        "_COMM" : "test-journal-su",
        "_CMDLINE" : "./test-journal-submit",
        "CODE_FILE" : "src/journal/test-journal-submit.c",
        "_EXE" : "/home/lennart/projects/systemd/test-journal-submit",
        "MESSAGE" : "Hello World",
        "CODE_LINE" : "4",
        "_PID" : "3516",
        "_SOURCE_REALTIME_TIMESTAMP" : "1351128226954170"
}

This looks pretty much the same, right? Almost! I highlighted three new
lines compared to the earlier output. Yes, you guessed it, by using
sd_journal_print() meta information about the generating
source code location is implicitly appended to each
message[3], which is helpful for a developer to identify
the source of a problem.

The primary reason for using the Journal’s native logging APIs is a
not just the source code location however: it is to allow
passing additional structured log messages from the program into the
journal. This additional log data may the be used to search the
journal for, is available for consumption for other programs, and
might help the administrator to track down issues beyond what is
expressed in the human readable message text. Here’s and example how
to do that with sd_journal_send():

#include <systemd/sd-journal.h>
#include <unistd.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
        sd_journal_send("MESSAGE=Hello World!",
                        "MESSAGE_ID=52fb62f99e2c49d89cfbf9d6de5e3555",
                        "PRIORITY=5",
                        "HOME=%s", getenv("HOME"),
                        "TERM=%s", getenv("TERM"),
                        "PAGE_SIZE=%li", sysconf(_SC_PAGESIZE),
                        "N_CPUS=%li", sysconf(_SC_NPROCESSORS_ONLN),
                        NULL);
        return 0;
}

This will write a log message to the journal much like the earlier
examples. However, this times a few additional, structured fields are
attached:

{
        "__CURSOR" : "s=ac9e9c423355411d87bf0ba1a9b424e8;i=5930;b=5335e9cf5d954633bb99aefc0ec38c25;m=16544f875b;t=4ccd863cdc4f0;x=896defe53cc1a96a",
        "__REALTIME_TIMESTAMP" : "1351129666274544",
        "__MONOTONIC_TIMESTAMP" : "95903778651",
        "_BOOT_ID" : "5335e9cf5d954633bb99aefc0ec38c25",
        "PRIORITY" : "5",
        "_UID" : "500",
        "_GID" : "500",
        "_AUDIT_SESSION" : "2",
        "_AUDIT_LOGINUID" : "500",
        "_SYSTEMD_CGROUP" : "/user/lennart/2",
        "_SYSTEMD_SESSION" : "2",
        "_SELINUX_CONTEXT" : "unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023",
        "_MACHINE_ID" : "a91663387a90b89f185d4e860000001a",
        "_HOSTNAME" : "epsilon",
        "CODE_FUNC" : "main",
        "_TRANSPORT" : "journal",
        "_COMM" : "test-journal-su",
        "_CMDLINE" : "./test-journal-submit",
        "CODE_FILE" : "src/journal/test-journal-submit.c",
        "_EXE" : "/home/lennart/projects/systemd/test-journal-submit",
        "MESSAGE" : "Hello World!",
        "_PID" : "4049",
        "CODE_LINE" : "6",
        "MESSAGE_ID" : "52fb62f99e2c49d89cfbf9d6de5e3555",
        "HOME" : "/home/lennart",
        "TERM" : "xterm-256color",
        "PAGE_SIZE" : "4096",
        "N_CPUS" : "4",
        "_SOURCE_REALTIME_TIMESTAMP" : "1351129666241467"
}

Awesome! Our simple example worked! The five meta data fields we
attached to our message appeared in the journal. We used sd_journal_send()
for this which works much like sd_journal_print() but takes a
NULL terminated list of format strings each followed by its
arguments. The format strings must include the field name and a ‘=’
before the values.

Our little structured message included seven fields. The first three we passed are well-known fields:

  1. MESSAGE= is the actual human readable message part of the structured message.
  2. PRIORITY= is the numeric message priority value as known from BSD syslog formatted as an integer string.
  3. MESSAGE_ID= is a 128bit ID that identifies our specific
    message call, formatted as hexadecimal string. We randomly generated
    this string with journalctl --new-id128. This can be used by
    applications to track down all occasions of this specific
    message. The 128bit can be a UUID, but this is not a requirement or enforced.

Applications may relatively freely define additional fields as they
see fit (we defined four pretty arbitrary ones in our example). A
complete list of the currently well-known fields is available in systemd.journal-fields(7).

Let’s see how the message ID helps us finding this message and all
its occasions in the journal:

$ journalctl MESSAGE_ID=52fb62f99e2c49d89cfbf9d6de5e3555
-- Logs begin at Thu, 2012-10-18 04:07:03 CEST, end at Thu, 2012-10-25 04:48:21 CEST. --
Oct 25 03:47:46 epsilon test-journal-se[4049]: Hello World!
Oct 25 04:40:36 epsilon test-journal-se[4480]: Hello World!

Seems I already invoked this example tool twice!

Many messages systemd itself generates have
message IDs
. This is useful for example, to find all occasions
where a program dumped core (journalctl
MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1
), or when a user
logged in (journalctl
MESSAGE_ID=8d45620c1a4348dbb17410da57c60c66
). If your application
generates a message that might be interesting to recognize in the
journal stream later on, we recommend attaching such a message ID to
it. You can easily allocate a new one for your message with journalctl
--new-id128
.

This example shows how we can use the Journal’s native APIs to
generate structured, recognizable messages. You can do much more than
this with the C API. For example, you may store binary data in journal
fields as well, which is useful to attach coredumps or hard disk SMART
states to events where this applies. In order to make this blog story
not longer than it already is we’ll not go into detail about how to do
this, an I ask you to check out sd_journal_send(3)
for further information on this.

Python

The examples above focus on C. Structured logging to the Journal is
also available from other languages. Along with systemd itself we ship
bindings for Python. Here’s an example how to use this:

from systemd import journal
journal.send('Hello world')
journal.send('Hello, again, world', FIELD2='Greetings!', FIELD3='Guten tag')

Other binding exist for Node.js,
PHP, Lua.

Portability

Generating structured data is a very useful feature for services to
make their logs more accessible both for administrators and other
programs. In addition to the implicit structure the Journal
adds to all logged messages it is highly beneficial if the various
components of our stack also provide explicit structure
in their messages, coming from within the processes themselves.

Porting an existing program to the Journal’s logging APIs comes
with one pitfall though: the Journal is Linux-only. If non-Linux
portability matters for your project it’s a good idea to provide an
alternative log output, and make it selectable at compile-time.

Regardless which way to log you choose, in all cases we’ll forward
the message to a classic syslog daemon running side-by-side with the
Journal, if there is one. However, much of the structured meta data of
the message is not forwarded since the classic syslog protocol simply
has no generally accepted way to encode this and we shouldn’t attempt
to serialize meta data into classic syslog messages which might turn
/var/log/messages into an unreadable dump of machine
data. Anyway, to summarize this: regardless if you log with
syslog(), printf(), sd_journal_print() or
sd_journal_send(), the message will be stored and indexed by
the journal and it will also be forwarded to classic syslog.

And that’s it for today. In a follow-up episode we’ll focus on
retrieving messages from the Journal using the C API, possibly
filtering for a specific subset of messages. Later on, I hope to give
a real-life example how to port an existing service to the Journal’s
logging APIs. Stay tuned!

Footnotes

[1] This can be changed with the SyslogLevel= service
setting. See systemd.exec(5)
for details.

[2] Interpretation of the < > prefixes of logged lines
may be disabled with the SyslogLevelPrefix= service setting. See systemd.exec(5)
for details.

[3] Appending the code location to the log messages can be
turned off at compile time by defining
-DSD_JOURNAL_SUPPRESS_LOCATION.

systemd for Administrators, Part XIII

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/systemctl-journal.html

Here’s
the thirteenth installment
of

my ongoing series
on
systemd
for
Administrators:

Log and Service Status

This one is a short episode. One of the most commonly used commands
on a systemd
system is systemctl status which may be used to determine the
status of a service (or other unit). It always has been a valuable
tool to figure out the processes, runtime information and other meta
data of a daemon running on the system.

With Fedora 17 we introduced the
journal
, our new logging scheme that provides structured, indexed
and reliable logging on systemd systems, while providing a certain
degree of compatibility with classic syslog implementations. The
original reason we started to work on the journal was one specific
feature idea, that to the outsider might appear simple but without the
journal is difficult and inefficient to implement: along with the
output of systemctl status we wanted to show the last 10 log
messages of the daemon. Log data is some of the most essential bits of
information we have on the status of a service. Hence it it is an
obvious choice to show next to the general status of the
service.

And now to make it short: at the same time as we integrated the
journal into systemd and Fedora we also hooked up
systemctl with it. Here’s an example output:

$ systemctl status avahi-daemon.service
avahi-daemon.service - Avahi mDNS/DNS-SD Stack
	  Loaded: loaded (/usr/lib/systemd/system/avahi-daemon.service; enabled)
	  Active: active (running) since Fri, 18 May 2012 12:27:37 +0200; 14s ago
	Main PID: 8216 (avahi-daemon)
	  Status: "avahi-daemon 0.6.30 starting up."
	  CGroup: name=systemd:/system/avahi-daemon.service
		  ├ 8216 avahi-daemon: running [omega.local]
		  └ 8217 avahi-daemon: chroot helper

May 18 12:27:37 omega avahi-daemon[8216]: Joining mDNS multicast group on interface eth1.IPv4 with address 172.31.0.52.
May 18 12:27:37 omega avahi-daemon[8216]: New relevant interface eth1.IPv4 for mDNS.
May 18 12:27:37 omega avahi-daemon[8216]: Network interface enumeration completed.
May 18 12:27:37 omega avahi-daemon[8216]: Registering new address record for 192.168.122.1 on virbr0.IPv4.
May 18 12:27:37 omega avahi-daemon[8216]: Registering new address record for fd00::e269:95ff:fe87:e282 on eth1.*.
May 18 12:27:37 omega avahi-daemon[8216]: Registering new address record for 172.31.0.52 on eth1.IPv4.
May 18 12:27:37 omega avahi-daemon[8216]: Registering HINFO record with values 'X86_64'/'LINUX'.
May 18 12:27:38 omega avahi-daemon[8216]: Server startup complete. Host name is omega.local. Local service cookie is 3555095952.
May 18 12:27:38 omega avahi-daemon[8216]: Service "omega" (/services/ssh.service) successfully established.
May 18 12:27:38 omega avahi-daemon[8216]: Service "omega" (/services/sftp-ssh.service) successfully established.

This, of course, shows the status of everybody’s favourite
mDNS/DNS-SD daemon with a list of its processes, along with — as
promised — the 10 most recent log lines. Mission accomplished!

There are a couple of switches available to alter the output
slightly and adjust it to your needs. The two most interesting
switches are -f to enable follow mode (as in tail
-f
) and -n to change the number of lines to show (you
guessed it, as in tail -n).

The log data shown comes from three sources: everything any of the
daemon’s processes logged with libc’s syslog() call,
everything submitted using the native Journal API, plus everything any
of the daemon’s processes logged to STDOUT or STDERR. In short:
everything the daemon generates as log data is collected, properly
interleaved and shown in the same format.

And that’s it already for today. It’s a very simple feature, but an
immensely useful one for every administrator. One of the kind “Why didn’t
we already do this 15 years ago?”.

Stay tuned for the next installment!

systemd for Administrators, Part III

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/systemd-for-admins-3.html

Here’s the third installment of my ongoing
series about
systemd
for administrators
.

How Do I Convert A SysV Init Script Into A systemd Service File?

Traditionally, Unix and Linux services (daemons) are started
via SysV init scripts. These are Bourne Shell scripts, usually
residing in a directory such as /etc/rc.d/init.d/ which when
called with one of a few standardized arguments (verbs) such as
start, stop or restart controls,
i.e. starts, stops or restarts the service in question. For starts
this usually involves invoking the daemon binary, which then forks a
background process (more precisely daemonizes). Shell scripts
tend to be slow, needlessly hard to read, very verbose and
fragile. Although they are immensly flexible (after all, they are just
code) some things are very hard to do properly with shell scripts,
such as ordering parallized execution, correctly supervising processes
or just configuring execution contexts in all detail. systemd provides
compatibility with these shell scripts, but due to the shortcomings
pointed out it is recommended to install native systemd service files
for all daemons installed. Also, in contrast to SysV init scripts
which have to be adjusted to the distribution systemd service files
are compatible with any kind of distribution running systemd (which
become more and more these days…). What follows is a terse guide how
to take a SysV init script and translate it into a native systemd
service file. Ideally, upstream projects should ship and install
systemd service files in their tarballs. If you have successfully
converted a SysV script according to the guidelines it might hence be
a good idea to submit the file as patch to upstream. How to prepare a
patch like that will be discussed in a later installment, suffice to
say at this point that the daemon(7)
manual page shipping with systemd contains a lot of useful information
regarding this.

So, let’s jump right in. As an example we’ll convert the init
script of the ABRT daemon into a systemd service file. ABRT is a
standard component of every Fedora install, and is an acronym for
Automatic Bug Reporting Tool, which pretty much describes what it
does, i.e. it is a service for collecting crash dumps. Its SysV script I have uploaded
here.

The first step when converting such a script is to read it
(surprise surprise!) and distill the useful information from the
usually pretty long script. In almost all cases the script consists of
mostly boilerplate code that is identical or at least very similar in
all init scripts, and usually copied and pasted from one to the
other. So, let’s extract the interesting information from the script
linked above:

  • A description string for the service is “Daemon to detect
    crashing apps
    “. As it turns out, the header comments include a
    redundant number of description strings, some of them describing less
    the actual service but the init script to start it. systemd services
    include a description too, and it should describe the service and not
    the service file.
  • The LSB header[1] contains dependency
    information. systemd due to its design around socket-based activation
    usually needs no (or very little) manually configured
    dependencies. (For details regarding socket activation see the original
    announcement blog post.
    ) In this case the dependency on
    $syslog (which encodes that abrtd requires a syslog daemon),
    is the only valuable information. While the header lists another
    dependency ($local_fs) this one is redundant with systemd as
    normal system services are always started with all local file systems
    available.
  • The LSB header suggests that this service should be started in
    runlevels 3 (multi-user) and 5 (graphical).
  • The daemon binary is /usr/sbin/abrtd

And that’s already it. The entire remaining content of this
115-line shell script is simply boilerplate or otherwise redundant
code: code that deals with synchronizing and serializing startup
(i.e. the code regarding lock files) or that outputs status messages
(i.e. the code calling echo), or simply parsing of the verbs (i.e. the
big case block).

From the information extracted above we can now write our systemd service file:

[Unit]
Description=Daemon to detect crashing apps
After=syslog.target

[Service]
ExecStart=/usr/sbin/abrtd
Type=forking

[Install]
WantedBy=multi-user.target

A little explanation of the contents of this file: The
[Unit] section contains generic information about the
service. systemd not only manages system services, but also devices,
mount points, timer, and other components of the system. The generic
term for all these objects in systemd is a unit, and the
[Unit] section encodes information about it that might be
applicable not only to services but also in to the other unit types
systemd maintains. In this case we set the following unit settings: we
set the description string and configure that the daemon shall be
started after Syslog[2], similar to what is encoded in the
LSB header of the original init script. For this Syslog dependency we
create a dependency of type After= on a systemd unit
syslog.target. The latter is a special target unit in systemd
and is the standardized name to pull in a syslog implementation. For
more information about these standardized names see the systemd.special(7). Note
that a dependency of type After= only encodes the suggested
ordering, but does not actually cause syslog to be started when abrtd
is — and this is exactly what we want, since abrtd actually works
fine even without syslog being around. However, if both are started
(and usually they are) then the order in which they are is controlled
with this dependency.

The next section is [Service] which encodes information
about the service itself. It contains all those settings that apply
only to services, and not the other kinds of units systemd maintains
(mount points, devices, timers, …). Two settings are used here:
ExecStart= takes the path to the binary to execute when the
service shall be started up. And with Type= we configure how
the service notifies the init system that it finished starting up. Since
traditional Unix daemons do this by returning to the parent process
after having forked off and initialized the background daemon we set
the type to forking here. That tells systemd to wait until
the start-up binary returns and then consider the processes still
running afterwards the daemon processes.

The final section is [Install]. It encodes information
about how the suggested installation should look like, i.e. under
which circumstances and by which triggers the service shall be
started. In this case we simply say that this service shall be started
when the multi-user.target unit is activated. This is a
special unit (see above) that basically takes the role of the classic
SysV Runlevel 3[3]. The setting WantedBy= has
little effect on the daemon during runtime. It is only read by the
systemctl enable command, which is the recommended way to
enable a service in systemd. This command will simply ensure that our
little service gets automatically activated as soon as
multi-user.target is requested, which it is on all normal
boots[4].

And that’s it. Now we already have a minimal working systemd
service file. To test it we copy it to
/etc/systemd/system/abrtd.service and invoke systemctl
daemon-reload
. This will make systemd take notice of it, and now
we can start the service with it: systemctl start
abrtd.service
. We can verify the status via systemctl status
abrtd.service
. And we can stop it again via systemctl stop
abrtd.service
. Finally, we can enable it, so that it is activated
by default on future boots with systemctl enable
abrtd.service
.

The service file above, while sufficient and basically a 1:1
translation (feature- and otherwise) of the SysV init script still has room for
improvement. Here it is a little bit updated:

[Unit]
Description=ABRT Automated Bug Reporting Tool
After=syslog.target

[Service]
Type=dbus
BusName=com.redhat.abrt
ExecStart=/usr/sbin/abrtd -d -s

[Install]
WantedBy=multi-user.target

So, what did we change? Two things: we improved the description
string a bit. More importantly however, we changed the type of the
service to dbus and configured the D-Bus bus name of the
service. Why did we do this? As mentioned classic SysV services
daemonize after startup, which usually involves double forking
and detaching from any terminal. While this is useful and necessary
when daemons are invoked via a script, this is unnecessary (and slow)
as well as counterproductive when a proper process babysitter such as
systemd is used. The reason for that is that the forked off daemon
process usually has little relation to the original process started by
systemd (after all the daemonizing scheme’s whole idea is to remove
this relation), and hence it is difficult for systemd to figure out
after the fork is finished which process belonging to the service is
actually the main process and which processes might just be
auxiliary. But that information is crucial to implement advanced
babysitting, i.e. supervising the process, automatic respawning on
abnormal termination, collectig crash and exit code information and
suchlike. In order to make it easier for systemd to figure out the
main process of the daemon we changed the service type to
dbus. The semantics of this service type are appropriate for
all services that take a name on the D-Bus system bus as last step of
their initialization[5]. ABRT is one of those. With this setting systemd
will spawn the ABRT process, which will no longer fork (this is
configured via the -d -s switches to the daemon), and systemd
will consider the service fully started up as soon as
com.redhat.abrt appears on the bus. This way the process
spawned by systemd is the main process of the daemon, systemd has a
reliable way to figure out when the daemon is fully started up and
systemd can easily supervise it.

And that’s all there is to it. We have a simple systemd service
file now that encodes in 10 lines more information than the original
SysV init script encoded in 115. And even now there’s a lot of room
left for further improvement utilizing more features systemd
offers. For example, we could set Restart=restart-always to
tell systemd to automatically restart this service when it dies. Or,
we could use OOMScoreAdjust=-500 to ask the kernel to please
leave this process around when the OOM killer wreaks havoc. Or, we
could use CPUSchedulingPolicy=idle to ensure that abrtd
processes crash dumps in background only, always allowing the kernel
to give preference to whatever else might be running and needing CPU
time.

For more information about the configuration options mentioned
here, see the respective man pages systemd.unit(5),
systemd.service(5),
systemd.exec(5). Or,
browse all of
systemd’s man pages
.

Of course, not all SysV scripts are as easy to convert as this
one. But gladly, as it turns out the vast majority actually are.

That’s it for today, come back soon for the next installment in our series.

Footnotes

[1] The LSB header of init scripts is a convention of
including meta data about the service in comment blocks at the top of
SysV init scripts and is
defined by the Linux Standard Base
. This was intended to
standardize init scripts between distributions. While most
distributions have adopted this scheme, the handling of the headers
varies greatly between the distributions, and in fact still makes it
necessary to adjust init scripts for every distribution. As such the LSB spec
never kept the promise it made.

[2] Strictly speaking, this dependency does not even have to
be encoded here, as it is redundant in a system where the Syslog
daemon is socket activatable. Modern syslog systems (for example
rsyslog v5) have been patched upstream to be socket-activatable. If
such a init system is used configuration of the
After=syslog.target dependency is redundant and
implicit. However, to maintain compatibility with syslog services that
have not been updated we include this dependency here.

[3] At least how it used to be defined on Fedora.

[4] Note that in systemd the graphical bootup
(graphical.target, taking the role of SysV runlevel 5) is an
implicit superset of the console-only bootup
(multi-user.target, i.e. like runlevel 3). That means hooking
a service into the latter will also hook it into the
former.

[5] Actually the majority of services of the default Fedora
install now take a name on the bus after startup.

The next step

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/pa-097.html

A few minutes ago, I finally released PulseAudio 0.9.7. Changes are numerous,
especially internally where the core is now threaded and mostly lock-free.
Check the rough list on the milestone page, announcement email. As many of you
know we are shipping a pre-release of 0.9.7 in Fedora 8, enabled by default. The final release
offers quite a few additions over that prerelease. To show off a couple of nice features, here’s a screencast, showing hotplug, simultaneous playback (what Apple calls aggregation) and zeroconfish network support:

screencast

Please excuse the typos. Yes, I still use XMMS, don’t ask [1]. Yes, you
need a bit of imagination to fully appreciate a screencast that lacks an audio track — but demos audio software.

So, what’s coming next? Earcandy, timer-based scheduling/”glitch-free” audio, scriptability through Lua, the todo list is huge. My unnoffical, scratchy, partly german TODO list for PulseAudio is available online.

As it appears all relevant distros will now move to PA by default. So,
hopefully, PA is coming to a desktop near you pretty soon. — Oh, you are one
of those who still don’t see the benefit of a desktop sound server? Then,
please reread this too
long email of mine
, or maybe this
ars.technica article
.

OTOH, if you happen to like this release, then consider giving me a kudo on ohloh.net, my ego wants a golden 10. 😉

logo

Footnotes:

[1] Those music players which categorize audio by ID3 tags just don’t
work for me, because most of my music files are very badly named. However, my
directory structure is very well organized, but all those newer players don’t
care about directory structures as it seems. XMMS doesn’t really either, but
xmms . does the job from the terminal.

Flameeyes, thank’s for hosting this clip.