# The Digital Security Exchange Is Live

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/04/the_digital_sec.html

Last year I wrote about the Digital Security Exchange. The project is live:

The DSX works to strengthen the digital resilience of U.S. civil society groups by improving their understanding and mitigation of online threats.

We do this by pairing civil society and social sector organizations with credible and trustworthy digital security experts and trainers who can help them keep their data and networks safe from exposure, exploitation, and attack. We are committed to working with community-based organizations, legal and journalistic organizations, civil rights advocates, local and national organizers, and public and high-profile figures who are working to advance social, racial, political, and economic justice in our communities and our world.

If you are either an organization who needs help, or an expert who can provide help, visit their website.

Note: I am on their advisory committee.

# Съд на ЕС: за компетентността на частните компании, опериращи платформи, да заличават съдържание

Post Syndicated from nellyo original https://nellyo.wordpress.com/2018/01/26/fb-5/

Много важен въпрос е поставен пред Съда на ЕС,  засега не е публикуван на сайта на Съда – ще допълня, когато го публикуват.

Нарастващата мощ на интернет  компаниите да вземат решения по отношение на съдържанието не бива да остава извън полезрението ни.

# Random with care

Post Syndicated from Eevee original https://eev.ee/blog/2018/01/02/random-with-care/

Hi! Here are a few loose thoughts about picking random numbers.

This is all aimed at frivolous pursuits like video games. Hell, even video games where money is at stake should be deferring to someone who knows way more than I do. Otherwise you might find out that your deck shuffles in your poker game are woefully inadequate and some smartass is cheating you out of millions. (If your random number generator has fewer than 226 bits of state, it can’t even generate every possible shuffling of a deck of cards!)

## Use the right distribution

Most languages have a random number primitive that spits out a number uniformly in the range [0, 1), and you can go pretty far with just that. But beware a few traps!

### Random pitches

Say you want to pitch up a sound by a random amount, perhaps up to an octave. Your audio API probably has a way to do this that takes a pitch multiplier, where I say “probably” because that’s how the only audio API I’ve used works.

Easy peasy. If 1 is unchanged and 2 is pitched up by an octave, then all you need is rand() + 1. Right?

No! Pitch is exponential — within the same octave, the “gap” between C and C♯ is about half as big as the gap between B and the following C. If you pick a pitch multiplier uniformly, you’ll have a noticeable bias towards the higher pitches.

One octave corresponds to a doubling of pitch, so if you want to pick a random note, you want 2 ** rand().

### Random directions

For two dimensions, you can just pick a random angle with rand() * TAU.

If you want a vector rather than an angle, or if you want a random direction in three dimensions, it’s a little trickier. You might be tempted to just pick a random point where each component is rand() * 2 - 1 (ranging from −1 to 1), but that’s not quite right. A direction is a point on the surface (or, equivalently, within the volume) of a sphere, and picking each component independently produces a point within the volume of a cube; the result will be a bias towards the corners of the cube, where there’s much more extra volume beyond the sphere.

No? Well, just trust me. I don’t know how to make a diagram for this.

Anyway, you could use the Pythagorean theorem a few times and make a huge mess of things, or it turns out there’s a really easy way that even works for two or four or any number of dimensions. You pick each coordinate from a Gaussian (normal) distribution, then normalize the resulting vector. In other words, using Python’s random module:

 1 2 3 4 5 6 def random_direction(): x = random.gauss(0, 1) y = random.gauss(0, 1) z = random.gauss(0, 1) r = math.sqrt(x*x + y*y + z*z) return x/r, y/r, z/r

Why does this work? I have no idea!

Note that it is possible to get zero (or close to it) for every component, in which case the result is nonsense. You can re-roll all the components if necessary; just check that the magnitude (or its square) is less than some epsilon, which is equivalent to throwing away a tiny sphere at the center and shouldn’t affect the distribution.

### Beware Gauss

Since I brought it up: the Gaussian distribution is a pretty nice one for choosing things in some range, where the middle is the common case and should appear more frequently.

That said, I never use it, because it has one annoying drawback: the Gaussian distribution has no minimum or maximum value, so you can’t really scale it down to the range you want. In theory, you might get any value out of it, with no limit on scale.

In practice, it’s astronomically rare to actually get such a value out. I did a hundred million trials just to see what would happen, and the largest value produced was 5.8.

But, still, I’d rather not knowingly put extremely rare corner cases in my code if I can at all avoid it. I could clamp the ends, but that would cause unnatural bunching at the endpoints. I could reroll if I got a value outside some desired range, but I prefer to avoid rerolling when I can, too; after all, it’s still (astronomically) possible to have to reroll for an indefinite amount of time. (Okay, it’s really not, since you’ll eventually hit the period of your PRNG. Still, though.) I don’t bend over backwards here — I did just say to reroll when picking a random direction, after all — but when there’s a nicer alternative I’ll gladly use it.

And lo, there is a nicer alternative! Enter the beta distribution. It always spits out a number in [0, 1], so you can easily swap it in for the standard normal function, but it takes two “shape” parameters α and β that alter its behavior fairly dramatically.

With α = β = 1, the beta distribution is uniform, i.e. no different from rand(). As α increases, the distribution skews towards the right, and as β increases, the distribution skews towards the left. If α = β, the whole thing is symmetric with a hump in the middle. The higher either one gets, the more extreme the hump (meaning that value is far more common than any other). With a little fiddling, you can get a number of interesting curves.

Screenshots don’t really do it justice, so here’s a little Wolfram widget that lets you play with α and β live:

Note that if α = 1, then 1 is a possible value; if β = 1, then 0 is a possible value. You probably want them both greater than 1, which clamps the endpoints to zero.

Also, it’s possible to have either α or β or both be less than 1, but this creates very different behavior: the corresponding endpoints become poles.

Anyway, something like α = β = 3 is probably close enough to normal for most purposes but already clamped for you. And you could easily replicate something like, say, NetHack’s incredibly bizarre rnz function.

### Random frequency

Say you want some event to have an 80% chance to happen every second. You (who am I kidding, I) might be tempted to do something like this:

 1 2 if random() < 0.8 * dt: do_thing()

In an ideal world, dt is always the same and is equal to 1 / f, where f is the framerate. Replace that 80% with a variable, say P, and every tic you have a P / f chance to do the… whatever it is.

Each second, f tics pass, so you’ll make this check f times. The chance that any check succeeds is the inverse of the chance that every check fails, which is $$1 – \left(1 – \frac{P}{f}\right)^f$$.

For P of 80% and a framerate of 60, that’s a total probability of 55.3%. Wait, what?

Consider what happens if the framerate is 2. On the first tic, you roll 0.4 twice — but probabilities are combined by multiplying, and splitting work up by dt only works for additive quantities. You lose some accuracy along the way. If you’re dealing with something that multiplies, you need an exponent somewhere.

But in this case, maybe you don’t want that at all. Each separate roll you make might independently succeed, so it’s possible (but very unlikely) that the event will happen 60 times within a single second! Or 200 times, if that’s someone’s framerate.

If you explicitly want something to have a chance to happen on a specific interval, you have to check on that interval. If you don’t have a gizmo handy to run code on an interval, it’s easy to do yourself with a time buffer:

 1 2 3 4 5 6 timer += dt # here, 1 is the "every 1 seconds" while timer > 1: timer -= 1 if random() < 0.8: do_thing()

Using while means rolls still happen even if you somehow skipped over an entire second.

(For the curious, and the nerds who already noticed: the expression $$1 – \left(1 – \frac{P}{f}\right)^f$$ converges to a specific value! As the framerate increases, it becomes a better and better approximation for $$1 – e^{-P}$$, which for the example above is 0.551. Hey, 60 fps is pretty accurate — it’s just accurately representing something nowhere near what I wanted. Er, you wanted.)

### Rolling your own

Of course, you can fuss with the classic [0, 1] uniform value however you want. If I want a bias towards zero, I’ll often just square it, or multiply two of them together. If I want a bias towards one, I’ll take a square root. If I want something like a Gaussian/normal distribution, but with clearly-defined endpoints, I might add together n rolls and divide by n. (The normal distribution is just what you get if you roll infinite dice and divide by infinity!)

It’d be nice to be able to understand exactly what this will do to the distribution. Unfortunately, that requires some calculus, which this post is too small to contain, and which I didn’t even know much about myself until I went down a deep rabbit hole while writing, and which in many cases is straight up impossible to express directly.

Here’s the non-calculus bit. A source of randomness is often graphed as a PDF — a probability density function. You’ve almost certainly seen a bell curve graphed, and that’s a PDF. They’re pretty nice, since they do exactly what they look like: they show the relative chance that any given value will pop out. On a bog standard bell curve, there’s a peak at zero, and of course zero is the most common result from a normal distribution.

(Okay, actually, since the results are continuous, it’s vanishingly unlikely that you’ll get exactly zero — but you’re much more likely to get a value near zero than near any other number.)

For the uniform distribution, which is what a classic rand() gives you, the PDF is just a straight horizontal line — every result is equally likely.

If there were a calculus bit, it would go here! Instead, we can cheat. Sometimes. Mathematica knows how to work with probability distributions in the abstract, and there’s a free web version you can use. For the example of squaring a uniform variable, try this out:

 1 PDF[TransformedDistribution[u^2, u \[Distributed] UniformDistribution[{0, 1}]], u]

(The \[Distributed] is a funny tilde that doesn’t exist in Unicode, but which Mathematica uses as a first-class operator. Also, press shiftEnter to evaluate the line.)

This will tell you that the distribution is… $$\frac{1}{2\sqrt{u}}$$. Weird! You can plot it:

 1 Plot[%, {u, 0, 1}]

(The % refers to the result of the last thing you did, so if you want to try several of these, you can just do Plot[PDF[…], u] directly.)

The resulting graph shows that numbers around zero are, in fact, vastly — infinitely — more likely than anything else.

What about multiplying two together? I can’t figure out how to get Mathematica to understand this, but a great amount of digging revealed that the answer is -ln x, and from there you can plot them both on Wolfram Alpha. They’re similar, though squaring has a much better chance of giving you high numbers than multiplying two separate rolls — which makes some sense, since if either of two rolls is a low number, the product will be even lower.

What if you know the graph you want, and you want to figure out how to play with a uniform roll to get it? Good news! That’s a whole thing called inverse transform sampling. All you have to do is take an integral. Good luck!

This is all extremely ridiculous. New tactic: Just Simulate The Damn Thing. You already have the code; run it a million times, make a histogram, and tada, there’s your PDF. That’s one of the great things about computers! Brute-force numerical answers are easy to come by, so there’s no excuse for producing something like rnz. (Though, be sure your histogram has sufficiently narrow buckets — I tried plotting one for rnz once and the weird stuff on the left side didn’t show up at all!)

By the way, I learned something from futzing with Mathematica here! Taking the square root (to bias towards 1) gives a PDF that’s a straight diagonal line, nothing like the hyperbola you get from squaring (to bias towards 0). How do you get a straight line the other way? Surprise: $$1 – \sqrt{1 – u}$$.

### Okay, okay, here’s the actual math

I don’t claim to have a very firm grasp on this, but I had a hell of a time finding it written out clearly, so I might as well write it down as best I can. This was a great excuse to finally set up MathJax, too.

Say $$u(x)$$ is the PDF of the original distribution and $$u$$ is a representative number you plucked from that distribution. For the uniform distribution, $$u(x) = 1$$. Or, more accurately,

$$u(x) = \begin{cases} 1 & \text{ if } 0 \le x \lt 1 \\ 0 & \text{ otherwise } \end{cases}$$

Remember that $$x$$ here is a possible outcome you want to know about, and the PDF tells you the relative probability that a roll will be near it. This PDF spits out 1 for every $$x$$, meaning every number between 0 and 1 is equally likely to appear.

We want to do something to that PDF, which creates a new distribution, whose PDF we want to know. I’ll use my original example of $$f(u) = u^2$$, which creates a new PDF $$v(x)$$.

The trick is that we need to work in terms of the cumulative distribution function for $$u$$. Where the PDF gives the relative chance that a roll will be (“near”) a specific value, the CDF gives the relative chance that a roll will be less than a specific value.

The conventions for this seem to be a bit fuzzy, and nobody bothers to explain which ones they’re using, which makes this all the more confusing to read about… but let’s write the CDF with a capital letter, so we have $$U(x)$$. In this case, $$U(x) = x$$, a straight 45° line (at least between 0 and 1). With the definition I gave, this should make sense. At some arbitrary point like 0.4, the value of the PDF is 1 (0.4 is just as likely as anything else), and the value of the CDF is 0.4 (you have a 40% chance of getting a number from 0 to 0.4).

Calculus ahoy: the PDF is the derivative of the CDF, which means it measures the slope of the CDF at any point. For $$U(x) = x$$, the slope is always 1, and indeed $$u(x) = 1$$. See, calculus is easy.

Okay, so, now we’re getting somewhere. What we want is the CDF of our new distribution, $$V(x)$$. The CDF is defined as the probability that a roll $$v$$ will be less than $$x$$, so we can literally write:

$$V(x) = P(v \le x)$$

(This is why we have to work with CDFs, rather than PDFs — a PDF gives the chance that a roll will be “nearby,” whatever that means. A CDF is much more concrete.)

What is $$v$$, exactly? We defined it ourselves; it’s the do something applied to a roll from the original distribution, or $$f(u)$$.

$$V(x) = P\!\left(f(u) \le x\right)$$

Now the first tricky part: we have to solve that inequality for $$u$$, which means we have to do something, backwards to $$x$$.

$$V(x) = P\!\left(u \le f^{-1}(x)\right)$$

Almost there! We now have a probability that $$u$$ is less than some value, and that’s the definition of a CDF!

$$V(x) = U\!\left(f^{-1}(x)\right)$$

Hooray! Now to turn these CDFs back into PDFs, all we need to do is differentiate both sides and use the chain rule. If you never took calculus, don’t worry too much about what that means!

$$v(x) = u\!\left(f^{-1}(x)\right)\left|\frac{d}{dx}f^{-1}(x)\right|$$

Wait! Where did that absolute value come from? It takes care of whether $$f(x)$$ increases or decreases. It’s the least interesting part here by far, so, whatever.

There’s one more magical part here when using the uniform distribution — $$u(\dots)$$ is always equal to 1, so that entire term disappears! (Note that this only works for a uniform distribution with a width of 1; PDFs are scaled so the entire area under them sums to 1, so if you had a rand() that could spit out a number between 0 and 2, the PDF would be $$u(x) = \frac{1}{2}$$.)

$$v(x) = \left|\frac{d}{dx}f^{-1}(x)\right|$$

So for the specific case of modifying the output of rand(), all we have to do is invert, then differentiate. The inverse of $$f(u) = u^2$$ is $$f^{-1}(x) = \sqrt{x}$$ (no need for a ± since we’re only dealing with positive numbers), and differentiating that gives $$v(x) = \frac{1}{2\sqrt{x}}$$. Done! This is also why square root comes out nicer; inverting it gives $$x^2$$, and differentiating that gives $$2x$$, a straight line.

Incidentally, that method for turning a uniform distribution into any distribution — inverse transform sampling — is pretty much the same thing in reverse: integrate, then invert. For example, when I saw that taking the square root gave $$v(x) = 2x$$, I naturally wondered how to get a straight line going the other way, $$v(x) = 2 – 2x$$. Integrating that gives $$2x – x^2$$, and then you can use the quadratic formula (or just ask Wolfram Alpha) to solve $$2x – x^2 = u$$ for $$x$$ and get $$f(u) = 1 – \sqrt{1 – u}$$.

Multiply two rolls is a bit more complicated; you have to write out the CDF as an integral and you end up doing a double integral and wow it’s a mess. The only thing I’ve retained is that you do a division somewhere, which then gets integrated, and that’s why it ends up as $$-\ln x$$.

And that’s quite enough of that! (Okay but having math in my blog is pretty cool and I will definitely be doing more of this, sorry, not sorry.)

## Random vs varied

Sometimes, random isn’t actually what you want. We tend to use the word “random” casually to mean something more like chaotic, i.e., with no discernible pattern. But that’s not really random. In fact, given how good humans can be at finding incidental patterns, they aren’t all that unlikely! Consider that when you roll two dice, they’ll come up either the same or only one apart almost half the time. Coincidence? Well, yes.

If you ask for randomness, you’re saying that any outcome — or series of outcomes — is acceptable, including five heads in a row or five tails in a row. Most of the time, that’s fine. Some of the time, it’s less fine, and what you really want is variety. Here are a couple examples and some fairly easy workarounds.

### NPC quips

The nature of games is such that NPCs will eventually run out of things to say, at which point further conversation will give the player a short brush-off quip — a slight nod from the designer to the player that, hey, you hit the end of the script.

Some NPCs have multiple possible quips and will give one at random. The trouble with this is that it’s very possible for an NPC to repeat the same quip several times in a row before abruptly switching to another one. With only a few options to choose from, getting the same option twice or thrice (especially across an entire game, which may have numerous NPCs) isn’t all that unlikely. The notion of an NPC quip isn’t very realistic to start with, but having someone repeat themselves and then abruptly switch to something else is especially jarring.

The easy fix is to show the quips in order! Paradoxically, this is more consistently varied than choosing at random — the original “order” is likely to be meaningless anyway, and it already has the property that the same quip can never appear twice in a row.

If you like, you can shuffle the list of quips every time you reach the end, but take care here — it’s possible that the last quip in the old order will be the same as the first quip in the new order, so you may still get a repeat. (Of course, you can just check for this case and swap the first quip somewhere else if it bothers you.)

That last behavior is, in fact, the canonical way that Tetris chooses pieces — the game simply shuffles a list of all 7 pieces, gives those to you in shuffled order, then shuffles them again to make a new list once it’s exhausted. There’s no avoidance of duplicates, though, so you can still get two S blocks in a row, or even two S and two Z all clumped together, but no more than that. Some Tetris variants take other approaches, such as actively avoiding repeats even several pieces apart or deliberately giving you the worst piece possible.

### Random drops

Random drops are often implemented as a flat chance each time. Maybe enemies have a 5% chance to drop health when they die. Legally speaking, over the long term, a player will see health drops for about 5% of enemy kills.

Over the short term, they may be desperate for health and not survive to see the long term. So you may want to put a thumb on the scale sometimes. Games in the Metroid series, for example, have a somewhat infamous bias towards whatever kind of drop they think you need — health if your health is low, missiles if your missiles are low.

I can’t give you an exact approach to use, since it depends on the game and the feeling you’re going for and the variables at your disposal. In extreme cases, you might want to guarantee a health drop from a tough enemy when the player is critically low on health. (Or if you’re feeling particularly evil, you could go the other way and deny the player health when they most need it…)

The problem becomes a little different, and worse, when the event that triggers the drop is relatively rare. The pathological case here would be something like a raid boss in World of Warcraft, which requires hours of effort from a coordinated group of people to defeat, and which has some tiny chance of dropping a good item that will go to only one of those people. This is why I stopped playing World of Warcraft at 60.

Dialing it back a little bit gives us Enter the Gungeon, a roguelike where each room is a set of encounters and each floor only has a dozen or so rooms. Initially, you have a 1% chance of getting a reward after completing a room — but every time you complete a room and don’t get a reward, the chance increases by 9%, up to a cap of 80%. Once you get a reward, the chance resets to 1%.

The natural question is: how frequently, exactly, can a player expect to get a reward? We could do math, or we could Just Simulate The Damn Thing.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 from collections import Counter import random histogram = Counter() TRIALS = 1000000 chance = 1 rooms_cleared = 0 rewards_found = 0 while rewards_found < TRIALS: rooms_cleared += 1 if random.random() * 100 < chance: # Reward! rewards_found += 1 histogram[rooms_cleared] += 1 rooms_cleared = 0 chance = 1 else: chance = min(80, chance + 9) for gaps, count in sorted(histogram.items()): print(f"{gaps:3d} | {count / TRIALS * 100:6.2f}%", '#' * (count // (TRIALS // 100)))
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 | 0.98% 2 | 9.91% ######### 3 | 17.00% ################ 4 | 20.23% #################### 5 | 19.21% ################### 6 | 15.05% ############### 7 | 9.69% ######### 8 | 5.07% ##### 9 | 2.09% ## 10 | 0.63% 11 | 0.12% 12 | 0.03% 13 | 0.00% 14 | 0.00% 15 | 0.00%

We’ve got kind of a hilly distribution, skewed to the left, which is up in this histogram. Most of the time, a player should see a reward every three to six rooms, which is maybe twice per floor. It’s vanishingly unlikely to go through a dozen rooms without ever seeing a reward, so a player should see at least one per floor.

Of course, this simulated a single continuous playthrough; when starting the game from scratch, your chance at a reward always starts fresh at 1%, the worst it can be. If you want to know about how many rewards a player will get on the first floor, hey, Just Simulate The Damn Thing.

 1 2 3 4 5 6 7 0 | 0.01% 1 | 13.01% ############# 2 | 56.28% ######################################################## 3 | 27.49% ########################### 4 | 3.10% ### 5 | 0.11% 6 | 0.00%

Cool. Though, that’s assuming exactly 12 rooms; it might be worth changing that to pick at random in a way that matches the level generator.

(Enter the Gungeon does some other things to skew probability, which is very nice in a roguelike where blind luck can make or break you. For example, if you kill a boss without having gotten a new gun anywhere else on the floor, the boss is guaranteed to drop a gun.)

### Critical hits

I suppose this is the same problem as random drops, but backwards.

Say you have a battle sim where every attack has a 6% chance to land a devastating critical hit. Presumably the same rules apply to both the player and the AI opponents.

Consider, then, that the AI opponents have exactly the same 6% chance to ruin the player’s day. Consider also that this gives them an 0.4% chance to critical hit twice in a row. 0.4% doesn’t sound like much, but across an entire playthrough, it’s not unlikely that a player might see it happen and find it incredibly annoying.

Perhaps it would be worthwhile to explicitly forbid AI opponents from getting consecutive critical hits.

## In conclusion

An emerging theme here has been to Just Simulate The Damn Thing. So consider Just Simulating The Damn Thing. Even a simple change to a random value can do surprising things to the resulting distribution, so unless you feel like differentiating the inverse function of your code, maybe test out any non-trivial behavior and make sure it’s what you wanted. Probability is hard to reason about.

# When should behaviour outside a community have consequences inside it?

Post Syndicated from Matthew Garrett original https://mjg59.dreamwidth.org/50099.html

Free software communities don’t exist in a vacuum. They’re made up of people who are also members of other communities, people who have other interests and engage in other activities. Sometimes these people engage in behaviour outside the community that may be perceived as negatively impacting communities that they’re a part of, but most communities have no guidelines for determining whether behaviour outside the community should have any consequences within the community. This post isn’t an attempt to provide those guidelines, but aims to provide some things that community leaders should think about when the issue is raised.

## Some things to consider

### Did the behaviour violate the law?

This seems like an obvious bar, but it turns out to be a pretty bad one. For a start, many things that are common accepted behaviour in various communities may be illegal (eg, reverse engineering work may contravene a strict reading of US copyright law), and taking this to an extreme would result in expelling anyone who’s ever broken a speed limit. On the flipside, refusing to act unless someone broke the law is also a bad threshold – much behaviour that communities consider unacceptable may be entirely legal.

There’s also the problem of determining whether a law was actually broken. The criminal justice system is (correctly) biased to an extent in favour of the defendant – removing someone’s rights in society should require meeting a high burden of proof. However, this is not the threshold that most communities hold themselves to in determining whether to continue permitting an individual to associate with them. An incident that does not result in a finding of criminal guilt (either through an explicit finding or a failure to prosecute the case in the first place) should not be ignored by communities for that reason.

### Did the behaviour violate your community norms?

There’s plenty of behaviour that may be acceptable within other segments of society but unacceptable within your community (eg, lobbying for the use of proprietary software is considered entirely reasonable in most places, but rather less so at an FSF event). If someone can be trusted to segregate their behaviour appropriately then this may not be a problem, but that’s probably not sufficient in all cases. For instance, if someone acts entirely reasonably within your community but engages in lengthy anti-semitic screeds on 4chan, it’s legitimate to question whether permitting them to continue being part of your community serves your community’s best interests.

### Did the behaviour violate the norms of the community in which it occurred?

Of course, the converse is also true – there’s behaviour that may be acceptable within your community but unacceptable in another community. It’s easy to write off someone acting in a way that contravenes the standards of another community but wouldn’t violate your expected behavioural standards – after all, if it wouldn’t breach your standards, what grounds do you have for taking action?

But you need to consider that if someone consciously contravenes the behavioural standards of a community they’ve chosen to participate in, they may be willing to do the same in your community. If pushing boundaries is a frequent trait then it may not be too long until you discover that they’re also pushing your boundaries.

## Why do you care?

A community’s code of conduct can be looked at in two ways – as a list of behaviours that will be punished if they occur, or as a list of behaviours that are unlikely to occur within that community. The former is probably the primary consideration when a community adopts a CoC, but the latter is how many people considering joining a community will think about it.

If your community includes individuals that are known to have engaged in behaviour that would violate your community standards, potential members or contributors may not trust that your CoC will function as adequate protection. A community that contains people known to have engaged in sexual harassment in other settings is unlikely to be seen as hugely welcoming, even if they haven’t (as far as you know!) done so within your community. The way your members behave outside your community is going to be seen as saying something about your community, and that needs to be taken into account.

A second (and perhaps less obvious) aspect is that membership of some higher profile communities may be seen as lending general legitimacy to someone, and they may play off that to legitimise behaviour or views that would be seen as abhorrent by the community as a whole. If someone’s anti-semitic views (for example) are seen as having more relevance because of their membership of your community, it’s reasonable to think about whether keeping them in your community serves the best interests of your community.

## Conclusion

I’ve said things like “considered” or “taken into account” a bunch here, and that’s for a good reason – I don’t know what the thresholds should be for any of these things, and there doesn’t seem to be even a rough consensus in the wider community. We’ve seen cases in which communities have acted based on behaviour outside their community (eg, Debian removing Jacob Appelbaum after it was revealed that he’d sexually assaulted multiple people), but there’s been no real effort to build a meaningful decision making framework around that.

As a result, communities struggle to make consistent decisions. It’s unreasonable to expect individual communities to solve these problems on their own, but that doesn’t mean we can ignore them. It’s time to start coming up with a real set of best practices.

# Now Available: A New AWS Quick Start Reference Deployment for CJIS

As part of the AWS Compliance Quick Start program, AWS has published a new Quick Start reference deployment for customers who need to align with Criminal Justice Information Services (CJIS) Security Policy 5.6 and process Criminal Justice Information (CJI) in accordance with this policy. The new Quick Start is AWS Enterprise Accelerator – Compliance: CJIS, and it makes it easier for you to address the list of supported controls you will find in the security controls matrix that accompanies the Quick Start.

As all AWS Quick Starts do, this Quick Start helps you automate the building of a recommended architecture that, when deployed as a package, provides a baseline AWS configuration. The Quick Start uses sets of nested AWS CloudFormation templates and user data scripts to create an example environment with a two-VPC, multi-tiered web service.

The new Quick Start also includes:

The recommended architecture built by the Quick Start supports a wide variety of AWS best practices (all of which are detailed in the Quick Start), including the use of multiple Availability Zones, isolation using public and private subnets, load balancing, and Auto Scaling.

The Quick Start package also includes a deployment guide with detailed instructions and a security controls matrix that describes how the deployment addresses CJIS Security Policy 5.6 controls. You should have your IT security assessors and risk decision makers review the security controls matrix so that they can understand the extent of the implementation of the controls within the architecture. The matrix also identifies the specific resources in the CloudFormation templates that affect each control, and contains cross-references to the CJIS Security Policy 5.6 security controls.

– Emil

# Object models

Post Syndicated from Eevee original https://eev.ee/blog/2017/11/28/object-models/

Well then!

I’ve written before about what I think objects are: state and behavior, which in practice mostly means method calls.

I suspect that the popular impression of what objects are, and also how they should work, comes from whatever C++ and Java happen to do. From that point of view, the whole post above is probably nonsense. If the baseline notion of “object” is a rigid definition woven tightly into the design of two massively popular languages, then it doesn’t even make sense to talk about what “object” should mean — it does mean the features of those languages, and cannot possibly mean anything else.

I think that’s a shame! It piles a lot of baggage onto a fairly simple idea. Polymorphism, for example, has nothing to do with objects — it’s an escape hatch for static type systems. Inheritance isn’t the only way to reuse code between objects, but it’s the easiest and fastest one, so it’s what we get. Frankly, it’s much closer to a speed tradeoff than a fundamental part of the concept.

We could do with more experimentation around how objects work, but that’s impossible in the languages most commonly thought of as object-oriented.

Here, then, is a (very) brief run through the inner workings of objects in four very dynamic languages. I don’t think I really appreciated objects until I’d spent some time with Python, and I hope this can help someone else whet their own appetite.

## Python 3

Of the four languages I’m going to touch on, Python will look the most familiar to the Java and C++ crowd. For starters, it actually has a class construct.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 class Vector: def __init__(self, x, y): self.x = x self.y = y def __neg__(self): return Vector(-self.x, -self.y) def __div__(self, denom): return Vector(self.x / denom, self.y / denom) @property def magnitude(self): return (self.x ** 2 + self.y ** 2) ** 0.5 def normalized(self): return self / self.magnitude

The __init__ method is an initializer, which is like a constructor but named differently (because the object already exists in a usable form by the time the initializer is called). Operator overloading is done by implementing methods with other special __dunder__ names. Properties can be created with @property, where the @ is syntax for applying a wrapper function to a function as it’s defined. You can do inheritance, even multiply:

 1 2 3 4 class Foo(A, B, C): def bar(self, x, y, z): # do some stuff super().bar(x, y, z)

Cool, a very traditional object model.

Except… for some details.

### Some details

For one, Python objects don’t have a fixed layout. Code both inside and outside the class can add or remove whatever attributes they want from whatever object they want. The underlying storage is just a dict, Python’s mapping type. (Or, rather, something like one. Also, it’s possible to change, which will probably be the case for everything I say here.)

If you create some attributes at the class level, you’ll start to get a peek behind the curtains:

 1 2 3 4 5 6 7 8 9 10 11 12 class Foo: values = [] def add_value(self, value): self.values.append(value) a = Foo() b = Foo() a.add_value('a') print(a.values) # ['a'] b.add_value('b') print(b.values) # ['a', 'b']

The [] assigned to values isn’t a default assigned to each object. In fact, the individual objects don’t know about it at all! You can use vars(a) to get at the underlying storage dict, and you won’t see a values entry in there anywhere.

Instead, values lives on the class, which is a value (and thus an object) in its own right. When Python is asked for self.values, it checks to see if self has a values attribute; in this case, it doesn’t, so Python keeps going and asks the class for one.

Python’s object model is secretly prototypical — a class acts as a prototype, as a shared set of fallback values, for its objects.

In fact, this is also how method calls work! They aren’t syntactically special at all, which you can see by separating the attribute lookup from the call.

 1 2 3 print("abc".startswith("a")) # True meth = "abc".startswith print(meth("a")) # True

Reading obj.method looks for a method attribute; if there isn’t one on obj, Python checks the class. Here, it finds one: it’s a function from the class body.

Ah, but wait! In the code I just showed, meth seems to “know” the object it came from, so it can’t just be a plain function. If you inspect the resulting value, it claims to be a “bound method” or “built-in method” rather than a function, too. Something funny is going on here, and that funny something is the descriptor protocol.

### Descriptors

Python allows attributes to implement their own custom behavior when read from or written to. Such an attribute is called a descriptor. I’ve written about them before, but here’s a quick overview.

If Python looks up an attribute, finds it in a class, and the value it gets has a __get__ method… then instead of using that value, Python will use the return value of its __get__ method.

The @property decorator works this way. The magnitude property in my original example was shorthand for doing this:

 1 2 3 4 5 6 7 8 9 10 11 12 class MagnitudeDescriptor: def __get__(self, instance, owner): if instance is None: return self return (instance.x ** 2 + instance.y ** 2) ** 0.5 class Vector: def __init__(self, x, y): self.x = x self.y = y magnitude = MagnitudeDescriptor()

When you ask for somevec.magnitude, Python checks somevec but doesn’t find magnitude, so it consults the class instead. The class does have a magnitude, and it’s a value with a __get__ method, so Python calls that method and somevec.magnitude evaluates to its return value. (The instance is None check is because __get__ is called even if you get the descriptor directly from the class via Vector.magnitude. A descriptor intended to work on instances can’t do anything useful in that case, so the convention is to return the descriptor itself.)

You can also intercept attempts to write to or delete an attribute, and do absolutely whatever you want instead. But note that, similar to operating overloading in Python, the descriptor must be on a class; you can’t just slap one on an arbitrary object and have it work.

This brings me right around to how “bound methods” actually work. Functions are descriptors! The function type implements __get__, and when a function is retrieved from a class via an instance, that __get__ bundles the function and the instance together into a tiny bound method object. It’s essentially:

 1 2 3 4 5 class FunctionType: def __get__(self, instance, owner): if instance is None: return self return functools.partial(self, instance)

The self passed as the first argument to methods is not special or magical in any way. It’s built out of a few simple pieces that are also readily accessible to Python code.

Note also that because obj.method() is just an attribute lookup and a call, Python doesn’t actually care whether method is a method on the class or just some callable thing on the object. You won’t get the auto-self behavior if it’s on the object, but otherwise there’s no difference.

### More attribute access, and the interesting part

Descriptors are one of several ways to customize attribute access. Classes can implement __getattr__ to intervene when an attribute isn’t found on an object; __setattr__ and __delattr__ to intervene when any attribute is set or deleted; and __getattribute__ to implement unconditional attribute access. (That last one is a fantastic way to create accidental recursion, since any attribute access you do within __getattribute__ will of course call __getattribute__ again.)

Here’s what I really love about Python. It might seem like a magical special case that descriptors only work on classes, but it really isn’t. You could implement exactly the same behavior yourself, in pure Python, using only the things I’ve just told you about. Classes are themselves objects, remember, and they are instances of type, so the reason descriptors only work on classes is that type effectively does this:

 1 2 3 4 5 6 7 8 9 10 class type: def __getattribute__(self, name): value = super().__getattribute__(name) # like all op overloads, __get__ must be on the type, not the instance ty = type(value) if hasattr(ty, '__get__'): # it's a descriptor! this is a class access so there is no instance return ty.__get__(value, None, self) else: return value

You can even trivially prove to yourself that this is what’s going on by skipping over types behavior:

 1 2 3 4 5 6 7 8 9 10 class Descriptor: def __get__(self, instance, owner): print('called!') class Foo: bar = Descriptor() Foo.bar # called! type.__getattribute__(Foo, 'bar') # called! object.__getattribute__(Foo, 'bar') # ...

And that’s not all! The mysterious super function, used to exhaustively traverse superclass method calls even in the face of diamond inheritance, can also be expressed in pure Python using these primitives. You could write your own superclass calling convention and use it exactly the same way as super.

This is one of the things I really like about Python. Very little of it is truly magical; virtually everything about the object model exists in the types rather than the language, which means virtually everything can be customized in pure Python.

### Class creation and metaclasses

A very brief word on all of this stuff, since I could talk forever about Python and I have three other languages to get to.

The class block itself is fairly interesting. It looks like this:

 1 2 class Name(*bases, **kwargs): # code

I’ve said several times that classes are objects, and in fact the class block is one big pile of syntactic sugar for calling type(...) with some arguments to create a new type object.

The Python documentation has a remarkably detailed description of this process, but the gist is:

• Python determines the type of the new class — the metaclass — by looking for a metaclass keyword argument. If there isn’t one, Python uses the “lowest” type among the provided base classes. (If you’re not doing anything special, that’ll just be type, since every class inherits from object and object is an instance of type.)

• Python executes the class body. It gets its own local scope, and any assignments or method definitions go into that scope.

• Python now calls type(name, bases, attrs, **kwargs). The name is whatever was right after class; the bases are position arguments; and attrs is the class body’s local scope. (This is how methods and other class attributes end up on the class.) The brand new type is then assigned to Name.

Of course, you can mess with most of this. You can implement __prepare__ on a metaclass, for example, to use a custom mapping as storage for the local scope — including any reads, which allows for some interesting shenanigans. The only part you can’t really implement in pure Python is the scoping bit, which has a couple extra rules that make sense for classes. (In particular, functions defined within a class block don’t close over the class body; that would be nonsense.)

### Object creation

Finally, there’s what actually happens when you create an object — including a class, which remember is just an invocation of type(...).

Calling Foo(...) is implemented as, well, a call. Any type can implement calls with the __call__ special method, and you’ll find that type itself does so. It looks something like this:

 1 2 3 4 5 6 7 8 9 10 11 12 13 # oh, a fun wrinkle that's hard to express in pure python: type is a class, so # it's an instance of itself class type: def __call__(self, *args, **kwargs): # remember, here 'self' is a CLASS, an instance of type. # __new__ is a true constructor: object.__new__ allocates storage # for a new blank object instance = self.__new__(self, *args, **kwargs) # you can return whatever you want from __new__ (!), and __init__ # is only called on it if it's of the right type if isinstance(instance, self): instance.__init__(*args, **kwargs) return instance

Again, you can trivially confirm this by asking any type for its __call__ method. Assuming that type doesn’t implement __call__ itself, you’ll get back a bound version of types implementation.

 1 2 >>> list.__call__

You can thus implement __call__ in your own metaclass to completely change how subclasses are created — including skipping the creation altogether, if you like.

And… there’s a bunch of stuff I haven’t even touched on.

### The Python philosophy

Python offers something that, on the surface, looks like a “traditional” class/object model. Under the hood, it acts more like a prototypical system, where failed attribute lookups simply defer to a superclass or metaclass.

The language also goes to almost superhuman lengths to expose all of its moving parts. Even the prototypical behavior is an implementation of __getattribute__ somewhere, which you are free to completely replace in your own types. Proxying and delegation are easy.

Also very nice is that these features “bundle” well, by which I mean a library author can do all manner of convoluted hijinks, and a consumer of that library doesn’t have to see any of it or understand how it works. You only need to inherit from a particular class (which has a metaclass), or use some descriptor as a decorator, or even learn any new syntax.

This meshes well with Python culture, which is pretty big on the principle of least surprise. These super-advanced features tend to be tightly confined to single simple features (like “makes a weak attribute“) or cordoned with DSLs (e.g., defining a form/struct/database table with a class body). In particular, I’ve never seen a metaclass in the wild implement its own __call__.

I have mixed feelings about that. It’s probably a good thing overall that the Python world shows such restraint, but I wonder if there are some very interesting possibilities we’re missing out on. I implemented a metaclass __call__ myself, just once, in an entity/component system that strove to minimize fuss when communicating between components. It never saw the light of day, but I enjoyed seeing some new things Python could do with the same relatively simple syntax. I wouldn’t mind seeing, say, an object model based on composition (with no inheritance) built atop Python’s primitives.

## Lua

Lua doesn’t have an object model. Instead, it gives you a handful of very small primitives for building your own object model. This is pretty typical of Lua — it’s a very powerful language, but has been carefully constructed to be very small at the same time. I’ve never encountered anything else quite like it, and “but it starts indexing at 1!” really doesn’t do it justice.

The best way to demonstrate how objects work in Lua is to build some from scratch. We need two key features. The first is metatables, which bear a passing resemblance to Python’s metaclasses.

### Tables and metatables

The table is Lua’s mapping type and its primary data structure. Keys can be any value other than nil. Lists are implemented as tables whose keys are consecutive integers starting from 1. Nothing terribly surprising. The dot operator is sugar for indexing with a string key.

 1 2 3 4 5 local t = { a = 1, b = 2 } print(t['a']) -- 1 print(t.b) -- 2 t.c = 3 print(t['c']) -- 3

A metatable is a table that can be associated with another value (usually another table) to change its behavior. For example, operator overloading is implemented by assigning a function to a special key in a metatable.

 1 2 3 4 5 6 7 8 9 10 local t = { a = 1, b = 2 } --print(t + 0) -- error: attempt to perform arithmetic on a table value local mt = { __add = function(left, right) return 12 end, } setmetatable(t, mt) print(t + 0) -- 12

Now, the interesting part: one of the special keys is __index, which is consulted when the base table is indexed by a key it doesn’t contain. Here’s a table that claims every key maps to itself.

 1 2 3 4 5 6 7 8 9 10 local t = {} local mt = { __index = function(table, key) return key end, } setmetatable(t, mt) print(t.foo) -- foo print(t.bar) -- bar print(t[3]) -- 3

__index doesn’t have to be a function, either. It can be yet another table, in which case that table is simply indexed with the key. If the key still doesn’t exist and that table has a metatable with an __index, the process repeats.

With this, it’s easy to have several unrelated tables that act as a single table. Call the base table an object, fill the __index table with functions and call it a class, and you have half of an object system. You can even get prototypical inheritance by chaining __indexes together.

At this point things are a little confusing, since we have at least three tables going on, so here’s a diagram. Keep in mind that Lua doesn’t actually have anything called an “object”, “class”, or “method” — those are just convenient nicknames for a particular structure we might build with Lua’s primitives.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ╔═══════════╗ ... ║ metatable ║ ║ ╟───────────╢ ┌─────╨───────────────────────┐ ║ __index ╫───┤ lookup table ("superclass") │ ╚═══╦═══════╝ ├─────────────────────────────┤ ╔═══════════╗ ║ │ some other method ┼─── function() ... end ║ metatable ║ ║ └─────────────────────────────┘ ╟───────────╢ ┌─────╨──────────────────┐ ║ __index ╫───┤ lookup table ("class") │ ╚═══╦═══════╝ ├────────────────────────┤ ║ │ some method ┼─── function() ... end ║ └────────────────────────┘ ┌─────╨─────────────────┐ │ base table ("object") │ └───────────────────────┘

Note that a metatable is not the same as a class; it defines behavior, not methods. Conversely, if you try to use a class directly as a metatable, it will probably not do much. (This is pretty different from e.g. Python, where operator overloads are just methods with funny names. One nice thing about the Lua approach is that you can keep interface-like functionality separate from methods, and avoid clogging up arbitrary objects’ namespaces. You could even use a dummy table as a key and completely avoid name collisions.)

Anyway, code!

 1 2 3 4 5 6 7 8 9 10 11 local class = { foo = function(a) print("foo got", a) end, } local mt = { __index = class } -- setmetatable returns its first argument, so this is nice shorthand local obj1 = setmetatable({}, mt) local obj2 = setmetatable({}, mt) obj1.foo(7) -- foo got 7 obj2.foo(9) -- foo got 9

Wait, wait, hang on. Didn’t I call these methods? How do they get at the object? Maybe Lua has a magical this variable?

### Methods, sort of

Not quite, but this is where the other key feature comes in: method-call syntax. It’s the lightest touch of sugar, just enough to have method invocation.

 1 2 3 4 5 6 7 8 9 -- note the colon! a:b(c, d, ...) -- exactly equivalent to this -- (except that a is only evaluated once) a.b(a, c, d, ...) -- which of course is really this a["b"](a, c, d, ...)

Now we can write methods that actually do something.

 1 2 3 4 5 6 7 8 9 10 local class = { bar = function(self) print("our score is", self.score) end, } local mt = { __index = class } local obj1 = setmetatable({ score = 13 }, mt) local obj2 = setmetatable({ score = 25 }, mt) obj1:bar() -- our score is 13 obj2:bar() -- our score is 25

And that’s all you need. Much like Python, methods and data live in the same namespace, and Lua doesn’t care whether obj:method() finds a function on obj or gets one from the metatable’s __index. Unlike Python, the function will be passed self either way, because self comes from the use of : rather than from the lookup behavior.

(Aside: strictly speaking, any Lua value can have a metatable — and if you try to index a non-table, Lua will always consult the metatable’s __index. Strings all have the string library as a metatable, so you can call methods on them: try ("%s %s"):format(1, 2). I don’t think Lua lets user code set the metatable for non-tables, so this isn’t that interesting, but if you’re writing Lua bindings from C then you can wrap your pointers in metatables to give them methods implemented in C.)

### Bringing it all together

Of course, writing all this stuff every time is a little tedious and error-prone, so instead you might want to wrap it all up inside a little function. No problem.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 local function make_object(body) -- create a metatable local mt = { __index = body } -- create a base table to serve as the object itself local obj = setmetatable({}, mt) -- and, done return obj end -- you can leave off parens if you're only passing in local Dog = { -- this acts as a "default" value; if obj.barks is missing, __index will -- kick in and find this value on the class. but if obj.barks is assigned -- to, it'll go in the object and shadow the value here. barks = 0, bark = function(self) self.barks = self.barks + 1 print("woof!") end, } local mydog = make_object(Dog) mydog:bark() -- woof! mydog:bark() -- woof! mydog:bark() -- woof! print(mydog.barks) -- 3 print(Dog.barks) -- 0

It works, but it’s fairly barebones. The nice thing is that you can extend it pretty much however you want. I won’t reproduce an entire serious object system here — lord knows there are enough of them floating around — but the implementation I have for my LÖVE games lets me do this:

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 local Animal = Object:extend{ cries = 0, } -- called automatically by Object function Animal:init() print("whoops i couldn't think of anything interesting to put here") end -- this is just nice syntax for adding a first argument called 'self', then -- assigning this function to Animal.cry function Animal:cry() self.cries = self.cries + 1 end local Cat = Animal:extend{} function Cat:cry() print("meow!") Cat.__super.cry(self) end local cat = Cat() cat:cry() -- meow! cat:cry() -- meow! print(cat.cries) -- 2

When I say you can extend it however you want, I mean that. I could’ve implemented Python (2)-style super(Cat, self):cry() syntax; I just never got around to it. I could even make it work with multiple inheritance if I really wanted to — or I could go the complete opposite direction and only implement composition. I could implement descriptors, customizing the behavior of individual table keys. I could add pretty decent syntax for composition/proxying. I am trying very hard to end this section now.

### The Lua philosophy

Lua’s philosophy is to… not have a philosophy? It gives you the bare minimum to make objects work, and you can do absolutely whatever you want from there. Lua does have something resembling prototypical inheritance, but it’s not so much a first-class feature as an emergent property of some very simple tools. And since you can make __index be a function, you could avoid the prototypical behavior and do something different entirely.

The very severe downside, of course, is that you have to find or build your own object system — which can get pretty confusing very quickly, what with the multiple small moving parts. Third-party code may also have its own object system with subtly different behavior. (Though, in my experience, third-party code tries very hard to avoid needing an object system at all.)

It’s hard to say what the Lua “culture” is like, since Lua is an embedded language that’s often a little different in each environment. I imagine it has a thousand millicultures, instead. I can say that the tedium of building my own object model has led me into something very “traditional”, with prototypical inheritance and whatnot. It’s partly what I’m used to, but it’s also just really dang easy to get working.

Likewise, while I love properties in Python and use them all the dang time, I’ve yet to use a single one in Lua. They wouldn’t be particularly hard to add to my object model, but having to add them myself (or shop around for an object model with them and also port all my code to use it) adds a huge amount of friction. I’ve thought about designing an interesting ECS with custom object behavior, too, but… is it really worth the effort? For all the power and flexibility Lua offers, the cost is that by the time I have something working at all, I’m too exhausted to actually use any of it.

## JavaScript

JavaScript is notable for being preposterously heavily used, yet not having a class block.

Well. Okay. Yes. It has one now. It didn’t for a very long time, and even the one it has now is sugar.

Here’s a vector class again:

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 class Vector { constructor(x, y) { this.x = x; this.y = y; } get magnitude() { return Math.sqrt(this.x * this.x + this.y * this.y); } dot(other) { return this.x * other.x + this.y * other.y; } }

In “classic” JavaScript, this would be written as:

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 function Vector(x, y) { this.x = x; this.y = y; } Object.defineProperty(Vector.prototype, 'magnitude', { configurable: true, enumerable: true, get: function() { return Math.sqrt(this.x * this.x + this.y * this.y); }, }); Vector.prototype.dot = function(other) { return this.x * other.x + this.y * other.y; };

Hm, yes. I can see why they added class.

### The JavaScript model

In JavaScript, a new type is defined in terms of a function, which is its constructor.

Right away we get into trouble here. There is a very big difference between these two invocations, which I actually completely forgot about just now after spending four hours writing about Python and Lua:

 1 2 let vec = Vector(3, 4); let vec = new Vector(3, 4);

The first calls the function Vector. It assigns some properties to this, which here is going to be window, so now you have a global x and y. It then returns nothing, so vec is undefined.

The second calls Vector with this set to a new empty object, then evaluates to that object. The result is what you’d actually expect.

(You can detect this situation with the strange new.target expression, but I have never once remembered to do so.)

From here, we have true, honest-to-god, first-class prototypical inheritance. The word “prototype” is even right there. When you write this:

 1 vec.dot(vec2)

JavaScript will look for dot on vec and (presumably) not find it. It then consults vecs prototype, an object you can see for yourself by using Object.getPrototypeOf(). Since vec is a Vector, its prototype is Vector.prototype.

I stress that Vector.prototype is not the prototype for Vector. It’s the prototype for instances of Vector.

(I say “instance”, but the true type of vec here is still just object. If you want to find Vector, it’s automatically assigned to the constructor property of its own prototype, so it’s available as vec.constructor.)

Of course, Vector.prototype can itself have a prototype, in which case the process would continue if dot were not found. A common (and, arguably, very bad) way to simulate single inheritance is to set Class.prototype to an instance of a superclass to get the prototype right, then tack on the methods for Class. Nowadays we can do Object.create(Superclass.prototype).

Now that I’ve been through Python and Lua, though, this isn’t particularly surprising. I kinda spoiled it.

I suppose one difference in JavaScript is that you can tack arbitrary attributes directly onto Vector all you like, and they will remain invisible to instances since they aren’t in the prototype chain. This is kind of backwards from Lua, where you can squirrel stuff away in the metatable.

Another difference is that every single object in JavaScript has a bunch of properties already tacked on — the ones in Object.prototype. Every object (and by “object” I mean any mapping) has a prototype, and that prototype defaults to Object.prototype, and it has a bunch of ancient junk like isPrototypeOf.

(Nit: it’s possible to explicitly create an object with no prototype via Object.create(null).)

Like Lua, and unlike Python, JavaScript doesn’t distinguish between keys found on an object and keys found via a prototype. Properties can be defined on prototypes with Object.defineProperty(), but that works just as well directly on an object, too. JavaScript doesn’t have a lot of operator overloading, but some things like Symbol.iterator also work on both objects and prototypes.

You may, at this point, be wondering what this is. Unlike Lua and Python (and the last language below), this is a special built-in value — a context value, invisibly passed for every function call.

It’s determined by where the function came from. If the function was the result of an attribute lookup, then this is set to the object containing that attribute. Otherwise, this is set to the global object, window. (You can also set this to whatever you want via the call method on functions.)

This decision is made lexically, i.e. from the literal source code as written. There are no Python-style bound methods. In other words:

 1 2 3 4 5 // this = obj obj.method() // this = window let meth = obj.method meth()

Also, because this is reassigned on every function call, it cannot be meaningfully closed over, which makes using closures within methods incredibly annoying. The old approach was to assign this to some other regular name like self (which got syntax highlighting since it’s also a built-in name in browsers); then we got Function.bind, which produced a callable thing with a fixed context value, which was kind of nice; and now finally we have arrow functions, which explicitly close over the current this when they’re defined and don’t change it when called. Phew.

### Class syntax

I already showed class syntax, and it’s really just one big macro for doing all the prototype stuff The Right Way. It even prevents you from calling the type without new. The underlying model is exactly the same, and you can inspect all the parts.

 1 2 3 4 5 6 7 8 9 10 11 class Vector { ... } console.log(Vector.prototype); // { dot: ..., magnitude: ..., ... } let vec = new Vector(3, 4); console.log(Object.getPrototypeOf(vec)); // same as Vector.prototype // i don't know why you would subclass vector but let's roll with it class Vectest extends Vector { ... } console.log(Vectest.prototype); // { ... } console.log(Object.getPrototypeOf(Vectest.prototype)) // same as Vector.prototype

Alas, class syntax has a couple shortcomings. You can’t use the class block to assign arbitrary data to either the type object or the prototype — apparently it was deemed too confusing that mutations would be shared among instances. Which… is… how prototypes work. How Python works. How JavaScript itself, one of the most popular languages of all time, has worked for twenty-two years. Argh.

You can still do whatever assignment you want outside of the class block, of course. It’s just a little ugly, and not something I’d think to look for with a sugary class.

A more subtle result of this behavior is that a class block isn’t quite the same syntax as an object literal. The check for data isn’t a runtime thing; class Foo { x: 3 } fails to parse. So JavaScript now has two largely but not entirely identical styles of key/value block.

### Attribute access

Here’s where things start to come apart at the seams, just a little bit.

JavaScript doesn’t really have an attribute protocol. Instead, it has two… extension points, I suppose.

One is Object.defineProperty, seen above. For common cases, there’s also the get syntax inside a property literal, which does the same thing. But unlike Python’s @property, these aren’t wrappers around some simple primitives; they are the primitives. JavaScript is the only language of these four to have “property that runs code on access” as a completely separate first-class concept.

If you want to intercept arbitrary attribute access (and some kinds of operators), there’s a completely different primitive: the Proxy type. It doesn’t let you intercept attribute access or operators; instead, it produces a wrapper object that supports interception and defers to the wrapped object by default.

It’s cool to see composition used in this way, but also, extremely weird. If you want to make your own type that overloads in or calling, you have to return a Proxy that wraps your own type, rather than actually returning your own type. And (unlike the other three languages in this post) you can’t return a different type from a constructor, so you have to throw that away and produce objects only from a factory. And instanceof would be broken, but you can at least fix that with Symbol.hasInstance — which is really operator overloading, implement yet another completely different way.

I know the design here is a result of legacy and speed — if any object could intercept all attribute access, then all attribute access would be slowed down everywhere. Fair enough. It still leaves the surface area of the language a bit… bumpy?

### The JavaScript philosophy

It’s a little hard to tell. The original idea of prototypes was interesting, but it was hidden behind some very awkward syntax. Since then, we’ve gotten a bunch of extra features awkwardly bolted on to reflect the wildly varied things the built-in types and DOM API were already doing. We have class syntax, but it’s been explicitly designed to avoid exposing the prototype parts of the model.

I admit I don’t do a lot of heavy JavaScript, so I might just be overlooking it, but I’ve seen virtually no code that makes use of any of the recent advances in object capabilities. Forget about custom iterators or overloading call; I can’t remember seeing any JavaScript in the wild that even uses properties yet. I don’t know if everyone’s waiting for sufficient browser support, nobody knows about them, or nobody cares.

The model has advanced recently, but I suspect JavaScript is still shackled to its legacy of “something about prototypes, I don’t really get it, just copy the other code that’s there” as an object model. Alas! Prototypes are so good. Hopefully class syntax will make it a bit more accessible, as it has in Python.

## Perl 5

Perl 5 also doesn’t have an object system and expects you to build your own. But where Lua gives you two simple, powerful tools for building one, Perl 5 feels more like a puzzle with half the pieces missing. Clearly they were going for something, but they only gave you half of it.

In brief, a Perl object is a reference that has been blessed with a package.

I need to explain a few things. Honestly, one of the biggest problems with the original Perl object setup was how many strange corners and unique jargon you had to understand just to get off the ground.

(If you want to try running any of this code, you should stick a use v5.26; as the first line. Perl is very big on backwards compatibility, so you need to opt into breaking changes, and even the mundane say builtin is behind a feature gate.)

A few things of note here. First, $self->[0] has nothing to do with objects; it’s normal syntax for getting the value of a index 0 out of an array reference called$self. (Most classes are based on hashrefs and would use $self->{value} instead.) A blessed reference is still a reference and can be treated like one. In general, -> is Perl’s dereferencey operator, but its exact behavior depends on what follows. If it’s followed by brackets, then it’ll apply the brackets to the thing in the reference: ->{} to index a hash reference, ->[] to index an array reference, and ->() to call a function reference. But if -> is followed by an identifier, then it’s a method call. For packages, that means calling a function in the package and passing the package name as the first argument. For objects — blessed references — that means calling a function in the associated package and passing the object as the first argument. This is a little weird! A blessed reference is a superposition of two things: its normal reference behavior, and some completely orthogonal object behavior. Also, object behavior has no notion of methods vs data; it only knows about methods. Perl lets you omit parentheses in a lot of places, including when calling a method with no arguments, so$vec->magnitude is really $vec->magnitude(). Perl’s blessing bears some similarities to Lua’s metatables, but ultimately Perl is much closer to Ruby’s “message passing” approach than the above three languages’ approaches of “get me something and maybe it’ll be callable”. (But this is no surprise — Ruby is a spiritual successor to Perl 5.) All of this leads to one little wrinkle: how do you actually expose data? Above, I had to write x and y methods. Am I supposed to do that for every single attribute on my type? Yes! But don’t worry, there are third-party modules to help with this incredibly fundamental task. Take Class::Accessor::Fast, so named because it’s faster than Class::Accessor:  1 2 3 package Foo; use base qw(Class::Accessor::Fast); __PACKAGE__->mk_accessors(qw(fred wilma barney)); (__PACKAGE__ is the lexical name of the current package; qw(...) is a list literal that splits its contents on whitespace.) This assumes you’re using a hashref with keys of the same names as the attributes.$obj->fred will return the fred key from your hashref, and $obj->fred(4) will change it to 4. You also, somewhat bizarrely, have to inherit from Class::Accessor::Fast. Speaking of which, ### Inheritance Inheritance is done by populating the package-global @ISA array with some number of (string) names of parent packages. Most code instead opts to write use base ...;, which does the same thing. Or, more commonly, use parent ...;, which… also… does the same thing. Every package implicitly inherits from UNIVERSAL, which can be freely modified by Perl code. A method can call its superclass method with the SUPER:: pseudo-package:  1 2 3 4 sub foo { my ($self) = @_; $self->SUPER::foo; } However, this does a depth-first search, which means it almost certainly does the wrong thing when faced with multiple inheritance. For a while the accepted solution involved a third-party module, but Perl eventually grew an alternative you have to opt into: C3, which may be more familiar to you as the order Python uses.  1 2 3 4 5 6 use mro 'c3'; sub foo { my ($self) = @_; $self->next::method; } Offhand, I’m not actually sure how next::method works, seeing as it was originally implemented in pure Perl code. I suspect it involves peeking at the caller’s stack frame. If so, then this is a very different style of customizability from e.g. Python — the MRO was never intended to be pluggable, and the use of a special pseudo-package means it isn’t really, but someone was determined enough to make it happen anyway. ### Operator overloading and whatnot Operator overloading looks a little weird, though really it’s pretty standard Perl.  1 2 3 4 5 6 7 8 package MyClass; use overload '+' => \&_add; sub _add { my ($self, $other,$swap) = @_; ... }

use overload here is a pragma, where “pragma” means “regular-ass module that does some wizardry when imported”.

\&_add is how you get a reference to the _add sub so you can pass it to the overload module. If you just said &_add or _add, that would call it.

And that’s it; you just pass a map of operators to functions to this built-in module. No worry about name clashes or pollution, which is pretty nice. You don’t even have to give references to functions that live in the package, if you don’t want them to clog your namespace; you could put them in another package, or even inline them anonymously.

One especially interesting thing is that Perl lets you overload every operator. Perl has a lot of operators. It considers some math builtins like sqrt and trig functions to be operators, or at least operator-y enough that you can overload them. You can also overload the “file text” operators, such as -e $path to test whether a file exists. You can overload conversions, including implicit conversion to a regex. And most fascinating to me, you can overload dereferencing — that is, the thing Perl does when you say$hashref->{key} to get at the underlying hash. So a single object could pretend to be references of multiple different types, including a subref to implement callability. Neat.

Somewhat related: you can overload basic operators (indexing, etc.) on basic types (not references!) with the tie function, which is designed completely differently and looks for methods with fixed names. Go figure.

You can intercept calls to nonexistent methods by implementing a function called AUTOLOAD, within which the $AUTOLOAD global will contain the name of the method being called. Originally this feature was, I think, intended for loading binary components or large libraries on-the-fly only when needed, hence the name. Offhand I’m not sure I ever saw it used the way __getattr__ is used in Python. Is there a way to intercept all method calls? I don’t think so, but it is Perl, so I must be forgetting something. ### Actually no one does this any more Like a decade ago, a council of elder sages sat down and put together a whole whizbang system that covers all of it: Moose.  1 2 3 4 5 6 7 8 9 10 package Vector; use Moose; has x => (is => 'rw', isa => 'Int'); has y => (is => 'rw', isa => 'Int'); sub magnitude { my ($self) = @_; return sqrt($self->x ** 2 +$self->y ** 2); }

Moose has its own way to do pretty much everything, and it’s all built on the same primitives. Moose also adds metaclasses, somehow, despite that the underlying model doesn’t actually support them? I’m not entirely sure how they managed that, but I do remember doing some class introspection with Moose and it was much nicer than the built-in way.

(If you’re wondering, the built-in way begins with looking at the hash called %Vector::. No, that’s not a typo.)

I really cannot stress enough just how much stuff Moose does, but I don’t want to delve into it here since Moose itself is not actually the language model.

### The Perl philosophy

I hope you can see what I meant with what I first said about Perl, now. It has multiple inheritance with an MRO, but uses the wrong one by default. It has extensive operator overloading, which looks nothing like how inheritance works, and also some of it uses a totally different mechanism with special method names instead. It only understands methods, not data, leaving you to figure out accessors by hand.

There’s 70% of an object system here with a clear general design it was gunning for, but none of the pieces really look anything like each other. It’s weird, in a distinctly Perl way.

The result is certainly flexible, at least! It’s especially cool that you can use whatever kind of reference you want for storage, though even as I say that, I acknowledge it’s no different from simply subclassing list or something in Python. It feels different in Perl, but maybe only because it looks so different.

I haven’t written much Perl in a long time, so I don’t know what the community is like any more. Moose was already ubiquitous when I left, which you’d think would let me say “the community mostly focuses on the stuff Moose can do” — but even a decade ago, Moose could already do far more than I had ever seen done by hand in Perl. It’s always made a big deal out of roles (read: interfaces), for instance, despite that I’d never seen anyone care about them in Perl before Moose came along. Maybe their presence in Moose has made them more popular? Who knows.

Also, I wrote Perl seriously, but in the intervening years I’ve only encountered people who only ever used Perl for one-offs. Maybe it’ll come as a surprise to a lot of readers that Perl has an object model at all.

## End

Well, that was fun! I hope any of that made sense.

Special mention goes to Rust, which doesn’t have an object model you can fiddle with at runtime, but does do things a little differently.

It’s been really interesting thinking about how tiny differences make a huge impact on what people do in practice. Take the choice of storage in Perl versus Python. Perl’s massively common URI class uses a string as the storage, nothing else; I haven’t seen anything like that in Python aside from markupsafe, which is specifically designed as a string type. I would guess this is partly because Perl makes you choose — using a hashref is an obvious default, but you have to make that choice one way or the other. In Python (especially 3), inheriting from object and getting dict-based storage is the obvious thing to do; the ability to use another type isn’t quite so obvious, and doing it “right” involves a tiny bit of extra work.

Or, consider that Lua could have descriptors, but the extra bit of work (especially design work) has been enough of an impediment that I’ve never implemented them. I don’t think the object implementations I’ve looked at have included them, either. Super weird!

In that light, it’s only natural that objects would be so strongly associated with the features Java and C++ attach to them. I think that makes it all the more important to play around! Look at what Moose has done. No, really, you should bear in mind my description of how Perl does stuff and flip through the Moose documentation. It’s amazing what they’ve built.

# AWS EU (London) Region Selected to Provide Services to Support UK Law Enforcement Customers

The AWS EU (London) Region has been selected to provide services to support UK law enforcement customers. This decision followed an assessment by Home Office Digital, Data and Technology supported by their colleagues in the National Policing Information Risk Management Team (NPIRMT) to determine the region’s suitability for addressing their specific needs.

The security, privacy, and protection of AWS customers are AWS’s first priority. We are committed to supporting Public Sector, Blue Light, Justice, and Public Safety organizations. We hope that other organizations in these sectors will now be encouraged to consider AWS services when addressing their own requirements, including the challenge of providing modern, scalable technologies that can meet their ever-evolving business demands.

– Oliver

# The UK Law Enforcement Community Can Now Use the AWS Cloud

Post Syndicated from Oliver Bell original https://aws.amazon.com/blogs/security/the-uk-law-enforcement-community-can-now-use-the-aws-cloud/

The AWS EU (London) Region has been Police Assured Secure Facility (PASF) assessed, offering additional support for UK law enforcement customers. This assessment means The National Policing Information Risk Management Team (NPIRMT) has completed a comprehensive physical security assessment of the AWS UK infrastructure and has reviewed the integral practices and processes of how AWS manages data center operations. UK Policing organizations can now leverage this assessment (available to those organizations from NPIRMT) as part of their own risk management approach to systems development and design with the confidence their data is stored in highly secure and compliant facilities. Note that the NPIRMT does not offer any warranty of physical security of the AWS data center.

The security, privacy, and protection of AWS customers are our first priority, and we are committed to supporting Public Sector and Blue Light organizations. This assessment further demonstrates AWS’s commitment to deliver secure and compliant services to the UK law enforcement community. We have built technology services suitable for use by Justice, Blue Light, and Public Safety organizations, and whether in law enforcement, emergency management, or criminal justice, AWS has the capability and resources to support this community’s unique IT needs. From Public Services Network–compliant solutions to architecting a UK OFFICIAL secure environment, AWS can help tackle public safety data needs. By combining the secure and flexible AWS infrastructure with the breadth of our specialized APN Partner solutions, we are confident we can help our customers across the industry succeed in their missions.

– Oliver

# Growing up alongside tech

Post Syndicated from Eevee original https://eev.ee/blog/2017/08/09/growing-up-alongside-tech/

industrialrobot: How has your views on tech changed as you’ve got older?

This is so open-ended that it’s actually stumped me for a solid month. I’ve had a surprisingly hard time figuring out where to even start.

It’s not that my views of tech have changed too much — it’s that they’ve changed very gradually. Teasing out and explaining any one particular change is tricky when it happened invisibly over the course of 10+ years.

I think a better framework for this is to consider how my relationship to tech has changed. It’s gone through three pretty distinct phases, each of which has strongly colored how I feel and talk about technology.

## Act I

In which I start from nothing.

Nothing is an interesting starting point. You only really get to start there once.

Learning something on my own as a kid was something of a magical experience, in a way that I don’t think I could replicate as an adult. I liked computers; I liked toying with computers; so I did that.

I don’t know how universal this is, but when I was a kid, I couldn’t even conceive of how incredible things were made. Buildings? Cars? Paintings? Operating systems? Where does any of that come from? Obviously someone made them, but it’s not the sort of philosophical point I lingered on when I was 10, so in the back of my head they basically just appeared fully-formed from the æther.

That meant that when I started trying out programming, I had no aspirations. I couldn’t imagine how far I would go, because all the examples of how far I would go were completely disconnected from any idea of human achievement. I started out with BASIC on a toy computer; how could I possibly envision a connection between that and something like a mainstream video game? Every new thing felt like a new form of magic, so I couldn’t conceive that I was even in the same ballpark as whatever process produced real software. (Even seeing the source code for GORILLAS.BAS, it didn’t quite click. I didn’t think to try reading any of it until years after I’d first encountered the game.)

This isn’t to say I didn’t have goals. I invented goals constantly, as I’ve always done; as soon as I learned about a new thing, I’d imagine some ways to use it, then try to build them. I produced a lot of little weird goofy toys, some of which entertained my tiny friend group for a couple days, some of which never saw the light of day. But none of it felt like steps along the way to some mountain peak of mastery, because I didn’t realize the mountain peak was even a place that could be gone to. It was pure, unadulterated (!) playing.

I contrast this to my art career, which started only a couple years ago. I was already in my late 20s, so I’d already spend decades seeing a very broad spectrum of art: everything from quick sketches up to painted masterpieces. And I’d seen the people who create that art, sometimes seen them create it in real-time. I’m even in a relationship with one of them! And of course I’d already had the experience of advancing through tech stuff and discovering first-hand that even the most amazing software is still just code someone wrote.

So from the very beginning, from the moment I touched pencil to paper, I knew the possibilities. I knew that the goddamn Sistine Chapel was something I could learn to do, if I were willing to put enough time in — and I knew that I’m not, so I’d have to settle somewhere a ways before that. I knew that I’d have to put an awful lot of work in before I’d be producing anything very impressive.

I did it anyway (though perhaps waited longer than necessary to start), but those aren’t things I can un-know, and so I can never truly explore art from a place of pure ignorance. On the other hand, I’ve probably learned to draw much more quickly and efficiently than if I’d done it as a kid, precisely because I know those things. Now I can decide I want to do something far beyond my current abilities, then go figure out how to do it. When I was just playing, that kind of ambition was impossible.

So, I played.

How did this affect my views on tech? Well, I didn’t… have any. Learning by playing tends to teach you things in an outward sprawl without many abrupt jumps to new areas, so you don’t tend to run up against conflicting information. The whole point of opinions is that they’re your own resolution to a conflict; without conflict, I can’t meaningfully say I had any opinions. I just accepted whatever I encountered at face value, because I didn’t even know enough to suspect there could be alternatives yet.

## Act II

That started to seriously change around, I suppose, the end of high school and beginning of college. I was becoming aware of this whole “open source” concept. I took classes that used languages I wouldn’t otherwise have given a second thought. (One of them was Python!) I started to contribute to other people’s projects. Eventually I even got a job, where I had to work with other people. It probably also helped that I’d had to maintain my own old code a few times.

Now I was faced with conflicting subjective ideas, and I had to form opinions about them! And so I did. With gusto. Over time, I developed an idea of what was Right based on experience I’d accrued. And then I set out to always do things Right.

That’s served me decently well with some individual problems, but it also led me to inflict a lot of unnecessary pain on myself. Several endeavors languished for no other reason than my dissatisfaction with the architecture, long before the basic functionality was done. I started a number of “pure” projects around this time, generic tools like imaging libraries that I had no direct need for. I built them for the sake of them, I guess because I felt like I was improving some niche… but of course I never finished any. It was always in areas I didn’t know that well in the first place, which is a fine way to learn if you have a specific concrete goal in mind — but it turns out that building a generic library for editing images means you have to know everything about images. Perhaps that ambition went a little haywire.

I’ve said before that this sort of (self-inflicted!) work was unfulfilling, in part because the best outcome would be that a few distant programmers’ lives are slightly easier. I do still think that, but I think there’s a deeper point here too.

In forgetting how to play, I’d stopped putting any of myself in most of the work I was doing. Yes, building an imaging library is kind of a slog that someone has to do, but… I assume the people who work on software like PIL and ImageMagick are actually interested in it. The few domains I tried to enter and revolutionize weren’t passions of mine; I just happened to walk through the neighborhood one day and decided I could obviously do it better.

Not coincidentally, this was the same era of my life that led me to write stuff like that PHP post, which you may notice I am conspicuously not even linking to. I don’t think I would write anything like it nowadays. I could see myself approaching the same subject, but purely from the point of view of language design, with more contrasts and tradeoffs and less going for volume. I certainly wouldn’t lead off with inflammatory puffery like “PHP is a community of amateurs”.

### Act III

I think I’ve mellowed out a good bit in the last few years.

It turns out that being Right is much less important than being Not Wrong — i.e., rather than trying to make something perfect that can be adapted to any future case, just avoid as many pitfalls as possible. Code that does something useful has much more practical value than unfinished code with some pristine architecture.

Nowhere is this more apparent than in game development, where all code is doomed to be crap and the best you can hope for is to stem the tide. But there’s also a fixed goal that’s completely unrelated to how the code looks: does the game work, and is it fun to play? Yes? Ship the damn thing and forget about it.

Games are also nice because it’s very easy to pour my own feelings into them and evoke feelings in the people who play them. They’re mine, something with my fingerprints on them — even the games I’ve built with glip have plenty of my own hallmarks, little touches I added on a whim or attention to specific details that I care about.

Maybe a better example is the Doom map parser I started writing. It sounds like a “pure” problem again, except that I actually know an awful lot about the subject already! I also cleverly (accidentally) released some useful results of the work I’ve done thusfar — like statistics about Doom II maps and a few screenshots of flipped stock maps — even though I don’t think the parser itself is far enough along to release yet. The tool has served a purpose, one with my fingerprints on it, even without being released publicly. That keeps it fresh in my mind as something interesting I’d like to keep working on, eventually. (When I run into an architecture question, I step back for a while, or I do other work in the hopes that the solution will reveal itself.)

I also made two simple Pokémon ROM hacks this year, despite knowing nothing about Game Boy internals or assembly when I started. I just decided I wanted to do an open-ended thing beyond my reach, and I went to do it, not worrying about cleanliness and willing to accept a bumpy ride to get there. I played, but in a more experienced way, invoking the stuff I know (and the people I’ve met!) to help me get a running start in completely unfamiliar territory.

This feels like a really fine distinction that I’m not sure I’m doing justice. I don’t know if I could’ve appreciated it three or four years ago. But I missed making toys, and I’m glad I’m doing it again.

In short, I forgot how to have fun with programming for a little while, and I’ve finally started to figure it out again. And that’s far more important than whether you use PHP or not.

# The Future of Forgeries

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/07/the_future_of_f_1.html

This article argues that AI technologies will make image, audio, and video forgeries much easier in the future.

Combined, the trajectory of cheap, high-quality media forgeries is worrying. At the current pace of progress, it may be as little as two or three years before realistic audio forgeries are good enough to fool the untrained ear, and only five or 10 years before forgeries can fool at least some types of forensic analysis. When tools for producing fake video perform at higher quality than today’s CGI and are simultaneously available to untrained amateurs, these forgeries might comprise a large part of the information ecosystem. The growth in this technology will transform the meaning of evidence and truth in domains across journalism, government communications, testimony in criminal justice, and, of course, national security.

I am not worried about fooling the “untrained ear,” and more worried about fooling forensic analysis. But there’s an arms race here. Recording technologies will get more sophisticated, too, making their outputs harder to forge. Still, I agree that the advantage will go to the forgers and not the forgery detectors.

# Commentary on US Election Security

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/07/commentary_on_u.html

Good commentaries from Ed Felten and Matt Blaze.

Both make a point that I have also been saying: hacks can undermine the legitimacy of an election, even if there is no actual voter or vote manipulation.

Felten:

The second lesson is that we should be paying more attention to attacks that aim to undermine the legitimacy of an election rather than changing the election’s result. Election-stealing attacks have gotten most of the attention up to now — ­and we are still vulnerable to them in some places — ­but it appears that external threat actors may be more interested in attacking legitimacy.

Attacks on legitimacy could take several forms. An attacker could disrupt the operation of the election, for example, by corrupting voter registration databases so there is uncertainty about whether the correct people were allowed to vote. They could interfere with post-election tallying processes, so that incorrect results were reported­ an attack that might have the intended effect even if the results were eventually corrected. Or the attacker might fabricate evidence of an attack, and release the false evidence after the election.

Legitimacy attacks could be easier to carry out than election-stealing attacks, as well. For one thing, a legitimacy attacker will typically want the attack to be discovered, although they might want to avoid having the culprit identified. By contrast, an election-stealing attack must avoid detection in order to succeed. (If detected, it might function as a legitimacy attack.)

Blaze:

A hostile state actor who can compromise a handful of county networks might not even need to alter any actual votes to create considerable uncertainty about an election’s legitimacy. It may be sufficient to simply plant some suspicious software on back end networks, create some suspicious audit files, or add some obviously bogus names to to the voter rolls. If the preferred candidate wins, they can quietly do nothing (or, ideally, restore the compromised networks to their original states). If the “wrong” candidate wins, however, they could covertly reveal evidence that county election systems had been compromised, creating public doubt about whether the election had been “rigged”. This could easily impair the ability of the true winner to effectively govern, at least for a while.

In other words, a hostile state actor interested in disruption may actually have an easier task than someone who wants to undetectably steal even a small local office. And a simple phishing and trojan horse email campaign like the one in the NSA report is potentially all that would be needed to carry this out.

Me:

Democratic elections serve two purposes. The first is to elect the winner. But the second is to convince the loser. After the votes are all counted, everyone needs to trust that the election was fair and the results accurate. Attacks against our election system, even if they are ultimately ineffective, undermine that trust and ­ by extension ­ our democracy.

And, finally, a report from the Brennan Center for Justice on how to secure elections.

# Developers and Ethics

Post Syndicated from Bozho original https://techblog.bozho.net/developers-and-ethics/

“What are some areas you are particularly interested in” – recruiters (head-hunters) tend to ask that question a lot. I don’t have a good answer for that – I’ll know it when I see it. But I have a list of areas that I wouldn’t like to work in. And one of them is gambling.

Several years ago I got a very lucrative offer for a gambling company, both well paid and technically challenging. But I rejected it. Because I didn’t want to contribute to abusing peoples’ weaknesses for the sake of getting their money. And no, I’m not a raging Marxist, but gambling is bad. You may argue that it’s a necessary vice and people need it to suppress other internal struggles, but I’m not buying that as a motivator.

I felt it’s unethical to write code that does that. Like I feel it’s unethical to profile users’ behaviours and “read” their emails in order to target ads, or to write bots to disseminate fake news.

A few months ago I was part of the campaign HQ for a party in a parliamentary election. Cambridge Analytica had already become popular after “delivering Brexit and Trump’s victory”, that using voters’ data in order to target messages at them sounded like the new cool thing. As head of IT & data, I rejected this approach. Because it would be unethical to bait unsuspecting users to take dumb tests in order to provide us with facebook tokens. Yes, we didn’t have any money to hire Cambridge Analytica-like companies, but even if we had, is “outsourcing” the dubious practice changing anything? If you pay someone to trick users into unknowingly giving their personal data, it’s as if you did it yourself.

This can be a very long post about technology and ethics. But it won’t, as this is a technical blog, not a philosophical one. It won’t be about philosophy – for interesting takes on the matter you can listen to Damon Horowitz’s TED talk or even go through all of Michael Sandel’s Justice lectures at Harvard. It won’t be about how companies should be ethical (e.g. following the ethical design manifesto)

Instead, it will be a short post focusing on developers and their ethical choices.

I think we have the freedom to be ethical – there’s so much demand on the job market that rejecting an offer, refusing to do something, or leaving a company for ethical reasons is something we have the luxury to do without compromising our well-being. When asked to do something unethical, we can refuse (several years ago I was asked to take part in some shady interactions related to a potential future government contract, which I refused to do). When offered jobs that are slightly better paid but would have us build abusive technology, we can turn the offer down. When a new feature requires us to breach people’s privacy, we can argue it, and ultimately not do it.

But in order to start making these ethical choices, we have to start thinking about ethics. To put ourselves in context. We, developers, are building the world of tomorrow (it sounds grandiose, but we know it’s way more mundane than that). We are the “tools” with which future products will be shaped. And yes, that’s true even for the average back-office system of an insurance company (which allows for raising the insurance for pre-existing conditions), and true for boring banking software (which allows mortgages way beyond the actual coverage the bank has), and so on.

Are these decisions ours to make? Isn’t it legislators that should define what’s allowed and what isn’t? We are just building whatever they tell us to build. Forgive me the far-fetched analogy, but Nazi Germany was an anti-humanity machine based on people who “just followed orders”. Yes, we’ll refuse, someone else will come and do it, but collective ethics gets built over time.

As Hannah Arendt had put it – “The sad truth is that most evil is done by people who never make up their minds to be good or evil.”. We may think that as developers we don’t have a say. But without us, no software can be built. So with our individual ethical stance, a certain unethical software may not be built or be successful, and that’s a stance worth considering, especially when it costs us next to nothing.

The post Developers and Ethics appeared first on Bozho's tech blog.

# AWS GovCloud (US) and Amazon Rekognition – A Powerful Public Safety Tool

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-govcloud-us-and-amazon-rekognition-a-powerful-public-safety-tool/

I’ve already told you about Amazon Rekognition and described how it uses deep neural network models to analyze images by detecting objects, scenes, and faces.

Today I am happy to tell you that Rekognition is now available in the AWS GovCloud (US) Region. To learn more, read the Amazon Rekognition FAQ, and the Amazon Rekognition Product Details, review the Amazon Rekognition Customer Use Cases, and then build your app using the information on the Amazon Rekognition for Developers page.

Motorola Solutions for Public Safety
While I have your attention, I would love to tell you how Motorola Solutions is exploring how Rekognition can enhance real-time intelligence for public safety personnel in the field and at the command center.

Motorola Solutions provides over 100,000 public safety and commercial customers in more than 100 countries with software, services, and tools for mobile intelligence and digital evidence management, many powered by images captured using body, dashboard, and stationary cameras. Due to the exceptionally sensitive nature of these images, they must be stored in an environment that meets stringent CJIS (Criminal Justice Information Systems) security standards defined by the FBI.

For several years, researchers at Motorola Solutions have been exploring the use of artificial intelligence. For example, they have built prototype applications that use Rekognition, Lex, and Polly in conjunction with their own software to scan images from a body-worn camera for missing persons and to raise alerts without requiring continuous human attention or interaction. With approximately 100,000 missing people in the US alone, law enforcement agencies need to bring powerful tools to bear. At re:Invent 2016, Dan Law (Chief Data Scientist for Motorola Solutions) described how they use AWS to aid in this effort. Here’s the video (Dan’s section is titled AI for Public Safety):

AWS and CJIS
The applications that Dan described can run in AWS GovCloud (US). This is an isolated cloud built to protect and preserve sensitive IT data while meeting the FBI’s CJIS requirements (and many others). AWS GovCloud (US) resides on US soil and is managed exclusively by US citizens. AWS routinely signs CJIS security agreements with our customers and can either perform or allow background checks on our employees, as needed.

Jeff;

# Weekly roundup: Breath of the Tired

Post Syndicated from Eevee original https://eev.ee/dev/2017/06/25/weekly-roundup-breath-of-the-tired/

I may have spoken too soon; I had some pretty sleepy nights this week. Oh, well. The slow march of progress continued nonetheless. Also I played Zelda a lot.

• potluck: I built a few little mechanisms: platforms, keys, switches, etc. I don’t have much game yet, but I’m putting off the bulk of it until GDQ week. Hope I can actually do this game justice in just a week! It’ll be a different kind of experience, since the art is set in stone and I already have an engine that can do most of what I want; I just have to build levels and story.

• book: I churned out a good few thousand words, rewrote the introduction, and got rid of a ton of stuff from the old book concept. It’s actually presentable as a work in progress now! Nice.

• veekun: I struggled with form ordering for quite a long time, but finally got it figured out, which is useful and important. Getting there. Also I had to yakshave my self-hosted git (which I use for ripped sprites), after an upgrade caused it to bitrot.

I did less than I would’ve liked, but I’ve still got some decent momentum on these three big things. Still feeling pretty good, and eagerly looking forward to having time free in July to mess around with art and work on fox flux.

# CoderDojo Coolest Projects 2017

Post Syndicated from Ben Nuttall original https://www.raspberrypi.org/blog/coderdojo-coolest-projects-2017/

When I heard we were merging with CoderDojo, I was delighted. CoderDojo is a wonderful organisation with a spectacular community, and it’s going to be great to join forces with the team and work towards our common goal: making a difference to the lives of young people by making technology accessible to them.

You may remember that last year Philip and I went along to Coolest Projects, CoderDojo’s annual event at which their global community showcase their best makes. It was awesome! This year a whole bunch of us from the Raspberry Pi Foundation attended Coolest Projects with our new Irish colleagues, and as expected, the projects on show were as cool as can be.

## This year’s coolest projects!

Young maker Benjamin demoed his brilliant RGB LED table tennis ball display for us, and showed off his brilliant project tutorial website codemakerbuddy.com, which he built with Python and Flask. [Click on any of the images to enlarge them.]

Next up, Aimee showed us a recipes app she’d made with the MIT App Inventor. It was a really impressive and well thought-out project.

This very successful OpenCV face detection program with hardware installed in a teddy bear was great as well:

Helen’s and Oly’s favourite project involved…live bees!

BEEEEEEEEEEES!

Its creator, 12-year-old Amy, said she wanted to do something to help the Earth. Her project uses various sensors to record data on the bee population in the hive. An adjacent monitor displays the data in a web interface:

## Coolest robots

I enjoyed seeing lots of GPIO Zero projects out in the wild, including this robotic lawnmower made by Kevin and Zach:

#### Raspberry Pi Lawnmower

Kevin and Zach’s Raspberry Pi lawnmower project with Python and GPIO Zero, showed at CoderDojo Coolest Projects 2017

Philip’s favourite make was a Pi-powered robot you can control with your mind! According to the maker, Laura, it worked really well with Philip because he has no hair.

This is extraordinary. Laura from @CoderDojo Romania has programmed a mind controlled robot using @Raspberry_Pi @coolestprojects

And here are some pictures of even more cool robots we saw:

## Games, toys, activities

Oly and I were massively impressed with the work of Mogamad, Daniel, and Basheerah, who programmed a (borrowed) Amazon Echo to make a voice-controlled text-adventure game using Java and the Alexa API. They’ve inspired me to try something similar using the AIY projects kit and adventurelib!

Christopher Hill did a brilliant job with his Home Alone LEGO house. He used sensors to trigger lights and sounds to make it look like someone’s at home, like in the film. I should have taken a video – seeing it in action was great!

Meanwhile, the Northern Ireland Raspberry Jam group ran a DOTS board activity, which turned their area into a conductive paint hazard zone.

## Creativity and ingenuity

We really enjoyed seeing so many young people collaborating, experimenting, and taking full advantage of the opportunity to make real projects. And we loved how huge the range of technologies in use was: people employed all manner of hardware and software to bring their ideas to life.

Wow! Look at that room full of awesome young people. @coolestprojects #coolestprojects @CoderDojo

Congratulations to the Coolest Projects 2017 prize winners, and to all participants. Here are some of the teams that won in the different categories:

Take a look at the gallery of all winners over on Flickr.

## The wow factor

Raspberry Pi co-founder and Foundation trustee Pete Lomas came along to the event as well. Here’s what he had to say:

It’s hard to describe the scale of the event, and photos just don’t do it justice. The first thing that hit me was the sheer excitement of the CoderDojo ninjas [the children attending Dojos]. Everyone was setting up for their time with the project judges, and their pure delight at being able to show off their creations was evident in both halls. Time and time again I saw the ninjas apply their creativity to help save the planet or make someone’s life better, and it’s truly exciting that we are going to help that continue and expand.

Even after 8 hours, enthusiasm wasn’t flagging – the awards ceremony was just brilliant, with ninjas high-fiving the winners on the way to the stage. This speaks volumes about the ethos and vision of the CoderDojo founders, where everyone is a winner just by being part of a community of worldwide friends. It was a brilliant introduction, and if this weekend was anything to go by, our merger certainly is a marriage made in Heaven.

## Join this awesome community!

If all this inspires you as much as it did us, consider looking for a CoderDojo near you – and sign up as a volunteer! There’s plenty of time for young people to build up skills and start working on a project for next year’s event. Check out coolestprojects.com for more information.

The post CoderDojo Coolest Projects 2017 appeared first on Raspberry Pi.

# The Dangers of Secret Law

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/06/the_dangers_of_.html

Last week, the Department of Justice released 18 new FISC opinions related to Section 702 as part of an EFF FOIA lawsuit. (Of course, they don’t mention EFF or the lawsuit. They make it sound as if it was their idea.)

There’s probably a lot in these opinions. In one Kafkaesque ruling, a defendant was denied access to the previous court rulings that were used by the court to decide against it:

…in 2014, the Foreign Intelligence Surveillance Court (FISC) rejected a service provider’s request to obtain other FISC opinions that government attorneys had cited and relied on in court filings seeking to compel the provider’s cooperation.

[…]

The provider’s request came up amid legal briefing by both it and the DOJ concerning its challenge to a 702 order. After the DOJ cited two earlier FISC opinions that were not public at the time — one from 2014 and another from 2008­ — the provider asked the court for access to those rulings.

The provider argued that without being able to review the previous FISC rulings, it could not fully understand the court’s earlier decisions, much less effectively respond to DOJ’s argument. The provider also argued that because attorneys with Top Secret security clearances represented it, they could review the rulings without posing a risk to national security.

The court disagreed in several respects. It found that the court’s rules and Section 702 prohibited the documents release. It also rejected the provider’s claim that the Constitution’s Due Process Clause entitled it to the documents.

This kind of government secrecy is toxic to democracy. National security is important, but we will not survive if we become a country of secret court orders based on secret interpretations of secret law.

# AWS GovCloud (US) Heads East – New Region in the Works for 2018

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-govcloud-us-heads-east-new-region-in-the-works-for-2018/

AWS GovCloud (US) gives AWS customers a place to host sensitive data and regulated workloads in the AWS Cloud. The first AWS GovCloud (US) Region was launched in 2011 and is located on the west coast of the US.

I’m happy to announce that we are working on a second Region that we expect to open in 2018. The upcoming AWS GovCloud (US-East) Region will provide customers with added redundancy, data durability, and resiliency, and will also provide additional options for disaster recovery.

Like the existing region, which we now call AWS GovCloud (US-West), the new region will be isolated and meet top US government compliance requirements including International Traffic in Arms Regulations (ITAR), NIST standards, Federal Risk and Authorization Management Program (FedRAMP) Moderate and High, Department of Defense Impact Levels 2-4, DFARs, IRS1075, and Criminal Justice Information Services (CJIS) requirements. Visit the GovCloud (US) page to learn more about the compliance regimes that we support.

Government agencies and the IT contactors that serve them were early adopters of AWS GovCloud (US), as were companies in regulated industries. These organizations are able to enjoy the flexibility and cost-effectiveness of public cloud while benefiting from the isolation and data protection offered by a region designed and built to meet their regulatory needs and to help them to meet their compliance requirements. Here’s a small sample from our customer base:

Federal (US) GovernmentDepartment of Veterans Affairs, General Services Administration 18F (Digital Services Delivery), NASA JPL, Defense Digital Service, United States Air Force, United States Department of Justice.

Regulated IndustriesCSRA, Talen Energy, Cobham Electronics.

SaaS and Solution ProvidersFIGmd, Blackboard, Splunk, GitHub, Motorola.

Federal, state, and local agencies that want to move their existing applications to the AWS Cloud can take advantage of the AWS Cloud Adoption Framework (CAF) offered by AWS Professional Services.

Jeff;

# NSA Document Outlining Russian Attempts to Hack Voter Rolls

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/06/nsa_document_ou.html

This week brought new public evidence about Russian interference in the 2016 election. On Monday, the Intercept published a top-secret National Security Agency document describing Russian hacking attempts against the US election system. While the attacks seem more exploratory than operational ­– and there’s no evidence that they had any actual effect ­– they further illustrate the real threats and vulnerabilities facing our elections, and they point to solutions.

The document describes how the GRU, Russia’s military intelligence agency, attacked a company called VR Systems that, according to its website, provides software to manage voter rolls in eight states. The August 2016 attack was successful, and the attackers used the information they stole from the company’s network to launch targeted attacks against 122 local election officials on October 27, 12 days before the election.

That is where the NSA’s analysis ends. We don’t know whether those 122 targeted attacks were successful, or what their effects were if so. We don’t know whether other election software companies besides VR Systems were targeted, or what the GRU’s overall plan was — if it had one. Certainly, there are ways to disrupt voting by interfering with the voter registration process or voter rolls. But there was no indication on Election Day that people found their names removed from the system, or their address changed, or anything else that would have had an effect — anywhere in the country, let alone in the eight states where VR Systems is deployed. (There were Election Day problems with the voting rolls in Durham, NC ­– one of the states that VR Systems supports ­– but they seem like conventional errors and not malicious action.)

And 12 days before the election (with early voting already well underway in many jurisdictions) seems far too late to start an operation like that. That is why these attacks feel exploratory to me, rather than part of an operational attack. The Russians were seeing how far they could get, and keeping those accesses in their pocket for potential future use.

Presumably, this document was intended for the Justice Department, including the FBI, which would be the proper agency to continue looking into these hacks. We don’t know what happened next, if anything. VR Systems isn’t commenting, and the names of the local election officials targeted did not appear in the NSA document.

So while this document isn’t much of a smoking gun, it’s yet more evidence of widespread Russian attempts to interfere last year.

The document was, allegedly, sent to the Intercept anonymously. An NSA contractor, Reality Leigh Winner, was arrested Saturday and charged with mishandling classified information. The speed with which the government identified her serves as a caution to anyone wanting to leak official US secrets.

The Intercept sent a scan of the document to another source during its reporting. That scan showed a crease in the original document, which implied that someone had printed the document and then carried it out of some secure location. The second source, according to the FBI’s affidavit against Winner, passed it on to the NSA. From there, NSA investigators were able to look at their records and determine that only six people had printed out the document. (The government may also have been able to track the printout through secret dots that identified the printer.) Winner was the only one of those six who had been in e-mail contact with the Intercept. It is unclear whether the e-mail evidence was from Winner’s NSA account or her personal account, but in either case, it’s incredibly sloppy tradecraft.

With President Trump’s election, the issue of Russian interference in last year’s campaign has become highly politicized. Reports like the one from the Office of the Director of National Intelligence in January have been criticized by partisan supporters of the White House. It’s interesting that this document was reported by the Intercept, which has been historically skeptical about claims of Russian interference. (I was quoted in their story, and they showed me a copy of the NSA document before it was published.) The leaker was even praised by WikiLeaks founder Julian Assange, who up until now has been traditionally critical of allegations of Russian election interference.

This demonstrates the power of source documents. It’s easy to discount a Justice Department official or a summary report. A detailed NSA document is much more convincing. Right now, there’s a federal suit to force the ODNI to release the entire January report, not just the unclassified summary. These efforts are vital.

This hack will certainly come up at the Senate hearing where former FBI director James B. Comey is scheduled to testify Thursday. Last year, there were several stories about voter databases being targeted by Russia. Last August, the FBI confirmed that the Russians successfully hacked voter databases in Illinois and Arizona. And a month later, an unnamed Department of Homeland Security official said that the Russians targeted voter databases in 20 states. Again, we don’t know of anything that came of these hacks, but expect Comey to be asked about them. Unfortunately, any details he does know are almost certainly classified, and won’t be revealed in open testimony.

But more important than any of this, we need to better secure our election systems going forward. We have significant vulnerabilities in our voting machines, our voter rolls and registration process, and the vote tabulation systems after the polls close. In January, DHS designated our voting systems as critical national infrastructure, but so far that has been entirely for show. In the United States, we don’t have a single integrated election. We have 50-plus individual elections, each with its own rules and its own regulatory authorities. Federal standards that mandate voter-verified paper ballots and post-election auditing would go a long way to secure our voting system. These attacks demonstrate that we need to secure the voter rolls, as well.

Democratic elections serve two purposes. The first is to elect the winner. But the second is to convince the loser. After the votes are all counted, everyone needs to trust that the election was fair and the results accurate. Attacks against our election system, even if they are ultimately ineffective, undermine that trust and ­– by extension ­– our democracy. Yes, fixing this will be expensive. Yes, it will require federal action in what’s historically been state-run systems. But as a country, we have no other option.

This essay previously appeared in the Washington Post.

# New – USASpending.gov on an Amazon RDS Snapshot

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-usaspending-gov-on-an-amazon-rds-snapshot/

My colleague Jed Sundwall runs the AWS Public Datasets program. He wrote the guest post below to tell you about an important new dataset that is available as an Amazon RDS Snapshot. In the post, Jed introduces the dataset and shows you how to create an Amazon RDS DB Instance from the snapshot.

Jeff;

I am very excited to announce that, starting today, the entire public USAspending.gov database is available for anyone to copy via Amazon Relational Database Service (RDS). USAspending.gov data includes data on all spending by the federal government, including contracts, grants, loans, employee salaries, and more. The data is available via a PostgreSQL snapshot, which provides bulk access to the entire USAspending.gov database, and is updated nightly. At this time, the database includes all USAspending.gov for the second quarter of fiscal year 2017, and data going back to the year 2000 will be added over the summer. You can learn more about the database and how to access it on its AWS Public Dataset landing page.

Through the AWS Public Datasets program, we work with AWS customers to experiment with ways that the cloud can make data more accessible to more people. Most of our AWS Public Datasets are made available through Amazon S3 because of its tremendous flexibility and ability to scale to serve any volume of any kind of data files. What’s exciting about the USAspending.gov database is that it provides a great example of how Amazon RDS can be used to share an entire relational database quickly and easily. Typically, sharing a relational database requires extract, transfer, and load (ETL) processes that require redundant storage capacity, time for data transfer, and often scripts to migrate your database schema from one database engine to another. ETL processes can be so intimidating and cumbersome that they’re effectively impossible for many people to carry out.

By making their data available as a public Amazon RDS snapshot, the team at USASPending.gov has made it easy for anyone to get a copy of their entire production database for their own use within minutes. This will be useful for researchers and businesses who want to work with real data about all US Government spending and quickly combine it with their own data or other data resources.

Deploying the USASpending.gov Database Using the AWS Management Console
Let’s go through the steps involved in deploying the database in your AWS account using the AWS Management Console.

1. Sign in to the AWS Management Console and select the US East (N. Virginia) region in the menu bar.
2. Open the Amazon RDS Console and choose Snapshots in the navigation pane.
3. In the filter for the search bar, select All Public Snapshots and search for 515495268755:
4. Select the snapshot named arn:aws:rds:us-east-1:515495268755:snapshot:usaspending-db.
5. Select Snapshot Actions -> Restore Snapshot. Select an instance size, and enter the other details, then click on Restore DB Instance.
6. You will see that a DB Instance is being created from the snapshot, within your AWS account.
7. After a few minutes, the status of the instance will change to Available.
8. You can see the endpoint for your database on the main page along with other useful info:

Deploying the USASpending.gov Database Using the AWS CLI
You can also install the AWS Command Line Interface (CLI) and use it to create a DB Instance from the snapshot. Here’s a sample command:

$aws rds restore-db-instance-from-db-snapshot --db-instance-identifier my-test-db-cli \ --db-snapshot-identifier arn:aws:rds:us-east-1:515495268755:snapshot:usaspending-db \ --region us-east-1 This will give you an ARN (Amazon Resource Name) that you can use to reference the DB Instance. For example:$ aws rds describe-db-instances \
--db-instance-identifier arn:aws:rds:us-east-1:917192695859:db:my-test-db-cli

This command will display the Endpoint.Address that you use to connect to the database.

Connecting to the DB Instance
After following the AWS Management Console or AWS CLI instructions above, you will have access to the full USAspending.gov database within this Amazon RDS DB instance, and you can connect to it using any PostgreSQL client using the following credentials:

• Database: data_store_api

If you use psql, you can access the database using this command:

$psql -h my-endpoint.rds.amazonaws.com -U root -d data_store_api You should change the database password after you log in: ALTER USER "root" WITH ENCRYPTED PASSWORD '{new password}'; If you can’t connect to your instance but think you should be able to, you may need to check your VPC Security Groups and make sure inbound and outbound traffic on the port (usually 5432) is allowed from your IP address. Exploring the Data The USAspending.gov data is very rich, so it will be hard to do it justice in this blog post, but hopefully these queries will give you an idea of what’s possible. To learn about the contents of the database, please review the USAspending.gov Data Dictionary. The following query will return the total amount of money the government is obligated to pay for contracts awarded by NASA that include “Mars” or “Martian” in the description of the award: select sum(total_obligation) from awards, subtier_agency where (awards.description like '% MARTIAN %' OR awards.description like '% MARS %') AND subtier_agency.name = 'National Aeronautics and Space Administration'; As I write this, the result I get for this query is$55,411,025.42. Note that the database is updated nightly and will include more historical data in the coming months, so you may get a different result if you run this query.

Now, here’s the same query, but looking for awards with “Jupiter” or “Jovian” in the description:

select sum(total_obligation) from awards, subtier_agency
where (awards.description like '%JUPITER%' OR awards.description like '%JOVIAN%')
AND subtier_agency.name = 'National Aeronautics and Space Administration';

The result I get is \$14,766,392.96.

I’m looking forward to seeing what people can do with this data. If you have any questions about the data, please create an issue on the USAspending.gov API’s issue tracker on GitHub.

— Jed

# In Case You Missed These: AWS Security Blog Posts from January, February, and March

In case you missed any AWS Security Blog posts published so far in 2017, they are summarized and linked to below. The posts are shown in reverse chronological order (most recent first), and the subject matter ranges from protecting dynamic web applications against DDoS attacks to monitoring AWS account configuration changes and API calls to Amazon EC2 security groups.

March

March 22: How to Help Protect Dynamic Web Applications Against DDoS Attacks by Using Amazon CloudFront and Amazon Route 53
Using a content delivery network (CDN) such as Amazon CloudFront to cache and serve static text and images or downloadable objects such as media files and documents is a common strategy to improve webpage load times, reduce network bandwidth costs, lessen the load on web servers, and mitigate distributed denial of service (DDoS) attacks. AWS WAF is a web application firewall that can be deployed on CloudFront to help protect your application against DDoS attacks by giving you control over which traffic to allow or block by defining security rules. When users access your application, the Domain Name System (DNS) translates human-readable domain names (for example, www.example.com) to machine-readable IP addresses (for example, 192.0.2.44). A DNS service, such as Amazon Route 53, can effectively connect users’ requests to a CloudFront distribution that proxies requests for dynamic content to the infrastructure hosting your application’s endpoints. In this blog post, I show you how to deploy CloudFront with AWS WAF and Route 53 to help protect dynamic web applications (with dynamic content such as a response to user input) against DDoS attacks. The steps shown in this post are key to implementing the overall approach described in AWS Best Practices for DDoS Resiliency and enable the built-in, managed DDoS protection service, AWS Shield.

March 21: New AWS Encryption SDK for Python Simplifies Multiple Master Key Encryption
The AWS Cryptography team is happy to announce a Python implementation of the AWS Encryption SDK. This new SDK helps manage data keys for you, and it simplifies the process of encrypting data under multiple master keys. As a result, this new SDK allows you to focus on the code that drives your business forward. It also provides a framework you can easily extend to ensure that you have a cryptographic library that is configured to match and enforce your standards. The SDK also includes ready-to-use examples. If you are a Java developer, you can refer to this blog post to see specific Java examples for the SDK. In this blog post, I show you how you can use the AWS Encryption SDK to simplify the process of encrypting data and how to protect your encryption keys in ways that help improve application availability by not tying you to a single region or key management solution.

March 21: Updated CJIS Workbook Now Available by Request
The need for guidance when implementing Criminal Justice Information Services (CJIS)–compliant solutions has become of paramount importance as more law enforcement customers and technology partners move to store and process criminal justice data in the cloud. AWS services allow these customers to easily and securely architect a CJIS-compliant solution when handling criminal justice data, creating a durable, cost-effective, and secure IT infrastructure that better supports local, state, and federal law enforcement in carrying out their public safety missions. AWS has created several documents (collectively referred to as the CJIS Workbook) to assist you in aligning with the FBI’s CJIS Security Policy. You can use the workbook as a framework for developing CJIS-compliant architecture in the AWS Cloud. The workbook helps you define and test the controls you operate, and document the dependence on the controls that AWS operates (compute, storage, database, networking, regions, Availability Zones, and edge locations).

March 9: New Cloud Directory API Makes It Easier to Query Data Along Multiple Dimensions
Today, we made available a new Cloud Directory API, ListObjectParentPaths, that enables you to retrieve all available parent paths for any directory object across multiple hierarchies. Use this API when you want to fetch all parent objects for a specific child object. The order of the paths and objects returned is consistent across iterative calls to the API, unless objects are moved or deleted. In case an object has multiple parents, the API allows you to control the number of paths returned by using a paginated call pattern. In this blog post, I use an example directory to demonstrate how this new API enables you to retrieve data across multiple dimensions to implement powerful applications quickly.

March 8: How to Access the AWS Management Console Using AWS Microsoft AD and Your On-Premises Credentials

March 7: How to Protect Your Web Application Against DDoS Attacks by Using Amazon Route 53 and an External Content Delivery Network
Distributed Denial of Service (DDoS) attacks are attempts by a malicious actor to flood a network, system, or application with more traffic, connections, or requests than it is able to handle. To protect your web application against DDoS attacks, you can use AWS Shield, a DDoS protection service that AWS provides automatically to all AWS customers at no additional charge. You can use AWS Shield in conjunction with DDoS-resilient web services such as Amazon CloudFront and Amazon Route 53 to improve your ability to defend against DDoS attacks. Learn more about architecting for DDoS resiliency by reading the AWS Best Practices for DDoS Resiliency whitepaper. You also have the option of using Route 53 with an externally hosted content delivery network (CDN). In this blog post, I show how you can help protect the zone apex (also known as the root domain) of your web application by using Route 53 to perform a secure redirect to prevent discovery of your application origin.

February

February 27: Now Generally Available – AWS Organizations: Policy-Based Management for Multiple AWS Accounts
Today, AWS Organizations moves from Preview to General Availability. You can use Organizations to centrally manage multiple AWS accounts, with the ability to create a hierarchy of organizational units (OUs). You can assign each account to an OU, define policies, and then apply those policies to an entire hierarchy, specific OUs, or specific accounts. You can invite existing AWS accounts to join your organization, and you can also create new accounts. All of these functions are available from the AWS Management Console, the AWS Command Line Interface (CLI), and through the AWS Organizations API.To read the full AWS Blog post about today’s launch, see AWS Organizations – Policy-Based Management for Multiple AWS Accounts.

February 23: s2n Is Now Handling 100 Percent of SSL Traffic for Amazon S3
Today, we’ve achieved another important milestone for securing customer data: we have replaced OpenSSL with s2n for all internal and external SSL traffic in Amazon Simple Storage Service (Amazon S3) commercial regions. This was implemented with minimal impact to customers, and multiple means of error checking were used to ensure a smooth transition, including client integration tests, catching potential interoperability conflicts, and identifying memory leaks through fuzz testing.

February 22: Easily Replace or Attach an IAM Role to an Existing EC2 Instance by Using the EC2 Console
AWS Identity and Access Management (IAM) roles enable your applications running on Amazon EC2 to use temporary security credentials. IAM roles for EC2 make it easier for your applications to make API requests securely from an instance because they do not require you to manage AWS security credentials that the applications use. Recently, we enabled you to use temporary security credentials for your applications by attaching an IAM role to an existing EC2 instance by using the AWS CLI and SDK. To learn more, see New! Attach an AWS IAM Role to an Existing Amazon EC2 Instance by Using the AWS CLI. Starting today, you can attach an IAM role to an existing EC2 instance from the EC2 console. You can also use the EC2 console to replace an IAM role attached to an existing instance. In this blog post, I will show how to attach an IAM role to an existing EC2 instance from the EC2 console.

February 22: How to Audit Your AWS Resources for Security Compliance by Using Custom AWS Config Rules
AWS Config Rules enables you to implement security policies as code for your organization and evaluate configuration changes to AWS resources against these policies. You can use Config rules to audit your use of AWS resources for compliance with external compliance frameworks such as CIS AWS Foundations Benchmark and with your internal security policies related to the US Health Insurance Portability and Accountability Act (HIPAA), the Federal Risk and Authorization Management Program (FedRAMP), and other regimes. AWS provides some predefined, managed Config rules. You also can create custom Config rules based on criteria you define within an AWS Lambda function. In this post, I show how to create a custom rule that audits AWS resources for security compliance by enabling VPC Flow Logs for an Amazon Virtual Private Cloud (VPC). The custom rule meets requirement 4.3 of the CIS AWS Foundations Benchmark: “Ensure VPC flow logging is enabled in all VPCs.”

February 13: AWS Announces CISPE Membership and Compliance with First-Ever Code of Conduct for Data Protection in the Cloud
I have two exciting announcements today, both showing AWS’s continued commitment to ensuring that customers can comply with EU Data Protection requirements when using our services.

February 13: How to Enable Multi-Factor Authentication for AWS Services by Using AWS Microsoft AD and On-Premises Credentials
You can now enable multi-factor authentication (MFA) for users of AWS services such as Amazon WorkSpaces and Amazon QuickSight and their on-premises credentials by using your AWS Directory Service for Microsoft Active Directory (Enterprise Edition) directory, also known as AWS Microsoft AD. MFA adds an extra layer of protection to a user name and password (the first “factor”) by requiring users to enter an authentication code (the second factor), which has been provided by your virtual or hardware MFA solution. These factors together provide additional security by preventing access to AWS services, unless users supply a valid MFA code.

February 13: How to Create an Organizational Chart with Separate Hierarchies by Using Amazon Cloud Directory
Amazon Cloud Directory enables you to create directories for a variety of use cases, such as organizational charts, course catalogs, and device registries. Cloud Directory offers you the flexibility to create directories with hierarchies that span multiple dimensions. For example, you can create an organizational chart that you can navigate through separate hierarchies for reporting structure, location, and cost center. In this blog post, I show how to use Cloud Directory APIs to create an organizational chart with two separate hierarchies in a single directory. I also show how to navigate the hierarchies and retrieve data. I use the Java SDK for all the sample code in this post, but you can use other language SDKs or the AWS CLI.

February 10: How to Easily Log On to AWS Services by Using Your On-Premises Active Directory
AWS Directory Service for Microsoft Active Directory (Enterprise Edition), also known as Microsoft AD, now enables your users to log on with just their on-premises Active Directory (AD) user name—no domain name is required. This new domainless logon feature makes it easier to set up connections to your on-premises AD for use with applications such as Amazon WorkSpaces and Amazon QuickSight, and it keeps the user logon experience free from network naming. This new interforest trusts capability is now available when using Microsoft AD with Amazon WorkSpaces and Amazon QuickSight Enterprise Edition. In this blog post, I explain how Microsoft AD domainless logon works with AD interforest trusts, and I show an example of setting up Amazon WorkSpaces to use this capability.

February 9: New! Attach an AWS IAM Role to an Existing Amazon EC2 Instance by Using the AWS CLI
AWS Identity and Access Management (IAM) roles enable your applications running on Amazon EC2 to use temporary security credentials that AWS creates, distributes, and rotates automatically. Using temporary credentials is an IAM best practice because you do not need to maintain long-term keys on your instance. Using IAM roles for EC2 also eliminates the need to use long-term AWS access keys that you have to manage manually or programmatically. Starting today, you can enable your applications to use temporary security credentials provided by AWS by attaching an IAM role to an existing EC2 instance. You can also replace the IAM role attached to an existing EC2 instance. In this blog post, I show how you can attach an IAM role to an existing EC2 instance by using the AWS CLI.

February 8: How to Remediate Amazon Inspector Security Findings Automatically
The Amazon Inspector security assessment service can evaluate the operating environments and applications you have deployed on AWS for common and emerging security vulnerabilities automatically. As an AWS-built service, Amazon Inspector is designed to exchange data and interact with other core AWS services not only to identify potential security findings but also to automate addressing those findings. Previous related blog posts showed how you can deliver Amazon Inspector security findings automatically to third-party ticketing systems and automate the installation of the Amazon Inspector agent on new Amazon EC2 instances. In this post, I show how you can automatically remediate findings generated by Amazon Inspector. To get started, you must first run an assessment and publish any security findings to an Amazon Simple Notification Service (SNS) topic. Then, you create an AWS Lambda function that is triggered by those notifications. Finally, the Lambda function examines the findings and then implements the appropriate remediation based on the type of issue.

February 6: How to Simplify Security Assessment Setup Using Amazon EC2 Systems Manager and Amazon Inspector
In a July 2016 AWS Blog post, I discussed how to integrate Amazon Inspector with third-party ticketing systems by using Amazon Simple Notification Service (SNS) and AWS Lambda. This AWS Security Blog post continues in the same vein, describing how to use Amazon Inspector to automate various aspects of security management. In this post, I show you how to install the Amazon Inspector agent automatically through the Amazon EC2 Systems Manager when a new Amazon EC2 instance is launched. In a subsequent post, I will show you how to update EC2 instances automatically that run Linux when Amazon Inspector discovers a missing security patch.

January

January 30: How to Protect Data at Rest with Amazon EC2 Instance Store Encryption
Encrypting data at rest is vital for regulatory compliance to ensure that sensitive data saved on disks is not readable by any user or application without a valid key. Some compliance regulations such as PCI DSS and HIPAA require that data at rest be encrypted throughout the data lifecycle. To this end, AWS provides data-at-rest options and key management to support the encryption process. For example, you can encrypt Amazon EBS volumes and configure Amazon S3 buckets for server-side encryption (SSE) using AES-256 encryption. Additionally, Amazon RDS supports Transparent Data Encryption (TDE). Instance storage provides temporary block-level storage for Amazon EC2 instances. This storage is located on disks attached physically to a host computer. Instance storage is ideal for temporary storage of information that frequently changes, such as buffers, caches, and scratch data. By default, files stored on these disks are not encrypted. In this blog post, I show a method for encrypting data on Linux EC2 instance stores by using Linux built-in libraries. This method encrypts files transparently, which protects confidential data. As a result, applications that process the data are unaware of the disk-level encryption.

January 27: How to Detect and Automatically Remediate Unintended Permissions in Amazon S3 Object ACLs with CloudWatch Events
Amazon S3 Access Control Lists (ACLs) enable you to specify permissions that grant access to S3 buckets and objects. When S3 receives a request for an object, it verifies whether the requester has the necessary access permissions in the associated ACL. For example, you could set up an ACL for an object so that only the users in your account can access it, or you could make an object public so that it can be accessed by anyone. If the number of objects and users in your AWS account is large, ensuring that you have attached correctly configured ACLs to your objects can be a challenge. For example, what if a user were to call the PutObjectAcl API call on an object that is supposed to be private and make it public? Or, what if a user were to call the PutObject with the optional Acl parameter set to public-read, therefore uploading a confidential file as publicly readable? In this blog post, I show a solution that uses Amazon CloudWatch Events to detect PutObject and PutObjectAcl API calls in near-real time and helps ensure that the objects remain private by making automatic PutObjectAcl calls, when necessary.

January 26: Now Available: Amazon Cloud Directory—A Cloud-Native Directory for Hierarchical Data
Today we are launching Amazon Cloud Directory. This service is purpose-built for storing large amounts of strongly typed hierarchical data. With the ability to scale to hundreds of millions of objects while remaining cost-effective, Cloud Directory is a great fit for all sorts of cloud and mobile applications.

January 24: New SOC 2 Report Available: Confidentiality
As with everything at Amazon, the success of our security and compliance program is primarily measured by one thing: our customers’ success. Our customers drive our portfolio of compliance reports, attestations, and certifications that support their efforts in running a secure and compliant cloud environment. As a result of our engagement with key customers across the globe, we are happy to announce the publication of our new SOC 2 Confidentiality report. This report is available now through AWS Artifact in the AWS Management Console.

January 18: Compliance in the Cloud for New Financial Services Cybersecurity Regulations
Financial regulatory agencies are focused more than ever on ensuring responsible innovation. Consequently, if you want to achieve compliance with financial services regulations, you must be increasingly agile and employ dynamic security capabilities. AWS enables you to achieve this by providing you with the tools you need to scale your security and compliance capabilities on AWS. The following breakdown of the most recent cybersecurity regulations, NY DFS Rule 23 NYCRR 500, demonstrates how AWS continues to focus on your regulatory needs in the financial services sector.

January 9: New Amazon GameDev Blog Post: Protect Multiplayer Game Servers from DDoS Attacks by Using Amazon GameLift
In online gaming, distributed denial of service (DDoS) attacks target a game’s network layer, flooding servers with requests until performance degrades considerably. These attacks can limit a game’s availability to players and limit the player experience for those who can connect. Today’s new Amazon GameDev Blog post uses a typical game server architecture to highlight DDoS attack vulnerabilities and discusses how to stay protected by using built-in AWS Cloud security, AWS security best practices, and the security features of Amazon GameLift. Read the post to learn more.

January 6: The Top 10 Most Downloaded AWS Security and Compliance Documents in 2016
The following list includes the 10 most downloaded AWS security and compliance documents in 2016. Using this list, you can learn about what other people found most interesting about security and compliance last year.

January 6: FedRAMP Compliance Update: AWS GovCloud (US) Region Receives a JAB-Issued FedRAMP High Baseline P-ATO for Three New Services
Three new services in the AWS GovCloud (US) region have received a Provisional Authority to Operate (P-ATO) from the Joint Authorization Board (JAB) under the Federal Risk and Authorization Management Program (FedRAMP). JAB issued the authorization at the High baseline, which enables US government agencies and their service providers the capability to use these services to process the government’s most sensitive unclassified data, including Personal Identifiable Information (PII), Protected Health Information (PHI), Controlled Unclassified Information (CUI), criminal justice information (CJI), and financial data.

January 4: The Top 20 Most Viewed AWS IAM Documentation Pages in 2016
The following 20 pages were the most viewed AWS Identity and Access Management (IAM) documentation pages in 2016. I have included a brief description with each link to give you a clearer idea of what each page covers. Use this list to see what other people have been viewing and perhaps to pique your own interest about a topic you’ve been meaning to research.

January 3: The Most Viewed AWS Security Blog Posts in 2016
The following 10 posts were the most viewed AWS Security Blog posts that we published during 2016. You can use this list as a guide to catch up on your blog reading or even read a post again that you found particularly useful.

January 3: How to Monitor AWS Account Configuration Changes and API Calls to Amazon EC2 Security Groups