Tag Archives: writing

Eevee gained 2791 experience points

Post Syndicated from Eevee original https://eev.ee/blog/2018/01/15/eevee-gained-2791-experience-points/

Eevee grew to level 31!

A year strongly defined by mixed success! Also, a lot of video games.

I ran three game jams, resulting in a total of 157 games existing that may not have otherwise, which is totally mindblowing?!

For GAMES MADE QUICK???, glip and I made NEON PHASE, a short little exploratory platformer. Honestly, I should give myself more credit for this and the rest of the LÖVE games I’ve based on the same codebase — I wove a physics engine (and everything else!) from scratch and it has held up remarkably well for a variety of different uses.

I successfully finished an HD version of Isaac’s Descent using my LÖVE engine, though it doesn’t have anything new over the original and I’ve only released it as a tech demo on Patreon.

For Strawberry Jam (NSFW!) we made fox flux (slightly NSFW!), which felt like a huge milestone: the first game where I made all the art! I mean, not counting Isaac’s Descent, which was for a very limited platform. It’s a pretty arbitrary milestone, yes, but it feels significant. I’ve been working on expanding the game into a longer and slightly less buggy experience, but the art is taking the longest by far. I must’ve spent weeks on player sprites alone.

We then set about working on Bolthaven, a sequel of sorts to NEON PHASE, and got decently far, and then abandond it. Oops.

We then started a cute little PICO-8 game, and forgot about it. Oops.

I was recruited to help with Chaos Composer, a more ambitious game glip started with someone else in Unity. I had to get used to Unity, and we squabbled a bit, but the game is finally about at the point where it’s “playable” and “maps” can be designed? It’s slightly on hold at the moment while we all finish up some other stuff, though.

We made a birthday game for two of our friends whose birthdays were very close together! Only they got to see it.

For Ludum Dare 38, we made Lunar Depot 38, a little “wave shooter” or whatever you call those? The AI is pretty rough, seeing as this was the first time I’d really made enemies and I had 72 hours to figure out how to do it, but I still think it’s pretty fun to play and I love the circular world.

I made Roguelike Simulator as an experiment with making something small and quick with a simple tool, and I had a lot of fun! I definitely want to do more stuff like this in the future.

And now we’re working on a game about Star Anise, my cat’s self-insert, which is looking to have more polish and depth than anything we’ve done so far! We’ve definitely come a long way in a year.

Somewhere along the line, I put out a call for a “potluck” project, where everyone would give me sprites of a given size without knowing what anyone else had contributed, and I would then make a game using only those sprites. Unfortunately, that stalled a few times: I tried using the Phaser JS library, but we didn’t get along; I tried LÖVE, but didn’t know where to go with the game; and then I decided to use this as an experiment with procedural generation, and didn’t get around to it. I still feel bad that everyone did work for me and I didn’t follow through, but I don’t know whether this will ever become a game.

veekun, alas, consumed months of my life. I finally got Sun and Moon loaded, but it took weeks of work since I was basically reinventing all the tooling we’d ever had from scratch, without even having most of that tooling available as a reference. It was worth it in the end, at least: Ultra Sun and Ultra Moon only took a few days to get loaded. But veekun itself is still missing some obvious Sun/Moon features, and the whole site needs an overhaul, and I just don’t know if I want to dedicate that much time to it when I have so much other stuff going on that’s much more interesting to me right now.

I finally turned my blog into more of a website, giving it a neat front page that lists a bunch of stuff I’ve done. I made a release category at last, though I’m still not quite in the habit of using it.

I wrote some blog posts, of course! I think the most interesting were JavaScript got better while I wasn’t looking and Object models. I was also asked to write a couple pieces for money for a column that then promptly shut down.

On a whim, I made a set of Eevee mugshots for Doom, which I think is a decent indication of my (pixel) art progress over the year?

I started idchoppers, a Doom parsing and manipulation library written in Rust, though it didn’t get very far and I’ve spent most of the time fighting with Rust because it won’t let me implement all my extremely bad ideas. It can do a couple things, at least, like flip maps very quickly and render maps to SVG.

I did toy around with music a little, but not a lot.

I wrote two short twines for Flora. They’re okay. I’m working on another; I think it’ll be better.

I didn’t do a lot of art overall, at least compared to the two previous years; most of my art effort over the year has gone into fox flux, which requires me to learn a whole lot of things. I did dip my toes into 3D modelling, most notably producing my current Twitter banner as well as this cool Star Anise animation. I wouldn’t mind doing more of that; maybe I’ll even try to make a low-poly pixel-textured 3D game sometime.

I restarted my book with a much better concept, though so far I’ve only written about half a chapter. Argh. I see that the vast majority of the work was done within the span of a single week, which is bad since that means I only worked on it for a week, but good since that means I can actually do a pretty good amount of work in only a week. I also did a lot of squabbling with tooling, which is hopefully mostly out of the way now.

My computer broke? That was an exciting week.

A lot of stuff, but the year as a whole still feels hit or miss. All the time I spent on veekun feels like a black void in the middle of the year, which seems like a good sign that I maybe don’t want to pour even more weeks into it in the near future.

Mostly, I want to do: more games, more art, more writing, more music.

I want to try out some tiny game making tools and make some tiny games with them — partly to get exposure to different things, partly to get more little ideas out into the world regularly, and partly to get more practice at letting myself have ideas. I have a couple tools in mind and I guess I’ll aim at a microgame every two months or so? I’d also like to finish the expanded fox flux by the end of the year, of course, though at the moment I can’t even gauge how long it might take.

I seriously lapsed on drawing last year, largely because fox flux pixel art took me so much time. So I want to draw more, and I want to get much faster at pixel art. It would probably help if I had a more concrete goal for drawing, so I might try to draw some short comics and write a little visual novel or something, which would also force me to aim for consistency.

I want to work on my book more, of course, but I also want to try my hand at a bit more fiction. I’ve had a blast writing dialogue for our games! I just shy away from longer-form writing for some reason — which seems ridiculous when a large part of my audience found me through my blog. I do think I’ve had some sort of breakthrough in the last month or two; I suddenly feel a good bit more confident about writing in general and figuring out what I want to say? One recent post I know I wrote in a single afternoon, which virtually never happens because I keep rewriting and rearranging stuff. Again, a visual novel would be a good excuse to practice writing fiction without getting too bogged down in details.

And, ah, music. I shy heavily away from music, since I have no idea what I’m doing, and also I seem to spend a lot of time fighting with tools. (Surprise.) I tried out SunVox for the first time just a few days ago and have been enjoying it quite a bit for making sound effects, so I might try it for music as well. And once again, visual novel background music is a pretty low-pressure thing to compose for. Hell, visual novels are small games, too, so that checks all the boxes. I guess I’ll go make a visual novel.

Here’s to twenty gayteen!

Now Open – Third AWS Availability Zone in London

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-open-third-aws-availability-zone-in-london/

We expand AWS by picking a geographic area (which we call a Region) and then building multiple, isolated Availability Zones in that area. Each Availability Zone (AZ) has multiple Internet connections and power connections to multiple grids.

Today I am happy to announce that we are opening our 50th AWS Availability Zone, with the addition of a third AZ to the EU (London) Region. This will give you additional flexibility to architect highly scalable, fault-tolerant applications that run across multiple AZs in the UK.

Since launching the EU (London) Region, we have seen an ever-growing set of customers, particularly in the public sector and in regulated industries, use AWS for new and innovative applications. Here are a couple of examples, courtesy of my AWS colleagues in the UK:

Enterprise – Some of the UK’s most respected enterprises are using AWS to transform their businesses, including BBC, BT, Deloitte, and Travis Perkins. Travis Perkins is one of the largest suppliers of building materials in the UK and is implementing the biggest systems and business change in its history, including an all-in migration of its data centers to AWS.

Startups – Cross-border payments company Currencycloud has migrated its entire payments production, and demo platform to AWS resulting in a 30% saving on their infrastructure costs. Clearscore, with plans to disrupting the credit score industry, has also chosen to host their entire platform on AWS. UnderwriteMe is using the EU (London) Region to offer an underwriting platform to their customers as a managed service.

Public Sector -The Met Office chose AWS to support the Met Office Weather App, available for iPhone and Android phones. Since the Met Office Weather App went live in January 2016, it has attracted more than half a million users. Using AWS, the Met Office has been able to increase agility, speed, and scalability while reducing costs. The Driver and Vehicle Licensing Agency (DVLA) is using the EU (London) Region for services such as the Strategic Card Payments platform, which helps the agency achieve PCI DSS compliance.

The AWS EU (London) Region has achieved Public Services Network (PSN) assurance, which provides UK Public Sector customers with an assured infrastructure on which to build UK Public Sector services. In conjunction with AWS’s Standardized Architecture for UK-OFFICIAL, PSN assurance enables UK Public Sector organizations to move their UK-OFFICIAL classified data to the EU (London) Region in a controlled and risk-managed manner.

For a complete list of AWS Regions and Services, visit the AWS Global Infrastructure page. As always, pricing for services in the Region can be found on the detail pages; visit our Cloud Products page to get started.


AWS Glue Now Supports Scala Scripts

Post Syndicated from Mehul Shah original https://aws.amazon.com/blogs/big-data/aws-glue-now-supports-scala-scripts/

We are excited to announce AWS Glue support for running ETL (extract, transform, and load) scripts in Scala. Scala lovers can rejoice because they now have one more powerful tool in their arsenal. Scala is the native language for Apache Spark, the underlying engine that AWS Glue offers for performing data transformations.

Beyond its elegant language features, writing Scala scripts for AWS Glue has two main advantages over writing scripts in Python. First, Scala is faster for custom transformations that do a lot of heavy lifting because there is no need to shovel data between Python and Apache Spark’s Scala runtime (that is, the Java virtual machine, or JVM). You can build your own transformations or invoke functions in third-party libraries. Second, it’s simpler to call functions in external Java class libraries from Scala because Scala is designed to be Java-compatible. It compiles to the same bytecode, and its data structures don’t need to be converted.

To illustrate these benefits, we walk through an example that analyzes a recent sample of the GitHub public timeline available from the GitHub archive. This site is an archive of public requests to the GitHub service, recording more than 35 event types ranging from commits and forks to issues and comments.

This post shows how to build an example Scala script that identifies highly negative issues in the timeline. It pulls out issue events in the timeline sample, analyzes their titles using the sentiment prediction functions from the Stanford CoreNLP libraries, and surfaces the most negative issues.

Getting started

Before we start writing scripts, we use AWS Glue crawlers to get a sense of the data—its structure and characteristics. We also set up a development endpoint and attach an Apache Zeppelin notebook, so we can interactively explore the data and author the script.

Crawl the data

The dataset used in this example was downloaded from the GitHub archive website into our sample dataset bucket in Amazon S3, and copied to the following locations:


Choose the best folder by replacing <region> with the region that you’re working in, for example, us-east-1. Crawl this folder, and put the results into a database named githubarchive in the AWS Glue Data Catalog, as described in the AWS Glue Developer Guide. This folder contains 12 hours of the timeline from January 22, 2017, and is organized hierarchically (that is, partitioned) by year, month, and day.

When finished, use the AWS Glue console to navigate to the table named data in the githubarchive database. Notice that this data has eight top-level columns, which are common to each event type, and three partition columns that correspond to year, month, and day.

Choose the payload column, and you will notice that it has a complex schema—one that reflects the union of the payloads of event types that appear in the crawled data. Also note that the schema that crawlers generate is a subset of the true schema because they sample only a subset of the data.

Set up the library, development endpoint, and notebook

Next, you need to download and set up the libraries that estimate the sentiment in a snippet of text. The Stanford CoreNLP libraries contain a number of human language processing tools, including sentiment prediction.

Download the Stanford CoreNLP libraries. Unzip the .zip file, and you’ll see a directory full of jar files. For this example, the following jars are required:

  • stanford-corenlp-3.8.0.jar
  • stanford-corenlp-3.8.0-models.jar
  • ejml-0.23.jar

Upload these files to an Amazon S3 path that is accessible to AWS Glue so that it can load these libraries when needed. For this example, they are in s3://glue-sample-other/corenlp/.

Development endpoints are static Spark-based environments that can serve as the backend for data exploration. You can attach notebooks to these endpoints to interactively send commands and explore and analyze your data. These endpoints have the same configuration as that of AWS Glue’s job execution system. So, commands and scripts that work there also work the same when registered and run as jobs in AWS Glue.

To set up an endpoint and a Zeppelin notebook to work with that endpoint, follow the instructions in the AWS Glue Developer Guide. When you are creating an endpoint, be sure to specify the locations of the previously mentioned jars in the Dependent jars path as a comma-separated list. Otherwise, the libraries will not be loaded.

After you set up the notebook server, go to the Zeppelin notebook by choosing Dev Endpoints in the left navigation pane on the AWS Glue console. Choose the endpoint that you created. Next, choose the Notebook Server URL, which takes you to the Zeppelin server. Log in using the notebook user name and password that you specified when creating the notebook. Finally, create a new note to try out this example.

Each notebook is a collection of paragraphs, and each paragraph contains a sequence of commands and the output for that command. Moreover, each notebook includes a number of interpreters. If you set up the Zeppelin server using the console, the (Python-based) pyspark and (Scala-based) spark interpreters are already connected to your new development endpoint, with pyspark as the default. Therefore, throughout this example, you need to prepend %spark at the top of your paragraphs. In this example, we omit these for brevity.

Working with the data

In this section, we use AWS Glue extensions to Spark to work with the dataset. We look at the actual schema of the data and filter out the interesting event types for our analysis.

Start with some boilerplate code to import libraries that you need:


import com.amazonaws.services.glue.DynamicRecord
import com.amazonaws.services.glue.GlueContext
import com.amazonaws.services.glue.util.GlueArgParser
import com.amazonaws.services.glue.util.Job
import com.amazonaws.services.glue.util.JsonOptions
import com.amazonaws.services.glue.types._
import org.apache.spark.SparkContext

Then, create the Spark and AWS Glue contexts needed for working with the data:

@transient val spark: SparkContext = SparkContext.getOrCreate()
val glueContext: GlueContext = new GlueContext(spark)

You need the transient decorator on the SparkContext when working in Zeppelin; otherwise, you will run into a serialization error when executing commands.

Dynamic frames

This section shows how to create a dynamic frame that contains the GitHub records in the table that you crawled earlier. A dynamic frame is the basic data structure in AWS Glue scripts. It is like an Apache Spark data frame, except that it is designed and optimized for data cleaning and transformation workloads. A dynamic frame is well-suited for representing semi-structured datasets like the GitHub timeline.

A dynamic frame is a collection of dynamic records. In Spark lingo, it is an RDD (resilient distributed dataset) of DynamicRecords. A dynamic record is a self-describing record. Each record encodes its columns and types, so every record can have a schema that is unique from all others in the dynamic frame. This is convenient and often more efficient for datasets like the GitHub timeline, where payloads can vary drastically from one event type to another.

The following creates a dynamic frame, github_events, from your table:

val github_events = glueContext
                    .getCatalogSource(database = "githubarchive", tableName = "data")

The getCatalogSource() method returns a DataSource, which represents a particular table in the Data Catalog. The getDynamicFrame() method returns a dynamic frame from the source.

Recall that the crawler created a schema from only a sample of the data. You can scan the entire dataset, count the rows, and print the complete schema as follows:


The result looks like the following:

The data has 414,826 records. As before, notice that there are eight top-level columns, and three partition columns. If you scroll down, you’ll also notice that the payload is the most complex column.

Run functions and filter records

This section describes how you can create your own functions and invoke them seamlessly to filter records. Unlike filtering with Python lambdas, Scala scripts do not need to convert records from one language representation to another, thereby reducing overhead and running much faster.

Let’s create a function that picks only the IssuesEvents from the GitHub timeline. These events are generated whenever someone posts an issue for a particular repository. Each GitHub event record has a field, “type”, that indicates the kind of event it is. The issueFilter() function returns true for records that are IssuesEvents.

def issueFilter(rec: DynamicRecord): Boolean = { 
    rec.getField("type").exists(_ == "IssuesEvent") 

Note that the getField() method returns an Option[Any] type, so you first need to check that it exists before checking the type.

You pass this function to the filter transformation, which applies the function on each record and returns a dynamic frame of those records that pass.

val issue_events =  github_events.filter(issueFilter)

Now, let’s look at the size and schema of issue_events.


It’s much smaller (14,063 records), and the payload schema is less complex, reflecting only the schema for issues. Keep a few essential columns for your analysis, and drop the rest using the ApplyMapping() transform:

val issue_titles = issue_events.applyMapping(Seq(("id", "string", "id", "string"),
                                                 ("actor.login", "string", "actor", "string"), 
                                                 ("repo.name", "string", "repo", "string"),
                                                 ("payload.action", "string", "action", "string"),
                                                 ("payload.issue.title", "string", "title", "string")))

The ApplyMapping() transform is quite handy for renaming columns, casting types, and restructuring records. The preceding code snippet tells the transform to select the fields (or columns) that are enumerated in the left half of the tuples and map them to the fields and types in the right half.

Estimating sentiment using Stanford CoreNLP

To focus on the most pressing issues, you might want to isolate the records with the most negative sentiments. The Stanford CoreNLP libraries are Java-based and offer sentiment-prediction functions. Accessing these functions through Python is possible, but quite cumbersome. It requires creating Python surrogate classes and objects for those found on the Java side. Instead, with Scala support, you can use those classes and objects directly and invoke their methods. Let’s see how.

First, import the libraries needed for the analysis:

import java.util.Properties
import edu.stanford.nlp.ling.CoreAnnotations
import edu.stanford.nlp.neural.rnn.RNNCoreAnnotations
import edu.stanford.nlp.pipeline.{Annotation, StanfordCoreNLP}
import edu.stanford.nlp.sentiment.SentimentCoreAnnotations
import scala.collection.convert.wrapAll._

The Stanford CoreNLP libraries have a main driver that orchestrates all of their analysis. The driver setup is heavyweight, setting up threads and data structures that are shared across analyses. Apache Spark runs on a cluster with a main driver process and a collection of backend executor processes that do most of the heavy sifting of the data.

The Stanford CoreNLP shared objects are not serializable, so they cannot be distributed easily across a cluster. Instead, you need to initialize them once for every backend executor process that might need them. Here is how to accomplish that:

val props = new Properties()
props.setProperty("annotators", "tokenize, ssplit, parse, sentiment")
props.setProperty("parse.maxlen", "70")

object myNLP {
    lazy val coreNLP = new StanfordCoreNLP(props)

The properties tell the libraries which annotators to execute and how many words to process. The preceding code creates an object, myNLP, with a field coreNLP that is lazily evaluated. This field is initialized only when it is needed, and only once. So, when the backend executors start processing the records, each executor initializes the driver for the Stanford CoreNLP libraries only one time.

Next is a function that estimates the sentiment of a text string. It first calls Stanford CoreNLP to annotate the text. Then, it pulls out the sentences and takes the average sentiment across all the sentences. The sentiment is a double, from 0.0 as the most negative to 4.0 as the most positive.

def estimatedSentiment(text: String): Double = {
    if ((text == null) || (!text.nonEmpty)) { return Double.NaN }
    val annotations = myNLP.coreNLP.process(text)
    val sentences = annotations.get(classOf[CoreAnnotations.SentencesAnnotation])
    sentences.foldLeft(0.0)( (csum, x) => { 
        csum + RNNCoreAnnotations.getPredictedClass(x.get(classOf[SentimentCoreAnnotations.SentimentAnnotatedTree])) 
    }) / sentences.length

Now, let’s estimate the sentiment of the issue titles and add that computed field as part of the records. You can accomplish this with the map() method on dynamic frames:

val issue_sentiments = issue_titles.map((rec: DynamicRecord) => { 
    val mbody = rec.getField("title")
    mbody match {
        case Some(mval: String) => { 
            rec.addField("sentiment", ScalarNode(estimatedSentiment(mval)))
            rec }
        case _ => rec

The map() method applies the user-provided function on every record. The function takes a DynamicRecord as an argument and returns a DynamicRecord. The code above computes the sentiment, adds it in a top-level field, sentiment, to the record, and returns the record.

Count the records with sentiment and show the schema. This takes a few minutes because Spark must initialize the library and run the sentiment analysis, which can be involved.


Notice that all records were processed (14,063), and the sentiment value was added to the schema.

Finally, let’s pick out the titles that have the lowest sentiment (less than 1.5). Count them and print out a sample to see what some of the titles look like.

val pressing_issues = issue_sentiments.filter(_.getField("sentiment").exists(_.asInstanceOf[Double] < 1.5))

Next, write them all to a file so that you can handle them later. (You’ll need to replace the output path with your own.)

glueContext.getSinkWithFormat(connectionType = "s3", 
                              options = JsonOptions("""{"path": "s3://<bucket>/out/path/"}"""), 
                              format = "json")

Take a look in the output path, and you can see the output files.

Putting it all together

Now, let’s create a job from the preceding interactive session. The following script combines all the commands from earlier. It processes the GitHub archive files and writes out the highly negative issues:

import com.amazonaws.services.glue.DynamicRecord
import com.amazonaws.services.glue.GlueContext
import com.amazonaws.services.glue.util.GlueArgParser
import com.amazonaws.services.glue.util.Job
import com.amazonaws.services.glue.util.JsonOptions
import com.amazonaws.services.glue.types._
import org.apache.spark.SparkContext
import java.util.Properties
import edu.stanford.nlp.ling.CoreAnnotations
import edu.stanford.nlp.neural.rnn.RNNCoreAnnotations
import edu.stanford.nlp.pipeline.{Annotation, StanfordCoreNLP}
import edu.stanford.nlp.sentiment.SentimentCoreAnnotations
import scala.collection.convert.wrapAll._

object GlueApp {

    object myNLP {
        val props = new Properties()
        props.setProperty("annotators", "tokenize, ssplit, parse, sentiment")
        props.setProperty("parse.maxlen", "70")

        lazy val coreNLP = new StanfordCoreNLP(props)

    def estimatedSentiment(text: String): Double = {
        if ((text == null) || (!text.nonEmpty)) { return Double.NaN }
        val annotations = myNLP.coreNLP.process(text)
        val sentences = annotations.get(classOf[CoreAnnotations.SentencesAnnotation])
        sentences.foldLeft(0.0)( (csum, x) => { 
            csum + RNNCoreAnnotations.getPredictedClass(x.get(classOf[SentimentCoreAnnotations.SentimentAnnotatedTree])) 
        }) / sentences.length

    def main(sysArgs: Array[String]) {
        val spark: SparkContext = SparkContext.getOrCreate()
        val glueContext: GlueContext = new GlueContext(spark)

        val dbname = "githubarchive"
        val tblname = "data"
        val outpath = "s3://<bucket>/out/path/"

        val github_events = glueContext
                            .getCatalogSource(database = dbname, tableName = tblname)

        val issue_events =  github_events.filter((rec: DynamicRecord) => {
            rec.getField("type").exists(_ == "IssuesEvent")

        val issue_titles = issue_events.applyMapping(Seq(("id", "string", "id", "string"),
                                                         ("actor.login", "string", "actor", "string"), 
                                                         ("repo.name", "string", "repo", "string"),
                                                         ("payload.action", "string", "action", "string"),
                                                         ("payload.issue.title", "string", "title", "string")))

        val issue_sentiments = issue_titles.map((rec: DynamicRecord) => { 
            val mbody = rec.getField("title")
            mbody match {
                case Some(mval: String) => { 
                    rec.addField("sentiment", ScalarNode(estimatedSentiment(mval)))
                    rec }
                case _ => rec

        val pressing_issues = issue_sentiments.filter(_.getField("sentiment").exists(_.asInstanceOf[Double] < 1.5))

        glueContext.getSinkWithFormat(connectionType = "s3", 
                              options = JsonOptions(s"""{"path": "$outpath"}"""), 
                              format = "json")

Notice that the script is enclosed in a top-level object called GlueApp, which serves as the script’s entry point for the job. (You’ll need to replace the output path with your own.) Upload the script to an Amazon S3 location so that AWS Glue can load it when needed.

To create the job, open the AWS Glue console. Choose Jobs in the left navigation pane, and then choose Add job. Create a name for the job, and specify a role with permissions to access the data. Choose An existing script that you provide, and choose Scala as the language.

For the Scala class name, type GlueApp to indicate the script’s entry point. Specify the Amazon S3 location of the script.

Choose Script libraries and job parameters. In the Dependent jars path field, enter the Amazon S3 locations of the Stanford CoreNLP libraries from earlier as a comma-separated list (without spaces). Then choose Next.

No connections are needed for this job, so choose Next again. Review the job properties, and choose Finish. Finally, choose Run job to execute the job.

You can simply edit the script’s input table and output path to run this job on whatever GitHub timeline datasets that you might have.


In this post, we showed how to write AWS Glue ETL scripts in Scala via notebooks and how to run them as jobs. Scala has the advantage that it is the native language for the Spark runtime. With Scala, it is easier to call Scala or Java functions and third-party libraries for analyses. Moreover, data processing is faster in Scala because there’s no need to convert records from one language runtime to another.

You can find more example of Scala scripts in our GitHub examples repository: https://github.com/awslabs/aws-glue-samples. We encourage you to experiment with Scala scripts and let us know about any interesting ETL flows that you want to share.

Happy Glue-ing!


Additional Reading

If you found this post useful, be sure to check out Simplify Querying Nested JSON with the AWS Glue Relationalize Transform and Genomic Analysis with Hail on Amazon EMR and Amazon Athena.


About the Authors

Mehul Shah is a senior software manager for AWS Glue. His passion is leveraging the cloud to build smarter, more efficient, and easier to use data systems. He has three girls, and, therefore, he has no spare time.




Ben Sowell is a software development engineer at AWS Glue.




Vinay Vivili is a software development engineer for AWS Glue.




Netflix, Amazon and Hollywood Sue Kodi-Powered Dragon Box Over Piracy

Post Syndicated from Ernesto original https://torrentfreak.com/netflix-amazon-and-hollywood-sue-kodi-powered-dragon-box-over-piracy-180111/

More and more people are starting to use Kodi-powered set-top boxes to stream video content to their TVs.

While Kodi itself is a neutral platform, sellers who ship devices with unauthorized add-ons give it a bad reputation.

In recent months these boxes have become the prime target for copyright enforcers, including the Alliance for Creativity and Entertainment (ACE), an anti-piracy partnership between Hollywood studios, Netflix, Amazon, and more than two dozen other companies.

After suing Tickbox last year a group of key ACE members have now filed a similar lawsuit against Dragon Media Inc, which sells the popular Dragon Box. The complaint, filed at a California federal court, also lists the company’s owner Paul Christoforo and reseller Jeff Williams among the defendants.

According to ACE, these type of devices are nothing more than pirate tools, allowing buyers to stream copyright infringing content. That also applies to Dragon Box, they inform the court.

“Defendants market and sell ‘Dragon Box,’ a computer hardware device that Defendants urge their customers to use as a tool for the mass infringement of the copyrighted motion pictures and television shows,” the complaint, picked up by HWR, reads.

The movie companies note that the defendants distribute and promote the Dragon Box as a pirate tool, using phrases such as “Watch your Favourites Anytime For FREE” and “stop paying for Netflix and Hulu.”

Dragon Box

When users follow the instructions Dragon provides they get free access to copyrighted movies, TV-shows and live content, ACE alleges. The complaint further points out that the device uses the open source Kodi player paired with pirate addons.

“The Dragon Media application provides Defendants’ customers with a customized configuration of the Kodi media player and a curated selection of the most popular addons for accessing infringing content,” the movie companies write.

“These addons are designed and maintained for the overarching purpose of scouring the Internet for illegal sources of copyrighted content and returning links to that content. When Dragon Box customers click those links, those customers receive unauthorized streams of popular motion pictures and television shows.”

One of the addons that are included with the download and installation of the Dragon software is Covenant.

This addon can be accessed through a preinstalled shortcut which is linked under the “Videos” menu. Users are then able to browse through a large library of curated content, including a separate category of movies that are still in theaters.

In theaters

According to a statement from Dragon owner Christoforo, business is going well. The company claims to have “over 250,000 customers in 50 states and 4 countries and growing” as well as “374 sellers” across the world.

With this lawsuit, however, the company’s future has suddenly become uncertain.

The movie companies ask the California District for an injunction to shut down the infringing service and impound all Dragon Box devices. In addition, they’re requesting statutory damages which can go up to several million dollars.

At the time of writing the Dragon Box website is still in on air and the company has yet to comment on the allegations.

A copy of the complaint is available here (pdf).

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Graphite 1.1: Teaching an Old Dog New Tricks

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2018/01/11/graphite-1.1-teaching-an-old-dog-new-tricks/

The Road to Graphite 1.1

I started working on Graphite just over a year ago, when @obfuscurity asked me to help out with some issues blocking the Graphite 1.0 release. Little did I know that a year later, that would have resulted in 262 commits (and counting), and that with the help of the other Graphite maintainers (especially @deniszh, @iksaif & @cbowman0) we would have added a huge amount of new functionality to Graphite.

There are a huge number of new additions and updates in this release, in this post I’ll give a tour of some of the highlights including tag support, syntax and function updates, custom function plugins, and python 3.x support.


The single biggest feature in this release is the addition of tag support, which brings the ability to describe metrics in a much richer way and to write more flexible and expressive queries.

Traditionally series in Graphite are identified using a hierarchical naming scheme based on dot-separated segments called nodes. This works very well and is simple to map into a hierarchical structure like the whisper filesystem tree, but it means that the user has to know what each segment represents, and makes it very difficult to modify or extend the naming scheme since everything is based on the positions of the segments within the hierarchy.

The tagging system gives users the ability to encode information about the series in a collection of tag=value pairs which are used together with the series name to uniquely identify each series, and the ability to query series by specifying tag-based matching expressions rather than constructing glob-style selectors based on the positions of specific segments within the hierarchy. This is broadly similar to the system used by Prometheus and makes it possible to use Graphite as a long-term storage backend for metrics gathered by Prometheus with full tag support.

When using tags, series names are specified using the new tagged carbon format: name;tag1=value1;tag2=value2. This format is backward compatible with most existing carbon tooling, and makes it easy to adapt existing tools to produce tagged metrics simply by changing the metric names. The OpenMetrics format is also supported for ingestion, and is normalized into the standard Graphite format internally.

At its core, the tagging system is implemented as a tag database (TagDB) alongside the metrics that allows them to be efficiently queried by individual tag values rather than having to traverse the metrics tree looking for series that match the specified query. Internally the tag index is stored in one of a number of pluggable tag databases, currently supported options are the internal graphite-web database, redis, or an external system that implements the Graphite tagging HTTP API. Carbon automatically keeps the index up to date with any tagged series seen.

The new seriesByTag function is used to query the TagDB and will return a list of all the series that match the expressions passed to it. seriesByTag supports both exact and regular expression matches, and can be used anywhere you would previously have specified a metric name or glob expression.

There are new dedicated functions for grouping and aliasing series by tag (groupByTags and aliasByTags), and you can also use tags interchangeably with node numbers in the standard Graphite functions like aliasByNode, groupByNodes, asPercent, mapSeries, etc.

Piping Syntax & Function Updates

One of the huge strengths of the Graphite render API is the ability to chain together multiple functions to process data, but until now (unless you were using a tool like Grafana) writing chained queries could be painful as each function had to be wrapped around the previous one. With this release it is now possible to “pipe” the output of one processing function into the next, and to combine piped and nested functions.

For example:


Can now be written as:




Another source of frustration with the old function API was the inconsistent implementation of aggregations, with different functions being used in different parts of the API, and some functions simply not being available. In 1.1 all functions that perform aggregation (whether across series or across time intervals) now support a consistent set of aggregations; average, median, sum, min, max, diff, stddev, count, range, multiply and last. This is part of a new approach to implementing functions that emphasises using shared building blocks to ensure consistency across the API and solve the problem of a particular function not working with the aggregation needed for a given task.

To that end a number of new functions have been added that each provide the same functionality as an entire family of “old” functions; aggregate, aggregateWithWildcards, movingWindow, filterSeries, highest, lowest and sortBy.

Each of these functions accepts an aggregation method parameter, for example aggregate(some.metric.*, 'sum') implements the same functionality as sumSeries(some.metric.*).

It can also be used with different aggregation methods to replace averageSeries, stddevSeries, multiplySeries, diffSeries, rangeOfSeries, minSeries, maxSeries and countSeries. All those functions are now implemented as aliases for aggregate, and it supports the previously-missing median and last aggregations.

The same is true for the other functions, and the summarize, smartSummarize, groupByNode, groupByNodes and the new groupByTags functions now all support the standard set of aggregations. Gone are the days of wishing that sortByMedian or highestRange were available!

For more information on the functions available check the function documentation.

Custom Functions

No matter how many functions are available there are always going to be specific use-cases where a custom function can perform analysis that wouldn’t otherwise be possible, or provide a convenient alias for a complicated function chain or specific set of parameters.

In Graphite 1.1 we added support for easily adding one-off custom functions, as well as for creating and sharing plugins that can provide one or more functions.

Each function plugin is packaged as a simple python module, and will be automatically loaded by Graphite when placed into the functions/custom folder.

An example of a simple function plugin that translates the name of every series passed to it into UPPERCASE:

from graphite.functions.params import Param, ParamTypes

def toUpperCase(requestContext, seriesList):
  """Custom function that changes series names to UPPERCASE"""
  for series in seriesList:
    series.name = series.name.upper()
  return seriesList

toUpperCase.group = 'Custom'
toUpperCase.params = [
  Param('seriesList', ParamTypes.seriesList, required=True),

SeriesFunctions = {
  'upper': toUpperCase,

Once installed the function is not only available for use within Grpahite, but is also exposed via the new Function API which allows the function definition and documentation to be automatically loaded by tools like Grafana. This means that users will be able to select and use the new function in exactly the same way as the internal functions.

More information on writing and using custom functions is available in the documentation.

Clustering Updates

One of the biggest changes from the 0.9 to 1.0 releases was the overhaul of the clustering code, and with 1.1.1 that process has been taken even further to optimize performance when using Graphite in a clustered deployment. In the past it was common for a request to require the frontend node to make multiple requests to the backend nodes to identify matching series and to fetch data, and the code for handling remote vs local series was overly complicated. In 1.1.1 we took a new approach where all render data requests pass through the same path internally, and multiple backend nodes are handled individually rather than grouped together into a single finder. This has greatly simplified the codebase, making it much easier to understand and reason about, while allowing much more flexibility in design of the finders. After these changes, render requests can now be answered with a single internal request to each backend node, and all requests for both remote and local data are executed in parallel.

To maintain the ability of graphite to scale out horizontally, the tagging system works seamlessly within a clustered environment, with each node responsible for the series stored on that node. Calls to load tagged series via seriesByTag are fanned out to the backend nodes and results are merged on the query node just like they are for non-tagged series.

Python 3 & Django 1.11 Support

Graphite 1.1 finally brings support for Python 3.x, both graphite-web and carbon are now tested against Python 2.7, 3.4, 3.5, 3.6 and PyPy. Django releases 1.8 through 1.11 are also supported. The work involved in sorting out the compatibility issues between Python 2.x and 3.x was quite involved, but it is a huge step forward for the long term support of the project! With the new Django 2.x series supporting only Python 3.x we will need to evaluate our long-term support for Python 2.x, but the Django 1.11 series is supported through 2020 so there is time to consider the options there.

Watch This Space

Efforts are underway to add support for the new functionality across the ecosystem of tools that work with Graphite, adding collectd tagging support, prometheus remote read & write with tags (and native Prometheus remote read/write support in Graphite) and last but not least Graphite tag support in Grafana.

We’re excited about the possibilities that the new capabilities in 1.1.x open up, and can’t wait to see how the community puts them to work.

Download the 1.1.1 release and check out the release notes here.

Weekly roundup: AOOOWR

Post Syndicated from Eevee original https://eev.ee/dev/2018/01/09/weekly-roundup-aooowr/

  • anise!!: Work continues! glip is busy with a big Flora update, so I’m left to just do code things in the meantime. I did some refactoring I’d been wanting to do for months (splitting apart the “map” and the “world” and the scene that draws the world), drew some final-ish menu art (it looks so good), switched to a vastly more accurate way to integrate position, added a bunch of transitions that make the game feel way more polished, and drew some pretty slick dialogue boxes. Nice!

    I’ll be continuing to work on this game during GAMES MADE QUICK??? 2.0, my jam for making games while watching AGDQ all week! Maybe join if you’re watching AGDQ all week!

  • art: I tried drawing a picture and this time I liked it. I also drew the header art for the aforementioned game jam, though I didn’t have time to finish it, but I think I pulled off a deliberate-looking scratchy sketchy style that’s appropriate for a game jam? Sure we’ll go with that.

  • blog: I finished a post about picking random numbers and a post about how game physics cheat. Which, ah, catches me up for December! Heck! I think I’ve found a slightly more casual style that feels easier to get down, though?

  • writing: I finally wrangled a sensible outline for a Twine I’ve been dragging my feet on, so now I don’t have any excuses! Oh no!

Daniel Miessler on My Writings about IoT Security

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/01/daniel_miessler.html

Daniel Miessler criticizes my writings about IoT security:

I know it’s super cool to scream about how IoT is insecure, how it’s dumb to hook up everyday objects like houses and cars and locks to the internet, how bad things can get, and I know it’s fun to be invited to talk about how everything is doom and gloom.

I absolutely respect Bruce Schneier a lot for what he’s contributed to InfoSec, which makes me that much more disappointed with this kind of position from him.

InfoSec is full of those people, and it’s beneath people like Bruce to add their voices to theirs. Everyone paying attention already knows it’s going to be a soup sandwich — a carnival of horrors — a tragedy of mistakes and abuses of trust.

It’s obvious. Not interesting. Not novel. Obvious. But obvious or not, all these things are still going to happen.

I actually agree with everything in his essay. “We should obviously try to minimize the risks, but we don’t do that by trying to shout down the entire enterprise.” Yes, definitely.

I don’t think the IoT must be stopped. I do think that the risks are considerable, and will increase as these systems become more pervasive and susceptible to class breaks. And I’m trying to write a book that will help navigate this. I don’t think I’m the prophet of doom, and don’t want to come across that way. I’ll give the manuscript another read with that in mind.

Top 10 Most Popular Torrent Sites of 2018

Post Syndicated from Ernesto original https://torrentfreak.com/top-10-most-popular-torrent-sites-of-2018-180107/

Torrent sites have come and gone over past year. Now, at the start of 2018, we take a look to see what the most-used sites are in the current landscape.

The Pirate Bay remains the undisputed number one. The site has weathered a few storms over the years, but it looks like it will be able to celebrate its 15th anniversary, which is coming up in a few months.

The list also includes various newcomers including Idope and Zooqle. While many people are happy to see new torrent sites emerge, this often means that others have called it quits.

Last year’s runner-up Extratorrent, for example, has shut down and left a gaping hole behind. And it wasn’t the only site that went away. TorrentProject also disappeared without a trace and the same was true for isohunt.to.

The unofficial Torrentz reincarnation Torrentz2.eu, the highest newcomer last year, is somewhat of an unusual entry. A few weeks ago all links to externally hosted torrents were removed, as was the list of indexed pages.

We decided to include the site nonetheless, given its history and because it’s still possible to find hashes through the site. As Torrentz2’s future is uncertain, we added an extra site (10.1) as compensation.

Finally, RuTracker also deserves a mention. The torrent site generates enough traffic to warrant a listing, but we traditionally limit the list to sites that are targeted primarily at an English or international audience.

Below is the full list of the ten most-visited torrent sites at the start of the new year. The list is based on various traffic reports and we display the Alexa rank for each. In addition, we include last year’s ranking.

Most Popular Torrent Sites

1. The Pirate Bay

The Pirate Bay is the “king of torrents” once again and also the oldest site in this list. The past year has been relatively quiet for the notorious torrent site, which is currently operating from its original .org domain name.

Alexa Rank: 104/ Last year #1


RARBG, which started out as a Bulgarian tracker, has captured the hearts and minds of many video pirates. The site was founded in 2008 and specializes in high quality video releases.

Alexa Rank: 298 / Last year #3

3. 1337x

1337x continues where it left off last year. The site gained a lot of traffic and, unlike some other sites in the list, has a dedicated group of uploaders that provide fresh content.

Alexa Rank: 321 / Last year #6

4. Torrentz2

Torrentz2 launched as a stand-in for the original Torrentz.eu site, which voluntarily closed its doors in 2016. At the time of writing, the site only lists torrent hashes and no longer any links to external torrent sites. While browser add-ons and plugins still make the site functional, its future is uncertain.

Alexa Rank: 349 / Last year #5

5. YTS.ag

YTS.ag is the unofficial successors of the defunct YTS or YIFY group. Not all other torrent sites were happy that the site hijacked the popuar brand and several are actively banning its releases.

Alexa Rank: 563 / Last year #4

6. EZTV.ag

The original TV-torrent distribution group EZTV shut down after a hostile takeover in 2015, with new owners claiming ownership of the brand. The new group currently operates from EZTV.ag and releases its own torrents. These releases are banned on some other torrent sites due to this controversial history.

Alexa Rank: 981 / Last year #7

7. LimeTorrents

Limetorrents has been an established torrent site for more than half a decade. The site’s operator also runs the torrent cache iTorrents, which is used by several other torrent search engines.

Alexa Rank: 2,433 / Last year #10

8. NYAA.si

NYAA.si is a popular resurrection of the anime torrent site NYAA, which shut down last year. Previously we left anime-oriented sites out of the list, but since we also include dedicated TV and movie sites, we decided that a mention is more than warranted.

Alexa Rank: 1,575 / Last year #NA

9. Torrents.me

Torrents.me is one of the torrent sites that enjoyed a meteoric rise in traffic this year. It’s a meta-search engine that links to torrent files and magnet links from other torrent sites.

Alexa Rank: 2,045 / Last year #NA

10. Zooqle

Zooqle, which boasts nearly three million verified torrents, has stayed under the radar for years but has still kept growing. The site made it into the top 10 for the first time this year.

Alexa Rank: 2,347 / Last year #NA

10.1 iDope

The special 10.1 mention goes to iDope. Launched in 2016, the site is a relative newcomer to the torrent scene. The torrent indexer has steadily increased its audience over the past year. With similar traffic numbers to Zooqle, a listing is therefore warranted.

Alexa Rank: 2,358 / Last year #NA

Disclaimer: Yes, we know that Alexa isn’t perfect, but it helps to compare sites that operate in a similar niche. We also used other traffic metrics to compile the top ten. Please keep in mind that many sites have mirrors or alternative domains, which are not taken into account here.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Physics cheats

Post Syndicated from Eevee original https://eev.ee/blog/2018/01/06/physics-cheats/

Anonymous asks:

something about how we tweak physics to “work” better in games?

Ho ho! Work. Get it? Like in physics…?


Hitbox” is perhaps not the most accurate term, since the shape used for colliding with the environment and the shape used for detecting damage might be totally different. They’re usually the same in simple platformers, though, and that’s what most of my games have been.

The hitbox is the biggest physics fudge by far, and it exists because of a single massive approximation that (most) games make: you’re controlling a single entity in the abstract, not a physical body in great detail.

That is: when you walk with your real-world meat shell, you perform a complex dance of putting one foot in front of the other, a motion you spent years perfecting. When you walk in a video game, you press a single “walk” button. Your avatar may play an animation that moves its legs back and forth, but since you’re not actually controlling the legs independently (and since simulating them is way harder), the game just treats you like a simple shape. Fairly often, this is a box, or something very box-like.

An Eevee sprite standing on faux ground; the size of the underlying image and the hitbox are outlined

Since the player has no direct control over the exact placement of their limbs, it would be slightly frustrating to have them collide with the world. This is especially true in cases like the above, where the tail and left ear protrude significantly out from the main body. If that Eevee wanted to stand against a real-world wall, she would simply tilt her ear or tail out of the way, so there’s no reason for the ear to block her from standing against a game wall. To compensate for this, the ear and tail are left out of the collision box entirely and will simply jut into a wall if necessary — a goofy affordance that’s so common it doesn’t even register as unusual. As a bonus (assuming this same box is used for combat), she won’t take damage from projectiles that merely graze past an ear.

(One extra consideration for sprite games in particular: the hitbox ought to be horizontally symmetric around the sprite’s pivot — i.e. the point where the entity is truly considered to be standing — so that the hitbox doesn’t abruptly move when the entity turns around!)


Treating the player (and indeed most objects) as a box has one annoying side effect: boxes have corners. Corners can catch on other corners, even by a single pixel. Real-world bodies tend to be a bit rounder and squishier and this can tolerate grazing a corner; even real-world boxes will simply rotate a bit.

Ah, but in our faux physics world, we generally don’t want conscious actors (such as the player) to rotate, even with a realistic physics simulator! Real-world bodies are made of parts that will generally try to keep you upright, after all; you don’t tilt back and forth much.

One way to handle corners is to simply remove them from conscious actors. A hitbox doesn’t have to be a literal box, after all. A popular alternative — especially in Unity where it’s a standard asset — is the pill-shaped capsule, which has semicircles/hemispheres on the top and bottom and a cylindrical body in 3D. No corners, no problem.

Of course, that introduces a new problem: now the player can’t balance precariously on edges without their rounded bottom sliding them off. Alas.

If you’re stuck with corners, then, you may want to use a corner bump, a term I just made up. If the player would collide with a corner, but the collision is only by a few pixels, just nudge them to the side a bit and carry on.

An Eevee sprite trying to move sideways into a shallow ledge; the game bumps her upwards slightly, so she steps onto it instead

When the corner is horizontal, this creates stairs! This is, more or less kinda, how steps work in Doom: when the player tries to cross from one sector into another, if the height difference is 24 units or less, the game simply bumps them upwards to the height of the new floor and lets them continue on.

Implementing this in a game without Doom’s notion of sectors is a little trickier. In fact, I still haven’t done it. Collision detection based on rejection gets it for free, kinda, but it’s not very deterministic and it breaks other things. But that’s a whole other post.


Gravity is pretty easy. Everything accelerates downwards all the time. What’s interesting are the exceptions.


Jumping is a giant hack.

Think about how actual jumping works: you tense your legs, which generally involves bending your knees first, and then spring upwards. In a platformer, you can just leap whenever you feel like it, which is nonsense. Also you go like twenty feet into the air?

Worse, most platformers allow variable-height jumping, where your jump is lower if you let go of the jump button while you’re in the air. Normally, one would expect to have to decide how much force to put into the jump beforehand.

But of course this is about convenience of controls: when jumping is your primary action, you want to be able to do it immediately, without any windup for how high you want to jump.

(And then there’s double jumping? Come on.)

Air control is a similar phenomenon: usually you’d jump in a particular direction by controlling how you push off the ground with your feet, but in a video game, you don’t have feet! You only have the box. The compromise is to let you control your horizontal movement to a limit degree in midair, even though that doesn’t make any sense. (It’s way more fun, though, and overall gives you more movement options, which are good to have in an interactive medium.)

Air control also exposes an obvious place that game physics collide with the realistic model of serious physics engines. I’ve mentioned this before, but: if you use Real Physics™ and air control yourself into a wall, you might find that you’ll simply stick to the wall until you let go of the movement buttons. Why? Remember, player movement acts as though an external force were pushing you around (and from the perspective of a Real™ physics engine, this is exactly how you’d implement it) — so air-controlling into a wall is equivalent to pushing a book against a wall with your hand, and the friction with the wall holds you in place. Oops.

Ground sticking

Another place game physics conflict with physics engines is with running to the top of a slope. On a real hill, of course, you land on top of the slope and are probably glad of it; slopes are hard to climb!

An Eevee moves to the top of a slope, and rather than step onto the flat top, she goes flying off into the air

In a video game, you go flying. Because you’re a box. With momentum. So you hit the peak and keep going in the same direction. Which is diagonally upwards.


To make them more predictable, projectiles generally aren’t subject to gravity, at least as far as I’ve seen. The real world does not have such an exemption. The real world imposes gravity even on sniper rifles, which in a video game are often implemented as an instant trace unaffected by anything in the world because the bullet never actually exists in the world.


Ah. Welcome to hell.


Water is an interesting case, and offhand I don’t know the gritty details of how games implement it. In the real world, water applies a resistant drag force to movement — and that force is proportional to the square of velocity, which I’d completely forgotten until right now. I am almost positive that no game handles that correctly. But then, in real-world water, you can push against the water itself for movement, and games don’t simulate that either. What’s the rough equivalent?

The Sonic Physics Guide suggests that Sonic handles it by basically halving everything: acceleration, max speed, friction, etc. When Sonic enters water, his speed is cut; when Sonic exits water, his speed is increased.

That last bit feels validating — I could swear Metroid Prime did the same thing, and built my own solution around it, but couldn’t remember for sure. It makes no sense, of course, for a jump to become faster just because you happened to break the surface of the water, but it feels fantastic.

The thing I did was similar, except that I didn’t want to add a multiplier in a dozen places when you happen to be underwater (and remember which ones need it to be squared, etc.). So instead, I calculate everything completely as normal, so velocity is exactly the same as it would be on dry land — but the distance you would move gets halved. The effect seems to be pretty similar to most platformers with water, at least as far as I can tell. It hasn’t shown up in a published game and I only added this fairly recently, so I might be overlooking some reason this is a bad idea.

(One reason that comes to mind is that velocity is now a little white lie while underwater, so anything relying on velocity for interesting effects might be thrown off. Or maybe that’s correct, because velocity thresholds should be halved underwater too? Hm!)

Notably, air is also a fluid, so it should behave the same way (just with different constants). I definitely don’t think any games apply air drag that’s proportional to the square of velocity.


Friction is, in my experience, a little handwaved. Probably because real-world friction is so darn complicated.

Consider that in the real world, we want very high friction on the surfaces we walk on — shoes and tires are explicitly designed to increase it, even. We move by bracing a back foot against the ground and using that to push ourselves forward, so we want the ground to resist our push as much as possible.

In a game world, we are a box. We move by being pushed by some invisible outside force, so if the friction between ourselves and the ground is too high, we won’t be able to move at all! That’s complete nonsense physically, but it turns out to be handy in some cases — for example, highish friction can simulate walking through deep mud, which should be difficult due to fluid drag and low friction.

But the best-known example of the fakeness of game friction is video game ice. Walking on real-world ice is difficult because the low friction means low grip; your feet are likely to slip out from under you, and you’ll simply fall down and have trouble moving at all. In a video game, you can’t fall down, so you have the opposite experience: you spend most of your time sliding around uncontrollably. Yet ice is so common in video games (and perhaps so uncommon in places I’ve lived) that I, at least, had never really thought about this disparity until an hour or so ago.

Game friction vs real-world friction

Real-world friction is a force. It’s the normal force (which is the force exerted by the object on the surface) times some constant that depends on how the two materials interact.

Force is mass times acceleration, and platformers often ignore mass, so friction ought to be an acceleration — applied against the object’s movement, but never enough to push it backwards.

I haven’t made any games where variable friction plays a significant role, but my gut instinct is that low friction should mean the player accelerates more slowly but has a higher max speed, and high friction should mean the opposite. I see from my own source code that I didn’t even do what I just said, so let’s defer to some better-made and well-documented games: Sonic and Doom.

In Sonic, friction is a fixed value subtracted from the player’s velocity (regardless of direction) each tic. Sonic has a fixed framerate, so the units are really pixels per tic squared (i.e. acceleration), multiplied by an implicit 1 tic per tic. So far, so good.

But Sonic’s friction only applies if the player isn’t pressing or . Hang on, that isn’t friction at all; that’s just deceleration! That’s equivalent to jogging to a stop. If friction were lower, Sonic would take longer to stop, but otherwise this is only tangentially related to friction.

(In fairness, this approach would decently emulate friction for non-conscious sliding objects, which are never going to be pressing movement buttons. Also, we don’t have the Sonic source code, and the name “friction” is a fan invention; the Sonic Physics Guide already uses “deceleration” to describe the player’s acceleration when turning around.)

Okay, let’s try Doom. In Doom, the default friction is 90.625%.

Hang on, what?

Yes, in Doom, friction is a multiplier applied every tic. Doom runs at 35 tics per second, so this is a multiplier of 0.032 per second. Yikes!

This isn’t anything remotely like real friction, but it’s much easier to implement. With friction as acceleration, the game has to know both the direction of movement (so it can apply friction in the opposite direction) and the magnitude (so it doesn’t overshoot and launch the object in the other direction). That means taking a semi-costly square root and also writing extra code to cap the amount of friction. With a multiplier, neither is necessary; just multiply the whole velocity vector and you’re done.

There are some downsides. One is that objects will never actually stop, since multiplying by 3% repeatedly will never produce a result of zero — though eventually the speed will become small enough to either slip below a “minimum speed” threshold or simply no longer fit in a float representation. Another is that the units are fairly meaningless: with Doom’s default friction of 90.625%, about how long does it take for the player to stop? I have no idea, partly because “stop” is ambiguous here! If friction were an acceleration, I could divide it into the player’s max speed to get a time.

All that aside, what are the actual effects of changing Doom’s friction? What an excellent question that’s surprisingly tricky to answer. (Note that friction can’t be changed in original Doom, only in the Boom port and its derivatives.) Here’s what I’ve pieced together.

Doom’s “friction” is really two values. “Friction” itself is a multiplier applied to moving objects on every tic, but there’s also a move factor which defaults to \(\frac{1}{32} = 0.03125\) and is derived from friction for custom values.

Every tic, the player’s velocity is multiplied by friction, and then increased by their speed times the move factor.

v(n) = v(n – 1) \times friction + speed \times move factor

Eventually, the reduction from friction will balance out the speed boost. That happens when \(v(n) = v(n – 1)\), so we can rearrange it to find the player’s effective max speed:

v = v \times friction + speed \times move factor \\
v – v \times friction = speed \times move factor \\
v = speed \times \frac{move factor}{1 – friction}

For vanilla Doom’s move factor of 0.03125 and friction of 0.90625, that becomes:

v = speed \times \frac{\frac{1}{32}}{1 – \frac{29}{32}} = speed \times \frac{\frac{1}{32}}{\frac{3}{32}} = \frac{1}{3} \times speed

Curiously, “speed” is three times the maximum speed an actor can actually move. Doomguy’s run speed is 50, so in practice he moves a third of that, or 16⅔ units per tic. (Of course, this isn’t counting SR40, a bug that lets Doomguy run ~40% faster than intended diagonally.)

So now, what if you change friction? Even more curiously, the move factor is calculated completely differently depending on whether friction is higher or lower than the default Doom amount:

move factor = \begin{cases}
\frac{133 – 128 \times friction}{544} &≈ 0.244 – 0.235 \times friction & \text{ if } friction \ge \frac{29}{32} \\
\frac{81920 \times friction – 70145}{1048576} &≈ 0.078 \times friction – 0.067 & \text{ otherwise }

That’s pretty weird? Complicating things further is that low friction (which means muddy terrain, remember) has an extra multiplier on its move factor, depending on how fast you’re already going — the idea is apparently that you have a hard time getting going, but it gets easier as you find your footing. The extra multiplier maxes out at 8, which makes the two halves of that function meet at the vanilla Doom value.

A graph of the relationship between friction and move factor

That very top point corresponds to the move factor from the original game. So no matter what you do to friction, the move factor becomes lower. At 0.85 and change, you can no longer move at all; below that, you move backwards.

From the formula above, it’s easy to see what changes to friction and move factor will do to Doomguy’s stable velocity. Move factor is in the numerator, so increasing it will increase stable velocity — but it can’t increase, so stable velocity can only ever decrease. Friction is in the denominator, but it’s subtracted from 1, so increasing friction will make the denominator a smaller value less than 1, i.e. increase stable velocity. Combined, we get this relationship between friction and stable velocity.

A graph showing stable velocity shooting up dramatically as friction increases

As friction approaches 1, stable velocity grows without bound. This makes sense, given the definition of \(v(n)\) — if friction is 1, the velocity from the previous tic isn’t reduced at all, so we just keep accelerating freely.

All of this is why I’m wary of using multipliers.

Anyway, this leaves me with one last question about the effects of Doom’s friction: how long does it take to reach stable velocity? Barring precision errors, we’ll never truly reach stable velocity, but let’s say within 5%. First we need a closed formula for the velocity after some number of tics. This is a simple recurrence relation, and you can write a few terms out yourself if you want to be sure this is right.

v(n) = v_0 \times friction^n + speed \times move factor \times \frac{friction^n – 1}{friction – 1}

Our initial velocity is zero, so the first term disappears. Set this equal to the stable formula and solve for n:

speed \times move factor \times \frac{friction^n – 1}{friction – 1} = (1 – 5\%) \times speed \times \frac{move factor}{1 – friction} \\
friction^n – 1 = -(1 – 5\%) \\
n = \frac{\ln 5\%}{\ln friction}

Speed” and move factor disappear entirely, which makes sense, and this is purely a function of friction (and how close we want to get). For vanilla Doom, that comes out to 30.4, which is a little less than a second. For other values of friction:

A graph of time to stability which leaps upwards dramatically towards the right

As friction increases (which in Doom terms means the surface is more slippery), it takes longer and longer to reach stable speed, which is in turn greater and greater. For lesser friction (i.e. mud), stable speed is lower, but reached fairly quickly. (Of course, the extra “getting going” multiplier while in mud adds some extra time here, but including that in the graph is a bit more complicated.)

I think this matches with my instincts above. How fascinating!

What’s that? This is way too much math and you hate it? Then don’t use multipliers in game physics.


That was a hell of a diversion!

I guess the goofiest stuff in basic game physics is really just about mapping player controls to in-game actions like jumping and deceleration; the rest consists of hacks to compensate for representing everything as a box.

Detecting Adblocker Blockers

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/01/detecting_adblo.html

Interesting research on the prevalence of adblock blockers: “Measuring and Disrupting Anti-Adblockers Using Differential Execution Analysis“:

Abstract: Millions of people use adblockers to remove intrusive and malicious ads as well as protect themselves against tracking and pervasive surveillance. Online publishers consider adblockers a major threat to the ad-powered “free” Web. They have started to retaliate against adblockers by employing anti-adblockers which can detect and stop adblock users. To counter this retaliation, adblockers in turn try to detect and filter anti-adblocking scripts. This back and forth has prompted an escalating arms race between adblockers and anti-adblockers.

We want to develop a comprehensive understanding of anti-adblockers, with the ultimate aim of enabling adblockers to bypass state-of-the-art anti-adblockers. In this paper, we present a differential execution analysis to automatically detect and analyze anti-adblockers. At a high level, we collect execution traces by visiting a website with and without adblockers. Through differential execution analysis, we are able to pinpoint the conditions that lead to the differences caused by anti-adblocking code. Using our system, we detect anti-adblockers on 30.5% of the Alexa top-10K websites which is 5-52 times more than reported in prior literature. Unlike prior work which is limited to detecting visible reactions (e.g., warning messages) by anti-adblockers, our system can discover attempts to detect adblockers even when there is no visible reaction. From manually checking one third of the detected websites, we find that the websites that have no visible reactions constitute over 90% of the cases, completely dominating the ones that have visible warning messages. Finally, based on our findings, we further develop JavaScript rewriting and API hooking based solutions (the latter implemented as a Chrome extension) to help adblockers bypass state-of-the-art anti-adblockers.

News article.

timeShift(GrafanaBuzz, 1w) Issue 28

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2018/01/05/timeshiftgrafanabuzz-1w-issue-28/

Happy new year! Grafana Labs is getting back in the swing of things after taking some time off to celebrate 2017, and spending time with family and friends. We’re diligently working on the new Grafana v5.0 release (planning v5.0 beta release by end of January), which includes a ton of new features, a new layout engine, and a polished UI. We’d love to hear your feedback!

Latest Stable Release

Grafana 4.6.3 is now available. Latest bugfixes include:

  • Gzip: Fixes bug Gravatar images when gzip was enabled #5952
  • Alert list: Now shows alert state changes even after adding manual annotations on dashboard #99513
  • Alerting: Fixes bug where rules evaluated as firing when all conditions was false and using OR operator. #93183
  • Cloudwatch: CloudWatch no longer display metrics’ default alias #101514, thx @mtanda

Download Grafana 4.6.3 Now

From the Blogosphere

Why Observability Matters – Now and in the Future: Our own Carl Bergquist teamed up with Neil Gehani, Director of Product at Weaveworks to discuss best practices on how to get started with monitoring your application and infrastructure. This video focuses on modern containerized applications instrumented to use Prometheus to generate metrics and Grafana to visualize them.

How to Install and Secure Grafana on Ubuntu 16.04: In this tutorial, you’ll learn how to install and secure Grafana with a SSL certificate and a Nginx reverse proxy, then you’ll modify Grafana’s default settings for even tighter security.

Monitoring Informix with Grafana: Ben walks us through how to use Grafana to visualize data from IBM Informix and offers a practical demonstration using Docker containers. He also talks about his philosophy of sharing dashboards across teams, important metrics to collect, and how he would like to improve his monitoring stack.

Monitor your hosts with Glances + InfluxDB + Grafana: Glances is a cross-platform system monitoring tool written in Python. This article takes you step by step through the pieces of the stack, installation, confirguration and provides a sample dashboard to get you up and running.

GrafanaCon Tickets are Going Fast!

Lock in your seat for GrafanaCon EU while there are still tickets avaialable! Join us March 1-2, 2018 in Amsterdam for 2 days of talks centered around Grafana and the surrounding monitoring ecosystem including Graphite, Prometheus, InfluxData, Elasticsearch, Kubernetes, and more.

We have some exciting talks lined up from Google, CERN, Bloomberg, eBay, Red Hat, Tinder, Fastly, Automattic, Prometheus, InfluxData, Percona and more! You can see the full list of speakers below, but be sure to get your ticket now.

Get Your Ticket Now

GrafanaCon EU will feature talks from:

“Google Bigtable”
Misha Brukman

“Monitoring at Bloomberg”
Stig Sorensen

“Monitoring at Bloomberg”
Sean Hanson

“Monitoring Tinder’s Billions of Swipes with Grafana”
Utkarsh Bhatnagar

“Grafana at CERN”
Borja Garrido

“Monitoring the Huge Scale at Automattic”
Abhishek Gahlot

“Real-time Engagement During the 2016 US Presidential Election”
Anna MacLachlan

“Real-time Engagement During the 2016 US Presidential Election”
Gerlando Piro

“Grafana v5 and the Future”
Torkel Odegaard

“Prometheus for Monitoring Metrics”
Brian Brazil

“What We Learned Integrating Grafana with Prometheus”
Peter Zaitsev

“The Biz of Grafana”
Raj Dutt

“What’s New In Graphite”
Dan Cech

“The Design of IFQL, the New Influx Functional Query Language”
Paul Dix

“Writing Grafana Dashboards with Jsonnet”
Julien Pivotto

“Monitoring AI Platform at eBay”
Deepak Vasthimal

“Running a Power Plant with Grafana”
Ryan McKinley

“Performance Metrics and User Experience: A “Tinder” Experience”
Susanne Greiner

“Analyzing Performance of OpenStack with Grafana Dashboards”
Alex Krzos

“Storage Monitoring at Shell Upstream”
Arie Jan Kraai

“The RED Method: How To Instrument Your Services”
Tom Wilkie

“Grafana Usage in the Quality Assurance Process”
Andrejs Kalnacs

“Using Prometheus and Grafana for Monitoring my Power Usage”
Erwin de Keijzer

“Weather, Power & Market Forecasts with Grafana”
Max von Roden

“Weather, Power & Market Forecasts with Grafana”
Steffen Knott

“Inherited Technical Debt – A Tale of Overcoming Enterprise Inertia”
Jordan J. Hamel

“Grafanalib: Dashboards as Code”
Jonathan Lange

“The Journey of Shifting the MQTT Broker HiveMQ to Kubernetes”
Arnold Bechtoldt

“Graphs Tell Stories”
Blerim Sheqa

[email protected] or How to Store Millions of Metrics per Second”
Vladimir Smirnov

Upcoming Events:

In between code pushes we like to speak at, sponsor and attend all kinds of conferences and meetups. We also like to make sure we mention other Grafana-related events happening all over the world. If you’re putting on just such an event, let us know and we’ll list it here.

FOSDEM | Brussels, Belgium – Feb 3-4, 2018: FOSDEM is a free developer conference where thousands of developers of free and open source software gather to share ideas and technology. There is no need to register; all are welcome.

Jfokus | Stockholm, Sweden – Feb 5-7, 2018:
Carl Bergquist – Quickie: Monitoring? Not OPS Problem

Why should we monitor our system? Why can’t we just rely on the operations team anymore? They use to be able to do that. What’s currently changing? Presentation content: – Why do we monitor our system – How did it use to work? – Whats changing – Why do we need to shift focus – Everyone should be on call. – Resilience is the goal (Best way of having someone care about quality is to make them responsible).

Register Now

Jfokus | Stockholm, Sweden – Feb 5-7, 2018:
Leonard Gram – Presentation: DevOps Deconstructed

What’s a Site Reliability Engineer and how’s that role different from the DevOps engineer my boss wants to hire? I really don’t want to be on call, should I? Is Docker the right place for my code or am I better of just going straight to Serverless? And why should I care about any of it? I’ll try to answer some of these questions while looking at what DevOps really is about and how commodisation of servers through “the cloud” ties into it all. This session will be an opinionated piece from a developer who’s been on-call for the past 6 years and would like to convince you to do the same, at least once.

Register Now

Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

Awesome! Let us know if you have any questions – we’re happy to help out. We also have a bunch of screencasts to help you get going.

Grafana Labs is Hiring!

We are passionate about open source software and thrive on tackling complex challenges to build the future. We ship code from every corner of the globe and love working with the community. If this sounds exciting, you’re in luck – WE’RE HIRING!

Check out our Open Positions

How are we doing?

That’s a wrap! Let us know what you think about timeShift. Submit a comment on this article below, or post something at our community forum. See you next year!

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

Spectre and Meltdown Attacks

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/01/spectre_and_mel.html

After a week or so of rumors, everyone is now reporting about the Spectre and Meltdown attacks against pretty much every modern processor out there.

These are side-channel attacks where one process can spy on other processes. They affect computers where an untrusted browser window can execute code, phones that have multiple apps running at the same time, and cloud computing networks that run lots of different processes at once. Fixing them either requires a patch that results in a major performance hit, or is impossible and requires a re-architecture of conditional execution in future CPU chips.

I’ll be writing something for publication over the next few days. This post is basically just a link repository.

EDITED TO ADD: Good technical explanation. And a Slashdot thread.

EDITED TO ADD (1/5): Another good technical description. And how the exploits work through browsers. A rundown of what vendors are doing. Nicholas Weaver on its effects on individual computers.

EDITED TO ADD (1/7): xkcd.

EDITED TO ADD (1/10): Another good technical description.

Random with care

Post Syndicated from Eevee original https://eev.ee/blog/2018/01/02/random-with-care/

Hi! Here are a few loose thoughts about picking random numbers.

A word about crypto


This is all aimed at frivolous pursuits like video games. Hell, even video games where money is at stake should be deferring to someone who knows way more than I do. Otherwise you might find out that your deck shuffles in your poker game are woefully inadequate and some smartass is cheating you out of millions. (If your random number generator has fewer than 226 bits of state, it can’t even generate every possible shuffling of a deck of cards!)

Use the right distribution

Most languages have a random number primitive that spits out a number uniformly in the range [0, 1), and you can go pretty far with just that. But beware a few traps!

Random pitches

Say you want to pitch up a sound by a random amount, perhaps up to an octave. Your audio API probably has a way to do this that takes a pitch multiplier, where I say “probably” because that’s how the only audio API I’ve used works.

Easy peasy. If 1 is unchanged and 2 is pitched up by an octave, then all you need is rand() + 1. Right?

No! Pitch is exponential — within the same octave, the “gap” between C and C♯ is about half as big as the gap between B and the following C. If you pick a pitch multiplier uniformly, you’ll have a noticeable bias towards the higher pitches.

One octave corresponds to a doubling of pitch, so if you want to pick a random note, you want 2 ** rand().

Random directions

For two dimensions, you can just pick a random angle with rand() * TAU.

If you want a vector rather than an angle, or if you want a random direction in three dimensions, it’s a little trickier. You might be tempted to just pick a random point where each component is rand() * 2 - 1 (ranging from −1 to 1), but that’s not quite right. A direction is a point on the surface (or, equivalently, within the volume) of a sphere, and picking each component independently produces a point within the volume of a cube; the result will be a bias towards the corners of the cube, where there’s much more extra volume beyond the sphere.

No? Well, just trust me. I don’t know how to make a diagram for this.

Anyway, you could use the Pythagorean theorem a few times and make a huge mess of things, or it turns out there’s a really easy way that even works for two or four or any number of dimensions. You pick each coordinate from a Gaussian (normal) distribution, then normalize the resulting vector. In other words, using Python’s random module:

def random_direction():
    x = random.gauss(0, 1)
    y = random.gauss(0, 1)
    z = random.gauss(0, 1)
    r = math.sqrt(x*x + y*y + z*z)
    return x/r, y/r, z/r

Why does this work? I have no idea!

Note that it is possible to get zero (or close to it) for every component, in which case the result is nonsense. You can re-roll all the components if necessary; just check that the magnitude (or its square) is less than some epsilon, which is equivalent to throwing away a tiny sphere at the center and shouldn’t affect the distribution.

Beware Gauss

Since I brought it up: the Gaussian distribution is a pretty nice one for choosing things in some range, where the middle is the common case and should appear more frequently.

That said, I never use it, because it has one annoying drawback: the Gaussian distribution has no minimum or maximum value, so you can’t really scale it down to the range you want. In theory, you might get any value out of it, with no limit on scale.

In practice, it’s astronomically rare to actually get such a value out. I did a hundred million trials just to see what would happen, and the largest value produced was 5.8.

But, still, I’d rather not knowingly put extremely rare corner cases in my code if I can at all avoid it. I could clamp the ends, but that would cause unnatural bunching at the endpoints. I could reroll if I got a value outside some desired range, but I prefer to avoid rerolling when I can, too; after all, it’s still (astronomically) possible to have to reroll for an indefinite amount of time. (Okay, it’s really not, since you’ll eventually hit the period of your PRNG. Still, though.) I don’t bend over backwards here — I did just say to reroll when picking a random direction, after all — but when there’s a nicer alternative I’ll gladly use it.

And lo, there is a nicer alternative! Enter the beta distribution. It always spits out a number in [0, 1], so you can easily swap it in for the standard normal function, but it takes two “shape” parameters α and β that alter its behavior fairly dramatically.

With α = β = 1, the beta distribution is uniform, i.e. no different from rand(). As α increases, the distribution skews towards the right, and as β increases, the distribution skews towards the left. If α = β, the whole thing is symmetric with a hump in the middle. The higher either one gets, the more extreme the hump (meaning that value is far more common than any other). With a little fiddling, you can get a number of interesting curves.

Screenshots don’t really do it justice, so here’s a little Wolfram widget that lets you play with α and β live:

Note that if α = 1, then 1 is a possible value; if β = 1, then 0 is a possible value. You probably want them both greater than 1, which clamps the endpoints to zero.

Also, it’s possible to have either α or β or both be less than 1, but this creates very different behavior: the corresponding endpoints become poles.

Anyway, something like α = β = 3 is probably close enough to normal for most purposes but already clamped for you. And you could easily replicate something like, say, NetHack’s incredibly bizarre rnz function.

Random frequency

Say you want some event to have an 80% chance to happen every second. You (who am I kidding, I) might be tempted to do something like this:

if random() < 0.8 * dt:

In an ideal world, dt is always the same and is equal to 1 / f, where f is the framerate. Replace that 80% with a variable, say P, and every tic you have a P / f chance to do the… whatever it is.

Each second, f tics pass, so you’ll make this check f times. The chance that any check succeeds is the inverse of the chance that every check fails, which is \(1 – \left(1 – \frac{P}{f}\right)^f\).

For P of 80% and a framerate of 60, that’s a total probability of 55.3%. Wait, what?

Consider what happens if the framerate is 2. On the first tic, you roll 0.4 twice — but probabilities are combined by multiplying, and splitting work up by dt only works for additive quantities. You lose some accuracy along the way. If you’re dealing with something that multiplies, you need an exponent somewhere.

But in this case, maybe you don’t want that at all. Each separate roll you make might independently succeed, so it’s possible (but very unlikely) that the event will happen 60 times within a single second! Or 200 times, if that’s someone’s framerate.

If you explicitly want something to have a chance to happen on a specific interval, you have to check on that interval. If you don’t have a gizmo handy to run code on an interval, it’s easy to do yourself with a time buffer:

timer += dt
# here, 1 is the "every 1 seconds"
while timer > 1:
    timer -= 1
    if random() < 0.8:

Using while means rolls still happen even if you somehow skipped over an entire second.

(For the curious, and the nerds who already noticed: the expression \(1 – \left(1 – \frac{P}{f}\right)^f\) converges to a specific value! As the framerate increases, it becomes a better and better approximation for \(1 – e^{-P}\), which for the example above is 0.551. Hey, 60 fps is pretty accurate — it’s just accurately representing something nowhere near what I wanted. Er, you wanted.)

Rolling your own

Of course, you can fuss with the classic [0, 1] uniform value however you want. If I want a bias towards zero, I’ll often just square it, or multiply two of them together. If I want a bias towards one, I’ll take a square root. If I want something like a Gaussian/normal distribution, but with clearly-defined endpoints, I might add together n rolls and divide by n. (The normal distribution is just what you get if you roll infinite dice and divide by infinity!)

It’d be nice to be able to understand exactly what this will do to the distribution. Unfortunately, that requires some calculus, which this post is too small to contain, and which I didn’t even know much about myself until I went down a deep rabbit hole while writing, and which in many cases is straight up impossible to express directly.

Here’s the non-calculus bit. A source of randomness is often graphed as a PDF — a probability density function. You’ve almost certainly seen a bell curve graphed, and that’s a PDF. They’re pretty nice, since they do exactly what they look like: they show the relative chance that any given value will pop out. On a bog standard bell curve, there’s a peak at zero, and of course zero is the most common result from a normal distribution.

(Okay, actually, since the results are continuous, it’s vanishingly unlikely that you’ll get exactly zero — but you’re much more likely to get a value near zero than near any other number.)

For the uniform distribution, which is what a classic rand() gives you, the PDF is just a straight horizontal line — every result is equally likely.

If there were a calculus bit, it would go here! Instead, we can cheat. Sometimes. Mathematica knows how to work with probability distributions in the abstract, and there’s a free web version you can use. For the example of squaring a uniform variable, try this out:

PDF[TransformedDistribution[u^2, u \[Distributed] UniformDistribution[{0, 1}]], u]

(The \[Distributed] is a funny tilde that doesn’t exist in Unicode, but which Mathematica uses as a first-class operator. Also, press shiftEnter to evaluate the line.)

This will tell you that the distribution is… \(\frac{1}{2\sqrt{u}}\). Weird! You can plot it:

Plot[%, {u, 0, 1}]

(The % refers to the result of the last thing you did, so if you want to try several of these, you can just do Plot[PDF[…], u] directly.)

The resulting graph shows that numbers around zero are, in fact, vastly — infinitely — more likely than anything else.

What about multiplying two together? I can’t figure out how to get Mathematica to understand this, but a great amount of digging revealed that the answer is -ln x, and from there you can plot them both on Wolfram Alpha. They’re similar, though squaring has a much better chance of giving you high numbers than multiplying two separate rolls — which makes some sense, since if either of two rolls is a low number, the product will be even lower.

What if you know the graph you want, and you want to figure out how to play with a uniform roll to get it? Good news! That’s a whole thing called inverse transform sampling. All you have to do is take an integral. Good luck!

This is all extremely ridiculous. New tactic: Just Simulate The Damn Thing. You already have the code; run it a million times, make a histogram, and tada, there’s your PDF. That’s one of the great things about computers! Brute-force numerical answers are easy to come by, so there’s no excuse for producing something like rnz. (Though, be sure your histogram has sufficiently narrow buckets — I tried plotting one for rnz once and the weird stuff on the left side didn’t show up at all!)

By the way, I learned something from futzing with Mathematica here! Taking the square root (to bias towards 1) gives a PDF that’s a straight diagonal line, nothing like the hyperbola you get from squaring (to bias towards 0). How do you get a straight line the other way? Surprise: \(1 – \sqrt{1 – u}\).

Okay, okay, here’s the actual math

I don’t claim to have a very firm grasp on this, but I had a hell of a time finding it written out clearly, so I might as well write it down as best I can. This was a great excuse to finally set up MathJax, too.

Say \(u(x)\) is the PDF of the original distribution and \(u\) is a representative number you plucked from that distribution. For the uniform distribution, \(u(x) = 1\). Or, more accurately,

u(x) = \begin{cases}
1 & \text{ if } 0 \le x \lt 1 \\
0 & \text{ otherwise }

Remember that \(x\) here is a possible outcome you want to know about, and the PDF tells you the relative probability that a roll will be near it. This PDF spits out 1 for every \(x\), meaning every number between 0 and 1 is equally likely to appear.

We want to do something to that PDF, which creates a new distribution, whose PDF we want to know. I’ll use my original example of \(f(u) = u^2\), which creates a new PDF \(v(x)\).

The trick is that we need to work in terms of the cumulative distribution function for \(u\). Where the PDF gives the relative chance that a roll will be (“near”) a specific value, the CDF gives the relative chance that a roll will be less than a specific value.

The conventions for this seem to be a bit fuzzy, and nobody bothers to explain which ones they’re using, which makes this all the more confusing to read about… but let’s write the CDF with a capital letter, so we have \(U(x)\). In this case, \(U(x) = x\), a straight 45° line (at least between 0 and 1). With the definition I gave, this should make sense. At some arbitrary point like 0.4, the value of the PDF is 1 (0.4 is just as likely as anything else), and the value of the CDF is 0.4 (you have a 40% chance of getting a number from 0 to 0.4).

Calculus ahoy: the PDF is the derivative of the CDF, which means it measures the slope of the CDF at any point. For \(U(x) = x\), the slope is always 1, and indeed \(u(x) = 1\). See, calculus is easy.

Okay, so, now we’re getting somewhere. What we want is the CDF of our new distribution, \(V(x)\). The CDF is defined as the probability that a roll \(v\) will be less than \(x\), so we can literally write:

$$V(x) = P(v \le x)$$

(This is why we have to work with CDFs, rather than PDFs — a PDF gives the chance that a roll will be “nearby,” whatever that means. A CDF is much more concrete.)

What is \(v\), exactly? We defined it ourselves; it’s the do something applied to a roll from the original distribution, or \(f(u)\).

$$V(x) = P\!\left(f(u) \le x\right)$$

Now the first tricky part: we have to solve that inequality for \(u\), which means we have to do something, backwards to \(x\).

$$V(x) = P\!\left(u \le f^{-1}(x)\right)$$

Almost there! We now have a probability that \(u\) is less than some value, and that’s the definition of a CDF!

$$V(x) = U\!\left(f^{-1}(x)\right)$$

Hooray! Now to turn these CDFs back into PDFs, all we need to do is differentiate both sides and use the chain rule. If you never took calculus, don’t worry too much about what that means!

$$v(x) = u\!\left(f^{-1}(x)\right)\left|\frac{d}{dx}f^{-1}(x)\right|$$

Wait! Where did that absolute value come from? It takes care of whether \(f(x)\) increases or decreases. It’s the least interesting part here by far, so, whatever.

There’s one more magical part here when using the uniform distribution — \(u(\dots)\) is always equal to 1, so that entire term disappears! (Note that this only works for a uniform distribution with a width of 1; PDFs are scaled so the entire area under them sums to 1, so if you had a rand() that could spit out a number between 0 and 2, the PDF would be \(u(x) = \frac{1}{2}\).)

$$v(x) = \left|\frac{d}{dx}f^{-1}(x)\right|$$

So for the specific case of modifying the output of rand(), all we have to do is invert, then differentiate. The inverse of \(f(u) = u^2\) is \(f^{-1}(x) = \sqrt{x}\) (no need for a ± since we’re only dealing with positive numbers), and differentiating that gives \(v(x) = \frac{1}{2\sqrt{x}}\). Done! This is also why square root comes out nicer; inverting it gives \(x^2\), and differentiating that gives \(2x\), a straight line.

Incidentally, that method for turning a uniform distribution into any distribution — inverse transform sampling — is pretty much the same thing in reverse: integrate, then invert. For example, when I saw that taking the square root gave \(v(x) = 2x\), I naturally wondered how to get a straight line going the other way, \(v(x) = 2 – 2x\). Integrating that gives \(2x – x^2\), and then you can use the quadratic formula (or just ask Wolfram Alpha) to solve \(2x – x^2 = u\) for \(x\) and get \(f(u) = 1 – \sqrt{1 – u}\).

Multiply two rolls is a bit more complicated; you have to write out the CDF as an integral and you end up doing a double integral and wow it’s a mess. The only thing I’ve retained is that you do a division somewhere, which then gets integrated, and that’s why it ends up as \(-\ln x\).

And that’s quite enough of that! (Okay but having math in my blog is pretty cool and I will definitely be doing more of this, sorry, not sorry.)

Random vs varied

Sometimes, random isn’t actually what you want. We tend to use the word “random” casually to mean something more like chaotic, i.e., with no discernible pattern. But that’s not really random. In fact, given how good humans can be at finding incidental patterns, they aren’t all that unlikely! Consider that when you roll two dice, they’ll come up either the same or only one apart almost half the time. Coincidence? Well, yes.

If you ask for randomness, you’re saying that any outcome — or series of outcomes — is acceptable, including five heads in a row or five tails in a row. Most of the time, that’s fine. Some of the time, it’s less fine, and what you really want is variety. Here are a couple examples and some fairly easy workarounds.

NPC quips

The nature of games is such that NPCs will eventually run out of things to say, at which point further conversation will give the player a short brush-off quip — a slight nod from the designer to the player that, hey, you hit the end of the script.

Some NPCs have multiple possible quips and will give one at random. The trouble with this is that it’s very possible for an NPC to repeat the same quip several times in a row before abruptly switching to another one. With only a few options to choose from, getting the same option twice or thrice (especially across an entire game, which may have numerous NPCs) isn’t all that unlikely. The notion of an NPC quip isn’t very realistic to start with, but having someone repeat themselves and then abruptly switch to something else is especially jarring.

The easy fix is to show the quips in order! Paradoxically, this is more consistently varied than choosing at random — the original “order” is likely to be meaningless anyway, and it already has the property that the same quip can never appear twice in a row.

If you like, you can shuffle the list of quips every time you reach the end, but take care here — it’s possible that the last quip in the old order will be the same as the first quip in the new order, so you may still get a repeat. (Of course, you can just check for this case and swap the first quip somewhere else if it bothers you.)

That last behavior is, in fact, the canonical way that Tetris chooses pieces — the game simply shuffles a list of all 7 pieces, gives those to you in shuffled order, then shuffles them again to make a new list once it’s exhausted. There’s no avoidance of duplicates, though, so you can still get two S blocks in a row, or even two S and two Z all clumped together, but no more than that. Some Tetris variants take other approaches, such as actively avoiding repeats even several pieces apart or deliberately giving you the worst piece possible.

Random drops

Random drops are often implemented as a flat chance each time. Maybe enemies have a 5% chance to drop health when they die. Legally speaking, over the long term, a player will see health drops for about 5% of enemy kills.

Over the short term, they may be desperate for health and not survive to see the long term. So you may want to put a thumb on the scale sometimes. Games in the Metroid series, for example, have a somewhat infamous bias towards whatever kind of drop they think you need — health if your health is low, missiles if your missiles are low.

I can’t give you an exact approach to use, since it depends on the game and the feeling you’re going for and the variables at your disposal. In extreme cases, you might want to guarantee a health drop from a tough enemy when the player is critically low on health. (Or if you’re feeling particularly evil, you could go the other way and deny the player health when they most need it…)

The problem becomes a little different, and worse, when the event that triggers the drop is relatively rare. The pathological case here would be something like a raid boss in World of Warcraft, which requires hours of effort from a coordinated group of people to defeat, and which has some tiny chance of dropping a good item that will go to only one of those people. This is why I stopped playing World of Warcraft at 60.

Dialing it back a little bit gives us Enter the Gungeon, a roguelike where each room is a set of encounters and each floor only has a dozen or so rooms. Initially, you have a 1% chance of getting a reward after completing a room — but every time you complete a room and don’t get a reward, the chance increases by 9%, up to a cap of 80%. Once you get a reward, the chance resets to 1%.

The natural question is: how frequently, exactly, can a player expect to get a reward? We could do math, or we could Just Simulate The Damn Thing.

from collections import Counter
import random

histogram = Counter()

TRIALS = 1000000
chance = 1
rooms_cleared = 0
rewards_found = 0
while rewards_found < TRIALS:
    rooms_cleared += 1
    if random.random() * 100 < chance:
        # Reward!
        rewards_found += 1
        histogram[rooms_cleared] += 1
        rooms_cleared = 0
        chance = 1
        chance = min(80, chance + 9)

for gaps, count in sorted(histogram.items()):
    print(f"{gaps:3d} | {count / TRIALS * 100:6.2f}%", '#' * (count // (TRIALS // 100)))
  1 |   0.98%
  2 |   9.91% #########
  3 |  17.00% ################
  4 |  20.23% ####################
  5 |  19.21% ###################
  6 |  15.05% ###############
  7 |   9.69% #########
  8 |   5.07% #####
  9 |   2.09% ##
 10 |   0.63%
 11 |   0.12%
 12 |   0.03%
 13 |   0.00%
 14 |   0.00%
 15 |   0.00%

We’ve got kind of a hilly distribution, skewed to the left, which is up in this histogram. Most of the time, a player should see a reward every three to six rooms, which is maybe twice per floor. It’s vanishingly unlikely to go through a dozen rooms without ever seeing a reward, so a player should see at least one per floor.

Of course, this simulated a single continuous playthrough; when starting the game from scratch, your chance at a reward always starts fresh at 1%, the worst it can be. If you want to know about how many rewards a player will get on the first floor, hey, Just Simulate The Damn Thing.

  0 |   0.01%
  1 |  13.01% #############
  2 |  56.28% ########################################################
  3 |  27.49% ###########################
  4 |   3.10% ###
  5 |   0.11%
  6 |   0.00%

Cool. Though, that’s assuming exactly 12 rooms; it might be worth changing that to pick at random in a way that matches the level generator.

(Enter the Gungeon does some other things to skew probability, which is very nice in a roguelike where blind luck can make or break you. For example, if you kill a boss without having gotten a new gun anywhere else on the floor, the boss is guaranteed to drop a gun.)

Critical hits

I suppose this is the same problem as random drops, but backwards.

Say you have a battle sim where every attack has a 6% chance to land a devastating critical hit. Presumably the same rules apply to both the player and the AI opponents.

Consider, then, that the AI opponents have exactly the same 6% chance to ruin the player’s day. Consider also that this gives them an 0.4% chance to critical hit twice in a row. 0.4% doesn’t sound like much, but across an entire playthrough, it’s not unlikely that a player might see it happen and find it incredibly annoying.

Perhaps it would be worthwhile to explicitly forbid AI opponents from getting consecutive critical hits.

In conclusion

An emerging theme here has been to Just Simulate The Damn Thing. So consider Just Simulating The Damn Thing. Even a simple change to a random value can do surprising things to the resulting distribution, so unless you feel like differentiating the inverse function of your code, maybe test out any non-trivial behavior and make sure it’s what you wanted. Probability is hard to reason about.

AWS Direct Connect Update – Ten New Locations Added in Late 2017

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-direct-connect-update-ten-new-locations-added-in-late-2017/

Happy 2018! I am looking forward to getting back to my usual routine, working with our teams to learn about their upcoming launches and then writing blog posts to bring the news to you. Right now I am still catching up on a few launches and announcements from late 2017.

First on the list for today is our most recent round of new cities for AWS Direct Connect. AWS customers all over the world use Direct Connect to create dedicated network connections from their premises to AWS in order to reduce their network costs, increase throughput, and to pursue a more consistent network experience.

We added ten new locations to our Direct Connect roster in December, all of which offer both 1 Gbps and 10 Gbps connectivity, along with partner-supplied options for speeds below 1 Gbps. Here are the newest locations, along withe the data centers and associated AWS Regions:

  • Bangalore, India – NetMagic DC2Asia Pacific (Mumbai).
  • Cape Town, South Africa – Teraco Ct1EU (Ireland).
  • Johannesburg, South Africa – Teraco JB1EU (Ireland).
  • London, UK – Telehouse North TwoEU (London).
  • Miami, Florida, US – Equinix MI1US East (Northern Virginia).
  • Minneapolis, Minnesota, US – Cologix MIN3US East (Ohio)
  • Ningxia, China – Shapotou IDC – China (Ningxia).
  • Ningxia, China – Industrial Park IDC – China (Ningxia).
  • Rio de Janeiro, Brazil – Equinix RJ2South America (São Paulo).
  • Tokyo, Japan – AT Tokyo ChuoAsia Pacific (Tokyo).

You can use these new locations in conjunction with the AWS Direct Connect Gateway to set up connectivity that spans Virtual Private Clouds (VPCs) spread across multiple AWS Regions (this does not apply to the AWS Regions in China).

If you are interested in putting Direct Connect to use, be sure to check out our ever-growing list of Direct Connect Partners.


Happy New Year- Welcome to Linux Journal 2.0

Post Syndicated from ris original https://lwn.net/Articles/742543/rss

Linux Journal is
. “Talk about a Happy New Year. The reason: it turns out we’re not dead. In fact, we’re more alive than ever, thanks to a rescue by readers—specifically, by the hackers who run Private Internet Access (PIA) VPN, a London Trust Media company. PIA are avid supporters of freenode and the larger FOSS community. They’re also all about Linux and the rest of the modern portfolio of allied concerns: privacy, crypto, freedom, personal agency, rewriting the rules of business and government around all of those, and having fun with constructive hacking of all kinds. We couldn’t have asked for a better rescue ship to come along for us.

Could you write for Hello World magazine?

Post Syndicated from Dan Fisher original https://www.raspberrypi.org/blog/could-you-write-for-hello-world-magazine/

Thinking about New Year’s resolutions? Ditch the gym and tone up your author muscles instead, by writing an article for Hello World magazine. We’ll help you, you’ll expand your knowledge of a topic you care about, and you’ll be contributing something of real value to the computing education community.

Join our pool of Hello World writers in 2018

The computing and digital making magazine for educators

Hello World is our free computing magazine for educators, published in partnership with Computing At School and kindly supported by BT. We launched at the Bett Show in January 2017, and over the past twelve months, we’ve grown to a readership of 15000 subscribers. You can get your own free copy here.

Our work is sustained by wonderful educational content from around the world in every issue. We’re hugely grateful to our current pool of authors – keep it up, veterans of 2017! – and we want to provide opportunities for new voices in the community to join them. You might be a classroom teacher sharing your scheme of work, a volunteer reflecting on running an after-school club, an industry professional sharing your STEM expertise, or an academic providing insights into new research – we’d love contributions from all kinds of people in all sorts of roles.

Your article doesn’t have to be finished and complete: if you send us an outline, we will work with you to develop it into a full piece.

Like my desk, but tidier

Five reasons to write for Hello World

Here are five reasons why writing for Hello World is a great way to start 2018:

1. You’ll learn something new

Researching an article is one of the best ways to broaden your knowledge about something that interests you.

2. You’ll think more clearly

Notes in hand, you sit at your desk and wonder how to craft all this information into a coherent piece of writing. It’s a situation we’re all familiar with. Writing an article makes you examine and clarify what you really think about a subject.

Share your expertise and make more interesting projects along the way

3. You’ll make cool projects

Testing a project for a Hello World resource is a perfect opportunity to build something amazing that’s hitherto been locked away inside your brain.

4. You’ll be doing something that matters

Sharing your knowledge and experience in Hello World helps others to teach and learn computing. It helps bring the power of digital making to more and more educators and learners.

5. You’ll share with an open and supportive community

The computing education community is full of people who lend their experience to help colleagues. Contributing to Hello World is a great way to take an active part in this supportive community, and you’ll be adding to a body of free, open source learning resources that are available for everyone to use, adapt, and share. It’s also a tremendous platform to broadcast your work: the digital version alone of Hello World has been downloaded over 50000 times.

Yes! What do I do next?

Feeling inspired? Email our editorial team with your idea.

Issue 4 of Hello World is out this month! Subscribe for free today to have it delivered to your inbox or your home.

The post Could you write for Hello World magazine? appeared first on Raspberry Pi.