Tag Archives: iq

Attend This Free December 13 Tech Talk: “Cloud-Native DDoS Mitigation with AWS Shield”

Post Syndicated from Craig Liebendorfer original https://aws.amazon.com/blogs/security/register-for-and-attend-this-december-14-aws-shield-tech-talk-cloud-native-ddos-mitigation/

AWS Online Tech Talks banner

As part of the AWS Online Tech Talks series, AWS will present Cloud-Native DDoS Mitigation with AWS Shield on Wednesday, December 13. This tech talk will start at 9:00 A.M. Pacific Time and end at 9:40 A.M. Pacific Time.

Distributed Denial of Service (DDoS) mitigation can help you maintain application availability, but traditional solutions are hard to scale and require expensive hardware. AWS Shield is a managed DDoS protection service that helps you safeguard web applications running in the AWS Cloud. In this tech talk, you will learn simple techniques for using AWS Shield to help you build scalable DDoS defenses into your applications without investing in costly infrastructure. You also will learn how AWS Shield helps you monitor your applications to detect DDoS attempts and how to respond to in-progress events.

This tech talk is free. Register today.

– Craig

Managing AWS Lambda Function Concurrency

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/managing-aws-lambda-function-concurrency/

One of the key benefits of serverless applications is the ease in which they can scale to meet traffic demands or requests, with little to no need for capacity planning. In AWS Lambda, which is the core of the serverless platform at AWS, the unit of scale is a concurrent execution. This refers to the number of executions of your function code that are happening at any given time.

Thinking about concurrent executions as a unit of scale is a fairly unique concept. In this post, I dive deeper into this and talk about how you can make use of per function concurrency limits in Lambda.

Understanding concurrency in Lambda

Instead of diving right into the guts of how Lambda works, here’s an appetizing analogy: a magical pizza.
Yes, a magical pizza!

This magical pizza has some unique properties:

  • It has a fixed maximum number of slices, such as 8.
  • Slices automatically re-appear after they are consumed.
  • When you take a slice from the pizza, it does not re-appear until it has been completely consumed.
  • One person can take multiple slices at a time.
  • You can easily ask to have the number of slices increased, but they remain fixed at any point in time otherwise.

Now that the magical pizza’s properties are defined, here’s a hypothetical situation of some friends sharing this pizza.

Shawn, Kate, Daniela, Chuck, Ian and Avleen get together every Friday to share a pizza and catch up on their week. As there is just six of them, they can easily all enjoy a slice of pizza at a time. As they finish each slice, it re-appears in the pizza pan and they can take another slice again. Given the magical properties of their pizza, they can continue to eat all they want, but with two very important constraints:

  • If any of them take too many slices at once, the others may not get as much as they want.
  • If they take too many slices, they might also eat too much and get sick.

One particular week, some of the friends are hungrier than the rest, taking two slices at a time instead of just one. If more than two of them try to take two pieces at a time, this can cause contention for pizza slices. Some of them would wait hungry for the slices to re-appear. They could ask for a pizza with more slices, but then run the same risk again later if more hungry friends join than planned for.

What can they do?

If the friends agreed to accept a limit for the maximum number of slices they each eat concurrently, both of these issues are avoided. Some could have a maximum of 2 of the 8 slices, or other concurrency limits that were more or less. Just so long as they kept it at or under eight total slices to be eaten at one time. This would keep any from going hungry or eating too much. The six friends can happily enjoy their magical pizza without worry!

Concurrency in Lambda

Concurrency in Lambda actually works similarly to the magical pizza model. Each AWS Account has an overall AccountLimit value that is fixed at any point in time, but can be easily increased as needed, just like the count of slices in the pizza. As of May 2017, the default limit is 1000 “slices” of concurrency per AWS Region.

Also like the magical pizza, each concurrency “slice” can only be consumed individually one at a time. After consumption, it becomes available to be consumed again. Services invoking Lambda functions can consume multiple slices of concurrency at the same time, just like the group of friends can take multiple slices of the pizza.

Let’s take our example of the six friends and bring it back to AWS services that commonly invoke Lambda:

  • Amazon S3
  • Amazon Kinesis
  • Amazon DynamoDB
  • Amazon Cognito

In a single account with the default concurrency limit of 1000 concurrent executions, any of these four services could invoke enough functions to consume the entire limit or some part of it. Just like with the pizza example, there is the possibility for two issues to pop up:

  • One or more of these services could invoke enough functions to consume a majority of the available concurrency capacity. This could cause others to be starved for it, causing failed invocations.
  • A service could consume too much concurrent capacity and cause a downstream service or database to be overwhelmed, which could cause failed executions.

For Lambda functions that are launched in a VPC, you have the potential to consume the available IP addresses in a subnet or the maximum number of elastic network interfaces to which your account has access. For more information, see Configuring a Lambda Function to Access Resources in an Amazon VPC. For information about elastic network interface limits, see Network Interfaces section in the Amazon VPC Limits topic.

One way to solve both of these problems is applying a concurrency limit to the Lambda functions in an account.

Configuring per function concurrency limits

You can now set a concurrency limit on individual Lambda functions in an account. The concurrency limit that you set reserves a portion of your account level concurrency for a given function. All of your functions’ concurrent executions count against this account-level limit by default.

If you set a concurrency limit for a specific function, then that function’s concurrency limit allocation is deducted from the shared pool and assigned to that specific function. AWS also reserves 100 units of concurrency for all functions that don’t have a specified concurrency limit set. This helps to make sure that future functions have capacity to be consumed.

Going back to the example of the consuming services, you could set throttles for the functions as follows:

Amazon S3 function = 350
Amazon Kinesis function = 200
Amazon DynamoDB function = 200
Amazon Cognito function = 150
Total = 900

With the 100 reserved for all non-concurrency reserved functions, this totals the account limit of 1000.

Here’s how this works. To start, create a basic Lambda function that is invoked via Amazon API Gateway. This Lambda function returns a single “Hello World” statement with an added sleep time between 2 and 5 seconds. The sleep time simulates an API providing some sort of capability that can take a varied amount of time. The goal here is to show how an API that is underloaded can reach its concurrency limit, and what happens when it does.
To create the example function

  1. Open the Lambda console.
  2. Choose Create Function.
  3. For Author from scratch, enter the following values:
    1. For Name, enter a value (such as concurrencyBlog01).
    2. For Runtime, choose Python 3.6.
    3. For Role, choose Create new role from template and enter a name aligned with this function, such as concurrencyBlogRole.
  4. Choose Create function.
  5. The function is created with some basic example code. Replace that code with the following:

import time
from random import randint
seconds = randint(2, 5)

def lambda_handler(event, context):
time.sleep(seconds)
return {"statusCode": 200,
"body": ("Hello world, slept " + str(seconds) + " seconds"),
"headers":
{
"Access-Control-Allow-Headers": "Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token",
"Access-Control-Allow-Methods": "GET,OPTIONS",
}}

  1. Under Basic settings, set Timeout to 10 seconds. While this function should only ever take up to 5-6 seconds (with the 5-second max sleep), this gives you a little bit of room if it takes longer.

  1. Choose Save at the top right.

At this point, your function is configured for this example. Test it and confirm this in the console:

  1. Choose Test.
  2. Enter a name (it doesn’t matter for this example).
  3. Choose Create.
  4. In the console, choose Test again.
  5. You should see output similar to the following:

Now configure API Gateway so that you have an HTTPS endpoint to test against.

  1. In the Lambda console, choose Configuration.
  2. Under Triggers, choose API Gateway.
  3. Open the API Gateway icon now shown as attached to your Lambda function:

  1. Under Configure triggers, leave the default values for API Name and Deployment stage. For Security, choose Open.
  2. Choose Add, Save.

API Gateway is now configured to invoke Lambda at the Invoke URL shown under its configuration. You can take this URL and test it in any browser or command line, using tools such as “curl”:


$ curl https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01
Hello world, slept 2 seconds

Throwing load at the function

Now start throwing some load against your API Gateway + Lambda function combo. Right now, your function is only limited by the total amount of concurrency available in an account. For this example account, you might have 850 unreserved concurrency out of a full account limit of 1000 due to having configured a few concurrency limits already (also the 100 concurrency saved for all functions without configured limits). You can find all of this information on the main Dashboard page of the Lambda console:

For generating load in this example, use an open source tool called “hey” (https://github.com/rakyll/hey), which works similarly to ApacheBench (ab). You test from an Amazon EC2 instance running the default Amazon Linux AMI from the EC2 console. For more help with configuring an EC2 instance, follow the steps in the Launch Instance Wizard.

After the EC2 instance is running, SSH into the host and run the following:


sudo yum install go
go get -u github.com/rakyll/hey

“hey” is easy to use. For these tests, specify a total number of tests (5,000) and a concurrency of 50 against the API Gateway URL as follows(replace the URL here with your own):


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01

The output from “hey” tells you interesting bits of information:


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01

Summary:
Total: 381.9978 secs
Slowest: 9.4765 secs
Fastest: 0.0438 secs
Average: 3.2153 secs
Requests/sec: 13.0891
Total data: 140024 bytes
Size/request: 28 bytes

Response time histogram:
0.044 [1] |
0.987 [2] |
1.930 [0] |
2.874 [1803] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
3.817 [1518] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
4.760 [719] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
5.703 [917] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
6.647 [13] |
7.590 [14] |
8.533 [9] |
9.477 [4] |

Latency distribution:
10% in 2.0224 secs
25% in 2.0267 secs
50% in 3.0251 secs
75% in 4.0269 secs
90% in 5.0279 secs
95% in 5.0414 secs
99% in 5.1871 secs

Details (average, fastest, slowest):
DNS+dialup: 0.0003 secs, 0.0000 secs, 0.0332 secs
DNS-lookup: 0.0000 secs, 0.0000 secs, 0.0046 secs
req write: 0.0000 secs, 0.0000 secs, 0.0005 secs
resp wait: 3.2149 secs, 0.0438 secs, 9.4472 secs
resp read: 0.0000 secs, 0.0000 secs, 0.0004 secs

Status code distribution:
[200] 4997 responses
[502] 3 responses

You can see a helpful histogram and latency distribution. Remember that this Lambda function has a random sleep period in it and so isn’t entirely representational of a real-life workload. Those three 502s warrant digging deeper, but could be due to Lambda cold-start timing and the “second” variable being the maximum of 5, causing the Lambda functions to time out. AWS X-Ray and the Amazon CloudWatch logs generated by both API Gateway and Lambda could help you troubleshoot this.

Configuring a concurrency reservation

Now that you’ve established that you can generate this load against the function, I show you how to limit it and protect a backend resource from being overloaded by all of these requests.

  1. In the console, choose Configure.
  2. Under Concurrency, for Reserve concurrency, enter 25.

  1. Click on Save in the top right corner.

You could also set this with the AWS CLI using the Lambda put-function-concurrency command or see your current concurrency configuration via Lambda get-function. Here’s an example command:


$ aws lambda get-function --function-name concurrencyBlog01 --output json --query Concurrency
{
"ReservedConcurrentExecutions": 25
}

Either way, you’ve set the Concurrency Reservation to 25 for this function. This acts as both a limit and a reservation in terms of making sure that you can execute 25 concurrent functions at all times. Going above this results in the throttling of the Lambda function. Depending on the invoking service, throttling can result in a number of different outcomes, as shown in the documentation on Throttling Behavior. This change has also reduced your unreserved account concurrency for other functions by 25.

Rerun the same load generation as before and see what happens. Previously, you tested at 50 concurrency, which worked just fine. By limiting the Lambda functions to 25 concurrency, you should see rate limiting kick in. Run the same test again:


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01

While this test runs, refresh the Monitoring tab on your function detail page. You see the following warning message:

This is great! It means that your throttle is working as configured and you are now protecting your downstream resources from too much load from your Lambda function.

Here is the output from a new “hey” command:


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01
Summary:
Total: 379.9922 secs
Slowest: 7.1486 secs
Fastest: 0.0102 secs
Average: 1.1897 secs
Requests/sec: 13.1582
Total data: 164608 bytes
Size/request: 32 bytes

Response time histogram:
0.010 [1] |
0.724 [3075] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1.438 [0] |
2.152 [811] |∎∎∎∎∎∎∎∎∎∎∎
2.866 [11] |
3.579 [566] |∎∎∎∎∎∎∎
4.293 [214] |∎∎∎
5.007 [1] |
5.721 [315] |∎∎∎∎
6.435 [4] |
7.149 [2] |

Latency distribution:
10% in 0.0130 secs
25% in 0.0147 secs
50% in 0.0205 secs
75% in 2.0344 secs
90% in 4.0229 secs
95% in 5.0248 secs
99% in 5.0629 secs

Details (average, fastest, slowest):
DNS+dialup: 0.0004 secs, 0.0000 secs, 0.0537 secs
DNS-lookup: 0.0002 secs, 0.0000 secs, 0.0184 secs
req write: 0.0000 secs, 0.0000 secs, 0.0016 secs
resp wait: 1.1892 secs, 0.0101 secs, 7.1038 secs
resp read: 0.0000 secs, 0.0000 secs, 0.0005 secs

Status code distribution:
[502] 3076 responses
[200] 1924 responses

This looks fairly different from the last load test run. A large percentage of these requests failed fast due to the concurrency throttle failing them (those with the 0.724 seconds line). The timing shown here in the histogram represents the entire time it took to get a response between the EC2 instance and API Gateway calling Lambda and being rejected. It’s also important to note that this example was configured with an edge-optimized endpoint in API Gateway. You see under Status code distribution that 3076 of the 5000 requests failed with a 502, showing that the backend service from API Gateway and Lambda failed the request.

Other uses

Managing function concurrency can be useful in a few other ways beyond just limiting the impact on downstream services and providing a reservation of concurrency capacity. Here are two other uses:

  • Emergency kill switch
  • Cost controls

Emergency kill switch

On occasion, due to issues with applications I’ve managed in the past, I’ve had a need to disable a certain function or capability of an application. By setting the concurrency reservation and limit of a Lambda function to zero, you can do just that.

With the reservation set to zero every invocation of a Lambda function results in being throttled. You could then work on the related parts of the infrastructure or application that aren’t working, and then reconfigure the concurrency limit to allow invocations again.

Cost controls

While I mentioned how you might want to use concurrency limits to control the downstream impact to services or databases that your Lambda function might call, another resource that you might be cautious about is money. Setting the concurrency throttle is another way to help control costs during development and testing of your application.

You might want to prevent against a function performing a recursive action too quickly or a development workload generating too high of a concurrency. You might also want to protect development resources connected to this function from generating too much cost, such as APIs that your Lambda function calls.

Conclusion

Concurrent executions as a unit of scale are a fairly unique characteristic about Lambda functions. Placing limits on how many concurrency “slices” that your function can consume can prevent a single function from consuming all of the available concurrency in an account. Limits can also prevent a function from overwhelming a backend resource that isn’t as scalable.

Unlike monolithic applications or even microservices where there are mixed capabilities in a single service, Lambda functions encourage a sort of “nano-service” of small business logic directly related to the integration model connected to the function. I hope you’ve enjoyed this post and configure your concurrency limits today!

Resilient TVAddons Plans to Ditch Proactive ‘Piracy’ Screening

Post Syndicated from Ernesto original https://torrentfreak.com/resilient-tvaddons-plans-to-ditch-proactive-piracy-screening-171207/

After years of smooth sailing, this year TVAddons became a poster child for the entertainment industry’s war on illicit streaming devices.

The leading repository for unofficial Kodi addons was sued for copyright infringement in the US by satellite and broadcast provider Dish Network. Around the same time, a similar case was filed by Bell, TVA, Videotron, and Rogers in Canada.

The latter case has done the most damage thus far, as it caused the addon repository to lose its domain names and social media accounts. As a result, the site went dead and while many believed it would never return, it made a blazing comeback after a few weeks.

Since the original TVAddons.ag domain was seized, the site returned on TVaddons.co. And that was not the only difference. A lot of the old add-ons, for which it was unclear if they linked to licensed content, were no longer listed in the repository either.

TVAddons previously relied on the DMCA to shield it from liability but apparently, that wasn’t enough. As a result, they took the drastic decision to check all submitted add-ons carefully.

“Since complying with the law is clearly not enough to prevent frivolous legal action from being taken against you, we have been forced to implement a more drastic code vetting process,” a TVAddons representative told us previously.

Despite the absence of several of the most used add-ons, the repository has managed to regain many of its former users. Over the past month, TVAddons had over 12 million unique users. These all manually installed the new repository on their devices.

“We’re not like one of those pirate sites that are shut down and opens on a new domain the next day, getting users to actually manually install a new repo isn’t an easy feat,” a TVAddons representative informs TorrentFreak.

While it’s still far away from the 40 million unique users it had earlier this year, before the trouble began, it’s still a force to be reckoned with.

Interestingly, the vast majority of all TVAddons traffic comes from the United States. The UK is second at a respectable distance, followed by Canada, Germany, and the Netherlands.

While many former users have returned, the submission policy changes didn’t go unnoticed. The relatively small selection of add-ons is a major drawback for some, but that’s about to change as well, we are informed.

TVAddons plans to return to the old submission model where developers can upload their code more freely. Instead of proactive screening, TVAddons will rely on a standard DMCA takedown policy, relying on copyright holders to flag potentially infringing content.

“We intend on returning to a standard DMCA compliant add-on submission policy shortly, there’s no reason why we should be held to a higher standard than Facebook, Twitter, YouTube or Reddit given the fact that we don’t even host any form of streaming content in the first place.

“Our interim policy isn’t pragmatic, it’s nearly impossible for us to verify the global licensing of all forms of protected content. When you visit a website, there’s no way of verifying licensing beyond trusting them based on reputation.”

The upcoming change doesn’t mean that TVAddons will ignore its legal requirements. If they receive a legitimate takedown notice, proper action will be taken, as always. As such, they would operate in the same fashion as other user-generated sites.

“Right now our interim addon submission policy is akin to North Korea. We always followed the law and will always continue to do so. Anytime we’ve received a legitimate complaint we’ve acted upon it in an expedited manner.

“Facebook, Twitter, Reddit and other online communities would have never existed if they were required to approve the contents of each user’s submissions prior to public posting.”

The change takes place while the two court cases are still pending. TVAddons is determined to keep up this fight. Meanwhile, they are also asking the public to support the project financially.

While some copyright holders, including those who are fighting the service in court, might not like the change, TVAddons believes that this is well within their rights. And with support from groups such as the Electronic Frontier Foundation, they don’t stand alone in this.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Looking Forward to 2018

Post Syndicated from Let's Encrypt - Free SSL/TLS Certificates original https://letsencrypt.org//2017/12/07/looking-forward-to-2018.html

Let’s Encrypt had a great year in 2017. We more than doubled the number of active (unexpired) certificates we service to 46 million, we just about tripled the number of unique domains we service to 61 million, and we did it all while maintaining a stellar security and compliance track record. Most importantly though, the Web went from 46% encrypted page loads to 67% according to statistics from Mozilla – a gain of 21% in a single year – incredible. We’re proud to have contributed to that, and we’d like to thank all of the other people and organizations who also worked hard to create a more secure and privacy-respecting Web.

While we’re proud of what we accomplished in 2017, we are spending most of the final quarter of the year looking forward rather than back. As we wrap up our own planning process for 2018, I’d like to share some of our plans with you, including both the things we’re excited about and the challenges we’ll face. We’ll cover service growth, new features, infrastructure, and finances.

Service Growth

We are planning to double the number of active certificates and unique domains we service in 2018, to 90 million and 120 million, respectively. This anticipated growth is due to continuing high expectations for HTTPS growth in general in 2018.

Let’s Encrypt helps to drive HTTPS adoption by offering a free, easy to use, and globally available option for obtaining the certificates required to enable HTTPS. HTTPS adoption on the Web took off at an unprecedented rate from the day Let’s Encrypt launched to the public.

One of the reasons Let’s Encrypt is so easy to use is that our community has done great work making client software that works well for a wide variety of platforms. We’d like to thank everyone involved in the development of over 60 client software options for Let’s Encrypt. We’re particularly excited that support for the ACME protocol and Let’s Encrypt is being added to the Apache httpd server.

Other organizations and communities are also doing great work to promote HTTPS adoption, and thus stimulate demand for our services. For example, browsers are starting to make their users more aware of the risks associated with unencrypted HTTP (e.g. Firefox, Chrome). Many hosting providers and CDNs are making it easier than ever for all of their customers to use HTTPS. Government agencies are waking up to the need for stronger security to protect constituents. The media community is working to Secure the News.

New Features

We’ve got some exciting features planned for 2018.

First, we’re planning to introduce an ACME v2 protocol API endpoint and support for wildcard certificates along with it. Wildcard certificates will be free and available globally just like our other certificates. We are planning to have a public test API endpoint up by January 4, and we’ve set a date for the full launch: Tuesday, February 27.

Later in 2018 we plan to introduce ECDSA root and intermediate certificates. ECDSA is generally considered to be the future of digital signature algorithms on the Web due to the fact that it is more efficient than RSA. Let’s Encrypt will currently sign ECDSA keys from subscribers, but we sign with the RSA key from one of our intermediate certificates. Once we have an ECDSA root and intermediates, our subscribers will be able to deploy certificate chains which are entirely ECDSA.

Infrastructure

Our CA infrastructure is capable of issuing millions of certificates per day with multiple redundancy for stability and a wide variety of security safeguards, both physical and logical. Our infrastructure also generates and signs nearly 20 million OCSP responses daily, and serves those responses nearly 2 billion times per day. We expect issuance and OCSP numbers to double in 2018.

Our physical CA infrastructure currently occupies approximately 70 units of rack space, split between two datacenters, consisting primarily of compute servers, storage, HSMs, switches, and firewalls.

When we issue more certificates it puts the most stress on storage for our databases. We regularly invest in more and faster storage for our database servers, and that will continue in 2018.

We’ll need to add a few additional compute servers in 2018, and we’ll also start aging out hardware in 2018 for the first time since we launched. We’ll age out about ten 2u compute servers and replace them with new 1u servers, which will save space and be more energy efficient while providing better reliability and performance.

We’ll also add another infrastructure operations staff member, bringing that team to a total of six people. This is necessary in order to make sure we can keep up with demand while maintaining a high standard for security and compliance. Infrastructure operations staff are systems administrators responsible for building and maintaining all physical and logical CA infrastructure. The team also manages a 24/7/365 on-call schedule and they are primary participants in both security and compliance audits.

Finances

We pride ourselves on being an efficient organization. In 2018 Let’s Encrypt will secure a large portion of the Web with a budget of only $3.0M. For an overall increase in our budget of only 13%, we will be able to issue and service twice as many certificates as we did in 2017. We believe this represents an incredible value and that contributing to Let’s Encrypt is one of the most effective ways to help create a more secure and privacy-respecting Web.

Our 2018 fundraising efforts are off to a strong start with Platinum sponsorships from Mozilla, Akamai, OVH, Cisco, Google Chrome and the Electronic Frontier Foundation. The Ford Foundation has renewed their grant to Let’s Encrypt as well. We are seeking additional sponsorship and grant assistance to meet our full needs for 2018.

We had originally budgeted $2.91M for 2017 but we’ll likely come in under budget for the year at around $2.65M. The difference between our 2017 expenses of $2.65M and the 2018 budget of $3.0M consists primarily of the additional infrastructure operations costs previously mentioned.

Support Let’s Encrypt

We depend on contributions from our community of users and supporters in order to provide our services. If your company or organization would like to sponsor Let’s Encrypt please email us at [email protected]. We ask that you make an individual contribution if it is within your means.

We’re grateful for the industry and community support that we receive, and we look forward to continuing to create a more secure and privacy-respecting Web!

Libertarians are against net neutrality

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/12/libertarians-are-against-net-neutrality.html

This post claims to be by a libertarian in support of net neutrality. As a libertarian, I need to debunk this. “Net neutrality” is a case of one-hand clapping, you rarely hear the competing side, and thus, that side may sound attractive. This post is about the other side, from a libertarian point of view.

That post just repeats the common, and wrong, left-wing talking points. I mean, there might be a libertarian case for some broadband regulation, but this isn’t it.

This thing they call “net neutrality” is just left-wing politics masquerading as some sort of principle. It’s no different than how people claim to be “pro-choice”, yet demand forced vaccinations. Or, it’s no different than how people claim to believe in “traditional marriage” even while they are on their third “traditional marriage”.

Properly defined, “net neutrality” means no discrimination of network traffic. But nobody wants that. A classic example is how most internet connections have faster download speeds than uploads. This discriminates against upload traffic, harming innovation in upload-centric applications like DropBox’s cloud backup or BitTorrent’s peer-to-peer file transfer. Yet activists never mention this, or other types of network traffic discrimination, because they no more care about “net neutrality” than Trump or Gingrich care about “traditional marriage”.

Instead, when people say “net neutrality”, they mean “government regulation”. It’s the same old debate between who is the best steward of consumer interest: the free-market or government.

Specifically, in the current debate, they are referring to the Obama-era FCC “Open Internet” order and reclassification of broadband under “Title II” so they can regulate it. Trump’s FCC is putting broadband back to “Title I”, which means the FCC can’t regulate most of its “Open Internet” order.

Don’t be tricked into thinking the “Open Internet” order is anything but intensely politically. The premise behind the order is the Democrat’s firm believe that it’s government who created the Internet, and all innovation, advances, and investment ultimately come from the government. It sees ISPs as inherently deceitful entities who will only serve their own interests, at the expense of consumers, unless the FCC protects consumers.

It says so right in the order itself. It starts with the premise that broadband ISPs are evil, using illegitimate “tactics” to hurt consumers, and continues with similar language throughout the order.

A good contrast to this can be seen in Tim Wu’s non-political original paper in 2003 that coined the term “net neutrality”. Whereas the FCC sees broadband ISPs as enemies of consumers, Wu saw them as allies. His concern was not that ISPs would do evil things, but that they would do stupid things, such as favoring short-term interests over long-term innovation (such as having faster downloads than uploads).

The political depravity of the FCC’s order can be seen in this comment from one of the commissioners who voted for those rules:

FCC Commissioner Jessica Rosenworcel wants to increase the minimum broadband standards far past the new 25Mbps download threshold, up to 100Mbps. “We invented the internet. We can do audacious things if we set big goals, and I think our new threshold, frankly, should be 100Mbps. I think anything short of that shortchanges our children, our future, and our new digital economy,” Commissioner Rosenworcel said.

This is indistinguishable from communist rhetoric that credits the Party for everything, as this booklet from North Korea will explain to you.

But what about monopolies? After all, while the free-market may work when there’s competition, it breaks down where there are fewer competitors, oligopolies, and monopolies.

There is some truth to this, in individual cities, there’s often only only a single credible high-speed broadband provider. But this isn’t the issue at stake here. The FCC isn’t proposing light-handed regulation to keep monopolies in check, but heavy-handed regulation that regulates every last decision.

Advocates of FCC regulation keep pointing how broadband monopolies can exploit their renting-seeking positions in order to screw the customer. They keep coming up with ever more bizarre and unlikely scenarios what monopoly power grants the ISPs.

But the never mention the most simplest: that broadband monopolies can just charge customers more money. They imagine instead that these companies will pursue a string of outrageous, evil, and less profitable behaviors to exploit their monopoly position.

The FCC’s reclassification of broadband under Title II gives it full power to regulate ISPs as utilities, including setting prices. The FCC has stepped back from this, promising it won’t go so far as to set prices, that it’s only regulating these evil conspiracy theories. This is kind of bizarre: either broadband ISPs are evilly exploiting their monopoly power or they aren’t. Why stop at regulating only half the evil?

The answer is that the claim “monopoly” power is a deception. It starts with overstating how many monopolies there are to begin with. When it issued its 2015 “Open Internet” order the FCC simultaneously redefined what they meant by “broadband”, upping the speed from 5-mbps to 25-mbps. That’s because while most consumers have multiple choices at 5-mbps, fewer consumers have multiple choices at 25-mbps. It’s a dirty political trick to convince you there is more of a problem than there is.

In any case, their rules still apply to the slower broadband providers, and equally apply to the mobile (cell phone) providers. The US has four mobile phone providers (AT&T, Verizon, T-Mobile, and Sprint) and plenty of competition between them. That it’s monopolistic power that the FCC cares about here is a lie. As their Open Internet order clearly shows, the fundamental principle that animates the document is that all corporations, monopolies or not, are treacherous and must be regulated.

“But corporations are indeed evil”, people argue, “see here’s a list of evil things they have done in the past!”

No, those things weren’t evil. They were done because they benefited the customers, not as some sort of secret rent seeking behavior.

For example, one of the more common “net neutrality abuses” that people mention is AT&T’s blocking of FaceTime. I’ve debunked this elsewhere on this blog, but the summary is this: there was no network blocking involved (not a “net neutrality” issue), and the FCC analyzed it and decided it was in the best interests of the consumer. It’s disingenuous to claim it’s an evil that justifies FCC actions when the FCC itself declared it not evil and took no action. It’s disingenuous to cite the “net neutrality” principle that all network traffic must be treated when, in fact, the network did treat all the traffic equally.

Another frequently cited abuse is Comcast’s throttling of BitTorrent.Comcast did this because Netflix users were complaining. Like all streaming video, Netflix backs off to slower speed (and poorer quality) when it experiences congestion. BitTorrent, uniquely among applications, never backs off. As most applications become slower and slower, BitTorrent just speeds up, consuming all available bandwidth. This is especially problematic when there’s limited upload bandwidth available. Thus, Comcast throttled BitTorrent during prime time TV viewing hours when the network was already overloaded by Netflix and other streams. BitTorrent users wouldn’t mind this throttling, because it often took days to download a big file anyway.

When the FCC took action, Comcast stopped the throttling and imposed bandwidth caps instead. This was a worse solution for everyone. It penalized heavy Netflix viewers, and prevented BitTorrent users from large downloads. Even though BitTorrent users were seen as the victims of this throttling, they’d vastly prefer the throttling over the bandwidth caps.

In both the FaceTime and BitTorrent cases, the issue was “network management”. AT&T had no competing video calling service, Comcast had no competing download service. They were only reacting to the fact their networks were overloaded, and did appropriate things to solve the problem.

Mobile carriers still struggle with the “network management” issue. While their networks are fast, they are still of low capacity, and quickly degrade under heavy use. They are looking for tricks in order to reduce usage while giving consumers maximum utility.

The biggest concern is video. It’s problematic because it’s designed to consume as much bandwidth as it can, throttling itself only when it experiences congestion. This is what you probably want when watching Netflix at the highest possible quality, but it’s bad when confronted with mobile bandwidth caps.

With small mobile devices, you don’t want as much quality anyway. You want the video degraded to lower quality, and lower bandwidth, all the time.

That’s the reasoning behind T-Mobile’s offerings. They offer an unlimited video plan in conjunction with the biggest video providers (Netflix, YouTube, etc.). The catch is that when congestion occurs, they’ll throttle it to lower quality. In other words, they give their bandwidth to all the other phones in your area first, then give you as much of the leftover bandwidth as you want for video.

While it sounds like T-Mobile is doing something evil, “zero-rating” certain video providers and degrading video quality, the FCC allows this, because they recognize it’s in the customer interest.

Mobile providers especially have great interest in more innovation in this area, in order to conserve precious bandwidth, but they are finding it costly. They can’t just innovate, but must ask the FCC permission first. And with the new heavy handed FCC rules, they’ve become hostile to this innovation. This attitude is highlighted by the statement from the “Open Internet” order:

And consumers must be protected, for example from mobile commercial practices masquerading as “reasonable network management.”

This is a clear declaration that free-market doesn’t work and won’t correct abuses, and that that mobile companies are treacherous and will do evil things without FCC oversight.

Conclusion

Ignoring the rhetoric for the moment, the debate comes down to simple left-wing authoritarianism and libertarian principles. The Obama administration created a regulatory regime under clear Democrat principles, and the Trump administration is rolling it back to more free-market principles. There is no principle at stake here, certainly nothing to do with a technical definition of “net neutrality”.

The 2015 “Open Internet” order is not about “treating network traffic neutrally”, because it doesn’t do that. Instead, it’s purely a left-wing document that claims corporations cannot be trusted, must be regulated, and that innovation and prosperity comes from the regulators and not the free market.

It’s not about monopolistic power. The primary targets of regulation are the mobile broadband providers, where there is plenty of competition, and who have the most “network management” issues. Even if it were just about wired broadband (like Comcast), it’s still ignoring the primary ways monopolies profit (raising prices) and instead focuses on bizarre and unlikely ways of rent seeking.

If you are a libertarian who nonetheless believes in this “net neutrality” slogan, you’ve got to do better than mindlessly repeating the arguments of the left-wing. The term itself, “net neutrality”, is just a slogan, varying from person to person, from moment to moment. You have to be more specific. If you truly believe in the “net neutrality” technical principle that all traffic should be treated equally, then you’ll want a rewrite of the “Open Internet” order.

In the end, while libertarians may still support some form of broadband regulation, it’s impossible to reconcile libertarianism with the 2015 “Open Internet”, or the vague things people mean by the slogan “net neutrality”.

Game night 1: Lisa, Lisa, MOOP

Post Syndicated from Eevee original https://eev.ee/blog/2017/12/05/game-night-1-lisa-lisa-moop/

For the last few weeks, glip (my partner) and I have spent a couple hours most nights playing indie games together. We started out intending to play a short list of games that had been recommended to glip, but this turns out to be a nice way to wind down, so we’ve been keeping it up and clicking on whatever looks interesting in the itch app.

Most of the games are small and made by one or two people, so they tend to be pretty tightly scoped and focus on a few particular kinds of details. I’ve found myself having brain thoughts about all that, so I thought I’d write some of them down.

I also know that some people (cough) tend not to play games they’ve never heard of, even if they want something new to play. If that’s you, feel free to play some of these, now that you’ve heard of them!

Also, I’m still figuring the format out here, so let me know if this is interesting or if you hope I never do it again!

First up:

  • Lisa: The Painful
  • Lisa: The Joyful
  • MOOP

These are impressions, not reviews. I try to avoid major/ending spoilers, but big plot points do tend to leave impressions.

Lisa: The Painful

long · classic rpg · dec 2014 · lin/mac/win · $10 on itch or steam · website

(cw: basically everything??)

Lisa: The Painful is true to its name. I hesitate to describe it as fun, exactly, but I’m glad we played it.

Everything about the game is dark. It’s a (somewhat loose) sequel to another game called Lisa, whose titular character ultimately commits suicide; her body hanging from a noose is the title screen for this game.

Ah, but don’t worry, it gets worse. This game takes place in a post-apocalyptic wasteland, where every female human — women, children, babies — is dead. You play as Brad (Lisa’s brother), who has discovered the lone exception: a baby girl he names Buddy and raises like a daughter. Now, Buddy has been kidnapped, and you have to go rescue her, presumably from being raped.

Ah, but don’t worry, it gets worse.


I’ve had a hard time putting my thoughts in order here, because so much of what stuck with me is the way the game entangles the plot with the mechanics.

I love that kind of thing, but it’s so hard to do well. I can’t really explain why, but I feel like most attempts to do it fall flat — they have a glimmer of an idea, but they don’t integrate it well enough, or they don’t run nearly as far as they could have. I often get the same feeling as, say, a hyped-up big moral choice that turns out to be picking “yes” or “no” from a menu. The idea is there, but the execution is so flimsy that it leaves no impact on me at all.

An obvious recent success here is Undertale, where the entire story is about violence and whether you choose to engage or avoid it (and whether you can do that). If you choose to eschew violence, not only does the game become more difficult, it arguably becomes a different game entirely. Granted, the contrast is lost if you (like me) tried to play as a pacifist from the very beginning. I do feel that you could go further with the idea than Undertale, but Undertale itself doesn’t feel incomplete.

Christ, I’m not even talking about the right game any more.

Okay, so: this game is a “classic” RPG, by which I mean, it was made with RPG Maker. (It’s kinda funny that RPG Maker was designed to emulate a very popular battle style, and now the only games that use that style are… made with RPG Maker.) The main loop, on the surface, is standard RPG fare: you walk around various places, talk to people, solve puzzles, recruit party members, and get into turn-based fights.

Now, Brad is addicted to a drug called Joy. He will regularly go into withdrawal, which manifests in the game as a status effect that cuts his stats (even his max HP!) dramatically.

It is really, really, incredibly inconvenient. And therein lies the genius here. The game could have simply told me that Brad is an addict, and I don’t think I would’ve cared too much. An addiction to a fantasy drug in a wasteland doesn’t mean anything to me, especially about this tiny sprite man I just met, so I would’ve filed this away as a sterile fact and forgotten about it. By making his addiction affect me, I’m now invested in it. I wish Brad weren’t addicted, even if only because it’s annoying. I found a party member once who turned out to have the same addiction, and I felt dread just from seeing the icon for the status effect. I’ve been looped into the events of this story through the medium I use to interact with it: the game.

It’s a really good use of games as a medium. Even before I’m invested in the characters, I’m invested in what’s happening to them, because it impacts the game!

Incidentally, you can get Joy as an item, which will temporarily cure your withdrawal… but you mostly find it by looting the corpses of grotesque mutant flesh horrors you encounter. I don’t think the game would have the player abruptly mutate out of nowhere, but I wasn’t about to find out, either. We never took any.


Virtually every staple of the RPG genre has been played with in some way to tie it into the theme/setting. I love it, and I think it works so well precisely because it plays with expectations of how RPGs usually work.

Most obviously, the game is a sidescroller, not top-down. You can’t jump freely, but you can hop onto one-tile-high boxes and climb ropes. You can also drop off off ledges… but your entire party will take fall damage, which gets rapidly more severe the further you fall.

This wouldn’t be too much of a problem, except that healing is hard to come by for most of the game. Several hub areas have campfires you can sleep next to to restore all your health and MP, but when you wake up, something will have happened to you. Maybe just a weird cutscene, or maybe one of your party members has decided to leave permanently.

Okay, so use healing items instead? Good luck; money is also hard to come by, and honestly so are shops, and many of the healing items are woefully underpowered.

Grind for money? Good luck there, too! While the game has plenty of battles, virtually every enemy is a unique overworld human who only appears once, and then is dead, because you killed him. Only a handful of places have unlimited random encounters, and grinding is not especially pleasant.

The “best” way to get a reliable heal is to savescum — save the game, sleep by the campfire, and reload if you don’t like what you wake up to.

In a similar vein, there’s a part of the game where you’re forced to play Russian Roulette. You choose a party member; he and an opponent will take turns shooting themselves in the head until someone finds a loaded chamber. If your party member loses, he is dead. And you have to keep playing until you win three times, so there’s no upper limit on how many people you might lose. I couldn’t find any way to influence who won, so I just had to savescum for a good half hour until I made it through with minimal losses.

It was maddening, but also a really good idea. Games don’t often incorporate the existence of saves into the gameplay, and when they do, they usually break the fourth wall and get all meta about it. Saves are never acknowledged in-universe here (aside from the existence of save points), but surely these parts of the game were designed knowing that the best way through them is by reloading. It’s rarely done, it can easily feel unfair, and it drove me up the wall — but it was certainly painful, as intended, and I kinda love that.

(Naturally, I’m told there’s a hard mode, where you can only use each save point once.)

The game also drives home the finality of death much better than most. It’s not hard to overlook the death of a redshirt, a character with a bit part who simply doesn’t appear any more. This game permanently kills your party members. Russian Roulette isn’t even the only way you can lose them! Multiple cutscenes force you to choose between losing a life or some other drastic consequence. (Even better, you can try to fight the person forcing this choice on you, and he will decimate you.) As the game progresses, you start to encounter enemies who can simply one-shot murder your party members.

It’s such a great angle. Just like with Brad’s withdrawal, you don’t want to avoid their deaths because it’d be emotional — there are dozens of party members you can recruit (though we only found a fraction of them), and most of them you only know a paragraph about — but because it would inconvenience you personally. Chances are, you have your strongest dudes in your party at any given time, so losing one of them sucks. And with few random encounters, you can’t just grind someone else up to an appropriate level; it feels like there’s a finite amount of XP in the game, and if someone high-level dies, you’ve lost all the XP that went into them.


The battles themselves are fairly straightforward. You can attack normally or use a special move that costs MP. SP? Some kind of points.

Two things in particular stand out. One I mentioned above: the vast majority of the encounters are one-time affairs against distinct named NPCs, who you then never see again, because they are dead, because you killed them.

The other is the somewhat unusual set of status effects. The staples like poison and sleep are here, but don’t show up all that often; more frequent are statuses like weird, drunk, stink, or cool. If you do take Joy (which also cures depression), you become joyed for a short time.

The game plays with these in a few neat ways, besides just Brad’s withdrawal. Some party members have a status like stink or cool permanently. Some battles are against people who don’t want to fight at all — and so they’ll spend most of the battle crying, purely for flavor impact. Seeing that for the first time hit me pretty hard; until then we’d only seen crying as a mechanical side effect of having sand kicked in one’s face.


The game does drag on a bit. I think we poured 10 in-game hours into it, which doesn’t count time spent reloading. It doesn’t help that you walk not super fast.

My biggest problem was with getting my bearings; I’m sure we spent a lot of that time wandering around accomplishing nothing. Most of the world is focused around one of a few hub areas, and once you’ve completed one hub, you can move onto the next one. That’s fine. Trouble is, you can go any of a dozen different directions from each hub, and most of those directions will lead you to very similar-looking hills built out of the same tiny handful of tiles. The connections between places are mostly cave entrances, which also largely look the same. Combine that with needing to backtrack for puzzle or progression reasons, and it’s incredibly difficult to keep track of where you’ve been, what you’ve done, and where you need to go next.

I don’t know that the game is wrong here; the aesthetic and world layout are fantastic at conveying a desolate wasteland. I wouldn’t even be surprised if the navigation were deliberately designed this way. (On the other hand, assuming every annoyance in a despair-ridden game is deliberate might be giving it too much credit.) But damn it’s still frustrating.

I felt a little lost in the battle system, too. Towards the end of the game, Brad in particular had over a dozen skills he could use, but I still couldn’t confidently tell you which were the strongest. New skills sometimes appear in the middle of the list or cost less than previous skills, and the game doesn’t outright tell you how much damage any of them do. I know this is the “classic RPG” style, and I don’t think it was hugely inconvenient, but it feels weird to barely know how my own skills work. I think this puts me off getting into new RPGs, just generally; there’s a whole new set of things I have to learn about, and games in this style often won’t just tell me anything, so there’s this whole separate meta-puzzle to figure out before I can play the actual game effectively.

Also, the sound could use a little bit of… mastering? Some music and sound effects are significantly louder and screechier than others. Painful, you could say.


The world is full of side characters with their own stuff going on, which is also something I love seeing in games; too often, the whole world feels like an obstacle course specifically designed for you.

Also, many of those characters are, well, not great people. Really, most of the game is kinda fucked up. Consider: the weird status effect is most commonly inflicted by the “Grope” skill. It makes you feel weird, you see. Oh, and the currency is porn magazines.

And then there are the gangs, the various spins on sex clubs, the forceful drug kingpins, and the overall violence that permeates everything (you stumble upon an alarming number of corpses). The game neither condones nor condemns any of this; it simply offers some ideas of how people might behave at the end of the world. It’s certainly the grittiest interpretation I’ve seen.

I don’t usually like post-apocalypses, because they try to have these very hopeful stories, but then at the end the world is still a blighted hellscape so what was the point of any of that? I like this game much better for being a blighted hellscape throughout. The story is worth following to see where it goes, not just because you expect everything wrapped up neatly at the end.

…I realize I’ve made this game sound monumentally depressing throughout, but it manages to pack in a lot of funny moments as well, from the subtle to the overt. In retrospect, it’s actually really good at balancing the mood so it doesn’t get too depressing. If nothing else, it’s hilarious to watch this gruff, solemn, battle-scarred, middle-aged man pedal around on a kid’s bike he found.


An obvious theme of the game is despair, but the more I think about it, the more I wonder if ambiguity is a theme as well. It certainly fits the confusing geography.

Even the premise is a little ambiguous. Is/was Olathe a city, a country, a whole planet? Did the apocalypse affect only Olathe, or the whole world? Does it matter in an RPG, where the only world that exists is the one mapped out within the game?

Towards the end of the game, you catch up with Buddy, but she rejects you, apparently resentful that you kept her hidden away for her entire life. Brad presses on anyway, insisting on protecting her.

At that point I wasn’t sure I was still on Brad’s side. But he’s not wrong, either. Is he? Maybe it depends on how old Buddy is — but the game never tells us. Her sprite is a bit smaller than the men’s, but it’s hard to gauge much from small exaggerated sprites, and she might just be shorter. In the beginning of the game, she was doing kid-like drawings, but we don’t know how much time passed after that. Everyone seems to take for granted that she’s capable of bearing children, and she talks like an adult. So is she old enough to be making this decision, or young enough for parent figure Brad to overrule her? What is the appropriate age of agency, anyway, when you’re the last girl/woman left more than a decade after the end of the world?

Can you repopulate a species with only one woman, anyway?


Well, that went on a bit longer than I intended. This game has a lot of small touches that stood out to me, and they all wove together very well.

Should you play it? I have absolutely no idea.

FINAL SCORE: 1 out of 6 chambers

Lisa: The Joyful

fairly short · classic rpg · aug 2015 · lin/mac/win · $5 on itch or steam

Surprise! There’s a third game to round out this trilogy.

Lisa: The Joyful is much shorter, maybe three hours long — enough to be played in a night rather than over the better part of a week.

This one picks up immediately after the end of Painful, with you now playing as Buddy. It takes a drastic turn early on: Buddy decides that, rather than hide from the world, she must conquer it. She sets out to murder all the big bosses and become queen.

The battle system has been inherited from the previous game, but battles are much more straightforward this time around. You can’t recruit any party members; for much of the game, it’s just you and a sword.

There is a catch! Of course.

The catch is that you do not have enough health to survive most boss battles without healing. With no party members, you cannot heal via skills. I don’t think you could buy healing items anywhere, either. You have a few when the game begins, but once you run out, that’s it.

Except… you also have… some Joy. Which restores you to full health and also makes you crit with every hit. And drops off of several enemies.

We didn’t even recognize Joy as a healing item at first, since we never used it in Painful; it’s description simply says that it makes you feel nothing, and we’d assumed the whole point of it was to stave off withdrawal, which Buddy doesn’t experience. Luckily, the game provided a hint in the form of an NPC who offers to switch on easy mode:

What’s that? Bad guys too tough? Not enough jerky? You don’t want to take Joy!? Say no more, you’ve come to the right place!

So the game is aware that it’s unfairly difficult, and it’s deliberately forcing you to take Joy, and it is in fact entirely constructed around this concept. I guess the title is a pretty good hint, too.

I don’t feel quite as strongly about Joyful as I do about Painful. (Admittedly, I was really tired and starting to doze off towards the end of Joyful.) Once you get that the gimmick is to force you to use Joy, the game basically reduces to a moderate-difficulty boss rush. Other than that, the only thing that stood out to me mechanically was that Buddy learns a skill where she lifts her shirt to inflict flustered as a status effect — kind of a lingering echo of how outrageous the previous game could be.

You do get a healthy serving of plot, which is nice and ties a few things together. I wouldn’t say it exactly wraps up the story, but it doesn’t feel like it’s missing anything either; it’s exactly as murky as you’d expect.

I think it’s worth playing Joyful if you’ve played Painful. It just didn’t have the same impact on me. It probably doesn’t help that I don’t like Buddy as a person. She seems cold, violent, and cruel. Appropriate for the world and a product of her environment, I suppose.

FINAL SCORE: 300 Mags

MOOP

fairly short · inventory game · nov 2017 · win · free on itch

Finally, as something of a palate cleanser, we have MOOP: a delightful and charming little inventory game.

I don’t think “inventory game” is a real genre, but I mean the kind of game where you go around collecting items and using them in the right place. Puzzle-driven, but with “puzzles” that can largely be solved by simply trying everything everywhere. I’d put a lot of point and click adventures in the same category, despite having a radically different interface. Is that fair? Yes, because it’s my blog.

MOOP was almost certainly also made in RPG Maker, but it breaks the mold in a very different way by not being an RPG. There are no battles whatsoever, only interactions on the overworld; you progress solely via dialogue and puzzle-solving. Examining something gives you a short menu of verbs — use, talk, get — reminiscent of interactive fiction, or perhaps the graphical “adventure” games that took inspiration from interactive fiction. (God, “adventure game” is the worst phrase. Every game is an adventure! It doesn’t mean anything!)

Everything about the game is extremely chill. I love the monochrome aesthetic combined with a large screen resolution; it feels like I’m peeking into an alternate universe where the Game Boy got bigger but never gained color. I played halfway through the game before realizing that the protagonist (Moop) doesn’t have a walk animation; they simply slide around. Somehow, it works.

The puzzles are a little clever, yet low-pressure; the world is small enough that you can examine everything again if you get stuck, and there’s no way to lose or be set back. The music is lovely, too. It just feels good to wander around in a world that manages to make sepia look very pretty.

The story manages to pack a lot into a very short time. It’s… gosh, I don’t know. It has a very distinct texture to it that I’m not sure I’ve seen before. The plot weaves through several major events that each have very different moods, and it moves very quickly — but it’s well-written and doesn’t feel rushed or disjoint. It’s lighthearted, but takes itself seriously enough for me to get invested. It’s fucking witchcraft.

I think there was even a non-binary character! Just kinda nonchalantly in there. Awesome.

What a happy, charming game. Play if you would like to be happy and charmed.

FINAL SCORE: 1 waxing moon

Could a Single Copyright Complaint Kill Your Domain?

Post Syndicated from Andy original https://torrentfreak.com/could-a-single-copyright-complaint-kill-your-domain-171203/

It goes without saying that domain names are a crucial part of any site’s infrastructure. Without domains, sites aren’t easily findable and when things go wrong, the majority of web users could be forgiven for thinking that they no longer exist.

That was the case last week when Canada-based mashup site Sowndhaus suddenly found that its domain had been rendered completely useless. As previously reported, the site’s domain was suspended by UK-based registrar DomainBox after it received a copyright complaint from the IFPI.

There are a number of elements to this story, not least that the site’s operators believe that their project is entirely legal.

“We are a few like-minded folks from the mashup community that were tired of doing the host dance – new sites welcome us with open arms until record industry pressure becomes too much and they mass delete and ban us,” a member of the Sowndhaus team informs TF.

“After every mass deletion there are a wave of producers that just retire and their music is lost forever. We decided to make a more permanent home for ourselves and Canada’s Copyright Modernization Act gave us the opportunity to do it legally.
We just want a small quiet corner of the internet where we can make music without being criminalized. It seems insane that I even have to say that.”

But while these are all valid concerns for the Sowndhaus community, there is a bigger picture here. There is absolutely no question that sites like YouTube and Soundcloud host huge libraries of mashups, yet somehow they hang on to their domains. Why would DomainBox take such drastic action? Is the site a real menace?

“The IFPI have sent a few standard DMCA takedown notices [to Sowndhaus, indirectly], each about a specific track or tracks on our server, asking us to remove them and any infringing activity. Every track complained about has been transformative, either a mashup or a remix and in a couple of cases cover versions,” the team explains.

But in all cases, it appears that IFPI and its agents didn’t take the time to complain to the site first. They instead went for the site’s infrastructure.

“[IFPI] have never contacted us directly, even though we have a ‘report copyright abuse’ feature on our site and a dedicated copyright email address. We’ve only received forwarded emails from our host and domain registrar,” the site says.

Sowndhaus believes that the event that led to the domain suspension was caused by a support ticket raised by the “RiskIQ Incident Response Team”, who appear to have been working on behalf of IFPI.

“We were told by DomainBox…’Please remove the unlawful content from your website, or the domain will be suspended. Please reply within the next 5 working days to ensure the request was actioned’,” Sowndhaus says.

But they weren’t given five days, or even one. DomainBox chose to suspend the Sowndhaus.com domain name immediately, rendering the site inaccessible and without even giving the site a chance to respond.

“They didn’t give us an option to appeal the decision. They just took the IFPI’s word that the files were unlawful and must be removed,” the site informs us.

Intrigued at why DomainBox took the nuclear option, TorrentFreak sent several emails to the company but each time they went unanswered. We also sent emails to Mesh Digital Ltd, DomainBox’s operator, but they were given the same treatment.

We wanted to know on what grounds the registrar suspended the domain but perhaps more importantly, we wanted to know if the company is as aggressive as this with its other customers.

To that end we posed a question: If DomainBox had been entrusted with the domains of YouTube or Soundcloud, would they have acted in the same manner? We can’t put words in their mouth but it seems likely that someone in the company would step in to avoid a PR disaster on that scale.

Of course, both YouTube and Soundcloud comply with the law by taking down content when it infringes someone’s rights. It’s a position held by Sowndhaus too, even though they do not operate in the United States.

“We comply fully with the Copyright Act (Canada) and have our own policy of removing any genuinely infringing content,” the site says, adding that users who infringe are banned from the platform.

While there has never been any suggestion that IFPI or its agents asked for Sowndhaus’ domain to be suspended, it’s clear that DomainBox made a decision to do just that. In some cases that might have been warranted, but registrars should definitely aim for a clear, transparent and fair process, so that the facts can be reviewed and appropriate action taken.

It’s something for people to keep in mind when they register a domain in future.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Collect Data Statistics Up to 5x Faster by Analyzing Only Predicate Columns with Amazon Redshift

Post Syndicated from George Caragea original https://aws.amazon.com/blogs/big-data/collect-data-statistics-up-to-5x-faster-by-analyzing-only-predicate-columns-with-amazon-redshift/

Amazon Redshift is a fast, fully managed, petabyte-scale data warehousing service that makes it simple and cost-effective to analyze all of your data. Many of our customers—including Boingo Wireless, Scholastic, Finra, Pinterest, and Foursquare—migrated to Amazon Redshift and achieved agility and faster time to insight, while dramatically reducing costs.

Query optimization and the need for accurate estimates

When a SQL query is submitted to Amazon Redshift, the query optimizer is in charge of generating all the possible ways to execute that query, and picking the fastest one. This can mean evaluating the cost of thousands, if not millions, of different execution plans.

The plan cost is calculated based on estimates of the data characteristics. For example, the characteristics could include the number of rows in each base table, the average width of a variable-length column, the number of distinct values in a column, and the most common values in a column. These estimates (or “statistics”) are computed in advance by running an ANALYZE command, and stored in the system catalog.

How do the query optimizer and ANALYZE work together?

An ideal scenario is to run ANALYZE after every ETL/ingestion job. This way, when running your workload, the query optimizer can use up-to-date data statistics, and choose the most optimal execution plan, given the updates.

However, running the ANALYZE command can add significant overhead to the data ingestion scripts. This can lead to customers not running ANALYZE on their data, and using default or stale estimates. The end result is usually the optimizer choosing a suboptimal execution plan that runs for longer than needed.

Analyzing predicate columns only

When you run a SQL query, the query optimizer requests statistics only on columns used in predicates in the SQL query (join predicates, filters in the WHERE clause and GROUP BY clauses). Consider the following query:

SELECT Avg(salary), 
       Min(hiredate), 
       deptname 
FROM   emp 
WHERE  state = 'CA' 
GROUP  BY deptname; 

In the query above, the optimizer requests statistics only on columns ‘state’ and ‘deptname’, but not on ‘salary’ and ‘hiredate’. If present, statistics on columns ‘salary’ and ‘hiredate’ are ignored, as they do not impact the cost of the execution plans considered.

Based on the optimizer functionality described earlier, the Amazon Redshift ANALYZE command has been updated to optionally collect information only about columns used in previous queries as part of a filter, join condition or a GROUP BY clause, and columns that are part of distribution or sort keys (predicate columns). There’s a recently introduced option for the ANALYZE command that only analyzes predicate columns:

ANALYZE <table name> PREDICATE COLUMNS;

By having Amazon Redshift collect information about predicate columns automatically, and analyzing those columns only, you’re able to reduce the time to run ANALYZE. For example, during the execution of the 99 queries in the TPC-DS workload, only 203 out of the 424 total columns are predicate columns (approximately 48%). By analyzing only the predicate columns for such a workload, the execution time for running ANALYZE can be significantly reduced.

From my experience in the data warehousing space, I have observed that about 20% of columns in a typical use case are marked predicate. In such a case, running ANALYZE PREDICATE COLUMNS can lead to a speedup of up to 5x relative to a full ANALYZE run.

If no information on predicate columns exists in the system (for example, a new table that has not been queried yet), ANALYZE PREDICATE COLUMNS collects statistics on all the columns. When queries on the table are run, Amazon Redshift collects information about predicate column usage, and subsequent runs of ANALYZE PREDICATE COLUMNS only operates on the predicate columns.

If the workload is relatively stable, and the set of predicate columns does not expand continuously over time, I recommend replacing all occurrences of the ANALYZE command with ANALYZE PREDICATE COLUMNS commands in your application and data ingestion code.

Using the Analyze/Vacuum utility

Several AWS customers are using the Analyze/Vacuum utility from the Redshift-Utils package to manage and automate their maintenance operations. By passing the –predicate-cols option to the Analyze/Vacuum utility, you can enable it to use the ANALYZE PREDICATE COLUMNS feature, providing you with the significant changes in overhead in a completely seamless manner.

Enhancements to logging for ANALYZE operations

When running ANALYZE with the PREDICATE COLUMNS option, the type of analyze run (Full vs Predicate Column), as well as information about the predicate columns encountered, is logged in the stl_analyze view:

SELECT status, 
       starttime, 
       prevtime, 
       num_predicate_cols, 
       num_new_predicate_cols 
FROM   stl_analyze;
     status   |    starttime        |   prevtime          | pred_cols | new_pred_cols
--------------+---------------------+---------------------+-----------+---------------
 Full         | 2017-11-09 01:15:47 |                     |         0 |             0
 PredicateCol | 2017-11-09 01:16:20 | 2017-11-09 01:15:47 |         2 |             2

AWS also enhanced the pg_statistic catalog table with two new pieces of information: the time stamp at which a column was marked as “predicate”, and the time stamp at which the column was last analyzed.

The Amazon Redshift documentation provides a view that allows a user to easily see which columns are marked as predicate, when they were marked as predicate, and when a column was last analyzed. For example, for the emp table used above, the output of the view could be as follows:

 SELECT col_name, 
       is_predicate, 
       first_predicate_use, 
       last_analyze 
FROM   predicate_columns 
WHERE  table_name = 'emp';

 col_name | is_predicate | first_predicate_use  |        last_analyze
----------+--------------+----------------------+----------------------------
 id       | f            |                      | 2017-11-09 01:15:47
 name     | f            |                      | 2017-11-09 01:15:47
 deptname | t            | 2017-11-09 01:16:03  | 2017-11-09 01:16:20
 age      | f            |                      | 2017-11-09 01:15:47
 salary   | f            |                      | 2017-11-09 01:15:47
 hiredate | f            |                      | 2017-11-09 01:15:47
 state    | t            | 2017-11-09 01:16:03  | 2017-11-09 01:16:20

Conclusion

After loading new data into an Amazon Redshift cluster, statistics need to be re-computed to guarantee performant query plans. By learning which column statistics are actually being used by the customer’s workload and collecting statistics only on those columns, Amazon Redshift is able to significantly reduce the amount of time needed for table maintenance during data loading workflows.


Additional Reading

Be sure to check out the Top 10 Tuning Techniques for Amazon Redshift, and the Advanced Table Design Playbook: Distribution Styles and Distribution Keys.


About the Author

George Caragea is a Senior Software Engineer with Amazon Redshift. He has been working on MPP Databases for over 6 years and is mainly interested in designing systems at scale. In his spare time, he enjoys being outdoors and on the water in the beautiful Bay Area and finishing the day exploring the rich local restaurant scene.

 

 

Implementing Canary Deployments of AWS Lambda Functions with Alias Traffic Shifting

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/implementing-canary-deployments-of-aws-lambda-functions-with-alias-traffic-shifting/

This post courtesy of Ryan Green, Software Development Engineer, AWS Serverless

The concepts of blue/green and canary deployments have been around for a while now and have been well-established as best-practices for reducing the risk of software deployments.

In a traditional, horizontally scaled application, copies of the application code are deployed to multiple nodes (instances, containers, on-premises servers, etc.), typically behind a load balancer. In these applications, deploying new versions of software to too many nodes at the same time can impact application availability as there may not be enough healthy nodes to service requests during the deployment. This aggressive approach to deployments also drastically increases the blast radius of software bugs introduced in the new version and does not typically give adequate time to safely assess the quality of the new version against production traffic.

In such applications, one commonly accepted solution to these problems is to slowly and incrementally roll out application software across the nodes in the fleet while simultaneously verifying application health (canary deployments). Another solution is to stand up an entirely different fleet and weight (or flip) traffic over to the new fleet after verification, ideally with some production traffic (blue/green). Some teams deploy to a single host (“one box environment”), where the new release can bake for some time before promotion to the rest of the fleet. Techniques like this enable the maintainers of complex systems to safely test in production while minimizing customer impact.

Enter Serverless

There is somewhat of an impedance mismatch when mapping these concepts to a serverless world. You can’t incrementally deploy your software across a fleet of servers when there are no servers!* In fact, even the term “deployment” takes on a different meaning with functions as a service (FaaS). In AWS Lambda, a “deployment” can be roughly modeled as a call to CreateFunction, UpdateFunctionCode, or UpdateAlias (I won’t get into the semantics of whether updating configuration counts as a deployment), all of which may affect the version of code that is invoked by clients.

The abstractions provided by Lambda remove the need for developers to be concerned about servers and Availability Zones, and this provides a powerful opportunity to greatly simplify the process of deploying software.
*Of course there are servers, but they are abstracted away from the developer.

Traffic shifting with Lambda aliases

Before the release of traffic shifting for Lambda aliases, deployments of a Lambda function could only be performed in a single “flip” by updating function code for version $LATEST, or by updating an alias to target a different function version. After the update propagates, typically within a few seconds, 100% of function invocations execute the new version. Implementing canary deployments with this model required the development of an additional routing layer, further adding development time, complexity, and invocation latency.
While rolling back a bad deployment of a Lambda function is a trivial operation and takes effect near instantaneously, deployments of new versions for critical functions can still be a potentially nerve-racking experience.

With the introduction of alias traffic shifting, it is now possible to trivially implement canary deployments of Lambda functions. By updating additional version weights on an alias, invocation traffic is routed to the new function versions based on the weight specified. Detailed CloudWatch metrics for the alias and version can be analyzed during the deployment, or other health checks performed, to ensure that the new version is healthy before proceeding.

Note: Sometimes the term “canary deployments” refers to the release of software to a subset of users. In the case of alias traffic shifting, the new version is released to some percentage of all users. It’s not possible to shard based on identity without adding an additional routing layer.

Examples

The simplest possible use of a canary deployment looks like the following:

# Update $LATEST version of function
aws lambda update-function-code --function-name myfunction ….

# Publish new version of function
aws lambda publish-version --function-name myfunction

# Point alias to new version, weighted at 5% (original version at 95% of traffic)
aws lambda update-alias --function-name myfunction --name myalias --routing-config '{"AdditionalVersionWeights" : {"2" : 0.05} }'

# Verify that the new version is healthy
…
# Set the primary version on the alias to the new version and reset the additional versions (100% weighted)
aws lambda update-alias --function-name myfunction --name myalias --function-version 2 --routing-config '{}'

This is begging to be automated! Here are a few options.

Simple deployment automation

This simple Python script runs as a Lambda function and deploys another function (how meta!) by incrementally increasing the weight of the new function version over a prescribed number of steps, while checking the health of the new version. If the health check fails, the alias is rolled back to its initial version. The health check is implemented as a simple check against the existence of Errors metrics in CloudWatch for the alias and new version.

GitHub aws-lambda-deploy repo

Install:

git clone https://github.com/awslabs/aws-lambda-deploy
cd aws-lambda-deploy
export BUCKET_NAME=[YOUR_S3_BUCKET_NAME_FOR_BUILD_ARTIFACTS]
./install.sh

Run:

# Rollout version 2 incrementally over 10 steps, with 120s between each step
aws lambda invoke --function-name SimpleDeployFunction --log-type Tail --payload \
  '{"function-name": "MyFunction",
  "alias-name": "MyAlias",
  "new-version": "2",
  "steps": 10,
  "interval" : 120,
  "type": "linear"
  }' output

Description of input parameters

  • function-name: The name of the Lambda function to deploy
  • alias-name: The name of the alias used to invoke the Lambda function
  • new-version: The version identifier for the new version to deploy
  • steps: The number of times the new version weight is increased
  • interval: The amount of time (in seconds) to wait between weight updates
  • type: The function to use to generate the weights. Supported values: “linear”

Because this runs as a Lambda function, it is subject to the maximum timeout of 5 minutes. This may be acceptable for many use cases, but to achieve a slower rollout of the new version, a different solution is required.

Step Functions workflow

This state machine performs essentially the same task as the simple deployment function, but it runs as an asynchronous workflow in AWS Step Functions. A nice property of Step Functions is that the maximum deployment timeout has now increased from 5 minutes to 1 year!

The step function incrementally updates the new version weight based on the steps parameter, waiting for some time based on the interval parameter, and performing health checks between updates. If the health check fails, the alias is rolled back to the original version and the workflow fails.

For example, to execute the workflow:

export STATE_MACHINE_ARN=`aws cloudformation describe-stack-resources --stack-name aws-lambda-deploy-stack --logical-resource-id DeployStateMachine --output text | cut  -d$'\t' -f3`

aws stepfunctions start-execution --state-machine-arn $STATE_MACHINE_ARN --input '{
  "function-name": "MyFunction",
  "alias-name": "MyAlias",
  "new-version": "2",
  "steps": 10,
  "interval": 120,
  "type": "linear"}'

Getting feedback on the deployment

Because the state machine runs asynchronously, retrieving feedback on the deployment requires polling for the execution status using DescribeExecution or implementing an asynchronous notification (using SNS or email, for example) from the Rollback or Finalize functions. A CloudWatch alarm could also be created to alarm based on the “ExecutionsFailed” metric for the state machine.

A note on health checks and observability

Weighted rollouts like this are considerably more successful if the code is being exercised and monitored continuously. In this example, it would help to have some automation continuously invoking the alias and reporting metrics on these invocations, such as client-side success rates and latencies.

The absence of Lambda Errors metrics used in these examples can be misleading if the function is not getting invoked. It’s also recommended to instrument your Lambda functions with custom metrics, in addition to Lambda’s built-in metrics, that can be used to monitor health during deployments.

Extensibility

These examples could be easily extended in various ways to support different use cases. For example:

  • Health check implementations: CloudWatch alarms, automatic invocations with payload assertions, querying external systems, etc.
  • Weight increase functions: Exponential, geometric progression, single canary step, etc.
  • Custom success/failure notifications: SNS, email, CI/CD systems, service discovery systems, etc.

Traffic shifting with SAM and CodeDeploy

Using the Lambda UpdateAlias operation with additional version weights provides a powerful primitive for you to implement custom traffic shifting solutions for Lambda functions.

For those not interested in building custom deployment solutions, AWS CodeDeploy provides an intuitive turn-key implementation of this functionality integrated directly into the Serverless Application Model. Traffic-shifted deployments can be declared in a SAM template, and CodeDeploy manages the function rollout as part of the CloudFormation stack update. CloudWatch alarms can also be configured to trigger a stack rollback if something goes wrong.

i.e.

MyFunction:
  Type: AWS::Serverless::Function
  Properties:
    FunctionName: MyFunction
    AutoPublishAlias: MyFunctionInvokeAlias
    DeploymentPreference:
      Type: Linear10PercentEvery1Minute
      Role:
        Fn::GetAtt: [ DeploymentRole, Arn ]
      Alarms:
       - { Ref: MyFunctionErrorsAlarm }
...

For more information about using CodeDeploy with SAM, see Automating Updates to Serverless Apps.

Conclusion

It is often the simple features that provide the most value. As I demonstrated in this post, serverless architectures allow the complex deployment orchestration used in traditional applications to be replaced with a simple Lambda function or Step Functions workflow. By allowing invocation traffic to be easily weighted to multiple function versions, Lambda alias traffic shifting provides a simple but powerful feature that I hope empowers you to easily implement safe deployment workflows for your Lambda functions.

AWS Cloud9 – Cloud Developer Environments

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aws-cloud9-cloud-developer-environments/

One of the first things you learn when you start programming is that, just like any craftsperson, your tools matter. Notepad.exe isn’t going to cut it. A powerful editor and testing pipeline supercharge your productivity. I still remember learning to use Vim for the first time and being able to zip around systems and complex programs. Do you remember how hard it was to setup all your compilers and dependencies on a new machine? How many cycles have you wasted matching versions, tinkering with configs, and then writing documentation to onboard a new developer to a project?

Today we’re launching AWS Cloud9, an Integrated Development Environment (IDE) for writing, running, and debugging code, all from your web browser. Cloud9 comes prepackaged with essential tools for many popular programming languages (Javascript, Python, PHP, etc.) so you don’t have to tinker with installing various compilers and toolchains. Cloud9 also provides a seamless experience for working with serverless applications allowing you to quickly switch between local and remote testing or debugging. Based on the popular open source Ace Editor and c9.io IDE (which we acquired last year), AWS Cloud9 is designed to make collaborative cloud development easy with extremely powerful pair programming features. There are more features than I could ever cover in this post but to give a quick breakdown I’ll break the IDE into 3 components: The editor, the AWS integrations, and the collaboration.

Editing


The Ace Editor at the core of Cloud9 is what lets you write code quickly, easily, and beautifully. It follows a UNIX philosophy of doing one thing and doing it well: writing code.

It has all the typical IDE features you would expect: live syntax checking, auto-indent, auto-completion, code folding, split panes, version control integration, multiple cursors and selections, and it also has a few unique features I want to highlight. First of all, it’s fast, even for large (100000+ line) files. There’s no lag or other issues while typing. It has over two dozen themes built-in (solarized!) and you can bring all of your favorite themes from Sublime Text or TextMate as well. It has built-in support for 40+ language modes and customizable run configurations for your projects. Most importantly though, it has Vim mode (or emacs if your fingers work that way). It also has a keybinding editor that allows you to bend the editor to your will.

The editor supports powerful keyboard navigation and commands (similar to Sublime Text or vim plugins like ctrlp). On a Mac, with ⌘+P you can open any file in your environment with fuzzy search. With ⌘+. you can open up the command pane which allows you to do invoke any of the editor commands by typing the name. It also helpfully displays the keybindings for a command in the pane, for instance to open to a terminal you can press ⌥+T. Oh, did I mention there’s a terminal? It ships with the AWS CLI preconfigured for access to your resources.

The environment also comes with pre-installed debugging tools for many popular languages – but you’re not limited to what’s already installed. It’s easy to add in new programs and define new run configurations.

The editor is just one, admittedly important, component in an IDE though. I want to show you some other compelling features.

AWS Integrations

The AWS Cloud9 IDE is the first IDE I’ve used that is truly “cloud native”. The service is provided at no additional charge, and you only charged for the underlying compute and storage resources. When you create an environment you’re prompted for either: an instance type and an auto-hibernate time, or SSH access to a machine of your choice.

If you’re running in AWS the auto-hibernate feature will stop your instance shortly after you stop using your IDE. This can be a huge cost savings over running a more permanent developer desktop. You can also launch it within a VPC to give it secure access to your development resources. If you want to run Cloud9 outside of AWS, or on an existing instance, you can provide SSH access to the service which it will use to create an environment on the external machine. Your environment is provisioned with automatic and secure access to your AWS account so you don’t have to worry about copying credentials around. Let me say that again: you can run this anywhere.

Serverless Development with AWS Cloud9

I spend a lot of time on Twitch developing serverless applications. I have hundreds of lambda functions and APIs deployed. Cloud9 makes working with every single one of these functions delightful. Let me show you how it works.


If you look in the top right side of the editor you’ll see an AWS Resources tab. Opening this you can see all of the lambda functions in your region (you can see functions in other regions by adjusting your region preferences in the AWS preference pane).

You can import these remote functions to your local workspace just by double-clicking them. This allows you to edit, test, and debug your serverless applications all locally. You can create new applications and functions easily as well. If you click the Lambda icon in the top right of the pane you’ll be prompted to create a new lambda function and Cloud9 will automatically create a Serverless Application Model template for you as well. The IDE ships with support for the popular SAM local tool pre-installed. This is what I use in most of my local testing and serverless development. Since you have a terminal, it’s easy to install additional tools and use other serverless frameworks.

 

Launching an Environment from AWS CodeStar

With AWS CodeStar you can easily provision an end-to-end continuous delivery toolchain for development on AWS. Codestar provides a unified experience for building, testing, deploying, and managing applications using AWS CodeCommit, CodeBuild, CodePipeline, and CodeDeploy suite of services. Now, with a few simple clicks you can provision a Cloud9 environment to develop your application. Your environment will be pre-configured with the code for your CodeStar application already checked out and git credentials already configured.

You can easily share this environment with your coworkers which leads me to another extremely useful set of features.

Collaboration

One of the many things that sets AWS Cloud9 apart from other editors are the rich collaboration tools. You can invite an IAM user to your environment with a few clicks.

You can see what files they’re working on, where their cursors are, and even share a terminal. The chat features is useful as well.

Things to Know

  • There are no additional charges for this service beyond the underlying compute and storage.
  • c9.io continues to run for existing users. You can continue to use all the features of c9.io and add new team members if you have a team account. In the future, we will provide tools for easy migration of your c9.io workspaces to AWS Cloud9.
  • AWS Cloud9 is available in the US West (Oregon), US East (Ohio), US East (N.Virginia), EU (Ireland), and Asia Pacific (Singapore) regions.

I can’t wait to see what you build with AWS Cloud9!

Randall

KDE’s Goals for 2018 and Beyond (KDE.news)

Post Syndicated from corbet original https://lwn.net/Articles/740359/rss

KDE.news covers the
goals that the KDE project has set for itself
in the coming year.
In synch with KDE’s vision, Sebastian Kugler says that ‘KDE is in a
unique position to offer users a complete software environment that helps
them to protect their privacy’. Being in that position, Sebastian explains,
KDE as a FLOSS community is morally obliged to do its utmost to provide the
most privacy-protecting environment for users. This is especially true
since KDE has been developing not only for desktop devices, but also for
mobile – an area where the respect for users’ privacy is nearly
non-existent.

GDPR – A Practical Guide For Developers

Post Syndicated from Bozho original https://techblog.bozho.net/gdpr-practical-guide-developers/

You’ve probably heard about GDPR. The new European data protection regulation that applies practically to everyone. Especially if you are working in a big company, it’s most likely that there’s already a process for gettign your systems in compliance with the regulation.

The regulation is basically a law that must be followed in all European countries (but also applies to non-EU companies that have users in the EU). In this particular case, it applies to companies that are not registered in Europe, but are having European customers. So that’s most companies. I will not go into yet another “12 facts about GDPR” or “7 myths about GDPR” posts/whitepapers, as they are often aimed at managers or legal people. Instead, I’ll focus on what GDPR means for developers.

Why am I qualified to do that? A few reasons – I was advisor to the deputy prime minister of a EU country, and because of that I’ve been both exposed and myself wrote some legislation. I’m familiar with the “legalese” and how the regulatory framework operates in general. I’m also a privacy advocate and I’ve been writing about GDPR-related stuff in the past, i.e. “before it was cool” (protecting sensitive data, the right to be forgotten). And finally, I’m currently working on a project that (among other things) aims to help with covering some GDPR aspects.

I’ll try to be a bit more comprehensive this time and cover as many aspects of the regulation that concern developers as I can. And while developers will mostly be concerned about how the systems they are working on have to change, it’s not unlikely that a less informed manager storms in in late spring, realizing GDPR is going to be in force tomorrow, asking “what should we do to get our system/website compliant”.

The rights of the user/client (referred to as “data subject” in the regulation) that I think are relevant for developers are: the right to erasure (the right to be forgotten/deleted from the system), right to restriction of processing (you still keep the data, but mark it as “restricted” and don’t touch it without further consent by the user), the right to data portability (the ability to export one’s data), the right to rectification (the ability to get personal data fixed), the right to be informed (getting human-readable information, rather than long terms and conditions), the right of access (the user should be able to see all the data you have about them), the right to data portability (the user should be able to get a machine-readable dump of their data).

Additionally, the relevant basic principles are: data minimization (one should not collect more data than necessary), integrity and confidentiality (all security measures to protect data that you can think of + measures to guarantee that the data has not been inappropriately modified).

Even further, the regulation requires certain processes to be in place within an organization (of more than 250 employees or if a significant amount of data is processed), and those include keeping a record of all types of processing activities carried out, including transfers to processors (3rd parties), which includes cloud service providers. None of the other requirements of the regulation have an exception depending on the organization size, so “I’m small, GDPR does not concern me” is a myth.

It is important to know what “personal data” is. Basically, it’s every piece of data that can be used to uniquely identify a person or data that is about an already identified person. It’s data that the user has explicitly provided, but also data that you have collected about them from either 3rd parties or based on their activities on the site (what they’ve been looking at, what they’ve purchased, etc.)

Having said that, I’ll list a number of features that will have to be implemented and some hints on how to do that, followed by some do’s and don’t’s.

  • “Forget me” – you should have a method that takes a userId and deletes all personal data about that user (in case they have been collected on the basis of consent, and not due to contract enforcement or legal obligation). It is actually useful for integration tests to have that feature (to cleanup after the test), but it may be hard to implement depending on the data model. In a regular data model, deleting a record may be easy, but some foreign keys may be violated. That means you have two options – either make sure you allow nullable foreign keys (for example an order usually has a reference to the user that made it, but when the user requests his data be deleted, you can set the userId to null), or make sure you delete all related data (e.g. via cascades). This may not be desirable, e.g. if the order is used to track available quantities or for accounting purposes. It’s a bit trickier for event-sourcing data models, or in extreme cases, ones that include some sort of blcokchain/hash chain/tamper-evident data structure. With event sourcing you should be able to remove a past event and re-generate intermediate snapshots. For blockchain-like structures – be careful what you put in there and avoid putting personal data of users. There is an option to use a chameleon hash function, but that’s suboptimal. Overall, you must constantly think of how you can delete the personal data. And “our data model doesn’t allow it” isn’t an excuse.
  • Notify 3rd parties for erasure – deleting things from your system may be one thing, but you are also obligated to inform all third parties that you have pushed that data to. So if you have sent personal data to, say, Salesforce, Hubspot, twitter, or any cloud service provider, you should call an API of theirs that allows for the deletion of personal data. If you are such a provider, obviously, your “forget me” endpoint should be exposed. Calling the 3rd party APIs to remove data is not the full story, though. You also have to make sure the information does not appear in search results. Now, that’s tricky, as Google doesn’t have an API for removal, only a manual process. Fortunately, it’s only about public profile pages that are crawlable by Google (and other search engines, okay…), but you still have to take measures. Ideally, you should make the personal data page return a 404 HTTP status, so that it can be removed.
  • Restrict processing – in your admin panel where there’s a list of users, there should be a button “restrict processing”. The user settings page should also have that button. When clicked (after reading the appropriate information), it should mark the profile as restricted. That means it should no longer be visible to the backoffice staff, or publicly. You can implement that with a simple “restricted” flag in the users table and a few if-clasues here and there.
  • Export data – there should be another button – “export data”. When clicked, the user should receive all the data that you hold about them. What exactly is that data – depends on the particular usecase. Usually it’s at least the data that you delete with the “forget me” functionality, but may include additional data (e.g. the orders the user has made may not be delete, but should be included in the dump). The structure of the dump is not strictly defined, but my recommendation would be to reuse schema.org definitions as much as possible, for either JSON or XML. If the data is simple enough, a CSV/XLS export would also be fine. Sometimes data export can take a long time, so the button can trigger a background process, which would then notify the user via email when his data is ready (twitter, for example, does that already – you can request all your tweets and you get them after a while).
  • Allow users to edit their profile – this seems an obvious rule, but it isn’t always followed. Users must be able to fix all data about them, including data that you have collected from other sources (e.g. using a “login with facebook” you may have fetched their name and address). Rule of thumb – all the fields in your “users” table should be editable via the UI. Technically, rectification can be done via a manual support process, but that’s normally more expensive for a business than just having the form to do it. There is one other scenario, however, when you’ve obtained the data from other sources (i.e. the user hasn’t provided their details to you directly). In that case there should still be a page where they can identify somehow (via email and/or sms confirmation) and get access to the data about them.
  • Consent checkboxes – this is in my opinion the biggest change that the regulation brings. “I accept the terms and conditions” would no longer be sufficient to claim that the user has given their consent for processing their data. So, for each particular processing activity there should be a separate checkbox on the registration (or user profile) screen. You should keep these consent checkboxes in separate columns in the database, and let the users withdraw their consent (by unchecking these checkboxes from their profile page – see the previous point). Ideally, these checkboxes should come directly from the register of processing activities (if you keep one). Note that the checkboxes should not be preselected, as this does not count as “consent”.
  • Re-request consent – if the consent users have given was not clear (e.g. if they simply agreed to terms & conditions), you’d have to re-obtain that consent. So prepare a functionality for mass-emailing your users to ask them to go to their profile page and check all the checkboxes for the personal data processing activities that you have.
  • “See all my data” – this is very similar to the “Export” button, except data should be displayed in the regular UI of the application rather than an XML/JSON format. For example, Google Maps shows you your location history – all the places that you’ve been to. It is a good implementation of the right to access. (Though Google is very far from perfect when privacy is concerned)
  • Age checks – you should ask for the user’s age, and if the user is a child (below 16), you should ask for parent permission. There’s no clear way how to do that, but my suggestion is to introduce a flow, where the child should specify the email of a parent, who can then confirm. Obviosuly, children will just cheat with their birthdate, or provide a fake parent email, but you will most likely have done your job according to the regulation (this is one of the “wishful thinking” aspects of the regulation).

Now some “do’s”, which are mostly about the technical measures needed to protect personal data. They may be more “ops” than “dev”, but often the application also has to be extended to support them. I’ve listed most of what I could think of in a previous post.

  • Encrypt the data in transit. That means that communication between your application layer and your database (or your message queue, or whatever component you have) should be over TLS. The certificates could be self-signed (and possibly pinned), or you could have an internal CA. Different databases have different configurations, just google “X encrypted connections. Some databases need gossiping among the nodes – that should also be configured to use encryption
  • Encrypt the data at rest – this again depends on the database (some offer table-level encryption), but can also be done on machine-level. E.g. using LUKS. The private key can be stored in your infrastructure, or in some cloud service like AWS KMS.
  • Encrypt your backups – kind of obvious
  • Implement pseudonymisation – the most obvious use-case is when you want to use production data for the test/staging servers. You should change the personal data to some “pseudonym”, so that the people cannot be identified. When you push data for machine learning purposes (to third parties or not), you can also do that. Technically, that could mean that your User object can have a “pseudonymize” method which applies hash+salt/bcrypt/PBKDF2 for some of the data that can be used to identify a person
  • Protect data integrity – this is a very broad thing, and could simply mean “have authentication mechanisms for modifying data”. But you can do something more, even as simple as a checksum, or a more complicated solution (like the one I’m working on). It depends on the stakes, on the way data is accessed, on the particular system, etc. The checksum can be in the form of a hash of all the data in a given database record, which should be updated each time the record is updated through the application. It isn’t a strong guarantee, but it is at least something.
  • Have your GDPR register of processing activities in something other than Excel – Article 30 says that you should keep a record of all the types of activities that you use personal data for. That sounds like bureaucracy, but it may be useful – you will be able to link certain aspects of your application with that register (e.g. the consent checkboxes, or your audit trail records). It wouldn’t take much time to implement a simple register, but the business requirements for that should come from whoever is responsible for the GDPR compliance. But you can advise them that having it in Excel won’t make it easy for you as a developer (imagine having to fetch the excel file internally, so that you can parse it and implement a feature). Such a register could be a microservice/small application deployed separately in your infrastructure.
  • Log access to personal data – every read operation on a personal data record should be logged, so that you know who accessed what and for what purpose
  • Register all API consumers – you shouldn’t allow anonymous API access to personal data. I’d say you should request the organization name and contact person for each API user upon registration, and add those to the data processing register. Note: some have treated article 30 as a requirement to keep an audit log. I don’t think it is saying that – instead it requires 250+ companies to keep a register of the types of processing activities (i.e. what you use the data for). There are other articles in the regulation that imply that keeping an audit log is a best practice (for protecting the integrity of the data as well as to make sure it hasn’t been processed without a valid reason)

Finally, some “don’t’s”.

  • Don’t use data for purposes that the user hasn’t agreed with – that’s supposed to be the spirit of the regulation. If you want to expose a new API to a new type of clients, or you want to use the data for some machine learning, or you decide to add ads to your site based on users’ behaviour, or sell your database to a 3rd party – think twice. I would imagine your register of processing activities could have a button to send notification emails to users to ask them for permission when a new processing activity is added (or if you use a 3rd party register, it should probably give you an API). So upon adding a new processing activity (and adding that to your register), mass email all users from whom you’d like consent.
  • Don’t log personal data – getting rid of the personal data from log files (especially if they are shipped to a 3rd party service) can be tedious or even impossible. So log just identifiers if needed. And make sure old logs files are cleaned up, just in case
  • Don’t put fields on the registration/profile form that you don’t need – it’s always tempting to just throw as many fields as the usability person/designer agrees on, but unless you absolutely need the data for delivering your service, you shouldn’t collect it. Names you should probably always collect, but unless you are delivering something, a home address or phone is unnecessary.
  • Don’t assume 3rd parties are compliant – you are responsible if there’s a data breach in one of the 3rd parties (e.g. “processors”) to which you send personal data. So before you send data via an API to another service, make sure they have at least a basic level of data protection. If they don’t, raise a flag with management.
  • Don’t assume having ISO XXX makes you compliant – information security standards and even personal data standards are a good start and they will probably 70% of what the regulation requires, but they are not sufficient – most of the things listed above are not covered in any of those standards

Overall, the purpose of the regulation is to make you take conscious decisions when processing personal data. It imposes best practices in a legal way. If you follow the above advice and design your data model, storage, data flow , API calls with data protection in mind, then you shouldn’t worry about the huge fines that the regulation prescribes – they are for extreme cases, like Equifax for example. Regulators (data protection authorities) will most likely have some checklists into which you’d have to somehow fit, but if you follow best practices, that shouldn’t be an issue.

I think all of the above features can be implemented in a few weeks by a small team. Be suspicious when a big vendor offers you a generic plug-and-play “GDPR compliance” solution. GDPR is not just about the technical aspects listed above – it does have organizational/process implications. But also be suspicious if a consultant claims GDPR is complicated. It’s not – it relies on a few basic principles that are in fact best practices anyway. Just don’t ignore them.

The post GDPR – A Practical Guide For Developers appeared first on Bozho's tech blog.

Object models

Post Syndicated from Eevee original https://eev.ee/blog/2017/11/28/object-models/

Anonymous asks, with dollars:

More about programming languages!

Well then!

I’ve written before about what I think objects are: state and behavior, which in practice mostly means method calls.

I suspect that the popular impression of what objects are, and also how they should work, comes from whatever C++ and Java happen to do. From that point of view, the whole post above is probably nonsense. If the baseline notion of “object” is a rigid definition woven tightly into the design of two massively popular languages, then it doesn’t even make sense to talk about what “object” should mean — it does mean the features of those languages, and cannot possibly mean anything else.

I think that’s a shame! It piles a lot of baggage onto a fairly simple idea. Polymorphism, for example, has nothing to do with objects — it’s an escape hatch for static type systems. Inheritance isn’t the only way to reuse code between objects, but it’s the easiest and fastest one, so it’s what we get. Frankly, it’s much closer to a speed tradeoff than a fundamental part of the concept.

We could do with more experimentation around how objects work, but that’s impossible in the languages most commonly thought of as object-oriented.

Here, then, is a (very) brief run through the inner workings of objects in four very dynamic languages. I don’t think I really appreciated objects until I’d spent some time with Python, and I hope this can help someone else whet their own appetite.

Python 3

Of the four languages I’m going to touch on, Python will look the most familiar to the Java and C++ crowd. For starters, it actually has a class construct.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __neg__(self):
        return Vector(-self.x, -self.y)

    def __div__(self, denom):
        return Vector(self.x / denom, self.y / denom)

    @property
    def magnitude(self):
        return (self.x ** 2 + self.y ** 2) ** 0.5

    def normalized(self):
        return self / self.magnitude

The __init__ method is an initializer, which is like a constructor but named differently (because the object already exists in a usable form by the time the initializer is called). Operator overloading is done by implementing methods with other special __dunder__ names. Properties can be created with @property, where the @ is syntax for applying a wrapper function to a function as it’s defined. You can do inheritance, even multiply:

1
2
3
4
class Foo(A, B, C):
    def bar(self, x, y, z):
        # do some stuff
        super().bar(x, y, z)

Cool, a very traditional object model.

Except… for some details.

Some details

For one, Python objects don’t have a fixed layout. Code both inside and outside the class can add or remove whatever attributes they want from whatever object they want. The underlying storage is just a dict, Python’s mapping type. (Or, rather, something like one. Also, it’s possible to change, which will probably be the case for everything I say here.)

If you create some attributes at the class level, you’ll start to get a peek behind the curtains:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
class Foo:
    values = []

    def add_value(self, value):
        self.values.append(value)

a = Foo()
b = Foo()
a.add_value('a')
print(a.values)  # ['a']
b.add_value('b')
print(b.values)  # ['a', 'b']

The [] assigned to values isn’t a default assigned to each object. In fact, the individual objects don’t know about it at all! You can use vars(a) to get at the underlying storage dict, and you won’t see a values entry in there anywhere.

Instead, values lives on the class, which is a value (and thus an object) in its own right. When Python is asked for self.values, it checks to see if self has a values attribute; in this case, it doesn’t, so Python keeps going and asks the class for one.

Python’s object model is secretly prototypical — a class acts as a prototype, as a shared set of fallback values, for its objects.

In fact, this is also how method calls work! They aren’t syntactically special at all, which you can see by separating the attribute lookup from the call.

1
2
3
print("abc".startswith("a"))  # True
meth = "abc".startswith
print(meth("a"))  # True

Reading obj.method looks for a method attribute; if there isn’t one on obj, Python checks the class. Here, it finds one: it’s a function from the class body.

Ah, but wait! In the code I just showed, meth seems to “know” the object it came from, so it can’t just be a plain function. If you inspect the resulting value, it claims to be a “bound method” or “built-in method” rather than a function, too. Something funny is going on here, and that funny something is the descriptor protocol.

Descriptors

Python allows attributes to implement their own custom behavior when read from or written to. Such an attribute is called a descriptor. I’ve written about them before, but here’s a quick overview.

If Python looks up an attribute, finds it in a class, and the value it gets has a __get__ method… then instead of using that value, Python will use the return value of its __get__ method.

The @property decorator works this way. The magnitude property in my original example was shorthand for doing this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
class MagnitudeDescriptor:
    def __get__(self, instance, owner):
        if instance is None:
            return self
        return (instance.x ** 2 + instance.y ** 2) ** 0.5

class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    magnitude = MagnitudeDescriptor()

When you ask for somevec.magnitude, Python checks somevec but doesn’t find magnitude, so it consults the class instead. The class does have a magnitude, and it’s a value with a __get__ method, so Python calls that method and somevec.magnitude evaluates to its return value. (The instance is None check is because __get__ is called even if you get the descriptor directly from the class via Vector.magnitude. A descriptor intended to work on instances can’t do anything useful in that case, so the convention is to return the descriptor itself.)

You can also intercept attempts to write to or delete an attribute, and do absolutely whatever you want instead. But note that, similar to operating overloading in Python, the descriptor must be on a class; you can’t just slap one on an arbitrary object and have it work.

This brings me right around to how “bound methods” actually work. Functions are descriptors! The function type implements __get__, and when a function is retrieved from a class via an instance, that __get__ bundles the function and the instance together into a tiny bound method object. It’s essentially:

1
2
3
4
5
class FunctionType:
    def __get__(self, instance, owner):
        if instance is None:
            return self
        return functools.partial(self, instance)

The self passed as the first argument to methods is not special or magical in any way. It’s built out of a few simple pieces that are also readily accessible to Python code.

Note also that because obj.method() is just an attribute lookup and a call, Python doesn’t actually care whether method is a method on the class or just some callable thing on the object. You won’t get the auto-self behavior if it’s on the object, but otherwise there’s no difference.

More attribute access, and the interesting part

Descriptors are one of several ways to customize attribute access. Classes can implement __getattr__ to intervene when an attribute isn’t found on an object; __setattr__ and __delattr__ to intervene when any attribute is set or deleted; and __getattribute__ to implement unconditional attribute access. (That last one is a fantastic way to create accidental recursion, since any attribute access you do within __getattribute__ will of course call __getattribute__ again.)

Here’s what I really love about Python. It might seem like a magical special case that descriptors only work on classes, but it really isn’t. You could implement exactly the same behavior yourself, in pure Python, using only the things I’ve just told you about. Classes are themselves objects, remember, and they are instances of type, so the reason descriptors only work on classes is that type effectively does this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class type:
    def __getattribute__(self, name):
        value = super().__getattribute__(name)
        # like all op overloads, __get__ must be on the type, not the instance
        ty = type(value)
        if hasattr(ty, '__get__'):
            # it's a descriptor!  this is a class access so there is no instance
            return ty.__get__(value, None, self)
        else:
            return value

You can even trivially prove to yourself that this is what’s going on by skipping over types behavior:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class Descriptor:
    def __get__(self, instance, owner):
        print('called!')

class Foo:
    bar = Descriptor()

Foo.bar  # called!
type.__getattribute__(Foo, 'bar')  # called!
object.__getattribute__(Foo, 'bar')  # ...

And that’s not all! The mysterious super function, used to exhaustively traverse superclass method calls even in the face of diamond inheritance, can also be expressed in pure Python using these primitives. You could write your own superclass calling convention and use it exactly the same way as super.

This is one of the things I really like about Python. Very little of it is truly magical; virtually everything about the object model exists in the types rather than the language, which means virtually everything can be customized in pure Python.

Class creation and metaclasses

A very brief word on all of this stuff, since I could talk forever about Python and I have three other languages to get to.

The class block itself is fairly interesting. It looks like this:

1
2
class Name(*bases, **kwargs):
    # code

I’ve said several times that classes are objects, and in fact the class block is one big pile of syntactic sugar for calling type(...) with some arguments to create a new type object.

The Python documentation has a remarkably detailed description of this process, but the gist is:

  • Python determines the type of the new class — the metaclass — by looking for a metaclass keyword argument. If there isn’t one, Python uses the “lowest” type among the provided base classes. (If you’re not doing anything special, that’ll just be type, since every class inherits from object and object is an instance of type.)

  • Python executes the class body. It gets its own local scope, and any assignments or method definitions go into that scope.

  • Python now calls type(name, bases, attrs, **kwargs). The name is whatever was right after class; the bases are position arguments; and attrs is the class body’s local scope. (This is how methods and other class attributes end up on the class.) The brand new type is then assigned to Name.

Of course, you can mess with most of this. You can implement __prepare__ on a metaclass, for example, to use a custom mapping as storage for the local scope — including any reads, which allows for some interesting shenanigans. The only part you can’t really implement in pure Python is the scoping bit, which has a couple extra rules that make sense for classes. (In particular, functions defined within a class block don’t close over the class body; that would be nonsense.)

Object creation

Finally, there’s what actually happens when you create an object — including a class, which remember is just an invocation of type(...).

Calling Foo(...) is implemented as, well, a call. Any type can implement calls with the __call__ special method, and you’ll find that type itself does so. It looks something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# oh, a fun wrinkle that's hard to express in pure python: type is a class, so
# it's an instance of itself
class type:
    def __call__(self, *args, **kwargs):
        # remember, here 'self' is a CLASS, an instance of type.
        # __new__ is a true constructor: object.__new__ allocates storage
        # for a new blank object
        instance = self.__new__(self, *args, **kwargs)
        # you can return whatever you want from __new__ (!), and __init__
        # is only called on it if it's of the right type
        if isinstance(instance, self):
            instance.__init__(*args, **kwargs)
        return instance

Again, you can trivially confirm this by asking any type for its __call__ method. Assuming that type doesn’t implement __call__ itself, you’ll get back a bound version of types implementation.

1
2
>>> list.__call__
<method-wrapper '__call__' of type object at 0x7fafb831a400>

You can thus implement __call__ in your own metaclass to completely change how subclasses are created — including skipping the creation altogether, if you like.

And… there’s a bunch of stuff I haven’t even touched on.

The Python philosophy

Python offers something that, on the surface, looks like a “traditional” class/object model. Under the hood, it acts more like a prototypical system, where failed attribute lookups simply defer to a superclass or metaclass.

The language also goes to almost superhuman lengths to expose all of its moving parts. Even the prototypical behavior is an implementation of __getattribute__ somewhere, which you are free to completely replace in your own types. Proxying and delegation are easy.

Also very nice is that these features “bundle” well, by which I mean a library author can do all manner of convoluted hijinks, and a consumer of that library doesn’t have to see any of it or understand how it works. You only need to inherit from a particular class (which has a metaclass), or use some descriptor as a decorator, or even learn any new syntax.

This meshes well with Python culture, which is pretty big on the principle of least surprise. These super-advanced features tend to be tightly confined to single simple features (like “makes a weak attribute“) or cordoned with DSLs (e.g., defining a form/struct/database table with a class body). In particular, I’ve never seen a metaclass in the wild implement its own __call__.

I have mixed feelings about that. It’s probably a good thing overall that the Python world shows such restraint, but I wonder if there are some very interesting possibilities we’re missing out on. I implemented a metaclass __call__ myself, just once, in an entity/component system that strove to minimize fuss when communicating between components. It never saw the light of day, but I enjoyed seeing some new things Python could do with the same relatively simple syntax. I wouldn’t mind seeing, say, an object model based on composition (with no inheritance) built atop Python’s primitives.

Lua

Lua doesn’t have an object model. Instead, it gives you a handful of very small primitives for building your own object model. This is pretty typical of Lua — it’s a very powerful language, but has been carefully constructed to be very small at the same time. I’ve never encountered anything else quite like it, and “but it starts indexing at 1!” really doesn’t do it justice.

The best way to demonstrate how objects work in Lua is to build some from scratch. We need two key features. The first is metatables, which bear a passing resemblance to Python’s metaclasses.

Tables and metatables

The table is Lua’s mapping type and its primary data structure. Keys can be any value other than nil. Lists are implemented as tables whose keys are consecutive integers starting from 1. Nothing terribly surprising. The dot operator is sugar for indexing with a string key.

1
2
3
4
5
local t = { a = 1, b = 2 }
print(t['a'])  -- 1
print(t.b)  -- 2
t.c = 3
print(t['c'])  -- 3

A metatable is a table that can be associated with another value (usually another table) to change its behavior. For example, operator overloading is implemented by assigning a function to a special key in a metatable.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
local t = { a = 1, b = 2 }
--print(t + 0)  -- error: attempt to perform arithmetic on a table value

local mt = {
    __add = function(left, right)
        return 12
    end,
}
setmetatable(t, mt)
print(t + 0)  -- 12

Now, the interesting part: one of the special keys is __index, which is consulted when the base table is indexed by a key it doesn’t contain. Here’s a table that claims every key maps to itself.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
local t = {}
local mt = {
    __index = function(table, key)
        return key
    end,
}
setmetatable(t, mt)
print(t.foo)  -- foo
print(t.bar)  -- bar
print(t[3])  -- 3

__index doesn’t have to be a function, either. It can be yet another table, in which case that table is simply indexed with the key. If the key still doesn’t exist and that table has a metatable with an __index, the process repeats.

With this, it’s easy to have several unrelated tables that act as a single table. Call the base table an object, fill the __index table with functions and call it a class, and you have half of an object system. You can even get prototypical inheritance by chaining __indexes together.

At this point things are a little confusing, since we have at least three tables going on, so here’s a diagram. Keep in mind that Lua doesn’t actually have anything called an “object”, “class”, or “method” — those are just convenient nicknames for a particular structure we might build with Lua’s primitives.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
                    ╔═══════════╗        ...
                    ║ metatable ║         ║
                    ╟───────────╢   ┌─────╨───────────────────────┐
                    ║ __index   ╫───┤ lookup table ("superclass") │
                    ╚═══╦═══════╝   ├─────────────────────────────┤
  ╔═══════════╗         ║           │ some other method           ┼─── function() ... end
  ║ metatable ║         ║           └─────────────────────────────┘
  ╟───────────╢   ┌─────╨──────────────────┐
  ║ __index   ╫───┤ lookup table ("class") │
  ╚═══╦═══════╝   ├────────────────────────┤
      ║           │ some method            ┼─── function() ... end
      ║           └────────────────────────┘
┌─────╨─────────────────┐
│ base table ("object") │
└───────────────────────┘

Note that a metatable is not the same as a class; it defines behavior, not methods. Conversely, if you try to use a class directly as a metatable, it will probably not do much. (This is pretty different from e.g. Python, where operator overloads are just methods with funny names. One nice thing about the Lua approach is that you can keep interface-like functionality separate from methods, and avoid clogging up arbitrary objects’ namespaces. You could even use a dummy table as a key and completely avoid name collisions.)

Anyway, code!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
local class = {
    foo = function(a)
        print("foo got", a)
    end,
}
local mt = { __index = class }
-- setmetatable returns its first argument, so this is nice shorthand
local obj1 = setmetatable({}, mt)
local obj2 = setmetatable({}, mt)
obj1.foo(7)  -- foo got 7
obj2.foo(9)  -- foo got 9

Wait, wait, hang on. Didn’t I call these methods? How do they get at the object? Maybe Lua has a magical this variable?

Methods, sort of

Not quite, but this is where the other key feature comes in: method-call syntax. It’s the lightest touch of sugar, just enough to have method invocation.

1
2
3
4
5
6
7
8
9
-- note the colon!
a:b(c, d, ...)

-- exactly equivalent to this
-- (except that `a` is only evaluated once)
a.b(a, c, d, ...)

-- which of course is really this
a["b"](a, c, d, ...)

Now we can write methods that actually do something.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
local class = {
    bar = function(self)
        print("our score is", self.score)
    end,
}
local mt = { __index = class }
local obj1 = setmetatable({ score = 13 }, mt)
local obj2 = setmetatable({ score = 25 }, mt)
obj1:bar()  -- our score is 13
obj2:bar()  -- our score is 25

And that’s all you need. Much like Python, methods and data live in the same namespace, and Lua doesn’t care whether obj:method() finds a function on obj or gets one from the metatable’s __index. Unlike Python, the function will be passed self either way, because self comes from the use of : rather than from the lookup behavior.

(Aside: strictly speaking, any Lua value can have a metatable — and if you try to index a non-table, Lua will always consult the metatable’s __index. Strings all have the string library as a metatable, so you can call methods on them: try ("%s %s"):format(1, 2). I don’t think Lua lets user code set the metatable for non-tables, so this isn’t that interesting, but if you’re writing Lua bindings from C then you can wrap your pointers in metatables to give them methods implemented in C.)

Bringing it all together

Of course, writing all this stuff every time is a little tedious and error-prone, so instead you might want to wrap it all up inside a little function. No problem.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
local function make_object(body)
    -- create a metatable
    local mt = { __index = body }
    -- create a base table to serve as the object itself
    local obj = setmetatable({}, mt)
    -- and, done
    return obj
end

-- you can leave off parens if you're only passing in 
local Dog = {
    -- this acts as a "default" value; if obj.barks is missing, __index will
    -- kick in and find this value on the class.  but if obj.barks is assigned
    -- to, it'll go in the object and shadow the value here.
    barks = 0,

    bark = function(self)
        self.barks = self.barks + 1
        print("woof!")
    end,
}

local mydog = make_object(Dog)
mydog:bark()  -- woof!
mydog:bark()  -- woof!
mydog:bark()  -- woof!
print(mydog.barks)  -- 3
print(Dog.barks)  -- 0

It works, but it’s fairly barebones. The nice thing is that you can extend it pretty much however you want. I won’t reproduce an entire serious object system here — lord knows there are enough of them floating around — but the implementation I have for my LÖVE games lets me do this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
local Animal = Object:extend{
    cries = 0,
}

-- called automatically by Object
function Animal:init()
    print("whoops i couldn't think of anything interesting to put here")
end

-- this is just nice syntax for adding a first argument called 'self', then
-- assigning this function to Animal.cry
function Animal:cry()
    self.cries = self.cries + 1
end

local Cat = Animal:extend{}

function Cat:cry()
    print("meow!")
    Cat.__super.cry(self)
end

local cat = Cat()
cat:cry()  -- meow!
cat:cry()  -- meow!
print(cat.cries)  -- 2

When I say you can extend it however you want, I mean that. I could’ve implemented Python (2)-style super(Cat, self):cry() syntax; I just never got around to it. I could even make it work with multiple inheritance if I really wanted to — or I could go the complete opposite direction and only implement composition. I could implement descriptors, customizing the behavior of individual table keys. I could add pretty decent syntax for composition/proxying. I am trying very hard to end this section now.

The Lua philosophy

Lua’s philosophy is to… not have a philosophy? It gives you the bare minimum to make objects work, and you can do absolutely whatever you want from there. Lua does have something resembling prototypical inheritance, but it’s not so much a first-class feature as an emergent property of some very simple tools. And since you can make __index be a function, you could avoid the prototypical behavior and do something different entirely.

The very severe downside, of course, is that you have to find or build your own object system — which can get pretty confusing very quickly, what with the multiple small moving parts. Third-party code may also have its own object system with subtly different behavior. (Though, in my experience, third-party code tries very hard to avoid needing an object system at all.)

It’s hard to say what the Lua “culture” is like, since Lua is an embedded language that’s often a little different in each environment. I imagine it has a thousand millicultures, instead. I can say that the tedium of building my own object model has led me into something very “traditional”, with prototypical inheritance and whatnot. It’s partly what I’m used to, but it’s also just really dang easy to get working.

Likewise, while I love properties in Python and use them all the dang time, I’ve yet to use a single one in Lua. They wouldn’t be particularly hard to add to my object model, but having to add them myself (or shop around for an object model with them and also port all my code to use it) adds a huge amount of friction. I’ve thought about designing an interesting ECS with custom object behavior, too, but… is it really worth the effort? For all the power and flexibility Lua offers, the cost is that by the time I have something working at all, I’m too exhausted to actually use any of it.

JavaScript

JavaScript is notable for being preposterously heavily used, yet not having a class block.

Well. Okay. Yes. It has one now. It didn’t for a very long time, and even the one it has now is sugar.

Here’s a vector class again:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class Vector {
    constructor(x, y) {
        this.x = x;
        this.y = y;
    }

    get magnitude() {
        return Math.sqrt(this.x * this.x + this.y * this.y);
    }

    dot(other) {
        return this.x * other.x + this.y * other.y;
    }
}

In “classic” JavaScript, this would be written as:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
function Vector(x, y) {
    this.x = x;
    this.y = y;
}

Object.defineProperty(Vector.prototype, 'magnitude', {
    configurable: true,
    enumerable: true,
    get: function() {
        return Math.sqrt(this.x * this.x + this.y * this.y);
    },
});


Vector.prototype.dot = function(other) {
    return this.x * other.x + this.y * other.y;
};

Hm, yes. I can see why they added class.

The JavaScript model

In JavaScript, a new type is defined in terms of a function, which is its constructor.

Right away we get into trouble here. There is a very big difference between these two invocations, which I actually completely forgot about just now after spending four hours writing about Python and Lua:

1
2
let vec = Vector(3, 4);
let vec = new Vector(3, 4);

The first calls the function Vector. It assigns some properties to this, which here is going to be window, so now you have a global x and y. It then returns nothing, so vec is undefined.

The second calls Vector with this set to a new empty object, then evaluates to that object. The result is what you’d actually expect.

(You can detect this situation with the strange new.target expression, but I have never once remembered to do so.)

From here, we have true, honest-to-god, first-class prototypical inheritance. The word “prototype” is even right there. When you write this:

1
vec.dot(vec2)

JavaScript will look for dot on vec and (presumably) not find it. It then consults vecs prototype, an object you can see for yourself by using Object.getPrototypeOf(). Since vec is a Vector, its prototype is Vector.prototype.

I stress that Vector.prototype is not the prototype for Vector. It’s the prototype for instances of Vector.

(I say “instance”, but the true type of vec here is still just object. If you want to find Vector, it’s automatically assigned to the constructor property of its own prototype, so it’s available as vec.constructor.)

Of course, Vector.prototype can itself have a prototype, in which case the process would continue if dot were not found. A common (and, arguably, very bad) way to simulate single inheritance is to set Class.prototype to an instance of a superclass to get the prototype right, then tack on the methods for Class. Nowadays we can do Object.create(Superclass.prototype).

Now that I’ve been through Python and Lua, though, this isn’t particularly surprising. I kinda spoiled it.

I suppose one difference in JavaScript is that you can tack arbitrary attributes directly onto Vector all you like, and they will remain invisible to instances since they aren’t in the prototype chain. This is kind of backwards from Lua, where you can squirrel stuff away in the metatable.

Another difference is that every single object in JavaScript has a bunch of properties already tacked on — the ones in Object.prototype. Every object (and by “object” I mean any mapping) has a prototype, and that prototype defaults to Object.prototype, and it has a bunch of ancient junk like isPrototypeOf.

(Nit: it’s possible to explicitly create an object with no prototype via Object.create(null).)

Like Lua, and unlike Python, JavaScript doesn’t distinguish between keys found on an object and keys found via a prototype. Properties can be defined on prototypes with Object.defineProperty(), but that works just as well directly on an object, too. JavaScript doesn’t have a lot of operator overloading, but some things like Symbol.iterator also work on both objects and prototypes.

About this

You may, at this point, be wondering what this is. Unlike Lua and Python (and the last language below), this is a special built-in value — a context value, invisibly passed for every function call.

It’s determined by where the function came from. If the function was the result of an attribute lookup, then this is set to the object containing that attribute. Otherwise, this is set to the global object, window. (You can also set this to whatever you want via the call method on functions.)

This decision is made lexically, i.e. from the literal source code as written. There are no Python-style bound methods. In other words:

1
2
3
4
5
// this = obj
obj.method()
// this = window
let meth = obj.method
meth()

Also, because this is reassigned on every function call, it cannot be meaningfully closed over, which makes using closures within methods incredibly annoying. The old approach was to assign this to some other regular name like self (which got syntax highlighting since it’s also a built-in name in browsers); then we got Function.bind, which produced a callable thing with a fixed context value, which was kind of nice; and now finally we have arrow functions, which explicitly close over the current this when they’re defined and don’t change it when called. Phew.

Class syntax

I already showed class syntax, and it’s really just one big macro for doing all the prototype stuff The Right Way. It even prevents you from calling the type without new. The underlying model is exactly the same, and you can inspect all the parts.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class Vector { ... }

console.log(Vector.prototype);  // { dot: ..., magnitude: ..., ... }
let vec = new Vector(3, 4);
console.log(Object.getPrototypeOf(vec));  // same as Vector.prototype

// i don't know why you would subclass vector but let's roll with it
class Vectest extends Vector { ... }

console.log(Vectest.prototype);  // { ... }
console.log(Object.getPrototypeOf(Vectest.prototype))  // same as Vector.prototype

Alas, class syntax has a couple shortcomings. You can’t use the class block to assign arbitrary data to either the type object or the prototype — apparently it was deemed too confusing that mutations would be shared among instances. Which… is… how prototypes work. How Python works. How JavaScript itself, one of the most popular languages of all time, has worked for twenty-two years. Argh.

You can still do whatever assignment you want outside of the class block, of course. It’s just a little ugly, and not something I’d think to look for with a sugary class.

A more subtle result of this behavior is that a class block isn’t quite the same syntax as an object literal. The check for data isn’t a runtime thing; class Foo { x: 3 } fails to parse. So JavaScript now has two largely but not entirely identical styles of key/value block.

Attribute access

Here’s where things start to come apart at the seams, just a little bit.

JavaScript doesn’t really have an attribute protocol. Instead, it has two… extension points, I suppose.

One is Object.defineProperty, seen above. For common cases, there’s also the get syntax inside a property literal, which does the same thing. But unlike Python’s @property, these aren’t wrappers around some simple primitives; they are the primitives. JavaScript is the only language of these four to have “property that runs code on access” as a completely separate first-class concept.

If you want to intercept arbitrary attribute access (and some kinds of operators), there’s a completely different primitive: the Proxy type. It doesn’t let you intercept attribute access or operators; instead, it produces a wrapper object that supports interception and defers to the wrapped object by default.

It’s cool to see composition used in this way, but also, extremely weird. If you want to make your own type that overloads in or calling, you have to return a Proxy that wraps your own type, rather than actually returning your own type. And (unlike the other three languages in this post) you can’t return a different type from a constructor, so you have to throw that away and produce objects only from a factory. And instanceof would be broken, but you can at least fix that with Symbol.hasInstance — which is really operator overloading, implement yet another completely different way.

I know the design here is a result of legacy and speed — if any object could intercept all attribute access, then all attribute access would be slowed down everywhere. Fair enough. It still leaves the surface area of the language a bit… bumpy?

The JavaScript philosophy

It’s a little hard to tell. The original idea of prototypes was interesting, but it was hidden behind some very awkward syntax. Since then, we’ve gotten a bunch of extra features awkwardly bolted on to reflect the wildly varied things the built-in types and DOM API were already doing. We have class syntax, but it’s been explicitly designed to avoid exposing the prototype parts of the model.

I admit I don’t do a lot of heavy JavaScript, so I might just be overlooking it, but I’ve seen virtually no code that makes use of any of the recent advances in object capabilities. Forget about custom iterators or overloading call; I can’t remember seeing any JavaScript in the wild that even uses properties yet. I don’t know if everyone’s waiting for sufficient browser support, nobody knows about them, or nobody cares.

The model has advanced recently, but I suspect JavaScript is still shackled to its legacy of “something about prototypes, I don’t really get it, just copy the other code that’s there” as an object model. Alas! Prototypes are so good. Hopefully class syntax will make it a bit more accessible, as it has in Python.

Perl 5

Perl 5 also doesn’t have an object system and expects you to build your own. But where Lua gives you two simple, powerful tools for building one, Perl 5 feels more like a puzzle with half the pieces missing. Clearly they were going for something, but they only gave you half of it.

In brief, a Perl object is a reference that has been blessed with a package.

I need to explain a few things. Honestly, one of the biggest problems with the original Perl object setup was how many strange corners and unique jargon you had to understand just to get off the ground.

(If you want to try running any of this code, you should stick a use v5.26; as the first line. Perl is very big on backwards compatibility, so you need to opt into breaking changes, and even the mundane say builtin is behind a feature gate.)

References

A reference in Perl is sort of like a pointer, but its main use is very different. See, Perl has the strange property that its data structures try very hard to spill their contents all over the place. Despite having dedicated syntax for arrays — @foo is an array variable, distinct from the single scalar variable $foo — it’s actually impossible to nest arrays.

1
2
3
my @foo = (1, 2, 3, 4);
my @bar = (@foo, @foo);
# @bar is now a flat list of eight items: 1, 2, 3, 4, 1, 2, 3, 4

The idea, I guess, is that an array is not one thing. It’s not a container, which happens to hold multiple things; it is multiple things. Anywhere that expects a single value, such as an array element, cannot contain an array, because an array fundamentally is not a single value.

And so we have “references”, which are a form of indirection, but also have the nice property that they’re single values. They add containment around arrays, and in general they make working with most of Perl’s primitive types much more sensible. A reference to a variable can be taken with the \ operator, or you can use [ ... ] and { ... } to directly create references to anonymous arrays or hashes.

1
2
3
my @foo = (1, 2, 3, 4);
my @bar = (\@foo, \@foo);
# @bar is now a nested list of two items: [1, 2, 3, 4], [1, 2, 3, 4]

(Incidentally, this is the sole reason I initially abandoned Perl for Python. Non-trivial software kinda requires nesting a lot of data structures, so you end up with references everywhere, and the syntax for going back and forth between a reference and its contents is tedious and ugly.)

A Perl object must be a reference. Perl doesn’t care what kind of reference — it’s usually a hash reference, since hashes are a convenient place to store arbitrary properties, but it could just as well be a reference to an array, a scalar, or even a sub (i.e. function) or filehandle.

I’m getting a little ahead of myself. First, the other half: blessing and packages.

Packages and blessing

Perl packages are just namespaces. A package looks like this:

1
2
3
4
5
6
7
package Foo::Bar;

sub quux {
    say "hi from quux!";
}

# now Foo::Bar::quux() can be called from anywhere

Nothing shocking, right? It’s just a named container. A lot of the details are kind of weird, like how a package exists in some liminal quasi-value space, but the basic idea is a Bag Of Stuff.

The final piece is “blessing,” which is Perl’s funny name for binding a package to a reference. A very basic class might look like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
package Vector;

# the name 'new' is convention, not special
sub new {
    # perl argument passing is weird, don't ask
    my ($class, $x, $y) = @_;

    # create the object itself -- here, unusually, an array reference makes sense
    my $self = [ $x, $y ];

    # associate the package with that reference
    # note that $class here is just the regular string, 'Vector'
    bless $self, $class;

    return $self;
}

sub x {
    my ($self) = @_;
    return $self->[0];
}

sub y {
    my ($self) = @_;
    return $self->[1];
}

sub magnitude {
    my ($self) = @_;
    return sqrt($self->x ** 2 + $self->y ** 2);
}

# switch back to the "default" package
package main;

# -> is method call syntax, which passes the invocant as the first argument;
# for a package, that's just the package name
my $vec = Vector->new(3, 4);
say $vec->magnitude;  # 5

A few things of note here. First, $self->[0] has nothing to do with objects; it’s normal syntax for getting the value of a index 0 out of an array reference called $self. (Most classes are based on hashrefs and would use $self->{value} instead.) A blessed reference is still a reference and can be treated like one.

In general, -> is Perl’s dereferencey operator, but its exact behavior depends on what follows. If it’s followed by brackets, then it’ll apply the brackets to the thing in the reference: ->{} to index a hash reference, ->[] to index an array reference, and ->() to call a function reference.

But if -> is followed by an identifier, then it’s a method call. For packages, that means calling a function in the package and passing the package name as the first argument. For objects — blessed references — that means calling a function in the associated package and passing the object as the first argument.

This is a little weird! A blessed reference is a superposition of two things: its normal reference behavior, and some completely orthogonal object behavior. Also, object behavior has no notion of methods vs data; it only knows about methods. Perl lets you omit parentheses in a lot of places, including when calling a method with no arguments, so $vec->magnitude is really $vec->magnitude().

Perl’s blessing bears some similarities to Lua’s metatables, but ultimately Perl is much closer to Ruby’s “message passing” approach than the above three languages’ approaches of “get me something and maybe it’ll be callable”. (But this is no surprise — Ruby is a spiritual successor to Perl 5.)

All of this leads to one little wrinkle: how do you actually expose data? Above, I had to write x and y methods. Am I supposed to do that for every single attribute on my type?

Yes! But don’t worry, there are third-party modules to help with this incredibly fundamental task. Take Class::Accessor::Fast, so named because it’s faster than Class::Accessor:

1
2
3
package Foo;
use base qw(Class::Accessor::Fast);
__PACKAGE__->mk_accessors(qw(fred wilma barney));

(__PACKAGE__ is the lexical name of the current package; qw(...) is a list literal that splits its contents on whitespace.)

This assumes you’re using a hashref with keys of the same names as the attributes. $obj->fred will return the fred key from your hashref, and $obj->fred(4) will change it to 4.

You also, somewhat bizarrely, have to inherit from Class::Accessor::Fast. Speaking of which,

Inheritance

Inheritance is done by populating the package-global @ISA array with some number of (string) names of parent packages. Most code instead opts to write use base ...;, which does the same thing. Or, more commonly, use parent ...;, which… also… does the same thing.

Every package implicitly inherits from UNIVERSAL, which can be freely modified by Perl code.

A method can call its superclass method with the SUPER:: pseudo-package:

1
2
3
4
sub foo {
    my ($self) = @_;
    $self->SUPER::foo;
}

However, this does a depth-first search, which means it almost certainly does the wrong thing when faced with multiple inheritance. For a while the accepted solution involved a third-party module, but Perl eventually grew an alternative you have to opt into: C3, which may be more familiar to you as the order Python uses.

1
2
3
4
5
6
use mro 'c3';

sub foo {
    my ($self) = @_;
    $self->next::method;
}

Offhand, I’m not actually sure how next::method works, seeing as it was originally implemented in pure Perl code. I suspect it involves peeking at the caller’s stack frame. If so, then this is a very different style of customizability from e.g. Python — the MRO was never intended to be pluggable, and the use of a special pseudo-package means it isn’t really, but someone was determined enough to make it happen anyway.

Operator overloading and whatnot

Operator overloading looks a little weird, though really it’s pretty standard Perl.

1
2
3
4
5
6
7
8
package MyClass;

use overload '+' => \&_add;

sub _add {
    my ($self, $other, $swap) = @_;
    ...
}

use overload here is a pragma, where “pragma” means “regular-ass module that does some wizardry when imported”.

\&_add is how you get a reference to the _add sub so you can pass it to the overload module. If you just said &_add or _add, that would call it.

And that’s it; you just pass a map of operators to functions to this built-in module. No worry about name clashes or pollution, which is pretty nice. You don’t even have to give references to functions that live in the package, if you don’t want them to clog your namespace; you could put them in another package, or even inline them anonymously.

One especially interesting thing is that Perl lets you overload every operator. Perl has a lot of operators. It considers some math builtins like sqrt and trig functions to be operators, or at least operator-y enough that you can overload them. You can also overload the “file text” operators, such as -e $path to test whether a file exists. You can overload conversions, including implicit conversion to a regex. And most fascinating to me, you can overload dereferencing — that is, the thing Perl does when you say $hashref->{key} to get at the underlying hash. So a single object could pretend to be references of multiple different types, including a subref to implement callability. Neat.

Somewhat related: you can overload basic operators (indexing, etc.) on basic types (not references!) with the tie function, which is designed completely differently and looks for methods with fixed names. Go figure.

You can intercept calls to nonexistent methods by implementing a function called AUTOLOAD, within which the $AUTOLOAD global will contain the name of the method being called. Originally this feature was, I think, intended for loading binary components or large libraries on-the-fly only when needed, hence the name. Offhand I’m not sure I ever saw it used the way __getattr__ is used in Python.

Is there a way to intercept all method calls? I don’t think so, but it is Perl, so I must be forgetting something.

Actually no one does this any more

Like a decade ago, a council of elder sages sat down and put together a whole whizbang system that covers all of it: Moose.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
package Vector;
use Moose;

has x => (is => 'rw', isa => 'Int');
has y => (is => 'rw', isa => 'Int');

sub magnitude {
    my ($self) = @_;
    return sqrt($self->x ** 2 + $self->y ** 2);
}

Moose has its own way to do pretty much everything, and it’s all built on the same primitives. Moose also adds metaclasses, somehow, despite that the underlying model doesn’t actually support them? I’m not entirely sure how they managed that, but I do remember doing some class introspection with Moose and it was much nicer than the built-in way.

(If you’re wondering, the built-in way begins with looking at the hash called %Vector::. No, that’s not a typo.)

I really cannot stress enough just how much stuff Moose does, but I don’t want to delve into it here since Moose itself is not actually the language model.

The Perl philosophy

I hope you can see what I meant with what I first said about Perl, now. It has multiple inheritance with an MRO, but uses the wrong one by default. It has extensive operator overloading, which looks nothing like how inheritance works, and also some of it uses a totally different mechanism with special method names instead. It only understands methods, not data, leaving you to figure out accessors by hand.

There’s 70% of an object system here with a clear general design it was gunning for, but none of the pieces really look anything like each other. It’s weird, in a distinctly Perl way.

The result is certainly flexible, at least! It’s especially cool that you can use whatever kind of reference you want for storage, though even as I say that, I acknowledge it’s no different from simply subclassing list or something in Python. It feels different in Perl, but maybe only because it looks so different.

I haven’t written much Perl in a long time, so I don’t know what the community is like any more. Moose was already ubiquitous when I left, which you’d think would let me say “the community mostly focuses on the stuff Moose can do” — but even a decade ago, Moose could already do far more than I had ever seen done by hand in Perl. It’s always made a big deal out of roles (read: interfaces), for instance, despite that I’d never seen anyone care about them in Perl before Moose came along. Maybe their presence in Moose has made them more popular? Who knows.

Also, I wrote Perl seriously, but in the intervening years I’ve only encountered people who only ever used Perl for one-offs. Maybe it’ll come as a surprise to a lot of readers that Perl has an object model at all.

End

Well, that was fun! I hope any of that made sense.

Special mention goes to Rust, which doesn’t have an object model you can fiddle with at runtime, but does do things a little differently.

It’s been really interesting thinking about how tiny differences make a huge impact on what people do in practice. Take the choice of storage in Perl versus Python. Perl’s massively common URI class uses a string as the storage, nothing else; I haven’t seen anything like that in Python aside from markupsafe, which is specifically designed as a string type. I would guess this is partly because Perl makes you choose — using a hashref is an obvious default, but you have to make that choice one way or the other. In Python (especially 3), inheriting from object and getting dict-based storage is the obvious thing to do; the ability to use another type isn’t quite so obvious, and doing it “right” involves a tiny bit of extra work.

Or, consider that Lua could have descriptors, but the extra bit of work (especially design work) has been enough of an impediment that I’ve never implemented them. I don’t think the object implementations I’ve looked at have included them, either. Super weird!

In that light, it’s only natural that objects would be so strongly associated with the features Java and C++ attach to them. I think that makes it all the more important to play around! Look at what Moose has done. No, really, you should bear in mind my description of how Perl does stuff and flip through the Moose documentation. It’s amazing what they’ve built.

Amazon EC2 Bare Metal Instances with Direct Access to Hardware

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-ec2-bare-metal-instances-with-direct-access-to-hardware/

When customers come to us with new and unique requirements for AWS, we listen closely, ask lots of questions, and do our best to understand and address their needs. When we do this, we make the resulting service or feature generally available; we do not build one-offs or “snowflakes” for individual customers. That model is messy and hard to scale and is not the way we work.

Instead, every AWS customer has access to whatever it is that we build, and everyone benefits. VMware Cloud on AWS is a good example of this strategy in action. They told us that they wanted to run their virtualization stack directly on the hardware, within the AWS Cloud, giving their customers access to the elasticity, security, and reliability (not to mention the broad array of services) that AWS offers.

We knew that other customers also had interesting use cases for bare metal hardware and didn’t want to take the performance hit of nested virtualization. They wanted access to the physical resources for applications that take advantage of low-level hardware features such as performance counters and Intel® VT that are not always available or fully supported in virtualized environments, and also for applications intended to run directly on the hardware or licensed and supported for use in non-virtualized environments.

Our multi-year effort to move networking, storage, and other EC2 features out of our virtualization platform and into dedicated hardware was already well underway and provided the perfect foundation for a possible solution. This work, as I described in Now Available – Compute-Intensive C5 Instances for Amazon EC2, includes a set of dedicated hardware accelerators.

Now that we have provided VMware with the bare metal access that they requested, we are doing the same for all AWS customers. I’m really looking forward to seeing what you can do with them!

New Bare Metal Instances
Today we are launching a public preview the i3.metal instance, the first in a series of EC2 instances that offer the best of both worlds, allowing the operating system to run directly on the underlying hardware while still providing access to all of the benefits of the cloud. The instance gives you direct access to the processor and other hardware, and has the following specifications:

  • Processing – Two Intel Xeon E5-2686 v4 processors running at 2.3 GHz, with a total of 36 hyperthreaded cores (72 logical processors).
  • Memory – 512 GiB.
  • Storage – 15.2 terabytes of local, SSD-based NVMe storage.
  • Network – 25 Gbps of ENA-based enhanced networking.

Bare Metal instances are full-fledged members of the EC2 family and can take advantage of Elastic Load Balancing, Auto Scaling, Amazon CloudWatch, Auto Recovery, and so forth. They can also access the full suite of AWS database, IoT, mobile, analytics, artificial intelligence, and security services.

Previewing Now
We are launching a public preview of the Bare Metal instances today; please sign up now if you want to try them out.

You can now bring your specialized applications or your own stack of virtualized components to AWS and run them on Bare Metal instances. If you are using or thinking about using containers, these instances make a great host for CoreOS.

An AMI that works on one of the new C5 instances should also work on an I3 Bare Metal Instance. It must have the ENA and NVMe drivers, and must be tagged for ENA.

Jeff;

 

Raspberry Pi clusters come of age

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/raspberry-pi-clusters-come-of-age/

In today’s guest post, Bruce Tulloch, CEO and Managing Director of BitScope Designs, discusses the uses of cluster computing with the Raspberry Pi, and the recent pilot of the Los Alamos National Laboratory 3000-Pi cluster built with the BitScope Blade.

Raspberry Pi cluster

High-performance computing and Raspberry Pi are not normally uttered in the same breath, but Los Alamos National Laboratory is building a Raspberry Pi cluster with 3000 cores as a pilot before scaling up to 40 000 cores or more next year.

That’s amazing, but why?

I was asked this question more than any other at The International Conference for High-Performance Computing, Networking, Storage and Analysis in Denver last week, where one of the Los Alamos Raspberry Pi Cluster Modules was on display at the University of New Mexico’s Center for Advanced Research Computing booth.

The short answer to this question is: the Raspberry Pi cluster enables Los Alamos National Laboratory (LANL) to conduct exascale computing R&D.

The Pi cluster breadboard

Exascale refers to computing systems at least 50 times faster than the most powerful supercomputers in use today. The problem faced by LANL and similar labs building these things is one of scale. To get the required performance, you need a lot of nodes, and to make it work, you need a lot of R&D.

However, there’s a catch-22: how do you write the operating systems, networks stacks, launch and boot systems for such large computers without having one on which to test it all? Use an existing supercomputer? No — the existing large clusters are fully booked 24/7 doing science, they cost millions of dollars per year to run, and they may not have the architecture you need for your next-generation machine anyway. Older machines retired from science may be available, but at this scale they cost far too much to use and are usually very hard to maintain.

The Los Alamos solution? Build a “model supercomputer” with Raspberry Pi!

Think of it as a “cluster development breadboard”.

The idea is to design, develop, debug, and test new network architectures and systems software on the “breadboard”, but at a scale equivalent to the production machines you’re currently building. Raspberry Pi may be a small computer, but it can run most of the system software stacks that production machines use, and the ratios of its CPU speed, local memory, and network bandwidth scale proportionately to the big machines, much like an architect’s model does when building a new house. To learn more about the project, see the news conference and this interview with insideHPC at SC17.

Traditional Raspberry Pi clusters

Like most people, we love a good cluster! People have been building them with Raspberry Pi since the beginning, because it’s inexpensive, educational, and fun. They’ve been built with the original Pi, Pi 2, Pi 3, and even the Pi Zero, but none of these clusters have proven to be particularly practical.

That’s not stopped them being useful though! I saw quite a few Raspberry Pi clusters at the conference last week.

One tiny one that caught my eye was from the people at openio.io, who used a small Raspberry Pi Zero W cluster to demonstrate their scalable software-defined object storage platform, which on big machines is used to manage petabytes of data, but which is so lightweight that it runs just fine on this:

Raspberry Pi Zero cluster

There was another appealing example at the ARM booth, where the Berkeley Labs’ singularity container platform was demonstrated running very effectively on a small cluster built with Raspberry Pi 3s.

Raspberry Pi 3 cluster demo at a conference stall

My show favourite was from the Edinburgh Parallel Computing Center (EPCC): Nick Brown used a cluster of Pi 3s to explain supercomputers to kids with an engaging interactive application. The idea was that visitors to the stand design an aircraft wing, simulate it across the cluster, and work out whether an aircraft that uses the new wing could fly from Edinburgh to New York on a full tank of fuel. Mine made it, fortunately!

Raspberry Pi 3 cluster demo at a conference stall

Next-generation Raspberry Pi clusters

We’ve been building small-scale industrial-strength Raspberry Pi clusters for a while now with BitScope Blade.

When Los Alamos National Laboratory approached us via HPC provider SICORP with a request to build a cluster comprising many thousands of nodes, we considered all the options very carefully. It needed to be dense, reliable, low-power, and easy to configure and to build. It did not need to “do science”, but it did need to work in almost every other way as a full-scale HPC cluster would.

Some people argue Compute Module 3 is the ideal cluster building block. It’s very small and just as powerful as Raspberry Pi 3, so one could, in theory, pack a lot of them into a very small space. However, there are very good reasons no one has ever successfully done this. For a start, you need to build your own network fabric and I/O, and cooling the CM3s, especially when densely packed in a cluster, is tricky given their tiny size. There’s very little room for heatsinks, and the tiny PCBs dissipate very little excess heat.

Instead, we saw the potential for Raspberry Pi 3 itself to be used to build “industrial-strength clusters” with BitScope Blade. It works best when the Pis are properly mounted, powered reliably, and cooled effectively. It’s important to avoid using micro SD cards and to connect the nodes using wired networks. It has the added benefit of coming with lots of “free” USB I/O, and the Pi 3 PCB, when mounted with the correct air-flow, is a remarkably good heatsink.

When Gordon announced netboot support, we became convinced the Raspberry Pi 3 was the ideal candidate when used with standard switches. We’d been making smaller clusters for a while, but netboot made larger ones practical. Assembling them all into compact units that fit into existing racks with multiple 10 Gb uplinks is the solution that meets LANL’s needs. This is a 60-node cluster pack with a pair of managed switches by Ubiquiti in testing in the BitScope Lab:

60-node Raspberry Pi cluster pack

Two of these packs, built with Blade Quattro, and one smaller one comprising 30 nodes, built with Blade Duo, are the components of the Cluster Module we exhibited at the show. Five of these modules are going into Los Alamos National Laboratory for their pilot as I write this.

Bruce Tulloch at a conference stand with a demo of the Raspberry Pi cluster for LANL

It’s not only research clusters like this for which Raspberry Pi is well suited. You can build very reliable local cloud computing and data centre solutions for research, education, and even some industrial applications. You’re not going to get much heavy-duty science, big data analytics, AI, or serious number crunching done on one of these, but it is quite amazing to see just how useful Raspberry Pi clusters can be for other purposes, whether it’s software-defined networks, lightweight MaaS, SaaS, PaaS, or FaaS solutions, distributed storage, edge computing, industrial IoT, and of course, education in all things cluster and parallel computing. For one live example, check out Mythic Beasts’ educational compute cloud, built with Raspberry Pi 3.

For more information about Raspberry Pi clusters, drop by BitScope Clusters.

I’ll read and respond to your thoughts in the comments below this post too.

Editor’s note:

Here is a photo of Bruce wearing a jetpack. Cool, right?!

Bruce Tulloch wearing a jetpack

The post Raspberry Pi clusters come of age appeared first on Raspberry Pi.

Potential impact of the Intel ME vulnerability

Post Syndicated from Matthew Garrett original https://mjg59.dreamwidth.org/49611.html

(Note: this is my personal opinion based on public knowledge around this issue. I have no knowledge of any non-public details of these vulnerabilities, and this should not be interpreted as the position or opinion of my employer)

Intel’s Management Engine (ME) is a small coprocessor built into the majority of Intel CPUs[0]. Older versions were based on the ARC architecture[1] running an embedded realtime operating system, but from version 11 onwards they’ve been small x86 cores running Minix. The precise capabilities of the ME have not been publicly disclosed, but it is at minimum capable of interacting with the network[2], display[3], USB, input devices and system flash. In other words, software running on the ME is capable of doing a lot, without requiring any OS permission in the process.

Back in May, Intel announced a vulnerability in the Advanced Management Technology (AMT) that runs on the ME. AMT offers functionality like providing a remote console to the system (so IT support can connect to your system and interact with it as if they were physically present), remote disk support (so IT support can reinstall your machine over the network) and various other bits of system management. The vulnerability meant that it was possible to log into systems with enabled AMT with an empty authentication token, making it possible to log in without knowing the configured password.

This vulnerability was less serious than it could have been for a couple of reasons – the first is that “consumer”[4] systems don’t ship with AMT, and the second is that AMT is almost always disabled (Shodan found only a few thousand systems on the public internet with AMT enabled, out of many millions of laptops). I wrote more about it here at the time.

How does this compare to the newly announced vulnerabilities? Good question. Two of the announced vulnerabilities are in AMT. The previous AMT vulnerability allowed you to bypass authentication, but restricted you to doing what AMT was designed to let you do. While AMT gives an authenticated user a great deal of power, it’s also designed with some degree of privacy protection in mind – for instance, when the remote console is enabled, an animated warning border is drawn on the user’s screen to alert them.

This vulnerability is different in that it allows an authenticated attacker to execute arbitrary code within the AMT process. This means that the attacker shouldn’t have any capabilities that AMT doesn’t, but it’s unclear where various aspects of the privacy protection are implemented – for instance, if the warning border is implemented in AMT rather than in hardware, an attacker could duplicate that functionality without drawing the warning. If the USB storage emulation for remote booting is implemented as a generic USB passthrough, the attacker could pretend to be an arbitrary USB device and potentially exploit the operating system through bugs in USB device drivers. Unfortunately we don’t currently know.

Note that this exploit still requires two things – first, AMT has to be enabled, and second, the attacker has to be able to log into AMT. If the attacker has physical access to your system and you don’t have a BIOS password set, they will be able to enable it – however, if AMT isn’t enabled and the attacker isn’t physically present, you’re probably safe. But if AMT is enabled and you haven’t patched the previous vulnerability, the attacker will be able to access AMT over the network without a password and then proceed with the exploit. This is bad, so you should probably (1) ensure that you’ve updated your BIOS and (2) ensure that AMT is disabled unless you have a really good reason to use it.

The AMT vulnerability applies to a wide range of versions, everything from version 6 (which shipped around 2008) and later. The other vulnerability that Intel describe is restricted to version 11 of the ME, which only applies to much more recent systems. This vulnerability allows an attacker to execute arbitrary code on the ME, which means they can do literally anything the ME is able to do. This probably also means that they are able to interfere with any other code running on the ME. While AMT has been the most frequently discussed part of this, various other Intel technologies are tied to ME functionality.

Intel’s Platform Trust Technology (PTT) is a software implementation of a Trusted Platform Module (TPM) that runs on the ME. TPMs are intended to protect access to secrets and encryption keys and record the state of the system as it boots, making it possible to determine whether a system has had part of its boot process modified and denying access to the secrets as a result. The most common usage of TPMs is to protect disk encryption keys – Microsoft Bitlocker defaults to storing its encryption key in the TPM, automatically unlocking the drive if the boot process is unmodified. In addition, TPMs support something called Remote Attestation (I wrote about that here), which allows the TPM to provide a signed copy of information about what the system booted to a remote site. This can be used for various purposes, such as not allowing a compute node to join a cloud unless it’s booted the correct version of the OS and is running the latest firmware version. Remote Attestation depends on the TPM having a unique cryptographic identity that is tied to the TPM and inaccessible to the OS.

PTT allows manufacturers to simply license some additional code from Intel and run it on the ME rather than having to pay for an additional chip on the system motherboard. This seems great, but if an attacker is able to run code on the ME then they potentially have the ability to tamper with PTT, which means they can obtain access to disk encryption secrets and circumvent Bitlocker. It also means that they can tamper with Remote Attestation, “attesting” that the system booted a set of software that it didn’t or copying the keys to another system and allowing that to impersonate the first. This is, uh, bad.

Intel also recently announced Intel Online Connect, a mechanism for providing the functionality of security keys directly in the operating system. Components of this are run on the ME in order to avoid scenarios where a compromised OS could be used to steal the identity secrets – if the ME is compromised, this may make it possible for an attacker to obtain those secrets and duplicate the keys.

It’s also not entirely clear how much of Intel’s Secure Guard Extensions (SGX) functionality depends on the ME. The ME does appear to be required for SGX Remote Attestation (which allows an application using SGX to prove to a remote site that it’s the SGX app rather than something pretending to be it), and again if those secrets can be extracted from a compromised ME it may be possible to compromise some of the security assumptions around SGX. Again, it’s not clear how serious this is because it’s not publicly documented.

Various other things also run on the ME, including stuff like video DRM (ensuring that high resolution video streams can’t be intercepted by the OS). It may be possible to obtain encryption keys from a compromised ME that allow things like Netflix streams to be decoded and dumped. From a user privacy or security perspective, these things seem less serious.

The big problem at the moment is that we have no idea what the actual process of compromise is. Intel state that it requires local access, but don’t describe what kind. Local access in this case could simply require the ability to send commands to the ME (possible on any system that has the ME drivers installed), could require direct hardware access to the exposed ME (which would require either kernel access or the ability to install a custom driver) or even the ability to modify system flash (possible only if the attacker has physical access and enough time and skill to take the system apart and modify the flash contents with an SPI programmer). The other thing we don’t know is whether it’s possible for an attacker to modify the system such that the ME is persistently compromised or whether it needs to be re-compromised every time the ME reboots. Note that even the latter is more serious than you might think – the ME may only be rebooted if the system loses power completely, so even a “temporary” compromise could affect a system for a long period of time.

It’s also almost impossible to determine if a system is compromised. If the ME is compromised then it’s probably possible for it to roll back any firmware updates but still report that it’s been updated, giving admins a false sense of security. The only way to determine for sure would be to dump the system flash and compare it to a known good image. This is impractical to do at scale.

So, overall, given what we know right now it’s hard to say how serious this is in terms of real world impact. It’s unlikely that this is the kind of vulnerability that would be used to attack individual end users – anyone able to compromise a system like this could just backdoor your browser instead with much less effort, and that already gives them your banking details. The people who have the most to worry about here are potential targets of skilled attackers, which means activists, dissidents and companies with interesting personal or business data. It’s hard to make strong recommendations about what to do here without more insight into what the vulnerability actually is, and we may not know that until this presentation next month.

Summary: Worst case here is terrible, but unlikely to be relevant to the vast majority of users.

[0] Earlier versions of the ME were built into the motherboard chipset, but as portions of that were incorporated onto the CPU package the ME followed
[1] A descendent of the SuperFX chip used in Super Nintendo cartridges such as Starfox, because why not
[2] Without any OS involvement for wired ethernet and for wireless networks in the system firmware, but requires OS support for wireless access once the OS drivers have loaded
[3] Assuming you’re using integrated Intel graphics
[4] “Consumer” is a bit of a misnomer here – “enterprise” laptops like Thinkpads ship with AMT, but are often bought by consumers.

comment count unavailable comments

How to Patch, Inspect, and Protect Microsoft Windows Workloads on AWS—Part 2

Post Syndicated from Koen van Blijderveen original https://aws.amazon.com/blogs/security/how-to-patch-inspect-and-protect-microsoft-windows-workloads-on-aws-part-2/

Yesterday in Part 1 of this blog post, I showed you how to:

  1. Launch an Amazon EC2 instance with an AWS Identity and Access Management (IAM) role, an Amazon Elastic Block Store (Amazon EBS) volume, and tags that Amazon EC2 Systems Manager (Systems Manager) and Amazon Inspector use.
  2. Configure Systems Manager to install the Amazon Inspector agent and patch your EC2 instances.

Today in Steps 3 and 4, I show you how to:

  1. Take Amazon EBS snapshots using Amazon EBS Snapshot Scheduler to automate snapshots based on instance tags.
  2. Use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).

To catch up on Steps 1 and 2, see yesterday’s blog post.

Step 3: Take EBS snapshots using EBS Snapshot Scheduler

In this section, I show you how to use EBS Snapshot Scheduler to take snapshots of your instances at specific intervals. To do this, I will show you how to:

  • Determine the schedule for EBS Snapshot Scheduler by providing you with best practices.
  • Deploy EBS Snapshot Scheduler by using AWS CloudFormation.
  • Tag your EC2 instances so that EBS Snapshot Scheduler backs up your instances when you want them backed up.

In addition to making sure your EC2 instances have all the available operating system patches applied on a regular schedule, you should take snapshots of the EBS storage volumes attached to your EC2 instances. Taking regular snapshots allows you to restore your data to a previous state quickly and cost effectively. With Amazon EBS snapshots, you pay only for the actual data you store, and snapshots save only the data that has changed since the previous snapshot, which minimizes your cost. You will use EBS Snapshot Scheduler to make regular snapshots of your EC2 instance. EBS Snapshot Scheduler takes advantage of other AWS services including CloudFormation, Amazon DynamoDB, and AWS Lambda to make backing up your EBS volumes simple.

Determine the schedule

As a best practice, you should back up your data frequently during the hours when your data changes the most. This reduces the amount of data you lose if you have to restore from a snapshot. For the purposes of this blog post, the data for my instances changes the most between the business hours of 9:00 A.M. to 5:00 P.M. Pacific Time. During these hours, I will make snapshots hourly to minimize data loss.

In addition to backing up frequently, another best practice is to establish a strategy for retention. This will vary based on how you need to use the snapshots. If you have compliance requirements to be able to restore for auditing, your needs may be different than if you are able to detect data corruption within three hours and simply need to restore to something that limits data loss to five hours. EBS Snapshot Scheduler enables you to specify the retention period for your snapshots. For this post, I only need to keep snapshots for recent business days. To account for weekends, I will set my retention period to three days, which is down from the default of 15 days when deploying EBS Snapshot Scheduler.

Deploy EBS Snapshot Scheduler

In Step 1 of Part 1 of this post, I showed how to configure an EC2 for Windows Server 2012 R2 instance with an EBS volume. You will use EBS Snapshot Scheduler to take eight snapshots each weekday of your EC2 instance’s EBS volumes:

  1. Navigate to the EBS Snapshot Scheduler deployment page and choose Launch Solution. This takes you to the CloudFormation console in your account. The Specify an Amazon S3 template URL option is already selected and prefilled. Choose Next on the Select Template page.
  2. On the Specify Details page, retain all default parameters except for AutoSnapshotDeletion. Set AutoSnapshotDeletion to Yes to ensure that old snapshots are periodically deleted. The default retention period is 15 days (you will specify a shorter value on your instance in the next subsection).
  3. Choose Next twice to move to the Review step, and start deployment by choosing the I acknowledge that AWS CloudFormation might create IAM resources check box and then choosing Create.

Tag your EC2 instances

EBS Snapshot Scheduler takes a few minutes to deploy. While waiting for its deployment, you can start to tag your instance to define its schedule. EBS Snapshot Scheduler reads tag values and looks for four possible custom parameters in the following order:

  • <snapshot time> – Time in 24-hour format with no colon.
  • <retention days> – The number of days (a positive integer) to retain the snapshot before deletion, if set to automatically delete snapshots.
  • <time zone> – The time zone of the times specified in <snapshot time>.
  • <active day(s)>all, weekdays, or mon, tue, wed, thu, fri, sat, and/or sun.

Because you want hourly backups on weekdays between 9:00 A.M. and 5:00 P.M. Pacific Time, you need to configure eight tags—one for each hour of the day. You will add the eight tags shown in the following table to your EC2 instance.

Tag Value
scheduler:ebs-snapshot:0900 0900;3;utc;weekdays
scheduler:ebs-snapshot:1000 1000;3;utc;weekdays
scheduler:ebs-snapshot:1100 1100;3;utc;weekdays
scheduler:ebs-snapshot:1200 1200;3;utc;weekdays
scheduler:ebs-snapshot:1300 1300;3;utc;weekdays
scheduler:ebs-snapshot:1400 1400;3;utc;weekdays
scheduler:ebs-snapshot:1500 1500;3;utc;weekdays
scheduler:ebs-snapshot:1600 1600;3;utc;weekdays

Next, you will add these tags to your instance. If you want to tag multiple instances at once, you can use Tag Editor instead. To add the tags in the preceding table to your EC2 instance:

  1. Navigate to your EC2 instance in the EC2 console and choose Tags in the navigation pane.
  2. Choose Add/Edit Tags and then choose Create Tag to add all the tags specified in the preceding table.
  3. Confirm you have added the tags by choosing Save. After adding these tags, navigate to your EC2 instance in the EC2 console. Your EC2 instance should look similar to the following screenshot.
    Screenshot of how your EC2 instance should look in the console
  4. After waiting a couple of hours, you can see snapshots beginning to populate on the Snapshots page of the EC2 console.Screenshot of snapshots beginning to populate on the Snapshots page of the EC2 console
  5. To check if EBS Snapshot Scheduler is active, you can check the CloudWatch rule that runs the Lambda function. If the clock icon shown in the following screenshot is green, the scheduler is active. If the clock icon is gray, the rule is disabled and does not run. You can enable or disable the rule by selecting it, choosing Actions, and choosing Enable or Disable. This also allows you to temporarily disable EBS Snapshot Scheduler.Screenshot of checking to see if EBS Snapshot Scheduler is active
  1. You can also monitor when EBS Snapshot Scheduler has run by choosing the name of the CloudWatch rule as shown in the previous screenshot and choosing Show metrics for the rule.Screenshot of monitoring when EBS Snapshot Scheduler has run by choosing the name of the CloudWatch rule

If you want to restore and attach an EBS volume, see Restoring an Amazon EBS Volume from a Snapshot and Attaching an Amazon EBS Volume to an Instance.

Step 4: Use Amazon Inspector

In this section, I show you how to you use Amazon Inspector to scan your EC2 instance for common vulnerabilities and exposures (CVEs) and set up Amazon SNS notifications. To do this I will show you how to:

  • Install the Amazon Inspector agent by using EC2 Run Command.
  • Set up notifications using Amazon SNS to notify you of any findings.
  • Define an Amazon Inspector target and template to define what assessment to perform on your EC2 instance.
  • Schedule Amazon Inspector assessment runs to assess your EC2 instance on a regular interval.

Amazon Inspector can help you scan your EC2 instance using prebuilt rules packages, which are built and maintained by AWS. These prebuilt rules packages tell Amazon Inspector what to scan for on the EC2 instances you select. Amazon Inspector provides the following prebuilt packages for Microsoft Windows Server 2012 R2:

  • Common Vulnerabilities and Exposures
  • Center for Internet Security Benchmarks
  • Runtime Behavior Analysis

In this post, I’m focused on how to make sure you keep your EC2 instances patched, backed up, and inspected for common vulnerabilities and exposures (CVEs). As a result, I will focus on how to use the CVE rules package and use your instance tags to identify the instances on which to run the CVE rules. If your EC2 instance is fully patched using Systems Manager, as described earlier, you should not have any findings with the CVE rules package. Regardless, as a best practice I recommend that you use Amazon Inspector as an additional layer for identifying any unexpected failures. This involves using Amazon CloudWatch to set up weekly Amazon Inspector scans, and configuring Amazon Inspector to notify you of any findings through SNS topics. By acting on the notifications you receive, you can respond quickly to any CVEs on any of your EC2 instances to help ensure that malware using known CVEs does not affect your EC2 instances. In a previous blog post, Eric Fitzgerald showed how to remediate Amazon Inspector security findings automatically.

Install the Amazon Inspector agent

To install the Amazon Inspector agent, you will use EC2 Run Command, which allows you to run any command on any of your EC2 instances that have the Systems Manager agent with an attached IAM role that allows access to Systems Manager.

  1. Choose Run Command under Systems Manager Services in the navigation pane of the EC2 console. Then choose Run a command.
    Screenshot of choosing "Run a command"
  2. To install the Amazon Inspector agent, you will use an AWS managed and provided command document that downloads and installs the agent for you on the selected EC2 instance. Choose AmazonInspector-ManageAWSAgent. To choose the target EC2 instance where this command will be run, use the tag you previously assigned to your EC2 instance, Patch Group, with a value of Windows Servers. For this example, set the concurrent installations to 1 and tell Systems Manager to stop after 5 errors.
    Screenshot of installing the Amazon Inspector agent
  3. Retain the default values for all other settings on the Run a command page and choose Run. Back on the Run Command page, you can see if the command that installed the Amazon Inspector agent executed successfully on all selected EC2 instances.
    Screenshot showing that the command that installed the Amazon Inspector agent executed successfully on all selected EC2 instances

Set up notifications using Amazon SNS

Now that you have installed the Amazon Inspector agent, you will set up an SNS topic that will notify you of any findings after an Amazon Inspector run.

To set up an SNS topic:

  1. In the AWS Management Console, choose Simple Notification Service under Messaging in the Services menu.
  2. Choose Create topic, name your topic (only alphanumeric characters, hyphens, and underscores are allowed) and give it a display name to ensure you know what this topic does (I’ve named mine Inspector). Choose Create topic.
    "Create new topic" page
  3. To allow Amazon Inspector to publish messages to your new topic, choose Other topic actions and choose Edit topic policy.
  4. For Allow these users to publish messages to this topic and Allow these users to subscribe to this topic, choose Only these AWS users. Type the following ARN for the US East (N. Virginia) Region in which you are deploying the solution in this post: arn:aws:iam::316112463485:root. This is the ARN of Amazon Inspector itself. For the ARNs of Amazon Inspector in other AWS Regions, see Setting Up an SNS Topic for Amazon Inspector Notifications (Console). Amazon Resource Names (ARNs) uniquely identify AWS resources across all of AWS.
    Screenshot of editing the topic policy
  5. To receive notifications from Amazon Inspector, subscribe to your new topic by choosing Create subscription and adding your email address. After confirming your subscription by clicking the link in the email, the topic should display your email address as a subscriber. Later, you will configure the Amazon Inspector template to publish to this topic.
    Screenshot of subscribing to the new topic

Define an Amazon Inspector target and template

Now that you have set up the notification topic by which Amazon Inspector can notify you of findings, you can create an Amazon Inspector target and template. A target defines which EC2 instances are in scope for Amazon Inspector. A template defines which packages to run, for how long, and on which target.

To create an Amazon Inspector target:

  1. Navigate to the Amazon Inspector console and choose Get started. At the time of writing this blog post, Amazon Inspector is available in the US East (N. Virginia), US West (N. California), US West (Oregon), EU (Ireland), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Sydney), and Asia Pacific (Tokyo) Regions.
  2. For Amazon Inspector to be able to collect the necessary data from your EC2 instance, you must create an IAM service role for Amazon Inspector. Amazon Inspector can create this role for you if you choose Choose or create role and confirm the role creation by choosing Allow.
    Screenshot of creating an IAM service role for Amazon Inspector
  3. Amazon Inspector also asks you to tag your EC2 instance and install the Amazon Inspector agent. You already performed these steps in Part 1 of this post, so you can proceed by choosing Next. To define the Amazon Inspector target, choose the previously used Patch Group tag with a Value of Windows Servers. This is the same tag that you used to define the targets for patching. Then choose Next.
    Screenshot of defining the Amazon Inspector target
  4. Now, define your Amazon Inspector template, and choose a name and the package you want to run. For this post, use the Common Vulnerabilities and Exposures package and choose the default duration of 1 hour. As you can see, the package has a version number, so always select the latest version of the rules package if multiple versions are available.
    Screenshot of defining an assessment template
  5. Configure Amazon Inspector to publish to your SNS topic when findings are reported. You can also choose to receive a notification of a started run, a finished run, or changes in the state of a run. For this blog post, you want to receive notifications if there are any findings. To start, choose Assessment Templates from the Amazon Inspector console and choose your newly created Amazon Inspector assessment template. Choose the icon below SNS topics (see the following screenshot).
    Screenshot of choosing an assessment template
  6. A pop-up appears in which you can choose the previously created topic and the events about which you want SNS to notify you (choose Finding reported).
    Screenshot of choosing the previously created topic and the events about which you want SNS to notify you

Schedule Amazon Inspector assessment runs

The last step in using Amazon Inspector to assess for CVEs is to schedule the Amazon Inspector template to run using Amazon CloudWatch Events. This will make sure that Amazon Inspector assesses your EC2 instance on a regular basis. To do this, you need the Amazon Inspector template ARN, which you can find under Assessment templates in the Amazon Inspector console. CloudWatch Events can run your Amazon Inspector assessment at an interval you define using a Cron-based schedule. Cron is a well-known scheduling agent that is widely used on UNIX-like operating systems and uses the following syntax for CloudWatch Events.

Image of Cron schedule

All scheduled events use a UTC time zone, and the minimum precision for schedules is one minute. For more information about scheduling CloudWatch Events, see Schedule Expressions for Rules.

To create the CloudWatch Events rule:

  1. Navigate to the CloudWatch console, choose Events, and choose Create rule.
    Screenshot of starting to create a rule in the CloudWatch Events console
  2. On the next page, specify if you want to invoke your rule based on an event pattern or a schedule. For this blog post, you will select a schedule based on a Cron expression.
  3. You can schedule the Amazon Inspector assessment any time you want using the Cron expression, or you can use the Cron expression I used in the following screenshot, which will run the Amazon Inspector assessment every Sunday at 10:00 P.M. GMT.
    Screenshot of scheduling an Amazon Inspector assessment with a Cron expression
  4. Choose Add target and choose Inspector assessment template from the drop-down menu. Paste the ARN of the Amazon Inspector template you previously created in the Amazon Inspector console in the Assessment template box and choose Create a new role for this specific resource. This new role is necessary so that CloudWatch Events has the necessary permissions to start the Amazon Inspector assessment. CloudWatch Events will automatically create the new role and grant the minimum set of permissions needed to run the Amazon Inspector assessment. To proceed, choose Configure details.
    Screenshot of adding a target
  5. Next, give your rule a name and a description. I suggest using a name that describes what the rule does, as shown in the following screenshot.
  6. Finish the wizard by choosing Create rule. The rule should appear in the Events – Rules section of the CloudWatch console.
    Screenshot of completing the creation of the rule
  7. To confirm your CloudWatch Events rule works, wait for the next time your CloudWatch Events rule is scheduled to run. For testing purposes, you can choose your CloudWatch Events rule and choose Edit to change the schedule to run it sooner than scheduled.
    Screenshot of confirming the CloudWatch Events rule works
  8. Now navigate to the Amazon Inspector console to confirm the launch of your first assessment run. The Start time column shows you the time each assessment started and the Status column the status of your assessment. In the following screenshot, you can see Amazon Inspector is busy Collecting data from the selected assessment targets.
    Screenshot of confirming the launch of the first assessment run

You have concluded the last step of this blog post by setting up a regular scan of your EC2 instance with Amazon Inspector and a notification that will let you know if your EC2 instance is vulnerable to any known CVEs. In a previous Security Blog post, Eric Fitzgerald explained How to Remediate Amazon Inspector Security Findings Automatically. Although that blog post is for Linux-based EC2 instances, the post shows that you can learn about Amazon Inspector findings in other ways than email alerts.

Conclusion

In this two-part blog post, I showed how to make sure you keep your EC2 instances up to date with patching, how to back up your instances with snapshots, and how to monitor your instances for CVEs. Collectively these measures help to protect your instances against common attack vectors that attempt to exploit known vulnerabilities. In Part 1, I showed how to configure your EC2 instances to make it easy to use Systems Manager, EBS Snapshot Scheduler, and Amazon Inspector. I also showed how to use Systems Manager to schedule automatic patches to keep your instances current in a timely fashion. In Part 2, I showed you how to take regular snapshots of your data by using EBS Snapshot Scheduler and how to use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).

If you have comments about today’s or yesterday’s post, submit them in the “Comments” section below. If you have questions about or issues implementing any part of this solution, start a new thread on the Amazon EC2 forum or the Amazon Inspector forum, or contact AWS Support.

– Koen

What do you want your button to do?

Post Syndicated from Carrie Anne Philbin original https://www.raspberrypi.org/blog/button/

Here at Raspberry Pi, we know that getting physical with computing is often a catalyst for creativity. Building a simple circuit can open up a world of making possibilities! This ethos of tinkering and invention is also being used in the classroom to inspire a whole new generation of makers too, and here is why.

The all-important question

Physical computing provides a great opportunity for creative expression: the button press! By explaining how a button works, how to build one with a breadboard attached to computer, and how to program the button to work when it’s pressed, you can give learners young and old all the conceptual skills they need to build a thing that does something. But what do they want their button to do? Have you ever asked your students or children at home? I promise it will be one of the most mindblowing experiences you’ll have if you do.

A button. A harmless, little arcade button.

Looks harmless now, but put it into the hands of a child and see what happens!

Amy will want her button to take a photo, Charlie will want his button to play a sound, Tumi will want her button to explode TNT in Minecraft, Jack will want their button to fire confetti out of a cannon, and James Robinson will want his to trigger silly noises (doesn’t he always?)! Idea generation is the inherent gift that every child has in abundance. As educators and parents, we’re always looking to deeply engage our young people in the subject matter we’re teaching, and they are never more engaged than when they have an idea and want to implement it. Way back in 2012, I wanted my button to print geeky sayings:

Geek Gurl Diaries Raspberry Pi Thermal Printer Project Sneak Peek!

A sneak peek at the finished Geek Gurl Diaries ‘Box of Geek’. I’ve been busy making this for a few weeks with some help from friends. Tutorial to make your own box coming soon, so keep checking the Geek Gurl Diaries Twitter, facebook page and channel.

What are the challenges for this approach in education?

Allowing this kind of free-form creativity and tinkering in the classroom obviously has its challenges for teachers, especially those confined to rigid lesson structures, timings, and small classrooms. The most common worry I hear from teachers is “what if they ask a question I can’t answer?” Encouraging this sort of creative thinking makes that almost an inevitability. How can you facilitate roughly 30 different projects simultaneously? The answer is by using those other computational and transferable thinking skills:

  • Problem-solving
  • Iteration
  • Collaboration
  • Evaluation

Clearly specifying a problem, surveying the tools available to solve it (including online references and external advice), and then applying them to solve the problem is a hugely important skill, and this is a great opportunity to teach it.

A girl plays a button reaction game at a Raspberry Pi event

Press ALL the buttons!

Hands-off guidance

When we train teachers at Picademy, we group attendees around themes that have come out of the idea generation session. Together they collaborate on an achievable shared goal. One will often sketch something on a whiteboard, decomposing the problem into smaller parts; then the group will divide up the tasks. Each will look online or in books for tutorials to help them with their step. I’ve seen this behaviour in student groups too, and it’s very easy to facilitate. You don’t need to be the resident expert on every project that students want to work on.

The key is knowing where to guide students to find the answers they need. Curating online videos, blogs, tutorials, and articles in advance gives you the freedom and confidence to concentrate on what matters: the learning. We have a number of physical computing projects that use buttons, linked to our curriculum for learners to combine inputs and outputs to solve a problem. The WhooPi cushion and GPIO music box are two of my favourites.

A Raspberry Pi and button attached to a computer display

Outside of formal education, events such as Raspberry Jams, CoderDojos, CAS Hubs, and hackathons are ideal venues for seeking and receiving support and advice.

Cross-curricular participation

The rise of the global maker movement, I think, is in response to abstract concepts and disciplines. Children are taught lots of concepts in isolation that aren’t always relevant to their lives or immediate environment. Digital making provides a unique and exciting way of bridging different subject areas, allowing for cross-curricular participation. I’m not suggesting that educators should throw away all their schemes of work and leave the full direction of the computing curriculum to students. However, there’s huge value in exposing learners to the possibilities for creativity in computing. Creative freedom and expression guide learning, better preparing young people for the workplace of tomorrow.

So…what do you want your button to do?

Hello World

Learn more about today’s subject, and read further articles regarding computer science in education, in Hello World magazine issue 1.

Read Hello World issue 1 for more…

UK-based educators can subscribe to Hello World to receive a hard copy delivered for free to their doorstep, while the PDF is available for free to everyone via the Hello World website.

The post What do you want your button to do? appeared first on Raspberry Pi.

On Architecture and the State of the Art

Post Syndicated from Philip Fitzsimons original https://aws.amazon.com/blogs/architecture/on-architecture-and-the-state-of-the-art/

On the AWS Solutions Architecture team we know we’re following in the footsteps of other technical experts who pulled together the best practices of their eras. Around 22 BC the Roman Architect Vitruvius Pollio wrote On architecture (published as The Ten Books on Architecture), which became a seminal work on architectural theory. Vitruvius captured the best practices of his contemporaries and those who went before him (especially the Greek architects).

Closer to our time, in 1910, another technical expert, Henry Harrison Suplee, wrote Gas Turbine: progress in the design and construction of turbines operated by gases of combustion, from which we believe the phrase “state of the art” originates:

It has therefore been thought desirable to gather under one cover the most important papers which have appeared upon the subject of the gas turbine in England, France, Germany, and Switzerland, together with some account of the work in America, and to add to this such information upon actual experimental machines as can be secured.

In the present state of the art this is all that can be done, but it is believed that this will aid materially in the conduct of subsequent work, and place in the hands of the gas-power engineer a collection of material not generally accessible or available in convenient form.  

Source

Both authors wrote books that captured the current knowledge on design principles and best practices (in architecture and engineering) to improve awareness and adoption. Like these authors, we at AWS believe that capturing and sharing best practices leads to better outcomes. This follows a pattern we established internally, in our Principal Engineering community. In 2012 we started an initiative called “Well-Architected” to help share the best practices for architecting in the cloud with our customers.

Every year AWS Solution Architects dedicate hundreds of thousands of hours to helping customers build architectures that are cloud native. Through customer feedback, and real world experience we see what strategies, patterns, and approaches work for you.

“After our well-architected review and subsequent migration to the cloud, we saw the tremendous cost-savings potential of Amazon Web Services. By using the industry-standard service, we can invest the majority of our time and energy into enhancing our solutions. Thanks to (consulting partner) 1Strategy’s deep, technical AWS expertise and flexibility during our migration, we were able to leverage the strengths of AWS quickly.”

Paul Cooley, Chief Technology Officer for Imprev

This year we have again refreshed the AWS Well-Architected Framework, with a particular focus on Operational Excellence. Last year we announced the addition of Operational Excellence as a new pillar to AWS Well-Architected Framework. Having carried out thousands of reviews since then, we reexamined the pillar and are pleased to announce some significant changes. First, the pillar dives more deeply into people and process because this is an area where we see the most opportunities for teams to improve. Second, we’ve pivoted heavily to focusing on whether your team and your workload are ready for runtime operations. Key to this is ensuring that in the early phases of design that you think about how your architecture will be operated. Reflecting on this we realized that Operational Excellence should be the first pillar to support the “Architect for run-time operations” approach.

We’ve also added detail on how Amazon approaches technology architecture, covering topics such as our Principal Engineering Community and two-way doors and mechanisms. We refreshed the other pillars to reflect the evolution of AWS, and the best practices we are seeing in the field. We have also added detail on the review process, in the surprisingly named “The Review Process” section.

As part of refreshing the pillars are have also released a new Operational Excellence Pillar whitepaper, and have updated the whitepapers for all of the other pillars of the Framework. For example we have significantly updated the Reliability Pillar whitepaper to provide guidance on application design for high availability. New sections cover techniques for high availability including and beyond infrastructure implications, and considerations across the application lifecycle. This updated whitepaper also provides examples that show how to achieve availability goals in single and multi-region architectures.

You can find free training and all of the ”state of the art” whitepapers on the AWS Well-Architected homepage:

Philip Fitzsimons, Leader, AWS Well-Architected Team