Tag Archives: LTE

New AWS Auto Scaling – Unified Scaling For Your Cloud Applications

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-auto-scaling-unified-scaling-for-your-cloud-applications/

I’ve been talking about scalability for servers and other cloud resources for a very long time! Back in 2006, I wrote “This is the new world of scalable, on-demand web services. Pay for what you need and use, and not a byte more.” Shortly after we launched Amazon Elastic Compute Cloud (EC2), we made it easy for you to do this with the simultaneous launch of Elastic Load Balancing, EC2 Auto Scaling, and Amazon CloudWatch. Since then we have added Auto Scaling to other AWS services including ECS, Spot Fleets, DynamoDB, Aurora, AppStream 2.0, and EMR. We have also added features such as target tracking to make it easier for you to scale based on the metric that is most appropriate for your application.

Introducing AWS Auto Scaling
Today we are making it easier for you to use the Auto Scaling features of multiple AWS services from a single user interface with the introduction of AWS Auto Scaling. This new service unifies and builds on our existing, service-specific, scaling features. It operates on any desired EC2 Auto Scaling groups, EC2 Spot Fleets, ECS tasks, DynamoDB tables, DynamoDB Global Secondary Indexes, and Aurora Replicas that are part of your application, as described by an AWS CloudFormation stack or in AWS Elastic Beanstalk (we’re also exploring some other ways to flag a set of resources as an application for use with AWS Auto Scaling).

You no longer need to set up alarms and scaling actions for each resource and each service. Instead, you simply point AWS Auto Scaling at your application and select the services and resources of interest. Then you select the desired scaling option for each one, and AWS Auto Scaling will do the rest, helping you to discover the scalable resources and then creating a scaling plan that addresses the resources of interest.

If you have tried to use any of our Auto Scaling options in the past, you undoubtedly understand the trade-offs involved in choosing scaling thresholds. AWS Auto Scaling gives you a variety of scaling options: You can optimize for availability, keeping plenty of resources in reserve in order to meet sudden spikes in demand. You can optimize for costs, running close to the line and accepting the possibility that you will tax your resources if that spike arrives. Alternatively, you can aim for the middle, with a generous but not excessive level of spare capacity. In addition to optimizing for availability, cost, or a blend of both, you can also set a custom scaling threshold. In each case, AWS Auto Scaling will create scaling policies on your behalf, including appropriate upper and lower bounds for each resource.

AWS Auto Scaling in Action
I will use AWS Auto Scaling on a simple CloudFormation stack consisting of an Auto Scaling group of EC2 instances and a pair of DynamoDB tables. I start by removing the existing Scaling Policies from my Auto Scaling group:

Then I open up the new Auto Scaling Console and selecting the stack:

Behind the scenes, Elastic Beanstalk applications are always launched via a CloudFormation stack. In the screen shot above, awseb-e-sdwttqizbp-stack is an Elastic Beanstalk application that I launched.

I can click on any stack to learn more about it before proceeding:

I select the desired stack and click on Next to proceed. Then I enter a name for my scaling plan and choose the resources that I’d like it to include:

I choose the scaling strategy for each type of resource:

After I have selected the desired strategies, I click Next to proceed. Then I review the proposed scaling plan, and click Create scaling plan to move ahead:

The scaling plan is created and in effect within a few minutes:

I can click on the plan to learn more:

I can also inspect each scaling policy:

I tested my new policy by applying a load to the initial EC2 instance, and watched the scale out activity take place:

I also took a look at the CloudWatch metrics for the EC2 Auto Scaling group:

Available Now
We are launching AWS Auto Scaling today in the US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), and Asia Pacific (Singapore) Regions today, with more to follow. There’s no charge for AWS Auto Scaling; you pay only for the CloudWatch Alarms that it creates and any AWS resources that you consume.

As is often the case with our new services, this is just the first step on what we hope to be a long and interesting journey! We have a long roadmap, and we’ll be adding new features and options throughout 2018 in response to your feedback.

Jeff;

Game night 2: Detention, Viatoree, Paletta

Post Syndicated from Eevee original https://eev.ee/blog/2018/01/16/game-night-2-detention-viatoree-paletta/

Game night continues with:

  • Detention
  • Viatoree
  • Paletta

These are impressions, not reviews. I try to avoid major/ending spoilers, but big plot points do tend to leave impressions.

Detention

longish · inventory horror · jan 2017 · lin/mac/win · $12 on steam · website

Inventory horror” is a hell of a genre.

I think this one came from a Twitter thread where glip asked for indie horror recommendations. It’s apparently well-known enough to have a Wikipedia article, but I hadn’t heard of it before.

I love love love the aesthetic here. It’s obviously 2Dish from a side view (though there’s plenty of parallax in a lot of places), and it’s all done with… papercraft? I think of it as papercraft. Everything is built out of painted chunks that look like they were cut out of paper. It’s most obvious when watching the protagonist move around; her legs and skirt swivel as she walks.

Less obvious are the occasional places where tiny details repeat in the background because a paper cutout was reused. I don’t bring that up as a dig on the art; on the contrary, I really liked noticing that once or twice. It made the world feel like it was made with a tileset (albeit with very large chunky tiles), like it’s slightly artificial. I’m used to seeing sidescrollers made from tiles, of course, but the tiles are usually colorful and cartoony pixel art; big gritty full-color tiles are unusual and eerie.

And that’s a good thing in a horror game! Detention’s setting is already slightly unreal, and it’s made all the moreso by my Western perspective: it takes place in a Taiwanese school in the 60’s, a time when Taiwan was apparently under martial law. The Steam page tells you this, but I didn’t even know that much when we started playing, so I’d effectively been dropped somewhere on the globe and left to collect the details myself. Even figuring out we were in Taiwan (rather than mainland China) felt like an insight.

Thinking back, it was kind of a breath of fresh air. Games can be pretty heavy-handed about explaining the setting, but I never got that feeling from Detention. There’s more than enough context to get what’s going on, but there are no “stop and look at the camera while monologuing some exposition” moments. The developers are based in Taiwan, so it’s possible the setting is plenty familiar to them, and my perception of it is a complete accident. Either way, it certainly made an impact. Death of the author and whatnot, I suppose.

One thing in particular that stood out: none of the Chinese text in the environment is directly translated. The protagonist’s thoughts still give away what it says — “this is the nurse’s office” and the like — but that struck me as pretty different from simply repeating the text in English as though I were reading a sign in an RPG. The text is there, perfectly legible, but I can’t read it; I can only ask the protagonist to read it and offer her thoughts. It drives home that I’m experiencing the world through the eyes of the protagonist, who is their own person with their own impression of everything. Again, this is largely an emergent property of the game’s being designed in a culture that is not mine, but I’m left wondering how much thought went into this style of localization.

The game itself sees you wandering through a dark and twisted version of the protagonist’s school, collecting items and solving puzzles with them. There’s no direct combat, though some places feature a couple varieties of spirits called lingered which you have to carefully avoid. As the game progresses, the world starts to break down, alternating between increasingly abstract and increasingly concrete as we find out who the protagonist is and why she’s here.

The payoff is very personal and left a lasting impression… though as I look at the Wikipedia page now, it looks like the ending we got was the non-canon bad ending?! Well, hell. The bad ending is still great, then.

The whole game has a huge Silent Hill vibe, only without the combat and fog. Frankly, the genre might work better without combat; personal demons are more intimidating and meaningful when you can’t literally shoot them with a gun until they’re dead.

FINAL SCORE: 拾

Viatoree

short · platformer · sep 2013 · win · free on itch

I found this because @itchio tweeted about it, and the phrase “atmospheric platform exploration game” is the second most beautiful sequence of words in the English language.

The first paragraph on the itch.io page tells you the setup. That paragraph also contains more text than the entire game. In short: there are five things, and you need to find them. You can walk, jump, and extend your arms straight up to lift yourself to the ceiling. That’s it. No enemies, no shooting, no NPCs (more or less).

The result is, indeed, an atmospheric platform exploration game. The foreground is entirely 1-bit pixel art, save for the occasional white pixel to indicate someone’s eyes, and the background is only a few shades of the same purple hue. The game becomes less about playing and more about just looking at the environmental detail, appreciating how much texture the game manages to squeeze out of chunky colorless pixels. The world is still alive, too, much moreso than most platformers; tiny critters appear here and there, doing some wandering of their own, completely oblivious to you.

The game is really short, but it… just… makes me happy. I’m happy that this can exist, that not only is it okay for someone to make a very compact and short game, but that the result can still resonate with me. Not everything needs to be a sprawling epic or ask me to dedicate hours of time. It takes a few tiny ideas, runs with them, does what it came to do, and ends there. I love games like this.

That sounds silly to write out, but it’s been hard to get into my head! I do like experimenting, but I also feel compelled to reach for the grandiose, and grandiose experiment sounds more like mad science than creative exploration. For whatever reason, Viatoree convinced me that it’s okay to do a small thing, in a way that no other jam game has. It was probably the catalyst that led me to make Roguelike Simulator, and I thank it for that.

Unfortunately, we collected four of the five macguffins before hitting upon on a puzzle we couldn’t make heads or tails of. After about ten minutes of fruitless searching, I decided to abandon this one unfinished, rather than bore my couch partner to tears. Maybe I’ll go take another stab at it after I post this.

FINAL SCORE: ●●●●○

Paletta

medium · puzzle story · nov 2017 · win · free on itch

Paletta, another RPG Maker work, won second place in the month-long Indie Game Maker Contest 2017. Nice! Apparently MOOP came in fourth in the same jam; also nice! I guess that’s why both of them ended up on the itch front page.

The game is set in a world drained of color, and you have to go restore it. Each land contains one lost color, and each color gives you a corresponding spell, which is generally used for some light puzzle-solving in further lands. It’s a very cute and light-hearted game, and it actually does an impressive job of obscuring its RPG Maker roots.

The world feels a little small to me, despite having fairly spacious maps. The progression is pretty linear: you enter one land, talk to a small handful of NPCs, solve the one puzzle, get the color, and move on. I think all the areas were continuously connected, too, which may have thrown me off a bit — these areas are described as though they were vast regions, but they’re all a hundred feet wide and nestled right next to each other.

I love playing with color as a concept, and I wish the game had run further with it somehow. Rescuing a color does add some color back to the world, but at times it seemed like the color that reappeared was somewhat arbitrary? It’s not like you rescue green and now all the green is back. Thinking back on it now, I wonder if each rescued color actually changed a fixed set of sprites from gray to colorized? But it’s been a month (oops) and now I’m not sure.

I’m not trying to pick on the authors for the brevity of their jam game and also first game they’ve ever finished. I enjoyed playing it and found it plenty charming! It just happens that this time, what left the biggest impression on me was a nebulous feeling that something was missing. I think that’s still plenty important to ponder.

FINAL SCORE: ❤️💛💚💙💜

Early Challenges: Managing Cash Flow

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/managing-cash-flow/

Cash flow projection charts

This post by Backblaze’s CEO and co-founder Gleb Budman is the eighth in a series about entrepreneurship. You can choose posts in the series from the list below:

  1. How Backblaze got Started: The Problem, The Solution, and the Stuff In-Between
  2. Building a Competitive Moat: Turning Challenges Into Advantages
  3. From Idea to Launch: Getting Your First Customers
  4. How to Get Your First 1,000 Customers
  5. Surviving Your First Year
  6. How to Compete with Giants
  7. The Decision on Transparency
  8. Early Challenges: Managing Cash Flow

Use the Join button above to receive notification of new posts in this series.

Running out of cash is one of the quickest ways for a startup to go out of business. When you are starting a company the question of where to get cash is usually the top priority, but managing cash flow is critical for every stage in the lifecycle of a company. As a primarily bootstrapped but capital-intensive business, managing cash flow at Backblaze was and still is a key element of our success and requires continued focus. Let’s look at what we learned over the years.

Raising Your Initial Funding

When starting a tech business in Silicon Valley, the default assumption is that you will immediately try to raise venture funding. There are certainly many advantages to raising funding — not the least of which is that you don’t need to be cash-flow positive since you have cash in the bank and the expectation is that you will have a “burn rate,” i.e. you’ll be spending more than you make.

Note: While you’re not expected to be cash-flow positive, that doesn’t mean you don’t have to worry about cash. Cash-flow management will determine your burn rate. Whether you can get to cash-flow breakeven or need to raise another round of funding is a direct byproduct of your cash flow management.

Also, raising funding takes time (most successful fundraising cycles take 3-6 months start-to-finish), and time at a startup is in short supply. Constantly trying to raise funding can take away from product development and pursuing growth opportunities. If you’re not successful in raising funding, you then have to either shut down or find an alternate method of funding the business.

Sources of Funding

Depending on the stage of the company, type of company, and other factors, you may have access to different sources of funding. Let’s list a number of them:

Customers

Sales — the best kind of funding. It is non-dilutive, doesn’t have to be paid back, and is a direct metric of the success of your company.

Pre-Sales — some customers may be willing to pay you for a product in beta, a test, or pre-pay for a product they’ll receive when finished. Pre-Sales income also is great because it shares the characteristics of cash from sales, but you get the cash early. It also can be a good sign that the product you’re building fills a market need. We started charging for Backblaze computer backup while it was still in private beta, which allowed us to not only collect cash from customers, but also test the billing experience and users’ real desire for the service.

Services — if you’re a service company and customers are paying you for that, great. You can effectively scale for the number of hours available in a day. As demand grows, you can add more employees to increase the total number of billable hours.

Note: If you’re a product company and customers are paying you to consult, that can provide much needed cash, and could provide feedback toward the right product. However, it can also distract from your core business, send you down a path where you’re building a product for a single customer, and addict you to a path that prevents you from building a scalable business.

Investors

Yourself — you likely are putting your time into the business, and deferring salary in the process. You may also put your own cash into the business either as an investment or a loan.

Angels — angels are ideal as early investors since they are used to investing in businesses with little to no traction. AngelList is a good place to find them, though finding people you’re connected with through someone that knows you well is best.

Crowdfunding — a component of the JOBS Act permitted entrepreneurs to raise money from nearly anyone since May 2016. The SEC imposes limits on both investors and the companies. This article goes into some depth on the options and sites available.

VCs — VCs are ideal for companies that need to raise at least a few million dollars and intend to build a business that will be worth over $1 billion.

Debt

Friends & Family — F&F are often the first people to give you money because they are investing in you. It’s great to have some early supporters, but it also can be risky to take money from people who aren’t used to the risks. The key advice here is to only take money from people who won’t mind losing it. If someone is talking about using their children’s college funds or borrowing from their 401k, say ‘no thank you’ — even if they’re sure they want to loan you money.

Bank Loans — a variety of loan types exist, but most either require the company to have been operational for a couple years, be able to borrow against money the company has or is making, or be able to get a personal guarantee from the founders whereby their own credit is on the line. Fundera provides a good overview of loan options and can help secure some, but most will not be an option for a brand new startup.

Grants

Government — in some areas there is the potential for government grants to facilitate research. The SBIR program facilitates some such grants.

At Backblaze, we used a number of these options:

• Investors/Yourself
We loaned a cumulative total of a couple hundred thousand dollars to the company and invested our time by going without a salary for a year and a half.
• Customers/Pre-Sales
We started selling the Backblaze service while it was still in beta.
• Customers/Sales
We launched v1.0 and kept selling.
• Investors/Angels
After a year and a half, we raised $370k from 11 angels. All of them were either people whom we knew personally or were a strong recommendation from a mutual friend.
• Debt/Loans
After a couple years we were able to get equipment leases whereby the Storage Pods and hard drives were used as collateral to secure the lease on them.
• Investors/VCs
Ater five years we raised $5m from TMT Investments to add to the balance sheet and invest in growth.

The variety and quantity of sources we used is by no means uncommon.

GAAP vs. Cash

Most companies start tracking financials based on cash, and as they scale they switch to GAAP (Generally Accepted Accounting Principles). Cash is easier to track — we got paid $XXXX and spent $YYY — and as often mentioned, is required for the business to stay alive. GAAP has more subtlety and complexity, but provides a clearer picture of how the business is really doing. Backblaze was on a ‘cash’ system for the first few years, then switched to GAAP. For this post, I’m going to focus on things that help cash flow, not GAAP profitability.

Stages of Cash Flow Management

All-spend

In a pure service business (e.g. solo proprietor law firm), you may have no expenses other than your time, so this stage doesn’t exist. However, in a product business there is a period of time where you are building the product and have nothing to sell. You have zero cash coming in, but have cash going out. Your cash-flow is completely negative and you need funds to cover that.

Sales-generating

Starting to see cash come in from customers is thrilling. I initially had our system set up to email me with every $5 payment we received. You’re making sales, but not covering expenses.

Ramen-profitable

But it takes a lot of $5 payments to pay for servers and salaries, so for a while expenses are likely to outstrip sales. Getting to ramen-profitable is a critical stage where sales cover the business expenses and are “paying enough for the founders to eat ramen.” This extends the runway for a business, but is not completely sustainable, since presumably the founders can’t (or won’t) live forever on a subsistence salary.

Business-profitable

This is the ultimate stage whereby the business is truly profitable, including paying everyone market-rate salaries. A business at this stage is self-sustaining. (Of course, market shifts and plenty of other challenges can kill the business, but cash-flow issues alone will not.)

Note, I’m using the word ‘profitable’ here to mean this is still on a cash-basis.

Backblaze was in the all-spend stage for just over a year, during which time we built the service and hadn’t yet made the service available to customers. Backblaze was in the sales-generating stage for nearly another year before the company was barely ramen-profitable where sales were covering the company expenses and paying the founders minimum wage. (I say ‘barely’ since minimum wage in the SF Bay Area is arguably never subsistence.) It took almost three more years before the company was business-profitable, paying everyone including the founders market-rate.

Cash Flow Forecasting

When raising funding it’s helpful to think of milestones reached. You don’t necessarily need enough cash on day one to last for the next 100 years of the company. Some good milestones to consider are how much cash you need to prove there is a market need, prove you can build a product to meet that need, or get to ramen-profitable.

Two things to consider:

1) Unit Economics (COGS)

If your product is 100% software, this may not be relevant. Once software is built it costs effectively nothing to deliver the product to one customer or one million customers. However, in most businesses there is some incremental cost to provide the product. If you’re selling a hardware device, perhaps you sell it for $100 but it costs you $50 to make it. This is called “COGS” (Cost of Goods Sold).

Many products rely on cloud services where the costs scale with growth. That model works great, but it’s still important to understand what the costs are for the cloud service you use per unit of product you sell.

Support is often done by the founders early-on in a business, but that is another real cost to factor in and estimate on a per-user basis. Taking all of the per unit costs combined, you may charge $10/month/user for your service, but if it costs you $7/month/user in cloud services, you’re only netting $3/month/user.

2) Operating Expenses (OpEx)

These are expenses that don’t scale with the number of product units you sell. Typically this includes research & development, sales & marketing, and general & administrative expenses. Presumably there is a certain level of these functions required to build the product, market it, sell it, and run the organization. You can choose to invest or cut back on these, but you’ll still make the same amount per product unit.

Incremental Net Profit Per Unit

If you’ve calculated your COGS and your unit economics are “upside down,” where the amount you charge is less than that it costs you to provide your service, it’s worth thinking hard about how that’s going to change over time. If it will not change, there is no scale that will make the business work. Presuming you do make money on each unit of product you sell — what is sometimes referred to as “Contribution Margin” — consider how many of those product units you need to sell to cover your operating expenses as described above.

Calculating Your Profit

The math on getting to ramen-profitable is simple:

(Number of Product Units Sold x Contribution Margin) - Operating Expenses = Profit

If your operating expenses include subsistence salaries for the founders and profit > $0, you’re ramen-profitable.

Improving Cash Flow

Having access to sources of cash, whether from selling to customers or other methods, is excellent. But needing less cash gives you more choices and allows you to either dilute less, owe less, or invest more.

There are two ways to improve cash flow:

1) Collect More Cash

The best way to collect more cash is to provide more value to your customers and as a result have them pay you more. Additional features/products/services can allow this. However, you can also collect more cash by changing how you charge for your product. If you have a subscription, changing from charging monthly to yearly dramatically improves your cash flow. If you have a product that customers use up, selling a year’s supply instead of selling them one-by-one can help.

2) Spend Less Cash

Reducing COGS is a fantastic way to spend less cash in a scalable way. If you can do this without harming the product or customer experience, you win. There are a myriad of ways to also reduce operating expenses, including taking sub-market salaries, using your home instead of renting office space, staying focused on your core product, etc.

Ultimately, collecting more and spending less cash dramatically simplifies the process of getting to ramen-profitable and later to business-profitable.

Be Careful (Why GAAP Matters)

A word of caution: while running out of cash will put you out of business immediately, overextending yourself will likely put you out of business not much later. GAAP shows how a business is really doing; cash doesn’t. If you only focus on cash, it is possible to commit yourself to both delivering products and repaying loans in the future in an unsustainable fashion. If you’re taking out loans, watch the total balance and monthly payments you’re committing to. If you’re asking customers for pre-payment, make sure you believe you can deliver on what they’ve paid for.

Summary

There are numerous challenges to building a business, and ensuring you have enough cash is amongst the most important. Having the cash to keep going lets you keep working on all of the other challenges. The frameworks above were critical for maintaining Backblaze’s cash flow and cash balance. Hopefully you can take some of the lessons we learned and apply them to your business. Let us know what works for you in the comments below.

The post Early Challenges: Managing Cash Flow appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Hollywood Wins ISP Blockade Against Popular Pirate Sites in Ireland

Post Syndicated from Ernesto original https://torrentfreak.com/hollywood-wins-isp-blockade-against-popular-pirate-sites-in-ireland-180116/

Like many other countries throughout Europe, Ireland is no stranger to pirate site blocking efforts.

The Pirate Bay was blocked back in 2009, as part of a voluntary agreement between copyright holders and local ISP Eircom. A few years later the High Court ordered other major Internet providers to follow suit.

However, The Pirate Bay is not the only ‘infringing’ site out there. The Motion Picture Association (MPA) has therefore asked the Commercial Court to expand the blockades to other sites.

On behalf of several major Hollywood studios, the group most recently targeted a group of the most used torrent and streaming sites; 1337x.io, EZTV.ag, Bmovies.is, 123movieshub.to, Putlocker.io, RARBG.to, Gowatchfreemovies.to and YTS.am.

On Monday the Commercial Court sided with the movie studios ordering all major Irish ISPs to block the sites. The latest order applies to Eircom, Sky Ireland, Vodafone Ireland, Virgin Media Ireland, Three Ireland, Digiweb, Imagine Telecommunications and Magnet Networks.

According to Justice Brian McGovern, the movie studios had made it clear that the sites in question infringed their copyrights. As such, there are “significant public interest grounds” to have them blocked.

Irish Examiner reports that none of the ISPs opposed the blocking request. This means that new pirate site blockades are mostly a formality now.

MPA EMEA President and Managing Director Stan McCoy is happy with the outcome, which he says will help to secure jobs in the movie industry.

“As the Irish film industry is continuing to thrive, the MPA is dedicated to supporting that growth by combatting the operations of illegal sites that undermine the sustainability of the sector,” McCoy says.

“Preventing these pirate sites from freely disturbing other people’s work will help us provide greater job security for the 18,000 people employed through the Irish film industry and ensure that consumers can continue to enjoy high quality content in the future.”

The MPA also obtained similar blocks against movie4k.to, primewire.ag, and onwatchseries.to. last year, which remain in effect to date.

The torrent and streaming sites that were targeted most recently have millions of visitors worldwide. While the blockades will make it harder for the Irish to access them directly, history has shown that some people circumvent these measures or simply move to other sites.

Several of the targeted sites themselves are also keeping a close eye on these blocking efforts and are providing users with alternative domains to bypass the restrictions, at least temporarily.

As such, it would be no surprise if the Hollywood studios return to the Commercial Court again in a few months.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

AWS Glue Now Supports Scala Scripts

Post Syndicated from Mehul Shah original https://aws.amazon.com/blogs/big-data/aws-glue-now-supports-scala-scripts/

We are excited to announce AWS Glue support for running ETL (extract, transform, and load) scripts in Scala. Scala lovers can rejoice because they now have one more powerful tool in their arsenal. Scala is the native language for Apache Spark, the underlying engine that AWS Glue offers for performing data transformations.

Beyond its elegant language features, writing Scala scripts for AWS Glue has two main advantages over writing scripts in Python. First, Scala is faster for custom transformations that do a lot of heavy lifting because there is no need to shovel data between Python and Apache Spark’s Scala runtime (that is, the Java virtual machine, or JVM). You can build your own transformations or invoke functions in third-party libraries. Second, it’s simpler to call functions in external Java class libraries from Scala because Scala is designed to be Java-compatible. It compiles to the same bytecode, and its data structures don’t need to be converted.

To illustrate these benefits, we walk through an example that analyzes a recent sample of the GitHub public timeline available from the GitHub archive. This site is an archive of public requests to the GitHub service, recording more than 35 event types ranging from commits and forks to issues and comments.

This post shows how to build an example Scala script that identifies highly negative issues in the timeline. It pulls out issue events in the timeline sample, analyzes their titles using the sentiment prediction functions from the Stanford CoreNLP libraries, and surfaces the most negative issues.

Getting started

Before we start writing scripts, we use AWS Glue crawlers to get a sense of the data—its structure and characteristics. We also set up a development endpoint and attach an Apache Zeppelin notebook, so we can interactively explore the data and author the script.

Crawl the data

The dataset used in this example was downloaded from the GitHub archive website into our sample dataset bucket in Amazon S3, and copied to the following locations:

s3://aws-glue-datasets-<region>/examples/scala-blog/githubarchive/data/

Choose the best folder by replacing <region> with the region that you’re working in, for example, us-east-1. Crawl this folder, and put the results into a database named githubarchive in the AWS Glue Data Catalog, as described in the AWS Glue Developer Guide. This folder contains 12 hours of the timeline from January 22, 2017, and is organized hierarchically (that is, partitioned) by year, month, and day.

When finished, use the AWS Glue console to navigate to the table named data in the githubarchive database. Notice that this data has eight top-level columns, which are common to each event type, and three partition columns that correspond to year, month, and day.

Choose the payload column, and you will notice that it has a complex schema—one that reflects the union of the payloads of event types that appear in the crawled data. Also note that the schema that crawlers generate is a subset of the true schema because they sample only a subset of the data.

Set up the library, development endpoint, and notebook

Next, you need to download and set up the libraries that estimate the sentiment in a snippet of text. The Stanford CoreNLP libraries contain a number of human language processing tools, including sentiment prediction.

Download the Stanford CoreNLP libraries. Unzip the .zip file, and you’ll see a directory full of jar files. For this example, the following jars are required:

  • stanford-corenlp-3.8.0.jar
  • stanford-corenlp-3.8.0-models.jar
  • ejml-0.23.jar

Upload these files to an Amazon S3 path that is accessible to AWS Glue so that it can load these libraries when needed. For this example, they are in s3://glue-sample-other/corenlp/.

Development endpoints are static Spark-based environments that can serve as the backend for data exploration. You can attach notebooks to these endpoints to interactively send commands and explore and analyze your data. These endpoints have the same configuration as that of AWS Glue’s job execution system. So, commands and scripts that work there also work the same when registered and run as jobs in AWS Glue.

To set up an endpoint and a Zeppelin notebook to work with that endpoint, follow the instructions in the AWS Glue Developer Guide. When you are creating an endpoint, be sure to specify the locations of the previously mentioned jars in the Dependent jars path as a comma-separated list. Otherwise, the libraries will not be loaded.

After you set up the notebook server, go to the Zeppelin notebook by choosing Dev Endpoints in the left navigation pane on the AWS Glue console. Choose the endpoint that you created. Next, choose the Notebook Server URL, which takes you to the Zeppelin server. Log in using the notebook user name and password that you specified when creating the notebook. Finally, create a new note to try out this example.

Each notebook is a collection of paragraphs, and each paragraph contains a sequence of commands and the output for that command. Moreover, each notebook includes a number of interpreters. If you set up the Zeppelin server using the console, the (Python-based) pyspark and (Scala-based) spark interpreters are already connected to your new development endpoint, with pyspark as the default. Therefore, throughout this example, you need to prepend %spark at the top of your paragraphs. In this example, we omit these for brevity.

Working with the data

In this section, we use AWS Glue extensions to Spark to work with the dataset. We look at the actual schema of the data and filter out the interesting event types for our analysis.

Start with some boilerplate code to import libraries that you need:

%spark

import com.amazonaws.services.glue.DynamicRecord
import com.amazonaws.services.glue.GlueContext
import com.amazonaws.services.glue.util.GlueArgParser
import com.amazonaws.services.glue.util.Job
import com.amazonaws.services.glue.util.JsonOptions
import com.amazonaws.services.glue.types._
import org.apache.spark.SparkContext

Then, create the Spark and AWS Glue contexts needed for working with the data:

@transient val spark: SparkContext = SparkContext.getOrCreate()
val glueContext: GlueContext = new GlueContext(spark)

You need the transient decorator on the SparkContext when working in Zeppelin; otherwise, you will run into a serialization error when executing commands.

Dynamic frames

This section shows how to create a dynamic frame that contains the GitHub records in the table that you crawled earlier. A dynamic frame is the basic data structure in AWS Glue scripts. It is like an Apache Spark data frame, except that it is designed and optimized for data cleaning and transformation workloads. A dynamic frame is well-suited for representing semi-structured datasets like the GitHub timeline.

A dynamic frame is a collection of dynamic records. In Spark lingo, it is an RDD (resilient distributed dataset) of DynamicRecords. A dynamic record is a self-describing record. Each record encodes its columns and types, so every record can have a schema that is unique from all others in the dynamic frame. This is convenient and often more efficient for datasets like the GitHub timeline, where payloads can vary drastically from one event type to another.

The following creates a dynamic frame, github_events, from your table:

val github_events = glueContext
                    .getCatalogSource(database = "githubarchive", tableName = "data")
                    .getDynamicFrame()

The getCatalogSource() method returns a DataSource, which represents a particular table in the Data Catalog. The getDynamicFrame() method returns a dynamic frame from the source.

Recall that the crawler created a schema from only a sample of the data. You can scan the entire dataset, count the rows, and print the complete schema as follows:

github_events.count
github_events.printSchema()

The result looks like the following:

The data has 414,826 records. As before, notice that there are eight top-level columns, and three partition columns. If you scroll down, you’ll also notice that the payload is the most complex column.

Run functions and filter records

This section describes how you can create your own functions and invoke them seamlessly to filter records. Unlike filtering with Python lambdas, Scala scripts do not need to convert records from one language representation to another, thereby reducing overhead and running much faster.

Let’s create a function that picks only the IssuesEvents from the GitHub timeline. These events are generated whenever someone posts an issue for a particular repository. Each GitHub event record has a field, “type”, that indicates the kind of event it is. The issueFilter() function returns true for records that are IssuesEvents.

def issueFilter(rec: DynamicRecord): Boolean = { 
    rec.getField("type").exists(_ == "IssuesEvent") 
}

Note that the getField() method returns an Option[Any] type, so you first need to check that it exists before checking the type.

You pass this function to the filter transformation, which applies the function on each record and returns a dynamic frame of those records that pass.

val issue_events =  github_events.filter(issueFilter)

Now, let’s look at the size and schema of issue_events.

issue_events.count
issue_events.printSchema()

It’s much smaller (14,063 records), and the payload schema is less complex, reflecting only the schema for issues. Keep a few essential columns for your analysis, and drop the rest using the ApplyMapping() transform:

val issue_titles = issue_events.applyMapping(Seq(("id", "string", "id", "string"),
                                                 ("actor.login", "string", "actor", "string"), 
                                                 ("repo.name", "string", "repo", "string"),
                                                 ("payload.action", "string", "action", "string"),
                                                 ("payload.issue.title", "string", "title", "string")))
issue_titles.show()

The ApplyMapping() transform is quite handy for renaming columns, casting types, and restructuring records. The preceding code snippet tells the transform to select the fields (or columns) that are enumerated in the left half of the tuples and map them to the fields and types in the right half.

Estimating sentiment using Stanford CoreNLP

To focus on the most pressing issues, you might want to isolate the records with the most negative sentiments. The Stanford CoreNLP libraries are Java-based and offer sentiment-prediction functions. Accessing these functions through Python is possible, but quite cumbersome. It requires creating Python surrogate classes and objects for those found on the Java side. Instead, with Scala support, you can use those classes and objects directly and invoke their methods. Let’s see how.

First, import the libraries needed for the analysis:

import java.util.Properties
import edu.stanford.nlp.ling.CoreAnnotations
import edu.stanford.nlp.neural.rnn.RNNCoreAnnotations
import edu.stanford.nlp.pipeline.{Annotation, StanfordCoreNLP}
import edu.stanford.nlp.sentiment.SentimentCoreAnnotations
import scala.collection.convert.wrapAll._

The Stanford CoreNLP libraries have a main driver that orchestrates all of their analysis. The driver setup is heavyweight, setting up threads and data structures that are shared across analyses. Apache Spark runs on a cluster with a main driver process and a collection of backend executor processes that do most of the heavy sifting of the data.

The Stanford CoreNLP shared objects are not serializable, so they cannot be distributed easily across a cluster. Instead, you need to initialize them once for every backend executor process that might need them. Here is how to accomplish that:

val props = new Properties()
props.setProperty("annotators", "tokenize, ssplit, parse, sentiment")
props.setProperty("parse.maxlen", "70")

object myNLP {
    lazy val coreNLP = new StanfordCoreNLP(props)
}

The properties tell the libraries which annotators to execute and how many words to process. The preceding code creates an object, myNLP, with a field coreNLP that is lazily evaluated. This field is initialized only when it is needed, and only once. So, when the backend executors start processing the records, each executor initializes the driver for the Stanford CoreNLP libraries only one time.

Next is a function that estimates the sentiment of a text string. It first calls Stanford CoreNLP to annotate the text. Then, it pulls out the sentences and takes the average sentiment across all the sentences. The sentiment is a double, from 0.0 as the most negative to 4.0 as the most positive.

def estimatedSentiment(text: String): Double = {
    if ((text == null) || (!text.nonEmpty)) { return Double.NaN }
    val annotations = myNLP.coreNLP.process(text)
    val sentences = annotations.get(classOf[CoreAnnotations.SentencesAnnotation])
    sentences.foldLeft(0.0)( (csum, x) => { 
        csum + RNNCoreAnnotations.getPredictedClass(x.get(classOf[SentimentCoreAnnotations.SentimentAnnotatedTree])) 
    }) / sentences.length
}

Now, let’s estimate the sentiment of the issue titles and add that computed field as part of the records. You can accomplish this with the map() method on dynamic frames:

val issue_sentiments = issue_titles.map((rec: DynamicRecord) => { 
    val mbody = rec.getField("title")
    mbody match {
        case Some(mval: String) => { 
            rec.addField("sentiment", ScalarNode(estimatedSentiment(mval)))
            rec }
        case _ => rec
    }
})

The map() method applies the user-provided function on every record. The function takes a DynamicRecord as an argument and returns a DynamicRecord. The code above computes the sentiment, adds it in a top-level field, sentiment, to the record, and returns the record.

Count the records with sentiment and show the schema. This takes a few minutes because Spark must initialize the library and run the sentiment analysis, which can be involved.

issue_sentiments.count
issue_sentiments.printSchema()

Notice that all records were processed (14,063), and the sentiment value was added to the schema.

Finally, let’s pick out the titles that have the lowest sentiment (less than 1.5). Count them and print out a sample to see what some of the titles look like.

val pressing_issues = issue_sentiments.filter(_.getField("sentiment").exists(_.asInstanceOf[Double] < 1.5))
pressing_issues.count
pressing_issues.show(10)

Next, write them all to a file so that you can handle them later. (You’ll need to replace the output path with your own.)

glueContext.getSinkWithFormat(connectionType = "s3", 
                              options = JsonOptions("""{"path": "s3://<bucket>/out/path/"}"""), 
                              format = "json")
            .writeDynamicFrame(pressing_issues)

Take a look in the output path, and you can see the output files.

Putting it all together

Now, let’s create a job from the preceding interactive session. The following script combines all the commands from earlier. It processes the GitHub archive files and writes out the highly negative issues:

import com.amazonaws.services.glue.DynamicRecord
import com.amazonaws.services.glue.GlueContext
import com.amazonaws.services.glue.util.GlueArgParser
import com.amazonaws.services.glue.util.Job
import com.amazonaws.services.glue.util.JsonOptions
import com.amazonaws.services.glue.types._
import org.apache.spark.SparkContext
import java.util.Properties
import edu.stanford.nlp.ling.CoreAnnotations
import edu.stanford.nlp.neural.rnn.RNNCoreAnnotations
import edu.stanford.nlp.pipeline.{Annotation, StanfordCoreNLP}
import edu.stanford.nlp.sentiment.SentimentCoreAnnotations
import scala.collection.convert.wrapAll._

object GlueApp {

    object myNLP {
        val props = new Properties()
        props.setProperty("annotators", "tokenize, ssplit, parse, sentiment")
        props.setProperty("parse.maxlen", "70")

        lazy val coreNLP = new StanfordCoreNLP(props)
    }

    def estimatedSentiment(text: String): Double = {
        if ((text == null) || (!text.nonEmpty)) { return Double.NaN }
        val annotations = myNLP.coreNLP.process(text)
        val sentences = annotations.get(classOf[CoreAnnotations.SentencesAnnotation])
        sentences.foldLeft(0.0)( (csum, x) => { 
            csum + RNNCoreAnnotations.getPredictedClass(x.get(classOf[SentimentCoreAnnotations.SentimentAnnotatedTree])) 
        }) / sentences.length
    }

    def main(sysArgs: Array[String]) {
        val spark: SparkContext = SparkContext.getOrCreate()
        val glueContext: GlueContext = new GlueContext(spark)

        val dbname = "githubarchive"
        val tblname = "data"
        val outpath = "s3://<bucket>/out/path/"

        val github_events = glueContext
                            .getCatalogSource(database = dbname, tableName = tblname)
                            .getDynamicFrame()

        val issue_events =  github_events.filter((rec: DynamicRecord) => {
            rec.getField("type").exists(_ == "IssuesEvent")
        })

        val issue_titles = issue_events.applyMapping(Seq(("id", "string", "id", "string"),
                                                         ("actor.login", "string", "actor", "string"), 
                                                         ("repo.name", "string", "repo", "string"),
                                                         ("payload.action", "string", "action", "string"),
                                                         ("payload.issue.title", "string", "title", "string")))

        val issue_sentiments = issue_titles.map((rec: DynamicRecord) => { 
            val mbody = rec.getField("title")
            mbody match {
                case Some(mval: String) => { 
                    rec.addField("sentiment", ScalarNode(estimatedSentiment(mval)))
                    rec }
                case _ => rec
            }
        })

        val pressing_issues = issue_sentiments.filter(_.getField("sentiment").exists(_.asInstanceOf[Double] < 1.5))

        glueContext.getSinkWithFormat(connectionType = "s3", 
                              options = JsonOptions(s"""{"path": "$outpath"}"""), 
                              format = "json")
                    .writeDynamicFrame(pressing_issues)
    }
}

Notice that the script is enclosed in a top-level object called GlueApp, which serves as the script’s entry point for the job. (You’ll need to replace the output path with your own.) Upload the script to an Amazon S3 location so that AWS Glue can load it when needed.

To create the job, open the AWS Glue console. Choose Jobs in the left navigation pane, and then choose Add job. Create a name for the job, and specify a role with permissions to access the data. Choose An existing script that you provide, and choose Scala as the language.

For the Scala class name, type GlueApp to indicate the script’s entry point. Specify the Amazon S3 location of the script.

Choose Script libraries and job parameters. In the Dependent jars path field, enter the Amazon S3 locations of the Stanford CoreNLP libraries from earlier as a comma-separated list (without spaces). Then choose Next.

No connections are needed for this job, so choose Next again. Review the job properties, and choose Finish. Finally, choose Run job to execute the job.

You can simply edit the script’s input table and output path to run this job on whatever GitHub timeline datasets that you might have.

Conclusion

In this post, we showed how to write AWS Glue ETL scripts in Scala via notebooks and how to run them as jobs. Scala has the advantage that it is the native language for the Spark runtime. With Scala, it is easier to call Scala or Java functions and third-party libraries for analyses. Moreover, data processing is faster in Scala because there’s no need to convert records from one language runtime to another.

You can find more example of Scala scripts in our GitHub examples repository: https://github.com/awslabs/aws-glue-samples. We encourage you to experiment with Scala scripts and let us know about any interesting ETL flows that you want to share.

Happy Glue-ing!

 


Additional Reading

If you found this post useful, be sure to check out Simplify Querying Nested JSON with the AWS Glue Relationalize Transform and Genomic Analysis with Hail on Amazon EMR and Amazon Athena.

 


About the Authors

Mehul Shah is a senior software manager for AWS Glue. His passion is leveraging the cloud to build smarter, more efficient, and easier to use data systems. He has three girls, and, therefore, he has no spare time.

 

 

 

Ben Sowell is a software development engineer at AWS Glue.

 

 

 

 
Vinay Vivili is a software development engineer for AWS Glue.

 

 

 

timeShift(GrafanaBuzz, 1w) Issue 29

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2018/01/12/timeshiftgrafanabuzz-1w-issue-29/

Welcome to TimeShift

intro paragraph


Latest Stable Release

Grafana 4.6.3 is now available. Latest bugfixes include:

  • Gzip: Fixes bug Gravatar images when gzip was enabled #5952
  • Alert list: Now shows alert state changes even after adding manual annotations on dashboard #99513
  • Alerting: Fixes bug where rules evaluated as firing when all conditions was false and using OR operator. #93183
  • Cloudwatch: CloudWatch no longer display metrics’ default alias #101514, thx @mtanda

Download Grafana 4.6.3 Now


From the Blogosphere

Graphite 1.1: Teaching an Old Dog New Tricks: Grafana Labs’ own Dan Cech is a contributor to the Graphite project, and has been instrumental in the addition of some of the newest features. This article discusses five of the biggest additions, how they work, and what you can expect for the future of the project.

Instrument an Application Using Prometheus and Grafana: Chris walks us through how easy it is to get useful metrics from an application to understand bottlenecks and performace. In this article, he shares an application he built that indexes your Gmail account into Elasticsearch, and sends the metrics to Prometheus. Then, he shows you how to set up Grafana to get meaningful graphs and dashboards.

Visualising Serverless Metrics With Grafana Dashboards: Part 3 in this series of blog posts on “Monitoring Serverless Applications Metrics” starts with an overview of Grafana and the UI, covers queries and templating, then dives into creating some great looking dashboards. The series plans to conclude with a post about setting up alerting.

Huawei FAT WLAN Access Points in Grafana: Huawei’s FAT firmware for their WLAN Access points lacks central management overview. To get a sense of the performance of your AP’s, why not quickly create a templated dashboard in Grafana? This article quickly steps your through the process, and includes a sample dashboard.


Grafana Plugins

Lots of updated plugins this week. Plugin authors add new features and fix bugs often, to make your plugin perform better – so it’s important to keep your plugins up to date. We’ve made updating easy; for on-prem Grafana, use the Grafana-cli tool, or update with 1 click if you’re using Hosted Grafana.

UPDATED PLUGIN

Clickhouse Data Source – The Clickhouse Data Source plugin has been updated a few times with small fixes during the last few weeks.

  • Fix for quantile functions
  • Allow rounding with round option for both time filters: $from and $to

Update

UPDATED PLUGIN

Zabbix App – The Zabbix App had a release with a redesign of the Triggers panel as well as support for Multiple data sources for the triggers panel

Update

UPDATED PLUGIN

OpenHistorian Data Source – this data source plugin received some new query builder screens and improved documentation.

Update

UPDATED PLUGIN

BT Status Dot Panel – This panel received a small bug fix.

Update

UPDATED PLUGIN

Carpet Plot Panel – A recent update for this panel fixes a D3 import bug.

Update


Upcoming Events

In between code pushes we like to speak at, sponsor and attend all kinds of conferences and meetups. We also like to make sure we mention other Grafana-related events happening all over the world. If you’re putting on just such an event, let us know and we’ll list it here.

Women Who Go Berlin: Go Workshop – Monitoring and Troubleshooting using Prometheus and Grafana | Berlin, Germany – Jan 31, 2018: In this workshop we will learn about one of the most important topics in making apps production ready: Monitoring. We will learn how to use tools you’ve probably heard a lot about – Prometheus and Grafana, and using what we learn we will troubleshoot a particularly buggy Go app.

Register Now

FOSDEM | Brussels, Belgium – Feb 3-4, 2018: FOSDEM is a free developer conference where thousands of developers of free and open source software gather to share ideas and technology. There is no need to register; all are welcome.

Jfokus | Stockholm, Sweden – Feb 5-7, 2018:
Carl Bergquist – Quickie: Monitoring? Not OPS Problem

Why should we monitor our system? Why can’t we just rely on the operations team anymore? They use to be able to do that. What’s currently changing? Presentation content: – Why do we monitor our system – How did it use to work? – Whats changing – Why do we need to shift focus – Everyone should be on call. – Resilience is the goal (Best way of having someone care about quality is to make them responsible).

Register Now

Jfokus | Stockholm, Sweden – Feb 5-7, 2018:
Leonard Gram – Presentation: DevOps Deconstructed

What’s a Site Reliability Engineer and how’s that role different from the DevOps engineer my boss wants to hire? I really don’t want to be on call, should I? Is Docker the right place for my code or am I better of just going straight to Serverless? And why should I care about any of it? I’ll try to answer some of these questions while looking at what DevOps really is about and how commodisation of servers through “the cloud” ties into it all. This session will be an opinionated piece from a developer who’s been on-call for the past 6 years and would like to convince you to do the same, at least once.

Register Now

Stockholm Metrics and Monitoring | Stockholm, Sweden – Feb 7, 2018:
Observability 3 ways – Logging, Metrics and Distributed Tracing

Let’s talk about often confused telemetry tools: Logging, Metrics and Distributed Tracing. We’ll show how you capture latency using each of the tools and how they work differently. Through examples and discussion, we’ll note edge cases where certain tools have advantages over others. By the end of this talk, we’ll better understand how each of Logging, Metrics and Distributed Tracing aids us in different ways to understand our applications.

Register Now

OpenNMS – Introduction to “Grafana” | Webinar – Feb 21, 2018:
IT monitoring helps detect emerging hardware damage and performance bottlenecks in the enterprise network before any consequential damage or disruption to business processes occurs. The powerful open-source OpenNMS software monitors a network, including all connected devices, and provides logging of a variety of data that can be used for analysis and planning purposes. In our next OpenNMS webinar on February 21, 2018, we introduce “Grafana” – a web-based tool for creating and displaying dashboards from various data sources, which can be perfectly combined with OpenNMS.

Register Now

GrafanaCon EU | Amsterdam, Netherlands – March 1-2, 2018:
Lock in your seat for GrafanaCon EU while there are still tickets avaialable! Join us March 1-2, 2018 in Amsterdam for 2 days of talks centered around Grafana and the surrounding monitoring ecosystem including Graphite, Prometheus, InfluxData, Elasticsearch, Kubernetes, and more.

We have some exciting talks lined up from Google, CERN, Bloomberg, eBay, Red Hat, Tinder, Automattic, Prometheus, InfluxData, Percona and more! Be sure to get your ticket before they’re sold out.

Learn More


Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

Nice hack! I know I like to keep one eye on server requests when I’m dropping beats. 😉


Grafana Labs is Hiring!

We are passionate about open source software and thrive on tackling complex challenges to build the future. We ship code from every corner of the globe and love working with the community. If this sounds exciting, you’re in luck – WE’RE HIRING!

Check out our Open Positions


How are we doing?

Thanks for reading another issue of timeShift. Let us know what you think! Submit a comment on this article below, or post something at our community forum.

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

Yet Another FBI Proposal for Insecure Communications

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/01/yet_another_fbi.html

Deputy Attorney General Rosenstein has given talks where he proposes that tech companies decrease their communications and device security for the benefit of the FBI. In a recent talk, his idea is that tech companies just save a copy of the plaintext:

Law enforcement can also partner with private industry to address a problem we call “Going Dark.” Technology increasingly frustrates traditional law enforcement efforts to collect evidence needed to protect public safety and solve crime. For example, many instant-messaging services now encrypt messages by default. The prevent the police from reading those messages, even if an impartial judge approves their interception.

The problem is especially critical because electronic evidence is necessary for both the investigation of a cyber incident and the prosecution of the perpetrator. If we cannot access data even with lawful process, we are unable to do our job. Our ability to secure systems and prosecute criminals depends on our ability to gather evidence.

I encourage you to carefully consider your company’s interests and how you can work cooperatively with us. Although encryption can help secure your data, it may also prevent law enforcement agencies from protecting your data.

Encryption serves a valuable purpose. It is a foundational element of data security and essential to safeguarding data against cyber-attacks. It is critical to the growth and flourishing of the digital economy, and we support it. I support strong and responsible encryption.

I simply maintain that companies should retain the capability to provide the government unencrypted copies of communications and data stored on devices, when a court orders them to do so.

Responsible encryption is effective secure encryption, coupled with access capabilities. We know encryption can include safeguards. For example, there are systems that include central management of security keys and operating system updates; scanning of content, like your e-mails, for advertising purposes; simulcast of messages to multiple destinations at once; and key recovery when a user forgets the password to decrypt a laptop. No one calls any of those functions a “backdoor.” In fact, those very capabilities are marketed and sought out.

I do not believe that the government should mandate a specific means of ensuring access. The government does not need to micromanage the engineering.

The question is whether to require a particular goal: When a court issues a search warrant or wiretap order to collect evidence of crime, the company should be able to help. The government does not need to hold the key.

Rosenstein is right that many services like Gmail naturally keep plaintext in the cloud. This is something we pointed out in our 2016 paper: “Don’t Panic.” But forcing companies to build an alternate means to access the plaintext that the user can’t control is an enormous vulnerability.

Graphite 1.1: Teaching an Old Dog New Tricks

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2018/01/11/graphite-1.1-teaching-an-old-dog-new-tricks/

The Road to Graphite 1.1

I started working on Graphite just over a year ago, when @obfuscurity asked me to help out with some issues blocking the Graphite 1.0 release. Little did I know that a year later, that would have resulted in 262 commits (and counting), and that with the help of the other Graphite maintainers (especially @deniszh, @iksaif & @cbowman0) we would have added a huge amount of new functionality to Graphite.

There are a huge number of new additions and updates in this release, in this post I’ll give a tour of some of the highlights including tag support, syntax and function updates, custom function plugins, and python 3.x support.

Tagging!

The single biggest feature in this release is the addition of tag support, which brings the ability to describe metrics in a much richer way and to write more flexible and expressive queries.

Traditionally series in Graphite are identified using a hierarchical naming scheme based on dot-separated segments called nodes. This works very well and is simple to map into a hierarchical structure like the whisper filesystem tree, but it means that the user has to know what each segment represents, and makes it very difficult to modify or extend the naming scheme since everything is based on the positions of the segments within the hierarchy.

The tagging system gives users the ability to encode information about the series in a collection of tag=value pairs which are used together with the series name to uniquely identify each series, and the ability to query series by specifying tag-based matching expressions rather than constructing glob-style selectors based on the positions of specific segments within the hierarchy. This is broadly similar to the system used by Prometheus and makes it possible to use Graphite as a long-term storage backend for metrics gathered by Prometheus with full tag support.

When using tags, series names are specified using the new tagged carbon format: name;tag1=value1;tag2=value2. This format is backward compatible with most existing carbon tooling, and makes it easy to adapt existing tools to produce tagged metrics simply by changing the metric names. The OpenMetrics format is also supported for ingestion, and is normalized into the standard Graphite format internally.

At its core, the tagging system is implemented as a tag database (TagDB) alongside the metrics that allows them to be efficiently queried by individual tag values rather than having to traverse the metrics tree looking for series that match the specified query. Internally the tag index is stored in one of a number of pluggable tag databases, currently supported options are the internal graphite-web database, redis, or an external system that implements the Graphite tagging HTTP API. Carbon automatically keeps the index up to date with any tagged series seen.

The new seriesByTag function is used to query the TagDB and will return a list of all the series that match the expressions passed to it. seriesByTag supports both exact and regular expression matches, and can be used anywhere you would previously have specified a metric name or glob expression.

There are new dedicated functions for grouping and aliasing series by tag (groupByTags and aliasByTags), and you can also use tags interchangeably with node numbers in the standard Graphite functions like aliasByNode, groupByNodes, asPercent, mapSeries, etc.

Piping Syntax & Function Updates

One of the huge strengths of the Graphite render API is the ability to chain together multiple functions to process data, but until now (unless you were using a tool like Grafana) writing chained queries could be painful as each function had to be wrapped around the previous one. With this release it is now possible to “pipe” the output of one processing function into the next, and to combine piped and nested functions.

For example:

alias(movingAverage(scaleToSeconds(sumSeries(stats_global.production.counters.api.requests.*.count),60),30),'api.avg')

Can now be written as:

sumSeries(stats_global.production.counters.api.requests.*.count)|scaleToSeconds(60)|movingAverage(30)|alias('api.avg')

OR

stats_global.production.counters.api.requests.*.count|sumSeries()|scaleToSeconds(60)|movingAverage(30)|alias('api.avg')

Another source of frustration with the old function API was the inconsistent implementation of aggregations, with different functions being used in different parts of the API, and some functions simply not being available. In 1.1 all functions that perform aggregation (whether across series or across time intervals) now support a consistent set of aggregations; average, median, sum, min, max, diff, stddev, count, range, multiply and last. This is part of a new approach to implementing functions that emphasises using shared building blocks to ensure consistency across the API and solve the problem of a particular function not working with the aggregation needed for a given task.

To that end a number of new functions have been added that each provide the same functionality as an entire family of “old” functions; aggregate, aggregateWithWildcards, movingWindow, filterSeries, highest, lowest and sortBy.

Each of these functions accepts an aggregation method parameter, for example aggregate(some.metric.*, 'sum') implements the same functionality as sumSeries(some.metric.*).

It can also be used with different aggregation methods to replace averageSeries, stddevSeries, multiplySeries, diffSeries, rangeOfSeries, minSeries, maxSeries and countSeries. All those functions are now implemented as aliases for aggregate, and it supports the previously-missing median and last aggregations.

The same is true for the other functions, and the summarize, smartSummarize, groupByNode, groupByNodes and the new groupByTags functions now all support the standard set of aggregations. Gone are the days of wishing that sortByMedian or highestRange were available!

For more information on the functions available check the function documentation.

Custom Functions

No matter how many functions are available there are always going to be specific use-cases where a custom function can perform analysis that wouldn’t otherwise be possible, or provide a convenient alias for a complicated function chain or specific set of parameters.

In Graphite 1.1 we added support for easily adding one-off custom functions, as well as for creating and sharing plugins that can provide one or more functions.

Each function plugin is packaged as a simple python module, and will be automatically loaded by Graphite when placed into the functions/custom folder.

An example of a simple function plugin that translates the name of every series passed to it into UPPERCASE:

from graphite.functions.params import Param, ParamTypes

def toUpperCase(requestContext, seriesList):
  """Custom function that changes series names to UPPERCASE"""
  for series in seriesList:
    series.name = series.name.upper()
  return seriesList

toUpperCase.group = 'Custom'
toUpperCase.params = [
  Param('seriesList', ParamTypes.seriesList, required=True),
]

SeriesFunctions = {
  'upper': toUpperCase,
}

Once installed the function is not only available for use within Grpahite, but is also exposed via the new Function API which allows the function definition and documentation to be automatically loaded by tools like Grafana. This means that users will be able to select and use the new function in exactly the same way as the internal functions.

More information on writing and using custom functions is available in the documentation.

Clustering Updates

One of the biggest changes from the 0.9 to 1.0 releases was the overhaul of the clustering code, and with 1.1.1 that process has been taken even further to optimize performance when using Graphite in a clustered deployment. In the past it was common for a request to require the frontend node to make multiple requests to the backend nodes to identify matching series and to fetch data, and the code for handling remote vs local series was overly complicated. In 1.1.1 we took a new approach where all render data requests pass through the same path internally, and multiple backend nodes are handled individually rather than grouped together into a single finder. This has greatly simplified the codebase, making it much easier to understand and reason about, while allowing much more flexibility in design of the finders. After these changes, render requests can now be answered with a single internal request to each backend node, and all requests for both remote and local data are executed in parallel.

To maintain the ability of graphite to scale out horizontally, the tagging system works seamlessly within a clustered environment, with each node responsible for the series stored on that node. Calls to load tagged series via seriesByTag are fanned out to the backend nodes and results are merged on the query node just like they are for non-tagged series.

Python 3 & Django 1.11 Support

Graphite 1.1 finally brings support for Python 3.x, both graphite-web and carbon are now tested against Python 2.7, 3.4, 3.5, 3.6 and PyPy. Django releases 1.8 through 1.11 are also supported. The work involved in sorting out the compatibility issues between Python 2.x and 3.x was quite involved, but it is a huge step forward for the long term support of the project! With the new Django 2.x series supporting only Python 3.x we will need to evaluate our long-term support for Python 2.x, but the Django 1.11 series is supported through 2020 so there is time to consider the options there.

Watch This Space

Efforts are underway to add support for the new functionality across the ecosystem of tools that work with Graphite, adding collectd tagging support, prometheus remote read & write with tags (and native Prometheus remote read/write support in Graphite) and last but not least Graphite tag support in Grafana.

We’re excited about the possibilities that the new capabilities in 1.1.x open up, and can’t wait to see how the community puts them to work.

Download the 1.1.1 release and check out the release notes here.

Turn your smartphone into a universal remote

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/zero-universal-remote/

Honolulu-based software developer bbtinkerer was tired of never being able to find the TV remote. So he made his own using a Raspberry Pi Zero, and connected it to a web app accessible on his smartphone.

bbtinkerer universal remote Raspberry Pi zero

Finding a remote alternative

“I needed one because the remote in my house tends to go missing a lot,” explains Bernard aka bbtinkerer on the Instructables page for his Raspberry Pi Zero Universal Remote.”If I want the controller, I have to hunt down three people and hope one of them remembers that they took it.”

bbtinkerer universal remote Raspberry Pi zero

For the build, Bernard used a Raspberry Pi Zero, an IR LED and corresponding receiver, Raspbian Lite, and a neat little 3D-printed housing.

bbtinkerer universal remote Raspberry Pi zero
bbtinkerer universal remote Raspberry Pi zero
bbtinkerer universal remote Raspberry Pi zero

First, he soldered a circuit for the LED and resistors on a small piece of perf board. Then he assembled the hardware components. Finally, all he needed to do was to write the code to control his devices (including a tower fan), and to set up the app.

bbtinkerer universal remote Raspberry Pi zero

Bernard employed the Linux Infrared Remote Control (LIRC) package to control the television with the Raspberry Pi Zero, accessing the Zero via SSH. He gives a complete rundown of the installation process on Instructables.

bbtinkerer universal remote Raspberry Pi zero

Setting up a remote’s buttons with LIRC is a simple case of pressing them and naming their functions one by one. You’ll need the remote to set up the system, but after that, feel free to lock it in a drawer and use your smartphone instead.



Finally, Bernard created the web interface using Node.js, and again, because he’s lovely, he published the code for anyone wanting to build their own. Thanks, Bernard!

Life hacks

If you’ve used a Raspberry Pi to build a time-saving life hack like Bernard’s, be sure to share it with us. Other favourites of ours include fridge cameras, phone app doorbell notifications, and Alan’s ocarina home automation system. I’m not sure if this last one can truly be considered a time-saving life hack. It’s still cool though!

The post Turn your smartphone into a universal remote appeared first on Raspberry Pi.

Tech Companies Meet EC to Discuss Removal of Pirate & Illegal Content

Post Syndicated from Andy original https://torrentfreak.com/tech-companies-meet-ec-to-discuss-removal-of-pirate-illegal-content-180109/

Thousands perhaps millions of pieces of illegal content flood onto the Internet every single day, a problem that’s only increasing with each passing year.

In the early days of the Internet very little was done to combat the problem but with the rise of social media and millions of citizens using it to publish whatever they like – not least terrorist propaganda and racist speech – governments around the world are beginning to take notice.

Of course, running parallel is the multi-billion dollar issue of intellectual property infringement. Eighteen years on from the first wave of mass online piracy and the majority of popular movies, TV shows, games, software and books are still available to download.

Over the past couple of years and increasingly in recent months, there have been clear signs that the EU in particular wishes to collectively mitigate the spread of all illegal content – from ISIS videos to pirated Hollywood movies – with assistance from major tech companies.

Google, YouTube, Facebook and Twitter are all expected to do their part, with the looming stick of legislation behind the collaborative carrots, should they fail to come up with a solution.

To that end, five EU Commissioners – Dimitris Avramopoulos, Elżbieta Bieńkowska, Věra Jourová, Julian King and Mariya Gabriel – will meet today in Brussels with representatives of several online platforms to discuss progress made in dealing with the spread of the aforementioned material.

In a joint statement together with EC Vice-President Andrus Ansip, the Commissioners describe all illegal content as a threat to security, safety, and fundamental rights, demanding a “collective response – from all actors, including the internet industry.”

They note that online platforms have committed significant resources towards removing violent and extremist content, including via automated removal, but more needs to be done to tackle the issue.

“This is starting to achieve results. However, even if tens of thousands of pieces of illegal content have been taken down, there are still hundreds of thousands more out there,” the Commissioners writes.

“And removal needs to be speedy: the longer illegal material stays online, the greater its reach, the more it can spread and grow. Building on the current voluntary approach, more efforts and progress have to be made.”

The Commission says it is relying on online platforms such as Google and Facebook to “step up and speed up their efforts to tackle these threats quickly and comprehensively.” This should include closer cooperation with law enforcement, sharing of information with other online players, plus action to ensure that once taken down, illegal content does not simply reappear.

While it’s clear that that the EC would prefer to work collaboratively with the platforms to find a solution to the illegal content problem, as expected there’s the veiled threat of them being compelled by law to do so, should they fall short of their responsibilities.

“We will continue to promote cooperation with social media companies to detect and remove terrorist and other illegal content online, and if necessary, propose legislation to complement the existing regulatory framework,” the EC warns.

Today’s discussions run both in parallel and in tandem with others specifically targeted at intellectual property abuses. Late November the EC presented a set of new measures to ensure that copyright holders are well protected both online and in the physical realm.

A key aim is to focus on large-scale facilitators, such as pirate site operators, while cutting their revenue streams.

“The Commission seeks to deprive commercial-scale IP infringers of the revenue flows that make their criminal activity lucrative – this is the so-called ‘follow the money’ approach which focuses on the ‘big fish’ rather than individuals,” the Commission explained.

This presentation followed on the heels of a proposal last September which had the EC advocating the take-down-stay-down principle, with pirate content being taken down, automated filters ensuring infringement can be tackled proactively, with measures being taken against repeat infringers.

Again, the EC warned that should cooperation with Internet platforms fail to come up with results, future legislation cannot be ruled out.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Pirate Bay Founder: Netflix and Spotify Are a Threat, No Solution

Post Syndicated from Ernesto original https://torrentfreak.com/pirate-bay-founder-netflix-and-spotify-are-a-threat-no-solution-180107/

Ten years ago the Internet was an entirely different place. Piracy was rampant, as it is today, but the people behind the largest torrent sites were more vocal then.

There was a battle going on for the right to freely share content online. This was very much a necessity at the time, as legal options were scarce, but for many it was also an idealistic battle.

As the spokesperson of The Pirate Bay, Peter Sunde was one of the leading voices at the time. He believed, and still does, that people should be able to share anything without restrictions. Period.

For Peter and three others associated with The Pirate Bay, this eventually resulted in jail sentences. They were not the only ones to feel the consequences. Over the past decade, dozens of torrent sites were shut down under legal pressure, forcing those operators that remain to go into hiding.

Today, ten years after we spoke to Peter about the future of torrent sites and file-sharing, we reach out to him again. A lot has changed, but how does The Pirate Bay’s co-founder look at things now?

“On the personal side, all is great, and I’m working on a TV-series about activism that will air next year. On top of that of course working on Njalla, Ipredator and other known projects,” Peter says.

“In general, I think that projects for me are still about the same thing as a decade ago, but just trying different approaches!”

While Peter stays true to his activist roots, fighting for privacy and freedom on the Internet, his outlook is not as positive as it once was.

He is proud that The Pirate Bay never caved and that they fought their cases to the end. The moral struggle was won, but he also realizes that the greater battle was lost.

“I’m proud and happy to be able to look myself in the mirror every morning with a feeling of doing right. A lot of corrupt people involved in our cases probably feel quite shitty. Well, if they have feelings,” Peter says.

The Pirate Bay’s former spokesperson doesn’t have any regrets really. The one thing that comes to mind, when we ask about things that he would have done differently, is to tell fellow Pirate Bay founder Anakata to encrypt his hard drive.

Brokep (Peter) and Anakata (Gottfrid)

Looking at the current media climate, Peter doesn’t think we are better off. On the contrary. While it might be easier in some counties to access content legally online, this also means that control is now firmly in the hands of a few major companies.

The Pirate Bay and others always encouraged free sharing for creators and consumers. This certainly hasn’t improved. Instead, media today is contained in large centralized silos.

“I’m surprised that people are so short-sighted. The ‘solution’ to file sharing was never centralizing content control back to a few entities – that was the struggle we were fighting for.

“Netflix, Spotify etc are not a solution but a loss. And it surprises me that the pirate movement is not trying to talk more about that,” he adds.

The Netflixes and Spotifies of this world are often portrayed as a solution to piracy. However, Peter sees things differently. He believes that these services put more control in the hands of powerful companies.

“The same companies we fought own these platforms. Either they own the shares in the companies, or they have deals with them which makes it impossible for these companies to not follow their rules.

“Artists can’t choose to be or not to be on Spotify in reality, because there’s nothing else in the end. If Spotify doesn’t follow the rules from these companies, they are fucked as well. The dependence is higher than ever.”

The first wave of mass Internet piracy well over a decade ago was a wake-up call to the entertainment industry. The immense popularity of torrent sites showed that people demanded something they weren’t offering.

In a way, these early pirate sites are the reason why Netflix and Spotify were able to do what they do. Literally, in the case of Spotify, which used pirated music to get the service going.

Peter doesn’t see them as the answer though. The only solution in his book is to redefine and legalize piracy.

“The solution to piracy is to re-define piracy. Make things available to everyone, without that being a crime,” Peter says.

In this regard, not much has changed in ten years. However, having witnessed this battle closer than anyone else, he also realizes that the winners are likely on the other end.

Piracy will decrease over time, but not the way Peter hopes it will.

“I think we’ll have less piracy because of the problems we see today. With net neutrality being infringed upon and more laws against individual liberties and access to culture, instead of actually benefiting people.

“The media industry will be happy to know that their lobbying efforts and bribes are paying off,” he concludes.

This is the second and final post in our torrent pioneers series. The first interview with isoHunt founder Gary Fung is available here.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Top 10 Most Popular Torrent Sites of 2018

Post Syndicated from Ernesto original https://torrentfreak.com/top-10-most-popular-torrent-sites-of-2018-180107/

Torrent sites have come and gone over past year. Now, at the start of 2018, we take a look to see what the most-used sites are in the current landscape.

The Pirate Bay remains the undisputed number one. The site has weathered a few storms over the years, but it looks like it will be able to celebrate its 15th anniversary, which is coming up in a few months.

The list also includes various newcomers including Idope and Zooqle. While many people are happy to see new torrent sites emerge, this often means that others have called it quits.

Last year’s runner-up Extratorrent, for example, has shut down and left a gaping hole behind. And it wasn’t the only site that went away. TorrentProject also disappeared without a trace and the same was true for isohunt.to.

The unofficial Torrentz reincarnation Torrentz2.eu, the highest newcomer last year, is somewhat of an unusual entry. A few weeks ago all links to externally hosted torrents were removed, as was the list of indexed pages.

We decided to include the site nonetheless, given its history and because it’s still possible to find hashes through the site. As Torrentz2’s future is uncertain, we added an extra site (10.1) as compensation.

Finally, RuTracker also deserves a mention. The torrent site generates enough traffic to warrant a listing, but we traditionally limit the list to sites that are targeted primarily at an English or international audience.

Below is the full list of the ten most-visited torrent sites at the start of the new year. The list is based on various traffic reports and we display the Alexa rank for each. In addition, we include last year’s ranking.

Most Popular Torrent Sites

1. The Pirate Bay

The Pirate Bay is the “king of torrents” once again and also the oldest site in this list. The past year has been relatively quiet for the notorious torrent site, which is currently operating from its original .org domain name.

Alexa Rank: 104/ Last year #1

2. RARBG

RARBG, which started out as a Bulgarian tracker, has captured the hearts and minds of many video pirates. The site was founded in 2008 and specializes in high quality video releases.

Alexa Rank: 298 / Last year #3

3. 1337x

1337x continues where it left off last year. The site gained a lot of traffic and, unlike some other sites in the list, has a dedicated group of uploaders that provide fresh content.

Alexa Rank: 321 / Last year #6

4. Torrentz2

Torrentz2 launched as a stand-in for the original Torrentz.eu site, which voluntarily closed its doors in 2016. At the time of writing, the site only lists torrent hashes and no longer any links to external torrent sites. While browser add-ons and plugins still make the site functional, its future is uncertain.

Alexa Rank: 349 / Last year #5

5. YTS.ag

YTS.ag is the unofficial successors of the defunct YTS or YIFY group. Not all other torrent sites were happy that the site hijacked the popuar brand and several are actively banning its releases.

Alexa Rank: 563 / Last year #4

6. EZTV.ag

The original TV-torrent distribution group EZTV shut down after a hostile takeover in 2015, with new owners claiming ownership of the brand. The new group currently operates from EZTV.ag and releases its own torrents. These releases are banned on some other torrent sites due to this controversial history.

Alexa Rank: 981 / Last year #7

7. LimeTorrents

Limetorrents has been an established torrent site for more than half a decade. The site’s operator also runs the torrent cache iTorrents, which is used by several other torrent search engines.

Alexa Rank: 2,433 / Last year #10

8. NYAA.si

NYAA.si is a popular resurrection of the anime torrent site NYAA, which shut down last year. Previously we left anime-oriented sites out of the list, but since we also include dedicated TV and movie sites, we decided that a mention is more than warranted.

Alexa Rank: 1,575 / Last year #NA

9. Torrents.me

Torrents.me is one of the torrent sites that enjoyed a meteoric rise in traffic this year. It’s a meta-search engine that links to torrent files and magnet links from other torrent sites.

Alexa Rank: 2,045 / Last year #NA

10. Zooqle

Zooqle, which boasts nearly three million verified torrents, has stayed under the radar for years but has still kept growing. The site made it into the top 10 for the first time this year.

Alexa Rank: 2,347 / Last year #NA

10.1 iDope

The special 10.1 mention goes to iDope. Launched in 2016, the site is a relative newcomer to the torrent scene. The torrent indexer has steadily increased its audience over the past year. With similar traffic numbers to Zooqle, a listing is therefore warranted.

Alexa Rank: 2,358 / Last year #NA

Disclaimer: Yes, we know that Alexa isn’t perfect, but it helps to compare sites that operate in a similar niche. We also used other traffic metrics to compile the top ten. Please keep in mind that many sites have mirrors or alternative domains, which are not taken into account here.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

No Level of Copyright Enforcement Will Ever Be Enough For Big Media

Post Syndicated from Andy original https://torrentfreak.com/no-level-of-copyright-enforcement-will-ever-be-enough-for-big-media-180107/

For more than ten years TorrentFreak has documented a continuous stream of piracy battles so it’s natural that, every now and then, we pause to consider when this war might stop. The answer is always “no time soon” and certainly not in 2018.

When swapping files over the Internet first began it wasn’t a particularly widespread activity. A reasonable amount of content was available, but it was relatively inaccessible. Then peer-to-peer came along and it sparked a revolution.

From the beginning, copyright holders felt that the law would answer their problems, whether that was by suing Napster, Kazaa, or even end users. Some industry players genuinely believed this strategy was just a few steps away from achieving its goals. Just a little bit more pressure and all would be under control.

Then, when the landmark MGM Studios v. Grokster decision was handed down in the studios’ favor during 2005, the excitement online was palpable. As copyright holders rejoiced in this body blow for the pirating masses, file-sharing communities literally shook under the weight of the ruling. For a day, maybe two.

For the majority of file-sharers, the ruling meant absolutely nothing. So what if some company could be held responsible for other people’s infringements? Another will come along, outside of the US if need be, people said. They were right not to be concerned – that’s exactly what happened.

Ever since, this cycle has continued. Eager to stem the tide of content being shared without their permission, rightsholders have advocated stronger anti-piracy enforcement and lobbied for more restrictive interpretations of copyright law. Thus far, however, literally nothing has provided a solution.

One would have thought that given the military-style raid on Kim Dotcom’s Megaupload, a huge void would’ve appeared in the sharing landscape. Instead, the file-locker business took itself apart and reinvented itself in jurisdictions outside the United States. Meanwhile, the BitTorrent scene continued in the background, somewhat obliviously.

With the SOPA debacle still fresh in relatively recent memory, copyright holders are still doggedly pursuing their aims. Site-blocking is rampant, advertisers are being pressured into compliance, and ISPs like Cox Communications now find themselves responsible for the infringements of their users. But has any of this caused any fatal damage to the sharing landscape? Not really.

Instead, we’re seeing a rise in the use of streaming sites, each far more accessible to the newcomer than their predecessors and vastly more difficult for copyright holders to police.

Systems built into Kodi are transforming these platforms into a plug-and-play piracy playground, one in which sites skirt US law and users can consume both at will and in complete privacy. Meanwhile, commercial and unauthorized IPTV offerings are gathering momentum, even as rightsholders try to pull them back.

Faced with problems like these we are now seeing calls for even tougher legislation. While groups like the RIAA dream of filtering the Internet, over in the UK a 2017 consultation had copyright holders excited that end users could be criminalized for simply consuming infringing content, let alone distributing it.

While the introduction of both or either of these measures would cause uproar (and rightly so), history tells us that each would fail in its stated aim of stopping piracy. With that eventuality all but guaranteed, calls for even tougher legislation are being readied for later down the line.

In short, there is no law that can stop piracy and therefore no law that will stop the entertainment industries coming back for harsher measures, pursuing the dream. This much we’ve established from close to two decades of litigation and little to no progress.

But really, is anyone genuinely surprised that they’re still taking this route? Draconian efforts to maintain control over the distribution of content predate the file-sharing wars by a couple of hundred years, at the very least. Why would rightsholders stop now, when the prize is even more valuable?

No one wants a minefield of copyright law. No one wants a restricted Internet. No one wants extended liability for innovators, service providers, or the public. But this is what we’ll get if this problem isn’t solved soon. Something drastic needs to happen, but who will be brave enough to admit it, let alone do something about it?

During a discussion about piracy last year on the BBC, the interviewer challenged a caller who freely admitted to pirating sports content online. The caller’s response was clear:

For far too long, broadcasters and rightsholders have abused their monopoly position, charging ever-increasing amounts for popular content, even while making billions. Piracy is a natural response to that, and effectively a chance for the little guy to get back some control, he argued.

Exactly the same happened in the music market during the late 1990s and 2000s. In response to artificial restriction of the market and the unrealistic hiking of prices, people turned to peer-to-peer networks for their fix. Thanks to this pressure but after years of turmoil, services like Spotify emerged, converting millions of former pirates in the process. Netflix, it appears, is attempting to do the same thing with video.

When people feel that they aren’t getting ripped off and that they have no further use for sub-standard piracy services in the face of stunning legal alternatives, things will change. But be under no illusion, people won’t be bullied there.

If we end up with an Internet stifled in favor of rightsholders, one in which service providers are too scared to innovate, the next generation of consumers will never forget. This will be a major problem for two key reasons. Not only will consumers become enemies but piracy will still exist. We will have come full circle, fueled only by division and hatred.

It’s a natural response to reject monopolistic behavior and it’s a natural response, for most, to be fair when treated with fairness. Destroying freedom is far from fair and will not create a better future – for anyone.

Laws have their place, no sane person will argue against that, but when the entertainment industries are making billions yet still want more, they’ll have to decide whether this will go on forever with building resentment, or if making a bit less profit now makes more sense longer term.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Physics cheats

Post Syndicated from Eevee original https://eev.ee/blog/2018/01/06/physics-cheats/

Anonymous asks:

something about how we tweak physics to “work” better in games?

Ho ho! Work. Get it? Like in physics…?

Hitboxes

Hitbox” is perhaps not the most accurate term, since the shape used for colliding with the environment and the shape used for detecting damage might be totally different. They’re usually the same in simple platformers, though, and that’s what most of my games have been.

The hitbox is the biggest physics fudge by far, and it exists because of a single massive approximation that (most) games make: you’re controlling a single entity in the abstract, not a physical body in great detail.

That is: when you walk with your real-world meat shell, you perform a complex dance of putting one foot in front of the other, a motion you spent years perfecting. When you walk in a video game, you press a single “walk” button. Your avatar may play an animation that moves its legs back and forth, but since you’re not actually controlling the legs independently (and since simulating them is way harder), the game just treats you like a simple shape. Fairly often, this is a box, or something very box-like.

An Eevee sprite standing on faux ground; the size of the underlying image and the hitbox are outlined

Since the player has no direct control over the exact placement of their limbs, it would be slightly frustrating to have them collide with the world. This is especially true in cases like the above, where the tail and left ear protrude significantly out from the main body. If that Eevee wanted to stand against a real-world wall, she would simply tilt her ear or tail out of the way, so there’s no reason for the ear to block her from standing against a game wall. To compensate for this, the ear and tail are left out of the collision box entirely and will simply jut into a wall if necessary — a goofy affordance that’s so common it doesn’t even register as unusual. As a bonus (assuming this same box is used for combat), she won’t take damage from projectiles that merely graze past an ear.

(One extra consideration for sprite games in particular: the hitbox ought to be horizontally symmetric around the sprite’s pivot — i.e. the point where the entity is truly considered to be standing — so that the hitbox doesn’t abruptly move when the entity turns around!)

Corners

Treating the player (and indeed most objects) as a box has one annoying side effect: boxes have corners. Corners can catch on other corners, even by a single pixel. Real-world bodies tend to be a bit rounder and squishier and this can tolerate grazing a corner; even real-world boxes will simply rotate a bit.

Ah, but in our faux physics world, we generally don’t want conscious actors (such as the player) to rotate, even with a realistic physics simulator! Real-world bodies are made of parts that will generally try to keep you upright, after all; you don’t tilt back and forth much.

One way to handle corners is to simply remove them from conscious actors. A hitbox doesn’t have to be a literal box, after all. A popular alternative — especially in Unity where it’s a standard asset — is the pill-shaped capsule, which has semicircles/hemispheres on the top and bottom and a cylindrical body in 3D. No corners, no problem.

Of course, that introduces a new problem: now the player can’t balance precariously on edges without their rounded bottom sliding them off. Alas.

If you’re stuck with corners, then, you may want to use a corner bump, a term I just made up. If the player would collide with a corner, but the collision is only by a few pixels, just nudge them to the side a bit and carry on.

An Eevee sprite trying to move sideways into a shallow ledge; the game bumps her upwards slightly, so she steps onto it instead

When the corner is horizontal, this creates stairs! This is, more or less kinda, how steps work in Doom: when the player tries to cross from one sector into another, if the height difference is 24 units or less, the game simply bumps them upwards to the height of the new floor and lets them continue on.

Implementing this in a game without Doom’s notion of sectors is a little trickier. In fact, I still haven’t done it. Collision detection based on rejection gets it for free, kinda, but it’s not very deterministic and it breaks other things. But that’s a whole other post.

Gravity

Gravity is pretty easy. Everything accelerates downwards all the time. What’s interesting are the exceptions.

Jumping

Jumping is a giant hack.

Think about how actual jumping works: you tense your legs, which generally involves bending your knees first, and then spring upwards. In a platformer, you can just leap whenever you feel like it, which is nonsense. Also you go like twenty feet into the air?

Worse, most platformers allow variable-height jumping, where your jump is lower if you let go of the jump button while you’re in the air. Normally, one would expect to have to decide how much force to put into the jump beforehand.

But of course this is about convenience of controls: when jumping is your primary action, you want to be able to do it immediately, without any windup for how high you want to jump.

(And then there’s double jumping? Come on.)

Air control is a similar phenomenon: usually you’d jump in a particular direction by controlling how you push off the ground with your feet, but in a video game, you don’t have feet! You only have the box. The compromise is to let you control your horizontal movement to a limit degree in midair, even though that doesn’t make any sense. (It’s way more fun, though, and overall gives you more movement options, which are good to have in an interactive medium.)

Air control also exposes an obvious place that game physics collide with the realistic model of serious physics engines. I’ve mentioned this before, but: if you use Real Physics™ and air control yourself into a wall, you might find that you’ll simply stick to the wall until you let go of the movement buttons. Why? Remember, player movement acts as though an external force were pushing you around (and from the perspective of a Real™ physics engine, this is exactly how you’d implement it) — so air-controlling into a wall is equivalent to pushing a book against a wall with your hand, and the friction with the wall holds you in place. Oops.

Ground sticking

Another place game physics conflict with physics engines is with running to the top of a slope. On a real hill, of course, you land on top of the slope and are probably glad of it; slopes are hard to climb!

An Eevee moves to the top of a slope, and rather than step onto the flat top, she goes flying off into the air

In a video game, you go flying. Because you’re a box. With momentum. So you hit the peak and keep going in the same direction. Which is diagonally upwards.

Projectiles

To make them more predictable, projectiles generally aren’t subject to gravity, at least as far as I’ve seen. The real world does not have such an exemption. The real world imposes gravity even on sniper rifles, which in a video game are often implemented as an instant trace unaffected by anything in the world because the bullet never actually exists in the world.

Resistance

Ah. Welcome to hell.

Water

Water is an interesting case, and offhand I don’t know the gritty details of how games implement it. In the real world, water applies a resistant drag force to movement — and that force is proportional to the square of velocity, which I’d completely forgotten until right now. I am almost positive that no game handles that correctly. But then, in real-world water, you can push against the water itself for movement, and games don’t simulate that either. What’s the rough equivalent?

The Sonic Physics Guide suggests that Sonic handles it by basically halving everything: acceleration, max speed, friction, etc. When Sonic enters water, his speed is cut; when Sonic exits water, his speed is increased.

That last bit feels validating — I could swear Metroid Prime did the same thing, and built my own solution around it, but couldn’t remember for sure. It makes no sense, of course, for a jump to become faster just because you happened to break the surface of the water, but it feels fantastic.

The thing I did was similar, except that I didn’t want to add a multiplier in a dozen places when you happen to be underwater (and remember which ones need it to be squared, etc.). So instead, I calculate everything completely as normal, so velocity is exactly the same as it would be on dry land — but the distance you would move gets halved. The effect seems to be pretty similar to most platformers with water, at least as far as I can tell. It hasn’t shown up in a published game and I only added this fairly recently, so I might be overlooking some reason this is a bad idea.

(One reason that comes to mind is that velocity is now a little white lie while underwater, so anything relying on velocity for interesting effects might be thrown off. Or maybe that’s correct, because velocity thresholds should be halved underwater too? Hm!)

Notably, air is also a fluid, so it should behave the same way (just with different constants). I definitely don’t think any games apply air drag that’s proportional to the square of velocity.

Friction

Friction is, in my experience, a little handwaved. Probably because real-world friction is so darn complicated.

Consider that in the real world, we want very high friction on the surfaces we walk on — shoes and tires are explicitly designed to increase it, even. We move by bracing a back foot against the ground and using that to push ourselves forward, so we want the ground to resist our push as much as possible.

In a game world, we are a box. We move by being pushed by some invisible outside force, so if the friction between ourselves and the ground is too high, we won’t be able to move at all! That’s complete nonsense physically, but it turns out to be handy in some cases — for example, highish friction can simulate walking through deep mud, which should be difficult due to fluid drag and low friction.

But the best-known example of the fakeness of game friction is video game ice. Walking on real-world ice is difficult because the low friction means low grip; your feet are likely to slip out from under you, and you’ll simply fall down and have trouble moving at all. In a video game, you can’t fall down, so you have the opposite experience: you spend most of your time sliding around uncontrollably. Yet ice is so common in video games (and perhaps so uncommon in places I’ve lived) that I, at least, had never really thought about this disparity until an hour or so ago.

Game friction vs real-world friction

Real-world friction is a force. It’s the normal force (which is the force exerted by the object on the surface) times some constant that depends on how the two materials interact.

Force is mass times acceleration, and platformers often ignore mass, so friction ought to be an acceleration — applied against the object’s movement, but never enough to push it backwards.

I haven’t made any games where variable friction plays a significant role, but my gut instinct is that low friction should mean the player accelerates more slowly but has a higher max speed, and high friction should mean the opposite. I see from my own source code that I didn’t even do what I just said, so let’s defer to some better-made and well-documented games: Sonic and Doom.

In Sonic, friction is a fixed value subtracted from the player’s velocity (regardless of direction) each tic. Sonic has a fixed framerate, so the units are really pixels per tic squared (i.e. acceleration), multiplied by an implicit 1 tic per tic. So far, so good.

But Sonic’s friction only applies if the player isn’t pressing or . Hang on, that isn’t friction at all; that’s just deceleration! That’s equivalent to jogging to a stop. If friction were lower, Sonic would take longer to stop, but otherwise this is only tangentially related to friction.

(In fairness, this approach would decently emulate friction for non-conscious sliding objects, which are never going to be pressing movement buttons. Also, we don’t have the Sonic source code, and the name “friction” is a fan invention; the Sonic Physics Guide already uses “deceleration” to describe the player’s acceleration when turning around.)

Okay, let’s try Doom. In Doom, the default friction is 90.625%.

Hang on, what?

Yes, in Doom, friction is a multiplier applied every tic. Doom runs at 35 tics per second, so this is a multiplier of 0.032 per second. Yikes!

This isn’t anything remotely like real friction, but it’s much easier to implement. With friction as acceleration, the game has to know both the direction of movement (so it can apply friction in the opposite direction) and the magnitude (so it doesn’t overshoot and launch the object in the other direction). That means taking a semi-costly square root and also writing extra code to cap the amount of friction. With a multiplier, neither is necessary; just multiply the whole velocity vector and you’re done.

There are some downsides. One is that objects will never actually stop, since multiplying by 3% repeatedly will never produce a result of zero — though eventually the speed will become small enough to either slip below a “minimum speed” threshold or simply no longer fit in a float representation. Another is that the units are fairly meaningless: with Doom’s default friction of 90.625%, about how long does it take for the player to stop? I have no idea, partly because “stop” is ambiguous here! If friction were an acceleration, I could divide it into the player’s max speed to get a time.

All that aside, what are the actual effects of changing Doom’s friction? What an excellent question that’s surprisingly tricky to answer. (Note that friction can’t be changed in original Doom, only in the Boom port and its derivatives.) Here’s what I’ve pieced together.

Doom’s “friction” is really two values. “Friction” itself is a multiplier applied to moving objects on every tic, but there’s also a move factor which defaults to \(\frac{1}{32} = 0.03125\) and is derived from friction for custom values.

Every tic, the player’s velocity is multiplied by friction, and then increased by their speed times the move factor.

$$
v(n) = v(n – 1) \times friction + speed \times move factor
$$

Eventually, the reduction from friction will balance out the speed boost. That happens when \(v(n) = v(n – 1)\), so we can rearrange it to find the player’s effective max speed:

$$
v = v \times friction + speed \times move factor \\
v – v \times friction = speed \times move factor \\
v = speed \times \frac{move factor}{1 – friction}
$$

For vanilla Doom’s move factor of 0.03125 and friction of 0.90625, that becomes:

$$
v = speed \times \frac{\frac{1}{32}}{1 – \frac{29}{32}} = speed \times \frac{\frac{1}{32}}{\frac{3}{32}} = \frac{1}{3} \times speed
$$

Curiously, “speed” is three times the maximum speed an actor can actually move. Doomguy’s run speed is 50, so in practice he moves a third of that, or 16⅔ units per tic. (Of course, this isn’t counting SR40, a bug that lets Doomguy run ~40% faster than intended diagonally.)

So now, what if you change friction? Even more curiously, the move factor is calculated completely differently depending on whether friction is higher or lower than the default Doom amount:

$$
move factor = \begin{cases}
\frac{133 – 128 \times friction}{544} &≈ 0.244 – 0.235 \times friction & \text{ if } friction \ge \frac{29}{32} \\
\frac{81920 \times friction – 70145}{1048576} &≈ 0.078 \times friction – 0.067 & \text{ otherwise }
\end{cases}
$$

That’s pretty weird? Complicating things further is that low friction (which means muddy terrain, remember) has an extra multiplier on its move factor, depending on how fast you’re already going — the idea is apparently that you have a hard time getting going, but it gets easier as you find your footing. The extra multiplier maxes out at 8, which makes the two halves of that function meet at the vanilla Doom value.

A graph of the relationship between friction and move factor

That very top point corresponds to the move factor from the original game. So no matter what you do to friction, the move factor becomes lower. At 0.85 and change, you can no longer move at all; below that, you move backwards.

From the formula above, it’s easy to see what changes to friction and move factor will do to Doomguy’s stable velocity. Move factor is in the numerator, so increasing it will increase stable velocity — but it can’t increase, so stable velocity can only ever decrease. Friction is in the denominator, but it’s subtracted from 1, so increasing friction will make the denominator a smaller value less than 1, i.e. increase stable velocity. Combined, we get this relationship between friction and stable velocity.

A graph showing stable velocity shooting up dramatically as friction increases

As friction approaches 1, stable velocity grows without bound. This makes sense, given the definition of \(v(n)\) — if friction is 1, the velocity from the previous tic isn’t reduced at all, so we just keep accelerating freely.

All of this is why I’m wary of using multipliers.

Anyway, this leaves me with one last question about the effects of Doom’s friction: how long does it take to reach stable velocity? Barring precision errors, we’ll never truly reach stable velocity, but let’s say within 5%. First we need a closed formula for the velocity after some number of tics. This is a simple recurrence relation, and you can write a few terms out yourself if you want to be sure this is right.

$$
v(n) = v_0 \times friction^n + speed \times move factor \times \frac{friction^n – 1}{friction – 1}
$$

Our initial velocity is zero, so the first term disappears. Set this equal to the stable formula and solve for n:

$$
speed \times move factor \times \frac{friction^n – 1}{friction – 1} = (1 – 5\%) \times speed \times \frac{move factor}{1 – friction} \\
friction^n – 1 = -(1 – 5\%) \\
n = \frac{\ln 5\%}{\ln friction}
$$

Speed” and move factor disappear entirely, which makes sense, and this is purely a function of friction (and how close we want to get). For vanilla Doom, that comes out to 30.4, which is a little less than a second. For other values of friction:

A graph of time to stability which leaps upwards dramatically towards the right

As friction increases (which in Doom terms means the surface is more slippery), it takes longer and longer to reach stable speed, which is in turn greater and greater. For lesser friction (i.e. mud), stable speed is lower, but reached fairly quickly. (Of course, the extra “getting going” multiplier while in mud adds some extra time here, but including that in the graph is a bit more complicated.)

I think this matches with my instincts above. How fascinating!

What’s that? This is way too much math and you hate it? Then don’t use multipliers in game physics.

Uh

That was a hell of a diversion!

I guess the goofiest stuff in basic game physics is really just about mapping player controls to in-game actions like jumping and deceleration; the rest consists of hacks to compensate for representing everything as a box.

Torrent Pioneers: isoHunt’s Gary Fung, Ten Years Later

Post Syndicated from Ernesto original https://torrentfreak.com/torrent-pioneers-isohunts-gary-fung-ten-years-later-180106/

Ten years ago, November 2007 to be precise, we published an article featuring the four leading torrent site admins at the time.

Niek van der Maas of Mininova, Justin Bunnell of TorrentSpy, Pirate Bay’s Peter Sunde and isoHunt’s Gary Fung were all kind enough to share their vision of BitTorrent’s future.

This future is the present today, and although the predictions were not all spot-on, there are a few interesting observations to make.

For one, these four men were all known by name, despite the uncertain legal situation they were in. How different is that today, when the operators of most of the world’s largest torrent sites are unknown to the broader public.

Another thing that stands out is that none of these pioneers are still active in the torrent space today. Niek and Justin have their own advertising businesses, Peter is a serial entrepreneur involved in various startups, while Gary works on his own projects.

While they have all moved on, they also remain a part of Internet history, which is why we decided to reach out to them ten years on.

Gary Fung was the first to reply. Those who’ve been following torrent news for a while know that isoHunt was shut down in 2013. The shutdown was the result of a lawsuit and came with a $110 million settlement with the MPAA, on paper.

Today the Canadian entrepreneur has other things on his hands, which includes “leveling up” his now one-year-old daughter. While that can be a day job by itself, he is also finalizing a mobile search app which will be released in the near future.

“The key is speed, and I can measure its speedup of the whole mobile search experience to be 10-100x that of conventional mobile web browsers,” Gary tells us, noting that after years of development, it’s almost ready.

The new search app is not one dedicated to torrents, as isoHunt once was. However, looking back, Gary is proud of what he accomplished with isoHunt, despite the bitter end.

“It was a humbling experience, in more ways than one. I’m proud that I participated and championed the rise of P2P content distribution through isoHunt as a search gateway,” Gary tells us.

“But I was also humbled by the responsibility and power at play, as seen in the lawsuits from the media industry giants, as well as the even larger picture of what P2P technologies were bringing, and still bring today.”

Decentralization has always been a key feature of BitTorrent and Gary sees this coming back in new trends. This includes the massive attention for blockchain related projects such as Bitcoin.

“2017 was the year Bitcoin became mainstream in a big way, and it’s feeling like the Internet before 2000. Decentralization is by nature disruptive, and I can’t wait to see what decentralizing money, governance, organizations and all kinds of applications will bring in the next few years.

“dApps [decentralized apps] made possible by platforms like Ethereum are like generalized BitTorrent for all kinds of applications, with ones we haven’t even thought of yet,” Gary adds.

Not everything is positive in hindsight, of course. Gary tells us that if he had to do it all over again he would take legal issues and lawyers more seriously. Not doing so led to more trouble than he imagined.

As a former torrent site admin, he has thought about the piracy issue quite a bit over the years. And unlike some sites today, he was happy to look for possible solutions to stop piracy.

One solution Gary suggested to Hollywood in the past was a hash recognition system for infringing torrents. A system to automatically filter known infringing files and remove these from cooperating torrent sites could still work today, he thinks.

“ContentID for all files shared on BitTorrent, similar to YouTube. I’ve proposed this to Hollywood studios before, as a better solution to suing their customers and potential P2P technology partners, but it obviously fell on deaf ears.”

In any case, torrent sites and similar services will continue to play an important role in how the media industry evolves. These platforms are showing Hollywood what the public wants, Gary believes.

“It has and will continue to play a role in showing the industry what consumers truly want: frictionless, convenient distribution, without borders of country or bundles. Bundles as in cable channels, but also in any way unwanted content is forced onto consumers without choice.”

While torrents were dominant in the past, the future will be streaming mostly, isoHunt’s founder says. He said this ten years ago, and he believes that in another decade it will have completely replaced cable TV.

Whether piracy will still be relevant then depends on how content is offered. More fragmentation will lead to more piracy, while easier access will make it less relevant.

“The question then will be, will streaming platforms be fragmented and exclusive content bundled into a hundred pieces besides Netflix, or will consumer choice and convenience win out in a cross-platform way?

“A piracy increase or reduction will depend on how that plays out because nobody wants to worry about ten monthly subscriptions to ten different streaming services, much less a hundred,” Gary concludes.

Perhaps we should revisit this again next decade…


The second post in this series, with Peter Sunde, will be published this weekend. The other two pioneers did not respond or declined to take part.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Combine Transactional and Analytical Data Using Amazon Aurora and Amazon Redshift

Post Syndicated from Re Alvarez-Parmar original https://aws.amazon.com/blogs/big-data/combine-transactional-and-analytical-data-using-amazon-aurora-and-amazon-redshift/

A few months ago, we published a blog post about capturing data changes in an Amazon Aurora database and sending it to Amazon Athena and Amazon QuickSight for fast analysis and visualization. In this post, I want to demonstrate how easy it can be to take the data in Aurora and combine it with data in Amazon Redshift using Amazon Redshift Spectrum.

With Amazon Redshift, you can build petabyte-scale data warehouses that unify data from a variety of internal and external sources. Because Amazon Redshift is optimized for complex queries (often involving multiple joins) across large tables, it can handle large volumes of retail, inventory, and financial data without breaking a sweat.

In this post, we describe how to combine data in Aurora in Amazon Redshift. Here’s an overview of the solution:

  • Use AWS Lambda functions with Amazon Aurora to capture data changes in a table.
  • Save data in an Amazon S3
  • Query data using Amazon Redshift Spectrum.

We use the following services:

Serverless architecture for capturing and analyzing Aurora data changes

Consider a scenario in which an e-commerce web application uses Amazon Aurora for a transactional database layer. The company has a sales table that captures every single sale, along with a few corresponding data items. This information is stored as immutable data in a table. Business users want to monitor the sales data and then analyze and visualize it.

In this example, you take the changes in data in an Aurora database table and save it in Amazon S3. After the data is captured in Amazon S3, you combine it with data in your existing Amazon Redshift cluster for analysis.

By the end of this post, you will understand how to capture data events in an Aurora table and push them out to other AWS services using AWS Lambda.

The following diagram shows the flow of data as it occurs in this tutorial:

The starting point in this architecture is a database insert operation in Amazon Aurora. When the insert statement is executed, a custom trigger calls a Lambda function and forwards the inserted data. Lambda writes the data that it received from Amazon Aurora to a Kinesis data delivery stream. Kinesis Data Firehose writes the data to an Amazon S3 bucket. Once the data is in an Amazon S3 bucket, it is queried in place using Amazon Redshift Spectrum.

Creating an Aurora database

First, create a database by following these steps in the Amazon RDS console:

  1. Sign in to the AWS Management Console, and open the Amazon RDS console.
  2. Choose Launch a DB instance, and choose Next.
  3. For Engine, choose Amazon Aurora.
  4. Choose a DB instance class. This example uses a small, since this is not a production database.
  5. In Multi-AZ deployment, choose No.
  6. Configure DB instance identifier, Master username, and Master password.
  7. Launch the DB instance.

After you create the database, use MySQL Workbench to connect to the database using the CNAME from the console. For information about connecting to an Aurora database, see Connecting to an Amazon Aurora DB Cluster.

The following screenshot shows the MySQL Workbench configuration:

Next, create a table in the database by running the following SQL statement:

Create Table
CREATE TABLE Sales (
InvoiceID int NOT NULL AUTO_INCREMENT,
ItemID int NOT NULL,
Category varchar(255),
Price double(10,2), 
Quantity int not NULL,
OrderDate timestamp,
DestinationState varchar(2),
ShippingType varchar(255),
Referral varchar(255),
PRIMARY KEY (InvoiceID)
)

You can now populate the table with some sample data. To generate sample data in your table, copy and run the following script. Ensure that the highlighted (bold) variables are replaced with appropriate values.

#!/usr/bin/python
import MySQLdb
import random
import datetime

db = MySQLdb.connect(host="AURORA_CNAME",
                     user="DBUSER",
                     passwd="DBPASSWORD",
                     db="DB")

states = ("AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN",
"IA","KS","KY","LA","ME","MD","MA","MI","MN","MS","MO","MT","NE","NV","NH","NJ",
"NM","NY","NC","ND","OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VT","VA",
"WA","WV","WI","WY")

shipping_types = ("Free", "3-Day", "2-Day")

product_categories = ("Garden", "Kitchen", "Office", "Household")
referrals = ("Other", "Friend/Colleague", "Repeat Customer", "Online Ad")

for i in range(0,10):
    item_id = random.randint(1,100)
    state = states[random.randint(0,len(states)-1)]
    shipping_type = shipping_types[random.randint(0,len(shipping_types)-1)]
    product_category = product_categories[random.randint(0,len(product_categories)-1)]
    quantity = random.randint(1,4)
    referral = referrals[random.randint(0,len(referrals)-1)]
    price = random.randint(1,100)
    order_date = datetime.date(2016,random.randint(1,12),random.randint(1,30)).isoformat()

    data_order = (item_id, product_category, price, quantity, order_date, state,
    shipping_type, referral)

    add_order = ("INSERT INTO Sales "
                   "(ItemID, Category, Price, Quantity, OrderDate, DestinationState, \
                   ShippingType, Referral) "
                   "VALUES (%s, %s, %s, %s, %s, %s, %s, %s)")

    cursor = db.cursor()
    cursor.execute(add_order, data_order)

    db.commit()

cursor.close()
db.close() 

The following screenshot shows how the table appears with the sample data:

Sending data from Amazon Aurora to Amazon S3

There are two methods available to send data from Amazon Aurora to Amazon S3:

  • Using a Lambda function
  • Using SELECT INTO OUTFILE S3

To demonstrate the ease of setting up integration between multiple AWS services, we use a Lambda function to send data to Amazon S3 using Amazon Kinesis Data Firehose.

Alternatively, you can use a SELECT INTO OUTFILE S3 statement to query data from an Amazon Aurora DB cluster and save it directly in text files that are stored in an Amazon S3 bucket. However, with this method, there is a delay between the time that the database transaction occurs and the time that the data is exported to Amazon S3 because the default file size threshold is 6 GB.

Creating a Kinesis data delivery stream

The next step is to create a Kinesis data delivery stream, since it’s a dependency of the Lambda function.

To create a delivery stream:

  1. Open the Kinesis Data Firehose console
  2. Choose Create delivery stream.
  3. For Delivery stream name, type AuroraChangesToS3.
  4. For Source, choose Direct PUT.
  5. For Record transformation, choose Disabled.
  6. For Destination, choose Amazon S3.
  7. In the S3 bucket drop-down list, choose an existing bucket, or create a new one.
  8. Enter a prefix if needed, and choose Next.
  9. For Data compression, choose GZIP.
  10. In IAM role, choose either an existing role that has access to write to Amazon S3, or choose to generate one automatically. Choose Next.
  11. Review all the details on the screen, and choose Create delivery stream when you’re finished.

 

Creating a Lambda function

Now you can create a Lambda function that is called every time there is a change that needs to be tracked in the database table. This Lambda function passes the data to the Kinesis data delivery stream that you created earlier.

To create the Lambda function:

  1. Open the AWS Lambda console.
  2. Ensure that you are in the AWS Region where your Amazon Aurora database is located.
  3. If you have no Lambda functions yet, choose Get started now. Otherwise, choose Create function.
  4. Choose Author from scratch.
  5. Give your function a name and select Python 3.6 for Runtime
  6. Choose and existing or create a new Role, the role would need to have access to call firehose:PutRecord
  7. Choose Next on the trigger selection screen.
  8. Paste the following code in the code window. Change the stream_name variable to the Kinesis data delivery stream that you created in the previous step.
  9. Choose File -> Save in the code editor and then choose Save.
import boto3
import json

firehose = boto3.client('firehose')
stream_name = ‘AuroraChangesToS3’


def Kinesis_publish_message(event, context):
    
    firehose_data = (("%s,%s,%s,%s,%s,%s,%s,%s\n") %(event['ItemID'], 
    event['Category'], event['Price'], event['Quantity'],
    event['OrderDate'], event['DestinationState'], event['ShippingType'], 
    event['Referral']))
    
    firehose_data = {'Data': str(firehose_data)}
    print(firehose_data)
    
    firehose.put_record(DeliveryStreamName=stream_name,
    Record=firehose_data)

Note the Amazon Resource Name (ARN) of this Lambda function.

Giving Aurora permissions to invoke a Lambda function

To give Amazon Aurora permissions to invoke a Lambda function, you must attach an IAM role with appropriate permissions to the cluster. For more information, see Invoking a Lambda Function from an Amazon Aurora DB Cluster.

Once you are finished, the Amazon Aurora database has access to invoke a Lambda function.

Creating a stored procedure and a trigger in Amazon Aurora

Now, go back to MySQL Workbench, and run the following command to create a new stored procedure. When this stored procedure is called, it invokes the Lambda function you created. Change the ARN in the following code to your Lambda function’s ARN.

DROP PROCEDURE IF EXISTS CDC_TO_FIREHOSE;
DELIMITER ;;
CREATE PROCEDURE CDC_TO_FIREHOSE (IN ItemID VARCHAR(255), 
									IN Category varchar(255), 
									IN Price double(10,2),
                                    IN Quantity int(11),
                                    IN OrderDate timestamp,
                                    IN DestinationState varchar(2),
                                    IN ShippingType varchar(255),
                                    IN Referral  varchar(255)) LANGUAGE SQL 
BEGIN
  CALL mysql.lambda_async('arn:aws:lambda:us-east-1:XXXXXXXXXXXXX:function:CDCFromAuroraToKinesis', 
     CONCAT('{ "ItemID" : "', ItemID, 
            '", "Category" : "', Category,
            '", "Price" : "', Price,
            '", "Quantity" : "', Quantity, 
            '", "OrderDate" : "', OrderDate, 
            '", "DestinationState" : "', DestinationState, 
            '", "ShippingType" : "', ShippingType, 
            '", "Referral" : "', Referral, '"}')
     );
END
;;
DELIMITER ;

Create a trigger TR_Sales_CDC on the Sales table. When a new record is inserted, this trigger calls the CDC_TO_FIREHOSE stored procedure.

DROP TRIGGER IF EXISTS TR_Sales_CDC;
 
DELIMITER ;;
CREATE TRIGGER TR_Sales_CDC
  AFTER INSERT ON Sales
  FOR EACH ROW
BEGIN
  SELECT  NEW.ItemID , NEW.Category, New.Price, New.Quantity, New.OrderDate
  , New.DestinationState, New.ShippingType, New.Referral
  INTO @ItemID , @Category, @Price, @Quantity, @OrderDate
  , @DestinationState, @ShippingType, @Referral;
  CALL  CDC_TO_FIREHOSE(@ItemID , @Category, @Price, @Quantity, @OrderDate
  , @DestinationState, @ShippingType, @Referral);
END
;;
DELIMITER ;

If a new row is inserted in the Sales table, the Lambda function that is mentioned in the stored procedure is invoked.

Verify that data is being sent from the Lambda function to Kinesis Data Firehose to Amazon S3 successfully. You might have to insert a few records, depending on the size of your data, before new records appear in Amazon S3. This is due to Kinesis Data Firehose buffering. To learn more about Kinesis Data Firehose buffering, see the “Amazon S3” section in Amazon Kinesis Data Firehose Data Delivery.

Every time a new record is inserted in the sales table, a stored procedure is called, and it updates data in Amazon S3.

Querying data in Amazon Redshift

In this section, you use the data you produced from Amazon Aurora and consume it as-is in Amazon Redshift. In order to allow you to process your data as-is, where it is, while taking advantage of the power and flexibility of Amazon Redshift, you use Amazon Redshift Spectrum. You can use Redshift Spectrum to run complex queries on data stored in Amazon S3, with no need for loading or other data prep.

Just create a data source and issue your queries to your Amazon Redshift cluster as usual. Behind the scenes, Redshift Spectrum scales to thousands of instances on a per-query basis, ensuring that you get fast, consistent performance even as your dataset grows to beyond an exabyte! Being able to query data that is stored in Amazon S3 means that you can scale your compute and your storage independently. You have the full power of the Amazon Redshift query model and all the reporting and business intelligence tools at your disposal. Your queries can reference any combination of data stored in Amazon Redshift tables and in Amazon S3.

Redshift Spectrum supports open, common data types, including CSV/TSV, Apache Parquet, SequenceFile, and RCFile. Files can be compressed using gzip or Snappy, with other data types and compression methods in the works.

First, create an Amazon Redshift cluster. Follow the steps in Launch a Sample Amazon Redshift Cluster.

Next, create an IAM role that has access to Amazon S3 and Athena. By default, Amazon Redshift Spectrum uses the Amazon Athena data catalog. Your cluster needs authorization to access your external data catalog in AWS Glue or Athena and your data files in Amazon S3.

In the demo setup, I attached AmazonS3FullAccess and AmazonAthenaFullAccess. In a production environment, the IAM roles should follow the standard security of granting least privilege. For more information, see IAM Policies for Amazon Redshift Spectrum.

Attach the newly created role to the Amazon Redshift cluster. For more information, see Associate the IAM Role with Your Cluster.

Next, connect to the Amazon Redshift cluster, and create an external schema and database:

create external schema if not exists spectrum_schema
from data catalog 
database 'spectrum_db' 
region 'us-east-1'
IAM_ROLE 'arn:aws:iam::XXXXXXXXXXXX:role/RedshiftSpectrumRole'
create external database if not exists;

Don’t forget to replace the IAM role in the statement.

Then create an external table within the database:

 CREATE EXTERNAL TABLE IF NOT EXISTS spectrum_schema.ecommerce_sales(
  ItemID int,
  Category varchar,
  Price DOUBLE PRECISION,
  Quantity int,
  OrderDate TIMESTAMP,
  DestinationState varchar,
  ShippingType varchar,
  Referral varchar)
ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION 's3://{BUCKET_NAME}/CDC/'

Query the table, and it should contain data. This is a fact table.

select top 10 * from spectrum_schema.ecommerce_sales

 

Next, create a dimension table. For this example, we create a date/time dimension table. Create the table:

CREATE TABLE date_dimension (
  d_datekey           integer       not null sortkey,
  d_dayofmonth        integer       not null,
  d_monthnum          integer       not null,
  d_dayofweek                varchar(10)   not null,
  d_prettydate        date       not null,
  d_quarter           integer       not null,
  d_half              integer       not null,
  d_year              integer       not null,
  d_season            varchar(10)   not null,
  d_fiscalyear        integer       not null)
diststyle all;

Populate the table with data:

copy date_dimension from 's3://reparmar-lab/2016dates' 
iam_role 'arn:aws:iam::XXXXXXXXXXXX:role/redshiftspectrum'
DELIMITER ','
dateformat 'auto';

The date dimension table should look like the following:

Querying data in local and external tables using Amazon Redshift

Now that you have the fact and dimension table populated with data, you can combine the two and run analysis. For example, if you want to query the total sales amount by weekday, you can run the following:

select sum(quantity*price) as total_sales, date_dimension.d_season
from spectrum_schema.ecommerce_sales 
join date_dimension on spectrum_schema.ecommerce_sales.orderdate = date_dimension.d_prettydate 
group by date_dimension.d_season

You get the following results:

Similarly, you can replace d_season with d_dayofweek to get sales figures by weekday:

With Amazon Redshift Spectrum, you pay only for the queries you run against the data that you actually scan. We encourage you to use file partitioning, columnar data formats, and data compression to significantly minimize the amount of data scanned in Amazon S3. This is important for data warehousing because it dramatically improves query performance and reduces cost.

Partitioning your data in Amazon S3 by date, time, or any other custom keys enables Amazon Redshift Spectrum to dynamically prune nonrelevant partitions to minimize the amount of data processed. If you store data in a columnar format, such as Parquet, Amazon Redshift Spectrum scans only the columns needed by your query, rather than processing entire rows. Similarly, if you compress your data using one of the supported compression algorithms in Amazon Redshift Spectrum, less data is scanned.

Analyzing and visualizing Amazon Redshift data in Amazon QuickSight

Modify the Amazon Redshift security group to allow an Amazon QuickSight connection. For more information, see Authorizing Connections from Amazon QuickSight to Amazon Redshift Clusters.

After modifying the Amazon Redshift security group, go to Amazon QuickSight. Create a new analysis, and choose Amazon Redshift as the data source.

Enter the database connection details, validate the connection, and create the data source.

Choose the schema to be analyzed. In this case, choose spectrum_schema, and then choose the ecommerce_sales table.

Next, we add a custom field for Total Sales = Price*Quantity. In the drop-down list for the ecommerce_sales table, choose Edit analysis data sets.

On the next screen, choose Edit.

In the data prep screen, choose New Field. Add a new calculated field Total Sales $, which is the product of the Price*Quantity fields. Then choose Create. Save and visualize it.

Next, to visualize total sales figures by month, create a graph with Total Sales on the x-axis and Order Data formatted as month on the y-axis.

After you’ve finished, you can use Amazon QuickSight to add different columns from your Amazon Redshift tables and perform different types of visualizations. You can build operational dashboards that continuously monitor your transactional and analytical data. You can publish these dashboards and share them with others.

Final notes

Amazon QuickSight can also read data in Amazon S3 directly. However, with the method demonstrated in this post, you have the option to manipulate, filter, and combine data from multiple sources or Amazon Redshift tables before visualizing it in Amazon QuickSight.

In this example, we dealt with data being inserted, but triggers can be activated in response to an INSERT, UPDATE, or DELETE trigger.

Keep the following in mind:

  • Be careful when invoking a Lambda function from triggers on tables that experience high write traffic. This would result in a large number of calls to your Lambda function. Although calls to the lambda_async procedure are asynchronous, triggers are synchronous.
  • A statement that results in a large number of trigger activations does not wait for the call to the AWS Lambda function to complete. But it does wait for the triggers to complete before returning control to the client.
  • Similarly, you must account for Amazon Kinesis Data Firehose limits. By default, Kinesis Data Firehose is limited to a maximum of 5,000 records/second. For more information, see Monitoring Amazon Kinesis Data Firehose.

In certain cases, it may be optimal to use AWS Database Migration Service (AWS DMS) to capture data changes in Aurora and use Amazon S3 as a target. For example, AWS DMS might be a good option if you don’t need to transform data from Amazon Aurora. The method used in this post gives you the flexibility to transform data from Aurora using Lambda before sending it to Amazon S3. Additionally, the architecture has the benefits of being serverless, whereas AWS DMS requires an Amazon EC2 instance for replication.

For design considerations while using Redshift Spectrum, see Using Amazon Redshift Spectrum to Query External Data.

If you have questions or suggestions, please comment below.


Additional Reading

If you found this post useful, be sure to check out Capturing Data Changes in Amazon Aurora Using AWS Lambda and 10 Best Practices for Amazon Redshift Spectrum


About the Authors

Re Alvarez-Parmar is a solutions architect for Amazon Web Services. He helps enterprises achieve success through technical guidance and thought leadership. In his spare time, he enjoys spending time with his two kids and exploring outdoors.

 

 

 

Detecting Adblocker Blockers

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/01/detecting_adblo.html

Interesting research on the prevalence of adblock blockers: “Measuring and Disrupting Anti-Adblockers Using Differential Execution Analysis“:

Abstract: Millions of people use adblockers to remove intrusive and malicious ads as well as protect themselves against tracking and pervasive surveillance. Online publishers consider adblockers a major threat to the ad-powered “free” Web. They have started to retaliate against adblockers by employing anti-adblockers which can detect and stop adblock users. To counter this retaliation, adblockers in turn try to detect and filter anti-adblocking scripts. This back and forth has prompted an escalating arms race between adblockers and anti-adblockers.

We want to develop a comprehensive understanding of anti-adblockers, with the ultimate aim of enabling adblockers to bypass state-of-the-art anti-adblockers. In this paper, we present a differential execution analysis to automatically detect and analyze anti-adblockers. At a high level, we collect execution traces by visiting a website with and without adblockers. Through differential execution analysis, we are able to pinpoint the conditions that lead to the differences caused by anti-adblocking code. Using our system, we detect anti-adblockers on 30.5% of the Alexa top-10K websites which is 5-52 times more than reported in prior literature. Unlike prior work which is limited to detecting visible reactions (e.g., warning messages) by anti-adblockers, our system can discover attempts to detect adblockers even when there is no visible reaction. From manually checking one third of the detected websites, we find that the websites that have no visible reactions constitute over 90% of the cases, completely dominating the ones that have visible warning messages. Finally, based on our findings, we further develop JavaScript rewriting and API hooking based solutions (the latter implemented as a Chrome extension) to help adblockers bypass state-of-the-art anti-adblockers.

News article.

The Raspberry Pi PiServer tool

Post Syndicated from Gordon Hollingworth original https://www.raspberrypi.org/blog/piserver/

As Simon mentioned in his recent blog post about Raspbian Stretch, we have developed a new piece of software called PiServer. Use this tool to easily set up a network of client Raspberry Pis connected to a single x86-based server via Ethernet. With PiServer, you don’t need SD cards, you can control all clients via the server, and you can add and configure user accounts — it’s ideal for the classroom, your home, or an industrial setting.

PiServer diagram

Client? Server?

Before I go into more detail, let me quickly explain some terms.

  • Server — the server is the computer that provides the file system, boot files, and password authentication to the client(s)
  • Client — a client is a computer that retrieves boot files from the server over the network, and then uses a file system the server has shared. More than one client can connect to a server, but all clients use the same file system.
  • User – a user is a user name/password combination that allows someone to log into a client to access the file system on the server. Any user can log into any client with their credentials, and will always see the same server and share the same file system. Users do not have sudo capability on a client, meaning they cannot make significant changes to the file system and software.

I see no SD cards

Last year we described how the Raspberry Pi 3 Model B can be booted without an SD card over an Ethernet network from another computer (the server). This is called network booting or PXE (pronounced ‘pixie’) booting.

Why would you want to do this?

  • A client computer (the Raspberry Pi) doesn’t need any permanent storage (an SD card) to boot.
  • You can network a large number of clients to one server, and all clients are exactly the same. If you log into one of the clients, you will see the same file system as if you logged into any other client.
  • The server can be run on an x86 system, which means you get to take advantage of the performance, network, and disk speed on the server.

Sounds great, right? Of course, for the less technical, creating such a network is very difficult. For example, there’s setting up all the required DHCP and TFTP servers, and making sure they behave nicely with the rest of the network. If you get this wrong, you can break your entire network.

PiServer to the rescue

To make network booting easy, I thought it would be nice to develop an application which did everything for you. Let me introduce: PiServer!

PiServer has the following functionalities:

  • It automatically detects Raspberry Pis trying to network boot, so you don’t have to work out their Ethernet addresses.
  • It sets up a DHCP server — the thing inside the router that gives all network devices an IP address — either in proxy mode or in full IP mode. No matter the mode, the DHCP server will only reply to the Raspberry Pis you have specified, which is important for network safety.
  • It creates user names and passwords for the server. This is great for a classroom full of Pis: just set up all the users beforehand, and everyone gets to log in with their passwords and keep all their work in a central place. Moreover, users cannot change the software, so educators have control over which programs their learners can use.
  • It uses a slightly altered Raspbian build which allows separation of temporary spaces, doesn’t have the default ‘pi’ user, and has LDAP enabled for log-in.

What can I do with PiServer?

Serve a whole classroom of Pis

In a classroom, PiServer allows all files for lessons or projects to be stored on a central x86-based computer. Each user can have their own account, and any files they create are also stored on the server. Moreover, the networked Pis doesn’t need to be connected to the internet. The teacher has centralised control over all Pis, and all Pis are user-agnostic, meaning there’s no need to match a person with a computer or an SD card.

Build a home server

PiServer could be used in the home to serve file systems for all Raspberry Pis around the house — either a single common Raspbian file system for all Pis or a different operating system for each. Hopefully, our extensive OS suppliers will provide suitable build files in future.

Use it as a controller for networked Pis

In an industrial scenario, it is possible to use PiServer to develop a network of Raspberry Pis (maybe even using Power over Ethernet (PoE)) such that the control software for each Pi is stored remotely on a server. This enables easy remote control and provisioning of the Pis from a central repository.

How to use PiServer

The client machines

So that you can use a Pi as a client, you need to enable network booting on it. Power it up using an SD card with a Raspbian Lite image, and open a terminal window. Type in

echo program_usb_boot_mode=1 | sudo tee -a /boot/config.txt

and press Return. This adds the line program_usb_boot_mode=1 to the end of the config.txt file in /boot. Now power the Pi down and remove the SD card. The next time you connect the Pi to a power source, you will be able to network boot it.

The server machine

As a server, you will need an x86 computer on which you can install x86 Debian Stretch. Refer to Simon’s blog post for additional information on this. It is possible to use a Raspberry Pi to serve to the client Pis, but the file system will be slower, especially at boot time.

Make sure your server has a good amount of disk space available for the file system — in general, we recommend at least 16Gb SD cards for Raspberry Pis. The whole client file system is stored locally on the server, so the disk space requirement is fairly significant.

Next, start PiServer by clicking on the start icon and then clicking Preferences > PiServer. This will open a graphical user interface — the wizard — that will walk you through setting up your network. Skip the introduction screen, and you should see a screen looking like this:

PiServer GUI screenshot

If you’ve enabled network booting on the client Pis and they are connected to a power source, their MAC addresses will automatically appear in the table shown above. When you have added all your Pis, click Next.

PiServer GUI screenshot

On the Add users screen, you can set up users on your server. These are pairs of user names and passwords that will be valid for logging into the client Raspberry Pis. Don’t worry, you can add more users at any point. Click Next again when you’re done.

PiServer GUI screenshot

The Add software screen allows you to select the operating system you want to run on the attached Pis. (You’ll have the option to assign an operating system to each client individually in the setting after the wizard has finished its job.) There are some automatically populated operating systems, such as Raspbian and Raspbian Lite. Hopefully, we’ll add more in due course. You can also provide your own operating system from a local file, or install it from a URL. For further information about how these operating system images are created, have a look at the scripts in /var/lib/piserver/scripts.

Once you’re done, click Next again. The wizard will then install the necessary components and the operating systems you’ve chosen. This will take a little time, so grab a coffee (or decaffeinated drink of your choice).

When the installation process is finished, PiServer is up and running — all you need to do is reboot the Pis to get them to run from the server.

Shooting troubles

If you have trouble getting clients connected to your network, there are a fewthings you can do to debug:

  1. If some clients are connecting but others are not, check whether you’ve enabled the network booting mode on the Pis that give you issues. To do that, plug an Ethernet cable into the Pi (with the SD card removed) — the LEDs on the Pi and connector should turn on. If that doesn’t happen, you’ll need to follow the instructions above to boot the Pi and edit its /boot/config.txt file.
  2. If you can’t connect to any clients, check whether your network is suitable: format an SD card, and copy bootcode.bin from /boot on a standard Raspbian image onto it. Plug the card into a client Pi, and check whether it appears as a new MAC address in the PiServer GUI. If it does, then the problem is a known issue, and you can head to our forums to ask for advice about it (the network booting code has a couple of problems which we’re already aware of). For a temporary fix, you can clone the SD card on which bootcode.bin is stored for all your clients.

If neither of these things fix your problem, our forums are the place to find help — there’s a host of people there who’ve got PiServer working. If you’re sure you have identified a problem that hasn’t been addressed on the forums, or if you have a request for a functionality, then please add it to the GitHub issues.

The post The Raspberry Pi PiServer tool appeared first on Raspberry Pi.

Some notes on Meltdown/Spectre

Post Syndicated from Robert Graham original http://blog.erratasec.com/2018/01/some-notes-on-meltdownspectre.html

I thought I’d write up some notes.

You don’t have to worry if you patch. If you download the latest update from Microsoft, Apple, or Linux, then the problem is fixed for you and you don’t have to worry. If you aren’t up to date, then there’s a lot of other nasties out there you should probably also be worrying about. I mention this because while this bug is big in the news, it’s probably not news the average consumer needs to concern themselves with.

This will force a redesign of CPUs and operating systems. While not a big news item for consumers, it’s huge in the geek world. We’ll need to redesign operating systems and how CPUs are made.

Don’t worry about the performance hit. Some, especially avid gamers, are concerned about the claims of “30%” performance reduction when applying the patch. That’s only in some rare cases, so you shouldn’t worry too much about it. As far as I can tell, 3D games aren’t likely to see less than 1% performance degradation. If you imagine your game is suddenly slower after the patch, then something else broke it.

This wasn’t foreseeable. A common cliche is that such bugs happen because people don’t take security seriously, or that they are taking “shortcuts”. That’s not the case here. Speculative execution and timing issues with caches are inherent issues with CPU hardware. “Fixing” this would make CPUs run ten times slower. Thus, while we can tweek hardware going forward, the larger change will be in software.

There’s no good way to disclose this. The cybersecurity industry has a process for coordinating the release of such bugs, which appears to have broken down. In truth, it didn’t. Once Linus announced a security patch that would degrade performance of the Linux kernel, we knew the coming bug was going to be Big. Looking at the Linux patch, tracking backwards to the bug was only a matter of time. Hence, the release of this information was a bit sooner than some wanted. This is to be expected, and is nothing to be upset about.

It helps to have a name. Many are offended by the crassness of naming vulnerabilities and giving them logos. On the other hand, we are going to be talking about these bugs for the next decade. Having a recognizable name, rather than a hard-to-remember number, is useful.

Should I stop buying Intel? Intel has the worst of the bugs here. On the other hand, ARM and AMD alternatives have their own problems. Many want to deploy ARM servers in their data centers, but these are likely to expose bugs you don’t see on x86 servers. The software fix, “page table isolation”, seems to work, so there might not be anything to worry about. On the other hand, holding up purchases because of “fear” of this bug is a good way to squeeze price reductions out of your vendor. Conversely, later generation CPUs, “Haswell” and even “Skylake” seem to have the least performance degradation, so it might be time to upgrade older servers to newer processors.

Intel misleads. Intel has a press release that implies they are not impacted any worse than others. This is wrong: the “Meltdown” issue appears to apply only to Intel CPUs. I don’t like such marketing crap, so I mention it.


Statements from companies: