Tag Archives: 360

Amazon Redshift – 2017 Recap

Post Syndicated from Larry Heathcote original https://aws.amazon.com/blogs/big-data/amazon-redshift-2017-recap/

We have been busy adding new features and capabilities to Amazon Redshift, and we wanted to give you a glimpse of what we’ve been doing over the past year. In this article, we recap a few of our enhancements and provide a set of resources that you can use to learn more and get the most out of your Amazon Redshift implementation.

In 2017, we made more than 30 announcements about Amazon Redshift. We listened to you, our customers, and delivered Redshift Spectrum, a feature of Amazon Redshift, that gives you the ability to extend analytics to your data lake—without moving data. We launched new DC2 nodes, doubling performance at the same price. We also announced many new features that provide greater scalability, better performance, more automation, and easier ways to manage your analytics workloads.

To see a full list of our launches, visit our what’s new page—and be sure to subscribe to our RSS feed.

Major launches in 2017

Amazon Redshift Spectrumextend analytics to your data lake, without moving data

We launched Amazon Redshift Spectrum to give you the freedom to store data in Amazon S3, in open file formats, and have it available for analytics without the need to load it into your Amazon Redshift cluster. It enables you to easily join datasets across Redshift clusters and S3 to provide unique insights that you would not be able to obtain by querying independent data silos.

With Redshift Spectrum, you can run SQL queries against data in an Amazon S3 data lake as easily as you analyze data stored in Amazon Redshift. And you can do it without loading data or resizing the Amazon Redshift cluster based on growing data volumes. Redshift Spectrum separates compute and storage to meet workload demands for data size, concurrency, and performance. Redshift Spectrum scales processing across thousands of nodes, so results are fast, even with massive datasets and complex queries. You can query open file formats that you already use—such as Apache Avro, CSV, Grok, ORC, Apache Parquet, RCFile, RegexSerDe, SequenceFile, TextFile, and TSV—directly in Amazon S3, without any data movement.

For complex queries, Redshift Spectrum provided a 67 percent performance gain,” said Rafi Ton, CEO, NUVIAD. “Using the Parquet data format, Redshift Spectrum delivered an 80 percent performance improvement. For us, this was substantial.

To learn more about Redshift Spectrum, watch our AWS Summit session Intro to Amazon Redshift Spectrum: Now Query Exabytes of Data in S3, and read our announcement blog post Amazon Redshift Spectrum – Exabyte-Scale In-Place Queries of S3 Data.

DC2 nodes—twice the performance of DC1 at the same price

We launched second-generation Dense Compute (DC2) nodes to provide low latency and high throughput for demanding data warehousing workloads. DC2 nodes feature powerful Intel E5-2686 v4 (Broadwell) CPUs, fast DDR4 memory, and NVMe-based solid state disks (SSDs). We’ve tuned Amazon Redshift to take advantage of the better CPU, network, and disk on DC2 nodes, providing up to twice the performance of DC1 at the same price. Our DC2.8xlarge instances now provide twice the memory per slice of data and an optimized storage layout with 30 percent better storage utilization.

Redshift allows us to quickly spin up clusters and provide our data scientists with a fast and easy method to access data and generate insights,” said Bradley Todd, technology architect at Liberty Mutual. “We saw a 9x reduction in month-end reporting time with Redshift DC2 nodes as compared to DC1.”

Read our customer testimonials to see the performance gains our customers are experiencing with DC2 nodes. To learn more, read our blog post Amazon Redshift Dense Compute (DC2) Nodes Deliver Twice the Performance as DC1 at the Same Price.

Performance enhancements— 3x-5x faster queries

On average, our customers are seeing 3x to 5x performance gains for most of their critical workloads.

We introduced short query acceleration to speed up execution of queries such as reports, dashboards, and interactive analysis. Short query acceleration uses machine learning to predict the execution time of a query, and to move short running queries to an express short query queue for faster processing.

We launched results caching to deliver sub-second response times for queries that are repeated, such as dashboards, visualizations, and those from BI tools. Results caching has an added benefit of freeing up resources to improve the performance of all other queries.

We also introduced late materialization to reduce the amount of data scanned for queries with predicate filters by batching and factoring in the filtering of predicates before fetching data blocks in the next column. For example, if only 10 percent of the table rows satisfy the predicate filters, Amazon Redshift can potentially save 90 percent of the I/O for the remaining columns to improve query performance.

We launched query monitoring rules and pre-defined rule templates. These features make it easier for you to set metrics-based performance boundaries for workload management (WLM) queries, and specify what action to take when a query goes beyond those boundaries. For example, for a queue that’s dedicated to short-running queries, you might create a rule that aborts queries that run for more than 60 seconds. To track poorly designed queries, you might have another rule that logs queries that contain nested loops.

Customer insights

Amazon Redshift and Redshift Spectrum serve customers across a variety of industries and sizes, from startups to large enterprises. Visit our customer page to see the success that customers are having with our recent enhancements. Learn how companies like Liberty Mutual Insurance saw a 9x reduction in month-end reporting time using DC2 nodes. On this page, you can find case studies, videos, and other content that show how our customers are using Amazon Redshift to drive innovation and business results.

In addition, check out these resources to learn about the success our customers are having building out a data warehouse and data lake integration solution with Amazon Redshift:

Partner solutions

You can enhance your Amazon Redshift data warehouse by working with industry-leading experts. Our AWS Partner Network (APN) Partners have certified their solutions to work with Amazon Redshift. They offer software, tools, integration, and consulting services to help you at every step. Visit our Amazon Redshift Partner page and choose an APN Partner. Or, use AWS Marketplace to find and immediately start using third-party software.

To see what our Partners are saying about Amazon Redshift Spectrum and our DC2 nodes mentioned earlier, read these blog posts:

Resources

Blog posts

Visit the AWS Big Data Blog for a list of all Amazon Redshift articles.

YouTube videos

GitHub

Our community of experts contribute on GitHub to provide tips and hints that can help you get the most out of your deployment. Visit GitHub frequently to get the latest technical guidance, code samples, administrative task automation utilities, the analyze & vacuum schema utility, and more.

Customer support

If you are evaluating or considering a proof of concept with Amazon Redshift, or you need assistance migrating your on-premises or other cloud-based data warehouse to Amazon Redshift, our team of product experts and solutions architects can help you with architecting, sizing, and optimizing your data warehouse. Contact us using this support request form, and let us know how we can assist you.

If you are an Amazon Redshift customer, we offer a no-cost health check program. Our team of database engineers and solutions architects give you recommendations for optimizing Amazon Redshift and Amazon Redshift Spectrum for your specific workloads. To learn more, email us at [email protected].

If you have any questions, email us at [email protected].

 


Additional Reading

If you found this post useful, be sure to check out Amazon Redshift Spectrum – Exabyte-Scale In-Place Queries of S3 Data, Using Amazon Redshift for Fast Analytical Reports and How to Migrate Your Oracle Data Warehouse to Amazon Redshift Using AWS SCT and AWS DMS.


About the Author

Larry Heathcote is a Principle Product Marketing Manager at Amazon Web Services for data warehousing and analytics. Larry is passionate about seeing the results of data-driven insights on business outcomes. He enjoys family time, home projects, grilling out and the taste of classic barbeque.

 

 

 

Security Breaches Don’t Affect Stock Price

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/01/security_breach.html

Interesting research: “Long-term market implications of data breaches, not,” by Russell Lange and Eric W. Burger.

Abstract: This report assesses the impact disclosure of data breaches has on the total returns and volatility of the affected companies’ stock, with a focus on the results relative to the performance of the firms’ peer industries, as represented through selected indices rather than the market as a whole. Financial performance is considered over a range of dates from 3 days post-breach through 6 months post-breach, in order to provide a longer-term perspective on the impact of the breach announcement.

Key findings:

  • While the difference in stock price between the sampled breached companies and their peers was negative (1.13%) in the first 3 days following announcement of a breach, by the 14th day the return difference had rebounded to + 0.05%, and on average remained positive through the period assessed.
  • For the differences in the breached companies’ betas and the beta of their peer sets, the differences in the means of 8 months pre-breach versus post-breach was not meaningful at 90, 180, and 360 day post-breach periods.

  • For the differences in the breached companies’ beta correlations against the peer indices pre- and post-breach, the difference in the means of the rolling 60 day correlation 8 months pre- breach versus post-breach was not meaningful at 90, 180, and 360 day post-breach periods.

  • In regression analysis, use of the number of accessed records, date, data sensitivity, and malicious versus accidental leak as variables failed to yield an R2 greater than 16.15% for response variables of 3, 14, 60, and 90 day return differential, excess beta differential, and rolling beta correlation differential, indicating that the financial impact on breached companies was highly idiosyncratic.

  • Based on returns, the most impacted industries at the 3 day post-breach date were U.S. Financial Services, Transportation, and Global Telecom. At the 90 day post-breach date, the three most impacted industries were U.S. Financial Services, U.S. Healthcare, and Global Telecom.

The market isn’t going to fix this. If we want better security, we need to regulate the market.

Note: The article is behind a paywall. An older version is here. A similar article is here.

Modding Legends Team-Xecuter Announce “Future-Proof” Nintendo Switch Hack

Post Syndicated from Andy original https://torrentfreak.com/modding-legends-team-xecuter-announce-future-proof-nintendo-switch-hack-180104/

Since the advent of the first truly mass-market videogames consoles, people have dreamed about removing the protection mechanisms that prevent users from tinkering with their machines.

These modifications – which are software, hardware, or combination of the two – facilitate the running of third-party or “homebrew” code. On this front, a notable mention must go to XBMC (now known as Kodi) which ran on the original Xbox after its copy protection mechanisms had been removed.

However, these same modifications regularly open the door to mass-market piracy too, with mod-chips (hardware devices) or soft-mods (software solutions) opening up machines so that consumers can run games obtained from the Internet or elsewhere.

For the Nintendo Switch, that prospect edged closer at the end of December when Wololo reported that hackers Plutoo, Derrek, and Naehrwert had given a long presentation (video) at the 34C3 hacking conference in Germany, revealing their kernel hack for the Nintendo Switch.

While this in itself is an exciting development, fresh news from a veteran hacking group suggests that Nintendo could be in big trouble on the piracy front in the not-too-distant future.

“In the light of a recent presentation at the Chaos Communication Congress in Germany we’ve decided to come out of the woodwork and tease you all a bit with our latest upcoming product,” the legendary Team-Xecutor just announced.

While the hack announced in December requires Switch firmware 3.0 (and a copy of Pokken Tournament DX), Team-Xecutor say that their product will be universal, something which tends to suggest a fundamental flaw in the Switch system.

“This solution will work on ANY Nintendo Switch console regardless of the currently installed firmware, and will be completely future proof,” the team explain.

Xecutor say that their solution opens up the possibility of custom firmware (CFW) on Nintendo’s console. In layman’s terms, this means that those with the technical ability will be able to dictate, at least to a point, how the console functions.

“We want to move the community forward and provide a persistent, stable and fast method of running your own code and custom firmware patches on Nintendo’s latest flagship product. And we think we’ve succeeded!” the team add.

The console-modding community thrives on rumors, with various parties claiming to have made progress here and there, on this console and that, so it’s natural for people to greet this kind of announcement with a degree of skepticism. That being said, Team-Xecutor is no regular group.

With a long history of console-based meddling, Team-Xecutor’s efforts include hardware solutions for the original Playstation and Playstation 2, an array of hacks for the original Xbox (Enigmah and various Xecuter-branded solutions), plus close involvement in prominent Xbox360 mods. Their pedigree is definitely not up for debate.

For now, the team isn’t releasing any more details on the nature of the hack but they have revealed when the public can expect to get their hands on it.

“Spring 2018 or there around,” they conclude.

Team-Xecutor demo

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

How to Prepare for AWS’s Move to Its Own Certificate Authority

Post Syndicated from Jonathan Kozolchyk original https://aws.amazon.com/blogs/security/how-to-prepare-for-aws-move-to-its-own-certificate-authority/

AWS Certificate Manager image

 

Update from March 28, 2018: We updated the Amazon Trust Services table by replacing an out-of-date value with a new value.


Transport Layer Security (TLS, formerly called Secure Sockets Layer [SSL]) is essential for encrypting information that is exchanged on the internet. For example, Amazon.com uses TLS for all traffic on its website, and AWS uses it to secure calls to AWS services.

An electronic document called a certificate verifies the identity of the server when creating such an encrypted connection. The certificate helps establish proof that your web browser is communicating securely with the website that you typed in your browser’s address field. Certificate Authorities, also known as CAs, issue certificates to specific domains. When a domain presents a certificate that is issued by a trusted CA, your browser or application knows it’s safe to make the connection.

In January 2016, AWS launched AWS Certificate Manager (ACM), a service that lets you easily provision, manage, and deploy SSL/TLS certificates for use with AWS services. These certificates are available for no additional charge through Amazon’s own CA: Amazon Trust Services. For browsers and other applications to trust a certificate, the certificate’s issuer must be included in the browser’s trust store, which is a list of trusted CAs. If the issuing CA is not in the trust store, the browser will display an error message (see an example) and applications will show an application-specific error. To ensure the ubiquity of the Amazon Trust Services CA, AWS purchased the Starfield Services CA, a root found in most browsers and which has been valid since 2005. This means you shouldn’t have to take any action to use the certificates issued by Amazon Trust Services.

AWS has been offering free certificates to AWS customers from the Amazon Trust Services CA. Now, AWS is in the process of moving certificates for services such as Amazon EC2 and Amazon DynamoDB to use certificates from Amazon Trust Services as well. Most software doesn’t need to be changed to handle this transition, but there are exceptions. In this blog post, I show you how to verify that you are prepared to use the Amazon Trust Services CA.

How to tell if the Amazon Trust Services CAs are in your trust store

The following table lists the Amazon Trust Services certificates. To verify that these certificates are in your browser’s trust store, click each Test URL in the following table to verify that it works for you. When a Test URL does not work, it displays an error similar to this example.

Distinguished name SHA-256 hash of subject public key information Test URL
CN=Amazon Root CA 1,O=Amazon,C=US fbe3018031f9586bcbf41727e417b7d1c45c2f47f93be372a17b96b50757d5a2 Test URL
CN=Amazon Root CA 2,O=Amazon,C=US 7f4296fc5b6a4e3b35d3c369623e364ab1af381d8fa7121533c9d6c633ea2461 Test URL
CN=Amazon Root CA 3,O=Amazon,C=US 36abc32656acfc645c61b71613c4bf21c787f5cabbee48348d58597803d7abc9 Test URL
CN=Amazon Root CA 4,O=Amazon,C=US f7ecded5c66047d28ed6466b543c40e0743abe81d109254dcf845d4c2c7853c5 Test URL
CN=Starfield Services Root Certificate Authority – G2,O=Starfield Technologies\, Inc.,L=Scottsdale,ST=Arizona,C=US 2b071c59a0a0ae76b0eadb2bad23bad4580b69c3601b630c2eaf0613afa83f92 Test URL
Starfield Class 2 Certification Authority 15f14ac45c9c7da233d3479164e8137fe35ee0f38ae858183f08410ea82ac4b4 Not available*

* Note: Amazon doesn’t own this root and doesn’t have a test URL for it. The certificate can be downloaded from here.

You can calculate the SHA-256 hash of Subject Public Key Information as follows. With the PEM-encoded certificate stored in certificate.pem, run the following openssl commands:

openssl x509 -in certificate.pem -noout -pubkey | openssl asn1parse -noout -inform pem -out certificate.key
openssl dgst -sha256 certificate.key

As an example, with the Starfield Class 2 Certification Authority self-signed cert in a PEM encoded file sf-class2-root.crt, you can use the following openssl commands:

openssl x509 -in sf-class2-root.crt -noout -pubkey | openssl asn1parse -noout -inform pem -out sf-class2-root.key
openssl dgst -sha256 sf-class2-root.key ~
SHA256(sf-class2-root.key)= 15f14ac45c9c7da233d3479164e8137fe35ee0f38ae858183f08410ea82ac4b4

What to do if the Amazon Trust Services CAs are not in your trust store

If your tests of any of the Test URLs failed, you must update your trust store. The easiest way to update your trust store is to upgrade the operating system or browser that you are using.

You will find the Amazon Trust Services CAs in the following operating systems (release dates are in parentheses):

  • Microsoft Windows versions that have January 2005 or later updates installed, Windows Vista, Windows 7, Windows Server 2008, and newer versions
  • Mac OS X 10.4 with Java for Mac OS X 10.4 Release 5, Mac OS X 10.5 and newer versions
  • Red Hat Enterprise Linux 5 (March 2007), Linux 6, and Linux 7 and CentOS 5, CentOS 6, and CentOS 7
  • Ubuntu 8.10
  • Debian 5.0
  • Amazon Linux (all versions)
  • Java 1.4.2_12, Java 5 update 2, and all newer versions, including Java 6, Java 7, and Java 8

All modern browsers trust Amazon’s CAs. You can update the certificate bundle in your browser simply by updating your browser. You can find instructions for updating the following browsers on their respective websites:

If your application is using a custom trust store, you must add the Amazon root CAs to your application’s trust store. The instructions for doing this vary based on the application or platform. Please refer to the documentation for the application or platform you are using.

AWS SDKs and CLIs

Most AWS SDKs and CLIs are not impacted by the transition to the Amazon Trust Services CA. If you are using a version of the Python AWS SDK or CLI released before October 29, 2013, you must upgrade. The .NET, Java, PHP, Go, JavaScript, and C++ SDKs and CLIs do not bundle any certificates, so their certificates come from the underlying operating system. The Ruby SDK has included at least one of the required CAs since June 10, 2015. Before that date, the Ruby V2 SDK did not bundle certificates.

Certificate pinning

If you are using a technique called certificate pinning to lock down the CAs you trust on a domain-by-domain basis, you must adjust your pinning to include the Amazon Trust Services CAs. Certificate pinning helps defend you from an attacker using misissued certificates to fool an application into creating a connection to a spoofed host (an illegitimate host masquerading as a legitimate host). The restriction to a specific, pinned certificate is made by checking that the certificate issued is the expected certificate. This is done by checking that the hash of the certificate public key received from the server matches the expected hash stored in the application. If the hashes do not match, the code stops the connection.

AWS recommends against using certificate pinning because it introduces a potential availability risk. If the certificate to which you pin is replaced, your application will fail to connect. If your use case requires pinning, we recommend that you pin to a CA rather than to an individual certificate. If you are pinning to an Amazon Trust Services CA, you should pin to all CAs shown in the table earlier in this post.

If you have comments about this post, submit them in the “Comments” section below. If you have questions about this post, start a new thread on the ACM forum.

– Jonathan

Federate Database User Authentication Easily with IAM and Amazon Redshift

Post Syndicated from Thiyagarajan Arumugam original https://aws.amazon.com/blogs/big-data/federate-database-user-authentication-easily-with-iam-and-amazon-redshift/

Managing database users though federation allows you to manage authentication and authorization procedures centrally. Amazon Redshift now supports database authentication with IAM, enabling user authentication though enterprise federation. No need to manage separate database users and passwords to further ease the database administration. You can now manage users outside of AWS and authenticate them for access to an Amazon Redshift data warehouse. Do this by integrating IAM authentication and a third-party SAML-2.0 identity provider (IdP), such as AD FS, PingFederate, or Okta. In addition, database users can also be automatically created at their first login based on corporate permissions.

In this post, I demonstrate how you can extend the federation to enable single sign-on (SSO) to the Amazon Redshift data warehouse.

SAML and Amazon Redshift

AWS supports Security Assertion Markup Language (SAML) 2.0, which is an open standard for identity federation used by many IdPs. SAML enables federated SSO, which enables your users to sign in to the AWS Management Console. Users can also make programmatic calls to AWS API actions by using assertions from a SAML-compliant IdP. For example, if you use Microsoft Active Directory for corporate directories, you may be familiar with how Active Directory and AD FS work together to enable federation. For more information, see the Enabling Federation to AWS Using Windows Active Directory, AD FS, and SAML 2.0 AWS Security Blog post.

Amazon Redshift now provides the GetClusterCredentials API operation that allows you to generate temporary database user credentials for authentication. You can set up an IAM permissions policy that generates these credentials for connecting to Amazon Redshift. Extending the IAM authentication, you can configure the federation of AWS access though a SAML 2.0–compliant IdP. An IAM role can be configured to permit the federated users call the GetClusterCredentials action and generate temporary credentials to log in to Amazon Redshift databases. You can also set up policies to restrict access to Amazon Redshift clusters, databases, database user names, and user group.

Amazon Redshift federation workflow

In this post, I demonstrate how you can use a JDBC– or ODBC-based SQL client to log in to the Amazon Redshift cluster using this feature. The SQL clients used with Amazon Redshift JDBC or ODBC drivers automatically manage the process of calling the GetClusterCredentials action, retrieving the database user credentials, and establishing a connection to your Amazon Redshift database. You can also use your database application to programmatically call the GetClusterCredentials action, retrieve database user credentials, and connect to the database. I demonstrate these features using an example company to show how different database users accounts can be managed easily using federation.

The following diagram shows how the SSO process works:

  1. JDBC/ODBC
  2. Authenticate using Corp Username/Password
  3. IdP sends SAML assertion
  4. Call STS to assume role with SAML
  5. STS Returns Temp Credentials
  6. Use Temp Credentials to get Temp cluster credentials
  7. Connect to Amazon Redshift using temp credentials

Walkthrough

Example Corp. is using Active Directory (idp host:demo.examplecorp.com) to manage federated access for users in its organization. It has an AWS account: 123456789012 and currently manages an Amazon Redshift cluster with the cluster ID “examplecorp-dw”, database “analytics” in us-west-2 region for its Sales and Data Science teams. It wants the following access:

  • Sales users can access the examplecorp-dw cluster using the sales_grp database group
  • Sales users access examplecorp-dw through a JDBC-based SQL client
  • Sales users access examplecorp-dw through an ODBC connection, for their reporting tools
  • Data Science users access the examplecorp-dw cluster using the data_science_grp database group.
  • Partners access the examplecorp-dw cluster and query using the partner_grp database group.
  • Partners are not federated through Active Directory and are provided with separate IAM user credentials (with IAM user name examplecorpsalespartner).
  • Partners can connect to the examplecorp-dw cluster programmatically, using language such as Python.
  • All users are automatically created in Amazon Redshift when they log in for the first time.
  • (Optional) Internal users do not specify database user or group information in their connection string. It is automatically assigned.
  • Data warehouse users can use SSO for the Amazon Redshift data warehouse using the preceding permissions.

Step 1:  Set up IdPs and federation

The Enabling Federation to AWS Using Windows Active Directory post demonstrated how to prepare Active Directory and enable federation to AWS. Using those instructions, you can establish trust between your AWS account and the IdP and enable user access to AWS using SSO.  For more information, see Identity Providers and Federation.

For this walkthrough, assume that this company has already configured SSO to their AWS account: 123456789012 for their Active Directory domain demo.examplecorp.com. The Sales and Data Science teams are not required to specify database user and group information in the connection string. The connection string can be configured by adding SAML Attribute elements to your IdP. Configuring these optional attributes enables internal users to conveniently avoid providing the DbUser and DbGroup parameters when they log in to Amazon Redshift.

The user-name attribute can be set up as follows, with a user ID (for example, nancy) or an email address (for example. [email protected]):

<Attribute Name="https://redshift.amazon.com/SAML/Attributes/DbUser">  
  <AttributeValue>user-name</AttributeValue>
</Attribute>

The AutoCreate attribute can be defined as follows:

<Attribute Name="https://redshift.amazon.com/SAML/Attributes/AutoCreate">
    <AttributeValue>true</AttributeValue>
</Attribute>

The sales_grp database group can be included as follows:

<Attribute Name="https://redshift.amazon.com/SAML/Attributes/DbGroups">
    <AttributeValue>sales_grp</AttributeValue>
</Attribute>

For more information about attribute element configuration, see Configure SAML Assertions for Your IdP.

Step 2: Create IAM roles for access to the Amazon Redshift cluster

The next step is to create IAM policies with permissions to call GetClusterCredentials and provide authorization for Amazon Redshift resources. To grant a SQL client the ability to retrieve the cluster endpoint, region, and port automatically, include the redshift:DescribeClusters action with the Amazon Redshift cluster resource in the IAM role.  For example, users can connect to the Amazon Redshift cluster using a JDBC URL without the need to hardcode the Amazon Redshift endpoint:

Previous:  jdbc:redshift://endpoint:port/database

Current:  jdbc:redshift:iam://clustername:region/dbname

Use IAM to create the following policies. You can also use an existing user or role and assign these policies. For example, if you already created an IAM role for IdP access, you can attach the necessary policies to that role. Here is the policy created for sales users for this example:

Sales_DW_IAM_Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "redshift:DescribeClusters"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:GetClusterCredentials"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw",
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:userid": "AIDIODR4TAW7CSEXAMPLE:${redshift:DbUser}@examplecorp.com"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:CreateClusterUser"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:JoinGroup"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/sales_grp"
            ]
        }
    ]
}

The policy uses the following parameter values:

  • Region: us-west-2
  • AWS Account: 123456789012
  • Cluster name: examplecorp-dw
  • Database group: sales_grp
  • IAM role: AIDIODR4TAW7CSEXAMPLE
Policy Statement Description
{
"Effect":"Allow",
"Action":[
"redshift:DescribeClusters"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw"
]
}

Allow users to retrieve the cluster endpoint, region, and port automatically for the Amazon Redshift cluster examplecorp-dw. This specification uses the resource format arn:aws:redshift:region:account-id:cluster:clustername. For example, the SQL client JDBC can be specified in the format jdbc:redshift:iam://clustername:region/dbname.

For more information, see Amazon Resource Names.

{
"Effect":"Allow",
"Action":[
"redshift:GetClusterCredentials"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw",
"arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
],
"Condition":{
"StringEquals":{
"aws:userid":"AIDIODR4TAW7CSEXAMPLE:${redshift:DbUser}@examplecorp.com"
}
}
}

Generates a temporary token to authenticate into the examplecorp-dw cluster. “arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}” restricts the corporate user name to the database user name for that user. This resource is specified using the format: arn:aws:redshift:region:account-id:dbuser:clustername/dbusername.

The Condition block enforces that the AWS user ID should match “AIDIODR4TAW7CSEXAMPLE:${redshift:DbUser}@examplecorp.com”, so that individual users can authenticate only as themselves. The AIDIODR4TAW7CSEXAMPLE role has the Sales_DW_IAM_Policy policy attached.

{
"Effect":"Allow",
"Action":[
"redshift:CreateClusterUser"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
]
}
Automatically creates database users in examplecorp-dw, when they log in for the first time. Subsequent logins reuse the existing database user.
{
"Effect":"Allow",
"Action":[
"redshift:JoinGroup"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/sales_grp"
]
}
Allows sales users to join the sales_grp database group through the resource “arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/sales_grp” that is specified in the format arn:aws:redshift:region:account-id:dbgroup:clustername/dbgroupname.

Similar policies can be created for Data Science users with access to join the data_science_grp group in examplecorp-dw. You can now attach the Sales_DW_IAM_Policy policy to the role that is mapped to IdP application for SSO.
 For more information about how to define the claim rules, see Configuring SAML Assertions for the Authentication Response.

Because partners are not authorized using Active Directory, they are provided with IAM credentials and added to the partner_grp database group. The Partner_DW_IAM_Policy is attached to the IAM users for partners. The following policy allows partners to log in using the IAM user name as the database user name.

Partner_DW_IAM_Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "redshift:DescribeClusters"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:GetClusterCredentials"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw",
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ],
            "Condition": {
                "StringEquals": {
                    "redshift:DbUser": "${aws:username}"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:CreateClusterUser"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:JoinGroup"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/partner_grp"
            ]
        }
    ]
}

redshift:DbUser“: “${aws:username}” forces an IAM user to use the IAM user name as the database user name.

With the previous steps configured, you can now establish the connection to Amazon Redshift through JDBC– or ODBC-supported clients.

Step 3: Set up database user access

Before you start connecting to Amazon Redshift using the SQL client, set up the database groups for appropriate data access. Log in to your Amazon Redshift database as superuser to create a database group, using CREATE GROUP.

Log in to examplecorp-dw/analytics as superuser and create the following groups and users:

CREATE GROUP sales_grp;
CREATE GROUP datascience_grp;
CREATE GROUP partner_grp;

Use the GRANT command to define access permissions to database objects (tables/views) for the preceding groups.

Step 4: Connect to Amazon Redshift using the JDBC SQL client

Assume that sales user “nancy” is using the SQL Workbench client and JDBC driver to log in to the Amazon Redshift data warehouse. The following steps help set up the client and establish the connection:

  1. Download the latest Amazon Redshift JDBC driver from the Configure a JDBC Connection page
  2. Build the JDBC URL with the IAM option in the following format:
    jdbc:redshift:iam://examplecorp-dw:us-west-2/sales_db

Because the redshift:DescribeClusters action is assigned to the preceding IAM roles, it automatically resolves the cluster endpoints and the port. Otherwise, you can specify the endpoint and port information in the JDBC URL, as described in Configure a JDBC Connection.

Identify the following JDBC options for providing the IAM credentials (see the “Prepare your environment” section) and configure in the SQL Workbench Connection Profile:

plugin_name=com.amazon.redshift.plugin.AdfsCredentialsProvider 
idp_host=demo.examplecorp.com (The name of the corporate identity provider host)
idp_port=443  (The port of the corporate identity provider host)
user=examplecorp\nancy(corporate user name)
password=***(corporate user password)

The SQL workbench configuration looks similar to the following screenshot:

Now, “nancy” can connect to examplecorp-dw by authenticating using the corporate Active Directory. Because the SAML attributes elements are already configured for nancy, she logs in as database user nancy and is assigned the sales_grp. Similarly, other Sales and Data Science users can connect to the examplecorp-dw cluster. A custom Amazon Redshift ODBC driver can also be used to connect using a SQL client. For more information, see Configure an ODBC Connection.

Step 5: Connecting to Amazon Redshift using JDBC SQL Client and IAM Credentials

This optional step is necessary only when you want to enable users that are not authenticated with Active Directory. Partners are provided with IAM credentials that they can use to connect to the examplecorp-dw Amazon Redshift clusters. These IAM users are attached to Partner_DW_IAM_Policy that assigns them to be assigned to the public database group in Amazon Redshift. The following JDBC URLs enable them to connect to the Amazon Redshift cluster:

jdbc:redshift:iam//examplecorp-dw/analytics?AccessKeyID=XXX&SecretAccessKey=YYY&DbUser=examplecorpsalespartner&DbGroup= partner_grp&AutoCreate=true

The AutoCreate option automatically creates a new database user the first time the partner logs in. There are several other options available to conveniently specify the IAM user credentials. For more information, see Options for providing IAM credentials.

Step 6: Connecting to Amazon Redshift using an ODBC client for Microsoft Windows

Assume that another sales user “uma” is using an ODBC-based client to log in to the Amazon Redshift data warehouse using Example Corp Active Directory. The following steps help set up the ODBC client and establish the Amazon Redshift connection in a Microsoft Windows operating system connected to your corporate network:

  1. Download and install the latest Amazon Redshift ODBC driver.
  2. Create a system DSN entry.
    1. In the Start menu, locate the driver folder or folders:
      • Amazon Redshift ODBC Driver (32-bit)
      • Amazon Redshift ODBC Driver (64-bit)
      • If you installed both drivers, you have a folder for each driver.
    2. Choose ODBC Administrator, and then type your administrator credentials.
    3. To configure the driver for all users on the computer, choose System DSN. To configure the driver for your user account only, choose User DSN.
    4. Choose Add.
  3. Select the Amazon Redshift ODBC driver, and choose Finish. Configure the following attributes:
    Data Source Name =any friendly name to identify the ODBC connection 
    Database=analytics
    user=uma(corporate user name)
    Auth Type-Identity Provider: AD FS
    password=leave blank (Windows automatically authenticates)
    Cluster ID: examplecorp-dw
    idp_host=demo.examplecorp.com (The name of the corporate IdP host)

This configuration looks like the following:

  1. Choose OK to save the ODBC connection.
  2. Verify that uma is set up with the SAML attributes, as described in the “Set up IdPs and federation” section.

The user uma can now use this ODBC connection to establish the connection to the Amazon Redshift cluster using any ODBC-based tools or reporting tools such as Tableau. Internally, uma authenticates using the Sales_DW_IAM_Policy  IAM role and is assigned the sales_grp database group.

Step 7: Connecting to Amazon Redshift using Python and IAM credentials

To enable partners, connect to the examplecorp-dw cluster programmatically, using Python on a computer such as Amazon EC2 instance. Reuse the IAM users that are attached to the Partner_DW_IAM_Policy policy defined in Step 2.

The following steps show this set up on an EC2 instance:

  1. Launch a new EC2 instance with the Partner_DW_IAM_Policy role, as described in Using an IAM Role to Grant Permissions to Applications Running on Amazon EC2 Instances. Alternatively, you can attach an existing IAM role to an EC2 instance.
  2. This example uses Python PostgreSQL Driver (PyGreSQL) to connect to your Amazon Redshift clusters. To install PyGreSQL on Amazon Linux, use the following command as the ec2-user:
    sudo easy_install pip
    sudo yum install postgresql postgresql-devel gcc python-devel
    sudo pip install PyGreSQL

  1. The following code snippet demonstrates programmatic access to Amazon Redshift for partner users:
    #!/usr/bin/env python
    """
    Usage:
    python redshift-unload-copy.py <config file> <region>
    
    * Copyright 2014, Amazon.com, Inc. or its affiliates. All Rights Reserved.
    *
    * Licensed under the Amazon Software License (the "License").
    * You may not use this file except in compliance with the License.
    * A copy of the License is located at
    *
    * http://aws.amazon.com/asl/
    *
    * or in the "license" file accompanying this file. This file is distributed
    * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
    * express or implied. See the License for the specific language governing
    * permissions and limitations under the License.
    """
    
    import sys
    import pg
    import boto3
    
    REGION = 'us-west-2'
    CLUSTER_IDENTIFIER = 'examplecorp-dw'
    DB_NAME = 'sales_db'
    DB_USER = 'examplecorpsalespartner'
    
    options = """keepalives=1 keepalives_idle=200 keepalives_interval=200
                 keepalives_count=6"""
    
    set_timeout_stmt = "set statement_timeout = 1200000"
    
    def conn_to_rs(host, port, db, usr, pwd, opt=options, timeout=set_timeout_stmt):
        rs_conn_string = """host=%s port=%s dbname=%s user=%s password=%s
                             %s""" % (host, port, db, usr, pwd, opt)
        print "Connecting to %s:%s:%s as %s" % (host, port, db, usr)
        rs_conn = pg.connect(dbname=rs_conn_string)
        rs_conn.query(timeout)
        return rs_conn
    
    def main():
        # describe the cluster and fetch the IAM temporary credentials
        global redshift_client
        redshift_client = boto3.client('redshift', region_name=REGION)
        response_cluster_details = redshift_client.describe_clusters(ClusterIdentifier=CLUSTER_IDENTIFIER)
        response_credentials = redshift_client.get_cluster_credentials(DbUser=DB_USER,DbName=DB_NAME,ClusterIdentifier=CLUSTER_IDENTIFIER,DurationSeconds=3600)
        rs_host = response_cluster_details['Clusters'][0]['Endpoint']['Address']
        rs_port = response_cluster_details['Clusters'][0]['Endpoint']['Port']
        rs_db = DB_NAME
        rs_iam_user = response_credentials['DbUser']
        rs_iam_pwd = response_credentials['DbPassword']
        # connect to the Amazon Redshift cluster
        conn = conn_to_rs(rs_host, rs_port, rs_db, rs_iam_user,rs_iam_pwd)
        # execute a query
        result = conn.query("SELECT sysdate as dt")
        # fetch results from the query
        for dt_val in result.getresult() :
            print dt_val
        # close the Amazon Redshift connection
        conn.close()
    
    if __name__ == "__main__":
        main()

You can save this Python program in a file (redshiftscript.py) and execute it at the command line as ec2-user:

python redshiftscript.py

Now partners can connect to the Amazon Redshift cluster using the Python script, and authentication is federated through the IAM user.

Summary

In this post, I demonstrated how to use federated access using Active Directory and IAM roles to enable single sign-on to an Amazon Redshift cluster. I also showed how partners outside an organization can be managed easily using IAM credentials.  Using the GetClusterCredentials API action, now supported by Amazon Redshift, lets you manage a large number of database users and have them use corporate credentials to log in. You don’t have to maintain separate database user accounts.

Although this post demonstrated the integration of IAM with AD FS and Active Directory, you can replicate this solution across with your choice of SAML 2.0 third-party identity providers (IdP), such as PingFederate or Okta. For the different supported federation options, see Configure SAML Assertions for Your IdP.

If you have questions or suggestions, please comment below.


Additional Reading

Learn how to establish federated access to your AWS resources by using Active Directory user attributes.


About the Author

Thiyagarajan Arumugam is a Big Data Solutions Architect at Amazon Web Services and designs customer architectures to process data at scale. Prior to AWS, he built data warehouse solutions at Amazon.com. In his free time, he enjoys all outdoor sports and practices the Indian classical drum mridangam.

 

Abandon Proactive Copyright Filters, Huge Coalition Tells EU Heavyweights

Post Syndicated from Andy original https://torrentfreak.com/abandon-proactive-copyright-filters-huge-coalition-tells-eu-heavyweights-171017/

Last September, EU Commission President Jean-Claude Juncker announced plans to modernize copyright law in Europe.

The proposals (pdf) are part of the Digital Single Market reforms, which have been under development for the past several years.

One of the proposals is causing significant concern. Article 13 would require some online service providers to become ‘Internet police’, proactively detecting and filtering allegedly infringing copyright works, uploaded to their platforms by users.

Currently, users are generally able to share whatever they like but should a copyright holder take exception to their upload, mechanisms are available for that content to be taken down. It’s envisioned that proactive filtering, whereby user uploads are routinely scanned and compared to a database of existing protected content, will prevent content becoming available in the first place.

These proposals are of great concern to digital rights groups, who believe that such filters will not only undermine users’ rights but will also place unfair burdens on Internet platforms, many of which will struggle to fund such a program. Yesterday, in the latest wave of opposition to Article 13, a huge coalition of international rights groups came together to underline their concerns.

Headed up by Civil Liberties Union for Europe (Liberties) and European Digital Rights (EDRi), the coalition is formed of dozens of influential groups, including Electronic Frontier Foundation (EFF), Human Rights Watch, Reporters without Borders, and Open Rights Group (ORG), to name just a few.

In an open letter to European Commission President Jean-Claude Juncker, President of the European Parliament Antonio Tajani, President of the European Council Donald Tusk and a string of others, the groups warn that the proposals undermine the trust established between EU member states.

“Fundamental rights, justice and the rule of law are intrinsically linked and constitute
core values on which the EU is founded,” the letter begins.

“Any attempt to disregard these values undermines the mutual trust between member states required for the EU to function. Any such attempt would also undermine the commitments made by the European Union and national governments to their citizens.”

Those citizens, the letter warns, would have their basic rights undermined, should the new proposals be written into EU law.

“Article 13 of the proposal on Copyright in the Digital Single Market include obligations on internet companies that would be impossible to respect without the imposition of excessive restrictions on citizens’ fundamental rights,” it notes.

A major concern is that by placing new obligations on Internet service providers that allow users to upload content – think YouTube, Facebook, Twitter and Instagram – they will be forced to err on the side of caution. Should there be any concern whatsoever that content might be infringing, fair use considerations and exceptions will be abandoned in favor of staying on the right side of the law.

“Article 13 appears to provoke such legal uncertainty that online services will have no other option than to monitor, filter and block EU citizens’ communications if they are to have any chance of staying in business,” the letter warns.

But while the potential problems for service providers and users are numerous, the groups warn that Article 13 could also be illegal since it contradicts case law of the Court of Justice.

According to the E-Commerce Directive, platforms are already required to remove infringing content, once they have been advised it exists. The new proposal, should it go ahead, would force the monitoring of uploads, something which goes against the ‘no general obligation to monitor‘ rules present in the Directive.

“The requirement to install a system for filtering electronic communications has twice been rejected by the Court of Justice, in the cases Scarlet Extended (C70/10) and Netlog/Sabam (C 360/10),” the rights groups warn.

“Therefore, a legislative provision that requires internet companies to install a filtering system would almost certainly be rejected by the Court of Justice because it would contravene the requirement that a fair balance be struck between the right to intellectual property on the one hand, and the freedom to conduct business and the right to freedom of expression, such as to receive or impart information, on the other.”

Specifically, the groups note that the proactive filtering of content would violate freedom of expression set out in Article 11 of the Charter of Fundamental Rights. That being the case, the groups expect national courts to disapply it and the rule to be annulled by the Court of Justice.

The latest protests against Article 13 come in the wake of large-scale objections earlier in the year, voicing similar concerns. However, despite the groups’ fears, they have powerful adversaries, each determined to stop the flood of copyrighted content currently being uploaded to the Internet.

Front and center in support of Article 13 is the music industry and its current hot-topic, the so-called Value Gap(1,2,3). The industry feels that platforms like YouTube are able to avoid paying expensive licensing fees (for music in particular) by exploiting the safe harbor protections of the DMCA and similar legislation.

They believe that proactively filtering uploads would significantly help to diminish this problem, which may very well be the case. But at what cost to the general public and the platforms they rely upon? Citizens and scholars feel that freedoms will be affected and it’s likely the outcry will continue.

The ball is now with the EU, whose members will soon have to make what could be the most important decision in recent copyright history. The rights groups, who are urging for Article 13 to be deleted, are clear where they stand.

The full letter is available here (pdf)

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

EU Piracy Report Suppression Raises Questions Over Transparency

Post Syndicated from Andy original https://torrentfreak.com/eu-piracy-report-suppression-raises-questions-transparency-170922/

Over the years, copyright holders have made hundreds of statements against piracy, mainly that it risks bringing industries to their knees through widespread and uncontrolled downloading from the Internet.

But while TV shows like Game of Thrones have been downloaded millions of times, the big question (one could argue the only really important question) is whether this activity actually affects sales. After all, if piracy has a massive negative effect on industry, something needs to be done. If it does not, why all the panic?

Quite clearly, the EU Commission wanted to find out the answer to this potential multi-billion dollar question when it made the decision to invest a staggering 360,000 euros in a dedicated study back in January 2014.

With a final title of ‘Estimating displacement rates of copyrighted content in the EU’, the completed study is an intimidating 307 pages deep. Shockingly, until this week, few people even knew it existed because, for reasons unknown, the EU Commission decided not to release it.

However, thanks to the sheer persistence of Member of the European Parliament Julia Reda, the public now has a copy and it contains quite a few interesting conclusions. But first, some background.

The study uses data from 2014 and covers four broad types of content: music,
audio-visual material, books and videogames. Unlike other reports, the study also considered live attendances of music and cinema visits in the key regions of Germany, UK, Spain, France, Poland and Sweden.

On average, 51% of adults and 72% of minors in the EU were found to have illegally downloaded or streamed any form of creative content, with Poland and Spain coming out as the worst offenders. However, here’s the kicker.

“In general, the results do not show robust statistical evidence of displacement of sales by online copyright infringements,” the study notes.

“That does not necessarily mean that piracy has no effect but only that the statistical analysis does not prove with sufficient reliability that there is an effect.”

For a study commissioned by the EU with huge sums of public money, this is a potentially damaging conclusion, not least for the countless industry bodies that lobby day in, day out, for tougher copyright law based on the “fact” that piracy is damaging to sales.

That being said, the study did find that certain sectors can be affected by piracy, notably recent top movies.

“The results show a displacement rate of 40 per cent which means that for every ten recent top films watched illegally, four fewer films are consumed legally,” the study notes.

“People do not watch many recent top films a second time but if it happens, displacement is lower: two legal consumptions are displaced by every ten illegal second views. This suggests that the displacement rate for older films is lower than the 40 per cent for recent top films. All in all, the estimated loss for recent top films is 5 per cent of current sales volumes.”

But while there is some negative effect on the movie industry, others can benefit. The study found that piracy had a slightly positive effect on the videogames industry, suggesting that those who play pirate games eventually become buyers of official content.

On top of displacement rates, the study also looked at the public’s willingness to pay for content, to assess whether price influences pirate consumption. Interestingly, the industry that had the most displaced sales – the movie industry – had the greatest number of people unhappy with its pricing model.

“Overall, the analysis indicates that for films and TV-series current prices are higher than 80 per cent of the illegal downloaders and streamers are willing to pay,” the study notes.

For other industries, where sales were not found to have been displaced or were positively affected by piracy, consumer satisfaction with pricing was greatest.

“For books, music and games, prices are at a level broadly corresponding to the
willingness to pay of illegal downloaders and streamers. This suggests that a
decrease in the price level would not change piracy rates for books, music and
games but that prices can have an effect on displacement rates for films and
TV-series,” the study concludes.

So, it appears that products that are priced fairly do not suffer significant displacement from piracy. Those that are priced too high, on the other hand, can expect to lose some sales.

Now that it’s been released, the findings of the study should help to paint a more comprehensive picture of the infringement climate in the EU, while laying to rest some of the wild claims of the copyright lobby. That being said, it shouldn’t have taken the toils of Julia Reda to bring them to light.

“This study may have remained buried in a drawer for several more years to come if it weren’t for an access to documents request I filed under the European Union’s Freedom of Information law on July 27, 2017, after having become aware of the public tender for this study dating back to 2013,” Reda explains.

“I would like to invite the Commission to become a provider of more solid and timely evidence to the copyright debate. Such data that is valuable both financially and in terms of its applicability should be available to everyone when it is financed by the European Union – it should not be gathering dust on a shelf until someone actively requests it.”

The full study can be downloaded here (pdf)

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

No, Google Drive is Definitely Not The New Pirate Bay

Post Syndicated from Andy original https://torrentfreak.com/no-google-drive-is-definitely-not-the-new-pirate-bay-170910/

Running close to two decades old, the world of true mainstream file-sharing is less of a mystery to the general public than it’s ever been.

Most people now understand the concept of shifting files from one place to another, and a significant majority will be aware of the opportunities to do so with infringing content.

Unsurprisingly, this is a major thorn in the side of rightsholders all over the world, who have been scrambling since the turn of the century in a considerable effort to stem the tide. The results of their work have varied, with some sectors hit harder than others.

One area that has taken a bit of a battering recently involves the dominant peer-to-peer platforms reliant on underlying BitTorrent transfers. Several large-scale sites have shut down recently, not least KickassTorrents, Torrentz, and ExtraTorrent, raising questions of what bad news may arrive next for inhabitants of Torrent Land.

Of course, like any other Internet-related activity, sharing has continued to evolve over the years, with streaming and cloud-hosting now a major hit with consumers. In the main, sites which skirt the borders of legality have been the major hosting and streaming players over the years, but more recently it’s become clear that even the most legitimate companies can become unwittingly involved in the piracy scene.

As reported here on TF back in 2014 and again several times this year (1,2,3), cloud-hosting services operated by Google, including Google Drive, are being used to store and distribute pirate content.

That news was echoed again this week, with a report on Gadgets360 reiterating that Google Drive is still being used for movie piracy. What followed were a string of follow up reports, some of which declared Google’s service to be ‘The New Pirate Bay.’

No. Just no.

While it’s always tempting for publications to squeeze a reference to The Pirate Bay into a piracy article due to the site’s popularity, it’s particularly out of place in this comparison. In no way, shape, or form can a centralized store of data like Google Drive ever replace the underlying technology of sites like The Pirate Bay.

While the casual pirate might love the idea of streaming a movie with a couple of clicks to a browser of his or her choice, the weakness of the cloud system cannot be understated. To begin with, anything hosted by Google is vulnerable to immediate takedown on demand, usually within a matter of hours.

“Google Drive has a variety of piracy counter-measures in place,” a spokesperson told Mashable this week, “and we are continuously working to improve our protections to prevent piracy across all of our products.”

When will we ever hear anything like that from The Pirate Bay? Answer: When hell freezes over. But it’s not just compliance with takedown requests that make Google Drive-hosted files vulnerable.

At the point Google Drive responds to a takedown request, it takes down the actual file. On the other hand, even if Pirate Bay responded to notices (which it doesn’t), it would be unable to do anything about the sharing going on underneath. Removing a torrent file or magnet link from TPB does nothing to negatively affect the decentralized swarm of people sharing files among themselves. Those files stay intact and sharing continues, no matter what happens to the links above.

Importantly, people sharing using BitTorrent do so without any need for central servers – the whole process is decentralized as long as a user can lay his or her hands on a torrent file or magnet link. Those using Google Drive, however, rely on a totally centralized system, where not only is Google king, but it can and will stop the entire party after receiving a few lines of text from a rightsholder.

There is a very good reason why sites like The Pirate Bay have been around for close to 15 years while platforms such as Megaupload, Hotfile, Rapidshare, and similar platforms have all met their makers. File-hosting platforms are expensive-to-run warehouses full of files, each of which brings direct liability for their hosts, once they’re made aware that those files are infringing. These days the choice is clear – take the files down or get brought down, it’s as simple as that.

The Pirate Bay, on the other hand, is nothing more than a treasure map (albeit a valuable one) that points the way to content spread all around the globe in the most decentralized way possible. There are no files to delete, no content to disappear. Comparing a vulnerable Google Drive to this kind of robust system couldn’t be further from the mark.

That being said, this is the way things are going. The cloud, it seems, is here to stay in all its forms. Everyone has access to it and uploading content is easier – much easier – than uploading it to a BitTorrent network. A Google Drive upload is simplicity itself for anyone with a mouse and a file; the same cannot be said about The Pirate Bay.

For this reason alone, platforms like Google Drive and the many dozens of others offering a similar service will continue to become havens for pirated content, until the next big round of legislative change. At the moment, each piece of content has to be removed individually but in the future, it’s possible that pre-emptive filters will kill uploads of pirated content before they see the light of day.

When this comes to pass, millions of people will understand why Google Drive, with its bots checking every file upload for alleged infringement, is not The Pirate Bay. At this point, if people have left it too long, it might be too late to reinvigorate BitTorrent networks to their former glory.

People will try to rebuild them, of course, but realizing why they shouldn’t have been left behind at all is probably the best protection.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

From Data Lake to Data Warehouse: Enhancing Customer 360 with Amazon Redshift Spectrum

Post Syndicated from Dylan Tong original https://aws.amazon.com/blogs/big-data/from-data-lake-to-data-warehouse-enhancing-customer-360-with-amazon-redshift-spectrum/

Achieving a 360o-view of your customer has become increasingly challenging as companies embrace omni-channel strategies, engaging customers across websites, mobile, call centers, social media, physical sites, and beyond. The promise of a web where online and physical worlds blend makes understanding your customers more challenging, but also more important. Businesses that are successful in this medium have a significant competitive advantage.

The big data challenge requires the management of data at high velocity and volume. Many customers have identified Amazon S3 as a great data lake solution that removes the complexities of managing a highly durable, fault tolerant data lake infrastructure at scale and economically.

AWS data services substantially lessen the heavy lifting of adopting technologies, allowing you to spend more time on what matters most—gaining a better understanding of customers to elevate your business. In this post, I show how a recent Amazon Redshift innovation, Redshift Spectrum, can enhance a customer 360 initiative.

Customer 360 solution

A successful customer 360 view benefits from using a variety of technologies to deliver different forms of insights. These could range from real-time analysis of streaming data from wearable devices and mobile interactions to historical analysis that requires interactive, on demand queries on billions of transactions. In some cases, insights can only be inferred through AI via deep learning. Finally, the value of your customer data and insights can’t be fully realized until it is operationalized at scale—readily accessible by fleets of applications. Companies are leveraging AWS for the breadth of services that cover these domains, to drive their data strategy.

A number of AWS customers stream data from various sources into a S3 data lake through Amazon Kinesis. They use Kinesis and technologies in the Hadoop ecosystem like Spark running on Amazon EMR to enrich this data. High-value data is loaded into an Amazon Redshift data warehouse, which allows users to analyze and interact with data through a choice of client tools. Redshift Spectrum expands on this analytics platform by enabling Amazon Redshift to blend and analyze data beyond the data warehouse and across a data lake.

The following diagram illustrates the workflow for such a solution.

This solution delivers value by:

  • Reducing complexity and time to value to deeper insights. For instance, an existing data model in Amazon Redshift may provide insights across dimensions such as customer, geography, time, and product on metrics from sales and financial systems. Down the road, you may gain access to streaming data sources like customer-care call logs and website activity that you want to blend in with the sales data on the same dimensions to understand how web and call center experiences maybe correlated with sales performance. Redshift Spectrum can join these dimensions in Amazon Redshift with data in S3 to allow you to quickly gain new insights, and avoid the slow and more expensive alternative of fully integrating these sources with your data warehouse.
  • Providing an additional avenue for optimizing costs and performance. In cases like call logs and clickstream data where volumes could be many TBs to PBs, storing the data exclusively in S3 yields significant cost savings. Interactive analysis on massive datasets may now be economically viable in cases where data was previously analyzed periodically through static reports generated by inexpensive batch processes. In some cases, you can improve the user experience while simultaneously lowering costs. Spectrum is powered by a large-scale infrastructure external to your Amazon Redshift cluster, and excels at scanning and aggregating large volumes of data. For instance, your analysts maybe performing data discovery on customer interactions across millions of consumers over years of data across various channels. On this large dataset, certain queries could be slow if you didn’t have a large Amazon Redshift cluster. Alternatively, you could use Redshift Spectrum to achieve a better user experience with a smaller cluster.

Proof of concept walkthrough

To make evaluation easier for you, I’ve conducted a Redshift Spectrum proof-of-concept (PoC) for the customer 360 use case. For those who want to replicate the PoC, the instructions, AWS CloudFormation templates, and public data sets are available in the GitHub repository.

The remainder of this post is a journey through the project, observing best practices in action, and learning how you can achieve business value. The walkthrough involves:

  • An analysis of performance data from the PoC environment involving queries that demonstrate blending and analysis of data across Amazon Redshift and S3. Observe that great results are achievable at scale.
  • Guidance by example on query tuning, design, and data preparation to illustrate the optimization process. This includes tuning a query that combines clickstream data in S3 with customer and time dimensions in Amazon Redshift, and aggregates ~1.9 B out of 3.7 B+ records in under 10 seconds with a small cluster!
  • Guidance and measurements to help assess deciding between two options: accessing and analyzing data exclusively in Amazon Redshift, or using Redshift Spectrum to access data left in S3.

Stream ingestion and enrichment

The focus of this post isn’t stream ingestion and enrichment on Kinesis and EMR, but be mindful of performance best practices on S3 to ensure good streaming and query performance:

  • Use random object keys: The data files provided for this project are prefixed with SHA-256 hashes to prevent hot partitions. This is important to ensure that optimal request rates to support PUT requests from the incoming stream in addition to certain queries from large Amazon Redshift clusters that could send a large number of parallel GET requests.
  • Micro-batch your data stream: S3 isn’t optimized for small random write workloads. Your datasets should be micro-batched into large files. For instance, the “parquet-1” dataset provided batches >7 million records per file. The optimal file size for Redshift Spectrum is usually in the 100 MB to 1 GB range.

If you have an edge case that may pose scalability challenges, AWS would love to hear about it. For further guidance, talk to your solutions architect.

Environment

The project consists of the following environment:

  • Amazon Redshift cluster: 4 X dc1.large
  • Data:
    • Time and customer dimension tables are stored on all Amazon Redshift nodes (ALL distribution style):
      • The data originates from the DWDATE and CUSTOMER tables in the Star Schema Benchmark
      • The customer table contains attributes for 3 million customers.
      • The time data is at the day-level granularity, and spans 7 years, from the start of 1992 to the end of 1998.
    • The clickstream data is stored in an S3 bucket, and serves as a fact table.
      • Various copies of this dataset in CSV and Parquet format have been provided, for reasons to be discussed later.
      • The data is a modified version of the uservisits dataset from AMPLab’s Big Data Benchmark, which was generated by Intel’s Hadoop benchmark tools.
      • Changes were minimal, so that existing test harnesses for this test can be adapted:
        • Increased the 751,754,869-row dataset 5X to 3,758,774,345 rows.
        • Added surrogate keys to support joins with customer and time dimensions. These keys were distributed evenly across the entire dataset to represents user visits from six customers over seven years.
        • Values for the visitDate column were replaced to align with the 7-year timeframe, and the added time surrogate key.

Queries across the data lake and data warehouse 

Imagine a scenario where a business analyst plans to analyze clickstream metrics like ad revenue over time and by customer, market segment and more. The example below is a query that achieves this effect: 

The query part highlighted in red retrieves clickstream data in S3, and joins the data with the time and customer dimension tables in Amazon Redshift through the part highlighted in blue. The query returns the total ad revenue for three customers over the last three months, along with info on their respective market segment.

Unfortunately, this query takes around three minutes to run, and doesn’t enable the interactive experience that you want. However, there’s a number of performance optimizations that you can implement to achieve the desired performance.

Performance analysis

Two key utilities provide visibility into Redshift Spectrum:

  • EXPLAIN
    Provides the query execution plan, which includes info around what processing is pushed down to Redshift Spectrum. Steps in the plan that include the prefix S3 are executed on Redshift Spectrum. For instance, the plan for the previous query has the step “S3 Seq Scan clickstream.uservisits_csv10”, indicating that Redshift Spectrum performs a scan on S3 as part of the query execution.
  • SVL_S3QUERY_SUMMARY
    Statistics for Redshift Spectrum queries are stored in this table. While the execution plan presents cost estimates, this table stores actual statistics for past query runs.

You can get the statistics of your last query by inspecting the SVL_S3QUERY_SUMMARY table with the condition (query = pg_last_query_id()). Inspecting the previous query reveals that the entire dataset of nearly 3.8 billion rows was scanned to retrieve less than 66.3 million rows. Improving scan selectivity in your query could yield substantial performance improvements.

Partitioning

Partitioning is a key means to improving scan efficiency. In your environment, the data and tables have already been organized, and configured to support partitions. For more information, see the PoC project setup instructions. The clickstream table was defined as:

CREATE EXTERNAL TABLE clickstream.uservisits_csv10
…
PARTITIONED BY(customer int4, visitYearMonth int4)

The entire 3.8 billion-row dataset is organized as a collection of large files where each file contains data exclusive to a particular customer and month in a year. This allows you to partition your data into logical subsets by customer and year/month. With partitions, the query engine can target a subset of files:

  • Only for specific customers
  • Only data for specific months
  • A combination of specific customers and year/months

You can use partitions in your queries. Instead of joining your customer data on the surrogate customer key (that is, c.c_custkey = uv.custKey), the partition key “customer” should be used instead:

SELECT c.c_name, c.c_mktsegment, t.prettyMonthYear, SUM(uv.adRevenue)
…
ON c.c_custkey = uv.customer
…
ORDER BY c.c_name, c.c_mktsegment, uv.yearMonthKey  ASC

This query should run approximately twice as fast as the previous query. If you look at the statistics for this query in SVL_S3QUERY_SUMMARY, you see that only half the dataset was scanned. This is expected because your query is on three out of six customers on an evenly distributed dataset. However, the scan is still inefficient, and you can benefit from using your year/month partition key as well:

SELECT c.c_name, c.c_mktsegment, t.prettyMonthYear, SUM(uv.adRevenue)
…
ON c.c_custkey = uv.customer
…
ON uv.visitYearMonth = t.d_yearmonthnum
…
ORDER BY c.c_name, c.c_mktsegment, uv.visitYearMonth ASC

All joins between the tables are now using partitions. Upon reviewing the statistics for this query, you should observe that Redshift Spectrum scans and returns the exact number of rows, 66,270,117. If you run this query a few times, you should see execution time in the range of 8 seconds, which is a 22.5X improvement on your original query!

Predicate pushdown and storage optimizations 

Previously, I mentioned that Redshift Spectrum performs processing through large-scale infrastructure external to your Amazon Redshift cluster. It is optimized for performing large scans and aggregations on S3. In fact, Redshift Spectrum may even out-perform a medium size Amazon Redshift cluster on these types of workloads with the proper optimizations. There are two important variables to consider for optimizing large scans and aggregations:

  • File size and count. As a general rule, use files 100 MB-1 GB in size, as Redshift Spectrum and S3 are optimized for reading this object size. However, the number of files operating on a query is directly correlated with the parallelism achievable by a query. There is an inverse relationship between file size and count: the bigger the files, the fewer files there are for the same dataset. Consequently, there is a trade-off between optimizing for object read performance, and the amount of parallelism achievable on a particular query. Large files are best for large scans as the query likely operates on sufficiently large number of files. For queries that are more selective and for which fewer files are operating, you may find that smaller files allow for more parallelism.
  • Data format. Redshift Spectrum supports various data formats. Columnar formats like Parquet can sometimes lead to substantial performance benefits by providing compression and more efficient I/O for certain workloads. Generally, format types like Parquet should be used for query workloads involving large scans, and high attribute selectivity. Again, there are trade-offs as formats like Parquet require more compute power to process than plaintext. For queries on smaller subsets of data, the I/O efficiency benefit of Parquet is diminished. At some point, Parquet may perform the same or slower than plaintext. Latency, compression rates, and the trade-off between user experience and cost should drive your decision.

To help illustrate how Redshift Spectrum performs on these large aggregation workloads, run a basic query that aggregates the entire ~3.7 billion record dataset on Redshift Spectrum, and compared that with running the query exclusively on Amazon Redshift:

SELECT uv.custKey, COUNT(uv.custKey)
FROM <your clickstream table> as uv
GROUP BY uv.custKey
ORDER BY uv.custKey ASC

For the Amazon Redshift test case, the clickstream data is loaded, and distributed evenly across all nodes (even distribution style) with optimal column compression encodings prescribed by the Amazon Redshift’s ANALYZE command.

The Redshift Spectrum test case uses a Parquet data format with each file containing all the data for a particular customer in a month. This results in files mostly in the range of 220-280 MB, and in effect, is the largest file size for this partitioning scheme. If you run tests with the other datasets provided, you see that this data format and size is optimal and out-performs others by ~60X. 

Performance differences will vary depending on the scenario. The important takeaway is to understand the testing strategy and the workload characteristics where Redshift Spectrum is likely to yield performance benefits. 

The following chart compares the query execution time for the two scenarios. The results indicate that you would have to pay for 12 X DC1.Large nodes to get performance comparable to using a small Amazon Redshift cluster that leverages Redshift Spectrum. 

Chart showing simple aggregation on ~3.7 billion records

So you’ve validated that Spectrum excels at performing large aggregations. Could you benefit by pushing more work down to Redshift Spectrum in your original query? It turns out that you can, by making the following modification:

The clickstream data is stored at a day-level granularity for each customer while your query rolls up the data to the month level per customer. In the earlier query that uses the day/month partition key, you optimized the query so that it only scans and retrieves the data required, but the day level data is still sent back to your Amazon Redshift cluster for joining and aggregation. The query shown here pushes aggregation work down to Redshift Spectrum as indicated by the query plan:

In this query, Redshift Spectrum aggregates the clickstream data to the month level before it is returned to the Amazon Redshift cluster and joined with the dimension tables. This query should complete in about 4 seconds, which is roughly twice as fast as only using the partition key. The speed increase is evident upon reviewing the SVL_S3QUERY_SUMMARY table:

  • Bytes scanned is 21.6X less because of the Parquet data format.
  • Only 90 records are returned back to the Amazon Redshift cluster as a result of the push-down, instead of ~66.2 million, leading to substantially less join overhead, and about 530 MB less data sent back to your cluster.
  • No adverse change in average parallelism.

Assessing the value of Amazon Redshift vs. Redshift Spectrum

At this point, you might be asking yourself, why would I ever not use Redshift Spectrum? Well, you still get additional value for your money by loading data into Amazon Redshift, and querying in Amazon Redshift vs. querying S3.

In fact, it turns out that the last version of our query runs even faster when executed exclusively in native Amazon Redshift, as shown in the following chart:

Chart comparing Amazon Redshift vs. Redshift Spectrum with pushdown aggregation over 3 months of data

As a general rule, queries that aren’t dominated by I/O and which involve multiple joins are better optimized in native Amazon Redshift. For instance, the performance difference between running the partition key query entirely in Amazon Redshift versus with Redshift Spectrum is twice as large as that that of the pushdown aggregation query, partly because the former case benefits more from better join performance.

Furthermore, the variability in latency in native Amazon Redshift is lower. For use cases where you have tight performance SLAs on queries, you may want to consider using Amazon Redshift exclusively to support those queries.

On the other hand, when you perform large scans, you could benefit from the best of both worlds: higher performance at lower cost. For instance, imagine that you wanted to enable your business analysts to interactively discover insights across a vast amount of historical data. In the example below, the pushdown aggregation query is modified to analyze seven years of data instead of three months:

SELECT c.c_name, c.c_mktsegment, t.prettyMonthYear, uv.totalRevenue
…
WHERE customer <= 3 and visitYearMonth >= 199201
… 
FROM dwdate WHERE d_yearmonthnum >= 199201) as t
…
ORDER BY c.c_name, c.c_mktsegment, uv.visitYearMonth ASC

This query requires scanning and aggregating nearly 1.9 billion records. As shown in the chart below, Redshift Spectrum substantially speeds up this query. A large Amazon Redshift cluster would have to be provisioned to support this use case. With the aid of Redshift Spectrum, you could use an existing small cluster, keep a single copy of your data in S3, and benefit from economical, durable storage while only paying for what you use via the pay per query pricing model.

Chart comparing Amazon Redshift vs. Redshift Spectrum with pushdown aggregation over 7 years of data

Summary

Redshift Spectrum lowers the time to value for deeper insights on customer data queries spanning the data lake and data warehouse. It can enable interactive analysis on datasets in cases that weren’t economically practical or technically feasible before.

There are cases where you can get the best of both worlds from Redshift Spectrum: higher performance at lower cost. However, there are still latency-sensitive use cases where you may want native Amazon Redshift performance. For more best practice tips, see the 10 Best Practices for Amazon Redshift post.

Please visit the Amazon Redshift Spectrum PoC Environment Github page. If you have questions or suggestions, please comment below.

 


Additional Reading

Learn more about how Amazon Redshift Spectrum extends data warehousing out to exabytes – no loading required.


About the Author

Dylan Tong is an Enterprise Solutions Architect at AWS. He works with customers to help drive their success on the AWS platform through thought leadership and guidance on designing well architected solutions. He has spent most of his career building on his expertise in data management and analytics by working for leaders and innovators in the space.

 

 

Awesome Raspberry Pi cases to 3D print at home

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/3d-printed-raspberry-pi-cases/

Unless you’re planning to fit your Raspberry Pi inside a build, you may find yourself in need of a case to protect it from dust, damage and/or the occasional pet attack. Here are some of our favourite 3D-printed cases, for which files are available online so you can recreate them at home.

TARDIS

TARDIS Raspberry PI 3 case – 3D Printing Time lapse

Every Tuesday we’ll 3D print designs from the community and showcase slicer settings, use cases and of course, Time-lapses! This week: TARDIS Raspberry PI 3 case By: https://www.thingiverse.com/Jason3030 https://www.thingiverse.com/thing:2430122/ BCN3D Sigma Blue PLA 3hrs 20min X:73 Y:73 Z:165mm .4mm layer / .6mm nozzle 0% Infill / 4mm retract 230C / 0C 114G 60mm/s —————————————– Shop for parts for your own DIY projects http://adafru.it/3dprinting Download Autodesk Fusion 360 – 1 Year Free License (renew it after that for more free use!)

Since I am an avid Whovian, it’s not surprising that this case made its way onto the list. Its outside is aesthetically pleasing to the aspiring Time Lord, and it snugly fits your treasured Pi.



Pop this case on your desk and chuckle with glee every time someone asks what’s inside it:

Person: What’s that?
You: My Raspberry Pi.
Person: What’s a Raspberry Pi?
You: It’s a computer!
Person: There’s a whole computer in that tiny case?
You: Yes…it’s BIGGER ON THE INSIDE!

I’ll get my coat.

Pi crust

Yes, we all wish we’d thought of it first. What better case for a Raspberry Pi than a pie crust?

3D-printed Raspberry Pi cases

While the case is designed to fit the Raspberry Pi Model B, you will be able to upgrade the build to accommodate newer models with a few tweaks.



Just make sure that if you do, you credit Marco Valenzuela, its original baker.

Consoles

Since many people use the Raspberry Pi to run RetroPie, there is a growing trend of 3D-printed console-style Pi cases.

3D-printed Raspberry Pi cases

So why not pop your Raspberry Pi into a case made to look like your favourite vintage console, such as the Nintendo NES or N64?



You could also use an adapter to fit a Raspberry Pi Zero within an actual Atari cartridge, or go modern and print a PlayStation 4 case!

Functional

Maybe you’re looking to use your Raspberry Pi as a component of a larger project, such as a home automation system, learning suite, or makerspace. In that case you may need to attach it to a wall, under a desk, or behind a monitor.

3D-printed Raspberry Pi cases

Coo! Coo!

The Pidgeon, shown above, allows you to turn your Zero W into a surveillance camera, while the piPad lets you keep a breadboard attached for easy access to your Pi’s GPIO pins.



Functional cases with added brackets are great for incorporating your Pi on the sly. The VESA mount case will allow you to attach your Pi to any VESA-compatible monitor, and the Fallout 4 Terminal is just really cool.

Cute

You might want your case to just look cute, especially if it’s going to sit in full view on your desk or shelf.

3D-printed Raspberry Pi cases

The tired cube above is the only one of our featured 3D prints for which you have to buy the files ($1.30), but its adorable face begged to be shared anyway.



If you’d rather save your money for another day, you may want to check out this adorable monster from Adafruit. Be aware that this case will also need some altering to fit newer versions of the Pi.

Our cases

Finally, there are great options for you if you don’t have access to a 3D printer, or if you would like to help the Raspberry Pi Foundation’s mission. You can buy one of the official Raspberry Pi cases for the Raspberry Pi 3 and Raspberry Pi Zero (and Zero W)!

3D-printed Raspberry Pi cases



As with all official Raspberry Pi accessories (and with the Pi itself), your money goes toward helping the Foundation to put the power of digital making into the hands of people all over the world.

3D-printed Raspberry Pi cases

You could also print a replica of the official Astro Pi cases, in which two Pis are currently orbiting the earth on the International Space Station.

Design your own Raspberry Pi case!

If you’ve built a case for your Raspberry Pi, be it with a 3D printer, laser-cutter, or your bare hands, make sure to share it with us in the comments below, or via our social media channels.

And if you’d like to give 3D printing a go, there are plenty of free online learning resources, and sites that offer tutorials and software to get you started, such as TinkerCAD, Instructables, and Adafruit.

The post Awesome Raspberry Pi cases to 3D print at home appeared first on Raspberry Pi.

Running an elastic HiveMQ cluster with auto discovery on AWS

Post Syndicated from The HiveMQ Team original http://www.hivemq.com/blog/running-hivemq-cluster-aws-auto-discovery

hivemq-aws

HiveMQ is a cloud-first MQTT broker with elastic clustering capabilities and a resilient software design which is a perfect fit for common cloud infrastructures. This blogpost discussed what benefits a MQTT broker cluster offers. Today’s post aims to be more practical and talk about how to set up a HiveMQ on one of the most popular cloud computing platform: Amazon Webservices.

Running HiveMQ on cloud infrastructure

Running a HiveMQ cluster on cloud infrastructure like AWS not only offers the advantage the possibility of elastically scaling the infrastructure, it also assures that state of the art security standards are in place on the infrastructure side. These platforms are typically highly available and new virtual machines can be spawned in a snap if they are needed. HiveMQ’s unique ability to add (and remove) cluster nodes at runtime without any manual reconfiguration of the cluster allow to scale linearly on IaaS providers. New cluster nodes can be started (manually or automatically) and the cluster sizes adapts automatically. For more detailed information about HiveMQ clustering and how to achieve true high availability and linear scalability with HiveMQ, we recommend reading the HiveMQ Clustering Paper.

As Amazon Webservice is amongst the best known and most used cloud platforms, we want to illustrate the setup of a HiveMQ cluster on AWS in this post. Note that similar concepts as displayed in this step by step guide for Running an elastic HiveMQ cluster on AWS apply to other cloud platforms such as Microsoft Azure or Google Cloud Platform.

Setup and Configuration

Amazon Webservices prohibits the use of UDP multicast, which is the default HiveMQ cluster discovery mode. The use of Amazon Simple Storage Service (S3) buckets for auto-discovery is a perfect alternative if the brokers are running on AWS EC2 instances anyway. HiveMQ has a free off-the-shelf plugin available for AWS S3 Cluster Discovery.

The following provides a step-by-step guide how to setup the brokers on AWS EC2 with automatic cluster member discovery via S3.

Setup Security Group

The first step is creating a security group that allows inbound traffic to the listeners we are going to configure for MQTT communication. It is also vital to have SSH access on the instances. After you created the security group you need to edit the group and add an additional rule for internal communication between the cluster nodes (meaning the source is the security group itself) on all TCP ports.

To create and edit security groups go to the EC2 console – NETWORK & SECURITY – Security Groups

Inbound traffic

Inbound traffic

Outbound traffic

Outbound traffic

The next step is to create an s3-bucket in the s3 console. Make sure to choose a region, close to the region you want to run your HiveMQ instances on.

Option A: Create IAM role and assign to EC2 instance

Our recommendation is to configure your EC2 instances in a way, allowing them to have access to the s3 bucket. This way you don’t need to create a specific user and don’t need to use the user’s credentials in the

s3discovery.properties

file.

Create IAM Role

Create IAM Role

EC2 Instance Role Type

EC2 Instance Role Type

Select S3 Full Access

Select S3 Full Access

Assign new Role to Instance

Assign new Role to Instance

Option B: Create user and assign IAM policy

The next step is creating a user in the IAM console.

Choose name and set programmatic access

Choose name and set programmatic access

Assign s3 full access role

Assign s3 full access role

Review and create

Review and create

Download credentials

Download credentials

It is important you store these credentials, as they will be needed later for configuring the S3 Cluster Discovery Plugin.

Start EC2 instances with HiveMQ

The next step is spawning 2 or more EC-2 instances with HiveMQ. Follow the steps in the HiveMQ User Guide.

Install s3 discovery plugin

The final step is downloading, installing and configuring the S3 Cluster Discovery Plugin.
After you downloaded the plugin you need to configure the s3 access in the

s3discovery.properties

file according to which s3 access option you chose.

Option A:

# AWS Credentials                                          #
############################################################

#
# Use environment variables to specify your AWS credentials
# the following variables need to be set:
# AWS_ACCESS_KEY_ID
# AWS_SECRET_ACCESS_KEY
#
#credentials-type:environment_variables

#
# Use Java system properties to specify your AWS credentials
# the following variables need to be set:
# aws.accessKeyId
# aws.secretKey
#
#credentials-type:java_system_properties

#
# Uses the credentials file wich ############################################################
# can be created by calling 'aws configure' (AWS CLI)
# usually this file is located at ~/.aws/credentials (platform dependent)
# The location of the file can be configured by setting the environment variable
# AWS_CREDENTIAL_PROFILE_FILE to the location of your file
#
#credentials-type:user_credentials_file

#
# Uses the IAM Profile assigned to the EC2 instance running HiveMQ to access S3
# Notice: This only works if HiveMQ is running on an EC2 instance !
#
credentials-type:instance_profile_credentials

#
# Tries to access S3 via the default mechanisms in the following order
# 1) Environment variables
# 2) Java system properties
# 3) User credentials file
# 4) IAM profiles assigned to EC2 instance
#
#credentials-type:default

#
# Uses the credentials specified in this file.
# The variables you must provide are:
# credentials-access-key-id
# credentials-secret-access-key
#
#credentials-type:access_key
#credentials-access-key-id:
#credentials-secret-access-key:

#
# Uses the credentials specified in this file to authenticate with a temporary session
# The variables you must provide are:
# credentials-access-key-id
# credentials-secret-access-key
# credentials-session-token
#
#credentials-type:temporary_session
#credentials-access-key-id:{access_key_id}
#credentials-secret-access-key:{secret_access_key}
#credentials-session-token:{session_token}


############################################################
# S3 Bucket                                                #
############################################################

#
# Region for the S3 bucket used by hivemq
# see http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region for a list of regions for S3
# example: us-west-2
#
s3-bucket-region:<your region here>

#
# Name of the bucket used by HiveMQ
#
s3-bucket-name:<your s3 bucket name here>

#
# Prefix for the filename of every node's file (optional)
#
file-prefix:hivemq/cluster/nodes/

#
# Expiration timeout (in minutes).
# Files with a timestamp older than (timestamp + expiration) will be automatically deleted
# Set to 0 if you do not want the plugin to handle expiration.
#
file-expiration:360

#
# Interval (in minutes) in which the own information in S3 is updated.
# Set to 0 if you do not want the plugin to update its own information.
# If you disable this you also might want to disable expiration.
#
update-interval:180

Option B:

# AWS Credentials                                          #
############################################################

#
# Use environment variables to specify your AWS credentials
# the following variables need to be set:
# AWS_ACCESS_KEY_ID
# AWS_SECRET_ACCESS_KEY
#
#credentials-type:environment_variables

#
# Use Java system properties to specify your AWS credentials
# the following variables need to be set:
# aws.accessKeyId
# aws.secretKey
#
#credentials-type:java_system_properties

#
# Uses the credentials file wich ############################################################
# can be created by calling 'aws configure' (AWS CLI)
# usually this file is located at ~/.aws/credentials (platform dependent)
# The location of the file can be configured by setting the environment variable
# AWS_CREDENTIAL_PROFILE_FILE to the location of your file
#
#credentials-type:user_credentials_file

#
# Uses the IAM Profile assigned to the EC2 instance running HiveMQ to access S3
# Notice: This only works if HiveMQ is running on an EC2 instance !
#
#credentials-type:instance_profile_credentials

#
# Tries to access S3 via the default mechanisms in the following order
# 1) Environment variables
# 2) Java system properties
# 3) User credentials file
# 4) IAM profiles assigned to EC2 instance
#
#credentials-type:default

#
# Uses the credentials specified in this file.
# The variables you must provide are:
# credentials-access-key-id
# credentials-secret-access-key
#
credentials-type:access_key
credentials-access-key-id:<your access key id here>
credentials-secret-access-key:<your secret access key here>

#
# Uses the credentials specified in this file to authenticate with a temporary session
# The variables you must provide are:
# credentials-access-key-id
# credentials-secret-access-key
# credentials-session-token
#
#credentials-type:temporary_session
#credentials-access-key-id:{access_key_id}
#credentials-secret-access-key:{secret_access_key}
#credentials-session-token:{session_token}


############################################################
# S3 Bucket                                                #
############################################################

#
# Region for the S3 bucket used by hivemq
# see http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region for a list of regions for S3
# example: us-west-2
#
s3-bucket-region:<your region here>

#
# Name of the bucket used by HiveMQ
#
s3-bucket-name:<your s3 bucket name here>

#
# Prefix for the filename of every node's file (optional)
#
file-prefix:hivemq/cluster/nodes/

#
# Expiration timeout (in minutes).
# Files with a timestamp older than (timestamp + expiration) will be automatically deleted
# Set to 0 if you do not want the plugin to handle expiration.
#
file-expiration:360

#
# Interval (in minutes) in which the own information in S3 is updated.
# Set to 0 if you do not want the plugin to update its own information.
# If you disable this you also might want to disable expiration.
#
update-interval:180

This file has to be identical on all your cluster nodes.

That’s it. Starting HiveMQ on multiple EC2 instances will now result in them forming a cluster, taking advantage of the S3 bucket for discovery.
You know that your setup was successful when HiveMQ logs something similar to this.

Cluster size = 2, members : [0QMpE, jw8wu].

Enjoy an elastic MQTT broker cluster

We are now able to take advantage of rapid elasticity. Scaling the HiveMQ cluster up or down by adding or removing EC2 instances without the need of administrative intervention is now possible.

For production environments it’s recommended to use automatic provisioning of the EC2 instances (e.g. by using Chef, Puppet, Ansible or similar tools) so you don’t need to configure each EC2 instance manually. Of course HiveMQ can also be used with Docker, which can also ease the provisioning of HiveMQ nodes.

Running an elastic HiveMQ cluster with auto discovery on AWS

Post Syndicated from The HiveMQ Team original https://www.hivemq.com/blog/running-hivemq-cluster-aws-auto-discovery

hivemq-aws

HiveMQ is a cloud-first MQTT broker with elastic clustering capabilities and a resilient software design which is a perfect fit for common cloud infrastructures. This blogpost discussed what benefits a MQTT broker cluster offers. Today’s post aims to be more practical and talk about how to set up a HiveMQ on one of the most popular cloud computing platform: Amazon Webservices.

Running HiveMQ on cloud infrastructure

Running a HiveMQ cluster on cloud infrastructure like AWS not only offers the advantage the possibility of elastically scaling the infrastructure, it also assures that state of the art security standards are in place on the infrastructure side. These platforms are typically highly available and new virtual machines can be spawned in a snap if they are needed. HiveMQ’s unique ability to add (and remove) cluster nodes at runtime without any manual reconfiguration of the cluster allow to scale linearly on IaaS providers. New cluster nodes can be started (manually or automatically) and the cluster sizes adapts automatically. For more detailed information about HiveMQ clustering and how to achieve true high availability and linear scalability with HiveMQ, we recommend reading the HiveMQ Clustering Paper.

As Amazon Webservice is amongst the best known and most used cloud platforms, we want to illustrate the setup of a HiveMQ cluster on AWS in this post. Note that similar concepts as displayed in this step by step guide for Running an elastic HiveMQ cluster on AWS apply to other cloud platforms such as Microsoft Azure or Google Cloud Platform.

Setup and Configuration

Amazon Webservices prohibits the use of UDP multicast, which is the default HiveMQ cluster discovery mode. The use of Amazon Simple Storage Service (S3) buckets for auto-discovery is a perfect alternative if the brokers are running on AWS EC2 instances anyway. HiveMQ has a free off-the-shelf plugin available for AWS S3 Cluster Discovery.

The following provides a step-by-step guide how to setup the brokers on AWS EC2 with automatic cluster member discovery via S3.

Setup Security Group

The first step is creating a security group that allows inbound traffic to the listeners we are going to configure for MQTT communication. It is also vital to have SSH access on the instances. After you created the security group you need to edit the group and add an additional rule for internal communication between the cluster nodes (meaning the source is the security group itself) on all TCP ports.

To create and edit security groups go to the EC2 console – NETWORK & SECURITY – Security Groups

Inbound traffic

Inbound traffic

Outbound traffic

Outbound traffic

The next step is to create an s3-bucket in the s3 console. Make sure to choose a region, close to the region you want to run your HiveMQ instances on.

Option A: Create IAM role and assign to EC2 instance

Our recommendation is to configure your EC2 instances in a way, allowing them to have access to the s3 bucket. This way you don’t need to create a specific user and don’t need to use the user’s credentials in the s3discovery.properties file.

Create IAM Role

Create IAM Role

EC2 Instance Role Type

EC2 Instance Role Type

Select S3 Full Access

Select S3 Full Access

Assign new Role to Instance

Assign new Role to Instance

Option B: Create user and assign IAM policy

The next step is creating a user in the IAM console.

Choose name and set programmatic access

Choose name and set programmatic access

Assign s3 full access role

Assign s3 full access role

Review and create

Review and create

Download credentials

Download credentials

It is important you store these credentials, as they will be needed later for configuring the S3 Cluster Discovery Plugin.

Start EC2 instances with HiveMQ

The next step is spawning 2 or more EC-2 instances with HiveMQ. Follow the steps in the HiveMQ User Guide.

Install s3 discovery plugin

The final step is downloading, installing and configuring the S3 Cluster Discovery Plugin.
After you downloaded the plugin you need to configure the s3 access in the s3discovery.properties file according to which s3 access option you chose.

Option A:

# AWS Credentials                                          #
############################################################

#
# Use environment variables to specify your AWS credentials
# the following variables need to be set:
# AWS_ACCESS_KEY_ID
# AWS_SECRET_ACCESS_KEY
#
#credentials-type:environment_variables

#
# Use Java system properties to specify your AWS credentials
# the following variables need to be set:
# aws.accessKeyId
# aws.secretKey
#
#credentials-type:java_system_properties

#
# Uses the credentials file wich ############################################################
# can be created by calling 'aws configure' (AWS CLI)
# usually this file is located at ~/.aws/credentials (platform dependent)
# The location of the file can be configured by setting the environment variable
# AWS_CREDENTIAL_PROFILE_FILE to the location of your file
#
#credentials-type:user_credentials_file

#
# Uses the IAM Profile assigned to the EC2 instance running HiveMQ to access S3
# Notice: This only works if HiveMQ is running on an EC2 instance !
#
credentials-type:instance_profile_credentials

#
# Tries to access S3 via the default mechanisms in the following order
# 1) Environment variables
# 2) Java system properties
# 3) User credentials file
# 4) IAM profiles assigned to EC2 instance
#
#credentials-type:default

#
# Uses the credentials specified in this file.
# The variables you must provide are:
# credentials-access-key-id
# credentials-secret-access-key
#
#credentials-type:access_key
#credentials-access-key-id:
#credentials-secret-access-key:

#
# Uses the credentials specified in this file to authenticate with a temporary session
# The variables you must provide are:
# credentials-access-key-id
# credentials-secret-access-key
# credentials-session-token
#
#credentials-type:temporary_session
#credentials-access-key-id:{access_key_id}
#credentials-secret-access-key:{secret_access_key}
#credentials-session-token:{session_token}


############################################################
# S3 Bucket                                                #
############################################################

#
# Region for the S3 bucket used by hivemq
# see http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region for a list of regions for S3
# example: us-west-2
#
s3-bucket-region:

#
# Name of the bucket used by HiveMQ
#
s3-bucket-name:

#
# Prefix for the filename of every node's file (optional)
#
file-prefix:hivemq/cluster/nodes/

#
# Expiration timeout (in minutes).
# Files with a timestamp older than (timestamp + expiration) will be automatically deleted
# Set to 0 if you do not want the plugin to handle expiration.
#
file-expiration:360

#
# Interval (in minutes) in which the own information in S3 is updated.
# Set to 0 if you do not want the plugin to update its own information.
# If you disable this you also might want to disable expiration.
#
update-interval:180

Option B:

# AWS Credentials                                          #
############################################################

#
# Use environment variables to specify your AWS credentials
# the following variables need to be set:
# AWS_ACCESS_KEY_ID
# AWS_SECRET_ACCESS_KEY
#
#credentials-type:environment_variables

#
# Use Java system properties to specify your AWS credentials
# the following variables need to be set:
# aws.accessKeyId
# aws.secretKey
#
#credentials-type:java_system_properties

#
# Uses the credentials file wich ############################################################
# can be created by calling 'aws configure' (AWS CLI)
# usually this file is located at ~/.aws/credentials (platform dependent)
# The location of the file can be configured by setting the environment variable
# AWS_CREDENTIAL_PROFILE_FILE to the location of your file
#
#credentials-type:user_credentials_file

#
# Uses the IAM Profile assigned to the EC2 instance running HiveMQ to access S3
# Notice: This only works if HiveMQ is running on an EC2 instance !
#
#credentials-type:instance_profile_credentials

#
# Tries to access S3 via the default mechanisms in the following order
# 1) Environment variables
# 2) Java system properties
# 3) User credentials file
# 4) IAM profiles assigned to EC2 instance
#
#credentials-type:default

#
# Uses the credentials specified in this file.
# The variables you must provide are:
# credentials-access-key-id
# credentials-secret-access-key
#
credentials-type:access_key
credentials-access-key-id:
credentials-secret-access-key:

#
# Uses the credentials specified in this file to authenticate with a temporary session
# The variables you must provide are:
# credentials-access-key-id
# credentials-secret-access-key
# credentials-session-token
#
#credentials-type:temporary_session
#credentials-access-key-id:{access_key_id}
#credentials-secret-access-key:{secret_access_key}
#credentials-session-token:{session_token}


############################################################
# S3 Bucket                                                #
############################################################

#
# Region for the S3 bucket used by hivemq
# see http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region for a list of regions for S3
# example: us-west-2
#
s3-bucket-region:

#
# Name of the bucket used by HiveMQ
#
s3-bucket-name:

#
# Prefix for the filename of every node's file (optional)
#
file-prefix:hivemq/cluster/nodes/

#
# Expiration timeout (in minutes).
# Files with a timestamp older than (timestamp + expiration) will be automatically deleted
# Set to 0 if you do not want the plugin to handle expiration.
#
file-expiration:360

#
# Interval (in minutes) in which the own information in S3 is updated.
# Set to 0 if you do not want the plugin to update its own information.
# If you disable this you also might want to disable expiration.
#
update-interval:180

This file has to be identical on all your cluster nodes.

That’s it. Starting HiveMQ on multiple EC2 instances will now result in them forming a cluster, taking advantage of the S3 bucket for discovery.
You know that your setup was successful when HiveMQ logs something similar to this.

Cluster size = 2, members : [0QMpE, jw8wu].

Enjoy an elastic MQTT broker cluster

We are now able to take advantage of rapid elasticity. Scaling the HiveMQ cluster up or down by adding or removing EC2 instances without the need of administrative intervention is now possible.

For production environments it’s recommended to use automatic provisioning of the EC2 instances (e.g. by using Chef, Puppet, Ansible or similar tools) so you don’t need to configure each EC2 instance manually. Of course HiveMQ can also be used with Docker, which can also ease the provisioning of HiveMQ nodes.

Bitcoin, UASF… и политиката

Post Syndicated from Григор original http://www.gatchev.info/blog/?p=2064

Напоследък се заговори из Нета за UASF при Bitcoin. Надали обаче много хора са обърнали внимание на тия акроними. (Обикновено статиите по въпроса на свой ред са салата от други акроними, което също не улеснява разбирането им.) Какво, по дяволите, значи това? И важно ли е?

Всъщност не е особено важно, освен за хора, които сериозно се занимават с криптовалути. Останалите спокойно могат да не му обръщат внимание.

Поне на пръв поглед. Защото дава и сериозно разбиране за ефективността на някои фундаментални политически понятия. Затова смятам да му посветя тук част от времето си – и да изгубя част от вашето.

1. Проблемите на Bitcoin

Електронна валута, която се контролира не от политикани и меринджеи, а от строги правила – мечта, нали? Край на страховете, че поредният популист ще отвори печатницата за пари и ще превърне спестяванията ви в шарена тоалетна хартия… Но идеи без проблеми няма (за реализациите им да не говорим). Така е и с Bitcoin.

Всички транзакции в биткойни се записват в блокове, които образуват верига – така нареченият блокчейн. По този начин всяка стотинка (пардон, сатоши 🙂 ) може да бъде проследена до самото ѝ създаване. Адресите, между които се обменят парите, са анонимни, но самите обмени са публични и явни. Може да ги проследи и провери за валидност всеки, които има нужния софтуер (достъпен свободно) и поддържа „пълен възел“ (full node), тоест е склонен да отдели стотина гигабайта на диска си.

Проблемът е, че блокът на Bitcoin има фиксиран максимален размер – до 1 мегабайт. Той побира максимум 2-3 хиляди транзакции. При 6 блока на час това означава около 15 000 транзакции на час, или около 360 000 на денонощие. Звучи много, но всъщност е абсолютно недостатъчно – доста големи банки правят по повече транзакции на секунда. Та, от известно време насам нуждата от транзакции надхвърля капацитета на блокчейна. Което създава проблем за потребителите на валутата. Някои от тях започват да я изоставят и да се насочват към традиционни валути, или към други криптовалути. Съответно, влиянието и ролята ѝ спада.

2. Положението с решенията

Предлагани са немалко решения на този проблем. Последното се нарича SegWit (segregated witness). Срещу всички тях (и конкретно срещу това) обаче има сериозна съпротива от ключови фактори в Bitcoin.

Сравнително скоро след създаването на Bitcoin в него беше въведено правилото, че транзакциите са платени. (Иначе беше много лесно да бъдат генерирани огромен брой транзакции за минимална сума напред-назад, и така да бъде задръстен блокчейнът.) Всяка транзакция указва колко ще плати за включването си в блок. (Това е, което я „узаконява“.)

Кои транзакции от чакащите реда си ще включи в блок решава този, който създава блока. Това е „копачът“, който е решил целта от предишния блок. Той прибира заплащането за включените транзакции, освен стандартната „награда“ за блока. Затова копачите имат изгода транзакциите да са колкото се може по-скъпи – тоест, капацитетът на блокчейна да е недостатъчен.

В добавка, немалко копачи използват „хак“ в технологията на системата – така нареченият ASICBOOST. Едно от предимствата на SegWit е, че пречи на подобни хакове – тоест, на тези „копачи“. (Подробности можете да намерите тук.)

Резултатът е, че някои копачи се съпротивляват на въвеждането на SegWit. А „копаещата мощност“ е, която служи като „демократичен глас“ в системата на Bitcoin. Вече е правен опит да се въведе SegWit, който не сполучи. За да е по-добър консенсусът, този опит изискваше SegWit да се приеме когато 95% от копаещата мощност го подкрепи. Скоро стана ясно, че това няма да се случи.

3. UASF? WTF? (Демек, кво е тва UASF?)

Не зная колко точно е процентът на отхвърлящите SegWit копачи. Но към момента копаенето е централизирано до степен да се върши почти всичкото от малък брой мощни компании. Напълно е възможно отхвърлящите SegWit да са над 50% от копаещата мощност. Ако е така, въвеждането на SegWit чрез подкрепа от нея би било невъзможно. (Разбира се, това ще значи в близко бъдеще упадъка на Bitcoin и превръщането му от „царя на криптовалутите“ в евтин музеен експонат. В крайна сметка тези копачи ще са си изкопали гроба. Но ако има на света нещо, на което може да се разчита винаги и докрай, това е човешката глупост.)

За да се избегне такъв сценарий, девелоперите от Bitcoin Core Team предложиха т.нар. User-Activated Soft Fork, съкратено UASF. Същността му е, че от 1 август нататък възлите в мрежата на Bitcoin, които подкрепят SegWit, ще започнат да смятат блокове, които не потвърждават че го поддържат, за невалидни.

Отхвърлящите SegWit копачи могат да продължат да си копаят по старому. Поддържащите го ще продължат по новому. Съответно блокчейнът на Bitcoin от този момент нататък ще се раздели на два – клон без SegWit и клон с него.

4. Какъв ще е резултатът?

Преобладаващата копаеща мощност може да се окаже в първия – тоест, по правилата на Сатоши Накамото той ще е основният. Но ако мрежата е разделена на две, всяка ще има своя основен клон, така че няма да бъдат технически обединени. Ще има две различни валути на име Bitcoin, и всяка ще претендира, че е основната.

Как ще се разреши този спор? Потребителите на Bitcoin търсят по-ниски цени за транзакции, така че огромният процент от тях бързо ще се ориентират към веригата със SegWit. А ценността и приетостта на Bitcoin се дължи просто на факта, че хората го приемат и са склонни да го използват. Затова и Segwit-натият Bitcoin ще запази ролята (и цената) на оригиналния Bitcoin, докато този без SegWit ще поевтинее и ще загуби повечето от релевантността си.

(Всъщност, подобно „разцепление“ вече се е случвало с No. 2 в света на криптовалутите – Ethereum. Затова има Ethereum и Ethereum Classic. Вторите изгубиха борбата да са наследникът на оригиналния Ethereum, но продължава да ги има, макар и да са с много по-малка роля и цена.)

Отхвърлилите SegWit копачи скоро ще се окажат в положение да копаят нещо, което струва жълти стотинки. Затова вероятно те шумно или тихо ще преминат към поддръжка на SegWit. Не бих се учудил дори доста от тях да го направят още на 1 август. (Въпреки че някои сигурно ще продължат да опищяват света колко лошо е решението и какви загуби понасят от него. Може да има дори съдебни процеси… Подробностите ще ги видим.)

5. Политиката

Ако сте издържали дотук, четете внимателно – същността на този запис е в тази част.

Наскоро си говорих с горда випускничка на български икономически ВУЗ. Изслушах обяснение как икономията от мащаба не съществува и е точно обратното. Как малките фирми са по-ефективни от големите и т.н…

Нищо чудно, че ги учат на глупости. Който плаща, дори зад сцената, той поръчва музиката. Странно ми е, че обучаваните вярват на тези глупости при положение, че реалността е пред очите им. И че в нея големите фирми разоряват и/или купуват малките, а не обратното. Няма как да е иначе. Както законите на Нютон важат еднакво за лабораторни тежести и за търговски контейнери, така и дисипативните закони важат еднакво за тенджери с вода и за икономически системи.

В ИТ бизнеса динамиката е много над средната. Където не е и няма как да бъде регулиран лесно, където нещата са по-laissez-faire, както е примерно в копаенето на биткойни, е още по-голяма. Нищо чудно, че копаенето премина толкова бързо от милиони индивидуални участници към малък брой лесно картелиращи се тиранозаври. Всяка система еволюира вътрешно в такава посока… Затова „перфектна система“ и „щастие завинаги“ няма как да съществуват. Затова, ако щете, свободата трябва да се замесва и изпича всеки ден.

„Преобладаващата копаеща мощност“, било като преобладаващият брой индивиди във вида, било като основната маса пари, било като управление на най-популярните сред гласоподавателите мемове, лесно може да се съсредоточи в тесен кръг ръце. И законите на вътрешната еволюция на системите, като конкретно изражение на дисипативните закони, водят именно натам… Тогава всяко гласуване започва да подкрепя статуквото. Демокрацията престава да бъде възможност за промяна – такава остава само разделянето на възгледите в отделни системи. Единствено тогава новото получава възможност реално да конкурира старото.

Затова и всеки биологичен вид наоколо е започнал някога като миниатюрна различна клонка от могъщото тогава стъбло на друг вид. Който днес познават само палеобиолозите. И всяка могъща банка, или производствена или медийна фирма е започнала – като сума пари, или производствен капацитет, или интелектуална собственост – като обикновена будка за заеми, или работилничка, или ателие. В сянката на тогавашните тиранозаври, помнени днес само от историците. Намерили начин да се отделят и скрият някак от тях, за да съберат мощта да ги конкурират…

Който разбрал – разбрал.

Perl 5.26.0 released

Post Syndicated from corbet original https://lwn.net/Articles/724363/rss

The Perl 5.26.0 release is out. “Perl 5.26.0 represents approximately 13 months of development since Perl
5.24.0 and contains approximately 360,000 lines of changes across 2,600
files from 86 authors
“. See this
page
for a list of changes in this release; new features include
indented here-documents, the ability to declare references to variables,
Unicode 9.0 support, and the removal of the current directory
(“.“) from @INC by default.

Product or Project?

Post Syndicated from Matt Richardson original https://www.raspberrypi.org/blog/product-or-project/

This column is from The MagPi issue 57. You can download a PDF of the full issue for free, or subscribe to receive the print edition in your mailbox or the digital edition on your tablet. All proceeds from the print and digital editions help the Raspberry Pi Foundation achieve its charitable goals.

Image of MagPi magazine and AIY Project Kit

Taking inspiration from a widely known inspirational phrase, I like to tell people, “make the thing you wish to see in the world.” In other words, you don’t have to wait for a company to create the exact product you want. You can be a maker as well as a consumer! Prototyping with hardware has become easier and more affordable, empowering people to make products that suit their needs perfectly. And the people making these things aren’t necessarily electrical engineers, computer scientists, or product designers. They’re not even necessarily adults. They’re often self-taught hobbyists who are empowered by maker-friendly technology.

It’s a subject I’ve been very interested in, and I have written about it before. Here’s what I’ve noticed: the flow between maker project and consumer product moves in both directions. In other words, consumer products can start off as maker projects. Just take a look at the story behind many of the crowdfunded products on sites such as Kickstarter. Conversely, consumer products can evolve into maker products as well. The cover story for the latest issue of The MagPi is a perfect example of that. Google has given you the resources you need to build your own dedicated Google Assistant device. How cool is that?

David Pride on Twitter

@Raspberry_Pi @TheMagP1 Oh this is going to be a ridiculous amount of fun. 😊 #AIYProjects #woodchuck https://t.co/2sWYmpi6T1

But consumer products becoming hackable hardware isn’t always an intentional move by the product’s maker. In the 2000s, TiVo set-top DVRs were a hot product and their most enthusiastic fans figured out how to hack the product to customise it to meet their needs without any kind of support from TiVo.

Embracing change

But since then, things have changed. For example, when Microsoft’s Kinect for the Xbox 360 was released in 2010, makers were immediately enticed by its capabilities. It not only acted as a camera, but it could also sense depth, a feature that would be useful for identifying the position of objects in a space. At first, there was no hacker support from Microsoft, so Adafruit Industries announced a $3,000 bounty to create open-source drivers so that anyone could access the features of Kinect for their own projects. Since then, Microsoft has embraced the use of Kinect for these purposes.

The Create 2 from iRobot

iRobot’s Create 2, a hackable version of the Roomba

Consumer product companies even make versions of their products that are specifically meant for hacking, making, and learning. Belkin’s WeMo home automation product line includes the WeMo Maker, a device that can act as a remote relay or sensor and hook into your home automation system. And iRobot offers Create 2, a hackable version of its Roomba floor-cleaning robot. While iRobot aimed the robot at STEM educators, you could use it for personal projects too. Electronic instrument maker Korg takes its maker-friendly approach to the next level by releasing the schematics for some of its analogue synthesiser products.

Why would a company want to do this? There are a few possible reasons. For one, it’s a way of encouraging consumers to create a community around a product. It could be a way for innovation with the product to continue, unchecked by the firm’s own limits on resources. For certain, it’s an awesome feel-good way for a company to empower their own users. Whatever the reason these products exist, it’s the digital maker who comes out ahead. They have more affordable tools, materials, and resources to create their own customised products and possibly learn a thing or two along the way.

With maker-friendly, hackable products, being a creator and a consumer aren’t mutually exclusive. In fact, you’re probably getting the best of both worlds: great products and great opportunities to make the thing you wish to see in the world.

The post Product or Project? appeared first on Raspberry Pi.