Tag Archives: tracking

Accessing Cell Phone Location Information

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/05/accessing_cell_.html

The New York Times is reporting about a company called Securus Technologies that gives police the ability to track cell phone locations without a warrant:

The service can find the whereabouts of almost any cellphone in the country within seconds. It does this by going through a system typically used by marketers and other companies to get location data from major cellphone carriers, including AT&T, Sprint, T-Mobile and Verizon, documents show.

Another article.

Boing Boing post.

Some notes on eFail

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/05/some-notes-on-efail.html

I’ve been busy trying to replicate the “eFail” PGP/SMIME bug. I thought I’d write up some notes.

PGP and S/MIME encrypt emails, so that eavesdroppers can’t read them. The bugs potentially allow eavesdroppers to take the encrypted emails they’ve captured and resend them to you, reformatted in a way that allows them to decrypt the messages.

Disable remote/external content in email

The most important defense is to disable “external” or “remote” content from being automatically loaded. This is when HTML-formatted emails attempt to load images from remote websites. This happens legitimately when they want to display images, but not fill up the email with them. But most of the time this is illegitimate, they hide images on the webpage in order to track you with unique IDs and cookies. For example, this is the code at the end of an email from politician Bernie Sanders to his supporters. Notice the long random number assigned to track me, and the width/height of this image is set to one pixel, so you don’t even see it:

Such trackers are so pernicious they are disabled by default in most email clients. This is an example of the settings in Thunderbird:

The problem is that as you read email messages, you often get frustrated by the fact the error messages and missing content, so you keep adding exceptions:

The correct defense against this eFail bug is to make sure such remote content is disabled and that you have no exceptions, or at least, no HTTP exceptions. HTTPS exceptions (those using SSL) are okay as long as they aren’t to a website the attacker controls. Unencrypted exceptions, though, the hacker can eavesdrop on, so it doesn’t matter if they control the website the requests go to. If the attacker can eavesdrop on your emails, they can probably eavesdrop on your HTTP sessions as well.

Some have recommended disabling PGP and S/MIME completely. That’s probably overkill. As long as the attacker can’t use the “remote content” in emails, you are fine. Likewise, some have recommend disabling HTML completely. That’s not even an option in any email client I’ve used — you can disable sending HTML emails, but not receiving them. It’s sufficient to just disable grabbing remote content, not the rest of HTML email rendering.

I couldn’t replicate the direct exfiltration

There rare two related bugs. One allows direct exfiltration, which appends the decrypted PGP email onto the end of an IMG tag (like one of those tracking tags), allowing the entire message to be decrypted.

An example of this is the following email. This is a standard HTML email message consisting of multiple parts. The trick is that the IMG tag in the first part starts the URL (blog.robertgraham.com/…) but doesn’t end it. It has the starting quotes in front of the URL but no ending quotes. The ending will in the next chunk.

The next chunk isn’t HTML, though, it’s PGP. The PGP extension (in my case, Enignmail) will detect this and automatically decrypt it. In this case, it’s some previous email message I’ve received the attacker captured by eavesdropping, who then pastes the contents into this email message in order to get it decrypted.

What should happen at this point is that Thunderbird will generate a request (if “remote content” is enabled) to the blog.robertgraham.com server with the decrypted contents of the PGP email appended to it. But that’s not what happens. Instead, I get this:

I am indeed getting weird stuff in the URL (the bit after the GET /), but it’s not the PGP decrypted message. Instead what’s going on is that when Thunderbird puts together a “multipart/mixed” message, it adds it’s own HTML tags consisting of lines between each part. In the email client it looks like this:

The HTML code it adds looks like:

That’s what you see in the above URL, all this code up to the first quotes. Those quotes terminate the quotes in the URL from the first multipart section, causing the rest of the content to be ignored (as far as being sent as part of the URL).

So at least for the latest version of Thunderbird, you are accidentally safe, even if you have “remote content” enabled. Though, this is only according to my tests, there may be a work around to this that hackers could exploit.

STARTTLS

In the old days, email was sent plaintext over the wire so that it could be passively eavesdropped on. Nowadays, most providers send it via “STARTTLS”, which sorta encrypts it. Attackers can still intercept such email, but they have to do so actively, using man-in-the-middle. Such active techniques can be detected if you are careful and look for them.
Some organizations don’t care. Apparently, some nation states are just blocking all STARTTLS and forcing email to be sent unencrypted. Others do care. The NSA will passively sniff all the email they can in nations like Iraq, but they won’t actively intercept STARTTLS messages, for fear of getting caught.
The consequence is that it’s much less likely that somebody has been eavesdropping on you, passively grabbing all your PGP/SMIME emails. If you fear they have been, you should look (e.g. send emails from GMail and see if they are intercepted by sniffing the wire).

You’ll know if you are getting hacked

If somebody attacks you using eFail, you’ll know. You’ll get an email message formatted this way, with multipart/mixed components, some with corrupt HTML, some encrypted via PGP. This means that for the most part, your risk is that you’ll be attacked only once — the hacker will only be able to get one message through and decrypt it before you notice that something is amiss. Though to be fair, they can probably include all the emails they want decrypted as attachments to the single email they sent you, so the risk isn’t necessarily that you’ll only get one decrypted.
As mentioned above, a lot of attackers (e.g. the NSA) won’t attack you if its so easy to get caught. Other attackers, though, like anonymous hackers, don’t care.
Somebody ought to write a plugin to Thunderbird to detect this.

Summary

It only works if attackers have already captured your emails (though, that’s why you use PGP/SMIME in the first place, to guard against that).
It only works if you’ve enabled your email client to automatically grab external/remote content.
It seems to not be easily reproducible in all cases.
Instead of disabling PGP/SMIME, you should make sure your email client hast remote/external content disabled — that’s a huge privacy violation even without this bug.

Notes: The default email client on the Mac enables remote content by default, which is bad:

Amazon Aurora Backtrack – Turn Back Time

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-aurora-backtrack-turn-back-time/

We’ve all been there! You need to make a quick, seemingly simple fix to an important production database. You compose the query, give it a once-over, and let it run. Seconds later you realize that you forgot the WHERE clause, dropped the wrong table, or made another serious mistake, and interrupt the query, but the damage has been done. You take a deep breath, whistle through your teeth, wish that reality came with an Undo option. Now what?

New Amazon Aurora Backtrack
Today I would like to tell you about the new backtrack feature for Amazon Aurora. This is as close as we can come, given present-day technology, to an Undo option for reality.

This feature can be enabled at launch time for all newly-launched Aurora database clusters. To enable it, you simply specify how far back in time you might want to rewind, and use the database as usual (this is on the Configure advanced settings page):

Aurora uses a distributed, log-structured storage system (read Design Considerations for High Throughput Cloud-Native Relational Databases to learn a lot more); each change to your database generates a new log record, identified by a Log Sequence Number (LSN). Enabling the backtrack feature provisions a FIFO buffer in the cluster for storage of LSNs. This allows for quick access and recovery times measured in seconds.

After that regrettable moment when all seems lost, you simply pause your application, open up the Aurora Console, select the cluster, and click Backtrack DB cluster:

Then you select Backtrack and choose the point in time just before your epic fail, and click Backtrack DB cluster:

Then you wait for the rewind to take place, unpause your application and proceed as if nothing had happened. When you initiate a backtrack, Aurora will pause the database, close any open connections, drop uncommitted writes, and wait for the backtrack to complete. Then it will resume normal operation and being to accept requests. The instance state will be backtracking while the rewind is underway:

The console will let you know when the backtrack is complete:

If it turns out that you went back a bit too far, you can backtrack to a later time. Other Aurora features such as cloning, backups, and restores continue to work on an instance that has been configured for backtrack.

I’m sure you can think of some creative and non-obvious use cases for this cool new feature. For example, you could use it to restore a test database after running a test that makes changes to the database. You can initiate the restoration from the API or the CLI, making it easy to integrate into your existing test framework.

Things to Know
This option applies to newly created MySQL-compatible Aurora database clusters and to MySQL-compatible clusters that have been restored from a backup. You must opt-in when you create or restore a cluster; you cannot enable it for a running cluster.

This feature is available now in all AWS Regions where Amazon Aurora runs, and you can start using it today.

Jeff;

Hard Drive Stats for Q1 2018

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hard-drive-stats-for-q1-2018/

Backblaze Drive Stats Q1 2018

As of March 31, 2018 we had 100,110 spinning hard drives. Of that number, there were 1,922 boot drives and 98,188 data drives. This review looks at the quarterly and lifetime statistics for the data drive models in operation in our data centers. We’ll also take a look at why we are collecting and reporting 10 new SMART attributes and take a sneak peak at some 8 TB Toshiba drives. Along the way, we’ll share observations and insights on the data presented and we look forward to you doing the same in the comments.

Background

Since April 2013, Backblaze has recorded and saved daily hard drive statistics from the drives in our data centers. Each entry consists of the date, manufacturer, model, serial number, status (operational or failed), and all of the SMART attributes reported by that drive. Currently there are about 97 million entries totaling 26 GB of data. You can download this data from our website if you want to do your own research, but for starters here’s what we found.

Hard Drive Reliability Statistics for Q1 2018

At the end of Q1 2018 Backblaze was monitoring 98,188 hard drives used to store data. For our evaluation below we remove from consideration those drives which were used for testing purposes and those drive models for which we did not have at least 45 drives. This leaves us with 98,046 hard drives. The table below covers just Q1 2018.

Q1 2018 Hard Drive Failure Rates

Notes and Observations

If a drive model has a failure rate of 0%, it only means there were no drive failures of that model during Q1 2018.

The overall Annualized Failure Rate (AFR) for Q1 is just 1.2%, well below the Q4 2017 AFR of 1.65%. Remember that quarterly failure rates can be volatile, especially for models that have a small number of drives and/or a small number of Drive Days.

There were 142 drives (98,188 minus 98,046) that were not included in the list above because we did not have at least 45 of a given drive model. We use 45 drives of the same model as the minimum number when we report quarterly, yearly, and lifetime drive statistics.

Welcome Toshiba 8TB drives, almost…

We mentioned Toshiba 8 TB drives in the first paragraph, but they don’t show up in the Q1 Stats chart. What gives? We only had 20 of the Toshiba 8 TB drives in operation in Q1, so they were excluded from the chart. Why do we have only 20 drives? When we test out a new drive model we start with the “tome test” and it takes 20 drives to fill one tome. A tome is the same drive model in the same logical position in each of the 20 Storage Pods that make up a Backblaze Vault. There are 60 tomes in each vault.

In this test, we created a Backblaze Vault of 8 TB drives, with 59 of the tomes being Seagate 8 TB drives and 1 tome being the Toshiba drives. Then we monitored the performance of the vault and its member tomes to see if, in this case, the Toshiba drives performed as expected.

Q1 2018 Hard Drive Failure Rate — Toshiba 8TB

So far the Toshiba drive is performing fine, but they have been in place for only 20 days. Next up is the “pod test” where we fill a Storage Pod with Toshiba drives and integrate it into a Backblaze Vault comprised of like-sized drives. We hope to have a better look at the Toshiba 8 TB drives in our Q2 report — stay tuned.

Lifetime Hard Drive Reliability Statistics

While the quarterly chart presented earlier gets a lot of interest, the real test of any drive model is over time. Below is the lifetime failure rate chart for all the hard drive models which have 45 or more drives in operation as of March 31st, 2018. For each model, we compute their reliability starting from when they were first installed.

Lifetime Hard Drive Failure Rates

Notes and Observations

The failure rates of all of the larger drives (8-, 10- and 12 TB) are very good, 1.2% AFR (Annualized Failure Rate) or less. Many of these drives were deployed in the last year, so there is some volatility in the data, but you can use the Confidence Interval to get a sense of the failure percentage range.

The overall failure rate of 1.84% is the lowest we have ever achieved, besting the previous low of 2.00% from the end of 2017.

Our regular readers and drive stats wonks may have noticed a sizable jump in the number of HGST 8 TB drives (model: HUH728080ALE600), from 45 last quarter to 1,045 this quarter. As the 10 TB and 12 TB drives become more available, the price per terabyte of the 8 TB drives has gone down. This presented an opportunity to purchase the HGST drives at a price in line with our budget.

We purchased and placed into service the 45 original HGST 8 TB drives in Q2 of 2015. They were our first Helium-filled drives and our only ones until the 10 TB and 12 TB Seagate drives arrived in Q3 2017. We’ll take a first look into whether or not Helium makes a difference in drive failure rates in an upcoming blog post.

New SMART Attributes

If you have previously worked with the hard drive stats data or plan to, you’ll notice that we added 10 more columns of data starting in 2018. There are 5 new SMART attributes we are tracking each with a raw and normalized value:

  • 177 – Wear Range Delta
  • 179 – Used Reserved Block Count Total
  • 181- Program Fail Count Total or Non-4K Aligned Access Count
  • 182 – Erase Fail Count
  • 235 – Good Block Count AND System(Free) Block Count

The 5 values are all related to SSD drives.

Yes, SSD drives, but before you jump to any conclusions, we used 10 Samsung 850 EVO SSDs as boot drives for a period of time in Q1. This was an experiment to see if we could reduce boot up time for the Storage Pods. In our case, the improved boot up speed wasn’t worth the SSD cost, but it did add 10 new columns to the hard drive stats data.

Speaking of hard drive stats data, the complete data set used to create the information used in this review is available on our Hard Drive Test Data page. You can download and use this data for free for your own purpose, all we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone. It is free.

If you just want the summarized data used to create the tables and charts in this blog post, you can download the ZIP file containing the MS Excel spreadsheet.

Good luck and let us know if you find anything interesting.

[Ed: 5/1/2018 – Updated Lifetime chart to fix error in confidence interval for HGST 4TB drive, model: HDS5C4040ALE630]

The post Hard Drive Stats for Q1 2018 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Tips for Success: GDPR Lessons Learned

Post Syndicated from Chad Woolf original https://aws.amazon.com/blogs/security/tips-for-success-gdpr-lessons-learned/

Security is our top priority at AWS, and from the beginning we have built security into the fabric of our services. With the introduction of GDPR (which becomes enforceable on May 25 of 2018), privacy and data protection have become even more ingrained into our security-centered culture. Three weeks ago, well ahead of the deadline, we announced that all AWS services are compliant with GDPR, meaning you can use AWS as a data processor as a way to help solve your GDPR challenges (be sure to visit our GDPR Center for additional information).

When it comes to GDPR compliance, many customers are progressing nicely and much of the initial trepidation is gone. In my interactions with customers on this topic, a few themes have emerged as universal:

  • GDPR is important. You need to have a plan in place if you process personal data of EU data subjects, not only because it’s good governance, but because GDPR does carry significant penalties for non-compliance.
  • Solving this can be complex, potentially involving a lot of personnel and multiple tools. Your GDPR process will also likely span across disciplines – impacting people, processes, and technology.
  • Each customer is unique, and there are many methodologies around assessing your compliance with GDPR. It’s important to be aware of your own individual business attributes.

I thought it might be helpful to share some of our own lessons learned. In our experience in solving the GDPR challenge, the following were keys to our success:

  1. Get your senior leadership involved. We have a regular cadence of detailed status conversations about GDPR with our CEO, Andy Jassy. GDPR is high stakes, and the AWS leadership team knows it. If GDPR doesn’t have the attention it needs with the visibility of top management today, it’s time to escalate.
  2. Centralize the GDPR efforts. Driving all work streams centrally is key. This may sound obvious, but managing this in a distributed manner may result in duplicative effort and/or team members moving in a different direction.
  3. The most important single partner in solving GDPR is your legal team. Having non-legal people make assumptions about how to interpret GDPR for your unique environment is both risky and a potential waste of time and resources. You want to avoid analysis paralysis by getting proper legal advice, collaborating on a direction, and then moving forward with the proper urgency.
  4. Collaborate closely with tech leadership. The “process” people in your organization, the ones who already know how to approach governance problems, are typically comfortable jumping right in to GDPR. But technical teams, including data owners, have set up their software for business application. They may not even know what kind of data they are storing, processing, or transferring to other parts of the business. In the GDPR exercise they need to be aware of (or at least help facilitate) the tracking of data and data elements between systems. This isn’t a typical ask for technical teams, so be prepared to educate and to fully understand data flow.
  5. Don’t live by the established checklists. There are multiple methodologies to solving the compliance challenges of GDPR. At AWS, we ended up establishing core requirements, mapped out by data controller and data processor functions and then, in partnership with legal, decided upon a group of projects based on our known current state. Be careful about using a set methodology, tool or questionnaire to govern your efforts. These generic assessments can help educate, but letting them drive or limit your work could lead to missing something that is key to your own compliance. In this sense, a generic, “one size fits all” solution might not be helpful.
  6. Don’t be afraid to challenge prior orthodoxy. Many times we changed course based on new information. You shouldn’t be afraid to scrap an effort if you determine it’s not working. You should also not be afraid to escalate issues to senior leadership when needed. This is an executive issue.
  7. Look for ways to leverage your work beyond this compliance activity. GDPR requires serious effort, but are the results limited to GDPR compliance? Certainly not. You can use GDPR workflows as a way to ensure better governance moving forward. Privacy and security will require work for the foreseeable future, so make your governance program scalable and usable for other purposes.

One last tip that has made all the difference: think about protecting data subjects and work backwards from there. Customer focus drives us to ask, “what would customers and data subjects want and expect us to do?” Taking GDPR from a pure legal or compliance standpoint may be technically sufficient, but we believe the objectives of security and personal data protection require a more comprehensive view, and you can most effectively shape that view by starting with the individuals GDPR was meant to protect.

If you would like to find out more about our experiences, as well as how we can help you in your efforts, please reach out to us today.

-Chad Woolf

Vice President, AWS Security Assurance

Interested in additional AWS Security news? Follow the AWS Security Blog on Twitter.

COPPA Compliance

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/04/coppa_complianc.html

Interesting research: “‘Won’t Somebody Think of the Children?’ Examining COPPA Compliance at Scale“:

Abstract: We present a scalable dynamic analysis framework that allows for the automatic evaluation of the privacy behaviors of Android apps. We use our system to analyze mobile apps’ compliance with the Children’s Online Privacy Protection Act (COPPA), one of the few stringent privacy laws in the U.S. Based on our automated analysis of 5,855 of the most popular free children’s apps, we found that a majority are potentially in violation of COPPA, mainly due to their use of third-party SDKs. While many of these SDKs offer configuration options to respect COPPA by disabling tracking and behavioral advertising, our data suggest that a majority of apps either do not make use of these options or incorrectly propagate them across mediation SDKs. Worse, we observed that 19% of children’s apps collect identifiers or other personally identifiable information (PII) via SDKs whose terms of service outright prohibit their use in child-directed apps. Finally, we show that efforts by Google to limit tracking through the use of a resettable advertising ID have had little success: of the 3,454 apps that share the resettable ID with advertisers, 66% transmit other, non-resettable, persistent identifiers as well, negating any intended privacy-preserving properties of the advertising ID.

How to retain system tables’ data spanning multiple Amazon Redshift clusters and run cross-cluster diagnostic queries

Post Syndicated from Karthik Sonti original https://aws.amazon.com/blogs/big-data/how-to-retain-system-tables-data-spanning-multiple-amazon-redshift-clusters-and-run-cross-cluster-diagnostic-queries/

Amazon Redshift is a data warehouse service that logs the history of the system in STL log tables. The STL log tables manage disk space by retaining only two to five days of log history, depending on log usage and available disk space.

To retain STL tables’ data for an extended period, you usually have to create a replica table for every system table. Then, for each you load the data from the system table into the replica at regular intervals. By maintaining replica tables for STL tables, you can run diagnostic queries on historical data from the STL tables. You then can derive insights from query execution times, query plans, and disk-spill patterns, and make better cluster-sizing decisions. However, refreshing replica tables with live data from STL tables at regular intervals requires schedulers such as Cron or AWS Data Pipeline. Also, these tables are specific to one cluster and they are not accessible after the cluster is terminated. This is especially true for transient Amazon Redshift clusters that last for only a finite period of ad hoc query execution.

In this blog post, I present a solution that exports system tables from multiple Amazon Redshift clusters into an Amazon S3 bucket. This solution is serverless, and you can schedule it as frequently as every five minutes. The AWS CloudFormation deployment template that I provide automates the solution setup in your environment. The system tables’ data in the Amazon S3 bucket is partitioned by cluster name and query execution date to enable efficient joins in cross-cluster diagnostic queries.

I also provide another CloudFormation template later in this post. This second template helps to automate the creation of tables in the AWS Glue Data Catalog for the system tables’ data stored in Amazon S3. After the system tables are exported to Amazon S3, you can run cross-cluster diagnostic queries on the system tables’ data and derive insights about query executions in each Amazon Redshift cluster. You can do this using Amazon QuickSight, Amazon Athena, Amazon EMR, or Amazon Redshift Spectrum.

You can find all the code examples in this post, including the CloudFormation templates, AWS Glue extract, transform, and load (ETL) scripts, and the resolution steps for common errors you might encounter in this GitHub repository.

Solution overview

The solution in this post uses AWS Glue to export system tables’ log data from Amazon Redshift clusters into Amazon S3. The AWS Glue ETL jobs are invoked at a scheduled interval by AWS Lambda. AWS Systems Manager, which provides secure, hierarchical storage for configuration data management and secrets management, maintains the details of Amazon Redshift clusters for which the solution is enabled. The last-fetched time stamp values for the respective cluster-table combination are maintained in an Amazon DynamoDB table.

The following diagram covers the key steps involved in this solution.

The solution as illustrated in the preceding diagram flows like this:

  1. The Lambda function, invoke_rs_stl_export_etl, is triggered at regular intervals, as controlled by Amazon CloudWatch. It’s triggered to look up the AWS Systems Manager parameter store to get the details of the Amazon Redshift clusters for which the system table export is enabled.
  2. The same Lambda function, based on the Amazon Redshift cluster details obtained in step 1, invokes the AWS Glue ETL job designated for the Amazon Redshift cluster. If an ETL job for the cluster is not found, the Lambda function creates one.
  3. The ETL job invoked for the Amazon Redshift cluster gets the cluster credentials from the parameter store. It gets from the DynamoDB table the last exported time stamp of when each of the system tables was exported from the respective Amazon Redshift cluster.
  4. The ETL job unloads the system tables’ data from the Amazon Redshift cluster into an Amazon S3 bucket.
  5. The ETL job updates the DynamoDB table with the last exported time stamp value for each system table exported from the Amazon Redshift cluster.
  6. The Amazon Redshift cluster system tables’ data is available in Amazon S3 and is partitioned by cluster name and date for running cross-cluster diagnostic queries.

Understanding the configuration data

This solution uses AWS Systems Manager parameter store to store the Amazon Redshift cluster credentials securely. The parameter store also securely stores other configuration information that the AWS Glue ETL job needs for extracting and storing system tables’ data in Amazon S3. Systems Manager comes with a default AWS Key Management Service (AWS KMS) key that it uses to encrypt the password component of the Amazon Redshift cluster credentials.

The following table explains the global parameters and cluster-specific parameters required in this solution. The global parameters are defined once and applicable at the overall solution level. The cluster-specific parameters are specific to an Amazon Redshift cluster and repeat for each cluster for which you enable this post’s solution. The CloudFormation template explained later in this post creates these parameters as part of the deployment process.

Parameter name Type Description
Global parametersdefined once and applied to all jobs
redshift_query_logs.global.s3_prefix String The Amazon S3 path where the query logs are exported. Under this path, each exported table is partitioned by cluster name and date.
redshift_query_logs.global.tempdir String The Amazon S3 path that AWS Glue ETL jobs use for temporarily staging the data.
redshift_query_logs.global.role> String The name of the role that the AWS Glue ETL jobs assume. Just the role name is sufficient. The complete Amazon Resource Name (ARN) is not required.
redshift_query_logs.global.enabled_cluster_list StringList A comma-separated list of cluster names for which system tables’ data export is enabled. This gives flexibility for a user to exclude certain clusters.
Cluster-specific parametersfor each cluster specified in the enabled_cluster_list parameter
redshift_query_logs.<<cluster_name>>.connection String The name of the AWS Glue Data Catalog connection to the Amazon Redshift cluster. For example, if the cluster name is product_warehouse, the entry is redshift_query_logs.product_warehouse.connection.
redshift_query_logs.<<cluster_name>>.user String The user name that AWS Glue uses to connect to the Amazon Redshift cluster.
redshift_query_logs.<<cluster_name>>.password Secure String The password that AWS Glue uses to connect the Amazon Redshift cluster’s encrypted-by key that is managed in AWS KMS.

For example, suppose that you have two Amazon Redshift clusters, product-warehouse and category-management, for which the solution described in this post is enabled. In this case, the parameters shown in the following screenshot are created by the solution deployment CloudFormation template in the AWS Systems Manager parameter store.

Solution deployment

To make it easier for you to get started, I created a CloudFormation template that automatically configures and deploys the solution—only one step is required after deployment.

Prerequisites

To deploy the solution, you must have one or more Amazon Redshift clusters in a private subnet. This subnet must have a network address translation (NAT) gateway or a NAT instance configured, and also a security group with a self-referencing inbound rule for all TCP ports. For more information about why AWS Glue ETL needs the configuration it does, described previously, see Connecting to a JDBC Data Store in a VPC in the AWS Glue documentation.

To start the deployment, launch the CloudFormation template:

CloudFormation stack parameters

The following table lists and describes the parameters for deploying the solution to export query logs from multiple Amazon Redshift clusters.

Property Default Description
S3Bucket mybucket The bucket this solution uses to store the exported query logs, stage code artifacts, and perform unloads from Amazon Redshift. For example, the mybucket/extract_rs_logs/data bucket is used for storing all the exported query logs for each system table partitioned by the cluster. The mybucket/extract_rs_logs/temp/ bucket is used for temporarily staging the unloaded data from Amazon Redshift. The mybucket/extract_rs_logs/code bucket is used for storing all the code artifacts required for Lambda and the AWS Glue ETL jobs.
ExportEnabledRedshiftClusters Requires Input A comma-separated list of cluster names from which the system table logs need to be exported.
DataStoreSecurityGroups Requires Input A list of security groups with an inbound rule to the Amazon Redshift clusters provided in the parameter, ExportEnabledClusters. These security groups should also have a self-referencing inbound rule on all TCP ports, as explained on Connecting to a JDBC Data Store in a VPC.

After you launch the template and create the stack, you see that the following resources have been created:

  1. AWS Glue connections for each Amazon Redshift cluster you provided in the CloudFormation stack parameter, ExportEnabledRedshiftClusters.
  2. All parameters required for this solution created in the parameter store.
  3. The Lambda function that invokes the AWS Glue ETL jobs for each configured Amazon Redshift cluster at a regular interval of five minutes.
  4. The DynamoDB table that captures the last exported time stamps for each exported cluster-table combination.
  5. The AWS Glue ETL jobs to export query logs from each Amazon Redshift cluster provided in the CloudFormation stack parameter, ExportEnabledRedshiftClusters.
  6. The IAM roles and policies required for the Lambda function and AWS Glue ETL jobs.

After the deployment

For each Amazon Redshift cluster for which you enabled the solution through the CloudFormation stack parameter, ExportEnabledRedshiftClusters, the automated deployment includes temporary credentials that you must update after the deployment:

  1. Go to the parameter store.
  2. Note the parameters <<cluster_name>>.user and redshift_query_logs.<<cluster_name>>.password that correspond to each Amazon Redshift cluster for which you enabled this solution. Edit these parameters to replace the placeholder values with the right credentials.

For example, if product-warehouse is one of the clusters for which you enabled system table export, you edit these two parameters with the right user name and password and choose Save parameter.

Querying the exported system tables

Within a few minutes after the solution deployment, you should see Amazon Redshift query logs being exported to the Amazon S3 location, <<S3Bucket_you_provided>>/extract_redshift_query_logs/data/. In that bucket, you should see the eight system tables partitioned by customer name and date: stl_alert_event_log, stl_dlltext, stl_explain, stl_query, stl_querytext, stl_scan, stl_utilitytext, and stl_wlm_query.

To run cross-cluster diagnostic queries on the exported system tables, create external tables in the AWS Glue Data Catalog. To make it easier for you to get started, I provide a CloudFormation template that creates an AWS Glue crawler, which crawls the exported system tables stored in Amazon S3 and builds the external tables in the AWS Glue Data Catalog.

Launch this CloudFormation template to create external tables that correspond to the Amazon Redshift system tables. S3Bucket is the only input parameter required for this stack deployment. Provide the same Amazon S3 bucket name where the system tables’ data is being exported. After you successfully create the stack, you can see the eight tables in the database, redshift_query_logs_db, as shown in the following screenshot.

Now, navigate to the Athena console to run cross-cluster diagnostic queries. The following screenshot shows a diagnostic query executed in Athena that retrieves query alerts logged across multiple Amazon Redshift clusters.

You can build the following example Amazon QuickSight dashboard by running cross-cluster diagnostic queries on Athena to identify the hourly query count and the key query alert events across multiple Amazon Redshift clusters.

How to extend the solution

You can extend this post’s solution in two ways:

  • Add any new Amazon Redshift clusters that you spin up after you deploy the solution.
  • Add other system tables or custom query results to the list of exports from an Amazon Redshift cluster.

Extend the solution to other Amazon Redshift clusters

To extend the solution to more Amazon Redshift clusters, add the three cluster-specific parameters in the AWS Systems Manager parameter store following the guidelines earlier in this post. Modify the redshift_query_logs.global.enabled_cluster_list parameter to append the new cluster to the comma-separated string.

Extend the solution to add other tables or custom queries to an Amazon Redshift cluster

The current solution ships with the export functionality for the following Amazon Redshift system tables:

  • stl_alert_event_log
  • stl_dlltext
  • stl_explain
  • stl_query
  • stl_querytext
  • stl_scan
  • stl_utilitytext
  • stl_wlm_query

You can easily add another system table or custom query by adding a few lines of code to the AWS Glue ETL job, <<cluster-name>_extract_rs_query_logs. For example, suppose that from the product-warehouse Amazon Redshift cluster you want to export orders greater than $2,000. To do so, add the following five lines of code to the AWS Glue ETL job product-warehouse_extract_rs_query_logs, where product-warehouse is your cluster name:

  1. Get the last-processed time-stamp value. The function creates a value if it doesn’t already exist.

salesLastProcessTSValue = functions.getLastProcessedTSValue(trackingEntry=”mydb.sales_2000",job_configs=job_configs)

  1. Run the custom query with the time stamp.

returnDF=functions.runQuery(query="select * from sales s join order o where o.order_amnt > 2000 and sale_timestamp > '{}'".format (salesLastProcessTSValue) ,tableName="mydb.sales_2000",job_configs=job_configs)

  1. Save the results to Amazon S3.

functions.saveToS3(dataframe=returnDF,s3Prefix=s3Prefix,tableName="mydb.sales_2000",partitionColumns=["sale_date"],job_configs=job_configs)

  1. Get the latest time-stamp value from the returned data frame in Step 2.

latestTimestampVal=functions.getMaxValue(returnDF,"sale_timestamp",job_configs)

  1. Update the last-processed time-stamp value in the DynamoDB table.

functions.updateLastProcessedTSValue(“mydb.sales_2000",latestTimestampVal[0],job_configs)

Conclusion

In this post, I demonstrate a serverless solution to retain the system tables’ log data across multiple Amazon Redshift clusters. By using this solution, you can incrementally export the data from system tables into Amazon S3. By performing this export, you can build cross-cluster diagnostic queries, build audit dashboards, and derive insights into capacity planning by using services such as Athena. I also demonstrate how you can extend this solution to other ad hoc query use cases or tables other than system tables by adding a few lines of code.


Additional Reading

If you found this post useful, be sure to check out Using Amazon Redshift Spectrum, Amazon Athena, and AWS Glue with Node.js in Production and Amazon Redshift – 2017 Recap.


About the Author

Karthik Sonti is a senior big data architect at Amazon Web Services. He helps AWS customers build big data and analytical solutions and provides guidance on architecture and best practices.

 

 

 

 

Community profile: Dave Akerman

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/community-profile-dave-akerman/

This column is from The MagPi issue 61. You can download a PDF of the full issue for free, or subscribe to receive the print edition through your letterbox or the digital edition on your tablet. All proceeds from the print and digital editions help the Raspberry Pi Foundation achieve our charitable goals.

The pinned tweet on Dave Akerman’s Twitter account shows a table displaying the various components needed for a high-altitude balloon (HAB) flight. Batteries, leads, a camera and Raspberry Pi, plus an unusually themed payload. The caption reads ‘The Queen, The Duke of York, and my TARDIS”, and sums up Dave’s maker career in a heartbeat.

David Akerman on Twitter

The Queen, The Duke of York, and my TARDIS 🙂 #UKHAS #RaspberryPi

Though writing software for industrial automation pays the bills, the majority of Dave’s time is spent in the world of high-altitude ballooning and the ever-growing community that encompasses it. And, while he makes some money sending business-themed balloons to near space for the likes of Aardman Animations, Confused.com, and the BBC, Dave is best known in the Raspberry Pi community for his use of the small computer in every payload, and his work as a tutor alongside the Foundation’s staff at Skycademy events.

Dave Akerman The MagPi Raspberry Pi Community Profile

Dave continues to help others while breaking records and having a good time exploring the atmosphere.

Dave has dedicated many hours and many, many more miles to assist with the Foundation’s Skycademy programme, helping to explore high-altitude ballooning with educators from across the UK. Using a Raspberry Pi and various other pieces of lightweight tech, Dave and Foundation staff member James Robinson explored the incorporation of high-altitude ballooning into education. Through Skycademy, educators were able to learn new skills and take them to the classroom, setting off their own balloons with their students, and recording the results on Raspberry Pis.

Dave Akerman The MagPi Raspberry Pi Community Profile

Dave’s most recent flight broke a new record. On 13 August 2017, his HAB payload was able to send back the highest images taken by any amateur flight.

But education isn’t the only reason for Dave’s involvement in the HAB community. As with anyone passionate about a specific hobby, Dave strives to break records. The most recent record-breaking flight took place on 13 August 2017, when Dave’s Raspberry Pi Zero HAB sent home the highest images taken by any amateur high-altitude balloon launch: at 43014 metres. No other HAB balloon has provided images from such an altitude, and the lightweight nature of the Pi Zero definitely helped, as Dave went on to mention on Twitter a few days later.

Dave Akerman The MagPi Raspberry Pi Community Profile

Dave is recognised as being the first person to incorporate a Raspberry Pi into a HAB payload, and continues to break records with the help of the little green board. More recently, he’s been able to lighten the load by using the Raspberry Pi Zero.

When the first Pi made its way to near space, Dave tore the computer apart in order to meet the weight restriction. The Pi in the Sky board was created to add the extra features needed for the flight. Since then, the HAT has experienced a few changes.

Dave Akerman The MagPi Raspberry Pi Community Profile

The Pi in the Sky board, created specifically for HAB flights.

Dave first fell in love with high-altitude ballooning after coming across the hobby in a video shared on a photographic forum. With a lifelong interest in space thanks to watching the Moon landings as a boy, plus a talent for electronics and photography, it seems a natural progression for him. Throw in his coding skills from learning to program on a Teletype and it’s no wonder he was ready and eager to take to the skies, so to speak, and capture the curvature of the Earth. What was so great about using the Raspberry Pi was the instant gratification he got from receiving images in real time as they were taken during the flight. While other devices could control a camera and store captured images for later retrieval, thanks to the Pi Dave was able to transmit the files back down to Earth and check the progress of his balloon while attempting to break records with a flight.

Dave Akerman The MagPi Raspberry Pi Community Profile Morph

One of the many commercial flights Dave has organised featured the classic children’s TV character Morph, a creation of the Aardman Animations studio known for Wallace and Gromit. Morph took to the sky twice in his mission to reach near space, and finally succeeded in 2016.

High-altitude ballooning isn’t the only part of Dave’s life that incorporates a Raspberry Pi. Having “lost count” of how many Pis he has running tasks, Dave has also created radio receivers for APRS (ham radio data), ADS-B (aircraft tracking), and OGN (gliders), along with a time-lapse camera in his garden, and he has a few more Pi for tinkering purposes.

The post Community profile: Dave Akerman appeared first on Raspberry Pi.

Streaming Joshua v Parker is Illegal But Re-Streaming is the Real Danger

Post Syndicated from Andy original https://torrentfreak.com/streaming-joshua-v-parker-is-illegal-but-re-streaming-is-the-real-danger-180329/

This Saturday evening, Anthony Joshua and Joseph Parker will string up their gloves and do battle in one of the most important heavyweight bouts of recent times.

Joshua will put an unbeaten professional record and his WBA, IBF and IBO world titles on the line. Parker – also unbeaten professionally – will put his WBO belt up for grabs. It’s a mouthwatering proposition for fight fans everywhere.

While the collision will take place at the Principality Stadium in Cardiff in front of a staggering 80,000 people, millions more will watch the fight in front of the TV at home, having paid Sky Sports Box Office up to £24.95 for the privilege.

Of course, hundreds of thousands won’t pay a penny, instead relying on streams delivered via illicit Kodi addons, Android apps, and IPTV services. While these options are often free, quality and availability on the night is far from guaranteed. Even those paying for premium ‘pirate’ access have been let down at the last minute but in the scheme of things, that’s generally unlikely.

Despite the uncertainty, this morning the Police Intellectual Property Crime Unit and Federation Against Copyright Theft took the unusual step of issuing a joint warning to people thinking of streaming the fight to their homes illegally.

“Consumers need to be aware that streaming without the right permissions or subscriptions is no longer a grey area,” PIPCU and FACT said in a statement.

“In April last year the EU Court of Justice ruled that not only was selling devices allowing access to copyrighted content illegal, but using one to stream TV, sports or films without an official subscription is also breaking the law.”

The decision, which came as part of the BREIN v Filmspeler case, found that obtaining a copyright-protected work “from a website belonging to a third party offering that work without the consent of the copyright holder” was an illegal act.

While watching the fight via illicit streams is undoubtedly illegal, tracking people who simply view content is extremely difficult and there hasn’t been a single prosecution in the UK (or indeed anywhere else that we’re aware of) against anyone doing so.

That being said, those who make content available for others to watch illegally are putting themselves at considerable risk. While professional pirate re-streamers tend to have better security, Joe Public who points his phone at his TV Saturday night to stream the fight on Facebook should take time out to consider his actions.

In January, Sky revealed that 34-year-old Craig Foster had been caught by the company after someone re-streamed the previous year’s Anthony Joshua vs Wladimir Klitschko fight on Facebook Live using Foster’s Sky account.

Foster had paid Sky for the fight but he claims that a friend used his iPad to record the screen and re-stream the fight to Facebook. Sky, almost certainly using tracking watermarks (example below), traced the ‘pirate’ stream back to Foster’s set-top box.

Watermarks during the Mayweather v McGregor fight

The end result was a technical knockout for Sky who suspended Foster’s Sky subscription and then agreed not to launch a lawsuit providing he paid the broadcaster £5,000.

“The public should be aware that misusing their TV subscriptions has serious repercussions,” said PIPCU and FACT referring to the case this morning.

“For example, customers found to be illegally sharing paid-for content can have their subscription account terminated immediately and can expect to be prosecuted and fined.”

While we know for certain this has happened at least once, TorrentFreak contacted FACT this morning for details on how many Sky subscribers have been caught, warned, and/or prosecuted by Sky in this manner. FACT told us they don’t have any figures but offered the following statement from CEO Kieron Sharp.

“Not only is FACT working closely with broadcasters and rights owners to identify the original source of illegally re-streamed content, but with support from law enforcement, government and social media platforms, we are tightening the net on digital piracy,” Sharp said.

Finally, it’s also worth keeping in mind that even when people live-stream an illegal yet non-watermarked stream to Facebook, they can still be traced by Sky.

As revelations this week have shown only too clearly, Facebook knows a staggering amount about its users so tracking an illegal stream back to a person would be child’s play for a determined rightsholder with a court order.

While someone attracting a couple of dozen viewers might not be at a major risk of repercussions, a viral stream might require the use of a calculator to assess the damages claimed by Sky. Like boxing, this kind of piracy is best left to the professionals to avoid painful and unnecessary trauma.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

[$] Recent improvements to Tor

Post Syndicated from jake original https://lwn.net/Articles/750312/rss

We may need Tor, “the onion router”,
more than we ever imagined. Authoritarian states are blocking more and more web
sites
and snooping
on their populations online
—even routine tracking of our online
activities
can reveal information that can be used to undermine
democracy. Thus, there was strong interest in the “State of the Onion”
panel at the 2018 LibrePlanet conference, where
four contributors to the Tor project presented a progress update covering the
past few years.

Subscribers can read on for a report on the panel by guest author Andy Oram.

Tracing Stolen Bitcoin

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/03/tracing_stolen_.html

Ross Anderson has a really interesting paper on tracing stolen bitcoin. From a blog post:

Previous attempts to track tainted coins had used either the “poison” or the “haircut” method. Suppose I open a new address and pay into it three stolen bitcoin followed by seven freshly-mined ones. Then under poison, the output is ten stolen bitcoin, while under haircut it’s ten bitcoin that are marked 30% stolen. After thousands of blocks, poison tainting will blacklist millions of addresses, while with haircut the taint gets diffused, so neither is very effective at tracking stolen property. Bitcoin due-diligence services supplant haircut taint tracking with AI/ML, but the results are still not satisfactory.

We discovered that, back in 1816, the High Court had to tackle this problem in Clayton’s case, which involved the assets and liabilities of a bank that had gone bust. The court ruled that money must be tracked through accounts on the basis of first-in, first out (FIFO); the first penny into an account goes to satisfy the first withdrawal, and so on.

Ilia Shumailov has written software that applies FIFO tainting to the blockchain and the results are impressive, with a massive improvement in precision. What’s more, FIFO taint tracking is lossless, unlike haircut; so in addition to tracking a stolen coin forward to find where it’s gone, you can start with any UTXO and trace it backwards to see its entire ancestry. It’s not just good law; it’s good computer science too.

Tracking Cookies and GDPR

Post Syndicated from Bozho original https://techblog.bozho.net/tracking-cookies-gdpr/

GDPR is the new data protection regulation, as you probably already know. I’ve given a detailed practical advice for what it means for developers (and product owners). However, there’s one thing missing there – cookies. The elephant in the room.

Previously I’ve stated that cookies are subject to another piece of legislation – the ePrivacy directive, which is getting updated and its new version will be in force a few years from now. And while that’s technically correct, cookies seem to be affected by GDPR as well. In a way I’ve underestimated that effect.

When you do a Google search on “GDPR cookies”, you’ll pretty quickly realize that a) there’s not too much information and b) there’s not much technical understanding of the issue.

What appears to be the consensus is that GDPR does change the way cookies are handled. More specifically – tracking cookies. Here’s recital 30:

(30) Natural persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them.

How tracking cookies work – a 3rd party (usually an ad network) gives you a code snippet that you place on your website, for example to display ads. That code snippet, however, calls “home” (makes a request to the 3rd party domain). If the 3rd party has previously been used on your computer, it has created a cookie. In the example of Facebook, they have the cookie with your Facebook identifier because you’ve logged in to Facebook. So this cookie (with your identifier) is sent with the request. The request also contains all the details from the page. In effect, you are uniquely identified by an identifier (in the case of Facebook and Google – fully identified, rather than some random anonymous identifier as with other ad networks).

Your behaviour on the website is personal data. It gets associated with your identifier, which in turn is associated with your profile. And all of that is personal data. Who is responsible for collecting the website behaviour data, i.e. who is the “controller”? Is it Facebook (or any other 3rd party) that technically does the collection? No, it’s the website owner, as the behaviour data is obtained on their website, and they have put the tracking piece of code there. So they bear responsibility.

What’s the responsibility? So far it boiled down to displaying the useless “we use cookies” warning that nobody cares about. And the current (old) ePrivacy directive and its interpretations says that this is enough – if the users actions can unambiguously mean that they are fine with cookies – i.e. if they continue to use the website after seeing the warning – then you’re fine. This is no longer true from a GDPR perspective – you are collecting user data and you have to have a lawful ground for processing.

For the data collected by tracking cookies you have two options – “consent” and “legitimate interest”. Legitimate interest will be hard to prove – it is not something that a user reasonably expects, it is not necessary for you to provide the service. If your lawyers can get that option to fly, good for them, but I’m not convinced regulators will be happy with that.

The other option is “consent”. You have to ask your users explicitly – that means “with a checkbox” – to let you use tracking cookies. That has two serious implications – from technical and usability point of view.

  • The technical issue is that the data is sent via 3rd party code as soon as the page loads and before the user can give their consent. And that’s already a violation. You can, of course, have the 3rd party code be dynamically inserted only after the user gives consent, but that will require some fiddling with javascript and might not always work depending on the provider. And you’d have to support opt-out at any time (which would in turn disable the 3rd party snippet). It would require actual coding, rather than just copy-pasting a snippet.
  • The usability aspect is the bigger issue – while you could neatly tuck a cookie warning at the bottom, you’d now have to have a serious, “stop the world” popup that asks for consent if you want anyone to click it. You can, of course, just add a checkbox to the existing cookie warning, but don’t expect anyone to click it.

These aspects pose a significant questions: is it worth it to have tracking cookies? Is developing new functionality worth it, is interrupting the user worth it, and is implementing new functionality just so that users never clicks a hidden checkbox worth it? Especially given that Firefox now blocks all tracking cookies and possibly other browsers will follow?

That by itself is an interesting topic – Firefox has basically implemented the most strict form of requirements of the upcoming ePrivacy directive update (that would turn it into an ePrivacy regulation). Other browsers will have to follow, even though Google may not be happy to block their own tracking cookies. I hope other browsers follow Firefox in tracking protection and the issue will be gone automatically.

To me it seems that it will be increasingly not worthy to have tracking cookies on your website. They add regulatory obligations for you and give you very little benefit (yes, you could track engagement from ads, but you can do that in other ways, arguably by less additional code than supporting the cookie consents). And yes, the cookie consent will be “outsourced” to browsers after the ePrivacy regulation is passed, but we can’t be sure at the moment whether there won’t be technical whack-a-mole between browsers and advertisers and whether you wouldn’t still need additional effort to have dynamic consent for tracking cookies. (For example there are reported issues that Firefox used to make Facebook login fail if tracking protection is enabled. Which could be a simple bug, or could become a strategy by big vendors in the future to force browsers into a less strict tracking protection).

Okay, we’ve decided it’s not worth it managing tracking cookies. But do you have a choice as a website owner? Can you stop your ad network from using them? (Remember – you are liable if users’ data is collected by visiting your website). And currently the answer is no – you can’t disable that. You can’t have “just the ads”. This is part of the “deal” – you get money for the ads you place, but you participate in a big “surveillance” network. Users have a way to opt out (e.g. Google AdWords gives them that option). You, as a website owner, don’t.

Facebook has a recommendations page that says “you take care of getting the consent”. But for example the “like button” plugin doesn’t have an option to not send any data to Facebook.

And sometimes you don’t want to serve ads, just track user behaviour and measure conversion. But even if you ask for consent for that and conditionally insert the plugin/snippet, do you actually know what data it sends? And what it’s used for? Because you have to know in order to inform your users. “Do you agree to use tracking cookies that Facebook has inserted in order to collect data about your behaviour on our website” doesn’t sound compelling.

So, what to do? The easiest thing is just not to use any 3rd party ad-related plugins. But that’s obviously not an option, as ad revenue is important, especially in the publishing industry. I don’t have a good answer, apart from “Regulators should pressure ad networks to provide opt-outs and clearly document their data usage”. They have to do that under GDPR, and while website owners are responsible for their users’ data, the ad networks that are in the role of processors in this case (as you delegate the data collection for your visitors to them) also have obligation to assist you in fulfilling your obligations. So ask Facebook – what should I do with your tracking cookies? And when the regulator comes after a privacy-aware customer files a complaint, you could prove that you’ve tried.

The ethical debate whether it’s wrong to collect data about peoples’ behaviour without their informed consent is an easy one. And that’s why I don’t put blame on the regulators – they are putting the ethical consensus in law. It gets more complicated if not allowing tracking means some internet services are no longer profitable and therefore can’t exist. Can we have the cake and eat it too?

The post Tracking Cookies and GDPR appeared first on Bozho's tech blog.

Introducing the B2 Snapshot Return Refund Program

Post Syndicated from Ahin Thomas original https://www.backblaze.com/blog/b2-snapshot-return-refund-program/

B2 Snapshot Return Refund Program

What Is the B2 Snapshot Return Refund Program?

Backblaze’s mission is making cloud storage astonishingly easy and affordable. That guides our focus — making our customers’ data more usable. Today, we’re pleased to introduce a trial of the B2 Snapshot Return Refund program. B2 customers have long been able to create a Snapshot of their data and order a hard drive with that data sent via FedEx anywhere in the world. Starting today, if the customer sends the drive back to Backblaze within 30 days, they will get a full refund. This new feature is available automatically for B2 customers when they order a Snapshot. There are no extra buttons to push or boxes to check — just send back the drive within 30 days and we’ll refund your money. To put it simply, we are offering the cloud storage industry’s only refundable rapid data egress service.

You Shouldn’t be Afraid to Use Your Own Data

Last week, we cut the price of B2 downloads in half — from 2¢ per GB to 1¢ per GB. That 50% reduction makes B2’s download price 1/5 that of Amazon’s S3 (with B2 storage pricing already 1/4 that of S3). The price reduction and today’s introduction of the B2 Snapshot Return Refund program are deliberate moves to eliminate the industry’s biggest barrier to entry — the cost of using data stored in the cloud.  Storage vendors who make it expensive to restore, or place time lag impediments to access, are reducing the usefulness of your data. We believe this is antithetical to encouraging the use of the cloud in the first place.

Learning From Our Customers

Our Computer Backup product already has a Restore Return Refund program. It’s incredibly popular, and we enjoy the almost daily “you just saved my bacon” letters that come back with the returned hard drives. Our customer surveys have repeatedly demonstrated that the ability to get data back is one of the things that has made our Computer Backup service one of the most popular in the industry. So, it made sense to us that our B2 customers could use a similar program.

There are many ways B2 customers can benefit from using the B2 Snapshot Return Refund program, here is a typical scenario.

Media and Entertainment Workflow Based Snapshots

Businesses in the Media and Entertainment (M&E) industry tend to have large quantities of digital media, and the amount of data will continue to increase in the coming years with more 4K and 8K cameras coming into regular use. When an organization needs to deliver or share that data, they typically have to manually download data from their internal storage system, and copy it on a thumb drive or hard drive, or perhaps create an LTO tape. Once that is done, they take their storage device, label it, and mail to their customer. Not only is this practice costly, time consuming, and potentially insecure, it doesn’t scale well with larger amounts of data.

With just a few clicks, you can easily distribute or share your digital media if it stored in the B2 Cloud. Here’s how the process works:

  1. Log in to your Backblaze B2 account.
  2. Navigate to the bucket where the data is located.
  3. Select the files, or the entire bucket, you wish to send and create a “Snapshot.”
  4. Once the Snapshot is complete you have choices:
    • Download the Snapshot and pay $0.01/GB for the download
    • Have Backblaze copy the Snapshot to an external hard drive and FedEx it anywhere in the world. This stores up to 3.5 TB and costs $189.00. Return the hard drive to Backblaze within 30 days and you’ll get your $189.00 back.
    • Have Backblaze copy the Snapshot to a flash drive and FedEx it anywhere in the world. This stores up to 110 GB and costs $99.00. FedEx shipping to the specified location is included. Return the flash drive to Backblaze within 30 days and you’ll get your $99.00 back.

You can always keep the hard drive or flash drive and Backblaze, of course, will keep your money.

Each drive containing a Snapshot is encrypted. The encryption key can be found in your Backblaze B2 account after you log in. The FedEX tracking number is there as well. When the hard drive arrives at its destination you can provide the encryption key to the recipient and they’ll be able to access the files. Note that the encryption key must be entered each time the hard drive is started, so the data remains protected even if the hard drive is returned to Backblaze.

The B2 Snapshot Return Refund program supports Snapshots as large as 3.5 terabytes. That means you can send about 50 hours of 4k video to a client or partner by selecting the hard drive option. If you select the flash drive option, a Snapshot can be up to 110 gigabytes, which is about 1hr and 45 min of 4k video.

While the example uses an M&E workflow, any workflow requiring the exchange or distribution of large amounts of data across distinct geographies will benefit from this service.

This is a Trial Program

Backblaze fully intends to offer the B2 Snapshot Return Refund Program for a long time. That said, there is no program like this in the industry and so we want to put some guardrails on it to ensure we can offer a sustainable program for all. Thus, the “fine print”:

  • Minimum Snapshot Size — a Snapshot must be greater than 10 GB to qualify for this program. Why? You can download a 10 GB Snapshot in a few minutes. Why pay us to do the same thing and have it take a couple of days??
  • The 30 Day Clock — The clock starts on the day the drive is marked as delivered to you by FedEx and the clock ends on the date postmarked on the package we receive. If that’s 30 days or less, your refund will be granted.
  • 5 Drive Refunds Per Year — We are initially setting a limit of 5 drive refunds per B2 account per year. By placing a cap on the number of drive refunds per year, we are able to provide a service that is responsive to our entire client base. We expect to change or remove this limit once we have enough data to understand the demand and can make sure we are staffed properly.

It is Your Data — Use It

Our industry has a habit of charging little to store data and then usurious amounts to get it back. There are certainly real costs involved in data retrieval. We outlined them in our post on the Cost of Cloud Storage. The industry rates charged for data retrieval are clearly strategic moves to try and lock customers in. To us, that runs counter to trying to do our part to make data useful and our customers’ lives easier. That viewpoint drives our efforts behind lowering our download pricing and the creation of this program.

We hope you enjoy the B2 Snapshot Return Refund program. If you have a moment, please tell us in the comments below how you might use it!

The post Introducing the B2 Snapshot Return Refund Program appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Wanted: Office Administrator

Post Syndicated from Yev original https://www.backblaze.com/blog/wanted-office-administrator-2/

At inception, Backblaze was a consumer company. Thousands upon thousands of individuals came to our website and gave us $5/mo to keep their data safe. But, we didn’t sell business solutions. It took us years before we had a sales team. In the last couple of years, we’ve released products that businesses of all sizes love: Backblaze B2 Cloud Storage and Backblaze for Business Computer Backup. Those businesses want to integrate Backblaze into their infrastructure, so it’s time to expand our teams!

Company Description:
Founded in 2007, Backblaze started with a mission to make backup software elegant and provide complete peace of mind. Over the course of almost a decade, we have become a pioneer in robust, scalable low cost cloud backup. Recently, we launched B2 – robust and reliable object storage at just $0.005/gb/mo. Part of our differentiation is being able to offer the lowest price of any of the big players while still being profitable.

We’ve managed to nurture a team oriented culture with amazingly low turnover. We value our people and their families. Don’t forget to check out our “About Us” page to learn more about the people and some of our perks.

We have built a profitable, high growth business. While we love our investors, we have maintained control over the business. That means our corporate goals are simple – grow sustainably and profitably.

Some Backblaze Perks:

  • Competitive healthcare plans
  • Competitive compensation and 401k
  • All employees receive Option grants
  • Unlimited vacation days
  • Strong coffee
  • Fully stocked Micro kitchen
  • Catered breakfast and lunches
  • Awesome people who work on awesome projects
  • New Parent Childcare bonus
  • Normal work hours
  • Get to bring your pets into the office
  • San Mateo Office – located near Caltrain and Highways 101 & 280.

Want to know what you’ll be doing?

You will play a pivotal role at Backblaze! You will be the glue that binds people together in the office and one of the main engines that keeps our company running. This is an exciting opportunity to help shape the company culture of Backblaze by making the office a fun and welcoming place to work. As an Office Administrator, your priority is to help employees have what they need to feel happy, comfortable, and productive at work; whether it’s refilling snacks, collecting shipments, responding to maintenance requests, ordering office supplies, or assisting with fun social events, your contributions will be critical to our culture.

Office Administrator Responsibilities:

  • Maintain a clean, well-stocked and organized office
  • Greet visitors and callers, route and resolve information requests
  • Ensure conference rooms and kitchen areas are clean and stocked
  • Sign for all packages delivered to the office as well as forward relevant departments
  • Administrative duties as assigned

Facilities Coordinator Responsibilities:

  • Act as point of contact for building facilities and other office vendors and deliveries
  • Work with HR to ensure new hires are welcomed successfully at Backblaze – to include desk/equipment orders, seat planning, and general facilities preparation
  • Work with the “Fun Committee” to support office events and activities
  • Be available after hours as required for ongoing business success (events, building issues)

Jr. Buyer Responsibilities:

  • Assist with creating purchase orders and buying equipment
  • Compare costs and maintain vendor cards in Quickbooks
  • Assist with booking travel, hotel accommodations, and conference rooms
  • Maintain accurate records of purchases and tracking orders
  • Maintain office equipment, physical space, and maintenance schedules
  • Manage company calendar, snack, and meal orders

Qualifications:

  • 1 year experience in an Inventory/Shipping/Receiving/Admin role preferred
  • Proficiency with Microsoft Office applications, Google Apps, Quickbooks, Excel
  • Experience and skill at adhering to a budget
  • High attention to detail
  • Proven ability to prioritize within a multi-tasking environment; highly organized
  • Collaborative and communicative
  • Hands-on, “can do” attitude
  • Personable and approachable
  • Able to lift up to 50 lbs
  • Strong data entry

This position is located in San Mateo, California. Backblaze is an Equal Opportunity Employer.

If this all sounds like you:

  1. Send an email to [email protected] with the position in the subject line.
  2. Tell us a bit about your work history.
  3. Include your resume.

The post Wanted: Office Administrator appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Tamilrockers Arrests: Police Parade Alleged Movie Pirates on TV

Post Syndicated from Andy original https://torrentfreak.com/tamilrockers-arrests-police-parade-alleged-movie-pirates-on-tv-180315/

Just two years ago around 277 million people used the Internet in India. Today there are estimates as high as 355 million and with a population of more than 1.3 billion, India has plenty of growth yet to come.

Also evident is that in addition to a thirst for hard work, many Internet-enabled Indians have developed a taste for Internet piracy. While the US and Europe were the most likely bases for pirate site operators between 2000 and 2015, India now appears in a growing number of cases, from torrent and streaming platforms to movie release groups.

One site that is clearly Indian-focused is the ever-popular Tamilrockers. The movie has laughed in the face of the authorities for a number of years, skipping from domain to domain as efforts to block the site descend into a chaotic game of whack-a-mole. Like The Pirate Bay, Tamilrockers has burned through plenty of domains including tamilrockers.in, tamilrockers.ac, tamilrockers.me, tamilrockers.co, tamilrockers.is, tamilrockers.us and tamilrockers.ro.

Now, however, the authorities are claiming a significant victory against the so-far elusive operators of the site. The anti-piracy cell of the Kerala police announced last evening that they’ve arrested five men said to be behind both Tamilrockers and alleged sister site, DVDRockers.

They’re named as alleged Tamilrockers owner ‘Prabhu’, plus ‘Karthi’ and ‘Suresh’ (all aged 24), plus alleged DVD Rockers owner ‘Johnson’ and ‘Jagan’ (elsewhere reported as ‘Maria John’). The men were said to be generating between US$1,500 and US$3,000 each per month. The average salary in India is around $600 per annum.

While details of how the suspects were caught tend to come later in US and European cases, the Indian authorities are more forthright. According to Anti-Piracy Cell Superintendent B.K. Prasanthan, who headed the team that apprehended the men, it was a trail of advertising revenue crumbs that led them to the suspects.

Prasanthan revealed that it was an email, sent by a Haryana-based ad company to an individual who was arrested in 2016 in a similar case, that helped in tracking the members of Tamilrockers.

“This ad company had sent a mail to [the individual], offering to publish ads on the website he was running. In that email, the company happened to mention that they have ties with Tamilrockers. We got the information about Tamilrockers through this ad company,” Prasanthan said.

That information included the bank account details of the suspects.

Given the technical nature of the sites, it’s perhaps no surprise that the suspects are qualified in the IT field. Prasanthan revealed that all had done well.

“All the gang members were technically qualified. It even included MSc and BSc holders in computer science. They used to record movies in pieces from various parts of the world and join [them together]. We are trying to trace more members of the gang including Karthi’s brothers,” Prasanathan said.

All five men were remanded in custody but not before they were paraded in front of the media, footage which later appeared on TV.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

AWS Documentation is Now Open Source and on GitHub

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-documentation-is-now-open-source-and-on-github/

Earlier this year we made the AWS SDK developer guides available as GitHub repos (all found within the awsdocs organization) and invited interested parties to contribute changes and improvements in the form of pull requests.

Today we are adding over 138 additional developer and user guides to the organization, and we are looking forward to receiving your requests. You can fix bugs, improve code samples (or submit new ones), add detail, and rewrite sentences and paragraphs in the interest of accuracy or clarity. You can also look at the commit history in order to learn more about new feature and service launches and to track improvements to the documents.

Making a Contribution
Before you get started, read the Amazon Open Source Code of Conduct and take a look at the Contributing Guidelines document (generally named CONTRIBUTING.md) for the AWS service of interest. Then create a GitHub account if you don’t already have one.

Once you find something to change or improve, visit the HTML version of the document and click on Edit on GitHub button at the top of the page:

This will allow you to edit the document in source form (typically Markdown or reStructuredText). The source code is used to produce the HTML, PDF, and Kindle versions of the documentation.

Once you are in GitHub, click on the pencil icon:

This creates a “fork” — a separate copy of the file that you can edit in isolation.

Next, make an edit. In general, as a new contributor to an open source project, you should gain experience and build your reputation by making small, high-quality edits. I’ll change “dozens of services” to “over one hundred services” in this document:

Then I summarize my change and click Propose file change:

I examine the differences to verify my changes and then click Create pull request:

Then I review the details and click Create pull request again:

The pull request (also known as a PR) makes its way to the Elastic Beanstalk documentation team and they get to decide if they want to accept it, reject it, or to engage in a conversation with me to learn more. The teams endeavor to respond to PRs within 48 hours, and I’ll be notified via GitHub whenever the status of the PR changes.

As is the case with most open source projects, a steady stream of focused, modest-sized pull requests is preferable to the occasional king-sized request with dozens of edits inside.

If I am interested in tracking changes to a repo over time, I can Watch and/or Star it:

If I Watch a repo, I’ll receive an email whenever there’s a new release, issue, or pull request for that service guide.

Go Fork It
This launch gives you another way to help us to improve AWS. Let me know what you think!

Jeff;

A site for reviews of Tumbleweed snapshots

Post Syndicated from corbet original https://lwn.net/Articles/748362/rss

As leading-edge rolling distributions go, OpenSUSE Tumbleweed is relatively
stable, but it is still true that some snapshots are better than others.
Jimmy Berry has announced the creation of a web site tracking
the quality of each day’s snapshot. “By utilizing a variety of
sources of feedback pertaining to snapshots a stability score is
estimated. The goal is to err on the side of caution and to allow users to
avoid troublesome releases.

Attending Mobile World Congress? Check Out Our Connected Car Demo!

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/attending-mobile-world-congress-check-out-our-connected-car-demo/

Are you planning to attend Mobile World Congress 2018 in Barcelona (one of my favorite cities)? If so, please be sure to check out the connected car demo in Hall 5 Booth 5E41.

The AWS Greengrass team has been working on a proof of concept with our friends at Vodafone and Saguna to show you how connected cars can change the automotive industry. The demo is built around the emerging concept of multi-access edge computing, or MEC.

Car manufacturers want to provide advanced digital technology in their vehicles but don’t want to make significant upgrades to the on-board computing resources due to cost, power, and time-to-market considerations, not to mention the issues that arise when attempting to retrofit cars that are already on the road. MEC offloads processing resources to the edge of the mobile network, for instance a hub site in the access network. This model helps car manufacturers to take advantage of low-latency compute resources while building features that can evolve and improve over the lifetime of the vehicle, often 20 years or more. It also reduces the complexity and the cost of the on-board components.

The MWC demo streams a live video feed over Vodafone’s 4G LTE network, with Saguna’s AI-powered MEC solution that leverages AWS Greengrass. The demo focuses on driver safety, with the goal of helping to detect drivers that are distracted by talking to someone or something in the car. With an on-board camera aimed at the driver, backed up by AI-powered movement tracking and pattern detection running at the edge of the mobile network, distractions can be identified and the driver can be alerted. This architecture also allows manufacturers to enhance existing cars since most of the computing is handled at the edge of the mobile network.

If you couldn’t make it to Mobile World Congress, you can also check out the video for this solution, here.

Jeff;