Tag Archives: lua

COPPA Compliance

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/04/coppa_complianc.html

Interesting research: “‘Won’t Somebody Think of the Children?’ Examining COPPA Compliance at Scale“:

Abstract: We present a scalable dynamic analysis framework that allows for the automatic evaluation of the privacy behaviors of Android apps. We use our system to analyze mobile apps’ compliance with the Children’s Online Privacy Protection Act (COPPA), one of the few stringent privacy laws in the U.S. Based on our automated analysis of 5,855 of the most popular free children’s apps, we found that a majority are potentially in violation of COPPA, mainly due to their use of third-party SDKs. While many of these SDKs offer configuration options to respect COPPA by disabling tracking and behavioral advertising, our data suggest that a majority of apps either do not make use of these options or incorrectly propagate them across mediation SDKs. Worse, we observed that 19% of children’s apps collect identifiers or other personally identifiable information (PII) via SDKs whose terms of service outright prohibit their use in child-directed apps. Finally, we show that efforts by Google to limit tracking through the use of a resettable advertising ID have had little success: of the 3,454 apps that share the resettable ID with advertisers, 66% transmit other, non-resettable, persistent identifiers as well, negating any intended privacy-preserving properties of the advertising ID.

Piracy Falls 6%, in Spain, But It’s Still a Multi-Billion Euro Problem

Post Syndicated from Andy original https://torrentfreak.com/piracy-falls-6-in-spain-but-its-still-a-multi-billion-euro-problem-180409/

The Coalition of Creators and Content Industries, which represents Spain’s leading entertainment industry companies, is keeping a close eye on the local piracy landscape.

The outfit has just published its latest Piracy Observatory and Digital Content Consumption Habits report, carried out by the independent consultant GFK, and there is good news to report on headline piracy figures.

During 2017, the report estimates that people accessed unlicensed digital content just over four billion times, which equates to almost 21.9 billion euros in lost revenues. While this is a significant number, it’s a decrease of 6% compared to 2016 and an accumulated decrease of 9% compared to 2015, the coalition reports.

Overall, movies are most popular with pirates, with 34% helping themselves to content without paying.

“The volume of films accessed illegally during 2017 was 726 million, with a market value of 5.7 billion euros, compared to 6.9 billion in 2016. 35% of accesses happened while the film was still on screens in cinema theaters, while this percentage was 33% in 2016,” the report notes.

TV shows are in a close second position with 30% of users gobbling up 945 million episodes illegally during 2017. A surprisingly high 24% of users went for eBooks, with music relegated to fourth place with ‘just’ 22%, followed by videogames (11%) and football (10%).

The reasons given by pirates for their habits are both varied and familiar. 51% said that original content is too expensive while 43% said that taking the illegal route “is fast and easy”. Half of the pirates said that simply paying for an internet connection was justification for getting content for free.

A quarter of all pirates believe that they aren’t doing anyone any harm, with the same number saying they get content without paying because there are no consequences for doing so. But it isn’t just pirates themselves in the firing line.

Perhaps unsurprisingly given the current climate, the report heavily criticizes search engines for facilitating access to infringing content.

“With 75%, search engines are the main method of accessing illegal content and Google is used for nine out of ten accesses to pirate content,” the report reads.

“Regarding social networks, Facebook is the most used method of access (83%), followed by Twitter (42%) and Instagram (34%). Therefore it is most valuable that Facebook has reached agreements with different industries to become a legal source and to regulate access to content.”

Once on pirate sites, some consumers reported difficulties in determining whether they’re legal or not. Around 15% said that they had “big difficulties” telling whether a site is authorized with 44% saying they had problems “sometimes”.

That being said, given the amount of advertising on pirate sites, it’s no surprise that most knew a pirate site when they visited one and, according to the report, advertising placement is only on the up.

Just over a quarter of advertising appearing on pirate sites features well-known brands, although this is a reduction from more than 37% in 2016. This needs to be further improved, the coalition says, via collaboration between all parties involved in the industry.

A curious claim from the report is that 81% of pirate site users said they were required to register in order to use a platform. This resulted in “transferring personal data” to pirate site operators who gather it in databases that are used for profitable “e-marketing campaigns”.

“Pirate sites also get much more valuable data than one could imagine which allow them to get important economic benefits, as for example, Internet surfing habits, other websites visited by consumers, preferences, likes, and purchase habits,” the report states.

So what can be done to reduce consumer reliance on pirate sites? The report finds that consumers are largely in line with how the entertainment industries believe piracy should or could be tackled.

“The most efficient measures against piracy would be, according to the internet users’ own view, blocking access to the website offering content (78%) and penalizing internet providers (73%),” the report reads.

“Following these two, the best measure to reduce infringements would be, according to consumers, to promote social awareness campaigns against piracy (61%). This suggests that increased collaboration between the content sector and the ISPs (Internet Service Providers) could count on consumers’ support and positive assessment.”

Finally, consumers in Spain are familiar with the legal options, should they wish to take that route in future. Netflix awareness in the country is at 91%, Spotify at 81%, with Movistar+ and HBO at 80% and 68% respectively.

“This invalidates the reasons given by pirate users who said they did so because of the lack of an accessible legal offer at affordable prices,” the report adds.

However, those who take the plunge into the legal world don’t always kick the pirate habit, with the paper stating that users of pirates sites tend to carry on pirating, although they do pirate less in some sectors, notably music. The study also departs from findings in other regions that pirates can also be avid consumers of legitimate content.

Several reports, from the UK, Sweden, Australia, and even from Hollywood, have clearly indicated that pirates are the entertainment industries’ best customers.

In Spain, however, the situation appears to be much more pessimistic, with only 8% of people who access illegal digital content paying for legal content too. That seems low given that Netflix alone had more than a million Spanish subscribers at the end of 2017 and six million Spanish households currently subscribe to other pay TV services.

The report is available here (Spanish, pdf)

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Facebook and Cambridge Analytica

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/03/facebook_and_ca.html

In the wake of the Cambridge Analytica scandal, news articles and commentators have focused on what Facebook knows about us. A lot, it turns out. It collects data from our posts, our likes, our photos, things we type and delete without posting, and things we do while not on Facebook and even when we’re offline. It buys data about us from others. And it can infer even more: our sexual orientation, political beliefs, relationship status, drug use, and other personality traits — even if we didn’t take the personality test that Cambridge Analytica developed.

But for every article about Facebook’s creepy stalker behavior, thousands of other companies are breathing a collective sigh of relief that it’s Facebook and not them in the spotlight. Because while Facebook is one of the biggest players in this space, there are thousands of other companies that spy on and manipulate us for profit.

Harvard Business School professor Shoshana Zuboff calls it “surveillance capitalism.” And as creepy as Facebook is turning out to be, the entire industry is far creepier. It has existed in secret far too long, and it’s up to lawmakers to force these companies into the public spotlight, where we can all decide if this is how we want society to operate and — if not — what to do about it.

There are 2,500 to 4,000 data brokers in the United States whose business is buying and selling our personal data. Last year, Equifax was in the news when hackers stole personal information on 150 million people, including Social Security numbers, birth dates, addresses, and driver’s license numbers.

You certainly didn’t give it permission to collect any of that information. Equifax is one of those thousands of data brokers, most of them you’ve never heard of, selling your personal information without your knowledge or consent to pretty much anyone who will pay for it.

Surveillance capitalism takes this one step further. Companies like Facebook and Google offer you free services in exchange for your data. Google’s surveillance isn’t in the news, but it’s startlingly intimate. We never lie to our search engines. Our interests and curiosities, hopes and fears, desires and sexual proclivities, are all collected and saved. Add to that the websites we visit that Google tracks through its advertising network, our Gmail accounts, our movements via Google Maps, and what it can collect from our smartphones.

That phone is probably the most intimate surveillance device ever invented. It tracks our location continuously, so it knows where we live, where we work, and where we spend our time. It’s the first and last thing we check in a day, so it knows when we wake up and when we go to sleep. We all have one, so it knows who we sleep with. Uber used just some of that information to detect one-night stands; your smartphone provider and any app you allow to collect location data knows a lot more.

Surveillance capitalism drives much of the internet. It’s behind most of the “free” services, and many of the paid ones as well. Its goal is psychological manipulation, in the form of personalized advertising to persuade you to buy something or do something, like vote for a candidate. And while the individualized profile-driven manipulation exposed by Cambridge Analytica feels abhorrent, it’s really no different from what every company wants in the end. This is why all your personal information is collected, and this is why it is so valuable. Companies that can understand it can use it against you.

None of this is new. The media has been reporting on surveillance capitalism for years. In 2015, I wrote a book about it. Back in 2010, the Wall Street Journal published an award-winning two-year series about how people are tracked both online and offline, titled “What They Know.”

Surveillance capitalism is deeply embedded in our increasingly computerized society, and if the extent of it came to light there would be broad demands for limits and regulation. But because this industry can largely operate in secret, only occasionally exposed after a data breach or investigative report, we remain mostly ignorant of its reach.

This might change soon. In 2016, the European Union passed the comprehensive General Data Protection Regulation, or GDPR. The details of the law are far too complex to explain here, but some of the things it mandates are that personal data of EU citizens can only be collected and saved for “specific, explicit, and legitimate purposes,” and only with explicit consent of the user. Consent can’t be buried in the terms and conditions, nor can it be assumed unless the user opts in. This law will take effect in May, and companies worldwide are bracing for its enforcement.

Because pretty much all surveillance capitalism companies collect data on Europeans, this will expose the industry like nothing else. Here’s just one example. In preparation for this law, PayPal quietly published a list of over 600 companies it might share your personal data with. What will it be like when every company has to publish this sort of information, and explicitly explain how it’s using your personal data? We’re about to find out.

In the wake of this scandal, even Mark Zuckerberg said that his industry probably should be regulated, although he’s certainly not wishing for the sorts of comprehensive regulation the GDPR is bringing to Europe.

He’s right. Surveillance capitalism has operated without constraints for far too long. And advances in both big data analysis and artificial intelligence will make tomorrow’s applications far creepier than today’s. Regulation is the only answer.

The first step to any regulation is transparency. Who has our data? Is it accurate? What are they doing with it? Who are they selling it to? How are they securing it? Can we delete it? I don’t see any hope of Congress passing a GDPR-like data protection law anytime soon, but it’s not too far-fetched to demand laws requiring these companies to be more transparent in what they’re doing.

One of the responses to the Cambridge Analytica scandal is that people are deleting their Facebook accounts. It’s hard to do right, and doesn’t do anything about the data that Facebook collects about people who don’t use Facebook. But it’s a start. The market can put pressure on these companies to reduce their spying on us, but it can only do that if we force the industry out of its secret shadows.

This essay previously appeared on CNN.com.

EDITED TO ADD (4/2): Slashdot thread.

Newly released guide provides Australian public sector the ability to evaluate AWS at PROTECTED level

Post Syndicated from Oliver Bell original https://aws.amazon.com/blogs/security/newly-released-guide-provides-australian-public-sector-the-ability-to-evaluate-aws-at-protected-level/

Australian public sector customers now have a clear roadmap to use our secure services for sensitive workloads at the PROTECTED level. For the first time, we’ve released our Information Security Registered Assessors Program (IRAP) PROTECTED documentation via AWS Artifact. This information provides the ability to plan, architect, and self-assess systems built in AWS under the Digital Transformation Agency’s Secure Cloud Guidelines.

In short, this documentation gives public sector customers everything needed to evaluate AWS at the PROTECTED level. And we’re making this resource available to download on-demand through AWS Artifact. When you download the guide, you’ll find a mapping of how AWS meets each requirement to securely and compliantly process PROTECTED data.

With the AWS IRAP PROTECTED documentation, the process of adopting our secure services has never been easier. The information enables individual agencies to complete their own assessments and adopt AWS, but we also continue to work with the Australian Signals Directorate to include our services at the PROTECTED level on the Certified Cloud Services List.

Meanwhile, we’re also excited to announce that there are now 46 services in scope, which mean more options to build secure and innovative solutions, while also saving money and gaining the productivity of the cloud.

If you have questions about this announcement or would like to inquire about how to use AWS for your regulated workloads, contact your account team.

Adding Backdoors at the Chip Level

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/03/adding_backdoor.html

Interesting research into undetectably adding backdoors into computer chips during manufacture: “Stealthy dopant-level hardware Trojans: extended version,” also available here:

Abstract: In recent years, hardware Trojans have drawn the attention of governments and industry as well as the scientific community. One of the main concerns is that integrated circuits, e.g., for military or critical-infrastructure applications, could be maliciously manipulated during the manufacturing process, which often takes place abroad. However, since there have been no reported hardware Trojans in practice yet, little is known about how such a Trojan would look like and how difficult it would be in practice to implement one. In this paper we propose an extremely stealthy approach for implementing hardware Trojans below the gate level, and we evaluate their impact on the security of the target device. Instead of adding additional circuitry to the target design, we insert our hardware Trojans by changing the dopant polarity of existing transistors. Since the modified circuit appears legitimate on all wiring layers (including all metal and polysilicon), our family of Trojans is resistant to most detection techniques, including fine-grain optical inspection and checking against “golden chips”. We demonstrate the effectiveness of our approach by inserting Trojans into two designs — a digital post-processing derived from Intel’s cryptographically secure RNG design used in the Ivy Bridge processors and a side-channel resistant SBox implementation­ — and by exploring their detectability and their effects on security.

The moral is that this kind of technique is very difficult to detect.

Repeat Infringer Policy Doesn’t Have to Be Spelled Out, Appeals Court Rules

Post Syndicated from Ernesto original https://torrentfreak.com/repeat-infringer-policy-doesnt-have-to-be-spelled-out-appeals-court-rules-180324/

The “repeat infringer” issue is a hot topic in US Courts, leading to much uncertainty among various Internet services.

Under the DMCA, companies are required to implement a reasonable policy to deal with frequent offenders.

This not only applies to commercial Internet providers, as Cox found out the hard way, but also to websites that host user-uploaded content, such as video and image hosting services.

Last week the United States Court of Appeals for the Ninth Circuit issued an order that provides some further clarification on how a repeat infringer policy should be documented.

The case in question was filed by adult content producer Ventura Content, which accused the adult-themed site Motherless of copyright infringement. While Motherless relies on user-uploaded content, the adult producer argued that it is liable for pirated content on its site.

In a majority ruling, the Court found that Motherless did not know about the alleged infringements before the lawsuit was filed and removed them within a day of being properly alerted.

This means that the site is entitled to safe harbor protection if it implemented a reasonable repeat infringer policy, which brings us to the crux of the case.

The operator of the site, Joshua Lange, is the sole employee who single-handedly deals with all takedown requests. The site also has a page informing users that there is a repeat infringer policy, without providing specific details.

The adult content producer argued that the site had failed to reasonably implement such a policy, but the Court disagreed, noting that the DMCA doesn’t prescribe a written policy.

“The details of the termination policy are not written down. However, the statute does not say that the policy details must be written, just that the site must inform subscribers of ‘a policy’ of terminating repeat infringers in appropriate circumstances,” the Court states.

In this case, the details of the policy were in the mind of the operator, who made his decisions based on a case-by-case evaluation.

“Lange uses his judgment, not a mechanical test, to terminate infringers based on the volume, history, severity, and intentions behind a user’s infringing content uploads.”

The fact that the details of the policy were not spelled out doesn’t mean that Motherless has no safe harbor protection, although this may be different for large companies.

“A company might need a written policy to tell its employees or independent contractors what to do if there were a significant number of them, but Motherless is not such a firm.

“So the lack of a detailed written policy is not by itself fatal to safe harbor eligibility. Neither is the fact that Motherless did not publicize its internal criteria,” the Court adds.

Surprisingly, the site’s operator didn’t keep any written logs of repeat infringers either. He simply kept track of them in his head and terminated more than a thousand accounts this way. This didn’t work flawlessly, as a few repeat infringers slipped through, but the Court believes it was good enough.

“It is tempting, perhaps, to say that a policy is not ‘reasonably’ implemented if it does not include both a database of users whose uploads have generated DMCA notices and some automated means of catching them if they do it again. But the statute does not require that,” the order reads.

Overall, the Court sides with Motherless and its operator and affirmed the summary judgment in its favor.

This case is unique in many ways. Among other things, it shows that written details or logs are not always required for a “reasonable” repeat infringer policy. While this could be different for large companies, it is likely to be referenced frequently in related cases.

This week, hosting provider Steadfast was quick to use the ruling to argue that it sufficiently adopted and informed users of its repeat infringer policy.

—-

A copy of the Ninth Circuit Court of Appeals ruling is available here (pdf).

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Israeli Security Attacks AMD by Publishing Zero-Day Exploits

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/03/israeli_securit.html

Last week, the Israeli security company CTS Labs published a series of exploits against AMD chips. The publication came with the flashy website, detailed whitepaper, cool vulnerability names — RYZENFALL, MASTERKEY, FALLOUT, and CHIMERA — and logos we’ve come to expect from these sorts of things. What’s new is that the company only gave AMD a day’s notice, which breaks with every norm about responsible disclosure. CTS Labs didn’t release details of the exploits, only high-level descriptions of the vulnerabilities, but it is probably still enough for others to reproduce their results. This is incredibly irresponsible of the company.

Moreover, the vulnerabilities are kind of meh. Nicholas Weaver explains:

In order to use any of the four vulnerabilities, an attacker must already have almost complete control over the machine. For most purposes, if the attacker already has this access, we would generally say they’ve already won. But these days, modern computers at least attempt to protect against a rogue operating system by having separate secure subprocessors. CTS Labs discovered the vulnerabilities when they looked at AMD’s implementation of the secure subprocessor to see if an attacker, having already taken control of the host operating system, could bypass these last lines of defense.

In a “Clarification,” CTS Labs kind of agrees:

The vulnerabilities described in amdflaws.com could give an attacker that has already gained initial foothold into one or more computers in the enterprise a significant advantage against IT and security teams.

The only thing the attacker would need after the initial local compromise is local admin privileges and an affected machine. To clarify misunderstandings — there is no need for physical access, no digital signatures, no additional vulnerability to reflash an unsigned BIOS. Buy a computer from the store, run the exploits as admin — and they will work (on the affected models as described on the site).

The weirdest thing about this story is that CTS Labs describes one of the vulnerabilities, Chimera, as a backdoor. Although it doesn’t t come out and say that this was deliberately planted by someone, it does make the point that the chips were designed in Taiwan. This is an incredible accusation, and honestly needs more evidence before we can evaluate it.

The upshot of all of this is that CTS Labs played this for maximum publicity: over-hyping its results and minimizing AMD’s ability to respond. And it may have an ulterior motive:

But CTS’s website touting AMD’s flaws also contained a disclaimer that threw some shadows on the company’s motives: “Although we have a good faith belief in our analysis and believe it to be objective and unbiased, you are advised that we may have, either directly or indirectly, an economic interest in the performance of the securities of the companies whose products are the subject of our reports,” reads one line. WIRED asked in a follow-up email to CTS whether the company holds any financial positions designed to profit from the release of its AMD research specifically. CTS didn’t respond.

We all need to demand better behavior from security researchers. I know that any publicity is good publicity, but I am pleased to see the stories critical of CTS Labs outnumbering the stories praising it.

EDITED TO ADD (3/21): AMD responds:

AMD’s response today agrees that all four bug families are real and are found in the various components identified by CTS. The company says that it is developing firmware updates for the three PSP flaws. These fixes, to be made available in “coming weeks,” will be installed through system firmware updates. The firmware updates will also mitigate, in some unspecified way, the Chimera issue, with AMD saying that it’s working with ASMedia, the third-party hardware company that developed Promontory for AMD, to develop suitable protections. In its report, CTS wrote that, while one CTS attack vector was a firmware bug (and hence in principle correctable), the other was a hardware flaw. If true, there may be no effective way of solving it.

Response here.

Join the AWS Quest – Help me to Rebuild Ozz!

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/join-the-aws-quest-help-me-to-rebuild-ozz/

If you have been watching my weekly videos, you may have noticed an orange robot in the background from time to time. That’s Ozz, my robot friend and helper. Built from the ground up in my home laboratory, Ozz is an invaluable part of the AWS blogging process!

Sadly, when we announced we are adding the AWS Podcast to the blog, Ozz literally went to pieces and all I have left is a large pile of bricks and some great memories of our time together. From what I can tell, Ozz went haywire over this new development due to excessive enthusiasm!

Ozz, perhaps anticipating that this could happen at some point, buried a set of clues (each pointing to carefully protected plans) in this blog, in the AWS Podcast, and in other parts of the AWS site. If we can find and decode these plans, we can rebuild Ozz, better, stronger, and faster. Unfortunately, due to concerns about the ultra-competitive robot friend market, Ozz concealed each of the plans inside a set of devious, brain-twisting puzzles. You are going to need to look high, low, inside, outside, around, and through the clues in order to figure this one out. You may even need to phone a friend or two.

Your mission, should you choose to accept it, is to find these clues, decode the plans, and help me to rebuild Ozz. The information that I have is a bit fuzzy, but I think there are 20 or so puzzles, each one describing one part of Ozz. If we can solve them all, we’ll get together on Twitch later this month and put Ozz back together.

Are you with me on this? Let’s do it!

Jeff;

Improve the Operational Efficiency of Amazon Elasticsearch Service Domains with Automated Alarms Using Amazon CloudWatch

Post Syndicated from Veronika Megler original https://aws.amazon.com/blogs/big-data/improve-the-operational-efficiency-of-amazon-elasticsearch-service-domains-with-automated-alarms-using-amazon-cloudwatch/

A customer has been successfully creating and running multiple Amazon Elasticsearch Service (Amazon ES) domains to support their business users’ search needs across products, orders, support documentation, and a growing suite of similar needs. The service has become heavily used across the organization.  This led to some domains running at 100% capacity during peak times, while others began to run low on storage space. Because of this increased usage, the technical teams were in danger of missing their service level agreements.  They contacted me for help.

This post shows how you can set up automated alarms to warn when domains need attention.

Solution overview

Amazon ES is a fully managed service that delivers Elasticsearch’s easy-to-use APIs and real-time analytics capabilities along with the availability, scalability, and security that production workloads require.  The service offers built-in integrations with a number of other components and AWS services, enabling customers to go from raw data to actionable insights quickly and securely.

One of these other integrated services is Amazon CloudWatch. CloudWatch is a monitoring service for AWS Cloud resources and the applications that you run on AWS. You can use CloudWatch to collect and track metrics, collect and monitor log files, set alarms, and automatically react to changes in your AWS resources.

CloudWatch collects metrics for Amazon ES. You can use these metrics to monitor the state of your Amazon ES domains, and set alarms to notify you about high utilization of system resources.  For more information, see Amazon Elasticsearch Service Metrics and Dimensions.

While the metrics are automatically collected, the missing piece is how to set alarms on these metrics at appropriate levels for each of your domains. This post includes sample Python code to evaluate the current state of your Amazon ES environment, and to set up alarms according to AWS recommendations and best practices.

There are two components to the sample solution:

  • es-check-cwalarms.py: This Python script checks the CloudWatch alarms that have been set, for all Amazon ES domains in a given account and region.
  • es-create-cwalarms.py: This Python script sets up a set of CloudWatch alarms for a single given domain.

The sample code can also be found in the amazon-es-check-cw-alarms GitHub repo. The scripts are easy to extend or combine, as described in the section “Extensions and Adaptations”.

Assessing the current state

The first script, es-check-cwalarms.py, is used to give an overview of the configurations and alarm settings for all the Amazon ES domains in the given region. The script takes the following parameters:

python es-checkcwalarms.py -h
usage: es-checkcwalarms.py [-h] [-e ESPREFIX] [-n NOTIFY] [-f FREE][-p PROFILE] [-r REGION]
Checks a set of recommended CloudWatch alarms for Amazon Elasticsearch Service domains (optionally, those beginning with a given prefix).
optional arguments:
  -h, --help   		show this help message and exit
  -e ESPREFIX, --esprefix ESPREFIX	Only check Amazon Elasticsearch Service domains that begin with this prefix.
  -n NOTIFY, --notify NOTIFY    List of CloudWatch alarm actions; e.g. ['arn:aws:sns:xxxx']
  -f FREE, --free FREE  Minimum free storage (MB) on which to alarm
  -p PROFILE, --profile PROFILE     IAM profile name to use
  -r REGION, --region REGION       AWS region for the domain. Default: us-east-1

The script first identifies all the domains in the given region (or, optionally, limits them to the subset that begins with a given prefix). It then starts running a set of checks against each one.

The script can be run from the command line or set up as a scheduled Lambda function. For example, for one customer, it was deemed appropriate to regularly run the script to check that alarms were correctly set for all domains. In addition, because configuration changes—cluster size increases to accommodate larger workloads being a common change—might require updates to alarms, this approach allowed the automatic identification of alarms no longer appropriately set as the domain configurations changed.

The output shown below is the output for one domain in my account.

Starting checks for Elasticsearch domain iotfleet , version is 53
Iotfleet Automated snapshot hour (UTC): 0
Iotfleet Instance configuration: 1 instances; type:m3.medium.elasticsearch
Iotfleet Instance storage definition is: 4 GB; free storage calced to: 819.2 MB
iotfleet Desired free storage set to (in MB): 819.2
iotfleet WARNING: Not using VPC Endpoint
iotfleet WARNING: Does not have Zone Awareness enabled
iotfleet WARNING: Instance count is ODD. Best practice is for an even number of data nodes and zone awareness.
iotfleet WARNING: Does not have Dedicated Masters.
iotfleet WARNING: Neither index nor search slow logs are enabled.
iotfleet WARNING: EBS not in use. Using instance storage only.
iotfleet Alarm ok; definition matches. Test-Elasticsearch-iotfleet-ClusterStatus.yellow-Alarm ClusterStatus.yellow
iotfleet Alarm ok; definition matches. Test-Elasticsearch-iotfleet-ClusterStatus.red-Alarm ClusterStatus.red
iotfleet Alarm ok; definition matches. Test-Elasticsearch-iotfleet-CPUUtilization-Alarm CPUUtilization
iotfleet Alarm ok; definition matches. Test-Elasticsearch-iotfleet-JVMMemoryPressure-Alarm JVMMemoryPressure
iotfleet WARNING: Missing alarm!! ('ClusterIndexWritesBlocked', 'Maximum', 60, 5, 'GreaterThanOrEqualToThreshold', 1.0)
iotfleet Alarm ok; definition matches. Test-Elasticsearch-iotfleet-AutomatedSnapshotFailure-Alarm AutomatedSnapshotFailure
iotfleet Alarm: Threshold does not match: Test-Elasticsearch-iotfleet-FreeStorageSpace-Alarm Should be:  819.2 ; is 3000.0

The output messages fall into the following categories:

  • System overview, Informational: The Amazon ES version and configuration, including instance type and number, storage, automated snapshot hour, etc.
  • Free storage: A calculation for the appropriate amount of free storage, based on the recommended 20% of total storage.
  • Warnings: best practices that are not being followed for this domain. (For more about this, read on.)
  • Alarms: An assessment of the CloudWatch alarms currently set for this domain, against a recommended set.

The script contains an array of recommended CloudWatch alarms, based on best practices for these metrics and statistics. Using the array allows alarm parameters (such as free space) to be updated within the code based on current domain statistics and configurations.

For a given domain, the script checks if each alarm has been set. If the alarm is set, it checks whether the values match those in the array esAlarms. In the output above, you can see three different situations being reported:

  • Alarm ok; definition matches. The alarm set for the domain matches the settings in the array.
  • Alarm: Threshold does not match. An alarm exists, but the threshold value at which the alarm is triggered does not match.
  • WARNING: Missing alarm!! The recommended alarm is missing.

All in all, the list above shows that this domain does not have a configuration that adheres to best practices, nor does it have all the recommended alarms.

Setting up alarms

Now that you know that the domains in their current state are missing critical alarms, you can correct the situation.

To demonstrate the script, set up a new domain named “ver”, in us-west-2. Specify 1 node, and a 10-GB EBS disk. Also, create an SNS topic in us-west-2 with a name of “sendnotification”, which sends you an email.

Run the second script, es-create-cwalarms.py, from the command line. This script creates (or updates) the desired CloudWatch alarms for the specified Amazon ES domain, “ver”.

python es-create-cwalarms.py -r us-west-2 -e test -c ver -n "['arn:aws:sns:us-west-2:xxxxxxxxxx:sendnotification']"
EBS enabled: True type: gp2 size (GB): 10 No Iops 10240  total storage (MB)
Desired free storage set to (in MB): 2048.0
Creating  Test-Elasticsearch-ver-ClusterStatus.yellow-Alarm
Creating  Test-Elasticsearch-ver-ClusterStatus.red-Alarm
Creating  Test-Elasticsearch-ver-CPUUtilization-Alarm
Creating  Test-Elasticsearch-ver-JVMMemoryPressure-Alarm
Creating  Test-Elasticsearch-ver-FreeStorageSpace-Alarm
Creating  Test-Elasticsearch-ver-ClusterIndexWritesBlocked-Alarm
Creating  Test-Elasticsearch-ver-AutomatedSnapshotFailure-Alarm
Successfully finished creating alarms!

As with the first script, this script contains an array of recommended CloudWatch alarms, based on best practices for these metrics and statistics. This approach allows you to add or modify alarms based on your use case (more on that below).

After running the script, navigate to Alarms on the CloudWatch console. You can see the set of alarms set up on your domain.

Because the “ver” domain has only a single node, cluster status is yellow, and that alarm is in an “ALARM” state. It’s already sent a notification that the alarm has been triggered.

What to do when an alarm triggers

After alarms are set up, you need to identify the correct action to take for each alarm, which depends on the alarm triggered. For ideas, guidance, and additional pointers to supporting documentation, see Get Started with Amazon Elasticsearch Service: Set CloudWatch Alarms on Key Metrics. For information about common errors and recovery actions to take, see Handling AWS Service Errors.

In most cases, the alarm triggers due to an increased workload. The likely action is to reconfigure the system to handle the increased workload, rather than reducing the incoming workload. Reconfiguring any backend store—a category of systems that includes Elasticsearch—is best performed when the system is quiescent or lightly loaded. Reconfigurations such as setting zone awareness or modifying the disk type cause Amazon ES to enter a “processing” state, potentially disrupting client access.

Other changes, such as increasing the number of data nodes, may cause Elasticsearch to begin moving shards, potentially impacting search performance on these shards while this is happening. These actions should be considered in the context of your production usage. For the same reason I also do not recommend running a script that resets all domains to match best practices.

Avoid the need to reconfigure during heavy workload by setting alarms at a level that allows a considered approach to making the needed changes. For example, if you identify that each weekly peak is increasing, you can reconfigure during a weekly quiet period.

While Elasticsearch can be reconfigured without being quiesced, it is not a best practice to automatically scale it up and down based on usage patterns. Unlike some other AWS services, I recommend against setting a CloudWatch action that automatically reconfigures the system when alarms are triggered.

There are other situations where the planned reconfiguration approach may not work, such as low or zero free disk space causing the domain to reject writes. If the business is dependent on the domain continuing to accept incoming writes and deleting data is not an option, the team may choose to reconfigure immediately.

Extensions and adaptations

You may wish to modify the best practices encoded in the scripts for your own environment or workloads. It’s always better to avoid situations where alerts are generated but routinely ignored. All alerts should trigger a review and one or more actions, either immediately or at a planned date. The following is a list of common situations where you may wish to set different alarms for different domains:

  • Dev/test vs. production
    You may have a different set of configuration rules and alarms for your dev environment configurations than for test. For example, you may require zone awareness and dedicated masters for your production environment, but not for your development domains. Or, you may not have any alarms set in dev. For test environments that mirror your potential peak load, test to ensure that the alarms are appropriately triggered.
  • Differing workloads or SLAs for different domains
    You may have one domain with a requirement for superfast search performance, and another domain with a heavy ingest load that tolerates slower search response. Your reaction to slow response for these two workloads is likely to be different, so perhaps the thresholds for these two domains should be set at a different level. In this case, you might add a “max CPU utilization” alarm at 100% for 1 minute for the fast search domain, while the other domain only triggers an alarm when the average has been higher than 60% for 5 minutes. You might also add a “free space” rule with a higher threshold to reflect the need for more space for the heavy ingest load if there is danger that it could fill the available disk quickly.
  • “Normal” alarms versus “emergency” alarms
    If, for example, free disk space drops to 25% of total capacity, an alarm is triggered that indicates action should be taken as soon as possible, such as cleaning up old indexes or reconfiguring at the next quiet period for this domain. However, if free space drops below a critical level (20% free space), action must be taken immediately in order to prevent Amazon ES from setting the domain to read-only. Similarly, if the “ClusterIndexWritesBlocked” alarm triggers, the domain has already stopped accepting writes, so immediate action is needed. In this case, you may wish to set “laddered” alarms, where one threshold causes an alarm to be triggered to review the current workload for a planned reconfiguration, but a different threshold raises a “DefCon 3” alarm that immediate action is required.

The sample scripts provided here are a starting point, intended for you to adapt to your own environment and needs.

Running the scripts one time can identify how far your current state is from your desired state, and create an initial set of alarms. Regularly re-running these scripts can capture changes in your environment over time and adjusting your alarms for changes in your environment and configurations. One customer has set them up to run nightly, and to automatically create and update alarms to match their preferred settings.

Removing unwanted alarms

Each CloudWatch alarm costs approximately $0.10 per month. You can remove unwanted alarms in the CloudWatch console, under Alarms. If you set up a “ver” domain above, remember to remove it to avoid continuing charges.

Conclusion

Setting CloudWatch alarms appropriately for your Amazon ES domains can help you avoid suboptimal performance and allow you to respond to workload growth or configuration issues well before they become urgent. This post gives you a starting point for doing so. The additional sleep you’ll get knowing you don’t need to be concerned about Elasticsearch domain performance will allow you to focus on building creative solutions for your business and solving problems for your customers.

Enjoy!


Additional Reading

If you found this post useful, be sure to check out Analyzing Amazon Elasticsearch Service Slow Logs Using Amazon CloudWatch Logs Streaming and Kibana and Get Started with Amazon Elasticsearch Service: How Many Shards Do I Need?

 


About the Author

Dr. Veronika Megler is a senior consultant at Amazon Web Services. She works with our customers to implement innovative big data, AI and ML projects, helping them accelerate their time-to-value when using AWS.

 

 

 

HDD vs SSD: What Does the Future for Storage Hold?

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/ssd-vs-hdd-future-of-storage/

SSD 60 TB drive

This is part one of a series. Use the Join button above to receive notification of future posts on this and other topics.

Customers frequently ask us whether and when we plan to move our cloud backup and data storage to SSDs (Solid-State Drives). That’s not a surprising question considering the many advantages SSDs have over magnetic platter type drives, also known as HDDs (Hard-Disk Drives).

We’re a large user of HDDs in our data centers (currently 100,000 hard drives holding over 500 petabytes of data). We want to provide the best performance, reliability, and economy for our cloud backup and cloud storage services, so we continually evaluate which drives to use for operations and in our data centers. While we use SSDs for some applications, which we’ll describe below, there are reasons why HDDs will continue to be the primary drives of choice for us and other cloud providers for the foreseeable future.

HDDs vs SSDs

HDD vs SSD

The laptop computer I am writing this on has a single 512GB SSD, which has become a common feature in higher end laptops. The SSD’s advantages for a laptop are easy to understand: they are smaller than an HDD, faster, quieter, last longer, and are not susceptible to vibration and magnetic fields. They also have much lower latency and access times.

Today’s typical online price for a 2.5” 512GB SSD is $140 to $170. The typical online price for a 3.5” 512 GB HDD is $44 to $65. That’s a pretty significant difference in price, but since the SSD helps make the laptop lighter, enables it to be more resistant to the inevitable shocks and jolts it will experience in daily use, and adds of benefits of faster booting, faster waking from sleep, and faster launching of applications and handling of big files, the extra cost for the SSD in this case is worth it.

Some of these SSD advantages, chiefly speed, also will apply to a desktop computer, so desktops are increasingly outfitted with SSDs, particularly to hold the operating system, applications, and data that is accessed frequently. Replacing a boot drive with an SSD has become a popular upgrade option to breathe new life into a computer, especially one that seems to take forever to boot or is used for notoriously slow-loading applications such as Photoshop.

We covered upgrading your computer with an SSD in our blog post SSD 101: How to Upgrade Your Computer With An SSD.

Data centers are an entirely different kettle of fish. The primary concerns for data center storage are reliability, storage density, and cost. While SSDs are strong in the first two areas, it’s the third where they are not yet competitive. At Backblaze we adopt higher density HDDs as they become available — we’re currently using both 10TB and 12TB drives (among other capacities) in our data centers. Higher density drives provide greater storage density per Storage Pod and Vault and reduce our overhead cost through less required maintenance and lower total power requirements. Comparable SSDs in those sizes would cost roughly $1,000 per terabyte, considerably higher than the corresponding HDD. Simply put, SSDs are not yet in the price range to make their use economical for the benefits they provide, which is the reason why we expect to be using HDDs as our primary storage media for the foreseeable future.

What Are HDDs?

HDDs have been around over 60 years since IBM introduced them in 1956. The first disk drive was the size of a car, stored a mere 3.75 megabytes, and cost $300,000 in today’s dollars.

IBM 350 Disk Storage System — 3.75MB in 1956

The 350 Disk Storage System was a major component of the IBM 305 RAMAC (Random Access Method of Accounting and Control) system, which was introduced in September 1956. It consisted of 40 platters and a dual read/write head on a single arm that moved up and down the stack of magnetic disk platters.

The basic mechanism of an HDD remains unchanged since then, though it has undergone continual refinement. An HDD uses magnetism to store data on a rotating platter. A read/write head is affixed to an arm that floats above the spinning platter reading and writing data. The faster the platter spins, the faster an HDD can perform. Typical laptop drives today spin at either 5400 RPM (revolutions per minute) or 7200 RPM, though some server-based platters spin at even higher speeds.

Exploded drawing of a hard drive

Exploded drawing of a hard drive

The platters inside the drives are coated with a magnetically sensitive film consisting of tiny magnetic grains. Data is recorded when a magnetic write-head flies just above the spinning disk; the write head rapidly flips the magnetization of one magnetic region of grains so that its magnetic pole points up or down, to encode a 1 or a 0 in binary code. If all this sounds like an HDD is vulnerable to shocks and vibration, you’d be right. They also are vulnerable to magnets, which is one way to destroy the data on an HDD if you’re getting rid of it.

The major advantage of an HDD is that it can store lots of data cheaply. One and two terabyte (1,024 and 2,048 gigabytes) hard drives are not unusual for a laptop these days, and 10TB and 12TB drives are now available for desktops and servers. Densities and rotation speeds continue to grow. However, if you compare the cost of common HDDs vs SSDs for sale online, the SSDs are roughly 3-5x the cost per gigabyte. So if you want cheap storage and lots of it, using a standard hard drive is definitely the more economical way to go.

What are the best uses for HDDs?

  • Disk arrays (NAS, RAID, etc.) where high capacity is needed
  • Desktops when low cost is priority
  • Media storage (photos, videos, audio not currently being worked on)
  • Drives with extreme number of reads and writes

What Are SSDs?

SSDs go back almost as far as HDDs, with the first semiconductor storage device compatible with a hard drive interface introduced in 1978, the StorageTek 4305.

Storage Technology 4305 SSD

The StorageTek was an SSD aimed at the IBM mainframe compatible market. The STC 4305 was seven times faster than IBM’s popular 2305 HDD system (and also about half the price). It consisted of a cabinet full of charge-coupled devices and cost $400,000 for 45MB capacity with throughput speeds up to 1.5 MB/sec.

SSDs are based on a type of non-volatile memory called NAND (named for the Boolean operator “NOT AND,” and one of two main types of flash memory). Flash memory stores data in individual memory cells, which are made of floating-gate transistors. Though they are semiconductor-based memory, they retain their information when no power is applied to them — a feature that’s obviously a necessity for permanent data storage.

Samsung SSD

Samsung SSD 850 Pro

Compared to an HDD, SSDs have higher data-transfer rates, higher areal storage density, better reliability, and much lower latency and access times. For most users, it’s the speed of an SSD that primarily attracts them. When discussing the speed of drives, what we are referring to is the speed at which they can read and write data.

For HDDs, the speed at which the platters spin strongly determines the read/write times. When data on an HDD is accessed, the read/write head must physically move to the location where the data was encoded on a magnetic section on the platter. If the file being read was written sequentially to the disk, it will be read quickly. As more data is written to the disk, however, it’s likely that the file will be written across multiple sections, resulting in fragmentation of the data. Fragmented data takes longer to read with an HDD as the read head has to move to different areas of the platter(s) to completely read all the data requested.

Because SSDs have no moving parts, they can operate at speeds far above those of a typical HDD. Fragmentation is not an issue for SSDs. Files can be written anywhere with little impact on read/write times, resulting in read times far faster than any HDD, regardless of fragmentation.

Samsung SSD 850 Pro (back)

Due to the way data is written and read to the drive, however, SSD cells can wear out over time. SSD cells push electrons through a gate to set its state. This process wears on the cell and over time reduces its performance until the SSD wears out. This effect takes a long time and SSDs have mechanisms to minimize this effect, such as the TRIM command. Flash memory writes an entire block of storage no matter how few pages within the block are updated. This requires reading and caching the existing data, erasing the block and rewriting the block. If an empty block is available, a write operation is much faster. The TRIM command, which must be supported in both the OS and the SSD, enables the OS to inform the drive which blocks are no longer needed. It allows the drive to erase the blocks ahead of time in order to make empty blocks available for subsequent writes.

The effect of repeated reading and erasing on an SSD is cumulative and an SSD can slow down and even display errors with age. It’s more likely, however, that the system using the SSD will be discarded for obsolescence before the SSD begins to display read/write errors. Hard drives eventually wear out from constant use as well, since they use physical recording methods, so most users won’t base their selection of an HDD or SSD drive based on expected longevity.

SSD internals

SSD circuit board

Overall, SSDs are considered far more durable than HDDs due to a lack of mechanical parts. The moving mechanisms within an HDD are susceptible to not only wear and tear over time, but to damage due to movement or forceful contact. If one were to drop a laptop with an HDD, there is a high likelihood that all those moving parts will collide, resulting in potential data loss and even destructive physical damage that could kill the HDD outright. SSDs have no moving parts so, while they hold the risk of a potentially shorter life span due to high use, they can survive the rigors we impose upon our portable devices and laptops.

What are the best uses for SSDs?

  • Notebooks, laptops, where performance, lightweight, areal storage density, resistance to shock and general ruggedness are desirable
  • Boot drives holding operating system and applications, which will speed up booting and application launching
  • Working files (media that is being edited: photos, video, audio, etc.)
  • Swap drives where SSD will speed up disk paging
  • Cache drives
  • Database servers
  • Revitalizing an older computer. If you’ve got a computer that seems slow to start up and slow to load applications and files, updating the boot drive with an SSD could make it seem, if not new, at least as if it just came back refreshed from spending some time on the beach.

Stay Tuned for Part 2 of HDD vs SSD

That’s it for part 1. In our second part we’ll take a deeper look at the differences between HDDs and SSDs, how both HDD and SSD technologies are evolving, and how Backblaze takes advantage of SSDs in our operations and data centers.

Here's a tip!Here’s a tip on finding all the posts tagged with SSD on our blog. Just follow https://www.backblaze.com/blog/tag/ssd/.

Don’t miss future posts on HDDs, SSDs, and other topics, including hard drive stats, cloud storage, and tips and tricks for backing up to the cloud. Use the Join button above to receive notification of future posts on our blog.

The post HDD vs SSD: What Does the Future for Storage Hold? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Spotify Emails Warning to ‘Pirates’ Using Hacked Apps

Post Syndicated from Andy original https://torrentfreak.com/spotify-emails-warning-to-pirates-using-hacked-apps-180305/

Spotify is a fantastic music streaming service used by more than 159 million users around the world. Around 71m of those are premium subscibers according to figures released by the company last December.

Given the above, 88 million Spotify members are using the free tier, meaning that they’re subjected to advertising and other limitations such as shuffle-only play and track skip restrictions.

The idea is that the free user gets a decent level of service but is held back just enough with small irritations to make the jump to a premium subscription a logical step at some point.

What millions of free users don’t know, however, is that there are modified Spotify apps out there that can remove many of these restrictions. All the user has to do is sign up to free Spotify account, download one of the many ‘hacked’ Spotify installation files out there, put in their username and password, and enjoy.

How many people use these hacked versions of Spotify isn’t clear and up to now, it’s been somewhat of a mystery as to why Spotify itself hasn’t done something about them. During the past few days, however, there have been signs that a crackdown could be on the way.

In an email sent to an unknown but significant number of people, Spotify informs users of modified apps that they’re on the company’s radar and there could be consequences for trying to subvert the system.

“We detected abnormal activity on the app you are using so we have disabled it. Don’t worry – your Spotify account is safe,” the email from Spotify reads.

“To access your Spotify account, simply uninstall any unauthorized or modified version of Spotify and download and install the Spotify app from the official Google Play Store. If you need more help, please see our support article on Reinstalling Spotify.”

Users have been popping up on Spotify’s forums asking why they’ve received this email. Some seem to think they’ve done nothing wrong but most signs point to people using modified software.

The warning email from Spotify

While the email signs off with a note thanking the recipient for being a Spotify user, there is also a warning.

“If we detect repeated use of unauthorized apps in violation of our terms, we reserve all rights, including suspending or terminating your account,” Spotify writes.

For people who used their real accounts along with modified apps this could be a problem but many people using hacked versions go in prepared with a secondary or temporary email address and false details.

Quite how far Spotify will go to rid its service of this kind of a user remains unknown but at least for now, the actual effects of this early crackdown seemed mixed.

TorrentFreak has spoken with users who have modified versions and have received the email, yet their installation still works just fine. Others report that they can no longer log in with their modified version.

What is clear, however, is that Spotify has both modified apps and their creators on its radar. On March 1, 2018 the company wrote to Github demanding that a popular Spotify mod known as ‘Dogfood’ be taken down from the repository.

Dogfood is done on Github

The full takedown notice can be found here. It lists Dogfood itself plus a whole bunch of ‘forks’ which have also been taken down by Github.

There were signs in January that the developer of Dogfood might have been under pressure to limit the effectiveness of his app. On January 18 he announced on XDA that some functionality would be removed moving forward.

“In order to comply with XDA’s Rules and CoC, Spotify Dogfood has taken a new direction, and now offers *exclusively* Ad-free music playback,” he wrote.

“Any other features won’t be included anymore in this mod. But, that doesn’t mean anything if you’re a true, a core user of this app, because there will still be regular updates to it, as there has been up until now.”

Where that development will take place now isn’t clear but it clearly won’t be on Github. Indeed, even XDA has been targeted by Spotify, with the site receiving a DMCA notice from the company which required the removal of links and an apparent closure of the whole discussion.

XDA DMCA takedown

For now it seems that Spotify is playing nice, at least with users of modified apps. Whether it will continue with the same relaxed attitude is unclear but it’s hard not to connect the move with its intention to go public and its $23bn valuation.

Still, the company should be more in tune with pirates than most given its history, so may yet have a decent plan up its sleeve.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

The Challenges of Opening a Data Center — Part 2

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/factors-for-choosing-data-center/

Rows of storage pods in a data center

This is part two of a series on the factors that an organization needs to consider when opening a data center and the challenges that must be met in the process.

In Part 1 of this series, we looked at the different types of data centers, the importance of location in planning a data center, data center certification, and the single most expensive factor in running a data center, power.

In Part 2, we continue to look at factors that need to considered both by those interested in a dedicated data center and those seeking to colocate in an existing center.

Power (continued from Part 1)

In part 1, we began our discussion of the power requirements of data centers.

As we discussed, redundancy and failover is a chief requirement for data center power. A redundantly designed power supply system is also a necessity for maintenance, as it enables repairs to be performed on one network, for example, without having to turn off servers, databases, or electrical equipment.

Power Path

The common critical components of a data center’s power flow are:

  • Utility Supply
  • Generators
  • Transfer Switches
  • Distribution Panels
  • Uninterruptible Power Supplies (UPS)
  • PDUs

Utility Supply is the power that comes from one or more utility grids. While most of us consider the grid to be our primary power supply (hats off to those of you who manage to live off the grid), politics, economics, and distribution make utility supply power susceptible to outages, which is why data centers must have autonomous power available to maintain availability.

Generators are used to supply power when the utility supply is unavailable. They convert mechanical energy, usually from motors, to electrical energy.

Transfer Switches are used to transfer electric load from one source or electrical device to another, such as from one utility line to another, from a generator to a utility, or between generators. The transfer could be manually activated or automatic to ensure continuous electrical power.

Distribution Panels get the power where it needs to go, taking a power feed and dividing it into separate circuits to supply multiple loads.

A UPS, as we touched on earlier, ensures that continuous power is available even when the main power source isn’t. It often consists of batteries that can come online almost instantaneously when the current power ceases. The power from a UPS does not have to last a long time as it is considered an emergency measure until the main power source can be restored. Another function of the UPS is to filter and stabilize the power from the main power supply.

Data Center UPS

Data center UPSs

PDU stands for the Power Distribution Unit and is the device that distributes power to the individual pieces of equipment.

Network

After power, the networking connections to the data center are of prime importance. Can the data center obtain and maintain high-speed networking connections to the building? With networking, as with all aspects of a data center, availability is a primary consideration. Data center designers think of all possible ways service can be interrupted or lost, even briefly. Details such as the vulnerabilities in the route the network connections make from the core network (the backhaul) to the center, and where network connections enter and exit a building, must be taken into consideration in network and data center design.

Routers and switches are used to transport traffic between the servers in the data center and the core network. Just as with power, network redundancy is a prime factor in maintaining availability of data center services. Two or more upstream service providers are required to ensure that availability.

How fast a customer can transfer data to a data center is affected by: 1) the speed of the connections the data center has with the outside world, 2) the quality of the connections between the customer and the data center, and 3) the distance of the route from customer to the data center. The longer the length of the route and the greater the number of packets that must be transferred, the more significant a factor will be played by latency in the data transfer. Latency is the delay before a transfer of data begins following an instruction for its transfer. Generally latency, not speed, will be the most significant factor in transferring data to and from a data center. Packets transferred using the TCP/IP protocol suite, which is the conceptual model and set of communications protocols used on the internet and similar computer networks, must be acknowledged when received (ACK’d) and requires a communications roundtrip for each packet. If the data is in larger packets, the number of ACKs required is reduced, so latency will be a smaller factor in the overall network communications speed.

Latency generally will be less significant for data storage transfers than for cloud computing. Optimizations such as multi-threading, which is used in Backblaze’s Cloud Backup service, will generally improve overall transfer throughput if sufficient bandwidth is available.

Those interested in testing the overall speed and latency of their connection to Backblaze’s data centers can use the Check Your Bandwidth tool on our website.
Data center telecommunications equipment

Data center telecommunications equipment

Data center under floor cable runs

Data center under floor cable runs

Cooling

Computer, networking, and power generation equipment generates heat, and there are a number of solutions employed to rid a data center of that heat. The location and climate of the data center is of great importance to the data center designer because the climatic conditions dictate to a large degree what cooling technologies should be deployed that in turn affect the power used and the cost of using that power. The power required and cost needed to manage a data center in a warm, humid climate will vary greatly from managing one in a cool, dry climate. Innovation is strong in this area and many new approaches to efficient and cost-effective cooling are used in the latest data centers.

Switch's uninterruptible, multi-system, HVAC Data Center Cooling Units

Switch’s uninterruptible, multi-system, HVAC Data Center Cooling Units

There are three primary ways data center cooling can be achieved:

Room Cooling cools the entire operating area of the data center. This method can be suitable for small data centers, but becomes more difficult and inefficient as IT equipment density and center size increase.

Row Cooling concentrates on cooling a data center on a row by row basis. In its simplest form, hot aisle/cold aisle data center design involves lining up server racks in alternating rows with cold air intakes facing one way and hot air exhausts facing the other. The rows composed of rack fronts are called cold aisles. Typically, cold aisles face air conditioner output ducts. The rows the heated exhausts pour into are called hot aisles. Typically, hot aisles face air conditioner return ducts.

Rack Cooling tackles cooling on a rack by rack basis. Air-conditioning units are dedicated to specific racks. This approach allows for maximum densities to be deployed per rack. This works best in data centers with fully loaded racks, otherwise there would be too much cooling capacity, and the air-conditioning losses alone could exceed the total IT load.

Security

Data Centers are high-security facilities as they house business, government, and other data that contains personal, financial, and other secure information about businesses and individuals.

This list contains the physical-security considerations when opening or co-locating in a data center:

Layered Security Zones. Systems and processes are deployed to allow only authorized personnel in certain areas of the data center. Examples include keycard access, alarm systems, mantraps, secure doors, and staffed checkpoints.

Physical Barriers. Physical barriers, fencing and reinforced walls are used to protect facilities. In a colocation facility, one customers’ racks and servers are often inaccessible to other customers colocating in the same data center.

Backblaze racks secured in the data center

Backblaze racks secured in the data center

Monitoring Systems. Advanced surveillance technology monitors and records activity on approaching driveways, building entrances, exits, loading areas, and equipment areas. These systems also can be used to monitor and detect fire and water emergencies, providing early detection and notification before significant damage results.

Top-tier providers evaluate their data center security and facilities on an ongoing basis. Technology becomes outdated quickly, so providers must stay-on-top of new approaches and technologies in order to protect valuable IT assets.

To pass into high security areas of a data center requires passing through a security checkpoint where credentials are verified.

Data Center security

The gauntlet of cameras and steel bars one must pass before entering this data center

Facilities and Services

Data center colocation providers often differentiate themselves by offering value-added services. In addition to the required space, power, cooling, connectivity and security capabilities, the best solutions provide several on-site amenities. These accommodations include offices and workstations, conference rooms, and access to phones, copy machines, and office equipment.

Additional features may consist of kitchen facilities, break rooms and relaxation lounges, storage facilities for client equipment, and secure loading docks and freight elevators.

Moving into A Data Center

Moving into a data center is a major job for any organization. We wrote a post last year, Desert To Data in 7 Days — Our New Phoenix Data Center, about what it was like to move into our new data center in Phoenix, Arizona.

Desert To Data in 7 Days — Our New Phoenix Data Center

Visiting a Data Center

Our Director of Product Marketing Andy Klein wrote a popular post last year on what it’s like to visit a data center called A Day in the Life of a Data Center.

A Day in the Life of a Data Center

Would you Like to Know More about The Challenges of Opening and Running a Data Center?

That’s it for part 2 of this series. If readers are interested, we could write a post about some of the new technologies and trends affecting data center design and use. Please let us know in the comments.

Here's a tip!Here’s a tip on finding all the posts tagged with data center on our blog. Just follow https://www.backblaze.com/blog/tag/data-center/.

Don’t miss future posts on data centers and other topics, including hard drive stats, cloud storage, and tips and tricks for backing up to the cloud. Use the Join button above to receive notification of future posts on our blog.

The post The Challenges of Opening a Data Center — Part 2 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

BMG Wants Appeals Court to Rehear Cox Piracy Liability Case

Post Syndicated from Ernesto original https://torrentfreak.com/bmg-wants-appeals-court-to-rehear-cox-piracy-liability-case-180226/

Earlier this month, the Court of Appeals for the Fourth Circuit overturned the $25 million piracy liability verdict against Internet provider Cox.

The panel of three judges concluded that the district court made an error in its jury instruction and ordered a new trial.

The erroneous instruction said that the ISP could be found liable for contributory infringement if it “knew or should have known of such infringing activity.” The Court of Appeals agrees that based on the law, the “should have known” standard is too low.

As a result of the ruling, music publisher BMG Rights Management and Cox would have to go head to head again in a new trial. However, according to BMG, the Court of Appeals itself made a mistake.

A few days ago the copyright holder petitioned the court for a rehearing en banc, asking for a do-over before all the judges of a court.

The music publisher argues that the appeals court judges mistakenly reached their decision based on a legal principle that applies to “inducement” of liability, while BMG was pursuing a claim of “material contribution.”

“The panel’s unprecedented application of a heightened knowledge standard creates a conflict with decisions and pattern jury instructions from other circuits as well as with the common-law rules underlying contributory infringement.

“All of those recognize that BMG’s material-contribution theory requires only constructive knowledge,” BMG’s brief adds.

Even if the appeals court persists with its assertion that the liability standard is “willful blindness” rather than “should have known,” a new trial would not be warranted, according to the music publisher.

They point out that plenty of evidence was presented which proved that Cox was wilfully blind to the copyright infringements and describe the erroneous instruction as a “harmless error of the most benign kind.”

The music publisher’s request for a rehearing is supported by the RIAA, which filed an amicus curiae brief together with the National Music Publishers Association.

Both music industry groups back BMG’s arguments and ask the appeals court to consider a rehearing, stating that it would be in the best interests of artists, songwriters, and other rightsholders.

“The level of copyright infringement that takes place over the Internet is ‘staggering,’ and it is vital that copyright owners have effective mechanisms to address it. It is also critical that copyright owners can adequately address infringement that occurs in other contexts.

“If the panel’s decision is not corrected, it would threaten the very incentives of artists, songwriters, and others to create valuable works and distribute them to the public,” the RIAA and NMPA add.

For the RIAA the case is particularly important since it filed a similar lawsuit against Internet provider Grande Communications last year.

Given what’s at stake, we can assume that Cox will protest the request for a rehearing. And it wouldn’t be a big surprise if other telecommunications companies take the same position.

BMG’s petition is available here (pdf) and a copy of the RIAA/NMPA motion can be found here (pdf).

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Getting product security engineering right

Post Syndicated from Michal Zalewski original http://lcamtuf.blogspot.com/2018/02/getting-product-security-engineering.html

Product security is an interesting animal: it is a uniquely cross-disciplinary endeavor that spans policy, consulting,
process automation, in-depth software engineering, and cutting-edge vulnerability research. And in contrast to many
other specializations in our field of expertise – say, incident response or network security – we have virtually no
time-tested and coherent frameworks for setting it up within a company of any size.

In my previous post, I shared some thoughts
on nurturing technical organizations and cultivating the right kind of leadership within. Today, I figured it would
be fitting to follow up with several notes on what I learned about structuring product security work – and about actually
making the effort count.

The “comfort zone” trap

For security engineers, knowing your limits is a sought-after quality: there is nothing more dangerous than a security
expert who goes off script and starts dispensing authoritatively-sounding but bogus advice on a topic they know very
little about. But that same quality can be destructive when it prevents us from growing beyond our most familiar role: that of
a critic who pokes holes in other people’s designs.

The role of a resident security critic lends itself all too easily to a sense of supremacy: the mistaken
belief that our cognitive skills exceed the capabilities of the engineers and product managers who come to us for help
– and that the cool bugs we file are the ultimate proof of our special gift. We start taking pride in the mere act
of breaking somebody else’s software – and then write scathing but ineffectual critiques addressed to executives,
demanding that they either put a stop to a project or sign off on a risk. And hey, in the latter case, they better
brace for our triumphant “I told you so” at some later date.

Of course, escalations of this type have their place, but they need to be a very rare sight; when practiced routinely, they are a telltale
sign of a dysfunctional team. We might be failing to think up viable alternatives that are in tune with business or engineering needs; we might
be very unpersuasive, failing to communicate with other rational people in a language they understand; or it might be that our tolerance for risk
is badly out of whack with the rest of the company. Whatever the cause, I’ve seen high-level escalations where the security team
spoke of valiant efforts to resist inexplicably awful design decisions or data sharing setups; and where product leads in turn talked about
pressing business needs randomly blocked by obstinate security folks. Sometimes, simply having them compare their notes would be enough to arrive
at a technical solution – such as sharing a less sensitive subset of the data at hand.

To be effective, any product security program must be rooted in a partnership with the rest of the company, focused on helping them get stuff done
while eliminating or reducing security risks. To combat the toxic us-versus-them mentality, I found it helpful to have some team members with
software engineering backgrounds, even if it’s the ownership of a small open-source project or so. This can broaden our horizons, helping us see
that we all make the same mistakes – and that not every solution that sounds good on paper is usable once we code it up.

Getting off the treadmill

All security programs involve a good chunk of operational work. For product security, this can be a combination of product launch reviews, design consulting requests, incoming bug reports, or compliance-driven assessments of some sort. And curiously, such reactive work also has the property of gradually expanding to consume all the available resources on a team: next year is bound to bring even more review requests, even more regulatory hurdles, and even more incoming bugs to triage and fix.

Being more tractable, such routine tasks are also more readily enshrined in SDLs, SLAs, and all kinds of other official documents that are often mistaken for a mission statement that justifies the existence of our teams. Soon, instead of explaining to a developer why they should fix a particular problem right away, we end up pointing them to page 17 in our severity classification guideline, which defines that “severity 2” vulnerabilities need to be resolved within a month. Meanwhile, another policy may be telling them that they need to run a fuzzer or a web application scanner for a particular number of CPU-hours – no matter whether it makes sense or whether the job is set up right.

To run a product security program that scales sublinearly, stays abreast of future threats, and doesn’t erect bureaucratic speed bumps just for the sake of it, we need to recognize this inherent tendency for operational work to take over – and we need to reign it in. No matter what the last year’s policy says, we usually don’t need to be doing security reviews with a particular cadence or to a particular depth; if we need to scale them back 10% to staff a two-quarter project that fixes an important API and squashes an entire class of bugs, it’s a short-term risk we should feel empowered to take.

As noted in my earlier post, I find contingency planning to be a valuable tool in this regard: why not ask ourselves how the team would cope if the workload went up another 30%, but bad financial results precluded any team growth? It’s actually fun to think about such hypotheticals ahead of the time – and hey, if the ideas sound good, why not try them out today?

Living for a cause

It can be difficult to understand if our security efforts are structured and prioritized right; when faced with such uncertainty, it is natural to stick to the safe fundamentals – investing most of our resources into the very same things that everybody else in our industry appears to be focusing on today.

I think it’s important to combat this mindset – and if so, we might as well tackle it head on. Rather than focusing on tactical objectives and policy documents, try to write down a concise mission statement explaining why you are a team in the first place, what specific business outcomes you are aiming for, how do you prioritize it, and how you want it all to change in a year or two. It should be a fluid narrative that reads right and that everybody on your team can take pride in; my favorite way of starting the conversation is telling folks that we could always have a new VP tomorrow – and that the VP’s first order of business could be asking, “why do you have so many people here and how do I know they are doing the right thing?”. It’s a playful but realistic framing device that motivates people to get it done.

In general, a comprehensive product security program should probably start with the assumption that no matter how many resources we have at our disposal, we will never be able to stay in the loop on everything that’s happening across the company – and even if we did, we’re not going to be able to catch every single bug. It follows that one of our top priorities for the team should be making sure that bugs don’t happen very often; a scalable way of getting there is equipping engineers with intuitive and usable tools that make it easy to perform common tasks without having to worry about security at all. Examples include standardized, managed containers for production jobs; safe-by-default APIs, such as strict contextual autoescaping for XSS or type safety for SQL; security-conscious style guidelines; or plug-and-play libraries that take care of common crypto or ACL enforcement tasks.

Of course, not all problems can be addressed on framework level, and not every engineer will always reach for the right tools. Because of this, the next principle that I found to be worth focusing on is containment and mitigation: making sure that bugs are difficult to exploit when they happen, or that the damage is kept in check. The solutions in this space can range from low-level enhancements (say, hardened allocators or seccomp-bpf sandboxes) to client-facing features such as browser origin isolation or Content Security Policy.

The usual consulting, review, and outreach tasks are an important facet of a product security program, but probably shouldn’t be the sole focus of your team. It’s also best to avoid undue emphasis on vulnerability showmanship: while valuable in some contexts, it creates a hypercompetitive environment that may be hostile to less experienced team members – not to mention, squashing individual bugs offers very limited value if the same issue is likely to be reintroduced into the codebase the next day. I like to think of security reviews as a teaching opportunity instead: it’s a way to raise awareness, form partnerships with engineers, and help them develop lasting habits that reduce the incidence of bugs. Metrics to understand the impact of your work are important, too; if your engagements are seen mostly as a yet another layer of red tape, product teams will stop reaching out to you for advice.

The other tenet of a healthy product security effort requires us to recognize at a scale and given enough time, every defense mechanism is bound to fail – and so, we need ways to prevent bugs from turning into incidents. The efforts in this space may range from developing product-specific signals for the incident response and monitoring teams; to offering meaningful vulnerability reward programs and nourishing a healthy and respectful relationship with the research community; to organizing regular offensive exercises in hopes of spotting bugs before anybody else does.

Oh, one final note: an important feature of a healthy security program is the existence of multiple feedback loops that help you spot problems without the need to micromanage the organization and without being deathly afraid of taking chances. For example, the data coming from bug bounty programs, if analyzed correctly, offers a wonderful way to alert you to systemic problems in your codebase – and later on, to measure the impact of any remediation and hardening work.

Amazon Redshift – 2017 Recap

Post Syndicated from Larry Heathcote original https://aws.amazon.com/blogs/big-data/amazon-redshift-2017-recap/

We have been busy adding new features and capabilities to Amazon Redshift, and we wanted to give you a glimpse of what we’ve been doing over the past year. In this article, we recap a few of our enhancements and provide a set of resources that you can use to learn more and get the most out of your Amazon Redshift implementation.

In 2017, we made more than 30 announcements about Amazon Redshift. We listened to you, our customers, and delivered Redshift Spectrum, a feature of Amazon Redshift, that gives you the ability to extend analytics to your data lake—without moving data. We launched new DC2 nodes, doubling performance at the same price. We also announced many new features that provide greater scalability, better performance, more automation, and easier ways to manage your analytics workloads.

To see a full list of our launches, visit our what’s new page—and be sure to subscribe to our RSS feed.

Major launches in 2017

Amazon Redshift Spectrumextend analytics to your data lake, without moving data

We launched Amazon Redshift Spectrum to give you the freedom to store data in Amazon S3, in open file formats, and have it available for analytics without the need to load it into your Amazon Redshift cluster. It enables you to easily join datasets across Redshift clusters and S3 to provide unique insights that you would not be able to obtain by querying independent data silos.

With Redshift Spectrum, you can run SQL queries against data in an Amazon S3 data lake as easily as you analyze data stored in Amazon Redshift. And you can do it without loading data or resizing the Amazon Redshift cluster based on growing data volumes. Redshift Spectrum separates compute and storage to meet workload demands for data size, concurrency, and performance. Redshift Spectrum scales processing across thousands of nodes, so results are fast, even with massive datasets and complex queries. You can query open file formats that you already use—such as Apache Avro, CSV, Grok, ORC, Apache Parquet, RCFile, RegexSerDe, SequenceFile, TextFile, and TSV—directly in Amazon S3, without any data movement.

For complex queries, Redshift Spectrum provided a 67 percent performance gain,” said Rafi Ton, CEO, NUVIAD. “Using the Parquet data format, Redshift Spectrum delivered an 80 percent performance improvement. For us, this was substantial.

To learn more about Redshift Spectrum, watch our AWS Summit session Intro to Amazon Redshift Spectrum: Now Query Exabytes of Data in S3, and read our announcement blog post Amazon Redshift Spectrum – Exabyte-Scale In-Place Queries of S3 Data.

DC2 nodes—twice the performance of DC1 at the same price

We launched second-generation Dense Compute (DC2) nodes to provide low latency and high throughput for demanding data warehousing workloads. DC2 nodes feature powerful Intel E5-2686 v4 (Broadwell) CPUs, fast DDR4 memory, and NVMe-based solid state disks (SSDs). We’ve tuned Amazon Redshift to take advantage of the better CPU, network, and disk on DC2 nodes, providing up to twice the performance of DC1 at the same price. Our DC2.8xlarge instances now provide twice the memory per slice of data and an optimized storage layout with 30 percent better storage utilization.

Redshift allows us to quickly spin up clusters and provide our data scientists with a fast and easy method to access data and generate insights,” said Bradley Todd, technology architect at Liberty Mutual. “We saw a 9x reduction in month-end reporting time with Redshift DC2 nodes as compared to DC1.”

Read our customer testimonials to see the performance gains our customers are experiencing with DC2 nodes. To learn more, read our blog post Amazon Redshift Dense Compute (DC2) Nodes Deliver Twice the Performance as DC1 at the Same Price.

Performance enhancements— 3x-5x faster queries

On average, our customers are seeing 3x to 5x performance gains for most of their critical workloads.

We introduced short query acceleration to speed up execution of queries such as reports, dashboards, and interactive analysis. Short query acceleration uses machine learning to predict the execution time of a query, and to move short running queries to an express short query queue for faster processing.

We launched results caching to deliver sub-second response times for queries that are repeated, such as dashboards, visualizations, and those from BI tools. Results caching has an added benefit of freeing up resources to improve the performance of all other queries.

We also introduced late materialization to reduce the amount of data scanned for queries with predicate filters by batching and factoring in the filtering of predicates before fetching data blocks in the next column. For example, if only 10 percent of the table rows satisfy the predicate filters, Amazon Redshift can potentially save 90 percent of the I/O for the remaining columns to improve query performance.

We launched query monitoring rules and pre-defined rule templates. These features make it easier for you to set metrics-based performance boundaries for workload management (WLM) queries, and specify what action to take when a query goes beyond those boundaries. For example, for a queue that’s dedicated to short-running queries, you might create a rule that aborts queries that run for more than 60 seconds. To track poorly designed queries, you might have another rule that logs queries that contain nested loops.

Customer insights

Amazon Redshift and Redshift Spectrum serve customers across a variety of industries and sizes, from startups to large enterprises. Visit our customer page to see the success that customers are having with our recent enhancements. Learn how companies like Liberty Mutual Insurance saw a 9x reduction in month-end reporting time using DC2 nodes. On this page, you can find case studies, videos, and other content that show how our customers are using Amazon Redshift to drive innovation and business results.

In addition, check out these resources to learn about the success our customers are having building out a data warehouse and data lake integration solution with Amazon Redshift:

Partner solutions

You can enhance your Amazon Redshift data warehouse by working with industry-leading experts. Our AWS Partner Network (APN) Partners have certified their solutions to work with Amazon Redshift. They offer software, tools, integration, and consulting services to help you at every step. Visit our Amazon Redshift Partner page and choose an APN Partner. Or, use AWS Marketplace to find and immediately start using third-party software.

To see what our Partners are saying about Amazon Redshift Spectrum and our DC2 nodes mentioned earlier, read these blog posts:

Resources

Blog posts

Visit the AWS Big Data Blog for a list of all Amazon Redshift articles.

YouTube videos

GitHub

Our community of experts contribute on GitHub to provide tips and hints that can help you get the most out of your deployment. Visit GitHub frequently to get the latest technical guidance, code samples, administrative task automation utilities, the analyze & vacuum schema utility, and more.

Customer support

If you are evaluating or considering a proof of concept with Amazon Redshift, or you need assistance migrating your on-premises or other cloud-based data warehouse to Amazon Redshift, our team of product experts and solutions architects can help you with architecting, sizing, and optimizing your data warehouse. Contact us using this support request form, and let us know how we can assist you.

If you are an Amazon Redshift customer, we offer a no-cost health check program. Our team of database engineers and solutions architects give you recommendations for optimizing Amazon Redshift and Amazon Redshift Spectrum for your specific workloads. To learn more, email us at [email protected].

If you have any questions, email us at [email protected].

 


Additional Reading

If you found this post useful, be sure to check out Amazon Redshift Spectrum – Exabyte-Scale In-Place Queries of S3 Data, Using Amazon Redshift for Fast Analytical Reports and How to Migrate Your Oracle Data Warehouse to Amazon Redshift Using AWS SCT and AWS DMS.


About the Author

Larry Heathcote is a Principle Product Marketing Manager at Amazon Web Services for data warehousing and analytics. Larry is passionate about seeing the results of data-driven insights on business outcomes. He enjoys family time, home projects, grilling out and the taste of classic barbeque.