Tag Archives: credit cards

Google’s Chrome Web Store Spammed With Dodgy ‘Pirate’ Movie Links

Post Syndicated from Andy original https://torrentfreak.com/googles-chrome-web-store-spammed-with-dodgy-pirate-movie-links-180527/

Launched in 2010, Google’s Chrome Store is the go-to place for people looking to pimp their Chrome browser.

Often referred to as apps and extensions, the programs offered by the platform run in Chrome and can perform a dazzling array of functions, from improving security and privacy, to streaming video or adding magnet links to torrent sites.

Also available on the Chrome Store are themes, which can be installed locally to change the appearance of the Chrome browser.

While there are certainly plenty to choose from, some additions to the store over the past couple of months are not what most people have come to expect from the add-on platform.

Free movies on Chrome’s Web Store?

As the image above suggests, unknown third parties appear to be exploiting the Chrome Store’s ‘theme’ section to offer visitors access to a wide range of pirate movies including Black Panther, Avengers: Infinity War and Rampage.

When clicking through to the page offering Ready Player One, for example, users are presented with a theme that apparently allows them to watch the movie online in “Full HD Online 4k.”

Of course, the whole scheme is a dubious scam which eventually leads users to Vioos.co, a platform that tries very hard to give the impression of being a pirate streaming portal but actually provides nothing of use.

Nothing to see here

In fact, as soon as one clicks the play button on movies appearing on Vioos.co, visitors are re-directed to another site called Zumastar which asks people to “create a free account” to “access unlimited downloads & streaming.”

“With over 20 million titles, Zumastar is your number one entertainment resource. Join hundreds of thousands of satisfied members and enjoy the hottest movies,” the site promises.

With this kind of marketing, perhaps we should think about this offer for a second. Done. No thanks.

In extended testing, some visits to Vioos.co resulted in a redirection to EtnaMedia.net, a domain that was immediately blocked by MalwareBytes due to suspected fraud. However, after allowing the browser to make the connection, TF was presented with another apparent subscription site.

We didn’t follow through with a sign-up but further searches revealed upset former customers complaining of money being taken from their credit cards when they didn’t expect that to happen.

Quite how many people have signed up to Zumastar or EtnaMedia via this convoluted route from Google’s Chrome Store isn’t clear but a worrying number appear to have installed the ‘themes’ (if that’s what they are) offered on each ‘pirate movie’ page.

At the time of writing the ‘free Watch Rampage Online Full Movie’ ‘theme’ has 2,196 users, the “Watch Avengers Infinity War Full Movie” variant has 974, the ‘Watch Ready Player One 2018 Full HD’ page has 1,031, and the ‘Watch Black Panther Online Free 123putlocker’ ‘theme’ has more than 1,800. Clearly, a worrying number of people will click and install just about anything.

We haven’t tested the supposed themes to see what they do but it’s a cast-iron guarantee that they don’t offer the movies displayed and there’s always a chance they’ll do something awful. As a rule of thumb, it’s nearly always wise to steer clear of anything with “full movie” in the title, they can rarely be trusted.

Finally, those hoping to get some guidance on quality from the reviews on the Chrome Store will be bitterly disappointed.

Garbage reviews, probably left by the scammers

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Police Forces Around Europe Hit Pirate IPTV Operation

Post Syndicated from Andy original https://torrentfreak.com/police-forces-around-europe-hit-pirate-iptv-operation-180519/

Once upon a time, torrent and web streaming sites were regularly in the headlines while being targeted by the authorities. With the rise of set-top box streaming, actions against pirate IPTV operations are more regularly making the news.

In an operation coordinated by the public prosecutor’s office in Rome, 150 officers of the Provincial Command of the Guardia di Finanza (GdF) this week targeted what appears to be a fairly large unauthorized IPTV provider.

Under the banner Operation Spinoff, in Italy, more than 50 searches were carried out in 20 provinces of 11 regions. Five people were arrested. Elsewhere in Europe – in Switzerland, Germany and Spain – the Polizei Basel-Landschaft, the Kriminal Polizei and the Policia Nacional coordinated to execute warrants.

A small selection of the service on offer

“Through technical and ‘in-the-field’ investigations and the meticulous reconstruction of financial flows, carried out mainly through prepaid credit cards or payment web platforms, investigators have reconstructed the activity of a pyramid-like criminal structure dedicated to the illegal decryption and diffusion of pay-per-view television content through the Internet,” the GdF said in a statement.

Italian authorities report that the core of the IPTV operation were its sources of original content and channels. These were located in a range of diverse locations such as companies, commercial premises, garages and even private homes. Inside each location was equipment to receive, decrypt and capture signals from broadcasters including Sky TV.

Italian police examine hardware

These signals were collected together to form a package of channels which were then transmitted via the Internet and sold to the public in the form of an IPTV subscription. Packages were reportedly priced between 15 and 20 euros per month.

It’s estimated that between the 49 individuals said to be involved in the operation, around one million euros was generated. All are suspected of copyright infringement and money laundering offenses. Of the five Italian citizens reported to be at the core of the operations, four were taken into custody and one placed under house arrest.

Reports identify the suspects as: ‘AS’, born 1979 and residing in Lorrach, Germany. ‘RM’, born 1987 and living in Sarno, Italy. ‘LD’, born 1996 and also living in Sarno, Italy. ‘GP’, born 1990, living in Pordenone, Italy. And ‘SM’, born 1981 and living in Zagarolo, Italy.

More hardware

Players at all levels of the business are under investigation, from the sources who decrypted the signals to the sellers and re-sellers of the content to end users. Also under the microscope are people said to have laundered the operation’s money through credit cards and payment platforms.

The GdF describes the pirate IPTV operation in serious terms, noting that it aimed to set up a “parallel distribution company able to provide services that are entirely analogous to lawful companies, from checks on the feasibility of installing the service to maintaining adequate standards and technical assistance to customers.”

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Airline Ticket Fraud

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/05/airline_ticket_.html

New research: “Leaving on a jet plane: the trade in fraudulently obtained airline tickets:”

Abstract: Every day, hundreds of people fly on airline tickets that have been obtained fraudulently. This crime script analysis provides an overview of the trade in these tickets, drawing on interviews with industry and law enforcement, and an analysis of an online blackmarket. Tickets are purchased by complicit travellers or resellers from the online blackmarket. Victim travellers obtain tickets from fake travel agencies or malicious insiders. Compromised credit cards used to be the main method to purchase tickets illegitimately. However, as fraud detection systems improved, offenders displaced to other methods, including compromised loyalty point accounts, phishing, and compromised business accounts. In addition to complicit and victim travellers, fraudulently obtained tickets are used for transporting mules, and for trafficking and smuggling. This research details current prevention approaches, and identifies additional interventions, aimed at the act, the actor, and the marketplace.

Blog post.

Cryptocurrency Security Challenges

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/cryptocurrency-security-challenges/

Physical coins representing cyrptocurrencies

Most likely you’ve read the tantalizing stories of big gains from investing in cryptocurrencies. Someone who invested $1,000 into bitcoins five years ago would have over $85,000 in value now. Alternatively, someone who invested in bitcoins three months ago would have seen their investment lose 20% in value. Beyond the big price fluctuations, currency holders are possibly exposed to fraud, bad business practices, and even risk losing their holdings altogether if they are careless in keeping track of the all-important currency keys.

It’s certain that beyond the rewards and risks, cryptocurrencies are here to stay. We can’t ignore how they are changing the game for how money is handled between people and businesses.

Some Advantages of Cryptocurrency

  • Cryptocurrency is accessible to anyone.
  • Decentralization means the network operates on a user-to-user (or peer-to-peer) basis.
  • Transactions can completed for a fraction of the expense and time required to complete traditional asset transfers.
  • Transactions are digital and cannot be counterfeited or reversed arbitrarily by the sender, as with credit card charge-backs.
  • There aren’t usually transaction fees for cryptocurrency exchanges.
  • Cryptocurrency allows the cryptocurrency holder to send exactly what information is needed and no more to the merchant or recipient, even permitting anonymous transactions (for good or bad).
  • Cryptocurrency operates at the universal level and hence makes transactions easier internationally.
  • There is no other electronic cash system in which your account isn’t owned by someone else.

On top of all that, blockchain, the underlying technology behind cryptocurrencies, is already being applied to a variety of business needs and itself becoming a hot sector of the tech economy. Blockchain is bringing traceability and cost-effectiveness to supply-chain management — which also improves quality assurance in areas such as food, reducing errors and improving accounting accuracy, smart contracts that can be automatically validated, signed and enforced through a blockchain construct, the possibility of secure, online voting, and many others.

Like any new, booming marketing there are risks involved in these new currencies. Anyone venturing into this domain needs to have their eyes wide open. While the opportunities for making money are real, there are even more ways to lose money.

We’re going to cover two primary approaches to staying safe and avoiding fraud and loss when dealing with cryptocurrencies. The first is to thoroughly vet any person or company you’re dealing with to judge whether they are ethical and likely to succeed in their business segment. The second is keeping your critical cryptocurrency keys safe, which we’ll deal with in this and a subsequent post.

Caveat Emptor — Buyer Beware

The short history of cryptocurrency has already seen the demise of a number of companies that claimed to manage, mine, trade, or otherwise help their customers profit from cryptocurrency. Mt. Gox, GAW Miners, and OneCoin are just three of the many companies that disappeared with their users’ money. This is the traditional equivalent of your bank going out of business and zeroing out your checking account in the process.

That doesn’t happen with banks because of regulatory oversight. But with cryptocurrency, you need to take the time to investigate any company you use to manage or trade your currencies. How long have they been around? Who are their investors? Are they affiliated with any reputable financial institutions? What is the record of their founders and executive management? These are all important questions to consider when evaluating a company in this new space.

Would you give the keys to your house to a service or person you didn’t thoroughly know and trust? Some companies that enable you to buy and sell currencies online will routinely hold your currency keys, which gives them the ability to do anything they want with your holdings, including selling them and pocketing the proceeds if they wish.

That doesn’t mean you shouldn’t ever allow a company to keep your currency keys in escrow. It simply means that you better know with whom you’re doing business and if they’re trustworthy enough to be given that responsibility.

Keys To the Cryptocurrency Kingdom — Public and Private

If you’re an owner of cryptocurrency, you know how this all works. If you’re not, bear with me for a minute while I bring everyone up to speed.

Cryptocurrency has no physical manifestation, such as bills or coins. It exists purely as a computer record. And unlike currencies maintained by governments, such as the U.S. dollar, there is no central authority regulating its distribution and value. Cryptocurrencies use a technology called blockchain, which is a decentralized way of keeping track of transactions. There are many copies of a given blockchain, so no single central authority is needed to validate its authenticity or accuracy.

The validity of each cryptocurrency is determined by a blockchain. A blockchain is a continuously growing list of records, called “blocks”, which are linked and secured using cryptography. Blockchains by design are inherently resistant to modification of the data. They perform as an open, distributed ledger that can record transactions between two parties efficiently and in a verifiable, permanent way. A blockchain is typically managed by a peer-to-peer network collectively adhering to a protocol for validating new blocks. Once recorded, the data in any given block cannot be altered retroactively without the alteration of all subsequent blocks, which requires collusion of the network majority. On a scaled network, this level of collusion is impossible — making blockchain networks effectively immutable and trustworthy.

Blockchain process

The other element common to all cryptocurrencies is their use of public and private keys, which are stored in the currency’s wallet. A cryptocurrency wallet stores the public and private “keys” or “addresses” that can be used to receive or spend the cryptocurrency. With the private key, it is possible to write in the public ledger (blockchain), effectively spending the associated cryptocurrency. With the public key, it is possible for others to send currency to the wallet.

What is a cryptocurrency address?

Cryptocurrency “coins” can be lost if the owner loses the private keys needed to spend the currency they own. It’s as if the owner had lost a bank account number and had no way to verify their identity to the bank, or if they lost the U.S. dollars they had in their wallet. The assets are gone and unusable.

The Cryptocurrency Wallet

Given the importance of these keys, and lack of recourse if they are lost, it’s obviously very important to keep track of your keys.

If you’re being careful in choosing reputable exchanges, app developers, and other services with whom to trust your cryptocurrency, you’ve made a good start in keeping your investment secure. But if you’re careless in managing the keys to your bitcoins, ether, Litecoin, or other cryptocurrency, you might as well leave your money on a cafe tabletop and walk away.

What Are the Differences Between Hot and Cold Wallets?

Just like other numbers you might wish to keep track of — credit cards, account numbers, phone numbers, passphrases — cryptocurrency keys can be stored in a variety of ways. Those who use their currencies for day-to-day purchases most likely will want them handy in a smartphone app, hardware key, or debit card that can be used for purchases. These are called “hot” wallets. Some experts advise keeping the balances in these devices and apps to a minimal amount to avoid hacking or data loss. We typically don’t walk around with thousands of dollars in U.S. currency in our old-style wallets, so this is really a continuation of the same approach to managing spending money.

Bread mobile app screenshot

A “hot” wallet, the Bread mobile app

Some investors with large balances keep their keys in “cold” wallets, or “cold storage,” i.e. a device or location that is not connected online. If funds are needed for purchases, they can be transferred to a more easily used payment medium. Cold wallets can be hardware devices, USB drives, or even paper copies of your keys.

Trezor hardware wallet

A “cold” wallet, the Trezor hardware wallet

Ledger Nano S hardware wallet

A “cold” wallet, the Ledger Nano S

Bitcoin paper wallet

A “cold” Bitcoin paper wallet

Wallets are suited to holding one or more specific cryptocurrencies, and some people have multiple wallets for different currencies and different purposes.

A paper wallet is nothing other than a printed record of your public and private keys. Some prefer their records to be completely disconnected from the internet, and a piece of paper serves that need. Just like writing down an account password on paper, however, it’s essential to keep the paper secure to avoid giving someone the ability to freely access your funds.

How to Keep your Keys, and Cryptocurrency Secure

In a post this coming Thursday, Securing Your Cryptocurrency, we’ll discuss the best strategies for backing up your cryptocurrency so that your currencies don’t become part of the millions that have been lost. We’ll cover the common (and uncommon) approaches to backing up hot wallets, cold wallets, and using paper and metal solutions to keeping your keys safe.

In the meantime, please tell us of your experiences with cryptocurrencies — good and bad — and how you’ve dealt with the issue of cryptocurrency security.

The post Cryptocurrency Security Challenges appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Internet Users Warned Over Fake 20th Century Fox Piracy ‘Fines’

Post Syndicated from Andy original https://torrentfreak.com/internet-users-warned-over-fake-20th-century-fox-piracy-fines-171220/

Most people who obtain and share large quantities of material online understand that comes with risk, possibly in the form of an ISP-forwarded warning, a letter demanding cash, or even a visit from the police.

While the latter only happens in the rarest of circumstances, warnings are relatively commonplace, especially in the United States where companies like Rightscorp pump them out in their thousands. Letters demanding cash payment, sent by so-called copyright trolls, are less prevalent but these days most people understand the concept of a piracy ‘fine’.

With this level of understanding in the mainstream there are opportunities for scammers, who have periodically tried to extract payments from Internet users who have done nothing wrong. This is currently the case in Germany, where a consumer group is warning of a wave of piracy ‘fines’ being sent out to completely innocent victims.

The emails, which claim to be sent on behalf of 20th Century Fox, allege the recipient has infringed copyright on streaming portal Kinox.to. For this apparent transgression, they demand a payment of more than 375 euros but the whole thing is an elaborate scam.

The 20th Century Fox ‘piracy’ scam

Unlike some fairly primitive previous efforts, however, these emails are actually quite clever.

Citing a genuine ruling from the European Court of Justice which found that streaming content is illegal inside the EU, the cash demand offers up personal information of the user, such as IP addresses, browser, and operating system.

However, instead of obtaining these via an external piracy monitoring system and subsequent court order (as happens with BitTorrent cases), the data is pulled from the user’s machine when a third-party link is clicked.

As highlighted by Tarnkappe, who first noticed the warning, there are other elements to the cash demands that point to an elaborate scam.

Perhaps the biggest tell of all is the complete absence of precise details of the alleged infringement, such as the title of the content supposedly obtained along with a time and date. These are common features of all genuine settlement demands so any that fail to mention content should be treated with caution.

“Do not pay. It is rip off. Report to the police,” the local consumer group warns.

Interestingly, warning recipients are advised by the scammers to pay their ‘fine’ directly to a bank account in the United Kingdom. Hopefully it will have been shut down by now but it’s worth mentioning that people should avoid direct bank transfers with anyone they don’t trust.

If any payment must be made, credit cards are a much safer option but in the case of wannabe trolls, they’re best ignored until they appear with proper proof backed up by credible legal documentation. Even then, people should consider putting up a fight, if they’re being unfairly treated.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Top 10 Torrent Site TorrentDownloads Blocked By Chrome and Firefox

Post Syndicated from Andy original https://torrentfreak.com/top-10-torrent-site-torrentdownloads-blocked-by-chrome-and-firefox-171107/

While the popularity of torrent sites isn’t as strong as it used to be, dozens of millions of people use them on a daily basis.

Content availability is rich and the majority of the main movie, TV show, game and software releases appear on them within minutes, offering speedy and convenient downloads. Nevertheless, things don’t always go as smoothly as people might like.

Over the past couple of days that became evident to visitors of TorrentDownloads, one of the Internet’s most popular torrent sites.

TorrentDownloads – usually a reliable and tidy platform

Instead of viewing the rather comprehensive torrent index that made the Top 10 Most Popular Torrent Site lists in 2016 and 2017, visitors receive a warning.

“Attackers on torrentdownloads.me may trick you into doing something dangerous like installing software or revealing your personal information (for example, passwords, phone numbers or credit cards),” Chrome users are warned.

“Google Safe Browsing recently detected phishing on torrentdownloads.me. Phishing sites pretend to be other websites to trick you.”

Chrome warning

People using Firefox also receive a similar warning.

“This web page at torrentdownloads.me has been reported as a deceptive site and has been blocked based on your security preferences,” the browser warns.

“Deceptive sites are designed to trick you into doing something dangerous, like installing software, or revealing your personal information, like passwords, phone numbers or credit cards.”

A deeper check on Google’s malware advisory service echoes the same information, noting that the site contains “harmful content” that may “trick visitors into sharing personal info or downloading software.” Checks carried out with MalwareBytes reveal that service blocking the domain too.

TorrentFreak spoke with the operator of TorrentDownloads who told us that the warnings had been triggered by a rogue advertiser which was immediately removed from the site.

“We have already requested a review with Google Webmaster after we removed an old affiliates advertiser and changed the links on the site,” he explained.

“In Google Webmaster they state that the request will be processed within 72 Hours, so I think it will be reviewed today when 72 hours are completed.”

This statement suggests that the site itself wasn’t the direct culprit, but ads hosted elsewhere. That being said, these kinds of warnings look very scary to visitors and sites have to take responsibility, so completely expelling the bad player from the platform was the correct choice. Nevertheless, people shouldn’t be too surprised at the appearance of suspect ads.

Many top torrent sites have suffered from similar warnings, including The Pirate Bay and KickassTorrents, which are often a product of anti-piracy efforts from the entertainment industries.

In the past, torrent and streaming sites could display ads from top-tier providers with few problems. However, in recent years, the so-called “follow the money” anti-piracy tactic has forced the majority away from pirate sites, meaning they now have to do business with ad networks that may not always be as tidy as one might hope.

While these warnings are the very last thing the sites in question want (they’re hardly good for increasing visitor numbers), they’re a gift to entertainment industry groups.

At the same time as the industries are forcing decent ads away, these alerts provide a great opportunity to warn users about the potential problems left behind as a result. A loose analogy might be deliberately cutting off beer supply to an unlicensed bar then warning people not to go there because the homebrew sucks. It some cases it can be true, but it’s a problem only being exacerbated by industry tactics.

It’s worth noting that no warnings are received by visitors to TorrentDownloads using Android devices, meaning that desktop users were probably the only people at risk. In any event, it’s expected that the warnings will disappear during the next day, so the immediate problems will be over. As far as TF is informed, the offending ads were removed days ago.

That appears to be backed up by checks carried out on a number of other malware scanning services. Norton, Opera, SiteAdvisor, Spamhaus, Yandex and ESET all declare the site to be clean.

Technical Chrome and Firefox users who are familiar with these types of warnings can take steps (Chrome, FF) to bypass the blocks, if they really must.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Spaghetti Download – Web Application Security Scanner

Post Syndicated from Darknet original https://www.darknet.org.uk/2017/10/spaghetti-download-web-application-security-scanner/?utm_source=rss&utm_medium=social&utm_campaign=darknetfeed

Spaghetti Download – Web Application Security Scanner

Spaghetti is an Open-source Web Application Security Scanner, it is designed to find various default and insecure files, configurations, and misconfigurations.

It is built on Python 2.7 and can run on any platform which has a Python environment.

Features of Spaghetti Web Application Security Scanner

  • Fingerprints
    • Server
    • Web Frameworks (CakePHP, CherryPy,…)
    • Web Application Firewall (Waf)
    • Content Management System (CMS)
    • Operating System (Linux, Unix,..)
    • Language (PHP, Ruby,…)
    • Cookie Security
  • Bruteforce
    • Admin Interface
    • Common Backdoors
    • Common Backup Directory
    • Common Backup File
    • Common Directory
    • Common File
    • Log File
  • Disclosure
    • Emails
    • Private IP
    • Credit Cards
  • Attacks
    • HTML Injection
    • SQL Injection
    • LDAP Injection
    • XPath Injection
    • Cross Site Scripting (XSS)
    • Remote File Inclusion (RFI)
    • PHP Code Injection
  • Other
    • HTTP Allow Methods
    • HTML Object
    • Multiple Index
    • Robots Paths
    • Web Dav
    • Cross Site Tracing (XST)
    • PHPINFO
    • .Listing
  • Vulns
    • ShellShock
    • Anonymous Cipher (CVE-2007-1858)
    • Crime (SPDY) (CVE-2012-4929)
    • Struts-Shock

Using Spaghetti Web Application Security Scanner

[email protected]:~/Spaghetti# python spaghetti.py
_____ _ _ _ _
| __|___ ___ ___| |_ ___| |_| |_|_|
|__ | .

Read the rest of Spaghetti Download – Web Application Security Scanner now! Only available at Darknet.

How to Query Personally Identifiable Information with Amazon Macie

Post Syndicated from Chad Woolf original https://aws.amazon.com/blogs/security/how-to-query-personally-identifiable-information-with-amazon-macie/

Amazon Macie logo

In August 2017 at the AWS Summit New York, AWS launched a new security and compliance service called Amazon Macie. Macie uses machine learning to automatically discover, classify, and protect sensitive data in AWS. In this blog post, I demonstrate how you can use Macie to help enable compliance with applicable regulations, starting with data retention.

How to query retained PII with Macie

Data retention and mandatory data deletion are common topics across compliance frameworks, so knowing what is stored and how long it has been or needs to be stored is of critical importance. For example, you can use Macie for Payment Card Industry Data Security Standard (PCI DSS) 3.2, requirement 3, “Protect stored cardholder data,” which mandates a “quarterly process for identifying and securely deleting stored cardholder data that exceeds defined retention.” You also can use Macie for ISO 27017 requirement 12.3.1, which calls for “retention periods for backup data.” In each of these cases, you can use Macie’s built-in queries to identify the age of data in your Amazon S3 buckets and to help meet your compliance needs.

To get started with Macie and run your first queries of personally identifiable information (PII) and sensitive data, follow the initial setup as described in the launch post on the AWS Blog. After you have set up Macie, walk through the following steps to start running queries. Start by focusing on the S3 buckets that you want to inventory and capture important compliance related activity and data.

To start running Macie queries:

  1. In the AWS Management Console, launch the Macie console (you can type macie to find the console).
  2. Click Dashboard in the navigation pane. This shows you an overview of the risk level and data classification type of all inventoried S3 buckets, categorized by date and type.
    Screenshot of "Dashboard" in the navigation pane
  3. Choose S3 objects by PII priority. This dashboard lets you sort by PII priority and PII types.
    Screenshot of "S3 objects by PII priority"
  4. In this case, I want to find information about credit card numbers. I choose the magnifying glass for the type cc_number (note that PII types can be used for custom queries). This view shows the events where PII classified data has been uploaded to S3. When I scroll down, I see the individual files that have been identified.
    Screenshot showing the events where PII classified data has been uploaded to S3
  5. Before looking at the files, I want to continue to build the query by only showing items with high priority. To do so, I choose the row called Object PII Priority and then the magnifying glass icon next to High.
    Screenshot of refining the query for high priority events
  6. To view the results matching these queries, I scroll down and choose any file listed. This shows vital information such as creation date, location, and object access control list (ACL).
  7. The piece I am most interested in this case is the Object PII details line to understand more about what was found in the file. In this case, I see name and credit card information, which is what caused the high priority. Scrolling up again, I also see that the query fields have updated as I interacted with the UI.
    Screenshot showing "Object PII details"

Let’s say that I want to get an alert every time Macie finds new data matching this query. This alert can be used to automate response actions by using AWS Lambda and Amazon CloudWatch Events.

  1. I choose the left green icon called Save query as alert.
    Screenshot of "Save query as alert" button
  2. I can customize the alert and change things like category or severity to fit my needs based on the alert data.
  3. Another way to find the information I am looking for is to run custom queries. To start using custom queries, I choose Research in the navigation pane.
    1. To learn more about custom Macie queries and what you can do on the Research tab, see Using the Macie Research Tab.
  4. I change the type of query I want to run from CloudTrail data to S3 objects in the drop-down list menu.
    Screenshot of choosing "S3 objects" from the drop-down list menu
  5. Because I want PII data, I start typing in the query box, which has an autocomplete feature. I choose the pii_types: query. I can now type the data I want to look for. In this case, I want to see all files matching the credit card filter so I type cc_number and press Enter. The query box now says, pii_types:cc_number. I press Enter again to enable autocomplete, and then I type AND pii_types:email to require both a credit card number and email address in a single object.
    The query looks for all files matching the credit card filter ("cc_number")
  6. I choose the magnifying glass to search and Macie shows me all S3 objects that are tagged as PII of type Credit Cards. I can further specify that I only want to see PII of type Credit Card that are classified as High priority by adding AND and pii_impact:high to the query.
    Screenshot showing narrowing the query results furtherAs before, I can save this new query as an alert by clicking Save query as alert, which will be triggered by data matching the query going forward.

Advanced tip

Try the following advanced queries using Lucene query syntax and save the queries as alerts in Macie.

  • Use a regular-expression based query to search for a minimum of 10 credit card numbers and 10 email addresses in a single object:
    • pii_explain.cc_number:/([1-9][0-9]|[0-9]{3,}) distinct Credit Card Numbers.*/ AND pii_explain.email:/([1-9][0-9]|[0-9]{3,}) distinct Email Addresses.*/
  • Search for objects containing at least one credit card, name, and email address that have an object policy enabling global access (searching for S3 AllUsers or AuthenticatedUsers permissions):
    • (object_acl.Grants.Grantee.URI:”http\://acs.amazonaws.com/groups/global/AllUsers” OR  object_acl.Grants.Grantee.URI:”http\://acs.amazonaws.com/groups/global/AllUsers”) AND (pii_types.cc_number AND pii_types.email AND pii_types.name)

These are two ways to identify and be alerted about PII by using Macie. In a similar way, you can create custom alerts for various AWS CloudTrail events by choosing a different data set on which to run the queries again. In the examples in this post, I identified credit cards stored in plain text (all data in this post is example data only), determined how long they had been stored in S3 by viewing the result details, and set up alerts to notify or trigger actions on new sensitive data being stored. With queries like these, you can build a reliable data validation program.

If you have comments about this post, submit them in the “Comments” section below. If you have questions about how to use Macie, start a new thread on the Macie forum or contact AWS Support.

-Chad

How Much Does ‘Free’ Premier League Piracy Cost These Days?

Post Syndicated from Andy original https://torrentfreak.com/how-much-does-free-premier-league-piracy-cost-these-days-170902/

Right now, the English Premier League is engaged in perhaps the most aggressively innovative anti-piracy operation the Internet has ever seen. After obtaining a new High Court order, it now has the ability to block ‘pirate’ streams of matches, in real-time, with no immediate legal oversight.

If the Premier League believes a server is streaming one of its matches, it can ask ISPs in the UK to block it, immediately. That’s unprecedented anywhere on the planet.

As previously reported, this campaign caused a lot of problems for people trying to access free and premium streams at the start of the season. Many IPTV services were blocked in the UK within minutes of matches starting, with free streams also dropping like flies. According to information obtained by TF, more than 600 illicit streams were blocked during that weekend.

While some IPTV providers and free streams continued without problems, it seems likely that it’s only a matter of time before the EPL begins to pick off more and more suppliers. To be clear, the EPL isn’t taking services or streams down, it’s only blocking them, which means that people using circumvention technologies like VPNs can get around the problem.

However, this raises the big issue again – that of continuously increasing costs. While piracy is often painted as free, it is not, and as setups get fancier, costs increase too.

Below, we take a very general view of a handful of the many ‘pirate’ configurations currently available, to work out how much ‘free’ piracy costs these days. The list is not comprehensive by any means (and excludes more obscure methods such as streaming torrents, which are always free and rarely blocked), but it gives an idea of costs and how the balance of power might eventually tip.

Basic beginner setup

On a base level, people who pirate online need at least some equipment. That could be an Android smartphone and easily installed free software such as Mobdro or Kodi. An Internet connection is a necessity and if the EPL blocks those all important streams, a VPN provider is required to circumvent the bans.

Assuming people already have a phone and the Internet, a VPN can be bought for less than £5 per month. This basic setup is certainly cheap but overall it’s an entry level experience that provides quality equal to the effort and money expended.

Equipment: Phone, tablet, PC
Comms: Fast Internet connection, decent VPN provider
Overal performance: Low quality, unpredictable, often unreliable
Cost: £5pm approx for VPN, plus Internet costs

Big screen, basic

For those who like their matches on the big screen, stepping up the chain costs more money. People need a TV with an HDMI input and a fast Internet connection as a minimum, alongside some kind of set-top device to run the necessary software.

Android devices are the most popular and are roughly split into two groups – the small standalone box type and the plug-in ‘stick’ variant such as Amazon’s Firestick.

A cheap Android set-top box

These cost upwards of £30 to £40 but the software to install on them is free. Like the phone, Mobdro is an option, but most people look to a Kodi setup with third-party addons. That said, all streams received on these setups are now vulnerable to EPL blocking so in the long-term, users will need to run a paid VPN.

The problem here is that some devices (including the 1st gen Firestick) aren’t ideal for running a VPN on top of a stream, so people will need to dump their old device and buy something more capable. That could cost another £30 to £40 and more, depending on requirements.

Importantly, none of this investment guarantees a decent stream – that’s down to what’s available on the day – but invariably the quality is low and/or intermittent, at best.

Equipment: TV, decent Android set-top box or equivalent
Comms: Fast Internet connection, decent VPN provider
Overall performance: Low to acceptable quality, unpredictable, often unreliable
Cost: £30 to £50 for set-top box, £5pm approx for VPN, plus Internet

Premium IPTV – PC or Android based

At this point, premium IPTV services come into play. People have a choice of spending varying amounts of money, depending on the quality of experience they require.

First of all, a monthly IPTV subscription with an established provider that isn’t going to disappear overnight is required, which can be a challenge to find in itself. We’re not here to review or recommend services but needless to say, like official TV packages they come in different flavors to suit varying wallet sizes. Some stick around, many don’t.

A decent one with a Sky-like EPG costs between £7 and £15 per month, depending on the quality and depth of streams, and how far in front users are prepared to commit.

Fairly typical IPTV with EPG (VOD shown)

Paying for a year in advance tends to yield better prices but with providers regularly disappearing and faltering in their service levels, people are often reluctant to do so. That said, some providers experience few problems so it’s a bit like gambling – research can improve the odds but there’s never a guarantee.

However, even when a provider, price, and payment period is decided upon, the process of paying for an IPTV service can be less than straightforward.

While some providers are happy to accept PayPal, many will only deal in credit cards, bitcoin, or other obscure payment methods. That sets up more barriers to entry that might deter the less determined customer. And, if time is indeed money, fussing around with new payment processors can be pricey, at least to begin with.

Once subscribed though, watching these streams is pretty straightforward. On a base level, people can use a phone, tablet, or set-top device to receive them, using software such as Perfect Player IPTV, for example. Currently available in free (ad supported) and premium (£2) variants, this software can be setup in a few clicks and will provide a decent user experience, complete with EPG.

Perfect Player IPTV

Those wanting to go down the PC route have more options but by far the most popular is receiving IPTV via a Kodi setup. For the complete novice, it’s not always easy to setup but some IPTV providers supply their own free addons, which streamline the process massively. These can also be used on Android-based Kodi setups, of course.

Nevertheless, if the EPL blocks the provider, a VPN is still going to be needed to access the IPTV service.

An Android tablet running Kodi

So, even if we ignore the cost of the PC and Internet connection, users could still find themselves paying between £10 and £20 per month for an IPTV service and a decent VPN. While more channels than simply football will be available from most providers, this is getting dangerously close to the £18 Sky are asking for its latest football package.

Equipment: TV, PC, or decent Android set-top box or equivalent
Comms: Fast Internet connection, IPTV subscription, decent VPN provider
Overal performance: High quality, mostly reliable, user-friendly (once setup)
Cost: PC or £30/£50 for set-top box, IPTV subscription £7 to £15pm, £5pm approx for VPN, plus Internet, plus time and patience for obscure payment methods.
Note: There are zero refunds when IPTV providers disappoint or disappear

Premium IPTV – Deluxe setup

Moving up to the top of the range, things get even more costly. Those looking to give themselves the full home entertainment-like experience will often move away from the PC and into the living room in front of the TV, armed with a dedicated set-top box. Weapon of choice: the Mag254.

Like Amazon’s FireStick, PC or Android tablet, the Mag254 is an entirely legal, content agnostic device. However, enter the credentials provided by many illicit IPTV suppliers and users are presented with a slick Sky-like experience, far removed from anything available elsewhere. The device is operated by remote control and integrates seamlessly with any HDMI-capable TV.

Mag254 IPTV box

Something like this costs around £70 in the UK, plus the cost of a WiFi adaptor on top, if needed. The cost of the IPTV provider needs to be figured in too, plus a VPN subscription if the provider gets blocked by EPL, which is likely. However, in this respect the Mag254 has a problem – it can’t run a VPN natively. This means that if streams get blocked and people need to use a VPN, they’ll need to find an external solution.

Needless to say, this costs more money. People can either do all the necessary research and buy a VPN-capable router/modem that’s also compatible with their provider (this can stretch to a couple of hundred pounds) or they’ll need to invest in a small ‘travel’ router with VPN client features built in.

‘Travel’ router (with tablet running Mobdro for scale)

These devices are available on Amazon for around £25 and sit in between the Mag254 (or indeed any other wireless device) and the user’s own regular router. Once the details of the VPN subscription are entered into the router, all traffic passing through is encrypted and will tunnel through web blocking measures. They usually solve the problem (ymmv) but of course, this is another cost.

Equipment: Mag254 or similar, with WiFi
Comms: Fast Internet connection, IPTV subscription, decent VPN provider
Overall performance: High quality, mostly reliable, very user-friendly
Cost: Mag254 around £75 with WiFi, IPTV subscription £7 to £15pm, £5pm for VPN (plus £25 for mini router), plus Internet, plus patience for obscure payment methods.
Note: There are zero refunds when IPTV providers disappoint or disappear

Conclusion

On the whole, people who want a reliable and high-quality Premier League streaming experience cannot get one for free, no matter where they source the content. There are many costs involved, some of which cannot be avoided.

If people aren’t screwing around with annoying and unreliable Kodi streams, they’ll be paying for an IPTV provider, VPN and other equipment. Or, if they want an easy life, they’ll be paying Sky, BT or Virgin Media. That might sound harsh to many pirates but it’s the only truly reliable solution.

However, for those looking for something that’s merely adequate, costs drop significantly. Indeed, if people don’t mind the hassle of wondering whether a sub-VHS quality stream will appear before the big match and stay on throughout, it can all be done on a shoestring.

But perhaps the most important thing to note in respect of costs is the recent changes to the pricing of Premier League content in the UK. As mentioned earlier, Sky now delivers a sports package for £18pm, which sounds like the best deal offered to football fans in recent years. It will be tempting for sure and has all the hallmarks of a price point carefully calculated by Sky.

The big question is whether it will be low enough to tip significant numbers of people away from piracy. The reality is that if another couple of thousand streams get hit hard again this weekend – and the next – and the next – many pirating fans will be watching the season drift away for yet another month, unviewed. That’s got to be frustrating.

The bottom line is that high-quality streaming piracy is becoming a little bit pricey just for football so if it becomes unreliable too – and that’s the Premier League’s goal – the balance of power could tip. At this point, the EPL will need to treat its new customers with respect, in order to keep them feeling both entertained and unexploited.

Fail on those counts – especially the latter – and the cycle will start again.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

The Cost of Cloud Storage

Post Syndicated from Tim Nufire original https://www.backblaze.com/blog/cost-of-cloud-storage/

the cost of the cloud as a percentage of revenue

This week, we’re celebrating the one year anniversary of the launch of Backblaze B2 Cloud Storage. Today’s post is focused on giving you a peek behind the curtain about the costs of providing cloud storage. Why? Over the last 10 years, the most common question we get is still “how do you do it?” In this multi-billion dollar, global industry exhibiting exponential growth, none of the other major players seem to be willing to discuss the underlying costs. By exposing a chunk of the Backblaze financials, we hope to provide a better understanding of what it costs to run “the cloud,” and continue our tradition of sharing information for the betterment of the larger community.

Context
Backblaze built one of the industry’s largest cloud storage systems and we’re proud of that accomplishment. We bootstrapped the business and funded our growth through a combination of our own business operations and just $5.3M in equity financing ($2.8M of which was invested into the business – the other $2.5M was a tender offer to shareholders). To do this, we had to build our storage system efficiently and run as a real, self-sustaining, business. After over a decade in the data storage business, we have developed a deep understanding of cloud storage economics.

Definitions
I promise we’ll get into the costs of cloud storage soon, but some quick definitions first:

    Revenue: Money we collect from customers.
    Cost of Goods Sold (“COGS”): The costs associated with providing the service.
    Operating Expenses (“OpEx”): The costs associated with developing and selling the service.
    Income/Loss: What is left after subtracting COGS and OpEx from Revenue.

I’m going to focus today’s discussion on the Cost of Goods Sold (“COGS”): What goes into it, how it breaks down, and what percent of revenue it makes up. Backblaze is a roughly break-even business with COGS accounting for 47% of our revenue and the remaining 53% spent on our Operating Expenses (“OpEx”) like developing new features, marketing, sales, office rent, and other administrative costs that are required for us to be a functional company.

This post’s focus on COGS should let us answer the commonly asked question of “how do you provide cloud storage for such a low cost?”

Breaking Down Cloud COGS

Providing a cloud storage service requires the following components (COGS and OpEX – below we break out COGS):
cloud infrastructure costs as a percentage of revenue

  • Hardware: 23% of Revenue
  • Backblaze stores data on hard drives. Those hard drives are “wrapped” with servers so they can connect to the public and store data. We’ve discussed our approach to how this works with our Vaults and Storage Pods. Our infrastructure is purpose built for data storage. That is, we thought about how data storage ought to work, and then built it from the ground up. Other companies may use different storage media like Flash, SSD, or even tape. But it all serves the same function of being the thing that data actually is stored on. For today, we’ll think of all this as “hardware.”

    We buy storage hardware that, on average, will last 5 years (60 months) before needing to be replaced. To account for hardware costs in a way that can be compared to our monthly expenses, we amortize them and recognize 1/60th of the purchase price each month.

    Storage Pods and hard drives are not the only hardware in our environment. We also have to buy the cabinets and rails that hold the servers, core servers that manage accounts/billing/etc., switches, routers, power strips, cables, and more. (Our post on bringing up a data center goes into some of this detail.) However, Storage Pods and the drives inside them make up about 90% of all the hardware cost.

  • Data Center (Space & Power): 8% of Revenue
  • “The cloud” is a great marketing term and one that has caught on for our industry. That said, all “clouds” store data on something physical like hard drives. Those hard drives (and servers) are actual, tangible things that take up actual space on earth, not in the clouds.

    At Backblaze, we lease space in colocation facilities which offer a secure, temperature controlled, reliable home for our equipment. Other companies build their own data centers. It’s the classic rent vs buy decision; but it always ends with hardware in racks in a data center.

    Hardware also needs power to function. Not everyone realizes it, but electricity is a significant cost of running cloud storage. In fact, some data center space is billed simply as a function of an electricity bill.

    Every hard drive storing data adds incremental space and power need. This is a cost that scales with storage growth.

    I also want to make a comment on taxes. We pay sales and property tax on hardware, and it is amortized as part of the hardware section above. However, it’s valuable to think about taxes when considering the data center since the location of the hardware actually drives the amount of taxes on the hardware that gets placed inside of it.

  • People: 7% of Revenue
  • Running a data center requires humans to make sure things go smoothly. The more data we store, the more human hands we need in the data center. All drives will fail eventually. When they fail, “stuff” needs to happen to get a replacement drive physically mounted inside the data center and filled with the customer data (all customer data is redundantly stored across multiple drives). The individuals that are associated specifically with managing the data center operations are included in COGS since, as you deploy more hard drives and servers, you need more of these people.

    Customer Support is the other group of people that are part of COGS. As customers use our services, questions invariably arise. To service our customers and get questions answered expediently, we staff customer support from our headquarters in San Mateo, CA. They do an amazing job! Staffing models, internally, are a function of the number of customers and the rate of acquiring new customers.

  • Bandwidth: 3% of Revenue
  • We have over 350 PB of customer data being stored across our data centers. The bulk of that has been uploaded by customers over the Internet (the other option, our Fireball service, is 6 months old and is seeing great adoption). Uploading data over the Internet requires bandwidth – basically, an Internet connection similar to the one running to your home or office. But, for a data center, instead of contracting with Time Warner or Comcast, we go “upstream.” Effectively, we’re buying wholesale.

    Understanding how that dynamic plays out with your customer base is a significant driver of how a cloud provider sets its pricing. Being in business for a decade has explicit advantages here. Because we understand our customer behavior, and have reached a certain scale, we are able to buy bandwidth in sufficient bulk to offer the industry’s best download pricing at $0.02 / Gigabyte (compared to $0.05 from Amazon, Google, and Microsoft).

    Why does optimizing download bandwidth charges matter for customers of a data storage business? Because it has a direct relationship to you being able to retrieve and use your data, which is important.

  • Other Fees: 6% of Revenue
  • We have grouped a the remaining costs inside of “Other Fees.” This includes fees we pay to our payment processor as well as the costs of running our Restore Return Refund program.

    A payment processor is required for businesses like ours that need to accept credit cards securely over the Internet. The bulk of the money we pay to the payment processor is actually passed through to pay the credit card companies like AmEx, Visa, and Mastercard.

    The Restore Return Refund program is a unique program for our consumer and business backup business. Customers can download any and all of their files directly from our website. We also offer customers the ability to order a hard drive with some or all of their data on it, we then FedEx it to the customer wherever in the world she is. If the customer chooses, she can return the drive to us for a full refund. Customers love the program, but it does cost Backblaze money. We choose to subsidize the cost associated with this service in an effort to provide the best customer experience we can.

The Big Picture

At the beginning of the post, I mentioned that Backblaze is, effectively, a break even business. The reality is that our products drive a profitable business but those profits are invested back into the business to fund product development and growth. That means growing our team as the size and complexity of the business expands; it also means being fortunate enough to have the cash on hand to fund “reserves” of extra hardware, bandwidth, data center space, etc. In our first few years as a bootstrapped business, having sufficient buffer was a challenge. Having weathered that storm, we are particularly proud of being in a financial place where we can afford to make things a bit more predictable.

All this adds up to answer the question of how Backblaze has managed to carve out its slice of the cloud market – a market that is a key focus for some of the largest companies of our time. We have innovated a novel, purpose built storage infrastructure with our Vaults and Pods. That infrastructure allows us to keep costs very, very low. Low costs enable us to offer the world’s most affordable, reliable cloud storage.

Does reliable, affordable storage matter? For a company like Vintage Aerial, it enables them to digitize 50 years’ worth of aerial photography of rural America and share that national treasure with the world. Having the best download pricing in the storage industry means Austin City Limits, a PBS show out of Austin, can digitize and preserve over 550 concerts.

We think offering purpose built, affordable storage is important. It empowers our customers to monetize existing assets, make sure data is backed up (and not lost), and focus on their core business because we can handle their data storage needs.

The post The Cost of Cloud Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Tips on Winning the ecommerce Game

Post Syndicated from Sarah Wilson original https://www.anchor.com.au/blog/2017/02/tips-ecommerce-hosting-game/

The ecommerce world is constantly changing and evolving, which is exactly why you must keep on top of the game. Arguably, choosing a reliable host is the most important decision that an eCommerce business has, that’s why we have noted 5 major reasons as to why a quality hosting provider is vital.

High Availability

The most important thing to think about when choosing a host and your infrastructure, is “How much is it going to cost me when my site goes down”.
If your site is down, especially over a large period of time, you could be losing customers and profits. One way to minimise this is to create a highly available environment on the cloud. This means that there is a ‘redundancy’ plan in place to minimise the chances of your site being offline for even a minute.

SEO Ranking

Having a good SEO ranking isn’t purely based on your content. If your site is extremely slow to load, or doesn’t load at all, the ‘secret Google bots’, will push your site further and further down the results page. We recommend using a CDN (Content Delivery Network) such as Cloudflare to help improve performance.  

Security

This may seem like a fairly obvious concern, but making sure you have regular security updates and patches is vital, especially, if credit cards or money transfers are involved on your site. Obviously there is no one way to combat every security concern on the internet, however, making sure you have regular back ups and 24/7 support will help any situation.

Scalability 

What happens when you have a sale or run an advertising campaign and suddenly have a flurry of traffic to your site? In order for your site to be able to cope with the new influx, it needs to be scalable. A good hosting provider can make your site scalable so that there is no downtime when your site is hit with a heavy traffic load. Generally, the best direction to follow when scalability is a priority, is the cloud or Amazon Web Services. The best part of it is, not only do you only pay for what you use, but hosting on the Amazon infrastructure also gives you an SLA (Service Level Agreement) of 99.95% uptime guarantee.

Stress-Free Support

Finally, a good hosting provider will take away any stress that is related to hosting. If your site goes down at 3am, you don’t want to be the person having to deal with it. At Anchor, we have a team of expert Sysadmins available 24/7 to take the stress out of keeping your site up and online.

With these 5 points in mind, you can now make 2017 your year, and beat the game that is eCommerce.

If you have security concerns, experiencing slow page loads or even downtime, we can perform a free ecommerce site assessment to help define a hosting roadmap that will allow you to speed ahead of the competition. If you would simply like to learn more about eCommerce hosting on Anchor’s award winning hosting network, simply contact our friendly staff will get back to you ASAP. 

The post Tips on Winning the ecommerce Game appeared first on AWS Managed Services by Anchor.

Tips on Winning the ecommerce Game

Post Syndicated from Sarah Wilson original http://www.anchor.com.au/blog/2017/02/tips-ecommerce-hosting-game/

The ecommerce world is constantly changing and evolving, which is exactly why you must keep on top of the game. Arguably, choosing a reliable host is the most important decision that an eCommerce business has, that’s why we have noted 5 major reasons as to why a quality hosting provider is vital.

High Availability

The most important thing to think about when choosing a host and your infrastructure, is “How much is it going to cost me when my site goes down”.
If your site is down, especially over a large period of time, you could be losing customers and profits. One way to minimise this is to create a highly available environment on the cloud. This means that there is a ‘redundancy’ plan in place to minimise the chances of your site being offline for even a minute.

SEO Ranking

Having a good SEO ranking isn’t purely based on your content. If your site is extremely slow to load, or doesn’t load at all, the ‘secret Google bots’, will push your site further and further down the results page. We recommend using a CDN (Content Delivery Network) such as Cloudflare to help improve performance.  

Security

This may seem like a fairly obvious concern, but making sure you have regular security updates and patches is vital, especially, if credit cards or money transfers are involved on your site. Obviously there is no one way to combat every security concern on the internet, however, making sure you have regular back ups and 24/7 support will help any situation.

Scalability 

What happens when you have a sale or run an advertising campaign and suddenly have a flurry of traffic to your site? In order for your site to be able to cope with the new influx, it needs to be scalable. A good hosting provider can make your site scalable so that there is no downtime when your site is hit with a heavy traffic load. Generally, the best direction to follow when scalability is a priority, is the cloud or Amazon Web Services. The best part of it is, not only do you only pay for what you use, but hosting on the Amazon infrastructure also gives you an SLA (Service Level Agreement) of 99.95% uptime guarantee.

Stress-Free Support

Finally, a good hosting provider will take away any stress that is related to hosting. If your site goes down at 3am, you don’t want to be the person having to deal with it. At Anchor, we have a team of expert Sysadmins available 24/7 to take the stress out of keeping your site up and online.

With these 5 points in mind, you can now make 2017 your year, and beat the game that is eCommerce.

If you have security concerns, experiencing slow page loads or even downtime, we can perform a free ecommerce site assessment to help define a hosting roadmap that will allow you to speed ahead of the competition. If you would simply like to learn more about eCommerce hosting on Anchor’s award winning hosting network, simply contact our friendly staff will get back to you ASAP. 

The post Tips on Winning the ecommerce Game appeared first on AWS Managed Services by Anchor.

AWS Hot Startups- January 2017

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/aws-hot-startups-january-2017-2/

It is the start of a new year and Tina Barr is back with many more great new startups to check out.
-Ana


Welcome back to another year of hot AWS-powered startups! We have three exciting new startups today:

  • ClassDojo – Connecting teachers, students, and parents to the classroom.
  • Nubank – A financial services startup reimagining the banking experience.
  • Ravelin – A fraud detection company built on machine learning models.

If you missed any of last year’s featured startups, be sure to check out our Year in Review.

ClassDojo (San Francisco)
ClassDojo imageFounded in 2011 by Liam Don and Sam Chaudhary, ClassDojo is a communication platform for the classroom. Teachers, parents, and students can use it throughout the day as a place to share important moments through photos, videos and messaging. With many classrooms today operating as a one-size-fits-all model, the ClassDojo founders wanted to improve the education system and connect the 700 million primary age kids in the world to the very best content and services. Sam and Liam started out by asking teachers what they would find most helpful for their classrooms, and many expressed that they wanted a more caring and inclusive community – one where they could be connected to everyone who was part of their classroom. With ClassDojo, teachers are able to create their own classroom culture in partnership with students and their parents.

In five years, ClassDojo has expanded to 90% of K-8 schools in the US and 180 other countries, and their content has been translated into over 35 languages. Recently, they have expanded further into classrooms with video series on Empathy and Growth Mindset that were co-created with Harvard and Stanford. These videos have now been seen by 1 in 3 kids under the age of 14 in the U.S. One of their products called Stories allows for instantly updated streams of pictures and videos from the school day, all of which are shared at home with parents. Students can even create their own stories – a timeline or portfolio of what they’ve learned.

Because ClassDojo sees heavy usage during the school day and across many global time zones, their traffic patterns are highly variable. Amazon EC2 autoscaling allows them to meet demand while controlling costs during quieter periods. Their data pipeline is built entirely on AWS – Amazon Kinesis allows them to stream high volumes of data into Amazon Redshift for analysis and into Amazon S3 for archival. They also utilize Amazon Aurora and Amazon RDS to store sensitive relational data, which makes at-rest encryption easy to manage, while scaling to meet very high query volumes with incredibly low latency. All of ClassDojo’s web frontends are hosted on Amazon S3 and served through Amazon CloudFront, and they use AWS WAF rules to secure their frontends against attacks and unauthorized access. To detect fraudulent accounts they have used Amazon Machine Learning, and are also exploring the new Amazon Lex service to provide voice control so that teachers can use their products hands-free in the classroom.

Check out their blog to see how teachers across the world are using ClassDojo in their classrooms!

Nubank (Brazil)
Nubank imageNubank is a technology-driven financial services startup that is working to redefine the banking standard in Brazil. Founder David Vélez with a team of over 350 engineers, scientists, designers, and analysts, they have created a banking alternative in one of the world’s fastest growing mobile markets. Not only is Brazil the world’s 5th largest country in both area and population, but it also has one of the highest credit card interest rates in the world. Nubank has reimagined the credit card experience for a world where everyone has access to smartphones and offers a product customers haven’t seen before.

The Brazilian banking industry is both heavily regulated and extremely concentrated. Nubank saw an opportunity for companies that are truly customer-centric and have better data and technology to compete in an industry that has seen little innovation in decades. With Nubank’s mobile app customers are able to block and unblock their credit cards, change their credit limits, pay their bills, and have access to all of their purchases in real time. They also offer 24/7 customer support through digital channels and clear and simple communication. This was previously unheard of in Brazil’s banking industry, and Nubank’s services have been extremely well-received by customers.

From the start, Nubank’s leaders planned for growth. They wanted to build a system that could meet the ever changing regulatory and business rules, have full auditing capability and scale in both size and complexity. They use many AWS services including Amazon DynamoDB, Amazon EC2, Amazon S3, and AWS CloudFormation. By using AWS, Nubank developed its credit card processing platform in only seven months and are able to add features with ease.

Go to Nubank’s blog for more information!

Ravelin (London)
Ravelin imageLaunched in 2015, Ravelin is a fraud detection company that works with many leading e-commerce and on-demand companies in a range of sectors including travel, retail, food delivery, ticketing, and transport. The company’s founders (Martin Sweeney, Leonard Austin, Mairtain O’Riada, and Nicky Lally) began their work while trying to solve fraud issues in an on-demand taxi business, which required accurate fraud predictions about a customer with limited information and then making that fraud decision almost instantly. They soon found that there was nothing on the market that was able to do this, and so the founders left to start Ravelin.

Ravelin allows its clients to spend less time on manual reviews and instead focus on servicing their customers. Their machine learning models are built to predict good and bad behavior based on the relevant customer behavioral and payment data sent via API. Spotting bad behavior helps Ravelin to prevent fraud, and equally importantly, spotting good patterns means fewer good customers are being blocked. Ravelin chose machine learning as their core technology due to its incredible accuracy at a speed and scale that aligns with how their clients’ businesses operate.

Ravelin uses a suite of AWS services to help their machine learning algorithms detect fraud. Their clients are spread all over the world and their peak traffic times can be unpredictable so they scale their Amazon EC2 infrastructure multiple times a day, which helps with handling increased traffic while minimizing server costs. Ravelin also uses services such as Amazon RDS, Amazon DynamoDB, Amazon ElastiCache, and Amazon Elasticsearch Service. Utilizing these services has allowed the Ravelin team more time to concentrate on building fraud detection software.

For the latest in fraud prevention, be sure to check out Ravelin’s blog!

-Tina Barr

Security Risks of TSA PreCheck

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/12/security_risks_12.html

Former TSA Administrator Kip Hawley wrote an op-ed pointing out the security vulnerabilities in the TSA’s PreCheck program:

The first vulnerability in the system is its enrollment process, which seeks to verify an applicant’s identity. We know verification is a challenge: A 2011 Government Accountability Office report on TSA’s system for checking airport workers’ identities concluded that it was “not designed to provide reasonable assurance that only qualified applicants” got approved. It’s not a stretch to believe a reasonably competent terrorist could construct an identity that would pass PreCheck’s front end.

The other step in PreCheck’s “intelligence-driven, risk-based security strategy” is absurd on its face: The absence of negative information about a person doesn’t mean he or she is trustworthy. News reports are filled with stories of people who seemed to be perfectly normal right up to the moment they committed a heinous act. There is no screening algorithm and no database check that can accurately predict human behavior — especially on the scale of millions. It is axiomatic that terrorist organizations recruit operatives who have clean backgrounds and interview well.

None of this is news.

Back in 2004, I wrote:

Imagine you’re a terrorist plotter with half a dozen potential terrorists at your disposal. They all apply for a card, and three get one. Guess which are going on the mission? And they’ll buy round-trip tickets with credit cards and have a “normal” amount of luggage with them.

What the Trusted Traveler program does is create two different access paths into the airport: high security and low security. The intent is that only good guys will take the low-security path, and the bad guys will be forced to take the high-security path, but it rarely works out that way. You have to assume that the bad guys will find a way to take the low-security path.

The Trusted Traveler program is based on the dangerous myth that terrorists match a particular profile and that we can somehow pick terrorists out of a crowd if we only can identify everyone. That’s simply not true. Most of the 9/11 terrorists were unknown and not on any watch list. Timothy McVeigh was an upstanding US citizen before he blew up the Oklahoma City Federal Building. Palestinian suicide bombers in Israel are normal, nondescript people. Intelligence reports indicate that Al Qaeda is recruiting non-Arab terrorists for US operations.

I wrote much the same thing in 2007:

Background checks are based on the dangerous myth that we can somehow pick terrorists out of a crowd if we could identify everyone. Unfortunately, there isn’t any terrorist profile that prescreening can uncover. Timothy McVeigh could probably have gotten one of these cards. So could have Eric Rudolph, the pipe bomber at the 1996 Olympic Games in Atlanta. There isn’t even a good list of known terrorists to check people against; the government list used by the airlines has been the butt of jokes for years.

And have we forgotten how prevalent identity theft is these days? If you think having a criminal impersonating you to your bank is bad, wait until they start impersonating you to the Transportation Security Administration.

The truth is that whenever you create two paths through security — a high-security path and a low-security path — you have to assume that the bad guys will find a way to exploit the low-security path. It may be counterintuitive, but we are all safer if the people chosen for more thorough screening are truly random and not based on an error-filled database or a cursory background check.

In a companion blog post, Hawley has more details about why the program doesn’t work:

In the sense that PreCheck bars people who were identified by intelligence or law enforcement agencies as possible terrorists, then it was intelligence-driven. But using that standard for PreCheck is ridiculous since those people already get extra screening or are on the No-Fly list. The movie Patriots Day, out now, reminds us of the tragic and preventable Boston Marathon bombing. The FBI sent agents to talk to the Tsarnaev brothers and investigate them as possible terror suspects. And cleared them. Even they did not meet the “intelligence-driven” definition used in PreCheck.

The other problem with “intelligence-driven” in the PreCheck context is that intelligence actually tells us the opposite; specifically that terrorists pick clean operatives. If TSA uses current intelligence to evaluate risk, it would not be out enrolling everybody they can into pre-9/11 security for everybody not flagged by the security services.

Hawley and I may agree on the problem, but we have completely opposite solutions. The op-ed was too short to include details, but they’re in a companion blog post. Basically, he wants to screen PreCheck passengers more:

In the interests of space, I left out details of what I would suggest as short-and medium-term solutions. Here are a few ideas:

  • Immediately scrub the PreCheck enrollees for false identities. That can probably be accomplished best and most quickly by getting permission from members, and then using, commercial data. If the results show that PreCheck has already been penetrated, the program should be suspended.
  • Deploy K-9 teams at PreCheck lanes.

  • Use Behaviorally trained officers to interact with and check the credentials of PreCheck passengers.

  • Use Explosives Trace Detection cotton swabs on PreCheck passengers at a much higher rate. Same with removing shoes.

  • Turn on the body scanners and keep them fully utilized.

  • Allow liquids to stay in the carry-on since TSA scanners can detect threat liquids.

  • Work with the airlines to keep the PreCheck experience positive.

  • Work with airports to place PreCheck lanes away from regular checkpoints so as not to diminish lane capacity for non-PreCheck passengers. Rental Car check-in areas could be one alternative. Also, downtown check-in and screening (with secure transport to the airport) is a possibility.

These solutions completely ignore the data from the real-world experiment PreCheck has been. Hawley writes that PreCheck tells us that “terrorists pick clean operatives.” That’s exactly wrong. PreCheck tells us that, basically, there are no terrorists. If 1) it’s an easier way through airport security that terrorists will invariably use, and 2) there have been no instances of terrorists using it in the 10+ years it and its predecessors have been in operation, then the inescapable conclusion is that the threat is minimal. Instead of screening PreCheck passengers more, we should screen everybody else less. This is me in 2012: “I think the PreCheck level of airport screening is what everyone should get, and that the no-fly list and the photo ID check add nothing to security.”

I agree with Hawley that we need to overhaul airport security. Me in 2010: “Airport security is the last line of defense, and it’s not a very good one.” We need to recognize that the actual risk is much lower than we fear, and ratchet airport security down accordingly. And then we need to continue to invest in investigation and intelligence: security measures that work regardless of the tactic or target.

Opening Soon – AWS Office in Dubai to Support Cloud Growth in UAE

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/opening-soon-aws-office-in-dubai-to-support-cloud-growth-in-uae/

The AWS office in Dubai, UAE will open on January 1, 2017.

We’ve been working with the Dubai Investment Development Agency (Dubai FDI) to launch the office, and plan to support startups, government institutions, and some of the Middle East’s historic and most established enterprises as they make the transition to the AWS Cloud.

Resources in Dubai
The office will be staffed with account managers, solutions architects, partner managers, professional services consultant, and support staff to allow customers to interact with AWS in their local setting and language.

In addition to access to the AWS team, customers in the Middle East have access to several important AWS programs including AWS Activate and AWS Educate:

  • AWS Activate is designed to provide startups with resources that will help them to get started on AWS, including up to $100,000 (USD) in AWS promotional credits.
  • AWS Educate is a global initiative designed to provide students and educators with the resources needed to accelerate cloud-based learning endeavors.
  • AWS Training and Certification helps technologists to develop the skills to design, deploy, and operate infrastructure in the AWS Cloud.

We are also planning to host AWSome days and other large-scale training events in the region.

Customers in the Middle East
Middle Eastern organizations were among the earliest adopters of cloud services when AWS launched in 2006. Customers based in the region are using AWS to run everything from development and test environments to Big Data analytics, from mobile, web and social applications to enterprise business applications and mission critical workloads.

AWS counts some of the UAE’s most well-known and fastest growing businesses as customers, including PayFort and Careem, as well as government institutions and some of the largest companies in the Middle East, such as flydubai and Middle East Broadcasting Center.

Careem is the leading ride booking app in the Middle East and North Africa. Launched in 2012, Careem runs totally on AWS and over the past three years has grown by 10x in size every year. This is growth that would not have been possible without AWS. After starting with one city, Dubai, Careem now serves millions of commuters in 43 cities across the Middle East, North Africa and Asia. Careem uses over 500 EC2 instances as well as a number of other services such as Amazon S3, Amazon DynamoDB, Elastic Beanstalk and others.

PayFort is a startup based in the United Arab Emirates that provides payment solutions to customers across the Middle East through its payments gateway, FORT. The platform enables organizations to accept online payments via debit and credit cards. PayFort counts Etihad Airways, Ferrari World, and Souq.com among its customers. PayFort chose to run FORT entirely on AWS technologies and as a result is saving 32% over their on-premises costs. Although cost was key for PayFort, it turns our that they chose AWS due to the high level of security that they could achieve with the platform. Compliance with Payment Card Industry Data Security Standard (PCI DSS) and International Organization for Standards (ISO) 27001 are central to PayFort’s payment services, both of which are available with AWS (we were actually the first cloud provider to reach compliance with version 3.1 of PCI DSS).

Fly Dubai is the leading low-cost airline in the Middle East, with over 90 destinations, and was launched by the government of Dubai in 2009. flydubai chose to build their online check-in platform on AWS and went from design to production in four months where it is now being used by thousands of passengers a day – this timeline would not have been possible without the cloud. Given the seasonal fluctuations in demand for flights, flydubai also needs an IT provider that allows it to cope with spikes in demand. Using AWS allows them to do this and lead times for new infrastructure services have been reduced from up to 10 weeks to a matter of hours.

Partners
The AWS Partner Network of consulting and technology partners in the region helps our customers to get the most from the cloud. The network includes global members like CSC as well as prominent regional members such as Redington.

Redington is an AWS Consulting Partner and is the Master Value Added Distributor for AWS in Middle East and North Africa. They are also an Authorized Commercial Reseller of AWS cloud technologies. Redington is helping organizations in the MEA region with cloud assessment, cloud readiness, design, implementation, migration, deployment and optimization of cloud resources. They also have an ecosystem of partners including ISV’s with experienced and certified AWS engineers with cross domain experience.

Join Us
This announcement is part of our continued expansion across Europe, the Middle East, and Asia. As part of our investment in these areas, we created over 10,000 new jobs in 2015. If you are interested in joining our team in Dubai or in any other location around the world, check out the Amazon Jobs site.

Jeff;

 

Combine NoSQL and Massively Parallel Analytics Using Apache HBase and Apache Hive on Amazon EMR

Post Syndicated from Ben Snively original https://blogs.aws.amazon.com/bigdata/post/Tx3EGE8Z90LZ9WX/Combine-NoSQL-and-Massively-Parallel-Analytics-Using-Apache-HBase-and-Apache-Hiv

Ben Snively is a Solutions Architect with AWS

Jon Fritz, a Senior Product Manager for Amazon EMR, co-authored this post

With today’s launch of Amazon EMR release 4.6, you can now quickly and easily provision a cluster with Apache HBase 1.2Apache HBase is a massively scalable, distributed big data store in the Apache Hadoop ecosystem. It is an open-source, non-relational, versioned database which runs on top of the Hadoop Distributed Filesystem (HDFS), and it is built for random, strictly consistent realtime access for tables with billions of rows and millions of columns. It has tight integration with Apache Hadoop, Apache Hive, and Apache Pig, so you can easily combine massively parallel analytics with fast data access. Apache HBase’s data model, throughput, and fault tolerance are a good match for workloads in ad tech, web analytics, financial services, applications using time-series data, and many more.

Table structure in Apache HBase, like many NoSQL technologies, should be directly influenced by the queries and access patterns of the data. Query performance varies drastically based on the way the cluster has to process and return the data. In this post, we’ll demonstrate query performance differences by showing you how to launch an EMR cluster with HBase and restore a table from a snapshot in Amazon S3. The table in the snapshot contains approximately 3.5 million rows, and you’ll perform look-ups and scans using the HBase shell as well as perform SQL queries over the same dataset using Hive with the Hive query editor in the Hue UI.

Introduction to Apache HBase

HBase is considered a persistent, multidimensional, sorted map, where each cell is indexed by a row key and column key (family and qualifier). Each cell can contain multiple versions of a value captured with a long variable, which defaults to a timestamp value. The reason it’s considered multidimensional is that there are several parameters that contribute to the cell location.

Table Structure

An HBase table is composed of one or more column families that are defined for the table. Each column family defines shared storage and a set of properties for an arbitrary grouping of columns. Column families are predefined (either at table creation time or modifying an existing table), but the columns themselves can be dynamically created and discovered while adding or updating rows. 

HBase stores data in HDFS, which spreads data stored in a table across the cluster. All the columns in a column family are stored together in a set of HFiles (also known as a column-oriented storage). A rowkey, which is immutable and uniquely defines a row, usually spans multiple HFiles. Rowkeys are treated as byte arrays (byte[]) and are stored in a sorted order in the multi-dimensional sorted map. Cells in your table are byte arrays too.

Data from a table is served by RegionServers running on nodes in your cluster. Each region server manages a namespace of rowkeys, and HBase can split regions as tables get larger, to keep the namespace for each region to a manageable size.

Query Performance

Lower query latency and higher throughput is achieved when each query scans less data and are distributed across all the RegionServers. Querying for a specific row and column allows the cluster to find the RegionServer quickly and underlying files that store the information and return it to the caller. A partial range, if known, can speed up queries by reducing the rows that need to be scanned when a single row is unknown. By default, HBase also utilizes row-level bloom filters to reduce the number of disk reads per Get request

The diagram below shows the relationship between a cardinality and query performance:

HBase shell query walkthrough

Using the console or the CLI, launch a new EMR 4.6 cluster that has the HBase, Hive, and Hue applications selected. Next, you restore a table snapshot on your cluster. We created this snapshot from an HBase table for this demo, and snapshotting is a convenient way to back-up and restore tables for production HBase clusters. Below is the schema of the data stored in the snapshot:

The rowkey is a composite value that combines lastname, firstname, and customerId. There are three column families that group columns containing information about address, credit card, and contact information.

After the cluster is running, recover the sample HBase snapshot from Amazon S3. HBase uses Hadoop MapReduce and EMRFS under the hood to transfer snapshots quickly from Amazon S3. SSH to the master node of your cluster, and use the HBase shell to create an empty table which will be populated by the restored data.

hbase shell
>> create 'customer', 'address', 'cc', 'contact'

Next, run an HBase command (outside of the shell) to copy the snapshot from Amazon S3 to HDFS on your cluster.

sudo su hbase -s /bin/sh -c 'hbase snapshot export -D hbase.rootdir=s3://us-east-1.elasticmapreduce.samples/hbase-demo-customer-data/snapshot -snapshot customer_snapshot1 -copy-to hdfs://<MASTERDNS>:8020/user/hbase -mappers 2 -chuser hbase -chmod 700'

Finally, use the HBase shell to disable the ‘customer’ table, restore the snapshot, and re-enable the table.

hbase shell
>> disable 'customer'
>> restore_snapshot 'customer_snapshot1'
>> enable 'customer'

To demonstrate how specifying different portions of the key structure in the multi-dimensional map influences performance, you will perform a variety of queries using the HBase shell.

Get the credit card type for a specific customer

In this case, you specify every parameter in the multi-dimension map and quickly return the cell from the RegionServer. The syntax for this ‘get’ request is ‘TABLE_NAME’, ‘ROWKEY’, ‘COLUMN_FAMILY:COLUMN’

hbase(main):008:0> 
    get 'customer', 'armstrong_zula_8570365786', 'cc:type'

Get all the address data from the address column family for a specific customer

In this case, you return all of the data from the ‘address’ column family for a specific customer. As you can see, there is a time penalty when compared to just returning one specific cell.

hbase(main):011:0> 
    get 'customer', 'armstrong_zula_8570365786', {COLUMN => 'address'}

                                              

Get all the data on a specific customer

In this case, you return all of the data for a specific customer. There is a time cost for returning additional information.

hbase(main):004:0> get 'customer', 'armstrong_zula_8570365786'

 

Get all the cities for each row that has a customer with a last name starting with “armstrong”

In this case, there is a partial prefix for the rowkey as well as a specific column family/qualifier for which you are querying. The HBase cluster is able to check with Region and RegionServers that you need to query. After that is done, the RegionServer can check the various HFiles if it has that specific rowkey and column data in the dataset. 

There are multiple performance improvements that a customer can turn on their table, such as bloom filters. Bloom filters allow the RegionServer to skip over some of the HFiles quickly.

hbase(main):014:0> 
scan 'customer', {STARTROW => 'armstrong_', ENDROW => 'armstronh', COLUMN => 'address:city'}

Get all customers that use a Visa credit card

In this example, HBase has to scan over every RegionServer, and search all the HFiles that contain the cc Column Family. The full list of customers that use Visa credit cards is returned.

hbase(main):014:0> 
scan 'customer', { COLUMNS => 'cc:type',  
                   FILTER => "ValueFilter( =, 'binaryprefix:visa' )" }

 

Running SQL analytics over data in HBase tables using Hive

HBase can organize your dataset for fast look-ups and scans, easily update rows, and bulk update your tables. Though it doesn’t have a native SQL interface to run massive parallel analytics workloads, HBase tightly integrates with Apache Hive, which utilizes Hadoop MapReduce as an execution engine, allowing you to write SQL queries on your HBase tables quickly or join data in HBase with other datasets.

For this post, you will write your Hive queries and browse tables in the Hive Metastore in the Hue UI, which runs on the master node of your cluster. To connect to Hue, see Launch the Hue Web Interface.

After you have signed into Hue in your browser, explore the UI by choosing Data Browsers, HBase Browser. Select the “customer” table to view your HBase table in the browser. Here you can see a sample of rows in your table and can query, add, and update rows through the UI. To learn more about the functionality available, see HBase Browser in the Hue documentation. Use the search box to scan HBase for 100 records with the rowkey prefix “smith” and return the credit card type and state:

Smith* +100 [cc:type,address:state]

 

Next, go to the Hive editor under Query Editor and create an external Hive table over your data in HBase:

CREATE EXTERNAL TABLE customer
(rowkey STRING,
street STRING,
city STRING,
state STRING,
zip STRING,
cctype STRING,
ccnumber STRING,
ccexpire STRING,
phone STRING)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,address:street,address:city,address:state,address:zip,cc:type,cc:number,cc:expire,contact:phone', 'hbase.scan.cacheblocks' = 'false', 'hbase.scan.cache' = '1000')
TBLPROPERTIES ('hbase.table.name' = 'customer');

Hive uses the HBase storage handler to interact with HBase tables, and you specify the column information and other information in the SerDe properties. In this table, you disable caching the blocks requested by Hive in RegionServers to avoid replacing records in the cache, and set the number of batched records returned from HBase to a client to 1,000. Other settings for the storage handler can be found in the Class HBaseSerDe topic in the Hive documentation.                        

After creating the Hive table, you can browse the schema and sample the data in the Metastore Tables section under the Data Browsers category:

Now that you have created your table, run some SQL queries. Note that Hive creates one mapper per input split, and calculates one split per region in your HBase table (see the HBase storage handler page for more information). The table in this example has two regions, so the Hive job uses two mappers despite the true memory requirements of the job.

Mappers use the memory settings present on the cluster, and the default mapper sizes for each instance type can be found in the EMR documentation. However, the way Hive calculates the number of mappers (1 per region) may cause too few mappers to be created for the actual data size, which in turn can cause out-of-memory issues for your job. To give each mapper more memory than the defaults, you can set these parameters in your Hive query editor (these stay alive for the duration of your Hive session in Hue):

SET mapreduce.map.memory.mb=2000;
SET mapreduce.map.java.opts=-Xmx1500m;

Execute, and now you have increased the mapper memory size. Now, count the rows in the HBase table:

SELECT count (rowkey)
FROM customer;

As you can see from the logs, Hive creates a MapReduce job, and each mapper is using predicate-push downs to use the RegionServer scan API to return data efficiently. Now, perform a more advanced query: finding the count of each credit card type for all customers from California.

SELECT cctype, count(rowkey) AS count
FROM customer
WHERE state = 'CA'
GROUP BY cctype;

Choose Chart in the results section to view the output in a graphical representation.

Querying, fusing, and aggregating data from S3 and HBase

You can create multiple tables in the Hive Metastore, and have each table backed by different external sources including Amazon S3, HDFS, and HBase. These sources can be used together in a join, enabling you to enrich data across sources and aggregate metrics across datasets.

In this example, you are going to use data stored in Amazon S3 to help enrich the data already stored in HBase. The data in HBase has the abbreviation for each state in each row. However, there is a CSV file in S3 that contains the state abbreviation (as a join key), full state name, governor name, and governor’s political affiliation.

First, you need to create an external table over Amazon S3. Use the Hive query editor in Hue to create the table:

CREATE EXTERNAL TABLE StateInfo(
        StateName STRING, 
        Abbr STRING,
        Name STRING,
        Affiliation STRING)
     ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    STORED AS TEXTFILE
    location 's3://us-east-1.elasticmapreduce.samples/hbase-demo-customer-data/state-data/';

In the following query, you find the number of customers per credit card type and state name. To do this, query tables in both HBase and S3 as input sources, and join the tables on the state abbreviation:

SELECT cctype, statename, count(rowkey)
FROM customer c
JOIN stateinfo s ON (c.state = s.abbr)
GROUP BY cctype, statename;

Next, you can see the top 10 senators who have the most constituents who use Visa credit cards:

SELECT name, max(affiliation) AS affiliation, count(rowkey)
FROM customer c
JOIN stateinfo s ON (c.state = s.abbr)
WHERE cctype = 'visa'
GROUP BY name;

Conclusion

In this post, you explored the HBase table structure, restoring table snapshots from S3, and the relative performance of various GET requests. Next, you used Hive to create an external table over data in HBase, and ran SQL queries using the Hive HBase storage handler on your HBase data. Finally, you joined data from an external, relational table in Amazon S3 with data in non-relational format in HBase.

If you have any questions about using HBase 1.2 on Amazon EMR, or have interesting use cases leveraging HBase with other applications in the Apache ecosystem, please comment below.

If you have questions or suggestions, please leave a comment below.

———————————-

Related

Analyze Your Data on Amazon DynamoDB with Apache Spark

The disingenuous question (FBIvApple)

Post Syndicated from Robert Graham original http://blog.erratasec.com/2016/02/the-disingenuous-question-fbivapple.html

I need more than 140 characters to respond to this tweet:If you were a crime victim and key evidence was on suspect’s phone, would you want govt to search phone w/ warrant?— Orin Kerr (@OrinKerr) February 22, 2016It’s an invalid question to ask. Firstly, it’s asking for the emotional answer, not the logical answer. Secondly, it’s only about half the debate, when the FBI is on your side, and not against you.The emotional question is like ISIS kidnappings. Logically, we know that the ransom money will fund ISIS’s murderous campaign, killing others. Logically, we know that paying this ransom just encourages more kidnappings of other people — that if we stuck to a policy of never paying ransoms, then ISIS would stop kidnapping people.If it were my loved ones at stake, of course I’d do anything to get them back alive and healthy, including pay a ransom. But at the same time, logically, I’d vote for laws to stop people paying ransoms. In other words, I’d vote for laws that I would then happily break should the situation ever apply to me.Thus, the following question has no meaning in a policy debate over paying ransoms:If it was your loved one at stake, would you pay the ransom?Even those who say “no” are being disingenuous. It’s easy to say it because they aren’t in danger of the situation ever happening to them. Most would change their answer to “yes” if it became real.The second reason the original question is invalid because it ignores why we have warrants in the first place. Unlimited police power is a bad thing. What you need is a counterbalancing question.For example, in 2007 (before iPhones became popular) the FBI showed up at my business and threatened me in order to keep something quiet. Specifically, I was to give a talk at a conference on how, contrary to what the company “TippingPoint” claimed, it was easy to decrypt their “signature” files. That company convinced the FBI that it was important to “national security” that I keep such information quiet. So the FBI came to our offices, and first asked politely, then started threatening me, in order to keep the information quiet.So, in such situations, should the FBI be able to get a warrant and search my phone? Note that a warrant would be easy to get, as the company TippingPoint suggested that I was also trying to blackmail (demanding money to stay quiet). It was a lie, they kept offering to bribe us to keep quiet and we kept telling them “under no circumstances”, but it’s enough to get a warrant in order go fishing for something else to hang us by.If FBI threatened you to keep quiet about something, should they be able to search your phone w/ warrant?@OrinKerr— Rob Graham ❄️ (@ErrataRob) February 22, 2016This is less a meaningful question. Most people are sheep, believing that as long as they don’t stick their heads up above the herd, they are in no danger of getting their heads lopped off. But even if it’s not your head in danger, don’t you want to protect those who do raise their heads?Rather than a “Going Dark” problem, ours is one of “Going Light”. We all now carry a GPS tracking device in our pocket that contains a microphone and video camera. We are quickly putting a microphone (and sometimes camera) in every room in our house, with devices like smart TVs and Amazon’s Echo. License plate readers line the roads, and face recognition (as well as video cameras) are located everywhere crowds gather. All our credit card transactions are slurped up by the government, as are our phone metadata (even more so since the so-called USA FREEDOM ACT).The question is whether the “warrant upon probable cause” is sufficient protection for the Going Light problem? Or do we need more limits?We activists think more limits are needed. The first limits are the ones requiring no special laws. Encryption is basic math — the effort necessary to stop encryption would require a police state worse than that created by the War on Drugs. The government should not be able to conscript programmers to create new technology on their behalf, as in the current Apple-v-FBI case.The War on Drugs and the War on Terror have made a police state out of America. We jail 10 times more people, per capita, than other free nations (more than virtually any other nation). Law enforcement steals more through “civil asset forfeiture” than burglars do. We can no longer travel without showing our papers at numerous checkpoints. We can no longer communicate nor use credit cards without a record going to a government controlled database.Yes, this police state works in our favor when it’s us that have been a victim of crime. But on the whole, we are now more in danger from the police state than we are from crime itself.BTW, @orinkerr is awesome. He asks the question because he honestly wants to know the answer, not because he’s slyly arguing the point. He brings up the question because so many others mention it. I’m using his as they example only because it’s the one that’s handy, and I’m too lazy hunting down a different one. Update: as he points out.

The disingenuous question (FBIvApple)

Post Syndicated from Robert Graham original http://blog.erratasec.com/2016/02/the-disingenuous-question-fbivapple.html

I need more than 140 characters to respond to this tweet:If you were a crime victim and key evidence was on suspect’s phone, would you want govt to search phone w/ warrant?— Orin Kerr (@OrinKerr) February 22, 2016It’s an invalid question to ask. Firstly, it’s asking for the emotional answer, not the logical answer. Secondly, it’s only about half the debate, when the FBI is on your side, and not against you.The emotional question is like ISIS kidnappings. Logically, we know that the ransom money will fund ISIS’s murderous campaign, killing others. Logically, we know that paying this ransom just encourages more kidnappings of other people — that if we stuck to a policy of never paying ransoms, then ISIS would stop kidnapping people.If it were my loved ones at stake, of course I’d do anything to get them back alive and healthy, including pay a ransom. But at the same time, logically, I’d vote for laws to stop people paying ransoms. In other words, I’d vote for laws that I would then happily break should the situation ever apply to me.Thus, the following question has no meaning in a policy debate over paying ransoms:If it was your loved one at stake, would you pay the ransom?Even those who say “no” are being disingenuous. It’s easy to say it because they aren’t in danger of the situation ever happening to them. Most would change their answer to “yes” if it became real.The second reason the original question is invalid because it ignores why we have warrants in the first place. Unlimited police power is a bad thing. What you need is a counterbalancing question.For example, in 2007 (before iPhones became popular) the FBI showed up at my business and threatened me in order to keep something quiet. Specifically, I was to give a talk at a conference on how, contrary to what the company “TippingPoint” claimed, it was easy to decrypt their “signature” files. That company convinced the FBI that it was important to “national security” that I keep such information quiet. So the FBI came to our offices, and first asked politely, then started threatening me, in order to keep the information quiet.So, in such situations, should the FBI be able to get a warrant and search my phone? Note that a warrant would be easy to get, as the company TippingPoint suggested that I was also trying to blackmail (demanding money to stay quiet). It was a lie, they kept offering to bribe us to keep quiet and we kept telling them “under no circumstances”, but it’s enough to get a warrant in order go fishing for something else to hang us by.If FBI threatened you to keep quiet about something, should they be able to search your phone w/ warrant?@OrinKerr— Rob Graham ❄️ (@ErrataRob) February 22, 2016This is less a meaningful question. Most people are sheep, believing that as long as they don’t stick their heads up above the herd, they are in no danger of getting their heads lopped off. But even if it’s not your head in danger, don’t you want to protect those who do raise their heads?Rather than a “Going Dark” problem, ours is one of “Going Light”. We all now carry a GPS tracking device in our pocket that contains a microphone and video camera. We are quickly putting a microphone (and sometimes camera) in every room in our house, with devices like smart TVs and Amazon’s Echo. License plate readers line the roads, and face recognition (as well as video cameras) are located everywhere crowds gather. All our credit card transactions are slurped up by the government, as are our phone metadata (even more so since the so-called USA FREEDOM ACT).The question is whether the “warrant upon probable cause” is sufficient protection for the Going Light problem? Or do we need more limits?We activists think more limits are needed. The first limits are the ones requiring no special laws. Encryption is basic math — the effort necessary to stop encryption would require a police state worse than that created by the War on Drugs. The government should not be able to conscript programmers to create new technology on their behalf, as in the current Apple-v-FBI case.The War on Drugs and the War on Terror have made a police state out of America. We jail 10 times more people, per capita, than other free nations (more than virtually any other nation). Law enforcement steals more through “civil asset forfeiture” than burglars do. We can no longer travel without showing our papers at numerous checkpoints. We can no longer communicate nor use credit cards without a record going to a government controlled database.Yes, this police state works in our favor when it’s us that have been a victim of crime. But on the whole, we are now more in danger from the police state than we are from crime itself.BTW, @orinkerr is awesome. He asks the question because he honestly wants to know the answer, not because he’s slyly arguing the point. He brings up the question because so many others mention it. I’m using his as they example only because it’s the one that’s handy, and I’m too lazy hunting down a different one. Update: as he points out.

The CA’s Role in Fighting Phishing and Malware

Post Syndicated from Let's Encrypt - Free SSL/TLS Certificates original https://letsencrypt.org//2015/10/29/phishing-and-malware.html

Since we announced Let’s Encrypt we’ve often been asked how we’ll ensure that we don’t issue certificates for phishing and malware sites. The concern most commonly expressed is that having valid HTTPS certificates helps these sites look more legitimate, making people more likely to trust them.

Deciding what to do here has been tough. On the one hand, we don’t like these sites any more than anyone else does, and our mission is to help build a safer and more secure Web. On the other hand, we’re not sure that certificate issuance (at least for Domain Validation) is the right level on which to be policing phishing and malware sites in 2015. This post explains our thinking in order to encourage a conversation about the CA ecosystem’s role in fighting these malicious sites.

CAs Make Poor Content Watchdogs

Let’s Encrypt is going to be issuing Domain Validation (DV) certificates. On a technical level, a DV certificate asserts that a public key belongs to a domain – it says nothing else about a site’s content or who runs it. DV certificates do not include any information about a website’s reputation, real-world identity, or safety. However, many people believe the mere presence of DV certificate ought to connote at least some of these things.

Treating a DV certificate as a kind of “seal of approval” for a site’s content is problematic for several reasons.

First, CAs are not well positioned to operate anti­-phishing and anti-malware operations – or to police content more generally. They simply do not have sufficient ongoing visibility into sites’ content. The best CAs can do is check with organizations that have much greater content awareness, such as Microsoft and Google. Google and Microsoft consume vast quantities of data about the Web from massive crawling and reporting infrastructures. This data allows them to use complex machine learning algorithms (developed and operated by dozens of staff) to identify malicious sites and content.

Even if a CA checks for phishing and malware status with a good API, the CA’s ability to accurately express information regarding phishing and malware is extremely limited. Site content can change much faster than certificate issuance and revocation cycles, phishing and malware status can be page-specific, and certificates and their related browser UIs contain little, if any, information about phishing or malware status. When a CA doesn’t issue a certificate for a site with phishing or malware content, users simply don’t see a lock icon. Users are much better informed and protected when browsers include anti-phishing and anti-malware features, which typically do not suffer from any of these limitations.

Another issue with treating DV certificates as a “seal of approval” for site content is that there is no standard for CA anti­-phishing and anti-malware measures beyond a simple blacklist of high-­value domains, so enforcement is inconsistent across the thousands of CAs trusted by major browsers. Even if one CA takes extraordinary measures to weed out bad sites, attackers can simply shop around to different CAs. The bad guys will almost always be able to get a certificate and hold onto it long enough to exploit people. It doesn’t matter how sophisticated the best CA anti­-phishing and anti-malware programs are, it only matters how good the worst are. It’s a “find the weakest link” scenario, and weak links aren’t hard to find.

Browser makers have realized all of this. That’s why they are pushing phishing and malware protection features, and evolving their UIs to more accurately reflect the assertions that certificates actually make.

TLS No Longer Optional

When they were first developed in the 1990s, HTTPS and SSL/TLS were considered “special” protections that were only necessary or useful for particular kinds of websites, like online banks and shopping sites accepting credit cards. We’ve since come to realize that HTTPS is important for almost all websites. It’s important for any website that allows people to log in with a password, any website that tracks its users in any way, any website that doesn’t want its content altered, and for any site that offers content people might not want others to know they are consuming. We’ve also learned that any site not secured by HTTPS can be used to attack other sites.

TLS is no longer the exception, nor should it be. That’s why we built Let’s Encrypt. We want TLS to be the default method for communication on the Web. It should just be a fundamental part of the fabric, like TCP or HTTP. When this happens, having a certificate will become an existential issue, rather than a value add, and content policing mistakes will be particularly costly. On a technical level, mistakes will lead to significant down time due to a slow issuance and revocation cycle, and features like HSTS. On a philosophical and moral level, mistakes (innocent or otherwise) will mean censorship, since CAs would be gatekeepers for online speech and presence. This is probably not a good role for CAs.

Our Plan

At least for the time being, Let’s Encrypt is going to check with the Google Safe Browsing API before issuing certificates, and refuse to issue to sites that are flagged as phishing or malware sites. Google’s API is the best source of phishing and malware status information that we have access to, and attempting to do more than query this API before issuance would almost certainly be wasteful and ineffective.

We’re going to implement this phishing and malware status check because many people are not comfortable with CAs entirely abandoning anti-phishing and anti-malware efforts just yet, even for DV certificates. We’d like to continue the conversation for a bit longer before we abandon what many people perceive to be an important CA behavior, even though we disagree.

Conclusion

The fight against phishing and malware content is an important one, but it does not make sense for CAs to be on the front lines, at least when it comes to DV certificates. That said, we’re going to implement checks against the Google Safe Browsing API while we continue the conversation.

We look forward to hearing what you think. Please let us know.