Tag Archives: hyperlink

Pirate IPTV Sellers Sign Abstention Agreements Under Pressure From BREIN

Post Syndicated from Andy original https://torrentfreak.com/pirate-iptv-sellers-sign-abstention-agreement-under-pressure-from-brein-180528/

Earlier this month, Dutch anti-piracy outfit BREIN revealed details of its case against Netherlands-based company Leaper Beheer BV.

BREIN’s complaint, which was filed at the Limburg District Court in Maastricht, claimed that
Leaper sold access to unlicensed live TV streams and on-demand movies. Around 4,000 live channels and 1,000 movies were included in the package, which was distributed to customers in the form of an .M3U playlist.

BREIN said that distribution of the playlist amounted to a communication to the public in contravention of the EU Copyright Directive. In its defense, Leaper argued that it is not a distributor of content itself and did not make anything available that wasn’t already public.

In a detailed ruling the Court sided with BREIN, noting that Leaper communicated works to a new audience that wasn’t taken into account when the content’s owners initially gave permission for their work to be distributed to the public.

The Court ordered Leaper to stop providing access to the unlicensed streams or face penalties of 5,000 euros per IPTV subscription sold, link offered, or days exceeded, to a maximum of one million euros. Further financial penalties were threatened for non-compliance with other aspects of the ruling.

In a fresh announcement Friday, BREIN revealed that three companies and their directors (Leaper included) have signed agreements to cease-and-desist, in order to avert summary proceedings. According to BREIN, the companies are the biggest sellers of pirate IPTV subscriptions in the Netherlands.

In addition to Leaper Beheer BV, Growler BV, DITisTV and their respective directors are bound by a number of conditions in their agreements but primarily to cease-and-desist offering hyperlinks or other technical means to access protected works belonging to BREIN’s affiliates and their members.

Failure to comply with the terms of the agreement will see the companies face penalties of 10,000 euros per infringement or per day (or part thereof).

DITisTV’s former website now appears to sell shoes and a search for the company using Google doesn’t reveal many flattering results. Consumer website Consumentenbond.nl enjoys the top spot with an article reporting that it received 300 complaints about DITisTV.

“The complainants report that after they have paid, they have not received their order, or that they were not given a refund if they sent back a malfunctioning media player. Some consumers have been waiting for their money for several months,” the article reads.

According to the report, DiTisTV pulled the plug on its website last June, probably in response to the European Court of Justice ruling which found that selling piracy-configured media players is illegal.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Court Orders Pirate IPTV Linker to Shut Down or Face Penalties Up to €1.25m

Post Syndicated from Andy original https://torrentfreak.com/court-orders-pirate-iptv-linker-to-shut-down-or-face-penalties-up-to-e1-25m-180911/

There are few things guaranteed in life. Death, taxes, and lawsuits filed regularly by Dutch anti-piracy outfit BREIN.

One of its most recent targets was Netherlands-based company Leaper Beheer BV, which also traded under the names Flickstore, Dump Die Deal and Live TV Store. BREIN filed a complaint at the Limburg District Court in Maastricht, claiming that Leaper provides access to unlicensed live TV streams and on-demand movies.

The anti-piracy outfit claimed that around 4,000 live channels were on offer, including Fox Sports, movie channels, commercial and public channels. These could be accessed after the customer made a payment which granted access to a unique activation code which could be entered into a set-top box.

BREIN told the court that the code returned an .M3U playlist, which was effectively a hyperlink to IPTV channels and more than 1,000 movies being made available without permission from their respective copyright holders. As such, this amounted to a communication to the public in contravention of the EU Copyright Directive, BREIN argued.

In its defense, Leaper said that it effectively provided a convenient link-shortening service for content that could already be found online in other ways. The company argued that it is not a distributor of content itself and did not make available anything that wasn’t already public. The company added that it was completely down to the consumer whether illegal content was viewed or not.

The key question for the Court was whether Leaper did indeed make a new “communication to the public” under the EU Copyright Directive, a standard the Court of Justice of the European Union (CJEU) says should be interpreted in a manner that provides a high level of protection for rightsholders.

The Court took a three-point approach in arriving at its decision.

  • Did Leaper act in a deliberate manner when providing access to copyright content, especially when its intervention provided access to consumers who would not ordinarily have access to that content?
  • Did Leaper communicate the works via a new method to a new audience?
  • Did Leaper have a profit motive when it communicated works to the public?
  • The Court found that Leaper did communicate works to the public and intervened “with full knowledge of the consequences of its conduct” when it gave its customers access to protected works.

    “Access to [the content] in a different way would be difficult for those customers, if Leaper were not to provide its services in question,” the Court’s decision reads.

    “Leaper reaches an indeterminate number of potential recipients who can take cognizance of the protected works and form a new audience. The purchasers who register with Leaper are to be regarded as recipients who were not taken into account by the rightful claimants when they gave permission for the original communication of their work to the public.”

    With that, the Court ordered Leaper to cease-and-desist facilitating access to unlicensed streams within 48 hours of the judgment, with non-compliance penalties of 5,000 euros per IPTV subscription sold, link offered, or days exceeded, to a maximum of one million euros.

    But the Court didn’t stop there.

    “Leaper must submit a statement audited by an accountant, supported by (clear, readable copies of) all relevant documents, within 12 days of notification of this judgment of all the relevant (contact) details of the (person or legal persons) with whom the company has had contact regarding the provision of IPTV subscriptions and/or the provision of hyperlinks to sources where films and (live) broadcasts are evidently offered without the permission of the entitled parties,” the Court ruled.

    Failure to comply with this aspect of the ruling will lead to more penalties of 5,000 euros per day up to a maximum of 250,000 euros. Leaper was also ordered to pay BREIN’s costs of 20,700 euros.

    Describing the people behind Leaper as “crooks” who previously sold media boxes with infringing addons (as previously determined to be illegal in the Filmspeler case), BREIN chief Tim Kuik says that a switch of strategy didn’t help them evade the law.

    “[Leaper] sold a link to consumers that gave access to unauthorized content, i.e. pay-TV channels as well as video-on-demand films and series,” BREIN chief Tim Kuik informs TorrentFreak.

    “They did it for profit and should have checked whether the content was authorized. They did not and in fact were aware the content was unauthorized. Which means they are clearly infringing copyright.

    “This is evident from the CJEU case law in GS Media as well as Filmspeler and The Pirate Bay, aka the Dutch trilogy because the three cases came from the Netherlands, but these rulings are applicable throughout the EU.

    “They just keep at it knowing they’re cheating and we’ll take them to the cleaners,” Kuik concludes.

    Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

    E-Mail Tracking

    Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/12/e-mail_tracking_1.html

    Good article on the history and practice of e-mail tracking:

    The tech is pretty simple. Tracking clients embed a line of code in the body of an email­ — usually in a 1×1 pixel image, so tiny it’s invisible, but also in elements like hyperlinks and custom fonts. When a recipient opens the email, the tracking client recognizes that pixel has been downloaded, as well as where and on what device. Newsletter services, marketers, and advertisers have used the technique for years, to collect data about their open rates; major tech companies like Facebook and Twitter followed suit in their ongoing quest to profile and predict our behavior online.

    But lately, a surprising­ — and growing­ — number of tracked emails are being sent not from corporations, but acquaintances. “We have been in touch with users that were tracked by their spouses, business partners, competitors,” says Florian Seroussi, the founder of OMC. “It’s the wild, wild west out there.”

    According to OMC’s data, a full 19 percent of all “conversational” email is now tracked. That’s one in five of the emails you get from your friends. And you probably never noticed.

    I admit it’s enticing. I would very much like the statistics that adding trackers to Crypto-Gram would give me. But I still don’t do it.

    Top Ten Ways to Protect Yourself Against Phishing Attacks

    Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/top-ten-ways-protect-phishing-attacks/

    It’s hard to miss the increasing frequency of phishing attacks in the news. Earlier this year, a major phishing attack targeted Google Docs users, and attempted to compromise at least one million Google Docs accounts. Experts say the “phish” was convincing and sophisticated, and even people who thought they would never be fooled by a phishing attack were caught in its net.

    What is phishing?

    Phishing attacks use seemingly trustworthy but malicious emails and websites to obtain your personal account or banking information. The attacks are cunning and highly effective because they often appear to come from an organization or business you actually use. The scam comes into play by tricking you into visiting a website you believe belongs to the trustworthy organization, but in fact is under the control of the phisher attempting to extract your private information.

    Phishing attacks are once again in the news due to a handful of high profile ransomware incidents. Ransomware invades a user’s computer, encrypts their data files, and demands payment to decrypt the files. Ransomware most often makes its way onto a user’s computer through a phishing exploit, which gives the ransomware access to the user’s computer.

    The best strategy against phishing is to scrutinize every email and message you receive and never to get caught. Easier said than done—even smart people sometimes fall victim to a phishing attack. To minimize the damage in an event of a phishing attack, backing up your data is the best ultimate defense and should be part of your anti-phishing and overall anti-malware strategy.

    How do you recognize a phishing attack?

    A phishing attacker may send an email seemingly from a reputable credit card company or financial institution that requests account information, often suggesting that there is a problem with your account. When users respond with the requested information, attackers can use it to gain access to the accounts.

    The image below is a mockup of how a phishing attempt might appear. In this example, courtesy of Wikipedia, the bank is fictional, but in a real attempt the sender would use an actual bank, perhaps even the bank where the targeted victim does business. The sender is attempting to trick the recipient into revealing confidential information by getting the victim to visit the phisher’s website. Note the misspelling of the words “received” and “discrepancy” as recieved and discrepency. Misspellings sometimes are indications of a phishing attack. Also note that although the URL of the bank’s webpage appears to be legitimate, the hyperlink would actually take you to the phisher’s webpage, which would be altogether different from the URL displayed in the message.

    By Andrew Levine – en:Image:PhishingTrustedBank.png, Public Domain, https://commons.wikimedia.org/w/index.php?curid=549747

    Top ten ways to protect yourself against phishing attacks

    1. Always think twice when presented with a link in any kind of email or message before you click on it. Ask yourself whether the sender would ask you to do what it is requesting. Most banks and reputable service providers won’t ask you to reveal your account information or password via email. If in doubt, don’t use the link in the message and instead open a new webpage and go directly to the known website of the organization. Sign in to the site in the normal manner to verify that the request is legitimate.
    2. A good precaution is to always hover over a link before clicking on it and observe the status line in your browser to verify that the link in the text and the destination link are in fact the same.
    3. Phishers are clever, and they’re getting better all the time, and you might be fooled by a simple ruse to make you think the link is one you recognize. Links can have hard-to-detect misspellings that would result in visiting a site very different than what you expected.
    4. Be wary even of emails and message from people you know. It’s very easy to spoof an email so it appears to come from someone you know, or to create a URL that appears to be legitimate, but isn’t.

    For example, let’s say that you work for roughmedia.com and you get an email from Chuck in accounting ([email protected]) that has an attachment for you, perhaps a company form you need to fill out. You likely wouldn’t notice in the sender address that the phisher has replaced the “m” in media with an “r” and an “n” that look very much like an “m.” You think it’s good old Chuck in finance and it’s actually someone “phishing” for you to open the attachment and infect your computer. This type of attack is known as “spear phishing” because it’s targeted at a specific individual and is using social engineering—specifically familiarity with the sender—as part of the scheme to fool you into trusting the attachment. This technique is by far the most successful on the internet today. (This example is based on Gimlet Media’s Reply All Podcast Episode, “What Kind of Idiot Gets Phished?“)

    1. Use anti-malware software, but don’t rely on it to catch all attacks. Phishers change their approach often to keep ahead of the software attack detectors.
    2. If you are asked to enter any valuable information, only do so if you’re on a secure connection. Look for the “https” prefix before the site URL, indicating the site is employing SSL (Secure Socket Layer). If there is no “s” after “http,” it’s best not to enter any confidential information.
    By Fabio Lanari – Internet1.jpg by Rock1997 modified., GFDL, https://commons.wikimedia.org/w/index.php?curid=20995390
    1. Avoid logging in to online banks and similar services via public Wi-Fi networks. Criminals can compromise open networks with man-in-the-middle attacks that capture your information or spoof website addresses over the connection and redirect you to a fake page they control.
    2. Email, instant messaging, and gaming social channels are all possible vehicles to deliver phishing attacks, so be vigilant!
    3. Lay the foundation for a good defense by choosing reputable tech vendors and service providers that respect your privacy and take steps to protect your data. At Backblaze, we have full-time security teams constantly looking for ways to improve our security.
    4. When it is available, always take advantage of multi-factor verification to protect your accounts. The standard categories used for authentication are 1) something you know (e.g. your username and password), 2) something you are (e.g. your fingerprint or retina pattern), and 3) something you have (e.g. an authenticator app on your smartphone). An account that allows only a single factor for authentication is more susceptible to hacking than one that supports multiple factors. Backblaze supports multi-factor authentication to protect customer accounts.

    Be a good internet citizen, and help reduce phishing and other malware attacks by notifying the organization being impersonated in the phishing attempt, or by forwarding suspicious messages to the Federal Trade Commission at [email protected]. Some email clients and services, such as Microsoft Outlook and Google Gmail, give you the ability to easily report suspicious emails. Phishing emails misrepresenting Apple can be reported to [email protected].

    Backing up your data is an important part of a strong defense against phishing and other malware

    The best way to avoid becoming a victim is to be vigilant against suspicious messages and emails, but also to assume that no matter what you do, it is very possible that your system will be compromised. Even the most sophisticated and tech-savvy of us can be ensnared if we are tired, in a rush, or just unfamiliar with the latest methods hackers are using. Remember that hackers are working full-time on ways to fool us, so it’s very difficult to keep ahead of them.

    The best defense is to make sure that any data that could compromised by hackers—basically all of the data that is reachable via your computer—is not your only copy. You do that by maintaining an active and reliable backup strategy.

    Files that are backed up to cloud storage, such as with Backblaze, are not vulnerable to attacks on your local computer in the way that local files, attached drives, network drives, or sync services like Dropbox that have local directories on your computer are.

    In the event that your computer is compromised and your files are lost or encrypted, you can recover your files if you have a cloud backup that is beyond the reach of attacks on your computer.

    The post Top Ten Ways to Protect Yourself Against Phishing Attacks appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

    Launch: Amazon Athena adds support for Querying Encrypted Data

    Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/launch-amazon-athena-adds-support-for-querying-encrypted-data/

    In November of last year, we brought a service to the market that we hoped would be a major step toward helping those who have the need to securely access and examine massive amounts of data on a daily basis.  This service is none other than Amazon Athena which I think of as a managed service that is attempting “to leap tall queries in a single bound” with querying of object storage. A service that provides AWS customers the power to easily analyze and query large amounts of data stored in Amazon S3.

    Amazon Athena is a serverless interactive query service that enables users to easily analyze data in Amazon S3 using standard SQL. At Athena’s core is Presto, a distributed SQL engine to run queries with ANSI SQL support and Apache Hive which allows Athena to work with popular data formats like CSV, JSON, ORC, Avro, and Parquet and adds common Data Definition Language (DDL) operations like create, drop, and alter tables. Athena enables the performant query access to datasets stored in Amazon Simple Storage Service (S3) with structured and unstructured data formats.

    You can write Hive-compliant DDL statements and ANSI SQL statements in the Athena Query Editor from the AWS Management Console, from SQL clients such as SQL Workbench by downloading and taking advantage of the Athena JDBC driver. Additionally, by using the JDBC driver you can run queries programmatically from your desired BI tools. You can read more about the Amazon Athena service from Jeff’s blog post during the service release in November.

    After releasing the initial features of the Amazon Athena service, the Athena team kept with the Amazon tradition of focusing on the customer by working diligently to make your customer experience with the service better. Therefore, the team has added a feature that I am excited to announce; Amazon Athena now provides support for Querying Encrypted data in Amazon S3. This new feature not only makes it possible for Athena to provide support for querying encrypted data in Amazon S3, but also enables the encryption of data from Athena’s query results. Businesses and customers who have requirements and/or regulations to encrypt sensitive data stored in Amazon S3 are able to take advantage of the serverless dynamic queries Athena offers with their encrypted data.

     

    Supporting Encryption

    Before we dive into the using the new Athena feature, let’s take some time to review the supported encryption options that S3 and Athena supports for customers needing to secure and encrypt data. Currently, S3 supports encrypting data with AWS Key Management Service (KMS). AWS KMS is a managed service for the creation and management of encryption keys used to encrypt data. In addition, S3 supports customers using their own encryption keys to encrypt data. Since it is important to understand the encrypted options that Athena supports for datasets stored on S3, in the chart below I have provided a breakdown of the encryption options supported with S3 and Athena, as well as, noted when the new Athena table property, has_encrypted_data, is required for encrypted data access.

     

    For more information on Amazon S3 encryption with AWS KMS or Amazon S3 Encryption options, review the information in the AWS KMS Developer Guide on How Amazon Simple Storage Service (Amazon S3) Uses AWS KMS and Amazon S3 Developer Guide on Protecting Data Using Encryption respectively.

     

    Creating & Accessing Encrypted Databases and Tables

    As I noted before, there are a couple of ways to access Athena. Of course, you can access Athena through the AWS Management Console, but you also have the option to use the JDBC driver with SQL clients like SQL Workbench and other Business Intelligence tools. In addition, the JDBC driver allows for programmatic query access.

    Enough discussion, it is time to dig into this new Athena service feature by creating a database and some tables, running queries from the table and encryption of the query results. We’ll accomplish all this by using encrypted data stored in Amazon S3.

    If this is your first time logging into the service, you will see the Amazon Athena Getting Started screen as shown below. You would need to click the Get Started button to be taken the Athena Query Editor.

    Now that we are in the Athena Query Editor, let’s create a database. If the sample database is shown when you open your Query Editor you would simply start typing your query statement in the Query Editor window to clear the sample query and create the new database.

    I will now issue the Hive DDL Command, CREATE DATABASE <dbname> within the Query Editor window to create my database, tara_customer_db.

    Once I receive the confirmation that my query execution was successful in the Results tab of Query Editor, my database should be created and available for selection in the dropdown.

    I now will change my selected database in the dropdown to my newly created database, tara_customer_db.

     

     

    With my database created, I am able to create tables from my data stored in S3. Since I did not have data encrypted with the various encryption types, the product group was kind enough to give me some sample data files to place in my S3 buckets. The first batch of sample data that I received was encrypted with SSE-KMS which if you recall from the encryption table matrix we discussed above is encryption type, Server-Side Encryption with AWS KMS–Managed Keys. I stored this set of encrypted data in my S3 bucket aptly named: aws-blog-tew-posts/SSE_KMS_EncryptionData. The second batch of sample data was encrypted with CSE-KMS, which is the encryption type, Client-Side Encryption with AWS, and is stored in my aws-blog-tew-posts/ CSE_KMS_EncryptionData S3 bucket. The last batch of data I received is just good old-fashioned plain text, and I have stored this data in the S3 bucket, aws-blog-tew-posts/PlainText_Table.

    Remember to access my data in the S3 buckets from the Athena service, I must ensure that my data buckets have the correct permissions to allow Athena access each bucket and data contained therein. In addition, working with AWS KMS encrypted data requires users to have roles that include the appropriate KMS key policies. It is important to note that to successfully read KMS encrypted data, users must have the correct permissions for access to S3, Athena, and KMS collectively.

    There are several ways that I can provide the appropriate access permissions between S3 and the Athena service:

    1. Allow access via user policy
    2. Allow access via bucket policy
    3. Allow access with both a bucket policy and user policy.

    To learn more about the Amazon Athena access permissions and/or the Amazon S3 permissions by reviewing the Athena documentation on Setting User and Amazon S3 Bucket Permissions.

    Since my data is ready and setup in my S3 buckets, I just need to head over to Athena Query Editor and create my first new table from the SSE-KMS encrypted data. My DDL commands that I will use to create my new table, sse_customerinfo, is as follows:

    CREATE EXTERNAL TABLE sse_customerinfo( 
      c_custkey INT, 
      c_name STRING, 
      c_address STRING, 
      c_nationkey INT, 
      c_phone STRING, 
      c_acctbal DOUBLE, 
      c_mktsegment STRING, 
      c_comment STRING
      ) 
    ROW FORMAT SERDE  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
    STORED AS INPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
    OUTPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' 
    LOCATION  's3://aws-blog-tew-posts/SSE_KMS_EncryptionData';
    

    I will enter my DDL command statement for the sse_customerinfo table creation into my Athena Query Editor and click the Run Query button. The Results tab will note that query was run successfully and you will see my new table show up under the tables available for the tara_customer_db database.

    I will repeat this process to create my cse_customerinfo table from the CSE-KMS encrypted batch of data and then the plain_customerinfo table from the unencrypted data source stored in my S3 bucket. The DDL statements used to create my cse_customerinfo table are as follows:

    CREATE EXTERNAL TABLE cse_customerinfo (
      c_custkey INT, 
      c_name STRING, 
      c_address STRING, 
      c_nationkey INT, 
      c_phone STRING, 
      c_acctbal DOUBLE, 
      c_mktsegment STRING, 
      c_comment STRING
    )
    ROW FORMAT SERDE   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
    STORED AS INPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
    OUTPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
    LOCATION   's3://aws-blog-tew-posts/CSE_KMS_EncryptionData'
    TBLPROPERTIES ('has_encrypted_data'='true');
    

    Again, I will enter my DDL statements above into the Athena Query Editor and click the Run Query button. If you review the DDL statements used to create the cse_customerinfo table carefully, you will notice a new table property (TBLPROPERTIES) flag, has_encrypted_data, was introduced with the new Athena encryption capability. This flag is used to tell Athena that the data in S3 to be used with queries for the specified table is encrypted data. If take a moment and refer back to the encryption matrix table we I reviewed earlier for the Athena and S3 encryption options, you will see that this flag is only required when you are using the Client-Side Encryption with AWS KMS–Managed Keys option. Once the cse_customerinfo table has been successfully created, a key symbol will appear next to the table identifying the table as an encrypted data table.

    Finally, I will create the last table, plain_customerinfo, from our sample data. Same steps as we performed for the previous tables. The DDL commands for this table are:

    CREATE EXTERNAL TABLE plain_customerinfo(
      c_custkey INT, 
      c_name STRING, 
      c_address STRING, 
      c_nationkey INT, 
      c_phone STRING, 
      c_acctbal DOUBLE, 
      c_mktsegment STRING, 
      c_comment STRING
    )
    ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
    STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
    OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
    LOCATION 's3://aws-blog-tew-posts/PlainText_Table';
    


    Great! We have successfully read encrypted data from S3 with Athena, and created tables based on the encrypted data. I can now run queries against my newly created encrypted data tables.

     

    Running Queries

    Running Queries against our new database tables is very simple. Again, common DDL statements and commands can be used to create queries against your data stored in Amazon S3. For our query review, I am going to use Athena’s preview data feature. In the list of tables, you will see two icons beside the tables. One icon is a table property icon, selecting this will bring up the selected table properties, however, the other icon, displayed as an eye symbol, and is the preview data feature that will generate a simple SELECT query statement for the table.

     

     

    To demonstrate running queries with Athena, I have selected to preview data for my plain_customerinfo by selecting the eye symbol/icon next to the table. The preview data feature creates the following DDL statement:

    SELECT * FROM plain_customerinfo limit 10;

    The query results from using the preview data feature with my plain_customerinfo table are displayed in the Results tab of the Athena Query Editor and provides the option to download the query results by clicking the file icon.

    The new Athena encrypted data feature also supports encrypting query results and storing these results in Amazon S3. To take advantage of this feature with my query results, I will now encrypt and save my query data in a bucket of my choice. You should note that the data table that I have selected is currently unencrypted.
    First, I’ll select the Athena Settings menu and the review the current storage settings for my query results. Since I do not have a KMS key to use for encryption, I will select the Create KMS key hyperlink and create a KMS key for use in encrypting my query results with Athena and S3. For details on how to create a KMS key and configure the appropriate user permissions, please see http://docs.aws.amazon.com/kms/latest/developerguide/create-keys.html.

    After successfully creating my s3encryptathena KMS key and copying the key ARN for use in my Athena settings, I return to the Athena console Settings dialog and select the Encrypt query results textbox. I, then update the Query result location textbox point to my s3 bucket, aws-athena-encrypted, which will be the location for storing my encrypted query results.

    The only thing that is left is to select the Encryption type and enter my KMS key. I can do this by either selecting the s3encryptathena key from the Encryption key dropdown or enter its ARN in the KMS key ARN textbox. In this example, I have chosen to use SSE-KMS for the encryption type. You can see both examples of selecting the KMS key below. Clicking the Save button completes the process.

    Now I will rerun my current query for my plain_customerinfo table. Remember this table is not encrypted, but with the Athena settings changes made for adding encryption for the query results, I have enabled the query results run against this table to be stored with SSE-KMS encryption using my KMS key.

    After my query rerun, I can see the fruits of my labor by going to the Amazon S3 console and viewing the CSV data files saved in my designated bucket, aws-athena-encrypted, and the SSE-KMS encryption of the bucket and files.

     

    Summary

    Needless to say, this Athena launch has several benefits for those needing to secure data via encryption while still retaining the ability to perform queries and analytics for data stored in varying data formats. Additionally, this release includes improvements I did not dive into with this blog post.

    • A new version of the JDBC driver that supports new encryption feature and key updates.
    • Added the ability to add, replace, and change columns using ALTER TABLE.
    • Added support for querying LZO-compressed data.

    See the release documentation in the Athena user guide to more details and start leveraging Athena to query your encrypted data stored in Amazon S3 now, by reviewing the Configuring Encryption Options section in the Athena documentation.

    Learn more about Athena and serverless queries on Amazon S3 by visiting the Athena product page or reviewing the Athena User Guide. In addition, you can dig deeper on the functionality of Athena and data encryption with S3 by reviewing the AWS Big Data Blog post: Analyzing Data in S3 using Amazon Athena and the AWS KMS Developer Guide.

    Happy Encrypting!

    Tara

    Implementing Serverless Manual Approval Steps in AWS Step Functions and Amazon API Gateway

    Post Syndicated from Bryan Liston original https://aws.amazon.com/blogs/compute/implementing-serverless-manual-approval-steps-in-aws-step-functions-and-amazon-api-gateway/


    Ali Baghani, Software Development Engineer

    A common use case for AWS Step Functions is a task that requires human intervention (for example, an approval process). Step Functions makes it easy to coordinate the components of distributed applications as a series of steps in a visual workflow called a state machine. You can quickly build and run state machines to execute the steps of your application in a reliable and scalable fashion.

    In this post, I describe a serverless design pattern for implementing manual approval steps. You can use a Step Functions activity task to generate a unique token that can be returned later indicating either approval or rejection by the person making the decision.

    Key steps to implementation

    When the execution of a Step Functions state machine reaches an activity task state, Step Functions schedules the activity and waits for an activity worker. An activity worker is an application that polls for activity tasks by calling GetActivityTask. When the worker successfully calls the API action, the activity is vended to that worker as a JSON blob that includes a token for callback.

    At this point, the activity task state and the branch of the execution that contains the state is paused. Unless a timeout is specified in the state machine definition, which can be up to one year, the activity task state waits until the activity worker calls either SendTaskSuccess or SendTaskFailure using the vended token. This pause is the first key to implementing a manual approval step.

    The second key is the ability in a serverless environment to separate the code that fetches the work and acquires the token from the code that responds with the completion status and sends the token back, as long as the token can be shared, i.e., the activity worker in this example is a serverless application supervised by a single activity task state.

    In this walkthrough, you use a short-lived AWS Lambda function invoked on a schedule to implement the activity worker, which acquires the token associated with the approval step, and prepares and sends an email to the approver using Amazon SES.

    It is very convenient if the application that returns the token can directly call the SendTaskSuccess and SendTaskFailure API actions on Step Functions. This can be achieved more easily by exposing these two actions through Amazon API Gateway so that an email client or web browser can return the token to Step Functions. By combining a Lambda function that acquires the token with the application that returns the token through API Gateway, you can implement a serverless manual approval step, as shown below.

    In this pattern, when the execution reaches a state that requires manual approval, the Lambda function prepares and sends an email to the user with two embedded hyperlinks for approval and rejection.

    If the authorized user clicks on the approval hyperlink, the state succeeds. If the authorized user clicks on the rejection link, the state fails. You can also choose to set a timeout for approval and, upon timeout, take action, such as resending the email request using retry/catch conditions in the activity task state.

    Employee promotion process

    As an example pattern use case, you can design a simple employee promotion process which involves a single task: getting a manager’s approval through email. When an employee is nominated for promotion, a new execution starts. The name of the employee and the email address of the employee’s manager are provided to the execution.

    You’ll use the design pattern to implement the manual approval step, and SES to send the email to the manager. After acquiring the task token, the Lambda function generates and sends an email to the manager with embedded hyperlinks to URIs hosted by API Gateway.

    In this example, I have administrative access to my account, so that I can create IAM roles. Moreover, I have already registered my email address with SES, so that I can send emails with the address as the sender/recipient. For detailed instructions, see Send an Email with Amazon SES.

    Here is a list of what you do:

    1. Create an activity
    2. Create a state machine
    3. Create and deploy an API
    4. Create an activity worker Lambda function
    5. Test that the process works

    Create an activity

    In the Step Functions console, choose Tasks and create an activity called ManualStep.

    stepfunctionsfirst_1.png

    Remember to keep the ARN of this activity at hand.

    stepfunctionsfirst_2.png

    Create a state machine

    Next, create the state machine that models the promotion process on the Step Functions console. Use StatesExecutionRole-us-east-1, the default role created by the console. Name the state machine PromotionApproval, and use the following code. Remember to replace the value for Resource with your activity ARN.

    {
      "Comment": "Employee promotion process!",
      "StartAt": "ManualApproval",
      "States": {
        "ManualApproval": {
          "Type": "Task",
          "Resource": "arn:aws:states:us-east-1:ACCOUNT_ID:activity:ManualStep",
          "TimeoutSeconds": 3600,
          "End": true
        }
      }
    }

    Create and deploy an API

    Next, create and deploy public URIs for calling the SendTaskSuccess or SendTaskFailure API action using API Gateway.

    First, navigate to the IAM console and create the role that API Gateway can use to call Step Functions. Name the role APIGatewayToStepFunctions, choose Amazon API Gateway as the role type, and create the role.

    After the role has been created, attach the managed policy AWSStepFunctionsFullAccess to it.

    stepfunctionsfirst_3.png

    In the API Gateway console, create a new API called StepFunctionsAPI. Create two new resources under the root (/) called succeed and fail, and for each resource, create a GET method.

    stepfunctionsfirst_4.png

    You now need to configure each method. Start by the /fail GET method and configure it with the following values:

    • For Integration type, choose AWS Service.
    • For AWS Service, choose Step Functions.
    • For HTTP method, choose POST.
    • For Region, choose your region of interest instead of us-east-1. (For a list of regions where Step Functions is available, see AWS Region Table.)
    • For Action Type, enter SendTaskFailure.
    • For Execution, enter the APIGatewayToStepFunctions role ARN.

    stepfunctionsfirst_5.png

    To be able to pass the taskToken through the URI, navigate to the Method Request section, and add a URL Query String parameter called taskToken.

    stepfunctionsfirst_6.png

    Then, navigate to the Integration Request section and add a Body Mapping Template of type application/json to inject the query string parameter into the body of the request. Accept the change suggested by the security warning. This sets the body pass-through behavior to When there are no templates defined (Recommended). The following code does the mapping:

    {
       "cause": "Reject link was clicked.",
       "error": "Rejected",
       "taskToken": "$input.params('taskToken')"
    }

    When you are finished, choose Save.

    Next, configure the /succeed GET method. The configuration is very similar to the /fail GET method. The only difference is for Action: choose SendTaskSuccess, and set the mapping as follows:

    {
       "output": "\"Approve link was clicked.\"",
       "taskToken": "$input.params('taskToken')"
    }

    The last step on the API Gateway console after configuring your API actions is to deploy them to a new stage called respond. You can test our API by choosing the Invoke URL links under either of the GET methods. Because no token is provided in the URI, a ValidationException message should be displayed.

    stepfunctionsfirst_7.png

    Create an activity worker Lambda function

    In the Lambda console, create a Lambda function with a CloudWatch Events Schedule trigger using a blank function blueprint for the Node.js 4.3 runtime. The rate entered for Schedule expression is the poll rate for the activity. This should be above the rate at which the activities are scheduled by a safety margin.

    The safety margin accounts for the possibility of lost tokens, retried activities, and polls that happen while no activities are scheduled. For example, if you expect 3 promotions to happen, in a certain week, you can schedule the Lambda function to run 4 times a day during that week. Alternatively, a single Lambda function can poll for multiple activities, either in parallel or in series. For this example, use a rate of one time per minute but do not enable the trigger yet.

    stepfunctionsfirst_8.png

    Next, create the Lambda function ManualStepActivityWorker using the following Node.js 4.3 code. The function receives the taskToken, employee name, and manager’s email from StepFunctions. It embeds the information into an email, and sends out the email to the manager.

    
    'use strict';
    console.log('Loading function');
    const aws = require('aws-sdk');
    const stepfunctions = new aws.StepFunctions();
    const ses = new aws.SES();
    exports.handler = (event, context, callback) => {
        
        var taskParams = {
            activityArn: 'arn:aws:states:us-east-1:ACCOUNT_ID:activity:ManualStep'
        };
        
        stepfunctions.getActivityTask(taskParams, function(err, data) {
            if (err) {
                console.log(err, err.stack);
                context.fail('An error occured while calling getActivityTask.');
            } else {
                if (data === null) {
                    // No activities scheduled
                    context.succeed('No activities received after 60 seconds.');
                } else {
                    var input = JSON.parse(data.input);
                    var emailParams = {
                        Destination: {
                            ToAddresses: [
                                input.managerEmailAddress
                                ]
                        },
                        Message: {
                            Subject: {
                                Data: 'Your Approval Needed for Promotion!',
                                Charset: 'UTF-8'
                            },
                            Body: {
                                Html: {
                                    Data: 'Hi!<br />' +
                                        input.employeeName + ' has been nominated for promotion!<br />' +
                                        'Can you please approve:<br />' +
                                        'https://API_DEPLOYMENT_ID.execute-api.us-east-1.amazonaws.com/respond/succeed?taskToken=' + encodeURIComponent(data.taskToken) + '<br />' +
                                        'Or reject:<br />' +
                                        'https://API_DEPLOYMENT_ID.execute-api.us-east-1.amazonaws.com/respond/fail?taskToken=' + encodeURIComponent(data.taskToken),
                                    Charset: 'UTF-8'
                                }
                            }
                        },
                        Source: input.managerEmailAddress,
                        ReplyToAddresses: [
                                input.managerEmailAddress
                            ]
                    };
                        
                    ses.sendEmail(emailParams, function (err, data) {
                        if (err) {
                            console.log(err, err.stack);
                            context.fail('Internal Error: The email could not be sent.');
                        } else {
                            console.log(data);
                            context.succeed('The email was successfully sent.');
                        }
                    });
                }
            }
        });
    };

    In the Lambda function handler and role section, for Role, choose Create a new role, LambdaManualStepActivityWorkerRole.

    stepfunctionsfirst_9.png

    Add two policies to the role: one to allow the Lambda function to call the GetActivityTask API action by calling Step Functions, and one to send an email by calling SES. The result should look as follows:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "logs:CreateLogGroup",
            "logs:CreateLogStream",
            "logs:PutLogEvents"
          ],
          "Resource": "arn:aws:logs:*:*:*"
        },
        {
          "Effect": "Allow",
          "Action": "states:GetActivityTask",
          "Resource": "arn:aws:states:*:*:activity:ManualStep"
        },
        {
          "Effect": "Allow",
          "Action": "ses:SendEmail",
          "Resource": "*"
        }
      ]
    }

    In addition, as the GetActivityTask API action performs long-polling with a timeout of 60 seconds, increase the timeout of the Lambda function to 1 minute 15 seconds. This allows the function to wait for an activity to become available, and gives it extra time to call SES to send the email. For all other settings, use the Lambda console defaults.

    stepfunctionsfirst_10.png

    After this, you can create your activity worker Lambda function.

    Test the process

    You are now ready to test the employee promotion process.

    In the Lambda console, enable the ManualStepPollSchedule trigger on the ManualStepActivityWorker Lambda function.

    In the Step Functions console, start a new execution of the state machine with the following input:

    { "managerEmailAddress": "[email protected]", "employeeName" : "Jim" } 

    Within a minute, you should receive an email with links to approve or reject Jim’s promotion. Choosing one of those links should succeed or fail the execution.

    stepfunctionsfirst_11.png

    Summary

    In this post, you created a state machine containing an activity task with Step Functions, an API with API Gateway, and a Lambda function to dispatch the approval/failure process. Your Step Functions activity task generated a unique token that was returned later indicating either approval or rejection by the person making the decision. Your Lambda function acquired the task token by polling the activity task, and then generated and sent an email to the manager for approval or rejection with embedded hyperlinks to URIs hosted by API Gateway.

    If you have questions or suggestions, please comment below.

    Is your 2017 cloud resolution to grow your online presence?

    Post Syndicated from Paul Pierpoint original http://www.anchor.com.au/blog/2017/01/2017-cloud-resolution-grow-online-presence/

    If your resolution for 2017 is to grow your online presence, provide a better customer experience or maximising your ecommerce revenue without increasing your workload, then switching your hosting to the cloud is exactly what you’re looking for!

    Moving your website or web application into a cloud hosting environment means you can take advantage of the latest, high speed, highly available infrastructure to power your application. You can easily ‘scale up’ to meet your growing demands and only pay for what you use, making it more cost effective than the lofty fixed monthly costs of traditional dedicated or virtual servers.

    One of the biggest mistakes however, is attempting to manage the cloud operations by yourself – highlighted in our Mistakes to Avoid in AWS  EBook [Hyperlink]. While some car drivers might be capable of carrying out a few basic car maintenance tasks, the majority wouldn’t attempt to reconfigure the clutch or modify the engine, no matter how many YouTube videos or handy guides there might be! Sending the car to a trusted mechanic not only saves a great deal of time and stress—often worth the fee alone— but also gives us the confidence that it is done RIGHT. Similarly, in the world of managing cloud hosting, we make an incredible powerful and complex set up, easy and stress-free for you.

    If you’re an online retailer, then you should know about Fleet — Anchor’s Magento focussed Platform-as-a-Service, powered by Amazon Web Services. Fleet makes it simple to test and deploy new code by abstracting away all of the underlying server, networking and infrastructure complexity, and giving you confidence that your store will automatically scale up and down with customer traffic — saving you money and hassle. Furthermore, if you’re on an Agile/DevOps journey and working towards the holy grail of continuous delivery, it’s easy to bolt Fleet right into your existing workflow thanks to the API and deployment automation smarts.

    If you’d like more information on how to streamline your hosting, and take advantage of all the benefits on offer, simply contact us.

    The post Is your 2017 cloud resolution to grow your online presence? appeared first on AWS Managed Services by Anchor.

    Presenting an Open Source Toolkit for Lightweight Multilingual Entity Linking

    Post Syndicated from mikesefanov original https://yahooeng.tumblr.com/post/154168092396

    yahooresearch:

    By Aasish Pappu, Roi Blanco, and Amanda Stent

    What’s the first thing you want to know about any kind of text document (like a Yahoo News or Yahoo Sports article)? What it’s about, of course! That means you want to know something about the people, organizations, and locations that are mentioned in the document. Systems that automatically surface this information are called named entity recognition and linking systems. These are one of the most useful components in text analytics as they are required for a wide variety of applications including search, recommender systems, question answering, and sentiment analysis.

    Named entity recognition and linking systems use statistical models trained over large amounts of labeled text data. A major challenge is to be able to accurately detect entities, in new languages, at scale, with limited labeled data available, and while consuming a limited amount of resources (memory and processing power).

    After researching and implementing solutions to enhance our own personalization technology, we are pleased to offer the open source community Fast Entity Linker, our unsupervised, accurate, and extensible multilingual named entity recognition and linking system, along with datapacks for English, Spanish, and Chinese.

    For broad usability, our system links text entity mentions to Wikipedia. For example, in the sentence Yahoo is a company headquartered in Sunnyvale, CA with Marissa Mayer as CEO, our system would identify the following entities:

    On the algorithmic side, we use entity embeddings, click-log data, and efficient clustering methods to achieve high precision. The system achieves a low memory footprint and fast execution times by using compressed data structures and aggressive hashing functions.

    Entity embeddings are vector-based representations that capture how entities are referred to in context. We train entity embeddings using Wikipedia articles, and use hyperlinked terms in the articles to create canonical entities. The context of an entity and the context of a token are modeled using the neural network architecture in the figure below, where entity vectors are trained to predict not only their surrounding entities but also the global context of word sequences contained within them. In this way, one layer models entity context, and the other layer models token context. We connect these two layers using the same technique that (Quoc and Mikolov ‘14) used to train paragraph vectors.

    image


    Architecture for training word embeddings and entity embeddings simultaneously. Ent represents entities and W represents their context words.

    Search click-log data gives very useful signals to disambiguate partial or ambiguous entity mentions. For example, if searchers for “Fox” tend to click on “Fox News” rather than “20th Century Fox,” we can use this data in order to identify “Fox” in a document. To disambiguate entity mentions and ensure a document has a consistent set of entities, our system supports three entity disambiguation algorithms:

    *Currently, only the Forward Backward Algorithm is available in our open source release–the other two will be made available soon!

    These algorithms are particularly helpful in accurately linking entities when a popular candidate is NOT the correct candidate for an entity mention. In the example below, these algorithms leverage the surrounding context to accurately link Manchester City, Swansea City, Liverpool, Chelsea, and Arsenal to their respective football clubs.



    Ambiguous mentions that could refer to multiple entities are highlighted in red. For example, Chelsea could refer to Chelsea Football team or Chelsea neighborhood in New York or London. Unambiguous named entities are highlighted in green.


    Examples of candidate retrieval process in Entity Linking for both ambiguous and unambiguous examples referred in the example above. The correct candidate is highlighted in green.


    At this time, Fast Entity Linker is one of only three freely-available multilingual named entity recognition and linking systems (others are DBpedia Spotlight and Babelfy). In addition to a stand-alone entity linker, the software includes tools for creating and compressing word/entity embeddings and datapacks for different languages from Wikipedia data. As an example, the datapack containing information from all of English Wikipedia is only ~2GB.

    The technical contributions of this system are described in two scientific papers:

    There are numerous possible applications of the open-source toolkit. One of them is attributing sentiment to entities detected in the text, as opposed to the entire text itself. For example, consider the following actual review of the movie “Inferno” from a user on MetaCritic (revised for clarity): “While the great performance of Tom Hanks (wiki_Tom_Hanks) and company make for a mysterious and vivid movie, the plot is difficult to comprehend. Although the movie was a clever and fun ride, I expected more from Columbia (wiki_Columbia_Pictures).”  Though the review on balance is neutral, it conveys a positive sentiment about wiki_Tom_Hanks and a negative sentiment about wiki_Columbia_Pictures.

    Many existing sentiment analysis tools collate the sentiment value associated with the text as a whole, which makes it difficult to track sentiment around any individual entity. With our toolkit, one could automatically extract “positive” and “negative” aspects within a given text, giving a clearer understanding of the sentiment surrounding its individual components.

    Feel free to use the code, contribute to it, and come up with addtional applications; our system and models are available at https://github.com/yahoo/FEL.

    Great work from our Yahoo Research team!

    AWS Hot Startups – November 2016 – AwareLabs, Doctor On Demand, Starling Bank, and VigLink

    Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-hot-startups-november-2016-awarelabs-doctor-on-demand-starling-bank-and-viglink/

    Tina is back with another impressive set of startups!

    Jeff;


    This month we are featuring four hot AWS-powered startups:

    • AwareLabs – Helping small businesses build smart websites
    • Doctor On Demand – Delivering fast, easy, and cost-effective access to top healthcare providers.
    • Starling Bank – Mobile banking for the next generation.
    • VigLink – Powering content-driven commerce.

    Make sure to also check out October’s Hot Startups if you missed it!

    AwareLabs (Phoenix/Charlotte)
    AwareLabs is a small, three person startup focused on helping business owners engage their customers through dozens of integrated applications. The startup was born in November 2011 and began as a website building guide that helped hundreds of entrepreneurs within its first few weeks. Early on, founder Paul Kenjora recognized that small businesses were being slowed down by existing business solutions, and in 2013 he took on the task of creating a business centric website builder. After attending an AWS seminar, Paul realized that small teams could design and deploy massive infrastructure just as well as heavily funded, high-tech companies. Previously, only big companies or heavy investment allowed for that type of scale. With the help of AwareLabs, small businesses with limited time and budgets can build the smart websites they need.

    The AwareLabs team relies on AWS to achieve what was previously impossible with a team of their size. They’ve been able to raise less capital, move faster, and deliver a solution customers love. AwareLabs leverages Amazon EC2 extensively for everything from running client websites, to maintaining their own secure code repository. Amazon S3 has also been a game changer in offloading the burden of data storage and reliability. This was the single biggest factor in letting the AwareLabs development team focus on client-facing features instead of infrastructure issues. Amazon SES and Amazon SNS freed their developers to deliver integrated one-click newsletters with intelligent bounce reduction, which was very well received by clients. Finally, AWS has helped AwareLabs be profitable, which is huge for any startup!

    Be sure to check out AwareLabs for your professional website needs!

    Doctor On Demand (San Francisco)
    Doctor On Demand was built to address the growing problem that many of those in the U.S. face – lack of access to healthcare providers. The average wait time to see a physician is three weeks, and in rural areas, it can be even longer. It takes an average of 25 days to see a psychiatrist or psychologist and nearly half of all patients with mental health issues go without treatment. With Doctor On Demand, patients can see a board-certified physician or psychologist in a matter of minutes directly from their smartphone, tablet, or computer. They can also have video visits with providers at any time of day – no matter where they are. Patients simply download the Doctor On Demand app (iOS and Android) or visit www.doctorondemand.com, provide a summary of the reason for their visit, and are connected to a licensed provider in their state. Services are delivered through hundreds of employers and work with dozens of major health plans.

    From the very beginning, AWS has allowed Doctor on Demand to operate securely in the healthcare space. They utilize Amazon EC2, Amazon S3, Amazon CloudFront, Amazon CloudWatch, and AWS Trusted Advisor. With these services they are able to build compliant security and privacy controls, ‘simple’ fault tolerance, and easily setup a disaster recovery site (utilizing multiple AWS Regions). The company says the best part about working with AWS is that they are able to get everything they need on a startup budget.

    Check out the Doctor On Demand blog to keep up with the latest news!

    Starling Bank (UK)
    Starling Bank is on a mission to shake up financial services.  In the way that TV was radically changed by Netflix, music by the likes of Spotify, and social media by Snapchat – this is what Starling aims to do for banking. Founded in 2014 by Anne Boden, Starling uses the latest technology to make the traditional current account obsolete. Having assembled a team of engineers, artists, and economists, the build of the bank is nearing completion. They will be launching their app in early 2017.

    Many next generation banks continue to stick to the traditional bank model that was built on technology from the 1960s and 70s. Instead of providing a range of products that are sold and cross-sold to unwilling customers, Starling will empower their users through seamless access to a mobile marketplace of financial services and products that best meet their needs at any given time. Customers can enjoy the security and protection of a licensed and regulated bank while also getting access to insights, data, and services that empower them to make decisions about their money.

    Starling Bank uses AWS to provision and scale a secure infrastructure automatically and on demand. They primarily use Amazon CloudFormation and Amazon EC2, but also make use of Amazon S3, Amazon RDS, and Amazon Lambda.

    Sign up here to be one of Starling’s first customers!

    VigLink (San Francisco)
    Oliver Roup, founder and CEO of VigLink, was first introduced to affiliate marketing as a student at Harvard Business School. His interest in the complex ecosystem prompted him to write a crawler to identify existing product links to Amazon. Roup found that less than half of those links were enrolled in the associates program. It was at this moment that he determined there was a real business opportunity at hand, and VigLink was born.

    Over the last seven years, the company has grown into not only a content monetization platform, but a platform that provides publishers and merchants with insights into their ecommerce business. At its core, VigLink identifies commercial product mentions within a publisher’s content and automatically transforms them into revenue generating hyperlinks whose destinations can be determined in real-time, advertiser-bid auctions. Since its founding in 2009, VigLink has been backed by top investors including Google Ventures, Emergence Capital Partners, and RRE. Check out a recent interview with Roup and a tour of VigLink’s offices here!

    Since the company’s start, VigLink has utilized AWS extensively. The flexibility to be able to respond to demand elastically without capital costs or hardware maintenance has been game-changing. They use numerous services including Amazon EC2, Amazon S3, Amazon SQS, Amazon RDS, and Amazon Redshift. While continuing to scale, VigLink has recently been able to cut costs by 15% using tools such as AWS Cost Explorer.

    Take a behind-the-scenes look at VigLink in this short video.

    Tina Barr

    Send ECS Container Logs to CloudWatch Logs for Centralized Monitoring

    Post Syndicated from Chris Barclay original http://blogs.aws.amazon.com/application-management/post/TxFRDMTMILAA8X/Send-ECS-Container-Logs-to-CloudWatch-Logs-for-Centralized-Monitoring

    My colleagues Brandon Chavis, Pierre Steckmeyer and Chad Schmutzer sent a nice guest post that demonstrates how to send your container logs to a central source for easy troubleshooting and alarming.

     

    —–

    Amazon EC2 Container Service (Amazon ECS) is a highly scalable, high performance container management service that supports Docker containers and allows you to easily run applications on a managed cluster of Amazon EC2 instances.

    In this multipart blog post, we have chosen to take a universal struggle amongst IT professionals—log collection—and approach it from different angles to highlight possible architectural patterns that facilitate communication and data sharing between containers.

    When building applications on ECS, it is a good practice to follow a micro services approach, which encourages the design of a single application component in a single container. This design improves flexibility and elasticity, while leading to a loosely coupled architecture for resilience and ease of maintenance. However, this architectural style makes it important to consider how your containers will communicate and share data with each other.

    Why is it useful?

    Application logs are useful for many reasons. They are the primary source of troubleshooting information. In the field of security, they are essential to forensics. Web server logs are often leveraged for analysis (at scale) in order to gain insight into usage, audience, and trends.

    Centrally collecting container logs is a common problem that can be solved in a number of ways. The Docker community has offered solutions such as having working containers map a shared volume; having a log-collecting container; and getting logs from a container that logs to stdout/stderr and retrieving them with docker logs.

    In this post, we present a solution using Amazon CloudWatch Logs. CloudWatch is a monitoring service for AWS cloud resources and the applications you run on AWS. CloudWatch Logs can be used to collect and monitor your logs for specific phrases, values, or patterns. For example, you could set an alarm on the number of errors that occur in your system logs or view graphs of web request latencies from your application logs. The additional advantages here are that you can look at a single pane of glass for all of your monitoring needs because such metrics as CPU, disk I/O, and network for your container instances are already available on CloudWatch.

    Here is how we are going to do it

    Our approach involves setting up a container whose sole purpose is logging. It runs rsyslog and the CloudWatch Logs agent, and we use Docker Links to communicate to other containers. With this strategy, it becomes easy to link existing application containers such as Apache and have discrete logs per task. This logging container is defined in each ECS task definition, which is a collection of containers running together on the same container instance. With our container log collection strategy, you do not have to modify your Docker image. Any log mechanism tweak is specified in the task definition.

     

    Note: This blog provisions a new ECS cluster in order to test the following instructions. Also, please note that we are using the US East (N. Virginia) region throughout this exercise. If you would like to use a different AWS region, please make sure to update your configuration accordingly.

    Linking to a CloudWatch logging container

    We will create a container that can be deployed as a syslog host. It will accept standard syslog connections on 514/TCP to rsyslog through container links, and will also forward those logs to CloudWatch Logs via the CloudWatch Logs agent. The idea is that this container can be deployed as the logging component in your architecture (not limited to ECS; it could be used for any centralized logging).

    As a proof of concept, we show you how to deploy a container running httpd, clone some static web content (for this example, we clone the ECS documentation), and have the httpd access and error logs sent to the rsyslog service running on the syslog container via container linking. We also send the Docker and ecs-agent logs from the EC2 instance the task is running on. The logs in turn are sent to CloudWatch Logs via the CloudWatch Logs agent.

    Note: Be sure to replace your information througout the document as necessary (for example: replace "my_docker_hub_repo" with the name of your own Docker Hub repository).

    We also assume that all following requirements are in place in your AWS account:

    A VPC exists for the account

    There is an IAM user with permissions to launch EC2 instances and create IAM policies/roles

    SSH keys have been generated

    Git and Docker are installed on the image building host

    The user owns a Docker Hub account and a repository ("my_docker_hub_repo" in this document)

    Let’s get started.

    Create the Docker image

    The first step is to create the Docker image to use as a logging container. For this, all you need is a machine that has Git and Docker installed. You could use your own local machine or an EC2 instance.

    Install Git and Docker. The following steps pertain to the Amazon Linux AMI but you should follow the Git and Docker installation instructions respective to your machine.

    $ sudo yum update -y && sudo yum -y install git docker

    Make sure that the Docker service is running:

    $ sudo service docker start

    Clone the GitHub repository containing the files you need:

    $ git clone https://github.com/awslabs/ecs-cloudwatch-logs.git
    $ cd ecs-cloudwatch-logs
    You should now have a directory containing two .conf files and a Dockerfile. Feel free to read the content of these files and identify the mechanisms used.
     

    Log in to Docker Hub:

    $ sudo docker login

    Build the container image (replace the my_docker_hub_repo with your repository name):

    $ sudo docker build -t my_docker_hub_repo/cloudwatchlogs .

    Push the image to your repo:

    $ sudo docker push my_docker_hub_repo/cloudwatchlogs

    Use the build-and-push time to dive deeper into what will live in this container. You can follow along by reading the Dockerfile. Here are a few things worth noting:

    The first RUN updates the distribution and installs rsyslog, pip, and curl.

    The second RUN downloads the AWS CloudWatch Logs agent.

    The third RUN enables remote conncetions for rsyslog.

    The fourth RUN removes the local6 and local7 facilities to prevent duplicate entries. If you don’t do this, you would see every single apache log entry in /var/log/syslog.

    The last RUN specifies which output files will receive the log entries on local6 and local7 (e.g., "if the facility is local6 and it is tagged with httpd, put those into this httpd-access.log file").

    We use Supervisor to run more than one process in this container: rsyslog and the CloudWatch Logs agent.

    We expose port 514 for rsyslog to collect log entries via the Docker link.

    Create an ECS cluster

    Now, create an ECS cluster. One way to do so could be to use the Amazon ECS console first run wizard. For now, though, all you need is an ECS cluster.

    7. Navigate to the ECS console and choose Create cluster. Give it a unique name that you have not used before (such as "ECSCloudWatchLogs"), and choose Create.

    Create an IAM role

    The next five steps set up a CloudWatch-enabled IAM role with EC2 permissions and spin up a new container instance with this role. All of this can be done manually via the console or you can run a CloudFormation template. To use the CloudFormation template, navigate to CloudFormation console, create a new stack by using this template and go straight to step 14 (just specify the ECS cluster name used above, choose your prefered instance type and select the appropriate EC2 SSH key, and leave the rest unchanged). Otherwise, continue on to step 8.

    8. Create an IAM policy for CloudWatch Logs and ECS: point your browser to the IAM console, choose Policies and then Create Policy. Choose Select next to Create Your Own Policy. Give your policy a name (e.g., ECSCloudWatchLogs) and paste the text below as the Policy Document value.

    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Action": [
    "logs:Create*",
    "logs:PutLogEvents"
    ],
    "Effect": "Allow",
    "Resource": "arn:aws:logs:*:*:*"
    },
    {
    "Action": [
    "ecs:CreateCluster",
    "ecs:DeregisterContainerInstance",
    "ecs:DiscoverPollEndpoint",
    "ecs:RegisterContainerInstance",
    "ecs:Submit*",
    "ecs:Poll"
    ],
    "Effect": "Allow",
    "Resource": "*"
    }
    ]
    }

    9. Create a new IAM EC2 service role and attach the above policy to it. In IAM, choose Roles, Create New Role. Pick a name for the role (e.g., ECSCloudWatchLogs). Choose Role Type, Amazon EC2. Find and pick the policy you just created, click Next Step, and then Create Role.

    Launch an EC2 instance and ECS cluster

    10. Launch an instance with the Amazon ECS AMI and the above role in the US East (N. Virginia) region. On the EC2 console page, choose Launch Instance. Choose Community AMIs. In the search box, type "amazon-ecs-optimized" and choose Select for the latest version (2015.03.b). Select the appropriate instance type and choose Next.

    11. Choose the appropriate Network value for your ECS cluster. Make sure that Auto-assign Public IP is enabled. Choose the IAM role that you just created (e.g., ECSCloudWatchLogs). Expand Advanced Details and in the User data field, add the following while substituting your_cluster_name for the appropriate name:

    #!/bin/bash
    echo ECS_CLUSTER=your_cluster_name >> /etc/ecs/ecs.config
    EOF

    12. Choose Next: Add Storage, then Next: Tag Instance. You can give your container instance a name on this page. Choose Next: Configure Security Group. On this page, you should make sure that both SSH and HTTP are open to at least your own IP address.

    13. Choose Review and Launch, then Launch and Associate with the appropriate SSH key. Note the instance ID.

    14. Ensure that your newly spun-up EC2 instance is part of your container instances (note that it may take up to a minute for the container instance to register with ECS). In the ECS console, select the appropriate cluster. Select the ECS Instances tab. You should see a container instance with the instance ID that you just noted after a minute.

    15. On the left pane of the ECS console, choose Task Definitions, then Create new Task Definition. On the JSON tab, paste the code below, overwriting the default text. Make sure to replace "my_docker_hub_repo" with your own Docker Hub repo name and choose Create.

    {
    "volumes": [
    {
    "name": "ecs_instance_logs",
    "host": {
    "sourcePath": "/var/log"
    }
    }
    ],
    "containerDefinitions": [
    {
    "environment": [],
    "name": "cloudwatchlogs",
    "image": "my_docker_hub_repo/cloudwatchlogs",
    "cpu": 50,
    "portMappings": [],
    "memory": 64,
    "essential": true,
    "mountPoints": [
    {
    "sourceVolume": "ecs_instance_logs",
    "containerPath": "/mnt/ecs_instance_logs",
    "readOnly": true
    }
    ]
    },
    {
    "environment": [],
    "name": "httpd",
    "links": [
    "cloudwatchlogs"
    ],
    "image": "httpd",
    "cpu": 50,
    "portMappings": [
    {
    "containerPort": 80,
    "hostPort": 80
    }
    ],
    "memory": 128,
    "entryPoint": ["/bin/bash", "-c"],
    "command": [
    "apt-get update && apt-get -y install wget && echo ‘CustomLog "| /usr/bin/logger -t httpd -p local6.info -n cloudwatchlogs -P 514" "%v %h %l %u %t %r %>s %b %{Referer}i %{User-agent}i"’ >> /usr/local/apache2/conf/httpd.conf && echo ‘ErrorLogFormat "%v [%t] [%l] [pid %P] %F: %E: [client %a] %M"’ >> /usr/local/apache2/conf/httpd.conf && echo ‘ErrorLog "| /usr/bin/logger -t httpd -p local7.info -n cloudwatchlogs -P 514"’ >> /usr/local/apache2/conf/httpd.conf && echo ServerName `hostname` >> /usr/local/apache2/conf/httpd.conf && rm -rf /usr/local/apache2/htdocs/* && cd /usr/local/apache2/htdocs && wget -mkEpnp -nH –cut-dirs=4 http://docs.aws.amazon.com/AmazonECS/latest/developerguide/Welcome.html && /usr/local/bin/httpd-foreground"
    ],
    "essential": true
    }
    ],
    "family": "cloudwatchlogs"
    }

    What are some highlights of this task definition?

    The sourcePath value allows the CloudWatch Logs agent running in the log collection container to access the host-based Docker and ECS agent log files. You can change the retention period in CloudWatch Logs.

    The cloudwatchlogs container is marked essential, which means that if log collection goes down, so should the application it is collecting from. Similarly, the web server is marked essential as well. You can easily change this behavior.

    The command section is a bit lengthy. Let us break it down:

    We first install wget so that we can later clone the ECS documentation for display on our web server.

    We then write four lines to httpd.conf. These are the echo commands. They describe how httpd will generate log files and their format. Notice how we tag (-t httpd) these files with httpd and assign them a specific facility (-p localX.info). We also specify that logger is to send these entries to host -n cloudwatchlogs on port -p 514. This will be handled by linking. Hence, port 514 is left untouched on the machine and we could have as many of these logging containers running as we want.

    %h %l %u %t %r %>s %b %{Referer}i %{User-agent}i should look fairly familiar to anyone who has looked into tweaking Apache logs. The initial %v is the server name and it will be replaced by the container ID. This is how we are able to discern what container the logs come from in CloudWatch Logs.

    We remove the default httpd landing page with rm -rf.

    We instead use wget to download a clone of the ECS documentation.

    And, finally, we start httpd. Note that we redirect httpd log files in our task definition at the command level for the httpd image. Applying the same concept to another image would simply require you to know where your application maintains its log files.

    Note that we redirect httpd log files in our task definition at the command level for the httpd image. Applying the same concept to another image would simply require you to know where your application maintains its log files.

    Create a service

    16. On the services tab in the ECS console, choose Create. Choose the task definition created in step 15, name the service and set the number of tasks to 1. Select Create service.

    17. The task will start running shortly. You can press the refresh icon on your service’s Tasks tab. After the status says "Running", choose the task and expand the httpd container. The container instance IP will be a hyperlink under the Network bindings section’s External link. When you select the link you should see a clone of the Amazon ECS documentation. You are viewing this thanks to the httpd container running on your ECS cluster.

    18. Open the CloudWatch Logs console to view new ecs entries.

    Conclusion

    If you have followed all of these steps, you should now have a two container task running in your ECS cluster. One container serves web pages while the other one collects the log activity from the web container and sends it to CloudWatch Logs. Such a setup can be replicated with any other application. All you need is to specify a different container image and describe the expected log files in the command section.

    Send ECS Container Logs to CloudWatch Logs for Centralized Monitoring

    Post Syndicated from Chris Barclay original http://blogs.aws.amazon.com/application-management/post/TxFRDMTMILAA8X/Send-ECS-Container-Logs-to-CloudWatch-Logs-for-Centralized-Monitoring

    My colleagues Brandon Chavis, Pierre Steckmeyer and Chad Schmutzer sent a nice guest post that demonstrates how to send your container logs to a central source for easy troubleshooting and alarming.

     

    —–

    Amazon EC2 Container Service (Amazon ECS) is a highly scalable, high performance container management service that supports Docker containers and allows you to easily run applications on a managed cluster of Amazon EC2 instances.

    In this multipart blog post, we have chosen to take a universal struggle amongst IT professionals—log collection—and approach it from different angles to highlight possible architectural patterns that facilitate communication and data sharing between containers.

    When building applications on ECS, it is a good practice to follow a micro services approach, which encourages the design of a single application component in a single container. This design improves flexibility and elasticity, while leading to a loosely coupled architecture for resilience and ease of maintenance. However, this architectural style makes it important to consider how your containers will communicate and share data with each other.

    Why is it useful?

    Application logs are useful for many reasons. They are the primary source of troubleshooting information. In the field of security, they are essential to forensics. Web server logs are often leveraged for analysis (at scale) in order to gain insight into usage, audience, and trends.

    Centrally collecting container logs is a common problem that can be solved in a number of ways. The Docker community has offered solutions such as having working containers map a shared volume; having a log-collecting container; and getting logs from a container that logs to stdout/stderr and retrieving them with docker logs.

    In this post, we present a solution using Amazon CloudWatch Logs. CloudWatch is a monitoring service for AWS cloud resources and the applications you run on AWS. CloudWatch Logs can be used to collect and monitor your logs for specific phrases, values, or patterns. For example, you could set an alarm on the number of errors that occur in your system logs or view graphs of web request latencies from your application logs. The additional advantages here are that you can look at a single pane of glass for all of your monitoring needs because such metrics as CPU, disk I/O, and network for your container instances are already available on CloudWatch.

    Here is how we are going to do it

    Our approach involves setting up a container whose sole purpose is logging. It runs rsyslog and the CloudWatch Logs agent, and we use Docker Links to communicate to other containers. With this strategy, it becomes easy to link existing application containers such as Apache and have discrete logs per task. This logging container is defined in each ECS task definition, which is a collection of containers running together on the same container instance. With our container log collection strategy, you do not have to modify your Docker image. Any log mechanism tweak is specified in the task definition.

     

    Note: This blog provisions a new ECS cluster in order to test the following instructions. Also, please note that we are using the US East (N. Virginia) region throughout this exercise. If you would like to use a different AWS region, please make sure to update your configuration accordingly.

    Linking to a CloudWatch logging container

    We will create a container that can be deployed as a syslog host. It will accept standard syslog connections on 514/TCP to rsyslog through container links, and will also forward those logs to CloudWatch Logs via the CloudWatch Logs agent. The idea is that this container can be deployed as the logging component in your architecture (not limited to ECS; it could be used for any centralized logging).

    As a proof of concept, we show you how to deploy a container running httpd, clone some static web content (for this example, we clone the ECS documentation), and have the httpd access and error logs sent to the rsyslog service running on the syslog container via container linking. We also send the Docker and ecs-agent logs from the EC2 instance the task is running on. The logs in turn are sent to CloudWatch Logs via the CloudWatch Logs agent.

    Note: Be sure to replace your information througout the document as necessary (for example: replace "my_docker_hub_repo" with the name of your own Docker Hub repository).

    We also assume that all following requirements are in place in your AWS account:

    A VPC exists for the account

    There is an IAM user with permissions to launch EC2 instances and create IAM policies/roles

    SSH keys have been generated

    Git and Docker are installed on the image building host

    The user owns a Docker Hub account and a repository ("my_docker_hub_repo" in this document)

    Let’s get started.

    Create the Docker image

    The first step is to create the Docker image to use as a logging container. For this, all you need is a machine that has Git and Docker installed. You could use your own local machine or an EC2 instance.

    Install Git and Docker. The following steps pertain to the Amazon Linux AMI but you should follow the Git and Docker installation instructions respective to your machine.

    $ sudo yum update -y && sudo yum -y install git docker

    Make sure that the Docker service is running:

    $ sudo service docker start

    Clone the GitHub repository containing the files you need:

    $ git clone https://github.com/awslabs/ecs-cloudwatch-logs.git
    $ cd ecs-cloudwatch-logs
    You should now have a directory containing two .conf files and a Dockerfile. Feel free to read the content of these files and identify the mechanisms used.
     

    Log in to Docker Hub:

    $ sudo docker login

    Build the container image (replace the my_docker_hub_repo with your repository name):

    $ sudo docker build -t my_docker_hub_repo/cloudwatchlogs .

    Push the image to your repo:

    $ sudo docker push my_docker_hub_repo/cloudwatchlogs

    Use the build-and-push time to dive deeper into what will live in this container. You can follow along by reading the Dockerfile. Here are a few things worth noting:

    The first RUN updates the distribution and installs rsyslog, pip, and curl.

    The second RUN downloads the AWS CloudWatch Logs agent.

    The third RUN enables remote conncetions for rsyslog.

    The fourth RUN removes the local6 and local7 facilities to prevent duplicate entries. If you don’t do this, you would see every single apache log entry in /var/log/syslog.

    The last RUN specifies which output files will receive the log entries on local6 and local7 (e.g., "if the facility is local6 and it is tagged with httpd, put those into this httpd-access.log file").

    We use Supervisor to run more than one process in this container: rsyslog and the CloudWatch Logs agent.

    We expose port 514 for rsyslog to collect log entries via the Docker link.

    Create an ECS cluster

    Now, create an ECS cluster. One way to do so could be to use the Amazon ECS console first run wizard. For now, though, all you need is an ECS cluster.

    7. Navigate to the ECS console and choose Create cluster. Give it a unique name that you have not used before (such as "ECSCloudWatchLogs"), and choose Create.

    Create an IAM role

    The next five steps set up a CloudWatch-enabled IAM role with EC2 permissions and spin up a new container instance with this role. All of this can be done manually via the console or you can run a CloudFormation template. To use the CloudFormation template, navigate to CloudFormation console, create a new stack by using this template and go straight to step 14 (just specify the ECS cluster name used above, choose your prefered instance type and select the appropriate EC2 SSH key, and leave the rest unchanged). Otherwise, continue on to step 8.

    8. Create an IAM policy for CloudWatch Logs and ECS: point your browser to the IAM console, choose Policies and then Create Policy. Choose Select next to Create Your Own Policy. Give your policy a name (e.g., ECSCloudWatchLogs) and paste the text below as the Policy Document value.

    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Action": [
    "logs:Create*",
    "logs:PutLogEvents"
    ],
    "Effect": "Allow",
    "Resource": "arn:aws:logs:*:*:*"
    },
    {
    "Action": [
    "ecs:CreateCluster",
    "ecs:DeregisterContainerInstance",
    "ecs:DiscoverPollEndpoint",
    "ecs:RegisterContainerInstance",
    "ecs:Submit*",
    "ecs:Poll"
    ],
    "Effect": "Allow",
    "Resource": "*"
    }
    ]
    }

    9. Create a new IAM EC2 service role and attach the above policy to it. In IAM, choose Roles, Create New Role. Pick a name for the role (e.g., ECSCloudWatchLogs). Choose Role Type, Amazon EC2. Find and pick the policy you just created, click Next Step, and then Create Role.

    Launch an EC2 instance and ECS cluster

    10. Launch an instance with the Amazon ECS AMI and the above role in the US East (N. Virginia) region. On the EC2 console page, choose Launch Instance. Choose Community AMIs. In the search box, type "amazon-ecs-optimized" and choose Select for the latest version (2015.03.b). Select the appropriate instance type and choose Next.

    11. Choose the appropriate Network value for your ECS cluster. Make sure that Auto-assign Public IP is enabled. Choose the IAM role that you just created (e.g., ECSCloudWatchLogs). Expand Advanced Details and in the User data field, add the following while substituting your_cluster_name for the appropriate name:

    #!/bin/bash
    echo ECS_CLUSTER=your_cluster_name >> /etc/ecs/ecs.config
    EOF

    12. Choose Next: Add Storage, then Next: Tag Instance. You can give your container instance a name on this page. Choose Next: Configure Security Group. On this page, you should make sure that both SSH and HTTP are open to at least your own IP address.

    13. Choose Review and Launch, then Launch and Associate with the appropriate SSH key. Note the instance ID.

    14. Ensure that your newly spun-up EC2 instance is part of your container instances (note that it may take up to a minute for the container instance to register with ECS). In the ECS console, select the appropriate cluster. Select the ECS Instances tab. You should see a container instance with the instance ID that you just noted after a minute.

    15. On the left pane of the ECS console, choose Task Definitions, then Create new Task Definition. On the JSON tab, paste the code below, overwriting the default text. Make sure to replace "my_docker_hub_repo" with your own Docker Hub repo name and choose Create.

    {
    "volumes": [
    {
    "name": "ecs_instance_logs",
    "host": {
    "sourcePath": "/var/log"
    }
    }
    ],
    "containerDefinitions": [
    {
    "environment": [],
    "name": "cloudwatchlogs",
    "image": "my_docker_hub_repo/cloudwatchlogs",
    "cpu": 50,
    "portMappings": [],
    "memory": 64,
    "essential": true,
    "mountPoints": [
    {
    "sourceVolume": "ecs_instance_logs",
    "containerPath": "/mnt/ecs_instance_logs",
    "readOnly": true
    }
    ]
    },
    {
    "environment": [],
    "name": "httpd",
    "links": [
    "cloudwatchlogs"
    ],
    "image": "httpd",
    "cpu": 50,
    "portMappings": [
    {
    "containerPort": 80,
    "hostPort": 80
    }
    ],
    "memory": 128,
    "entryPoint": ["/bin/bash", "-c"],
    "command": [
    "apt-get update && apt-get -y install wget && echo ‘CustomLog "| /usr/bin/logger -t httpd -p local6.info -n cloudwatchlogs -P 514" "%v %h %l %u %t %r %>s %b %{Referer}i %{User-agent}i"’ >> /usr/local/apache2/conf/httpd.conf && echo ‘ErrorLogFormat "%v [%t] [%l] [pid %P] %F: %E: [client %a] %M"’ >> /usr/local/apache2/conf/httpd.conf && echo ‘ErrorLog "| /usr/bin/logger -t httpd -p local7.info -n cloudwatchlogs -P 514"’ >> /usr/local/apache2/conf/httpd.conf && echo ServerName `hostname` >> /usr/local/apache2/conf/httpd.conf && rm -rf /usr/local/apache2/htdocs/* && cd /usr/local/apache2/htdocs && wget -mkEpnp -nH –cut-dirs=4 http://docs.aws.amazon.com/AmazonECS/latest/developerguide/Welcome.html && /usr/local/bin/httpd-foreground"
    ],
    "essential": true
    }
    ],
    "family": "cloudwatchlogs"
    }

    What are some highlights of this task definition?

    The sourcePath value allows the CloudWatch Logs agent running in the log collection container to access the host-based Docker and ECS agent log files. You can change the retention period in CloudWatch Logs.

    The cloudwatchlogs container is marked essential, which means that if log collection goes down, so should the application it is collecting from. Similarly, the web server is marked essential as well. You can easily change this behavior.

    The command section is a bit lengthy. Let us break it down:

    We first install wget so that we can later clone the ECS documentation for display on our web server.

    We then write four lines to httpd.conf. These are the echo commands. They describe how httpd will generate log files and their format. Notice how we tag (-t httpd) these files with httpd and assign them a specific facility (-p localX.info). We also specify that logger is to send these entries to host -n cloudwatchlogs on port -p 514. This will be handled by linking. Hence, port 514 is left untouched on the machine and we could have as many of these logging containers running as we want.

    %h %l %u %t %r %>s %b %{Referer}i %{User-agent}i should look fairly familiar to anyone who has looked into tweaking Apache logs. The initial %v is the server name and it will be replaced by the container ID. This is how we are able to discern what container the logs come from in CloudWatch Logs.

    We remove the default httpd landing page with rm -rf.

    We instead use wget to download a clone of the ECS documentation.

    And, finally, we start httpd. Note that we redirect httpd log files in our task definition at the command level for the httpd image. Applying the same concept to another image would simply require you to know where your application maintains its log files.

    Note that we redirect httpd log files in our task definition at the command level for the httpd image. Applying the same concept to another image would simply require you to know where your application maintains its log files.

    Create a service

    16. On the services tab in the ECS console, choose Create. Choose the task definition created in step 15, name the service and set the number of tasks to 1. Select Create service.

    17. The task will start running shortly. You can press the refresh icon on your service’s Tasks tab. After the status says "Running", choose the task and expand the httpd container. The container instance IP will be a hyperlink under the Network bindings section’s External link. When you select the link you should see a clone of the Amazon ECS documentation. You are viewing this thanks to the httpd container running on your ECS cluster.

    18. Open the CloudWatch Logs console to view new ecs entries.

    Conclusion

    If you have followed all of these steps, you should now have a two container task running in your ECS cluster. One container serves web pages while the other one collects the log activity from the web container and sends it to CloudWatch Logs. Such a setup can be replicated with any other application. All you need is to specify a different container image and describe the expected log files in the command section.