Tag Archives: development

Raspberry Jam Cameroon #PiParty

Post Syndicated from Ben Nuttall original https://www.raspberrypi.org/blog/raspberry-jam-cameroon-piparty/

Earlier this year on 3 and 4 March, communities around the world held Raspberry Jam events to celebrate Raspberry Pi’s sixth birthday. We sent out special birthday kits to participating Jams — it was amazing to know the kits would end up in the hands of people in parts of the world very far from Raspberry Pi HQ in Cambridge, UK.

The Raspberry Jam Camer team: Damien Doumer, Eyong Etta, Loïc Dessap and Lionel Sichom, aka Lionel Tellem

Preparing for the #PiParty

One birthday kit went to Yaoundé, the capital of Cameroon. There, a team of four students in their twenties — Lionel Sichom (aka Lionel Tellem), Eyong Etta, Loïc Dessap, and Damien Doumer — were organising Yaoundé’s first Jam, called Raspberry Jam Camer, as part of the Raspberry Jam Big Birthday Weekend. The team knew one another through their shared interests and skills in electronics, robotics, and programming. Damien explains in his blog post about the Jam that they planned ahead for several activities for the Jam based on their own projects, so they could be confident of having a few things that would definitely be successful for attendees to do and see.

Show-and-tell at Raspberry Jam Cameroon

Loïc presented a Raspberry Pi–based, Android app–controlled robot arm that he had built, and Lionel coded a small video game using Scratch on Raspberry Pi while the audience watched. Damien demonstrated the possibilities of Windows 10 IoT Core on Raspberry Pi, showing how to install it, how to use it remotely, and what you can do with it, including building a simple application.

Loïc Dessap, wearing a Raspberry Jam Big Birthday Weekend T-shirt, sits at a table with a robot arm, a laptop with a Pi sticker and other components. He is making an adjustment to his set-up.

Loïc showcases the prototype robot arm he built

There was lots more too, with others discussing their own Pi projects and talking about the possibilities Raspberry Pi offers, including a Pi-controlled drone and car. Cake was a prevailing theme of the Raspberry Jam Big Birthday Weekend around the world, and Raspberry Jam Camer made sure they didn’t miss out.

A round pink-iced cake decorated with the words "Happy Birthday RBP" and six candles, on a table beside Raspberry Pi stickers, Raspberry Jam stickers and Raspberry Jam fliers

Yay, birthday cake!!

A big success

Most visitors to the Jam were secondary school students, while others were university students and graduates. The majority were unfamiliar with Raspberry Pi, but all wanted to learn about Raspberry Pi and what they could do with it. Damien comments that the fact most people were new to Raspberry Pi made the event more interactive rather than creating any challenges, because the visitors were all interested in finding out about the little computer. The Jam was an all-round success, and the team was pleased with how it went:

What I liked the most was that we sensitized several people about the Raspberry Pi and what one can be capable of with such a small but powerful device. — Damien Doumer

The Jam team rounded off the event by announcing that this was the start of a Raspberry Pi community in Yaoundé. They hope that they and others will be able to organise more Jams and similar events in the area to spread the word about what people can do with Raspberry Pi, and to help them realise their ideas.

The Raspberry Jam Camer team, wearing Raspberry Jam Big Birthday Weekend T-shirts, pose with young Jam attendees outside their venue

Raspberry Jam Camer gets the thumbs-up

The Raspberry Pi community in Cameroon

In a French-language interview about their Jam, the team behind Raspberry Jam Camer said they’d like programming to become the third official language of Cameroon, after French and English; their aim is to to popularise programming and digital making across Cameroonian society. Neither of these fields is very familiar to most people in Cameroon, but both are very well aligned with the country’s ambitions for development. The team is conscious of the difficulties around the emergence of information and communication technologies in the Cameroonian context; in response, they are seizing the opportunities Raspberry Pi offers to give children and young people access to modern and constantly evolving technology at low cost.

Thanks to Lionel, Eyong, Damien, and Loïc, and to everyone who helped put on a Jam for the Big Birthday Weekend! Remember, anyone can start a Jam at any time — and we provide plenty of resources to get you started. Check out the Guidebook, the Jam branding pack, our specially-made Jam activities online (in multiple languages), printable worksheets, and more.

The post Raspberry Jam Cameroon #PiParty appeared first on Raspberry Pi.

Fairplay Canada Discredits “Pro-Piracy” TorrentFreak News, Then Cites Us

Post Syndicated from Ernesto original https://torrentfreak.com/fairplay-canada-discredits-pro-piracy-torrentfreak-news-then-cites-us-180520/

At TorrentFreak we do our best to keep readers updated on the latest copyright and piracy news, highlighting issues from different points of view.

We report on the opinions and efforts of copyright holders when it comes to online piracy and we also make room for those who oppose them. That’s how balanced reporting works in our view.

There is probably no site on the Internet who reports on the negative consequences of piracy as much as we do, but for some reason, the term “pro-piracy” is sometimes attached to our reporting. This also happened in the recent reply Fairplay Canada sent to the CRTC.

The coalition of media companies and ISPs is trying to get a pirate site blocking regime implemented in Canada. As part of this effort, it’s countering numerous responses from the public, including one from law professor Michael Geist.

In his submission, Geist pointed out that the Mexican Supreme Court ruled that site blocking is disproportional, referring to our article on the matter. This article was entirely correct at the time it was written, but it appears that the Court later clarified its stance.

Instead of pointing that out to us, or perhaps Geist, Fairplay frames it in a different light.

“Professor Geist dismisses Mexico because, relying on a third party source (the pro-piracy news site TorrentFreak), he believes its Supreme Court has ruled that the regime is disproportionate,” it writes.

Fairplay does not dispute that the Supreme Court initially ruled that a site blockade should target specific content. However, it adds that the court later clarified that blockades are also allowed if a substantial majority of content on a site is infringing.

The bottom line is that, later developments aside, our original article was correct. What bothers us, however, is that the Fairplay coalition is branding us as a “pro-piracy” site. That’s done for a reason, most likely to discredit the accuracy of our reporting.

Pro piracy news site

Luckily we have pretty thick skin, so we’ll get over it. If Fairplay Canada doesn’t trust us, then so be it.

Amusingly, however, this was not the only TorrentFreak article the coalition referenced. In fact, our reporting is cited twice more in the same report but without the pro-piracy branding.

A few pages down from the Geist reference, Fairplay mentions how pirate site blockades do not violate net neutrality in India, referring to our thorough article that explains how the process works.

No pro piracy?

Similarly, we’re also pretty reliable when it comes to reporting on MUSO’s latest piracy data, as Fairplay cites us for that as well. These are the data that play a central role in the coalition’s argumentation and analysis.

We’re not entirely sure how it works, but apparently, we are a “pro-piracy” news site when Fairplay Canada doesn’t like our reporting, and a reliable source when it suits their message.

In any case, we would like to point out that this entire opinion article is written without any pro-piracy messaging. But it appears that every sentence that deviates from the agenda of certain groups, may be interpreted as such.

Not sure if you could call that fair play?

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

ISP Telenor Will Block The Pirate Bay in Sweden Without a Shot Fired

Post Syndicated from Andy original https://torrentfreak.com/isp-telenor-will-block-the-pirate-bay-in-sweden-without-a-shot-fired-180520/

Back in 2014, Universal Music, Sony Music, Warner Music, Nordisk Film and the Swedish Film Industry filed a lawsuit against Bredbandsbolaget, one of Sweden’s largest ISPs.

The copyright holders asked the Stockholm District Court to order the ISP to block The Pirate Bay and streaming site Swefilmer, claiming that the provider knowingly facilitated access to the pirate platforms and assisted their pirating users.

Soon after the ISP fought back, refusing to block the sites in a determined response to the Court.

“Bredbandsbolaget’s role is to provide its subscribers with access to the Internet, thereby contributing to the free flow of information and the ability for people to reach each other and communicate,” the company said in a statement.

“Bredbandsbolaget does not block content or services based on individual organizations’ requests. There is no legal obligation for operators to block either The Pirate Bay or Swefilmer.”

In February 2015 the parties met in court, with Bredbandsbolaget arguing in favor of the “important principle” that ISPs should not be held responsible for content exchanged over the Internet, in the same way the postal service isn’t responsible for the contents of an envelope.

But with TV companies SVT, TV4 Group, MTG TV, SBS Discovery and C More teaming up with the IFPI alongside Paramount, Disney, Warner and Sony in the case, Bredbandsbolaget would need to pull out all the stops to obtain victory. The company worked hard and initially the news was good.

In November 2015, the Stockholm District Court decided that the copyright holders could not force Bredbandsbolaget to block the pirate sites, ruling that the ISP’s operations did not amount to participation in the copyright infringement offenses carried out by some of its ‘pirate’ subscribers.

However, the case subsequently went to appeal, with the brand new Patent and Market Court of Appeal hearing arguments. In February 2017 it handed down its decision, which overruled the earlier ruling of the District Court and ordered Bredbandsbolaget to implement “technical measures” to prevent its customers accessing the ‘pirate’ sites through a number of domain names and URLs.

With nowhere left to go, Bredbandsbolaget and owner Telenor were left hanging onto their original statement which vehemently opposed site-blocking.

“It is a dangerous path to go down, which forces Internet providers to monitor and evaluate content on the Internet and block websites with illegal content in order to avoid becoming accomplices,” they said.

In March 2017, Bredbandsbolaget blocked The Pirate Bay but said it would not give up the fight.

“We are now forced to contest any future blocking demands. It is the only way for us and other Internet operators to ensure that private players should not have the last word regarding the content that should be accessible on the Internet,” Bredbandsbolaget said.

While it’s not clear whether any additional blocking demands have been filed with the ISP, this week an announcement by Bredbandsbolaget parent company Telenor revealed an unexpected knock-on effect. Seemingly without a single shot being fired, The Pirate Bay will now be blocked by Telenor too.

The background lies in Telenor’s acquisition of Bredbandsbolaget back in 2005. Until this week the companies operated under separate brands but will now merge into one entity.

“Telenor Sweden and Bredbandsbolaget today take the final step on their joint trip and become the same company with the same name. As a result, Telenor becomes a comprehensive provider of broadband, TV and mobile communications,” the company said in a statement this week.

“Telenor Sweden and Bredbandsbolaget have shared both logo and organization for the last 13 years. Today, we take the last step in the relationship and consolidate the companies under the same name.”

Up until this final merger, 600,000 Bredbandsbolaget broadband customers were denied access to The Pirate Bay. Now it appears that Telenor’s 700,000 fiber and broadband customers will be affected too. The new single-brand company says it has decided to block the notorious torrent site across its entire network.

“We have not discontinued Bredbandsbolaget, but we have merged Telenor and Bredbandsbolaget and become one,” the company said.

“When we share the same network, The Pirate Bay is blocked by both Telenor and Bredbandsbolaget and there is nothing we plan to change in the future.”

TorrentFreak contacted the PR departments of both Telenor and Bredbandsbolaget requesting information on why a court order aimed at only the latter’s customers would now affect those of the former too, more than doubling the blockade’s reach. Neither company responded which leaves only speculation as to its motives.

On the one hand, the decision to voluntarily implement an expanded blockade could perhaps be viewed as a little unusual given how much time, effort and money has been invested in fighting web-blockades in Sweden.

On the other, the merger of the companies may present legal difficulties as far as the court order goes and it could certainly cause friction among the customer base of Telenor if some customers could access TPB, and others could not.

In any event, the legal basis for web-blocking on copyright infringement grounds was firmly established last year at the EU level, which means that Telenor would lose any future legal battle, should it decide to dig in its heels. On that basis alone, the decision to block all customers probably makes perfect commercial sense.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Connect, collaborate, and learn at AWS Global Summits in 2018

Post Syndicated from Tina Kelleher original https://aws.amazon.com/blogs/big-data/connect-collaborate-and-learn-at-aws-global-summits-in-2018/

Regardless of your career path, there’s no denying that attending industry events can provide helpful career development opportunities — not only for improving and expanding your skill sets, but for networking as well. According to this article from PayScale.com, experts estimate that somewhere between 70-85% of new positions are landed through networking.

Narrowing our focus to networking opportunities with cloud computing professionals who’re working on tackling some of today’s most innovative and exciting big data solutions, attending big data-focused sessions at an AWS Global Summit is a great place to start.

AWS Global Summits are free events that bring the cloud computing community together to connect, collaborate, and learn about AWS. As the name suggests, these summits are held in major cities around the world, and attract technologists from all industries and skill levels who’re interested in hearing from AWS leaders, experts, partners, and customers.

In addition to networking opportunities with top cloud technology providers, consultants and your peers in our Partner and Solutions Expo, you’ll also hone your AWS skills by attending and participating in a multitude of education and training opportunities.

Here’s a brief sampling of some of the upcoming sessions relevant to big data professionals:

May 31st : Big Data Architectural Patterns and Best Practices on AWS | AWS Summit – Mexico City

June 6th-7th: Various (click on the “Big Data & Analytics” header) | AWS Summit – Berlin

June 20-21st : [email protected] | Public Sector Summit – Washington DC

June 21st: Enabling Self Service for Data Scientists with AWS Service Catalog | AWS Summit – Sao Paulo

Be sure to check out the main page for AWS Global Summits, where you can see which cities have AWS Summits planned for 2018, register to attend an upcoming event, or provide your information to be notified when registration opens for a future event.

[$] A reworked TCP zero-copy receive API

Post Syndicated from corbet original https://lwn.net/Articles/754681/rss

In April, LWN looked at the new API for
zero-copy reception of TCP data that had been merged into the net-next tree
for the 4.18 development cycle. After that article was written, a couple
of issues came to the fore that required some changes to the API for this
feature. Those changes have been made and merged; read on for the details.

[$] Updates in container isolation

Post Syndicated from corbet original https://lwn.net/Articles/754433/rss

At KubeCon
+ CloudNativeCon Europe
2018, several talks explored the topic of
container isolation and security. The last year saw the release of Kata Containers which, combined with
the CRI-O project, provided strong isolation
guarantees for containers using a hypervisor. During the conference, Google
released its own hypervisor called gVisor, adding yet another
possible solution for this problem. Those new developments prompted the
community to work on integrating the concept of “secure containers”
(or “sandboxed containers”) deeper
into Kubernetes. This work is now coming to fruition; it prompts us to look
again at how Kubernetes tries to keep the bad guys from wreaking havoc once
they break into a container.

Puerto Rico’s First Raspberry Pi Educator Workshop

Post Syndicated from Dana Augustin original https://www.raspberrypi.org/blog/puerto-rico-raspberry-pi-workshop/

Earlier this spring, an excited group of STEM educators came together to participate in the first ever Raspberry Pi and Arduino workshop in Puerto Rico.

Their three-day digital making adventure was led by MakerTechPR’s José Rullán and Raspberry Pi Certified Educator Alex Martínez. They ran the event as part of the Robot Makers challenge organized by Yees! and sponsored by Puerto Rico’s Department of Economic Development and Trade to promote entrepreneurial skills within Puerto Rico’s education system.

Over 30 educators attended the workshop, which covered the use of the Raspberry Pi 3 as a computer and digital making resource. The educators received a kit consisting of a Raspberry Pi 3 with an Explorer HAT Pro and an Arduino Uno. At the end of the workshop, the educators were able to keep the kit as a demonstration unit for their classrooms. They were enthusiastic to learn new concepts and immerse themselves in the world of physical computing.

In their first session, the educators were introduced to the Raspberry Pi as an affordable technology for robotic clubs. In their second session, they explored physical computing and the coding languages needed to control the Explorer HAT Pro. They started off coding with Scratch, with which some educators had experience, and ended with controlling the GPIO pins with Python. In the final session, they learned how to develop applications using the powerful combination of Arduino and Raspberry Pi for robotics projects. This gave them a better understanding of how they could engage their students in physical computing.

“The Raspberry Pi ecosystem is the perfect solution in the classroom because to us it is very resourceful and accessible.” – Alex Martínez

Computer science and robotics courses are important for many schools and teachers in Puerto Rico. The simple idea of programming a microcontroller from a $35 computer increases the chances of more students having access to more technology to create things.

Puerto Rico’s education system has faced enormous challenges after Hurricane Maria, including economic collapse and the government’s closure of many schools due to the exodus of families from the island. By attending training like this workshop, educators in Puerto Rico are becoming more experienced in fields like robotics in particular, which are key for 21st-century skills and learning. This, in turn, can lead to more educational opportunities, and hopefully the reopening of more schools on the island.

“We find it imperative that our children be taught STEM disciplines and skills. Our goal is to continue this work of spreading digital making and computer science using the Raspberry Pi around Puerto Rico. We want our children to have the best education possible.” – Alex Martínez

After attending Picademy in 2016, Alex has integrated the Raspberry Pi Foundation’s online resources into his classroom. He has also taught small workshops around the island and in the local Puerto Rican makerspace community. José is an electrical engineer, entrepreneur, educator and hobbyist who enjoys learning to use technology and sharing his knowledge through projects and challenges.

The post Puerto Rico’s First Raspberry Pi Educator Workshop appeared first on Raspberry Pi.

Police Launch Investigation into Huge Pirate Manga Site Mangamura

Post Syndicated from Andy original https://torrentfreak.com/police-launch-investigation-into-huge-pirate-manga-site-mangamura-180514/

Back in March, Japan’s Chief Cabinet Secretary Yoshihide Suga said that the government was considering measures to prohibit access to pirate sites.

While protecting all content is the overall aim, it became clear that the government was determined to protect Japan’s successful manga and anime industries.

It didn’t take long for a reaction. On Friday April 13, the government introduced emergency website blocking measures, seeking cooperation from the country’s ISPs.

NTT Communications Corp., NTT Docomo Inc. and NTT Plala Inc., quickly announced they would block three leading pirate sites – Mangamura, AniTube! and MioMio which have a huge following in Japan. However, after taking the country by storm during the past two years, Mangamura had already called it quits.

On April 17, in the wake of the government announcement, Mangamura disappeared. It’s unclear whether its vanishing act was directly connected to recent developments but a program on national public broadcasting organization NHK, which claimed to have traced the site’s administrators back to the United States, Ukraine, and other regions, can’t have helped.

Further details released this morning reveal the intense pressure Mangamura was under. With 100 million visits a month it was bound to attract attention and according to Mainichi, several publishing giants ran out of patience last year and reported the platform to the authorities.

Kodansha, Japan’s largest publisher, and three other companies filed criminal complaints with Fukuoka Prefectural Police, Oita Prefectural Police, and other law enforcement departments, claiming the site violated their rights.

“The complaints, which were lodged against an unknown suspect or suspects, were filed on behalf of manga artists who are copyright holders to the pirated works, including Hajime Isayama and Eiichiro Oda, known for their wildly popular ‘Shingeki no Kyojin’ (‘Attack on Titan,’ published by Kodansha) and ‘One Piece’ (Shueisha Inc.), respectively,” the publication reports.

Mangamura launch in January 2016 and became a huge hit in Japan. Anti-piracy group Content Overseas Distribution Association (CODA), which counts publishing giant Kodansha among its members, reports that between September 2017 and February 2018, the site was accessed 620 million times.

Based on a “one visit, one manga title read” formula, CODA estimates that the site caused damages to the manga industry of 319.2 billion yen – around US$2.91 billion.

As a result, police are now stepping up their efforts to identify Mangamura’s operators. Whether that will prove fruitful will remain to be seen but in the meantime, Japan’s site-blocking efforts continue to cause controversy.

As reported last month, lawyer and NTT customer Yuichi Nakazawa launched legal action against NTT, demanding that the corporation immediately end its site-blocking operations.

“NTT’s decision was made arbitrarily on the site without any legal basis. No matter how legitimate the objective of copyright infringement is, it is very dangerous,” Nakazawa told TorrentFreak.

“I felt that ‘freedom,’ which is an important value of the Internet, was threatened. Actually, when the interruption of communications had begun, the company thought it would be impossible to reverse the situation, so I filed a lawsuit at this stage.”

Japan’s Constitution and its Telecommunications Business Act both have “no censorship” clauses, meaning that site-blocking has the potential to be ruled illegal. It’s also illegal in Japan to invade the privacy of Internet users’ communications, which some observers have argued is necessary if users are to be prevented from accessing pirate sites.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Analyze Apache Parquet optimized data using Amazon Kinesis Data Firehose, Amazon Athena, and Amazon Redshift

Post Syndicated from Roy Hasson original https://aws.amazon.com/blogs/big-data/analyzing-apache-parquet-optimized-data-using-amazon-kinesis-data-firehose-amazon-athena-and-amazon-redshift/

Amazon Kinesis Data Firehose is the easiest way to capture and stream data into a data lake built on Amazon S3. This data can be anything—from AWS service logs like AWS CloudTrail log files, Amazon VPC Flow Logs, Application Load Balancer logs, and others. It can also be IoT events, game events, and much more. To efficiently query this data, a time-consuming ETL (extract, transform, and load) process is required to massage and convert the data to an optimal file format, which increases the time to insight. This situation is less than ideal, especially for real-time data that loses its value over time.

To solve this common challenge, Kinesis Data Firehose can now save data to Amazon S3 in Apache Parquet or Apache ORC format. These are optimized columnar formats that are highly recommended for best performance and cost-savings when querying data in S3. This feature directly benefits you if you use Amazon Athena, Amazon Redshift, AWS Glue, Amazon EMR, or any other big data tools that are available from the AWS Partner Network and through the open-source community.

Amazon Connect is a simple-to-use, cloud-based contact center service that makes it easy for any business to provide a great customer experience at a lower cost than common alternatives. Its open platform design enables easy integration with other systems. One of those systems is Amazon Kinesis—in particular, Kinesis Data Streams and Kinesis Data Firehose.

What’s really exciting is that you can now save events from Amazon Connect to S3 in Apache Parquet format. You can then perform analytics using Amazon Athena and Amazon Redshift Spectrum in real time, taking advantage of this key performance and cost optimization. Of course, Amazon Connect is only one example. This new capability opens the door for a great deal of opportunity, especially as organizations continue to build their data lakes.

Amazon Connect includes an array of analytics views in the Administrator dashboard. But you might want to run other types of analysis. In this post, I describe how to set up a data stream from Amazon Connect through Kinesis Data Streams and Kinesis Data Firehose and out to S3, and then perform analytics using Athena and Amazon Redshift Spectrum. I focus primarily on the Kinesis Data Firehose support for Parquet and its integration with the AWS Glue Data Catalog, Amazon Athena, and Amazon Redshift.

Solution overview

Here is how the solution is laid out:

 

 

The following sections walk you through each of these steps to set up the pipeline.

1. Define the schema

When Kinesis Data Firehose processes incoming events and converts the data to Parquet, it needs to know which schema to apply. The reason is that many times, incoming events contain all or some of the expected fields based on which values the producers are advertising. A typical process is to normalize the schema during a batch ETL job so that you end up with a consistent schema that can easily be understood and queried. Doing this introduces latency due to the nature of the batch process. To overcome this issue, Kinesis Data Firehose requires the schema to be defined in advance.

To see the available columns and structures, see Amazon Connect Agent Event Streams. For the purpose of simplicity, I opted to make all the columns of type String rather than create the nested structures. But you can definitely do that if you want.

The simplest way to define the schema is to create a table in the Amazon Athena console. Open the Athena console, and paste the following create table statement, substituting your own S3 bucket and prefix for where your event data will be stored. A Data Catalog database is a logical container that holds the different tables that you can create. The default database name shown here should already exist. If it doesn’t, you can create it or use another database that you’ve already created.

CREATE EXTERNAL TABLE default.kfhconnectblog (
  awsaccountid string,
  agentarn string,
  currentagentsnapshot string,
  eventid string,
  eventtimestamp string,
  eventtype string,
  instancearn string,
  previousagentsnapshot string,
  version string
)
STORED AS parquet
LOCATION 's3://your_bucket/kfhconnectblog/'
TBLPROPERTIES ("parquet.compression"="SNAPPY")

That’s all you have to do to prepare the schema for Kinesis Data Firehose.

2. Define the data streams

Next, you need to define the Kinesis data streams that will be used to stream the Amazon Connect events.  Open the Kinesis Data Streams console and create two streams.  You can configure them with only one shard each because you don’t have a lot of data right now.

3. Define the Kinesis Data Firehose delivery stream for Parquet

Let’s configure the Data Firehose delivery stream using the data stream as the source and Amazon S3 as the output. Start by opening the Kinesis Data Firehose console and creating a new data delivery stream. Give it a name, and associate it with the Kinesis data stream that you created in Step 2.

As shown in the following screenshot, enable Record format conversion (1) and choose Apache Parquet (2). As you can see, Apache ORC is also supported. Scroll down and provide the AWS Glue Data Catalog database name (3) and table names (4) that you created in Step 1. Choose Next.

To make things easier, the output S3 bucket and prefix fields are automatically populated using the values that you defined in the LOCATION parameter of the create table statement from Step 1. Pretty cool. Additionally, you have the option to save the raw events into another location as defined in the Source record S3 backup section. Don’t forget to add a trailing forward slash “ / “ so that Data Firehose creates the date partitions inside that prefix.

On the next page, in the S3 buffer conditions section, there is a note about configuring a large buffer size. The Parquet file format is highly efficient in how it stores and compresses data. Increasing the buffer size allows you to pack more rows into each output file, which is preferred and gives you the most benefit from Parquet.

Compression using Snappy is automatically enabled for both Parquet and ORC. You can modify the compression algorithm by using the Kinesis Data Firehose API and update the OutputFormatConfiguration.

Be sure to also enable Amazon CloudWatch Logs so that you can debug any issues that you might run into.

Lastly, finalize the creation of the Firehose delivery stream, and continue on to the next section.

4. Set up the Amazon Connect contact center

After setting up the Kinesis pipeline, you now need to set up a simple contact center in Amazon Connect. The Getting Started page provides clear instructions on how to set up your environment, acquire a phone number, and create an agent to accept calls.

After setting up the contact center, in the Amazon Connect console, choose your Instance Alias, and then choose Data Streaming. Under Agent Event, choose the Kinesis data stream that you created in Step 2, and then choose Save.

At this point, your pipeline is complete.  Agent events from Amazon Connect are generated as agents go about their day. Events are sent via Kinesis Data Streams to Kinesis Data Firehose, which converts the event data from JSON to Parquet and stores it in S3. Athena and Amazon Redshift Spectrum can simply query the data without any additional work.

So let’s generate some data. Go back into the Administrator console for your Amazon Connect contact center, and create an agent to handle incoming calls. In this example, I creatively named mine Agent One. After it is created, Agent One can get to work and log into their console and set their availability to Available so that they are ready to receive calls.

To make the data a bit more interesting, I also created a second agent, Agent Two. I then made some incoming and outgoing calls and caused some failures to occur, so I now have enough data available to analyze.

5. Analyze the data with Athena

Let’s open the Athena console and run some queries. One thing you’ll notice is that when we created the schema for the dataset, we defined some of the fields as Strings even though in the documentation they were complex structures.  The reason for doing that was simply to show some of the flexibility of Athena to be able to parse JSON data. However, you can define nested structures in your table schema so that Kinesis Data Firehose applies the appropriate schema to the Parquet file.

Let’s run the first query to see which agents have logged into the system.

The query might look complex, but it’s fairly straightforward:

WITH dataset AS (
  SELECT 
    from_iso8601_timestamp(eventtimestamp) AS event_ts,
    eventtype,
    -- CURRENT STATE
    json_extract_scalar(
      currentagentsnapshot,
      '$.agentstatus.name') AS current_status,
    from_iso8601_timestamp(
      json_extract_scalar(
        currentagentsnapshot,
        '$.agentstatus.starttimestamp')) AS current_starttimestamp,
    json_extract_scalar(
      currentagentsnapshot, 
      '$.configuration.firstname') AS current_firstname,
    json_extract_scalar(
      currentagentsnapshot,
      '$.configuration.lastname') AS current_lastname,
    json_extract_scalar(
      currentagentsnapshot, 
      '$.configuration.username') AS current_username,
    json_extract_scalar(
      currentagentsnapshot, 
      '$.configuration.routingprofile.defaultoutboundqueue.name') AS               current_outboundqueue,
    json_extract_scalar(
      currentagentsnapshot, 
      '$.configuration.routingprofile.inboundqueues[0].name') as current_inboundqueue,
    -- PREVIOUS STATE
    json_extract_scalar(
      previousagentsnapshot, 
      '$.agentstatus.name') as prev_status,
    from_iso8601_timestamp(
      json_extract_scalar(
        previousagentsnapshot, 
       '$.agentstatus.starttimestamp')) as prev_starttimestamp,
    json_extract_scalar(
      previousagentsnapshot, 
      '$.configuration.firstname') as prev_firstname,
    json_extract_scalar(
      previousagentsnapshot, 
      '$.configuration.lastname') as prev_lastname,
    json_extract_scalar(
      previousagentsnapshot, 
      '$.configuration.username') as prev_username,
    json_extract_scalar(
      previousagentsnapshot, 
      '$.configuration.routingprofile.defaultoutboundqueue.name') as current_outboundqueue,
    json_extract_scalar(
      previousagentsnapshot, 
      '$.configuration.routingprofile.inboundqueues[0].name') as prev_inboundqueue
  from kfhconnectblog
  where eventtype <> 'HEART_BEAT'
)
SELECT
  current_status as status,
  current_username as username,
  event_ts
FROM dataset
WHERE eventtype = 'LOGIN' AND current_username <> ''
ORDER BY event_ts DESC

The query output looks something like this:

Here is another query that shows the sessions each of the agents engaged with. It tells us where they were incoming or outgoing, if they were completed, and where there were missed or failed calls.

WITH src AS (
  SELECT
     eventid,
     json_extract_scalar(currentagentsnapshot, '$.configuration.username') as username,
     cast(json_extract(currentagentsnapshot, '$.contacts') AS ARRAY(JSON)) as c,
     cast(json_extract(previousagentsnapshot, '$.contacts') AS ARRAY(JSON)) as p
  from kfhconnectblog
),
src2 AS (
  SELECT *
  FROM src CROSS JOIN UNNEST (c, p) AS contacts(c_item, p_item)
),
dataset AS (
SELECT 
  eventid,
  username,
  json_extract_scalar(c_item, '$.contactid') as c_contactid,
  json_extract_scalar(c_item, '$.channel') as c_channel,
  json_extract_scalar(c_item, '$.initiationmethod') as c_direction,
  json_extract_scalar(c_item, '$.queue.name') as c_queue,
  json_extract_scalar(c_item, '$.state') as c_state,
  from_iso8601_timestamp(json_extract_scalar(c_item, '$.statestarttimestamp')) as c_ts,
  
  json_extract_scalar(p_item, '$.contactid') as p_contactid,
  json_extract_scalar(p_item, '$.channel') as p_channel,
  json_extract_scalar(p_item, '$.initiationmethod') as p_direction,
  json_extract_scalar(p_item, '$.queue.name') as p_queue,
  json_extract_scalar(p_item, '$.state') as p_state,
  from_iso8601_timestamp(json_extract_scalar(p_item, '$.statestarttimestamp')) as p_ts
FROM src2
)
SELECT 
  username,
  c_channel as channel,
  c_direction as direction,
  p_state as prev_state,
  c_state as current_state,
  c_ts as current_ts,
  c_contactid as id
FROM dataset
WHERE c_contactid = p_contactid
ORDER BY id DESC, current_ts ASC

The query output looks similar to the following:

6. Analyze the data with Amazon Redshift Spectrum

With Amazon Redshift Spectrum, you can query data directly in S3 using your existing Amazon Redshift data warehouse cluster. Because the data is already in Parquet format, Redshift Spectrum gets the same great benefits that Athena does.

Here is a simple query to show querying the same data from Amazon Redshift. Note that to do this, you need to first create an external schema in Amazon Redshift that points to the AWS Glue Data Catalog.

SELECT 
  eventtype,
  json_extract_path_text(currentagentsnapshot,'agentstatus','name') AS current_status,
  json_extract_path_text(currentagentsnapshot, 'configuration','firstname') AS current_firstname,
  json_extract_path_text(currentagentsnapshot, 'configuration','lastname') AS current_lastname,
  json_extract_path_text(
    currentagentsnapshot,
    'configuration','routingprofile','defaultoutboundqueue','name') AS current_outboundqueue,
FROM default_schema.kfhconnectblog

The following shows the query output:

Summary

In this post, I showed you how to use Kinesis Data Firehose to ingest and convert data to columnar file format, enabling real-time analysis using Athena and Amazon Redshift. This great feature enables a level of optimization in both cost and performance that you need when storing and analyzing large amounts of data. This feature is equally important if you are investing in building data lakes on AWS.

 


Additional Reading

If you found this post useful, be sure to check out Analyzing VPC Flow Logs with Amazon Kinesis Firehose, Amazon Athena, and Amazon QuickSight and Work with partitioned data in AWS Glue.


About the Author

Roy Hasson is a Global Business Development Manager for AWS Analytics. He works with customers around the globe to design solutions to meet their data processing, analytics and business intelligence needs. Roy is big Manchester United fan cheering his team on and hanging out with his family.

 

 

 

Bell/TSN Letter to University Connects Site-Blocking Support to Students’ Futures

Post Syndicated from Andy original https://torrentfreak.com/bell-tsn-letter-to-university-connects-site-blocking-support-to-students-futures-180510/

In January, a coalition of Canadian companies called on local telecoms regulator CRTC to implement a website-blocking regime in Canada.

The coalition, Fairplay Canada, is a collection of organizations and companies with ties to the entertainment industries and includes Bell, Cineplex, Directors Guild of Canada, Maple Leaf Sports and Entertainment, Movie Theatre Association of Canada, and Rogers Media. Its stated aim is to address Canada’s online piracy problems.

While CTRC reviews FairPlay Canada’s plans, the coalition has been seeking to drum up support for the blocking regime, encouraging a diverse range of supporters to send submissions endorsing the project. Of course, building a united front among like-minded groups is nothing out of the ordinary but a situation just uncovered by Canadian law Professor Micheal Geist, one of the most vocal opponents of the proposed scheme, is bound to raise eyebrows.

Geist discovered a submission by Brian Hutchings, who works as Vice-President, Administration at Brock University in Ontario. Dated March 22, 2018, it notes that one of the university’s most sought-after programs is Sports Management, which helps Brock’s students to become “the lifeblood” of Canada’s sport and entertainment industries.

“Our University is deeply alarmed at how piracy is eroding an industry that employs so many of our co-op students and graduates. Piracy is a serious, pervasive threat that steals creativity, undermines investment in content development and threatens the survival of an industry that is also part of our national identity,” the submission reads.

“Brock ardently supports the FairPlay Canada coalition of more than 25 organizations involved in every aspect of Canada’s film, TV, radio, sports entertainment and music industries. Specifically, we support the coalition’s request that the CRTC introduce rules that would disable access in Canada to the most egregious piracy sites, similar to measures that have been taken in the UK, France and Australia. We are committed to assist the members of the coalition and the CRTC in eliminating the theft of digital content.”

The letter leaves no doubt that Brock University as a whole stands side-by-side with Fairplay Canada but according to a subsequent submission signed by Michelle Webber, President, Brock University Faculty Association (BUFA), nothing could be further from the truth.

Noting that BUFA unanimously supports the position of the Canadian Association of University Teachers which opposes the FairPlay proposal, Webber adds that BUFA stands in opposition to the submission by Brian Hutchings on behalf of Brock University.

“Vice President Hutching’s intervention was undertaken without consultation with the wider Brock University community, including faculty, librarians, and Senate; therefore, his submission should not be seen as indicative of the views of Brock University as a whole.”

BUFA goes on to stress the importance of an open Internet to researchers and educators while raising concerns that the blocking proposals could threaten the principles of net neutrality in Canada.

While the undermining of Hutching’s position is embarrassing enough, via access to information laws Geist has also been able to reveal the chain of events that prompted the Vice-President to write a letter of support on behalf of the whole university.

It began with an email sent by former Brock professor Cheri Bradish to Mark Milliere, TSN’s Senior Vice President and General Manager, with Hutchings copied in. The idea was to connect the pair, with the suggestion that supporting the site-blocking plan would help to mitigate the threat to “future work options” for students.

What followed was a direct email from Mark Milliere to Brian Hutchings, in which the former laid out the contributions his company makes to the university, while again suggesting that support for site-blocking would be in the long-term interests of students seeking employment in the industry.

On March 23, Milliere wrote to Hutchings again, thanking him for “a terrific letter” and stating that “If you need anything from TSN, just ask.”

This isn’t the first time that Bell has asked those beholden to the company to support its site-blocking plans.

Back in February it was revealed that the company had asked its own employees to participate in the site-blocking submission process, without necessarily revealing their affiliations with the company.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

AWS Online Tech Talks – May and Early June 2018

Post Syndicated from Devin Watson original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-may-and-early-june-2018/

AWS Online Tech Talks – May and Early June 2018  

Join us this month to learn about some of the exciting new services and solution best practices at AWS. We also have our first re:Invent 2018 webinar series, “How to re:Invent”. Sign up now to learn more, we look forward to seeing you.

Note – All sessions are free and in Pacific Time.

Tech talks featured this month:

Analytics & Big Data

May 21, 2018 | 11:00 AM – 11:45 AM PT Integrating Amazon Elasticsearch with your DevOps Tooling – Learn how you can easily integrate Amazon Elasticsearch Service into your DevOps tooling and gain valuable insight from your log data.

May 23, 2018 | 11:00 AM – 11:45 AM PTData Warehousing and Data Lake Analytics, Together – Learn how to query data across your data warehouse and data lake without moving data.

May 24, 2018 | 11:00 AM – 11:45 AM PTData Transformation Patterns in AWS – Discover how to perform common data transformations on the AWS Data Lake.

Compute

May 29, 2018 | 01:00 PM – 01:45 PM PT – Creating and Managing a WordPress Website with Amazon Lightsail – Learn about Amazon Lightsail and how you can create, run and manage your WordPress websites with Amazon’s simple compute platform.

May 30, 2018 | 01:00 PM – 01:45 PM PTAccelerating Life Sciences with HPC on AWS – Learn how you can accelerate your Life Sciences research workloads by harnessing the power of high performance computing on AWS.

Containers

May 24, 2018 | 01:00 PM – 01:45 PM PT – Building Microservices with the 12 Factor App Pattern on AWS – Learn best practices for building containerized microservices on AWS, and how traditional software design patterns evolve in the context of containers.

Databases

May 21, 2018 | 01:00 PM – 01:45 PM PTHow to Migrate from Cassandra to Amazon DynamoDB – Get the benefits, best practices and guides on how to migrate your Cassandra databases to Amazon DynamoDB.

May 23, 2018 | 01:00 PM – 01:45 PM PT5 Hacks for Optimizing MySQL in the Cloud – Learn how to optimize your MySQL databases for high availability, performance, and disaster resilience using RDS.

DevOps

May 23, 2018 | 09:00 AM – 09:45 AM PT.NET Serverless Development on AWS – Learn how to build a modern serverless application in .NET Core 2.0.

Enterprise & Hybrid

May 22, 2018 | 11:00 AM – 11:45 AM PTHybrid Cloud Customer Use Cases on AWS – Learn how customers are leveraging AWS hybrid cloud capabilities to easily extend their datacenter capacity, deliver new services and applications, and ensure business continuity and disaster recovery.

IoT

May 31, 2018 | 11:00 AM – 11:45 AM PTUsing AWS IoT for Industrial Applications – Discover how you can quickly onboard your fleet of connected devices, keep them secure, and build predictive analytics with AWS IoT.

Machine Learning

May 22, 2018 | 09:00 AM – 09:45 AM PTUsing Apache Spark with Amazon SageMaker – Discover how to use Apache Spark with Amazon SageMaker for training jobs and application integration.

May 24, 2018 | 09:00 AM – 09:45 AM PTIntroducing AWS DeepLens – Learn how AWS DeepLens provides a new way for developers to learn machine learning by pairing the physical device with a broad set of tutorials, examples, source code, and integration with familiar AWS services.

Management Tools

May 21, 2018 | 09:00 AM – 09:45 AM PTGaining Better Observability of Your VMs with Amazon CloudWatch – Learn how CloudWatch Agent makes it easy for customers like Rackspace to monitor their VMs.

Mobile

May 29, 2018 | 11:00 AM – 11:45 AM PT – Deep Dive on Amazon Pinpoint Segmentation and Endpoint Management – See how segmentation and endpoint management with Amazon Pinpoint can help you target the right audience.

Networking

May 31, 2018 | 09:00 AM – 09:45 AM PTMaking Private Connectivity the New Norm via AWS PrivateLink – See how PrivateLink enables service owners to offer private endpoints to customers outside their company.

Security, Identity, & Compliance

May 30, 2018 | 09:00 AM – 09:45 AM PT – Introducing AWS Certificate Manager Private Certificate Authority (CA) – Learn how AWS Certificate Manager (ACM) Private Certificate Authority (CA), a managed private CA service, helps you easily and securely manage the lifecycle of your private certificates.

June 1, 2018 | 09:00 AM – 09:45 AM PTIntroducing AWS Firewall Manager – Centrally configure and manage AWS WAF rules across your accounts and applications.

Serverless

May 22, 2018 | 01:00 PM – 01:45 PM PTBuilding API-Driven Microservices with Amazon API Gateway – Learn how to build a secure, scalable API for your application in our tech talk about API-driven microservices.

Storage

May 30, 2018 | 11:00 AM – 11:45 AM PTAccelerate Productivity by Computing at the Edge – Learn how AWS Snowball Edge support for compute instances helps accelerate data transfers, execute custom applications, and reduce overall storage costs.

June 1, 2018 | 11:00 AM – 11:45 AM PTLearn to Build a Cloud-Scale Website Powered by Amazon EFS – Technical deep dive where you’ll learn tips and tricks for integrating WordPress, Drupal and Magento with Amazon EFS.

 

 

 

 

Bad Software Is Our Fault

Post Syndicated from Bozho original https://techblog.bozho.net/bad-software-is-our-fault/

Bad software is everywhere. One can even claim that every software is bad. Cool companies, tech giants, established companies, all produce bad software. And no, yours is not an exception.

Who’s to blame for bad software? It’s all complicated and many factors are intertwined – there’s business requirements, there’s organizational context, there’s lack of sufficient skilled developers, there’s the inherent complexity of software development, there’s leaky abstractions, reliance on 3rd party software, consequences of wrong business and purchase decisions, time limitations, flawed business analysis, etc. So yes, despite the catchy title, I’m aware it’s actually complicated.

But in every “it’s complicated” scenario, there’s always one or two factors that are decisive. All of them contribute somehow, but the major drivers are usually a handful of things. And in the case of base software, I think it’s the fault of technical people. Developers, architects, ops.

We don’t seem to care about best practices. And I’ll do some nasty generalizations here, but bear with me. We can spend hours arguing about tabs vs spaces, curly bracket on new line, git merge vs rebase, which IDE is better, which framework is better and other largely irrelevant stuff. But we tend to ignore the important aspects that span beyond the code itself. The context in which the code lives, the non-functional requirements – robustness, security, resilience, etc.

We don’t seem to get security. Even trivial stuff such as user authentication is almost always implemented wrong. These days Twitter and GitHub realized they have been logging plain-text passwords, for example, but that’s just the tip of the iceberg. Too often we ignore the security implications.

“But the business didn’t request the security features”, one may say. The business never requested 2-factor authentication, encryption at rest, PKI, secure (or any) audit trail, log masking, crypto shredding, etc., etc. Because the business doesn’t know these things – we do and we have to put them on the backlog and fight for them to be implemented. Each organization has its specifics and tech people can influence the backlog in different ways, but almost everywhere we can put things there and prioritize them.

The other aspect is testing. We should all be well aware by now that automated testing is mandatory. We have all the tools in the world for unit, functional, integration, performance and whatnot testing, and yet many software projects lack the necessary test coverage to be able to change stuff without accidentally breaking things. “But testing takes time, we don’t have it”. We are perfectly aware that testing saves time, as we’ve all had those “not again!” recurring bugs. And yet we think of all sorts of excuses – “let the QAs test it”, we have to ship that now, we’ll test it later”, “this is too trivial to be tested”, etc.

And you may say it’s not our job. We don’t define what has do be done, we just do it. We don’t define the budget, the scope, the features. We just write whatever has been decided. And that’s plain wrong. It’s not our job to make money out of our code, and it’s not our job to define what customers need, but apart from that everything is our job. The way the software is structured, the security aspects and security features, the stability of the code base, the way the software behaves in different environments. The non-functional requirements are our job, and putting them on the backlog is our job.

You’ve probably heard that every software becomes “legacy” after 6 months. And that’s because of us, our sloppiness, our inability to mitigate external factors and constraints. Too often we create a mess through “just doing our job”.

And of course that’s a generalization. I happen to know a lot of great professionals who don’t make these mistakes, who strive for excellence and implement things the right way. But our industry as a whole doesn’t. Our industry as a whole produces bad software. And it’s our fault, as developers – as the only people who know why a certain piece of software is bad.

In a talk of his, Bob Martin warns us of the risks of our sloppiness. We have been building websites so far, but we are more and more building stuff that interacts with the real world, directly and indirectly. Ultimately, lives may depend on our software (like the recent unfortunate death caused by a self-driving car). And I’ll agree with Uncle Bob that it’s high time we self-regulate as an industry, before some technically incompetent politician decides to do that.

How, I don’t know. We’ll have to think more about it. But I’m pretty sure it’s our fault that software is bad, and no amount of blaming the management, the budget, the timing, the tools or the process can eliminate our responsibility.

Why do I insist on bashing my fellow software engineers? Because if we start looking at software development with more responsibility; with the fact that if it fails, it’s our fault, then we’re more likely to get out of our current bug-ridden, security-flawed, fragile software hole and really become the experts of the future.

The post Bad Software Is Our Fault appeared first on Bozho's tech blog.

Announcing Local Build Support for AWS CodeBuild

Post Syndicated from Karthik Thirugnanasambandam original https://aws.amazon.com/blogs/devops/announcing-local-build-support-for-aws-codebuild/

Today, we’re excited to announce local build support in AWS CodeBuild.

AWS CodeBuild is a fully managed build service. There are no servers to provision and scale, or software to install, configure, and operate. You just specify the location of your source code, choose your build settings, and CodeBuild runs build scripts for compiling, testing, and packaging your code.

In this blog post, I’ll show you how to set up CodeBuild locally to build and test a sample Java application.

By building an application on a local machine you can:

  • Test the integrity and contents of a buildspec file locally.
  • Test and build an application locally before committing.
  • Identify and fix errors quickly from your local development environment.

Prerequisites

In this post, I am using AWS Cloud9 IDE as my development environment.

If you would like to use AWS Cloud9 as your IDE, follow the express setup steps in the AWS Cloud9 User Guide.

The AWS Cloud9 IDE comes with Docker and Git already installed. If you are going to use your laptop or desktop machine as your development environment, install Docker and Git before you start.

Steps to build CodeBuild image locally

Run git clone https://github.com/aws/aws-codebuild-docker-images.git to download this repository to your local machine.

$ git clone https://github.com/aws/aws-codebuild-docker-images.git

Lets build a local CodeBuild image for JDK 8 environment. The Dockerfile for JDK 8 is present in /aws-codebuild-docker-images/ubuntu/java/openjdk-8.

Edit the Dockerfile to remove the last line ENTRYPOINT [“dockerd-entrypoint.sh”] and save the file.

Run cd ubuntu/java/openjdk-8 to change the directory in your local workspace.

Run docker build -t aws/codebuild/java:openjdk-8 . to build the Docker image locally. This command will take few minutes to complete.

$ cd aws-codebuild-docker-images
$ cd ubuntu/java/openjdk-8
$ docker build -t aws/codebuild/java:openjdk-8 .

Steps to setup CodeBuild local agent

Run the following Docker pull command to download the local CodeBuild agent.

$ docker pull amazon/aws-codebuild-local:latest --disable-content-trust=false

Now you have the local agent image on your machine and can run a local build.

Run the following git command to download a sample Java project.

$ git clone https://github.com/karthiksambandam/sample-web-app.git

Steps to use the local agent to build a sample project

Let’s build the sample Java project using the local agent.

Execute the following Docker command to run the local agent and build the sample web app repository you cloned earlier.

$ docker run -it -v /var/run/docker.sock:/var/run/docker.sock -e "IMAGE_NAME=aws/codebuild/java:openjdk-8" -e "ARTIFACTS=/home/ec2-user/environment/artifacts" -e "SOURCE=/home/ec2-user/environment/sample-web-app" amazon/aws-codebuild-local

Note: We need to provide three environment variables namely  IMAGE_NAME, SOURCE and ARTIFACTS.

IMAGE_NAME: The name of your build environment image.

SOURCE: The absolute path to your source code directory.

ARTIFACTS: The absolute path to your artifact output folder.

When you run the sample project, you get a runtime error that says the YAML file does not exist. This is because a buildspec.yml file is not included in the sample web project. AWS CodeBuild requires a buildspec.yml to run a build. For more information about buildspec.yml, see Build Spec Example in the AWS CodeBuild User Guide.

Let’s add a buildspec.yml file with the following content to the sample-web-app folder and then rebuild the project.

version: 0.2

phases:
  build:
    commands:
      - echo Build started on `date`
      - mvn install

artifacts:
  files:
    - target/javawebdemo.war

$ docker run -it -v /var/run/docker.sock:/var/run/docker.sock -e "IMAGE_NAME=aws/codebuild/java:openjdk-8" -e "ARTIFACTS=/home/ec2-user/environment/artifacts" -e "SOURCE=/home/ec2-user/environment/sample-web-app" amazon/aws-codebuild-local

This time your build should be successful. Upon successful execution, look in the /artifacts folder for the final built artifacts.zip file to validate.

Conclusion:

In this blog post, I showed you how to quickly set up the CodeBuild local agent to build projects right from your local desktop machine or laptop. As you see, local builds can improve developer productivity by helping you identify and fix errors quickly.

I hope you found this post useful. Feel free to leave your feedback or suggestions in the comments.

Analyze data in Amazon DynamoDB using Amazon SageMaker for real-time prediction

Post Syndicated from YongSeong Lee original https://aws.amazon.com/blogs/big-data/analyze-data-in-amazon-dynamodb-using-amazon-sagemaker-for-real-time-prediction/

Many companies across the globe use Amazon DynamoDB to store and query historical user-interaction data. DynamoDB is a fast NoSQL database used by applications that need consistent, single-digit millisecond latency.

Often, customers want to turn their valuable data in DynamoDB into insights by analyzing a copy of their table stored in Amazon S3. Doing this separates their analytical queries from their low-latency critical paths. This data can be the primary source for understanding customers’ past behavior, predicting future behavior, and generating downstream business value. Customers often turn to DynamoDB because of its great scalability and high availability. After a successful launch, many customers want to use the data in DynamoDB to predict future behaviors or provide personalized recommendations.

DynamoDB is a good fit for low-latency reads and writes, but it’s not practical to scan all data in a DynamoDB database to train a model. In this post, I demonstrate how you can use DynamoDB table data copied to Amazon S3 by AWS Data Pipeline to predict customer behavior. I also demonstrate how you can use this data to provide personalized recommendations for customers using Amazon SageMaker. You can also run ad hoc queries using Amazon Athena against the data. DynamoDB recently released on-demand backups to create full table backups with no performance impact. However, it’s not suitable for our purposes in this post, so I chose AWS Data Pipeline instead to create managed backups are accessible from other services.

To do this, I describe how to read the DynamoDB backup file format in Data Pipeline. I also describe how to convert the objects in S3 to a CSV format that Amazon SageMaker can read. In addition, I show how to schedule regular exports and transformations using Data Pipeline. The sample data used in this post is from Bank Marketing Data Set of UCI.

The solution that I describe provides the following benefits:

  • Separates analytical queries from production traffic on your DynamoDB table, preserving your DynamoDB read capacity units (RCUs) for important production requests
  • Automatically updates your model to get real-time predictions
  • Optimizes for performance (so it doesn’t compete with DynamoDB RCUs after the export) and for cost (using data you already have)
  • Makes it easier for developers of all skill levels to use Amazon SageMaker

All code and data set in this post are available in this .zip file.

Solution architecture

The following diagram shows the overall architecture of the solution.

The steps that data follows through the architecture are as follows:

  1. Data Pipeline regularly copies the full contents of a DynamoDB table as JSON into an S3
  2. Exported JSON files are converted to comma-separated value (CSV) format to use as a data source for Amazon SageMaker.
  3. Amazon SageMaker renews the model artifact and update the endpoint.
  4. The converted CSV is available for ad hoc queries with Amazon Athena.
  5. Data Pipeline controls this flow and repeats the cycle based on the schedule defined by customer requirements.

Building the auto-updating model

This section discusses details about how to read the DynamoDB exported data in Data Pipeline and build automated workflows for real-time prediction with a regularly updated model.

Download sample scripts and data

Before you begin, take the following steps:

  1. Download sample scripts in this .zip file.
  2. Unzip the src.zip file.
  3. Find the automation_script.sh file and edit it for your environment. For example, you need to replace 's3://<your bucket>/<datasource path>/' with your own S3 path to the data source for Amazon ML. In the script, the text enclosed by angle brackets—< and >—should be replaced with your own path.
  4. Upload the json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar file to your S3 path so that the ADD jar command in Apache Hive can refer to it.

For this solution, the banking.csv  should be imported into a DynamoDB table.

Export a DynamoDB table

To export the DynamoDB table to S3, open the Data Pipeline console and choose the Export DynamoDB table to S3 template. In this template, Data Pipeline creates an Amazon EMR cluster and performs an export in the EMRActivity activity. Set proper intervals for backups according to your business requirements.

One core node(m3.xlarge) provides the default capacity for the EMR cluster and should be suitable for the solution in this post. Leave the option to resize the cluster before running enabled in the TableBackupActivity activity to let Data Pipeline scale the cluster to match the table size. The process of converting to CSV format and renewing models happens in this EMR cluster.

For a more in-depth look at how to export data from DynamoDB, see Export Data from DynamoDB in the Data Pipeline documentation.

Add the script to an existing pipeline

After you export your DynamoDB table, you add an additional EMR step to EMRActivity by following these steps:

  1. Open the Data Pipeline console and choose the ID for the pipeline that you want to add the script to.
  2. For Actions, choose Edit.
  3. In the editing console, choose the Activities category and add an EMR step using the custom script downloaded in the previous section, as shown below.

Paste the following command into the new step after the data ­­upload step:

s3://#{myDDBRegion}.elasticmapreduce/libs/script-runner/script-runner.jar,s3://<your bucket name>/automation_script.sh,#{output.directoryPath},#{myDDBRegion}

The element #{output.directoryPath} references the S3 path where the data pipeline exports DynamoDB data as JSON. The path should be passed to the script as an argument.

The bash script has two goals, converting data formats and renewing the Amazon SageMaker model. Subsequent sections discuss the contents of the automation script.

Automation script: Convert JSON data to CSV with Hive

We use Apache Hive to transform the data into a new format. The Hive QL script to create an external table and transform the data is included in the custom script that you added to the Data Pipeline definition.

When you run the Hive scripts, do so with the -e option. Also, define the Hive table with the 'org.openx.data.jsonserde.JsonSerDe' row format to parse and read JSON format. The SQL creates a Hive EXTERNAL table, and it reads the DynamoDB backup data on the S3 path passed to it by Data Pipeline.

Note: You should create the table with the “EXTERNAL” keyword to avoid the backup data being accidentally deleted from S3 if you drop the table.

The full automation script for converting follows. Add your own bucket name and data source path in the highlighted areas.

#!/bin/bash
hive -e "
ADD jar s3://<your bucket name>/json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar ; 
DROP TABLE IF EXISTS blog_backup_data ;
CREATE EXTERNAL TABLE blog_backup_data (
 customer_id map<string,string>,
 age map<string,string>, job map<string,string>, 
 marital map<string,string>,education map<string,string>, 
 default map<string,string>, housing map<string,string>,
 loan map<string,string>, contact map<string,string>, 
 month map<string,string>, day_of_week map<string,string>, 
 duration map<string,string>, campaign map<string,string>,
 pdays map<string,string>, previous map<string,string>, 
 poutcome map<string,string>, emp_var_rate map<string,string>, 
 cons_price_idx map<string,string>, cons_conf_idx map<string,string>,
 euribor3m map<string,string>, nr_employed map<string,string>, 
 y map<string,string> ) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' 
LOCATION '$1/';

INSERT OVERWRITE DIRECTORY 's3://<your bucket name>/<datasource path>/' 
SELECT concat( customer_id['s'],',', 
 age['n'],',', job['s'],',', 
 marital['s'],',', education['s'],',', default['s'],',', 
 housing['s'],',', loan['s'],',', contact['s'],',', 
 month['s'],',', day_of_week['s'],',', duration['n'],',', 
 campaign['n'],',',pdays['n'],',',previous['n'],',', 
 poutcome['s'],',', emp_var_rate['n'],',', cons_price_idx['n'],',',
 cons_conf_idx['n'],',', euribor3m['n'],',', nr_employed['n'],',', y['n'] ) 
FROM blog_backup_data
WHERE customer_id['s'] > 0 ; 

After creating an external table, you need to read data. You then use the INSERT OVERWRITE DIRECTORY ~ SELECT command to write CSV data to the S3 path that you designated as the data source for Amazon SageMaker.

Depending on your requirements, you can eliminate or process the columns in the SELECT clause in this step to optimize data analysis. For example, you might remove some columns that have unpredictable correlations with the target value because keeping the wrong columns might expose your model to “overfitting” during the training. In this post, customer_id  columns is removed. Overfitting can make your prediction weak. More information about overfitting can be found in the topic Model Fit: Underfitting vs. Overfitting in the Amazon ML documentation.

Automation script: Renew the Amazon SageMaker model

After the CSV data is replaced and ready to use, create a new model artifact for Amazon SageMaker with the updated dataset on S3.  For renewing model artifact, you must create a new training job.  Training jobs can be run using the AWS SDK ( for example, Amazon SageMaker boto3 ) or the Amazon SageMaker Python SDK that can be installed with “pip install sagemaker” command as well as the AWS CLI for Amazon SageMaker described in this post.

In addition, consider how to smoothly renew your existing model without service impact, because your model is called by applications in real time. To do this, you need to create a new endpoint configuration first and update a current endpoint with the endpoint configuration that is just created.

#!/bin/bash
## Define variable 
REGION=$2
DTTIME=`date +%Y-%m-%d-%H-%M-%S`
ROLE="<your AmazonSageMaker-ExecutionRole>" 


# Select containers image based on region.  
case "$REGION" in
"us-west-2" )
    IMAGE="174872318107.dkr.ecr.us-west-2.amazonaws.com/linear-learner:latest"
    ;;
"us-east-1" )
    IMAGE="382416733822.dkr.ecr.us-east-1.amazonaws.com/linear-learner:latest" 
    ;;
"us-east-2" )
    IMAGE="404615174143.dkr.ecr.us-east-2.amazonaws.com/linear-learner:latest" 
    ;;
"eu-west-1" )
    IMAGE="438346466558.dkr.ecr.eu-west-1.amazonaws.com/linear-learner:latest" 
    ;;
 *)
    echo "Invalid Region Name"
    exit 1 ;  
esac

# Start training job and creating model artifact 
TRAINING_JOB_NAME=TRAIN-${DTTIME} 
S3OUTPUT="s3://<your bucket name>/model/" 
INSTANCETYPE="ml.m4.xlarge"
INSTANCECOUNT=1
VOLUMESIZE=5 
aws sagemaker create-training-job --training-job-name ${TRAINING_JOB_NAME} --region ${REGION}  --algorithm-specification TrainingImage=${IMAGE},TrainingInputMode=File --role-arn ${ROLE}  --input-data-config '[{ "ChannelName": "train", "DataSource": { "S3DataSource": { "S3DataType": "S3Prefix", "S3Uri": "s3://<your bucket name>/<datasource path>/", "S3DataDistributionType": "FullyReplicated" } }, "ContentType": "text/csv", "CompressionType": "None" , "RecordWrapperType": "None"  }]'  --output-data-config S3OutputPath=${S3OUTPUT} --resource-config  InstanceType=${INSTANCETYPE},InstanceCount=${INSTANCECOUNT},VolumeSizeInGB=${VOLUMESIZE} --stopping-condition MaxRuntimeInSeconds=120 --hyper-parameters feature_dim=20,predictor_type=binary_classifier  

# Wait until job completed 
aws sagemaker wait training-job-completed-or-stopped --training-job-name ${TRAINING_JOB_NAME}  --region ${REGION}

# Get newly created model artifact and create model
MODELARTIFACT=`aws sagemaker describe-training-job --training-job-name ${TRAINING_JOB_NAME} --region ${REGION}  --query 'ModelArtifacts.S3ModelArtifacts' --output text `
MODELNAME=MODEL-${DTTIME}
aws sagemaker create-model --region ${REGION} --model-name ${MODELNAME}  --primary-container Image=${IMAGE},ModelDataUrl=${MODELARTIFACT}  --execution-role-arn ${ROLE}

# create a new endpoint configuration 
CONFIGNAME=CONFIG-${DTTIME}
aws sagemaker  create-endpoint-config --region ${REGION} --endpoint-config-name ${CONFIGNAME}  --production-variants  VariantName=Users,ModelName=${MODELNAME},InitialInstanceCount=1,InstanceType=ml.m4.xlarge

# create or update the endpoint
STATUS=`aws sagemaker describe-endpoint --endpoint-name  ServiceEndpoint --query 'EndpointStatus' --output text --region ${REGION} `
if [[ $STATUS -ne "InService" ]] ;
then
    aws sagemaker  create-endpoint --endpoint-name  ServiceEndpoint  --endpoint-config-name ${CONFIGNAME} --region ${REGION}    
else
    aws sagemaker  update-endpoint --endpoint-name  ServiceEndpoint  --endpoint-config-name ${CONFIGNAME} --region ${REGION}
fi

Grant permission

Before you execute the script, you must grant proper permission to Data Pipeline. Data Pipeline uses the DataPipelineDefaultResourceRole role by default. I added the following policy to DataPipelineDefaultResourceRole to allow Data Pipeline to create, delete, and update the Amazon SageMaker model and data source in the script.

{
 "Version": "2012-10-17",
 "Statement": [
 {
 "Effect": "Allow",
 "Action": [
 "sagemaker:CreateTrainingJob",
 "sagemaker:DescribeTrainingJob",
 "sagemaker:CreateModel",
 "sagemaker:CreateEndpointConfig",
 "sagemaker:DescribeEndpoint",
 "sagemaker:CreateEndpoint",
 "sagemaker:UpdateEndpoint",
 "iam:PassRole"
 ],
 "Resource": "*"
 }
 ]
}

Use real-time prediction

After you deploy a model into production using Amazon SageMaker hosting services, your client applications use this API to get inferences from the model hosted at the specified endpoint. This approach is useful for interactive web, mobile, or desktop applications.

Following, I provide a simple Python code example that queries against Amazon SageMaker endpoint URL with its name (“ServiceEndpoint”) and then uses them for real-time prediction.

=== Python sample for real-time prediction ===

#!/usr/bin/env python
import boto3
import json 

client = boto3.client('sagemaker-runtime', region_name ='<your region>' )
new_customer_info = '34,10,2,4,1,2,1,1,6,3,190,1,3,4,3,-1.7,94.055,-39.8,0.715,4991.6'
response = client.invoke_endpoint(
    EndpointName='ServiceEndpoint',
    Body=new_customer_info, 
    ContentType='text/csv'
)
result = json.loads(response['Body'].read().decode())
print(result)
--- output(response) ---
{u'predictions': [{u'score': 0.7528127431869507, u'predicted_label': 1.0}]}

Solution summary

The solution takes the following steps:

  1. Data Pipeline exports DynamoDB table data into S3. The original JSON data should be kept to recover the table in the rare event that this is needed. Data Pipeline then converts JSON to CSV so that Amazon SageMaker can read the data.Note: You should select only meaningful attributes when you convert CSV. For example, if you judge that the “campaign” attribute is not correlated, you can eliminate this attribute from the CSV.
  2. Train the Amazon SageMaker model with the new data source.
  3. When a new customer comes to your site, you can judge how likely it is for this customer to subscribe to your new product based on “predictedScores” provided by Amazon SageMaker.
  4. If the new user subscribes your new product, your application must update the attribute “y” to the value 1 (for yes). This updated data is provided for the next model renewal as a new data source. It serves to improve the accuracy of your prediction. With each new entry, your application can become smarter and deliver better predictions.

Running ad hoc queries using Amazon Athena

Amazon Athena is a serverless query service that makes it easy to analyze large amounts of data stored in Amazon S3 using standard SQL. Athena is useful for examining data and collecting statistics or informative summaries about data. You can also use the powerful analytic functions of Presto, as described in the topic Aggregate Functions of Presto in the Presto documentation.

With the Data Pipeline scheduled activity, recent CSV data is always located in S3 so that you can run ad hoc queries against the data using Amazon Athena. I show this with example SQL statements following. For an in-depth description of this process, see the post Interactive SQL Queries for Data in Amazon S3 on the AWS News Blog. 

Creating an Amazon Athena table and running it

Simply, you can create an EXTERNAL table for the CSV data on S3 in Amazon Athena Management Console.

=== Table Creation ===
CREATE EXTERNAL TABLE datasource (
 age int, 
 job string, 
 marital string , 
 education string, 
 default string, 
 housing string, 
 loan string, 
 contact string, 
 month string, 
 day_of_week string, 
 duration int, 
 campaign int, 
 pdays int , 
 previous int , 
 poutcome string, 
 emp_var_rate double, 
 cons_price_idx double,
 cons_conf_idx double, 
 euribor3m double, 
 nr_employed double, 
 y int 
)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',' ESCAPED BY '\\' LINES TERMINATED BY '\n' 
LOCATION 's3://<your bucket name>/<datasource path>/';

The following query calculates the correlation coefficient between the target attribute and other attributes using Amazon Athena.

=== Sample Query ===

SELECT corr(age,y) AS correlation_age_and_target, 
 corr(duration,y) AS correlation_duration_and_target, 
 corr(campaign,y) AS correlation_campaign_and_target,
 corr(contact,y) AS correlation_contact_and_target
FROM ( SELECT age , duration , campaign , y , 
 CASE WHEN contact = 'telephone' THEN 1 ELSE 0 END AS contact 
 FROM datasource 
 ) datasource ;

Conclusion

In this post, I introduce an example of how to analyze data in DynamoDB by using table data in Amazon S3 to optimize DynamoDB table read capacity. You can then use the analyzed data as a new data source to train an Amazon SageMaker model for accurate real-time prediction. In addition, you can run ad hoc queries against the data on S3 using Amazon Athena. I also present how to automate these procedures by using Data Pipeline.

You can adapt this example to your specific use case at hand, and hopefully this post helps you accelerate your development. You can find more examples and use cases for Amazon SageMaker in the video AWS 2017: Introducing Amazon SageMaker on the AWS website.

 


Additional Reading

If you found this post useful, be sure to check out Serving Real-Time Machine Learning Predictions on Amazon EMR and Analyzing Data in S3 using Amazon Athena.

 


About the Author

Yong Seong Lee is a Cloud Support Engineer for AWS Big Data Services. He is interested in every technology related to data/databases and helping customers who have difficulties in using AWS services. His motto is “Enjoy life, be curious and have maximum experience.”

 

 

Sci-Hub ‘Pirate Bay For Science’ Security Certs Revoked by Comodo

Post Syndicated from Andy original https://torrentfreak.com/sci-hub-pirate-bay-for-science-security-certs-revoked-by-comodo-ca-180503/

Sci-Hub is often referred to as the “Pirate Bay of Science”. Like its namesake, it offers masses of unlicensed content for free, mostly against the wishes of copyright holders.

While The Pirate Bay will index almost anything, Sci-Hub is dedicated to distributing tens of millions of academic papers and articles, something which has turned itself into a target for publishing giants like Elsevier.

Sci-Hub and its Kazakhstan-born founder Alexandra Elbakyan have been under sustained attack for several years but more recently have been fending off an unprecedented barrage of legal action initiated by the American Chemical Society (ACS), a leading source of academic publications in the field of chemistry.

After winning a default judgment for $4.8 million in copyright infringement damages last year, ACS was further granted a broad injunction.

It required various third-party services (including domain registries, hosting companies and search engines) to stop facilitating access to the site. This plunged Sci-Hub into a game of domain whac-a-mole, one that continues to this day.

Determined to head Sci-Hub off at the pass, ACS obtained additional authority to tackle the evasive site and any new domains it may register in the future.

While Sci-Hub has been hopping around domains for a while, this week a new development appeared on the horizon. Visitors to some of the site’s domains were greeted with errors indicating that the domains’ security certificates had been revoked.

Tests conducted by TorrentFreak revealed clear revocations on Sci-Hub.hk and Sci-Hub.nz, both of which returned the error ‘NET::ERR_CERT_REVOKED’.

Certificate revoked

These certificates were first issued and then revoked by Comodo CA, the world’s largest certification authority. TF contacted the company who confirmed that it had been forced to take action against Sci-Hub.

“In response to a court order against Sci-Hub, Comodo CA has revoked four certificates for the site,” Jonathan Skinner, Director, Global Channel Programs at Comodo CA informed TorrentFreak.

“By policy Comodo CA obeys court orders and the law to the full extent of its ability.”

Comodo refused to confirm any additional details, including whether these revocations were anything to do with the current ACS injunction. However, Susan R. Morrissey, Director of Communications at ACS, told TorrentFreak that the revocations were indeed part of ACS’ legal action against Sci-Hub.

“[T]he action is related to our continuing efforts to protect ACS’ intellectual property,” Morrissey confirmed.

Sci-Hub operates multiple domains (an up-to-date list is usually available on Wikipedia) that can be switched at any time. At the time of writing the domain sci-hub.ga currently returns ‘ERR_SSL_VERSION_OR_CIPHER_MISMATCH’ while .CN and .GS variants both have Comodo certificates that expired last year.

When TF first approached Comodo earlier this week, Sci-Hub’s certificates with the company hadn’t been completely wiped out. For example, the domain https://sci-hub.tw operated perfectly, with an active and non-revoked Comodo certificate.

Still in the game…but not for long

By Wednesday, however, the domain was returning the now-familiar “revoked” message.

These domain issues are the latest technical problems to hit Sci-Hub as a result of the ACS injunction. In February, Cloudflare terminated service to several of the site’s domains.

“Cloudflare will terminate your service for the following domains sci-hub.la, sci-hub.tv, and sci-hub.tw by disabling our authoritative DNS in 24 hours,” Cloudflare told Sci-Hub.

While ACS has certainly caused problems for Sci-Hub, the platform is extremely resilient and remains online.

The domains https://sci-hub.is and https://sci-hub.nu are fully operational with certificates issued by Let’s Encrypt, a free and open certificate authority supported by the likes of Mozilla, EFF, Chrome, Private Internet Access, and other prominent tech companies.

It’s unclear whether these certificates will be targeted in the future but Sci-Hub doesn’t appear to be in the mood to back down.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Epic Settles With Copyright Infringing Fortnight Cheater, PUBG Cheaters Arrested

Post Syndicated from Ernesto original https://torrentfreak.com/epic-settles-with-copyright-infringing-fortnight-cheater-pubg-cheaters-arrested-180502/

Last year, Epic Games released Fortnite’s free-to-play “Battle Royale” game mode, generating massive interest among gamers.

Unfortunately, not all players stick to the rules. Thousands of people are trying to gain an advantage through cheats, ruining the game for those who play fair.

The same is true for PlayerUnknown’s Battlegrounds (PUBG), which predates Fortnite and shares many of the same characteristics. While the games are very much alike, the same can’t be said for the way cheaters are treated.

Over the past month, Epic Games has filed lawsuits against several people who violated the company’s copyrights, by creating, promoting – and in some cases – selling cheats. While copyright infringement cases can easily bankrupt defendants, that’s not what Epic is after.

This week the company signed another ‘settlement.’ This time with Joseph Sperry, a.k.a. “Spoezy,” in a North Carolina federal court. Sperry, who stood accused of creating and selling cheats, admitted to the copyright infringement allegations and signed a consent judgment.

“Defendant directly infringed Epic’s copyrights in Fortnite. Defendant used the cheats. His use of the cheats created unauthorized derivative works of Epic’s copyright protected Fortnite code that are substantially similar to Epic’s copyrighted work,” the judgment reads.

“In addition to creating and using the cheats, Defendant promoted, marketed, and sold these cheats to third parties, and actively encouraged and induced these other cheaters to purchase and use the cheats to gain an unfair advantage in Fortnite.”

The order includes an injunction which bars Sperry from cheating or promoting cheats in the future, but it doesn’t list any damages. Only if Sperry breaks the agreement will he be required to pay $5,000.

From the various Fortnite settlements we’ve seen to date, it’s clear that Epic Games is not after money. Its main goal is to stop the cheating and to hold cheaters accountable, but the company doesn’t go any further, for now.

This is quite a large contrast between several enforcement actions that were taken against alleged PUBG cheaters in China a few days ago.

Although there were no specific copyright infringement charges mentioned, Chinese authorities reported that fifteen people were arrested in connection with PUBG cheating.

“15 major suspects including ‘OMG’, ‘FL’, ‘火狐’, ‘须弥’ and ‘炎黄’ were arrested for developing hack programs, hosting marketplaces for hack programs, and brokering transactions. Currently the suspects have been fined approximately 30mil RNB ($5.1mil USD),” a statement reads.

PlayerUnknown shared the developments late last week and added that it will continue to crack down on those who continue to cheat.

“We take cheating extremely seriously. Developing, selling, promoting, or using unauthorized hacking/cheating programs isn’t just unfair for others playing PUBG—in many places, it’s also against the law,” the company said, commenting on the news.

Without further details, it’s hard to compare the Chinese cheating ‘operations’ to the Fortnite cases. However, Epic’s moderate approach clearly differs from the Chinese crackdown against PUBG cheaters.

A copy of the consent judgment against Sperry is available here (pdf).

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

CI/CD with Data: Enabling Data Portability in a Software Delivery Pipeline with AWS Developer Tools, Kubernetes, and Portworx

Post Syndicated from Kausalya Rani Krishna Samy original https://aws.amazon.com/blogs/devops/cicd-with-data-enabling-data-portability-in-a-software-delivery-pipeline-with-aws-developer-tools-kubernetes-and-portworx/

This post is written by Eric Han – Vice President of Product Management Portworx and Asif Khan – Solutions Architect

Data is the soul of an application. As containers make it easier to package and deploy applications faster, testing plays an even more important role in the reliable delivery of software. Given that all applications have data, development teams want a way to reliably control, move, and test using real application data or, at times, obfuscated data.

For many teams, moving application data through a CI/CD pipeline, while honoring compliance and maintaining separation of concerns, has been a manual task that doesn’t scale. At best, it is limited to a few applications, and is not portable across environments. The goal should be to make running and testing stateful containers (think databases and message buses where operations are tracked) as easy as with stateless (such as with web front ends where they are often not).

Why is state important in testing scenarios? One reason is that many bugs manifest only when code is tested against real data. For example, we might simply want to test a database schema upgrade but a small synthetic dataset does not exercise the critical, finer corner cases in complex business logic. If we want true end-to-end testing, we need to be able to easily manage our data or state.

In this blog post, we define a CI/CD pipeline reference architecture that can automate data movement between applications. We also provide the steps to follow to configure the CI/CD pipeline.

 

Stateful Pipelines: Need for Portable Volumes

As part of continuous integration, testing, and deployment, a team may need to reproduce a bug found in production against a staging setup. Here, the hosting environment is comprised of a cluster with Kubernetes as the scheduler and Portworx for persistent volumes. The testing workflow is then automated by AWS CodeCommit, AWS CodePipeline, and AWS CodeBuild.

Portworx offers Kubernetes storage that can be used to make persistent volumes portable between AWS environments and pipelines. The addition of Portworx to the AWS Developer Tools continuous deployment for Kubernetes reference architecture adds persistent storage and storage orchestration to a Kubernetes cluster. The example uses MongoDB as the demonstration of a stateful application. In practice, the workflow applies to any containerized application such as Cassandra, MySQL, Kafka, and Elasticsearch.

Using the reference architecture, a developer calls CodePipeline to trigger a snapshot of the running production MongoDB database. Portworx then creates a block-based, writable snapshot of the MongoDB volume. Meanwhile, the production MongoDB database continues serving end users and is uninterrupted.

Without the Portworx integrations, a manual process would require an application-level backup of the database instance that is outside of the CI/CD process. For larger databases, this could take hours and impact production. The use of block-based snapshots follows best practices for resilient and non-disruptive backups.

As part of the workflow, CodePipeline deploys a new MongoDB instance for staging onto the Kubernetes cluster and mounts the second Portworx volume that has the data from production. CodePipeline triggers the snapshot of a Portworx volume through an AWS Lambda function, as shown here

 

 

 

AWS Developer Tools with Kubernetes: Integrated Workflow with Portworx

In the following workflow, a developer is testing changes to a containerized application that calls on MongoDB. The tests are performed against a staging instance of MongoDB. The same workflow applies if changes were on the server side. The original production deployment is scheduled as a Kubernetes deployment object and uses Portworx as the storage for the persistent volume.

The continuous deployment pipeline runs as follows:

  • Developers integrate bug fix changes into a main development branch that gets merged into a CodeCommit master branch.
  • Amazon CloudWatch triggers the pipeline when code is merged into a master branch of an AWS CodeCommit repository.
  • AWS CodePipeline sends the new revision to AWS CodeBuild, which builds a Docker container image with the build ID.
  • AWS CodeBuild pushes the new Docker container image tagged with the build ID to an Amazon ECR registry.
  • Kubernetes downloads the new container (for the database client) from Amazon ECR and deploys the application (as a pod) and staging MongoDB instance (as a deployment object).
  • AWS CodePipeline, through a Lambda function, calls Portworx to snapshot the production MongoDB and deploy a staging instance of MongoDB• Portworx provides a snapshot of the production instance as the persistent storage of the staging MongoDB
    • The MongoDB instance mounts the snapshot.

At this point, the staging setup mimics a production environment. Teams can run integration and full end-to-end tests, using partner tooling, without impacting production workloads. The full pipeline is shown here.

 

Summary

This reference architecture showcases how development teams can easily move data between production and staging for the purposes of testing. Instead of taking application-specific manual steps, all operations in this CodePipeline architecture are automated and tracked as part of the CI/CD process.

This integrated experience is part of making stateful containers as easy as stateless. With AWS CodePipeline for CI/CD process, developers can easily deploy stateful containers onto a Kubernetes cluster with Portworx storage and automate data movement within their process.

The reference architecture and code are available on GitHub:

● Reference architecture: https://github.com/portworx/aws-kube-codesuite
● Lambda function source code for Portworx additions: https://github.com/portworx/aws-kube-codesuite/blob/master/src/kube-lambda.py

For more information about persistent storage for containers, visit the Portworx website. For more information about Code Pipeline, see the AWS CodePipeline User Guide.

[$] The memory-management development process

Post Syndicated from corbet original https://lwn.net/Articles/752985/rss

The memory-management subsystem is maintained by a small but dedicated
group of developers. How healthy is that development community? Michal
Hocko raised that question during the memory-management track at the 2018
Linux Storage, Filesystem, and Memory-Management Summit. Hocko is worried,
but it appears that his concerns are not universally felt.

Welcome Steven: Associate Front End Developer

Post Syndicated from Yev original https://www.backblaze.com/blog/welcome-steven-associate-front-end-developer/

The Backblaze web team is growing! As we add more features and work on our website we need more hands to get things done. Enter Steven, who joins us as an Associate Front End Developer. Steven is going to be getting his hands dirty and diving in to the fun-filled world of web development. Lets learn a bit more about Steven shall we?

What is your Backblaze Title?
Associate Front End Developer.

Where are you originally from?
The Bronx, New York born and raised.

What attracted you to Backblaze?
The team behind Backblaze made me feel like family from the moment I stepped in the door. The level of respect and dedication they showed me is the same respect and dedication they show their customers. Those qualities made wanting to be a part of Backblaze a no brainer!

What do you expect to learn while being at Backblaze?
I expect to grow as a software developer and human being by absorbing as much as I can from the immensely talented people I’ll be surrounded by.

Where else have you worked?
I previously worked at The Greenwich Hotel where I was a front desk concierge and bellman. If the team at Backblaze is anything like the team I was a part of there then this is going to be a fun ride.

Where did you go to school?
I studied at Baruch College and Bloc.

What’s your dream job?
My dream job is one where I’m able to express 100% of my creativity.

Favorite place you’ve traveled?
Santiago, Dominican Republic.

Favorite hobby?
Watching my Yankees, Knicks or Jets play.

Of what achievement are you most proud?
Becoming a Software Developer…

Star Trek or Star Wars?
Star Wars! May the force be with you…

Coke or Pepsi?
… Water. Black iced tea? One of god’s finer creations.

Favorite food?
Mangu con Los Tres Golpes (Mashed Plantains with Fried Salami, Eggs & Cheese).

Why do you like certain things?
I like things that give me good vibes.

Anything else you’d like you’d like to tell us?
If you break any complex concept down into to its simplest parts you’ll have an easier time trying to fully grasp it.

Those are some serious words of wisdom from Steven. We look forward to him helping us get cool stuff out the door!

The post Welcome Steven: Associate Front End Developer appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.