Tag Archives: radio

Mission Space Lab flight status announced!

Post Syndicated from Erin Brindley original https://www.raspberrypi.org/blog/mission-space-lab-flight-status-announced/

In September of last year, we launched our 2017/2018 Astro Pi challenge with our partners at the European Space Agency (ESA). Students from ESA membership and associate countries had the chance to design science experiments and write code to be run on one of our two Raspberry Pis on the International Space Station (ISS).

Astro Pi Mission Space Lab logo

Submissions for the Mission Space Lab challenge have just closed, and the results are in! Students had the opportunity to design an experiment for one of the following two themes:

  • Life in space
    Making use of Astro Pi Vis (Ed) in the European Columbus module to learn about the conditions inside the ISS.
  • Life on Earth
    Making use of Astro Pi IR (Izzy), which will be aimed towards the Earth through a window to learn about Earth from space.

ESA astronaut Alexander Gerst, speaking from the replica of the Columbus module at the European Astronaut Center in Cologne, has a message for all Mission Space Lab participants:

ESA astronaut Alexander Gerst congratulates Astro Pi 2017-18 winners

Subscribe to our YouTube channel: http://rpf.io/ytsub Help us reach a wider audience by translating our video content: http://rpf.io/yttranslate Buy a Raspberry Pi from one of our Approved Resellers: http://rpf.io/ytproducts Find out more about the Raspberry Pi Foundation: Raspberry Pi http://rpf.io/ytrpi Code Club UK http://rpf.io/ytccuk Code Club International http://rpf.io/ytcci CoderDojo http://rpf.io/ytcd Check out our free online training courses: http://rpf.io/ytfl Find your local Raspberry Jam event: http://rpf.io/ytjam Work through our free online projects: http://rpf.io/ytprojects Do you have a question about your Raspberry Pi?

Flight status

We had a total of 212 Mission Space Lab entries from 22 countries. Of these, a 114 fantastic projects have been given flight status, and the teams’ project code will run in space!

But they’re not winners yet. In April, the code will be sent to the ISS, and then the teams will receive back their experimental data. Next, to get deeper insight into the process of scientific endeavour, they will need produce a final report analysing their findings. Winners will be chosen based on the merit of their final report, and the winning teams will get exclusive prizes. Check the list below to see if your team got flight status.

Belgium

Flight status achieved:

  • Team De Vesten, Campus De Vesten, Antwerpen
  • Ursa Major, CoderDojo Belgium, West-Vlaanderen
  • Special operations STEM, Sint-Claracollege, Antwerpen

Canada

Flight status achieved:

  • Let It Grow, Branksome Hall, Toronto
  • The Dark Side of Light, Branksome Hall, Toronto
  • Genie On The ISS, Branksome Hall, Toronto
  • Byte by PIthons, Youth Tech Education Society & Kid Code Jeunesse, Edmonton
  • The Broadviewnauts, Broadview, Ottawa

Czech Republic

Flight status achieved:

  • BLEK, Střední Odborná Škola Blatná, Strakonice

Denmark

Flight status achieved:

  • 2y Infotek, Nærum Gymnasium, Nærum
  • Equation Quotation, Allerød Gymnasium, Lillerød
  • Team Weather Watchers, Allerød Gymnasium, Allerød
  • Space Gardners, Nærum Gymnasium, Nærum

Finland

Flight status achieved:

  • Team Aurora, Hyvinkään yhteiskoulun lukio, Hyvinkää

France

Flight status achieved:

  • INC2, Lycée Raoul Follereau, Bourgogne
  • Space Project SP4, Lycée Saint-Paul IV, Reunion Island
  • Dresseurs2Python, clg Albert CAMUS, essonne
  • Lazos, Lycée Aux Lazaristes, Rhone
  • The space nerds, Lycée Saint André Colmar, Alsace
  • Les Spationautes Valériquais, lycée de la Côte d’Albâtre, Normandie
  • AstroMega, Institut de Genech, north
  • Al’Crew, Lycée Algoud-Laffemas, Auvergne-Rhône-Alpes
  • AstroPython, clg Albert CAMUS, essonne
  • Aruden Corp, Lycée Pablo Neruda, Normandie
  • HeroSpace, clg Albert CAMUS, essonne
  • GalaXess [R]evolution, Lycée Saint Cricq, Nouvelle-Aquitaine
  • AstroBerry, clg Albert CAMUS, essonne
  • Ambitious Girls, Lycée Adam de Craponne, PACA

Germany

Flight status achieved:

  • Uschis, St. Ursula Gymnasium Freiburg im Breisgau, Breisgau
  • Dosi-Pi, Max-Born-Gymnasium Germering, Bavaria

Greece

Flight status achieved:

  • Deep Space Pi, 1o Epal Grevenon, Grevena
  • Flox Team, 1st Lyceum of Kifissia, Attiki
  • Kalamaria Space Team, Second Lyceum of Kalamaria, Central Macedonia
  • The Earth Watchers, STEM Robotics Academy, Thessaly
  • Celestial_Distance, Gymnasium of Kanithos, Sterea Ellada – Evia
  • Pi Stars, Primary School of Rododaphne, Achaias
  • Flarions, 5th Primary School of Salamina, Attica

Ireland

Flight status achieved:

  • Plant Parade, Templeogue College, Leinster
  • For Peats Sake, Templeogue College, Leinster
  • CoderDojo Clonakilty, Co. Cork

Italy

Flight status achieved:

  • Trentini DOP, CoderDojo Trento, TN
  • Tarantino Space Lab, Liceo G. Tarantino, BA
  • Murgia Sky Lab, Liceo G. Tarantino, BA
  • Enrico Fermi, Liceo XXV Aprile, Veneto
  • Team Lampone, CoderDojoTrento, TN
  • GCC, Gali Code Club, Trentino Alto Adige/Südtirol
  • Another Earth, IISS “Laporta/Falcone-Borsellino”
  • Anti Pollution Team, IIS “L. Einaudi”, Sicily
  • e-HAND, Liceo Statale Scientifico e Classico ‘Ettore Majorana’, Lombardia
  • scossa team, ITTS Volterra, Venezia
  • Space Comet Sisters, Scuola don Bosco, Torino

Luxembourg

Flight status achieved:

  • Spaceballs, Atert Lycée Rédange, Diekirch
  • Aline in space, Lycée Aline Mayrisch Luxembourg (LAML)

Poland

Flight status achieved:

  • AstroLeszczynPi, I Liceum Ogolnoksztalcace im. Krola Stanislawa Leszczynskiego w Jasle, podkarpackie
  • Astrokompasy, High School nr XVII in Wrocław named after Agnieszka Osiecka, Lower Silesian
  • Cosmic Investigators, Publiczna Szkoła Podstawowa im. Św. Jadwigi Królowej w Rzezawie, Małopolska
  • ApplePi, III Liceum Ogólnokształcące im. prof. T. Kotarbińskiego w Zielonej Górze, Lubusz Voivodeship
  • ELE Society 2, Zespol Szkol Elektronicznych i Samochodowych, Lubuskie
  • ELE Society 1, Zespol Szkol Elektronicznych i Samochodowych, Lubuskie
  • SpaceOn, Szkola Podstawowa nr 12 w Jasle – Gimnazjum Nr 2, Podkarpackie
  • Dewnald Ducks, III Liceum Ogólnokształcące w Zielonej Górze, lubuskie
  • Nova Team, III Liceum Ogolnoksztalcace im. prof. T. Kotarbinskiego, lubuskie district
  • The Moons, Szkola Podstawowa nr 12 w Jasle – Gimnazjum Nr 2, Podkarpackie
  • Live, Szkoła Podstawowa nr 1 im. Tadeusza Kościuszki w Zawierciu, śląskie
  • Storm Hunters, I Liceum Ogolnoksztalcace im. Krola Stanislawa Leszczynskiego w Jasle, podkarpackie
  • DeepSky, Szkoła Podstawowa nr 1 im. Tadeusza Kościuszki w Zawierciu, śląskie
  • Small Explorers, ZPO Konina, Malopolska
  • AstroZSCL, Zespół Szkół w Czerwionce-Leszczynach, śląskie
  • Orchestra, Szkola Podstawowa nr 12 w Jasle, Podkarpackie
  • ApplePi, I Liceum Ogolnoksztalcace im. Krola Stanislawa Leszczynskiego w Jasle, podkarpackie
  • Green Crew, Szkoła Podstawowa nr 2 w Czeladzi, Silesia

Portugal

Flight status achieved:

  • Magnetics, Escola Secundária João de Deus, Faro
  • ECA_QUEIROS_PI, Secondary School Eça de Queirós, Lisboa
  • ESDMM Pi, Escola Secundária D. Manuel Martins, Setúbal
  • AstroPhysicists, EB 2,3 D. Afonso Henriques, Braga

Romania

Flight status achieved:

  • Caelus, “Tudor Vianu” National High School of Computer Science, District One
  • CodeWarriors, “Tudor Vianu” National High School of Computer Science, District One
  • Dark Phoenix, “Tudor Vianu” National High School of Computer Science, District One
  • ShootingStars, “Tudor Vianu” National High School of Computer Science, District One
  • Astro Pi Carmen Sylva 2, Liceul Teoretic “Carmen Sylva”, Constanta
  • Astro Meridian, Astro Club Meridian 0, Bihor

Slovenia

Flight status achieved:

  • astrOSRence, OS Rence
  • Jakopičevca, Osnovna šola Riharda Jakopiča, Ljubljana

Spain

Flight status achieved:

  • Exea in Orbit, IES Cinco Villas, Zaragoza
  • Valdespartans, IES Valdespartera, Zaragoza
  • Valdespartans2, IES Valdespartera, Zaragoza
  • Astropithecus, Institut de Bruguers, Barcelona
  • SkyPi-line, Colegio Corazón de María, Asturias
  • ClimSOLatic, Colegio Corazón de María, Asturias
  • Científicosdelsaz, IES Profesor Pablo del Saz, Málaga
  • Canarias 2, IES El Calero, Las Palmas
  • Dreamers, M. Peleteiro, A Coruña
  • Canarias 1, IES El Calero, Las Palmas

The Netherlands

Flight status achieved:

  • Team Kaki-FM, Rkbs De Reiger, Noord-Holland

United Kingdom

Flight status achieved:

  • Binco, Teignmouth Community School, Devon
  • 2200 (Saddleworth), Detached Flight Royal Air Force Air Cadets, Lanchashire
  • Whatevernext, Albyn School, Highlands
  • GraviTeam, Limehurst Academy, Leicestershire
  • LSA Digital Leaders, Lytham St Annes Technology and Performing Arts College, Lancashire
  • Mead Astronauts, Mead Community Primary School, Wiltshire
  • STEAMCademy, Castlewood Primary School, West Sussex
  • Lux Quest, CoderDojo Banbridge, Co. Down
  • Temparatus, Dyffryn Taf, Carmarthenshire
  • Discovery STEMers, Discovery STEM Education, South Yorkshire
  • Code Inverness, Code Club Inverness, Highland
  • JJB, Ashton Sixth Form College, Tameside
  • Astro Lab, East Kent College, Kent
  • The Life Savers, Scratch and Python, Middlesex
  • JAAPiT, Taylor Household, Nottingham
  • The Heat Guys, The Archer Academy, Greater London
  • Astro Wantenauts, Wantage C of E Primary School, Oxfordshire
  • Derby Radio Museum, Radio Communication Museum of Great Britain, Derbyshire
  • Bytesyze, King’s College School, Cambridgeshire

Other

Flight status achieved:

  • Intellectual Savage Stars, Lycée français de Luanda, Luanda

 

Congratulations to all successful teams! We are looking forward to reading your reports.

The post Mission Space Lab flight status announced! appeared first on Raspberry Pi.

Jumping Air Gaps

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/02/jumping_air_gap_2.html

Nice profile of Mordechai Guri, who researches a variety of clever ways to steal data over air-gapped computers.

Guri and his fellow Ben-Gurion researchers have shown, for instance, that it's possible to trick a fully offline computer into leaking data to another nearby device via the noise its internal fan generates, by changing air temperatures in patterns that the receiving computer can detect with thermal sensors, or even by blinking out a stream of information from a computer hard drive LED to the camera on a quadcopter drone hovering outside a nearby window. In new research published today, the Ben-Gurion team has even shown that they can pull data off a computer protected by not only an air gap, but also a Faraday cage designed to block all radio signals.

Here’s a page with all the research results.

BoingBoing post.

China to Start Blocking Unauthorized VPN Providers This April

Post Syndicated from Andy original https://torrentfreak.com/china-to-start-blocking-unauthorized-vpn-providers-this-april-180203/

Back in January 2017, China’s Ministry of Industry and Information Technology announced a 14-month campaign to crack down on ‘unauthorized’ Internet platforms.

China said that Internet technologies and services had been expanding in a “disorderly” fashion, so regulation was required. No surprise then that the campaign targeted censorship-busting VPN services, which are used by citizens and corporations to traverse the country’s Great Firewall.

Heralding a “nationwide Internet network access services clean-up”, China warned that anyone operating such a service would require a government telecommunications business license. It’s now been more than a year since that announcement and much has happened in the interim.

In July 2017, Apple removed 674 VPN apps from its App Store and in September, a local man was jailed for nine months for selling VPN software. In December, another man was jailed for five-and-a-half years for selling a VPN service without an appropriate license from the government.

This week the government provided an update on the crackdown, telling the media that it will begin forcing local and foreign companies and individuals to use only government-approved systems to access the wider Internet.

Ministry of Industry and Information Technology (MIIT) chief engineer Zhang Feng reiterated earlier comments that VPN operators must be properly licensed by the government, adding that unlicensed VPNs will be subjected to new rules which come into force on March 31. The government plans to block unauthorized VPN providers, official media reported.

“We want to regulate VPNs which unlawfully conduct cross-border operational activities,” Zhang told reporters.

“Any foreign companies that want to set up a cross-border operation for private use will need to set up a dedicated line for that purpose,” he said.

“They will be able to lease such a line or network legally from the telecommunications import and export bureau. This shouldn’t affect their normal operations much at all.”

Radio Free Asia reports that state-run telecoms companies including China Mobile, China Unicom, and China Telecom, which are approved providers, have all been ordered to prevent their 1.3 billion subscribers from accessing blocked content with VPNs.

“The campaign aims to regulate the market environment and keep it fair and healthy,” Zhang added. “[As for] VPNs which unlawfully conduct cross-border operational activities, we want to regulate this.”

So, it appears that VPN providers are still allowed in China, so long as they’re officially licensed and approved by the government. However, in order to get that licensing they need to comply with government regulations, which means that people cannot use them to access content restricted by the Great Firewall.

All that being said, Zhang is reported as saying that people shouldn’t be concerned that their data is insecure as a result – neither providers nor the government are able to access content sent over a state-approved VPN service, he claimed.

“The rights for using normal intentional telecommunications services is strictly protected,” said Zhang, adding that regulation means that communications are “secure”.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Piracy Can Help Music Sales of Many Artists, Research Shows

Post Syndicated from Ernesto original https://torrentfreak.com/piracy-can-help-music-sales-of-many-artists-research-shows-180128/

The debate over whether online piracy helps or hurts music sales has been dragging on for several decades now.

The issue has been researched extensively with both positive and negative effects being reported, often varying based on the type of artist, music genre and media, among other variables.

One of the more extensive studies was published this month in the peer-reviewed Information Economics and Policy journal, by Queen’s University economics researcher Jonathan Lee.

In a paper titled ‘Purchase, pirate, publicize: Private-network music sharing and market album sales’ he examined the effect of BitTorrent-based piracy on both digital and physical music sales.

We covered an earlier version of the study two years ago when it was still a work in progress. With updates to the research methods and a data sample, the results are now more clear.

The file-sharing data was obtained from an unnamed private BitTorrent tracker and covers a data set of 250,000 albums and more than five million downloads. These were matched to US sales data for thousands of albums provided by Nielsen SoundScan.

By refining the estimation approach and updating the matching technique, the final version of the paper shows some interesting results.

Based on the torrent tracker data, Lee finds that piracy can boost sales of mid-tier artists, both for physical CDs and digital downloads. For the most popular artists, this effect is reversed. In both cases, the impact is the largest for digital sales.

“I now find that top artists are harmed and mid-tier artists may be helped in both markets, but that these effects are larger for digital sales,” Lee tells TorrentFreak. “This is consistent with the idea that people are more willing to switch between digital piracy and digital sales than between digital piracy and physical CDs.”

The findings lead to the conclusion that there is no ideal ‘one-size-fits-all’ response to piracy. In fact, some unauthorized sharing may be a good thing.

This is in line with observations from musicians themselves over the past years. Several top artists have admitted the positive effects of piracy, including Ed Sheeran, who recently said that he owes his career to it.

“I know that’s a bad thing to say, because I’m part of a music industry that doesn’t like illegal file sharing,” Sheeran said in an interview with CBS. “Illegal file sharing was what made me. It was students in England going to university, sharing my songs with each other.”

Sheeran sharing on TPB

Today, Sheeran is in a totally different position of course. As one of the top artists, he would now be hurt by piracy. However, the new stars of tomorrow may still reap the benefits.

According to the researcher, the music industry should realize that shutting down pirate sites may not always be the best option. On the contrary, file-sharing sites may be useful as promotional platforms in some cases.

“Following above, a policy of total shutdown of private file sharing networks seems excessively costly (compared with their relatively small impact on sales) and unwise (as a one-size-fits-all policy). It would be better to make legal consumption more convenient, reducing the demand for piracy as an alternative to purchasing,” Lee tells us.

“It would also be smart to experiment with releasing music onto piracy networks themselves, especially for up-and-coming artists, similar to the free promotion afforded by commercial radio.”

The researcher makes another interesting extrapolation from the findings. In recent years, some labels and artists have signed exclusive deals with some streaming platforms. This means that content is not available everywhere, and this fragmentation may make piracy look more appealing.

“Here you can view piracy as a non-fragmented alternative platform to Spotify et al. Thus consumers will have a strong incentive to use a single non-fragmented platform (piracy) over having multiple subscriptions to fragmented platforms,” Lee says.

It would be better for the labels to publish their music on all platforms, and to make these more appealing and convenient than the pirate alternative.

The data used for the research was collected several years ago before the big streaming boom, so it might be that the results are different today. However, it is clear that the effect of piracy on sales is not as uniform as the music industry often portrays it.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Thor:Ragnarok Director Says He “Illegally Torrented” Clips for the Showreel

Post Syndicated from Andy original https://torrentfreak.com/thorragnarok-director-says-illegally-torrented-clips-showreel-180127/

It’s not often that movies escape being pirated online but last weekend was a pretty miserable one for the people behind Thor:Ragnarok.

Just four months after the superhero movie’s theatrical debut, the Marvel hit was due to be released on disc February 26th, with digital distribution on iTunes planned for February 19th.

However, due to what appeared to be some kind of pre-order blunder, the $180 million movie was leaked online, resulting in a pirate frenzy that’s still ongoing.

But with the accidental early release of Thor:Ragnarok making waves within the torrent system and beyond, it seems ironic that its talented director actually has another relationship with piracy that most people aren’t aware of.

In an interview for ‘Q’, a show broadcast on Canada’s CBC radio, Taika Waititi noted that Thor: Ragnarok might be a “career ender” for him, something that was previously highlighted in the media.

However, the softly-spoken New Zealander also said some other things that flew completely under the radar but given recent developments, now have new significance.

Speaking with broadcaster Tom Power, Waititi revealed that when putting together his promotional showreel for Thor: Ragnarok, he obtained its source material from illegal sources.

Explaining the process used to acquire clips to create his ‘sizzle reel’ (a short video highlighting a director’s vision and tone for a proposed movie), Waititi revealed his less-than-official approach.

“I cut together little clips and shots – I basically illegally torrented and, erm, you know, ripped clips from the Internet,” Waititi said.

“Of a bunch of different things?” Power asked.

“I don’t mind saying that…erm…on the radio,” Waititi added, unconvincingly.

With Power quickly assuring the director that admitting doing something illegal was OK on air, Waititi perhaps realized it probably wasn’t.

“You can cut that out,” he suggested.

That Waititi took the ‘pirate’ approach to obtaining source material for his ‘sizzle reel’ isn’t really a surprise. Content is freely accessible online, crucially in easier to consume and edit formats than even Waititi has access to on short notice. And, since every film in memory is just a few clicks away, it’d be counter-intuitive not to use the resource in the name of creativity.

Overall then, it’s extremely unlikely that Waititi’s pirate confession will come to much. Two of his previous feature films, ‘Boy’ and ‘Hunt For The Wilderpeople’, held titles for the highest-grossing New Zealand film, the latter achieving the accolade in 2017.

Also in 2017, Waititi was named New Zealander of the Year in recognition of his “outstanding contribution to the well being of the nation.” Praise doesn’t come much higher than that.

How many torrent swarms he helped to keep healthy is destined remain a secret forever though, but as an emerging movie hero in his own right, people will forgive him that.

H/T Trioval

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Detecting Drone Surveillance with Traffic Analysis

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/01/detecting_drone.html

This is clever:

Researchers at Ben Gurion University in Beer Sheva, Israel have built a proof-of-concept system for counter-surveillance against spy drones that demonstrates a clever, if not exactly simple, way to determine whether a certain person or object is under aerial surveillance. They first generate a recognizable pattern on whatever subject­ — a window, say — someone might want to guard from potential surveillance. Then they remotely intercept a drone’s radio signals to look for that pattern in the streaming video the drone sends back to its operator. If they spot it, they can determine that the drone is looking at their subject.

In other words, they can see what the drone sees, pulling out their recognizable pattern from the radio signal, even without breaking the drone’s encrypted video.

The details have to do with the way drone video is compressed:

The researchers’ technique takes advantage of an efficiency feature streaming video has used for years, known as “delta frames.” Instead of encoding video as a series of raw images, it’s compressed into a series of changes from the previous image in the video. That means when a streaming video shows a still object, it transmits fewer bytes of data than when it shows one that moves or changes color.

That compression feature can reveal key information about the content of the video to someone who’s intercepting the streaming data, security researchers have shown in recent research, even when the data is encrypted.

Research paper and video.

Now Open AWS EU (Paris) Region

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-open-aws-eu-paris-region/

Today we are launching our 18th AWS Region, our fourth in Europe. Located in the Paris area, AWS customers can use this Region to better serve customers in and around France.

The Details
The new EU (Paris) Region provides a broad suite of AWS services including Amazon API Gateway, Amazon Aurora, Amazon CloudFront, Amazon CloudWatch, CloudWatch Events, Amazon CloudWatch Logs, Amazon DynamoDB, Amazon Elastic Compute Cloud (EC2), EC2 Container Registry, Amazon ECS, Amazon Elastic Block Store (EBS), Amazon EMR, Amazon ElastiCache, Amazon Elasticsearch Service, Amazon Glacier, Amazon Kinesis Streams, Polly, Amazon Redshift, Amazon Relational Database Service (RDS), Amazon Route 53, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon Simple Storage Service (S3), Amazon Simple Workflow Service (SWF), Amazon Virtual Private Cloud, Auto Scaling, AWS Certificate Manager (ACM), AWS CloudFormation, AWS CloudTrail, AWS CodeDeploy, AWS Config, AWS Database Migration Service, AWS Direct Connect, AWS Elastic Beanstalk, AWS Identity and Access Management (IAM), AWS Key Management Service (KMS), AWS Lambda, AWS Marketplace, AWS OpsWorks Stacks, AWS Personal Health Dashboard, AWS Server Migration Service, AWS Service Catalog, AWS Shield Standard, AWS Snowball, AWS Snowball Edge, AWS Snowmobile, AWS Storage Gateway, AWS Support (including AWS Trusted Advisor), Elastic Load Balancing, and VM Import.

The Paris Region supports all sizes of C5, M5, R4, T2, D2, I3, and X1 instances.

There are also four edge locations for Amazon Route 53 and Amazon CloudFront: three in Paris and one in Marseille, all with AWS WAF and AWS Shield. Check out the AWS Global Infrastructure page to learn more about current and future AWS Regions.

The Paris Region will benefit from three AWS Direct Connect locations. Telehouse Voltaire is available today. AWS Direct Connect will also become available at Equinix Paris in early 2018, followed by Interxion Paris.

All AWS infrastructure regions around the world are designed, built, and regularly audited to meet the most rigorous compliance standards and to provide high levels of security for all AWS customers. These include ISO 27001, ISO 27017, ISO 27018, SOC 1 (Formerly SAS 70), SOC 2 and SOC 3 Security & Availability, PCI DSS Level 1, and many more. This means customers benefit from all the best practices of AWS policies, architecture, and operational processes built to satisfy the needs of even the most security sensitive customers.

AWS is certified under the EU-US Privacy Shield, and the AWS Data Processing Addendum (DPA) is GDPR-ready and available now to all AWS customers to help them prepare for May 25, 2018 when the GDPR becomes enforceable. The current AWS DPA, as well as the AWS GDPR DPA, allows customers to transfer personal data to countries outside the European Economic Area (EEA) in compliance with European Union (EU) data protection laws. AWS also adheres to the Cloud Infrastructure Service Providers in Europe (CISPE) Code of Conduct. The CISPE Code of Conduct helps customers ensure that AWS is using appropriate data protection standards to protect their data, consistent with the GDPR. In addition, AWS offers a wide range of services and features to help customers meet the requirements of the GDPR, including services for access controls, monitoring, logging, and encryption.

From Our Customers
Many AWS customers are preparing to use this new Region. Here’s a small sample:

Societe Generale, one of the largest banks in France and the world, has accelerated their digital transformation while working with AWS. They developed SG Research, an application that makes reports from Societe Generale’s analysts available to corporate customers in order to improve the decision-making process for investments. The new AWS Region will reduce latency between applications running in the cloud and in their French data centers.

SNCF is the national railway company of France. Their mobile app, powered by AWS, delivers real-time traffic information to 14 million riders. Extreme weather, traffic events, holidays, and engineering works can cause usage to peak at hundreds of thousands of users per second. They are planning to use machine learning and big data to add predictive features to the app.

Radio France, the French public radio broadcaster, offers seven national networks, and uses AWS to accelerate its innovation and stay competitive.

Les Restos du Coeur, a French charity that provides assistance to the needy, delivering food packages and participating in their social and economic integration back into French society. Les Restos du Coeur is using AWS for its CRM system to track the assistance given to each of their beneficiaries and the impact this is having on their lives.

AlloResto by JustEat (a leader in the French FoodTech industry), is using AWS to to scale during traffic peaks and to accelerate their innovation process.

AWS Consulting and Technology Partners
We are already working with a wide variety of consulting, technology, managed service, and Direct Connect partners in France. Here’s a partial list:

AWS Premier Consulting PartnersAccenture, Capgemini, Claranet, CloudReach, DXC, and Edifixio.

AWS Consulting PartnersABC Systemes, Atos International SAS, CoreExpert, Cycloid, Devoteam, LINKBYNET, Oxalide, Ozones, Scaleo Information Systems, and Sopra Steria.

AWS Technology PartnersAxway, Commerce Guys, MicroStrategy, Sage, Software AG, Splunk, Tibco, and Zerolight.

AWS in France
We have been investing in Europe, with a focus on France, for the last 11 years. We have also been developing documentation and training programs to help our customers to improve their skills and to accelerate their journey to the AWS Cloud.

As part of our commitment to AWS customers in France, we plan to train more than 25,000 people in the coming years, helping them develop highly sought after cloud skills. They will have access to AWS training resources in France via AWS Academy, AWSome days, AWS Educate, and webinars, all delivered in French by AWS Technical Trainers and AWS Certified Trainers.

Use it Today
The EU (Paris) Region is open for business now and you can start using it today!

Jeff;

 

New Zealand Prepares Consultation to Modernize Copyright Laws

Post Syndicated from Andy original https://torrentfreak.com/new-zealand-prepares-consultation-to-modernize-copyright-laws-171218/

The Copyright Act 1994 is the key legislation governing New Zealand’s handling of intellectual property issues, covering protection, infringement, exceptions and enforcement. It last underwent a review more than a decade ago resulting in the Copyright (New Technologies) Amendment Act 2008.

Like much copyright law worldwide, New Zealand’s legislation has struggled to keep pace with technological change so, during the summer, the last government announced plans for a review with several key goals:

Assess the performance of the Copyright Act against the objectives of New Zealand’s copyright regime.

Identify barriers to achieving the objectives of New Zealand’s copyright regime, and the level of impact that these barriers have.

Formulate a preferred approach to addressing these issues – including amendments to the Copyright Act, and the commissioning of further work on any other regulatory or non-regulatory options that are identified.

The former government planned to initiate a public consultation in the second quarter of 2018, with a review being informed by the responses. According to an announcement Friday, the new government plans to go ahead with the overhaul, beginning in April as previously envisioned.

Many of the hot topics in the United States, Europe and closer to home in Australia are expected to come to the forefront, including site-blocking, service provider safe harbor provisions, and the thorny issue of fair use.

Speaking with RadioNZ, New Zealand Screen Association managing director Matthew Cheetham says that new legislation is required to keep pace with a rapidly moving landscape.

“In New Zealand, piracy is almost an accepted thing, because no one’s really doing anything about it, because no one actually can do anything about it,” Cheetham says.

“As new technologies have evolved, the law has struggled to keep pace with those new technologies and to make sure that the law is fit for purpose in the digital age.”

As the local representative for several Hollywood studios, it’s no surprise that NZSA will be seeking amendments that will force ISPs to block access to popular pirate sites, as they do already in the UK, Europe, and Australia.

“If the site is infringing [a court] can order internet service providers to block access to that site. Forty-two countries around the world have recognised that blocking access when it’s carefully defined is a perfectly legitimate avenue for rights holders to protect their rights,” Cheetham notes.

While there hasn’t been a major copyright overhaul in more than a decade, New Zealand is no stranger to prolonged exercises to try and stop piracy.

The country spent huge amounts of time and money late last decade in order to come up with the Copyright (Infringing File Sharing) Amendment Act 2011. It laid out a system under which pirates received escalating warnings culminating in eventual disconnection from the Internet. But, with escalating costs (between NZ$20 and NZ$25 per notice), the scheme was ultimately an expensive flop.

“We have an entire regime that allows copyright holders to seek and send notices to users that are committing piracy and actually have a process in a court-based system that allows remedies to be pursued,” Internet New Zealand deputy chief executive Andrew Cushen told RadioNZ.

“None of them are using it. Why would we now look at a wholly different solution that none of them are going to use as well?”

As someone who has been acutely affected by New Zealand’s approach to intellectual property rights enforcement, Kim Dotcom certainly has an interest in the development of local copyright law. The Megaupload founder was arrested in 2012 for alleged copyright offenses that he insists aren’t even a crime in New Zealand. So what advice does he have for the review?

According to the entrepreneur, the NZ Copyright Act is “mostly good”, noting that it protects both ISPs and consumers. Given the chance, however, he would remind judges about the purpose of the act.

“The NZ Copyright Act is a code. The Copyright Act creates a special property right. No other act applies to this special property right, including the crimes act,” Dotcom informs TF.

“This might be a helpful yardstick for Judges who don’t understand the Copyright Act and attempt to create new and unintended law from the bench. Just like in my case.”

Only time will tell how the public consultation will play out but it seems likely that tackling the “Value Gap” situation will be high up the agenda, especially if that can be achieved by eroding Internet companies’ safe harbors under copyright law. Expect that to receive significant push-back from the technology sector.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

2017-12-13 разни

Post Syndicated from Vasil Kolev original https://vasil.ludost.net/blog/?p=3370

И много неща на едно място, че все няма време за блогване.

Лабът организира голямо коледно LAN party, на 21.12, с всякакъв хардуер и игри.

Също така подредихме пак в лаба студио за записване на podcast-и, и дори записахме един тестов подкаст (записът е с много малко обработка, май трябва да се усили още малко). Като цяло може да се подобри малко софтуерната част (т.е. да отделя един час и да я поавтоматизирам), и да вземем още една стойка за единия микрофон (вместо да стои в едно диджириду, което е подпряно на стойка за китара), но изглежда да върши работа.

И финално лабово, насъбрали сме толкова странна техника, че обмислям workshop/състезание кой ще успее да подкара най-много неща. В момента ситуацията е такава, че мога да вържа VAX-а по оптика (което много ми се иска да направя тия дни, като имам малко време).

А на мен ми се спи. Тия дни успявам да събера някакъв сън, но като цяло трудно събирам наистина почивни weekend-и, в които основно да спя и да си почивам, та трябва да измисля нещо по въпроса, самия openfest ужасно ме умори (там имах няколко седмици без никаква почивка). Наскоро имах и един ден, в който събрах два пъти по 8 часа работа (второто беше да подредя видео и подобната техника от феста в лаба, че имаше нужда и заемаше място на неправилните места).

В работата е забавно, всеки ден откривам нови неща, които не работят и странни бъгове и дизайн решения в компоненти, които уж хората са тествали, ползват и са ок. От по-пресните примери е как continuous queries на influxdb при достатъчно бази и данни просто никога не могат да наваксат, защото са в един thread, който се вика дявол знае кога. Успях да ги заместя с 200 реда код на python (и разпитвайки google, не само аз съм така).

На книжния фронт една от основните новини е, че авторът на Worm е приключил последния си проект (Twig) и се е хванал пак да пише в света на Worm (казва се Ward), което е страхотно за всички, обичащи книгите по 5000 страници.

Тоя weekend има хакатон във ФМИ, за който услужихме от лаба с малко странна техника. Има ли някой, който може да пробута идеята на отборите да декодират радиопредаванията, с които наливат данни на таблата по спирките? Има нужния хардуер, вероятно със съществуващите неща като gnuradio няма да е сложно да се демодулира, и дори няма нужда да се доправя частта, с която може да се подават произволни надписи за показване по тия табла…

Върви подготовката за FOSDEM. След последните тестове (които правихме на един хакатон там на място) моя код, дето ползва openpgm не retransmit-ва, и за един ден дебъгване (и вкарване на print-ове на разни места и опити да разбера какво точно искат да кажат тия хора, които в разни функции с имена “провери-нещо-си” променят по генералния state и които доста намразих) не успях да намеря що не сработва. Обмислям да се скрия някъде по празниците и да го дебъгвам, или да измисля решение с TCP най-накрая.
(как може никой да не е написал multicast TCP. Трябва да го дадем за задача на някой, дето не знае, че не е възможно и да видим какво ще излезе…)

Спирам, преди това да е станало съвсем несвързано.

Matt’s steampunk radio jukebox

Post Syndicated from Janina Ander original https://www.raspberrypi.org/blog/matts-steampunk-radio-jukebox/

Matt Van Gastel breathed new life into his great-grandparents’ 1930s Westinghouse with a Raspberry Pi, an amplifier HAT, Google Music, and some serious effort. The result is a really beautiful, striking piece.

Matt Van Gastel Steampunk Radio Raspberry Pi

The radio

With a background in radio electronics, Matt Van Gastel had always planned to restore his great-grandparents’ mid-30s Westinghouse radio. “I even found the original schematics glued to the bottom of the base of the main electronics assembly,” he explains in his Instructables walkthrough. However, considering the age of the piece and the cost of sourcing parts for a repair, he decided to take the project in a slightly different direction.



“I pulled the main electronics assembly out quite easily, it was held in by four flat head screws […] I decided to make a Steampunk themed Jukebox based off this main assembly and power it with a Raspberry Pi,” he writes.

The build

Matt added JustBoom’s Amp HAT to a Raspberry Pi 3 to boost the sound quality and functionality of the board.

He spent a weekend prototyping and testing the electronics before deciding on his final layout. After a little time playing around with different software, Matt chose Mopidy, a flexible music server written in Python. Mopidy lets him connect to his music-streaming service of choice, Google Music, and also allows airplay connectivity for other wireless devices.

Stripping out the old electronics from inside the Westinghouse radio easily made enough space for Matt’s new, much smaller, setup. Reserving various pieces for the final build, and scrubbing the entire unit to within an inch of its life with soap and water, he moved on to the aesthetics of the piece.

The steampunk

LED Nixie tubes, a 1950s DC voltmeter, and spray paint all contributed to the final look of the radio. It has a splendid steampunk look that works wonderfully with the vintage of the original radio.



Retrofit and steampunk Raspberry Pi builds

From old pub jukeboxes to Bakelite kitchen radios, we’ve seen lots of retrofit audio visual Pi projects over the years, with all kinds of functionality and in all sorts of styles.

Americana – does exactly what it says on the tin jukebox

For more steampunk inspiration, check out phrazelle’s laptop and Derek Woodroffe’s tentacle hat. And for more audiophile builds, Tijuana Rick’s 60s Wurlitzer and Steve Devlin’s 50s wallbox are stand-out examples.

The post Matt’s steampunk radio jukebox appeared first on Raspberry Pi.

Pioneers winners: only you can save us

Post Syndicated from Erin Brindley original https://www.raspberrypi.org/blog/pioneers-winners-only-you-can-save-us/

She asked for help, and you came to her aid. Pioneers, the winners of the Only you can save us challenge have been picked!

Can you see me? Only YOU can save us!

I need your help. This is a call out for those between 11- and 16-years-old in the UK and Republic of Ireland. Something has gone very, very wrong and only you can save us. I’ve collected together as much information for you as I can. You’ll find it at http://www.raspberrypi.org/pioneers.

The challenge

In August we intercepted an emergency communication from a lonesome survivor. She seemed to be in quite a bit of trouble, and asked all you young people aged 11 to 16 to come up with something to help tackle the oncoming crisis, using whatever technology you had to hand. You had ten weeks to work in teams of two to five with an adult mentor to fulfil your mission.

The judges

We received your world-saving ideas, and our savvy survivor pulled together a ragtag bunch of apocalyptic experts to help us judge which ones would be the winning entries.

Dr Shini Somara

Dr Shini Somara is an advocate for STEM education and a mechanical engineer. She was host of The Health Show and has appeared in documentaries for the BBC, PBS Digital, and Sky. You can check out her work hosting Crash Course Physics on YouTube.

Prof Lewis Dartnell is an astrobiologist and author of the book The Knowledge: How to Rebuild Our World From Scratch.

Emma Stephenson has a background in aeronautical engineering and currently works in the Shell Foundation’s Access to Energy and Sustainable Mobility portfolio.

Currently sifting through the entries with the other judges of #makeyourideas with @raspberrypifoundation @_raspberrypi_

151 Likes, 3 Comments – Shini Somara (@drshinisomara) on Instagram: “Currently sifting through the entries with the other judges of #makeyourideas with…”

The winners

Our survivor is currently putting your entries to good use repairing, rebuilding, and defending her base. Our judges chose the following projects as outstanding examples of world-saving digital making.

Theme winner: Computatron

Raspberry Pioneers 2017 – Nerfus Dislikus Killer Robot

This is our entry to the pioneers ‘Only you can save us’ competition. Our team name is Computatrum. Hope you enjoy!

Are you facing an unknown enemy whose only weakness is Nerf bullets? Then this is the robot for you! We loved the especially apocalyptic feel of the Computatron’s cleverly hacked and repurposed elements. The team even used an old floppy disc mechanism to help fire their bullets!

Technically brilliant: Robot Apocalypse Committee

Pioneers Apocalypse 2017 – RationalPi

Thousands of lines of code… Many sheets of acrylic… A camera, touchscreen and fingerprint scanner… This is our entry into the Raspberry Pi Pioneers2017 ‘Only YOU can Save Us’ theme. When zombies or other survivors break into your base, you want a secure way of storing your crackers.

The Robot Apocalypse Committee is back, and this time they’ve brought cheese! The crew designed a cheese- and cracker-dispensing machine complete with face and fingerprint recognition to ensure those rations last until the next supply drop.

Best explanation: Pi Chasers

Tala – Raspberry Pi Pioneers Project

Hi! We are PiChasers and we entered the Raspberry Pi Pionners challenge last time when the theme was “Make it Outdoors!” but now we’ve been faced with another theme “Apocolypse”. We spent a while thinking of an original thing that would help in an apocolypse and decided upon a ‘text-only phone’ which uses local radio communication rather than cellular.

This text-based communication device encased in a tupperware container could be a lifesaver in a crisis! And luckily, the Pi Chasers produced an excellent video and amazing GitHub repo, ensuring that any and all survivors will be able to build their own in the safety of their base.

Most inspiring journey: Three Musketeers

Pioneers Entry – The Apocalypse

Pioneers Entry Team Name: The Three Musketeers Team Participants: James, Zach and Tom

We all know that zombies are terrible at geometry, and the Three Musketeers used this fact to their advantage when building their zombie security system. We were impressed to see the team working together to overcome the roadblocks they faced along the way.

We appreciate what you’re trying to do: Zombie Trolls

Zombie In The Middle

Uploaded by CDA Bodgers on 2017-12-01.

Playing piggy in the middle with zombies sure is a unique way of saving humankind from total extinction! We loved this project idea, and although the Zombie Trolls had a little trouble with their motors, we’re sure with a little more tinkering this zombie-fooling contraption could save us all.

Most awesome

Our judges also wanted to give a special commendation to the following teams for their equally awesome apocalypse-averting ideas:

  • PiRates, for their multifaceted zombie-proofing defence system and the high production value of their video
  • Byte them Pis, for their beautiful zombie-detecting doormat
  • Unatecxon, for their impressive bunker security system
  • Team Crompton, for their pressure-activated door system
  • Team Ernest, for their adventures in LEGO

The prizes

All our winning teams have secured exclusive digital maker boxes. These are jam-packed with tantalising tech to satisfy all tinkering needs, including:

Our theme winners have also secured themselves a place at Coolest Projects 2018 in Dublin, Ireland!

Thank you to everyone who got involved in this round of Pioneers. Look out for your awesome submission swag arriving in the mail!

The post Pioneers winners: only you can save us appeared first on Raspberry Pi.

Remote Hack of a Boeing 757

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/12/remote_hack_of_.html

Last month, the DHS announced that it was able to remotely hack a Boeing 757:

“We got the airplane on Sept. 19, 2016. Two days later, I was successful in accomplishing a remote, non-cooperative, penetration,” said Robert Hickey, aviation program manager within the Cyber Security Division of the DHS Science and Technology (S&T) Directorate.

“[Which] means I didn’t have anybody touching the airplane, I didn’t have an insider threat. I stood off using typical stuff that could get through security and we were able to establish a presence on the systems of the aircraft.” Hickey said the details of the hack and the work his team are doing are classified, but said they accessed the aircraft’s systems through radio frequency communications, adding that, based on the RF configuration of most aircraft, “you can come to grips pretty quickly where we went” on the aircraft.

Marvellous retrofitted home assistants

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/retrofitted-home-assistants/

As more and more digital home assistants are appearing on the consumer market, it’s not uncommon to see the towering Amazon Echo or sleek Google Home when visiting friends or family. But we, the maker community, are rarely happy unless our tech stands out from the rest. So without further ado, here’s a roundup of some fantastic retrofitted home assistant projects you can recreate and give pride of place in your kitchen, on your bookshelf, or wherever else you’d like to talk to your virtual, disembodied PA.

Google AIY Robot Conversion

Turned an 80s Tomy Mr Money into a little Google AIY / Raspberry Pi based assistant.

Matt ‘Circuitbeard’ Brailsford’s Tomy Mr Money Google AIY Assistant is just one of many home-brew home assistants makers have built since the release of APIs for Amazon Alexa and Google Home. Here are some more…

Teddy Ruxpin

Oh Teddy, how exciting and mysterious you were when I unwrapped you back in the mideighties. With your awkwardly moving lips and twitching eyelids, you were the cream of the crop of robotic toys! How was I to know that during my thirties, you would become augmented with home assistant software and suddenly instil within me a fear unlike any I’d felt before? (Save for my lifelong horror of ET…)

Alexa Ruxpin – Raspberry Pi & Alexa Powered Teddy Bear

Please watch: “DIY Fidget LED Display – Part 1” https://www.youtube.com/watch?v=FAZIc82Duzk -~-~~-~~~-~~-~- There are tons of virtual assistants out on the market: Siri, Ok Google, Alexa, etc. I had this crazy idea…what if I made the virtual assistant real…kinda. I decided to take an old animatronic teddy bear and hack it so that it ran Amazon Alexa.

Several makers around the world have performed surgery on Teddy to install a Raspberry Pi within his stomach and integrate him with Amazon Alexa Voice or Google’s AIY Projects Voice kit. And because these makers are talented, they’ve also managed to hijack Teddy’s wiring to make his lips move in time with his responses to your commands. Freaky…

Speaking of freaky: check out Zack’s Furlexa — an Amazon Alexa Furby that will haunt your nightmares.

Give old tech new life

Devices that were the height of technology when you purchased them may now be languishing in your attic collecting dust. With new and improved versions of gadgets and gizmos being released almost constantly, it is likely that your household harbours a spare whosit or whatsit which you can dismantle and give a new Raspberry Pi heart and purpose.

Take, for example, Martin Mander’s Google Pi intercom. By gutting and thoroughly cleaning a vintage intercom, Martin fashioned a suitable housing the Google AIY Projects Voice kit to create a new home assistant for his house:

1986 Google Pi Intercom

This is a 1986 Radio Shack Intercom that I’ve converted into a Google Home style device using a Raspberry Pi and the Google AIY (Artificial Intelligence Yourself) kit that came free with the MagPi magazine (issue 57). It uses the Google Assistant to answer questions and perform actions, using IFTTT to integrate with smart home accessories and other web services.

Not only does this build look fantastic, it’s also a great conversation starter for any visitors who had a similar device during the eighties.

Also take a look at Martin’s 1970s Amazon Alexa phone for more nostalgic splendour.

Put it in a box

…and then I’ll put that box inside of another box, and then I’ll mail that box to myself, and when it arrives…

A GIF from the emperors new groove - Raspberry Pi Home Assistant

A GIF. A harmless, little GIF…and proof of the comms team’s obsession with The Emperor’s New Groove.

You don’t have to be fancy when it comes to housing your home assistant. And often, especially if you’re working with the smaller people in your household, the results of a simple homespun approach are just as delightful.

Here are Hannah and her dad Tom, explaining how they built a home assistant together and fit it inside an old cigar box:

Raspberry Pi 3 Amazon Echo – The Alexa Kids Build!

My 7 year old daughter and I decided to play around with the Raspberry Pi and build ourselves an Amazon Echo (Alexa). The video tells you about what we did and the links below will take you to all the sites we used to get this up and running.

Also see the Google AIY Projects Voice kit — the cardboard box-est of home assistant boxes.

Make your own home assistant

And now it’s your turn! I challenge you all (and also myself) to create a home assistant using the Raspberry Pi. Whether you decide to fit Amazon Alexa inside an old shoebox or Google Home inside your sister’s Barbie, I’d love to see what you create using the free home assistant software available online.

Check out these other home assistants for Raspberry Pi, and keep an eye on our blog to see what I manage to create as part of the challenge.

Ten virtual house points for everyone who shares their build with us online, either in the comments below or by tagging us on your social media account.

The post Marvellous retrofitted home assistants appeared first on Raspberry Pi.

MagPi 64: get started with electronics

Post Syndicated from Rob Zwetsloot original https://www.raspberrypi.org/blog/magpi-64/

Hey folks, Rob here again! You get a double dose of me this month, as today marks the release of The MagPi 64. In this issue we give you a complete electronics starter guide to help you learn how to make circuits that connect to your Raspberry Pi!

The front cover of MagPi 64

MAGPI SIXTY-FOOUUUR!

Wires, wires everywhere!

In the electronics feature, we’ll teach you how to identify different components in circuit diagrams, we’ll explain what they do, and we’ll give you some basic wiring instructions so you can take your first steps. The feature also includes step-by-step tutorials on how to make a digital radio and a range-finder, meaning you can test out your new electronics skills immediately!

Christmas tutorials

Electronics are cool, but what else is in this issue? Well, we have exciting news about the next Google AIY Projects Vision kit, which forgoes audio for images, allowing you to build a smart camera with your Raspberry Pi.

We’ve also included guides on how to create your own text-based adventure game and a kaleidoscope camera. And, just in time for the festive season, there’s a tutorial for making a 3D-printed Pi-powered Christmas tree star. All this in The MagPi 64, along with project showcases, reviews, and much more!

Kaleido Cam

Using a normal web cam or the Raspberry Pi camera produce real time live kaleidoscope effects with the Raspberry Pi. This video shows the normal mode, along with an auto pre-rotate, and a horizontal and vertical flip.

Get The MagPi 64

Issue 64 is available today from WHSmith, Tesco, Sainsbury’s, and Asda. If you live in the US, head over to your local Barnes & Noble or Micro Center in the next few days. You can also get the new issue online from our store, or digitally via our Android and iOS apps. And don’t forget, there’s always the free PDF as well.

Subscribe for free goodies

Want to support the Raspberry Pi Foundation and the magazine, and get some cool free stuff? If you take out a twelve-month print subscription to The MagPi, you’ll get a Pi Zero W, Pi Zero case, and adapter cables absolutely free! This offer does not currently have an end date.

We hope you enjoy this issue!

Nintendo Sixty-FOOOOOOOOOOUR

Brandon gets an n64 for christmas 1998 and gets way too excited inquiries about usage / questions / comments? [email protected] © n64kids.com

The post MagPi 64: get started with electronics appeared first on Raspberry Pi.

Swedish Data Authority Investigates Piracy Settlement Letters

Post Syndicated from Andy original https://torrentfreak.com/swedish-data-authority-investigates-piracy-settlement-letters-171115/

Companies that aim to turn piracy into profit have been in existence for more than a decade but still the controversy around their practices continues.

Most, known colloquially as ‘copyright trolls’, monitor peer-to-peer networks such as BitTorrent, collecting IP addresses and other data in order to home in on a particular Internet account. From there, ISPs are sued to hand over that particular subscriber’s personal details. Once they’re obtained, the pressure begins.

At this point, trolls are in direct contact with the public, usually by letter. Their tone is almost always semi-aggressive, warning account holders that their actions are undermining entire industries. However, as if by magic, all the harm can be undone if they pay up few hundred dollars, euros, or pounds – quickly.

That’s the case in Sweden, where law firm Njord Law is representing the well-known international copyright trolls behind the movies CELL, IT, London Has Fallen, Mechanic: Resurrection, Criminal, and September of Shiraz.

“Have you, or other people with access to the aforementioned IP address, such as children living at home, viewed or tried to watch [a pirate movie] at the specified time?” Njord Law now writes in its letters to alleged pirates.

“If so, the case can be terminated by paying 4,500 SEK [$550].”

It’s clear that the companies involved are diving directly for cash. Indeed, letter recipients are told they have just two weeks to pay up or face further issues. The big question now is whether these demands are permissible under law, not necessarily from a copyright angle but due to the way they are presented to the alleged pirates.

The Swedish Data Protection Authority (Datainspektionen) is a public authority tasked with protecting the privacy of the individual in the information society. Swedish Radio reports that it has received several complaints from Swedes who have received cash demands and as a result is investigating whether the letters are legal.

As a result, the authority now has to determine whether the letters can be regarded as a debt collection measure. If so, they will have to comply with special laws and would also require special permission.

“They have not classified this as a debt collection fee, but it is not that element that is crucial. A debt collection measure is determined by whether there is any kind of pressure on the recipient to make a payment. Then there is the question of whether such pressure can be considered a debt collection measure,” says lawyer Camilla Sparr.

Of course, the notion that the letters exist for the purposes of collecting a debt is rejected by Njord Law. Lawyer Jeppe Brogaard Clausen says that his company has had no problems in this respect in other jurisdictions.

“We have encountered the same issue in Denmark and Finland and it was judged by the authorities that there is no talk about a debt collection letter,” Clausen told SR.

A lot hinges on the investigation of the Data Protection Authority. Njord Law has already obtained permission to find out the identities behind tens of thousands of IP addresses, including a single batch where 25,000 customers of ISP Telia were targeted.

At least 5,000 letters demanding payment have been sent out already and another 5,000 are lined up for the next few months. Clausen says their purpose is to change Swedes’ attitude towards illegal file sharing but there’s a broad belief that they’re part of a global network of companies whose aims are to generate profit from piracy.

But while the Data Protection Authority does its work, there is plenty of advice for letter recipients who don’t want to cave into demands for cash. Last month, Copyright Professor Sanna Wolk advised them to ignore the letters entirely.

“Do not pay. You do not even have to answer it,” Wolk told people receiving a letter.

“In the end, it’s the court that will decide whether you have to pay or not. We have seen this type of letter in the past, and only very few times those in charge of the claims have taken it to court.”

Of course, should copyright holders actually take a matter to court, then recipients must contest the claim since failure to do so could result in a default judgment. This means they lose the case without even having had the opportunity to mount a defense.

Importantly, one such defense could be that the individual didn’t carry out the offense, perhaps because their WiFi isn’t password protected or that they share their account with others.

“Someone who has an open network cannot be held responsible for copyright violations – such as downloading movies – if they provide others with access to their internet connection. This has been decided in a European Court ruling last year,” Wolk noted.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Sony & Warner Sue TuneIn For Copyright Infringement in UK High Court

Post Syndicated from Andy original https://torrentfreak.com/sony-warner-sue-tunein-for-copyright-infringement-in-uk-high-court-171109/

When it comes to providing digital online audio content, TuneIn is one of the world’s giants.

Whether music, news, sport or just chat, TuneIn provides more than 120,000 radio stations and five million podcasts to 75,000,000 global users, both for free and via a premium tier service.

Accessible from devices including cellphones, tablets, smart TVs, digital receivers, games consoles and even cars, TuneIn reaches more than 230 countries and territories worldwide. One, however, is about to cause the company a headache.

According to a report from Music Business Worldwide (MBW), Sony Music Entertainment and Warner Music Group are suing TuneIn over unlicensed streams.

MBW sources say that the record labels filed proceedings in the UK High Court last week, claiming that TuneIn committed copyright infringement on at least 800 music streams accessible in the UK.

While TuneIn does offer premium streams to customers, the service primarily acts as an index for radio streams hosted by their respective third-party creators. It describes itself as “an audio guide service” which indicates it does not directly provide the content listened to by its users.

However, previous EU rulings (such as one related to The Pirate Bay) have determined that providing an index to content is tantamount to a communication to the public, which for unlicensed content would amount to infringement in the UK.

While it would be difficult to avoid responsibility, TuneIn states on its website that it makes no claim that its service is legal in any other country than the United States.

“Those who choose to access or use the Service from locations outside the United States of America do so on their own initiative and are responsible for compliance with local laws, if and to the extent local laws are applicable,” the company writes.

“Access to the Service from jurisdictions where the contents or practices of the Service are illegal, unauthorized or penalized is strictly prohibited.”

All that being said, the specific details of the Sony/Warner complaint are not yet publicly available so the precise nature of the High Court action is yet to be determined.

TorrentFreak contacted the BPI, the industry body that represents both Sony and Warner in the UK, for comment on the lawsuit. A spokesperson informed us that they are not directly involved in the action.

We also contacted both the IFPI and San Francisco-based TuneIn for further comment but at the time of publication, we were yet to hear back from either.

TuneIn reportedly has until the end of November to file a defense.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Russian Site-Blocking Chiefs Under Investigation For Fraud

Post Syndicated from Andy original https://torrentfreak.com/russian-site-blocking-chiefs-under-investigation-for-fraud-171024/

Over the past several years, Rozcomnadzor has become a highly controversial government body in Russia. With responsibility for ordering web-blockades against sites the country deems disruptive, it’s effectively Russia’s online censorship engine.

In total, Rozcomnadzor has ordered the blocking of more than 82,000 sites. Within that total, at least 4,000 have been rendered inaccessible on copyright grounds, with an additional 41,000 innocent platforms blocked as collateral damage.

This massive over-blocking has been widely criticized in Russia but until now, Rozcomnadzor has appeared pretty much untouchable. However, a scandal is now engulfing the organization after at least four key officials were charged with fraud offenses.

News that something was potentially amiss began leaking out two weeks ago, when Russian publication Vedomosti reported on a court process in which the initials of the defendants appeared to coincide with officials at Rozcomnadzor.

The publication suspected that three men were involved; Roskomnadzor spokesman Vadim Ampelonsky, head of the legal department Boris Yedidin, and Alexander Veselchakov, who acts as an advisor to the head of the department monitoring radio frequencies.

The prosecution’s case indicated that the defendants were involved in “fraud committed by an organized group either on an especially large scale or entailing the deprivation of citizen’s rights.” Indeed, no further details were made available, with the head of Rozcomnadzor Alexander Zharov claiming he knew nothing about a criminal case and refusing to answer questions.

It later transpired that four employees had been charged with fraud, including Anastasiya Zvyagintseva, who acts as the general director of CRFC, an agency under the control of Rozcomnadzor.

According to Kommersant, Zvyagintseva’s involvement is at the core of the matter. She claims to have been forced to put “ghost employees” on the payroll, whose salaries were then paid to existing employees in order to increase their salaries.

The investigation into the scandal certainly runs deep. It’s reported that FSB officers have been spying on Rozcomnadzor officials for six months, listening to their phone conversations, monitoring their bank accounts, and even watching the ATM machines they used.

Local media reports indicate that the illegal salary scheme ran from 2012 until February 2017 and involved some 20 million rubles ($347,000) of illegal payments. These were allegedly used to retain ‘valuable’ employees when their regular salaries were not lucrative enough to keep them at the site-blocking body.

While Zvyagintseva has been released pending trial, Ampelonsky, Yedidin, and Veselchakov have been placed under house arrest by the Chertanovsky Court of Moscow until November 7.

Rozcomnadzor’s website is currently inaccessible.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

PureVPN Explains How it Helped the FBI Catch a Cyberstalker

Post Syndicated from Andy original https://torrentfreak.com/purevpn-explains-how-it-helped-the-fbi-catch-a-cyberstalker-171016/

Early October, Ryan S. Lin, 24, of Newton, Massachusetts, was arrested on suspicion of conducting “an extensive cyberstalking campaign” against a 24-year-old Massachusetts woman, as well as her family members and friends.

The Department of Justice described Lin’s offenses as a “multi-faceted” computer hacking and cyberstalking campaign. Launched in April 2016 when he began hacking into the victim’s online accounts, Lin allegedly obtained personal photographs and sensitive information about her medical and sexual histories and distributed that information to hundreds of other people.

Details of what information the FBI compiled on Lin can be found in our earlier report but aside from his alleged crimes (which are both significant and repugnant), it was PureVPN’s involvement in the case that caused the most controversy.

In a report compiled by an FBI special agent, it was revealed that the Hong Kong-based company’s logs helped the authorities net the alleged criminal.

“Significantly, PureVPN was able to determine that their service was accessed by the same customer from two originating IP addresses: the RCN IP address from the home Lin was living in at the time, and the software company where Lin was employed at the time,” the agent’s affidavit reads.

Among many in the privacy community, this revelation was met with disappointment. On the PureVPN website the company claims to carry no logs and on a general basis, it’s expected that so-called “no-logging” VPN providers should provide people with some anonymity, at least as far as their service goes. Now, several days after the furor, the company has responded to its critics.

In a fairly lengthy statement, the company begins by confirming that it definitely doesn’t log what websites a user views or what content he or she downloads.

“PureVPN did not breach its Privacy Policy and certainly did not breach your trust. NO browsing logs, browsing habits or anything else was, or ever will be shared,” the company writes.

However, that’s only half the problem. While it doesn’t log user activity (what sites people visit or content they download), it does log the IP addresses that customers use to access the PureVPN service. These, given the right circumstances, can be matched to external activities thanks to logs carried by other web companies.

PureVPN talks about logs held by Google’s Gmail service to illustrate its point.

“A network log is automatically generated every time a user visits a website. For the sake of this example, let’s say a user logged into their Gmail account. Every time they accessed Gmail, the email provider created a network log,” the company explains.

“If you are using a VPN, Gmail’s network log would contain the IP provided by PureVPN. This is one half of the picture. Now, if someone asks Google who accessed the user’s account, Google would state that whoever was using this IP, accessed the account.

“If the user was connected to PureVPN, it would be a PureVPN IP. The inquirer [in the Lin case, the FBI] would then share timestamps and network logs acquired from Google and ask them to be compared with the network logs maintained by the VPN provider.”

Now, if PureVPN carried no logs – literally no logs – it would not be able to help with this kind of inquiry. That was the case last year when the FBI approached Private Internet Access for information and the company was unable to assist.

However, as is made pretty clear by PureVPN’s explanation, the company does log user IP addresses and timestamps which reveal when a user was logged on to the service. It doesn’t matter that PureVPN doesn’t log what the user allegedly did online, since the third-party service already knows that information to the precise second.

Following the example, GMail knows that a user sent an email at 10:22am on Monday October 16 from a PureVPN IP address. So, if PureVPN is approached by the FBI, the company can confirm that User X was using the same IP address at exactly the same time, and his home IP address was XXX.XX.XXX.XX. Effectively, the combined logs link one IP address to the other and the user is revealed. It’s that simple.

It is for this reason that in TorrentFreak’s annual summary of no-logging VPN providers, the very first question we ask every single company reads as follows:

Do you keep ANY logs which would allow you to match an IP-address and a time stamp to a user/users of your service? If so, what information do you hold and for how long?

Clearly, if a company says “yes we log incoming IP addresses and associated timestamps”, any claim to total user anonymity is ended right there and then.

While not completely useless (a logging service will still stop the prying eyes of ISPs and similar surveillance, while also defeating throttling and site-blocking), if you’re a whistle-blower with a job or even your life to protect, this level of protection is entirely inadequate.

The take-home points from this controversy are numerous, but perhaps the most important is for people to read and understand VPN provider logging policies.

Secondly, and just as importantly, VPN providers need to be extremely clear about the information they log. Not tracking browsing or downloading activities is all well and good, but if home IP addresses and timestamps are stored, this needs to be made clear to the customer.

Finally, VPN users should not be evil. There are plenty of good reasons to stay anonymous online but cyberstalking, death threats and ruining people’s lives are not included. Fortunately, the FBI have offline methods for catching this type of offender, and long may that continue.

PureVPN’s blog post is available here.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Predict Billboard Top 10 Hits Using RStudio, H2O and Amazon Athena

Post Syndicated from Gopal Wunnava original https://aws.amazon.com/blogs/big-data/predict-billboard-top-10-hits-using-rstudio-h2o-and-amazon-athena/

Success in the popular music industry is typically measured in terms of the number of Top 10 hits artists have to their credit. The music industry is a highly competitive multi-billion dollar business, and record labels incur various costs in exchange for a percentage of the profits from sales and concert tickets.

Predicting the success of an artist’s release in the popular music industry can be difficult. One release may be extremely popular, resulting in widespread play on TV, radio and social media, while another single may turn out quite unpopular, and therefore unprofitable. Record labels need to be selective in their decision making, and predictive analytics can help them with decision making around the type of songs and artists they need to promote.

In this walkthrough, you leverage H2O.ai, Amazon Athena, and RStudio to make predictions on whether a song might make it to the Top 10 Billboard charts. You explore the GLM, GBM, and deep learning modeling techniques using H2O’s rapid, distributed and easy-to-use open source parallel processing engine. RStudio is a popular IDE, licensed either commercially or under AGPLv3, for working with R. This is ideal if you don’t want to connect to a server via SSH and use code editors such as vi to do analytics. RStudio is available in a desktop version, or a server version that allows you to access R via a web browser. RStudio’s Notebooks feature is used to demonstrate the execution of code and output. In addition, this post showcases how you can leverage Athena for query and interactive analysis during the modeling phase. A working knowledge of statistics and machine learning would be helpful to interpret the analysis being performed in this post.

Walkthrough

Your goal is to predict whether a song will make it to the Top 10 Billboard charts. For this purpose, you will be using multiple modeling techniques―namely GLM, GBM and deep learning―and choose the model that is the best fit.

This solution involves the following steps:

  • Install and configure RStudio with Athena
  • Log in to RStudio
  • Install R packages
  • Connect to Athena
  • Create a dataset
  • Create models

Install and configure RStudio with Athena

Use the following AWS CloudFormation stack to install, configure, and connect RStudio on an Amazon EC2 instance with Athena.

Launching this stack creates all required resources and prerequisites:

  • Amazon EC2 instance with Amazon Linux (minimum size of t2.large is recommended)
  • Provisioning of the EC2 instance in an existing VPC and public subnet
  • Installation of Java 8
  • Assignment of an IAM role to the EC2 instance with the required permissions for accessing Athena and Amazon S3
  • Security group allowing access to the RStudio and SSH ports from the internet (I recommend restricting access to these ports)
  • S3 staging bucket required for Athena (referenced within RStudio as ATHENABUCKET)
  • RStudio username and password
  • Setup logs in Amazon CloudWatch Logs (if needed for additional troubleshooting)
  • Amazon EC2 Systems Manager agent, which makes it easy to manage and patch

All AWS resources are created in the US-East-1 Region. To avoid cross-region data transfer fees, launch the CloudFormation stack in the same region. To check the availability of Athena in other regions, see Region Table.

Log in to RStudio

The instance security group has been automatically configured to allow incoming connections on the RStudio port 8787 from any source internet address. You can edit the security group to restrict source IP access. If you have trouble connecting, ensure that port 8787 isn’t blocked by subnet network ACLS or by your outgoing proxy/firewall.

  1. In the CloudFormation stack, choose Outputs, Value, and then open the RStudio URL. You might need to wait for a few minutes until the instance has been launched.
  2. Log in to RStudio with the and password you provided during setup.

Install R packages

Next, install the required R packages from the RStudio console. You can download the R notebook file containing just the code.

#install pacman – a handy package manager for managing installs
if("pacman" %in% rownames(installed.packages()) == FALSE)
{install.packages("pacman")}  
library(pacman)
p_load(h2o,rJava,RJDBC,awsjavasdk)
h2o.init(nthreads = -1)
##  Connection successful!
## 
## R is connected to the H2O cluster: 
##     H2O cluster uptime:         2 hours 42 minutes 
##     H2O cluster version:        3.10.4.6 
##     H2O cluster version age:    4 months and 4 days !!! 
##     H2O cluster name:           H2O_started_from_R_rstudio_hjx881 
##     H2O cluster total nodes:    1 
##     H2O cluster total memory:   3.30 GB 
##     H2O cluster total cores:    4 
##     H2O cluster allowed cores:  4 
##     H2O cluster healthy:        TRUE 
##     H2O Connection ip:          localhost 
##     H2O Connection port:        54321 
##     H2O Connection proxy:       NA 
##     H2O Internal Security:      FALSE 
##     R Version:                  R version 3.3.3 (2017-03-06)
## Warning in h2o.clusterInfo(): 
## Your H2O cluster version is too old (4 months and 4 days)!
## Please download and install the latest version from http://h2o.ai/download/
#install aws sdk if not present (pre-requisite for using Athena with an IAM role)
if (!aws_sdk_present()) {
  install_aws_sdk()
}

load_sdk()
## NULL

Connect to Athena

Next, establish a connection to Athena from RStudio, using an IAM role associated with your EC2 instance. Use ATHENABUCKET to specify the S3 staging directory.

URL <- 'https://s3.amazonaws.com/athena-downloads/drivers/AthenaJDBC41-1.0.1.jar'
fil <- basename(URL)
#download the file into current working directory
if (!file.exists(fil)) download.file(URL, fil)
#verify that the file has been downloaded successfully
list.files()
## [1] "AthenaJDBC41-1.0.1.jar"
drv <- JDBC(driverClass="com.amazonaws.athena.jdbc.AthenaDriver", fil, identifier.quote="'")

con <- jdbcConnection <- dbConnect(drv, 'jdbc:awsathena://athena.us-east-1.amazonaws.com:443/',
                                   s3_staging_dir=Sys.getenv("ATHENABUCKET"),
                                   aws_credentials_provider_class="com.amazonaws.auth.DefaultAWSCredentialsProviderChain")

Verify the connection. The results returned depend on your specific Athena setup.

con
## <JDBCConnection>
dbListTables(con)
##  [1] "gdelt"               "wikistats"           "elb_logs_raw_native"
##  [4] "twitter"             "twitter2"            "usermovieratings"   
##  [7] "eventcodes"          "events"              "billboard"          
## [10] "billboardtop10"      "elb_logs"            "gdelthist"          
## [13] "gdeltmaster"         "twitter"             "twitter3"

Create a dataset

For this analysis, you use a sample dataset combining information from Billboard and Wikipedia with Echo Nest data in the Million Songs Dataset. Upload this dataset into your own S3 bucket. The table below provides a description of the fields used in this dataset.

Field Description
year Year that song was released
songtitle Title of the song
artistname Name of the song artist
songid Unique identifier for the song
artistid Unique identifier for the song artist
timesignature Variable estimating the time signature of the song
timesignature_confidence Confidence in the estimate for the timesignature
loudness Continuous variable indicating the average amplitude of the audio in decibels
tempo Variable indicating the estimated beats per minute of the song
tempo_confidence Confidence in the estimate for tempo
key Variable with twelve levels indicating the estimated key of the song (C, C#, B)
key_confidence Confidence in the estimate for key
energy Variable that represents the overall acoustic energy of the song, using a mix of features such as loudness
pitch Continuous variable that indicates the pitch of the song
timbre_0_min thru timbre_11_min Variables that indicate the minimum values over all segments for each of the twelve values in the timbre vector
timbre_0_max thru timbre_11_max Variables that indicate the maximum values over all segments for each of the twelve values in the timbre vector
top10 Indicator for whether or not the song made it to the Top 10 of the Billboard charts (1 if it was in the top 10, and 0 if not)

Create an Athena table based on the dataset

In the Athena console, select the default database, sampled, or create a new database.

Run the following create table statement.

create external table if not exists billboard
(
year int,
songtitle string,
artistname string,
songID string,
artistID string,
timesignature int,
timesignature_confidence double,
loudness double,
tempo double,
tempo_confidence double,
key int,
key_confidence double,
energy double,
pitch double,
timbre_0_min double,
timbre_0_max double,
timbre_1_min double,
timbre_1_max double,
timbre_2_min double,
timbre_2_max double,
timbre_3_min double,
timbre_3_max double,
timbre_4_min double,
timbre_4_max double,
timbre_5_min double,
timbre_5_max double,
timbre_6_min double,
timbre_6_max double,
timbre_7_min double,
timbre_7_max double,
timbre_8_min double,
timbre_8_max double,
timbre_9_min double,
timbre_9_max double,
timbre_10_min double,
timbre_10_max double,
timbre_11_min double,
timbre_11_max double,
Top10 int
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://aws-bigdata-blog/artifacts/predict-billboard/data'
;

Inspect the table definition for the ‘billboard’ table that you have created. If you chose a database other than sampledb, replace that value with your choice.

dbGetQuery(con, "show create table sampledb.billboard")
##                                      createtab_stmt
## 1       CREATE EXTERNAL TABLE `sampledb.billboard`(
## 2                                       `year` int,
## 3                               `songtitle` string,
## 4                              `artistname` string,
## 5                                  `songid` string,
## 6                                `artistid` string,
## 7                              `timesignature` int,
## 8                `timesignature_confidence` double,
## 9                                `loudness` double,
## 10                                  `tempo` double,
## 11                       `tempo_confidence` double,
## 12                                       `key` int,
## 13                         `key_confidence` double,
## 14                                 `energy` double,
## 15                                  `pitch` double,
## 16                           `timbre_0_min` double,
## 17                           `timbre_0_max` double,
## 18                           `timbre_1_min` double,
## 19                           `timbre_1_max` double,
## 20                           `timbre_2_min` double,
## 21                           `timbre_2_max` double,
## 22                           `timbre_3_min` double,
## 23                           `timbre_3_max` double,
## 24                           `timbre_4_min` double,
## 25                           `timbre_4_max` double,
## 26                           `timbre_5_min` double,
## 27                           `timbre_5_max` double,
## 28                           `timbre_6_min` double,
## 29                           `timbre_6_max` double,
## 30                           `timbre_7_min` double,
## 31                           `timbre_7_max` double,
## 32                           `timbre_8_min` double,
## 33                           `timbre_8_max` double,
## 34                           `timbre_9_min` double,
## 35                           `timbre_9_max` double,
## 36                          `timbre_10_min` double,
## 37                          `timbre_10_max` double,
## 38                          `timbre_11_min` double,
## 39                          `timbre_11_max` double,
## 40                                     `top10` int)
## 41                             ROW FORMAT DELIMITED 
## 42                         FIELDS TERMINATED BY ',' 
## 43                            STORED AS INPUTFORMAT 
## 44       'org.apache.hadoop.mapred.TextInputFormat' 
## 45                                     OUTPUTFORMAT 
## 46  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
## 47                                        LOCATION
## 48    's3://aws-bigdata-blog/artifacts/predict-billboard/data'
## 49                                  TBLPROPERTIES (
## 50            'transient_lastDdlTime'='1505484133')

Run a sample query

Next, run a sample query to obtain a list of all songs from Janet Jackson that made it to the Billboard Top 10 charts.

dbGetQuery(con, " SELECT songtitle,artistname,top10   FROM sampledb.billboard WHERE lower(artistname) =     'janet jackson' AND top10 = 1")
##                       songtitle    artistname top10
## 1                       Runaway Janet Jackson     1
## 2               Because Of Love Janet Jackson     1
## 3                         Again Janet Jackson     1
## 4                            If Janet Jackson     1
## 5  Love Will Never Do (Without You) Janet Jackson 1
## 6                     Black Cat Janet Jackson     1
## 7               Come Back To Me Janet Jackson     1
## 8                       Alright Janet Jackson     1
## 9                      Escapade Janet Jackson     1
## 10                Rhythm Nation Janet Jackson     1

Determine how many songs in this dataset are specifically from the year 2010.

dbGetQuery(con, " SELECT count(*)   FROM sampledb.billboard WHERE year = 2010")
##   _col0
## 1   373

The sample dataset provides certain song properties of interest that can be analyzed to gauge the impact to the song’s overall popularity. Look at one such property, timesignature, and determine the value that is the most frequent among songs in the database. Timesignature is a measure of the number of beats and the type of note involved.

Running the query directly may result in an error, as shown in the commented lines below. This error is a result of trying to retrieve a large result set over a JDBC connection, which can cause out-of-memory issues at the client level. To address this, reduce the fetch size and run again.

#t<-dbGetQuery(con, " SELECT timesignature FROM sampledb.billboard")
#Note:  Running the preceding query results in the following error: 
#Error in .jcall(rp, "I", "fetch", stride, block): java.sql.SQLException: The requested #fetchSize is more than the allowed value in Athena. Please reduce the fetchSize and try #again. Refer to the Athena documentation for valid fetchSize values.
# Use the dbSendQuery function, reduce the fetch size, and run again
r <- dbSendQuery(con, " SELECT timesignature     FROM sampledb.billboard")
dftimesignature<- fetch(r, n=-1, block=100)
dbClearResult(r)
## [1] TRUE
table(dftimesignature)
## dftimesignature
##    0    1    3    4    5    7 
##   10  143  503 6787  112   19
nrow(dftimesignature)
## [1] 7574

From the results, observe that 6787 songs have a timesignature of 4.

Next, determine the song with the highest tempo.

dbGetQuery(con, " SELECT songtitle,artistname,tempo   FROM sampledb.billboard WHERE tempo = (SELECT max(tempo) FROM sampledb.billboard) ")
##                   songtitle      artistname   tempo
## 1 Wanna Be Startin' Somethin' Michael Jackson 244.307

Create the training dataset

Your model needs to be trained such that it can learn and make accurate predictions. Split the data into training and test datasets, and create the training dataset first.  This dataset contains all observations from the year 2009 and earlier. You may face the same JDBC connection issue pointed out earlier, so this query uses a fetch size.

#BillboardTrain <- dbGetQuery(con, "SELECT * FROM sampledb.billboard WHERE year <= 2009")
#Running the preceding query results in the following error:-
#Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", : Unable to retrieve #JDBC result set for SELECT * FROM sampledb.billboard WHERE year <= 2009 (Internal error)
#Follow the same approach as before to address this issue.

r <- dbSendQuery(con, "SELECT * FROM sampledb.billboard WHERE year <= 2009")
BillboardTrain <- fetch(r, n=-1, block=100)
dbClearResult(r)
## [1] TRUE
BillboardTrain[1:2,c(1:3,6:10)]
##   year           songtitle artistname timesignature
## 1 2009 The Awkward Goodbye    Athlete             3
## 2 2009        Rubik's Cube    Athlete             3
##   timesignature_confidence loudness   tempo tempo_confidence
## 1                    0.732   -6.320  89.614   0.652
## 2                    0.906   -9.541 117.742   0.542
nrow(BillboardTrain)
## [1] 7201

Create the test dataset

BillboardTest <- dbGetQuery(con, "SELECT * FROM sampledb.billboard where year = 2010")
BillboardTest[1:2,c(1:3,11:15)]
##   year              songtitle        artistname key
## 1 2010 This Is the House That Doubt Built A Day to Remember  11
## 2 2010        Sticks & Bricks A Day to Remember  10
##   key_confidence    energy pitch timbre_0_min
## 1          0.453 0.9666556 0.024        0.002
## 2          0.469 0.9847095 0.025        0.000
nrow(BillboardTest)
## [1] 373

Convert the training and test datasets into H2O dataframes

train.h2o <- as.h2o(BillboardTrain)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
test.h2o <- as.h2o(BillboardTest)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%

Inspect the column names in your H2O dataframes.

colnames(train.h2o)
##  [1] "year"                     "songtitle"               
##  [3] "artistname"               "songid"                  
##  [5] "artistid"                 "timesignature"           
##  [7] "timesignature_confidence" "loudness"                
##  [9] "tempo"                    "tempo_confidence"        
## [11] "key"                      "key_confidence"          
## [13] "energy"                   "pitch"                   
## [15] "timbre_0_min"             "timbre_0_max"            
## [17] "timbre_1_min"             "timbre_1_max"            
## [19] "timbre_2_min"             "timbre_2_max"            
## [21] "timbre_3_min"             "timbre_3_max"            
## [23] "timbre_4_min"             "timbre_4_max"            
## [25] "timbre_5_min"             "timbre_5_max"            
## [27] "timbre_6_min"             "timbre_6_max"            
## [29] "timbre_7_min"             "timbre_7_max"            
## [31] "timbre_8_min"             "timbre_8_max"            
## [33] "timbre_9_min"             "timbre_9_max"            
## [35] "timbre_10_min"            "timbre_10_max"           
## [37] "timbre_11_min"            "timbre_11_max"           
## [39] "top10"

Create models

You need to designate the independent and dependent variables prior to applying your modeling algorithms. Because you’re trying to predict the ‘top10’ field, this would be your dependent variable and everything else would be independent.

Create your first model using GLM. Because GLM works best with numeric data, you create your model by dropping non-numeric variables. You only use the variables in the dataset that describe the numerical attributes of the song in the logistic regression model. You won’t use these variables:  “year”, “songtitle”, “artistname”, “songid”, or “artistid”.

y.dep <- 39
x.indep <- c(6:38)
x.indep
##  [1]  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
## [24] 29 30 31 32 33 34 35 36 37 38

Create Model 1: All numeric variables

Create Model 1 with the training dataset, using GLM as the modeling algorithm and H2O’s built-in h2o.glm function.

modelh1 <- h2o.glm( y = y.dep, x = x.indep, training_frame = train.h2o, family = "binomial")
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=====                                                            |   8%
  |                                                                       
  |=================================================================| 100%

Measure the performance of Model 1, using H2O’s built-in performance function.

h2o.performance(model=modelh1,newdata=test.h2o)
## H2OBinomialMetrics: glm
## 
## MSE:  0.09924684
## RMSE:  0.3150347
## LogLoss:  0.3220267
## Mean Per-Class Error:  0.2380168
## AUC:  0.8431394
## Gini:  0.6862787
## R^2:  0.254663
## Null Deviance:  326.0801
## Residual Deviance:  240.2319
## AIC:  308.2319
## 
## Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
##          0   1    Error     Rate
## 0      255  59 0.187898  =59/314
## 1       17  42 0.288136   =17/59
## Totals 272 101 0.203753  =76/373
## 
## Maximum Metrics: Maximum metrics at their respective thresholds
##                         metric threshold    value idx
## 1                       max f1  0.192772 0.525000 100
## 2                       max f2  0.124912 0.650510 155
## 3                 max f0point5  0.416258 0.612903  23
## 4                 max accuracy  0.416258 0.879357  23
## 5                max precision  0.813396 1.000000   0
## 6                   max recall  0.037579 1.000000 282
## 7              max specificity  0.813396 1.000000   0
## 8             max absolute_mcc  0.416258 0.455251  23
## 9   max min_per_class_accuracy  0.161402 0.738854 125
## 10 max mean_per_class_accuracy  0.124912 0.765006 155
## 
## Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or ` 
h2o.auc(h2o.performance(modelh1,test.h2o)) 
## [1] 0.8431394

The AUC metric provides insight into how well the classifier is able to separate the two classes. In this case, the value of 0.8431394 indicates that the classification is good. (A value of 0.5 indicates a worthless test, while a value of 1.0 indicates a perfect test.)

Next, inspect the coefficients of the variables in the dataset.

dfmodelh1 <- as.data.frame(h2o.varimp(modelh1))
dfmodelh1
##                       names coefficients sign
## 1              timbre_0_max  1.290938663  NEG
## 2                  loudness  1.262941934  POS
## 3                     pitch  0.616995941  NEG
## 4              timbre_1_min  0.422323735  POS
## 5              timbre_6_min  0.349016024  NEG
## 6                    energy  0.348092062  NEG
## 7             timbre_11_min  0.307331997  NEG
## 8              timbre_3_max  0.302225619  NEG
## 9             timbre_11_max  0.243632060  POS
## 10             timbre_4_min  0.224233951  POS
## 11             timbre_4_max  0.204134342  POS
## 12             timbre_5_min  0.199149324  NEG
## 13             timbre_0_min  0.195147119  POS
## 14 timesignature_confidence  0.179973904  POS
## 15         tempo_confidence  0.144242598  POS
## 16            timbre_10_max  0.137644568  POS
## 17             timbre_7_min  0.126995955  NEG
## 18            timbre_10_min  0.123851179  POS
## 19             timbre_7_max  0.100031481  NEG
## 20             timbre_2_min  0.096127636  NEG
## 21           key_confidence  0.083115820  POS
## 22             timbre_6_max  0.073712419  POS
## 23            timesignature  0.067241917  POS
## 24             timbre_8_min  0.061301881  POS
## 25             timbre_8_max  0.060041698  POS
## 26                      key  0.056158445  POS
## 27             timbre_3_min  0.050825116  POS
## 28             timbre_9_max  0.033733561  POS
## 29             timbre_2_max  0.030939072  POS
## 30             timbre_9_min  0.020708113  POS
## 31             timbre_1_max  0.014228818  NEG
## 32                    tempo  0.008199861  POS
## 33             timbre_5_max  0.004837870  POS
## 34                                    NA <NA>

Typically, songs with heavier instrumentation tend to be louder (have higher values in the variable “loudness”) and more energetic (have higher values in the variable “energy”). This knowledge is helpful for interpreting the modeling results.

You can make the following observations from the results:

  • The coefficient estimates for the confidence values associated with the time signature, key, and tempo variables are positive. This suggests that higher confidence leads to a higher predicted probability of a Top 10 hit.
  • The coefficient estimate for loudness is positive, meaning that mainstream listeners prefer louder songs with heavier instrumentation.
  • The coefficient estimate for energy is negative, meaning that mainstream listeners prefer songs that are less energetic, which are those songs with light instrumentation.

These coefficients lead to contradictory conclusions for Model 1. This could be due to multicollinearity issues. Inspect the correlation between the variables “loudness” and “energy” in the training set.

cor(train.h2o$loudness,train.h2o$energy)
## [1] 0.7399067

This number indicates that these two variables are highly correlated, and Model 1 does indeed suffer from multicollinearity. Typically, you associate a value of -1.0 to -0.5 or 1.0 to 0.5 to indicate strong correlation, and a value of 0.1 to 0.1 to indicate weak correlation. To avoid this correlation issue, omit one of these two variables and re-create the models.

You build two variations of the original model:

  • Model 2, in which you keep “energy” and omit “loudness”
  • Model 3, in which you keep “loudness” and omit “energy”

You compare these two models and choose the model with a better fit for this use case.

Create Model 2: Keep energy and omit loudness

colnames(train.h2o)
##  [1] "year"                     "songtitle"               
##  [3] "artistname"               "songid"                  
##  [5] "artistid"                 "timesignature"           
##  [7] "timesignature_confidence" "loudness"                
##  [9] "tempo"                    "tempo_confidence"        
## [11] "key"                      "key_confidence"          
## [13] "energy"                   "pitch"                   
## [15] "timbre_0_min"             "timbre_0_max"            
## [17] "timbre_1_min"             "timbre_1_max"            
## [19] "timbre_2_min"             "timbre_2_max"            
## [21] "timbre_3_min"             "timbre_3_max"            
## [23] "timbre_4_min"             "timbre_4_max"            
## [25] "timbre_5_min"             "timbre_5_max"            
## [27] "timbre_6_min"             "timbre_6_max"            
## [29] "timbre_7_min"             "timbre_7_max"            
## [31] "timbre_8_min"             "timbre_8_max"            
## [33] "timbre_9_min"             "timbre_9_max"            
## [35] "timbre_10_min"            "timbre_10_max"           
## [37] "timbre_11_min"            "timbre_11_max"           
## [39] "top10"
y.dep <- 39
x.indep <- c(6:7,9:38)
x.indep
##  [1]  6  7  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
## [24] 30 31 32 33 34 35 36 37 38
modelh2 <- h2o.glm( y = y.dep, x = x.indep, training_frame = train.h2o, family = "binomial")
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=======                                                          |  10%
  |                                                                       
  |=================================================================| 100%

Measure the performance of Model 2.

h2o.performance(model=modelh2,newdata=test.h2o)
## H2OBinomialMetrics: glm
## 
## MSE:  0.09922606
## RMSE:  0.3150017
## LogLoss:  0.3228213
## Mean Per-Class Error:  0.2490554
## AUC:  0.8431933
## Gini:  0.6863867
## R^2:  0.2548191
## Null Deviance:  326.0801
## Residual Deviance:  240.8247
## AIC:  306.8247
## 
## Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
##          0  1    Error     Rate
## 0      280 34 0.108280  =34/314
## 1       23 36 0.389831   =23/59
## Totals 303 70 0.152815  =57/373
## 
## Maximum Metrics: Maximum metrics at their respective thresholds
##                         metric threshold    value idx
## 1                       max f1  0.254391 0.558140  69
## 2                       max f2  0.113031 0.647208 157
## 3                 max f0point5  0.413999 0.596026  22
## 4                 max accuracy  0.446250 0.876676  18
## 5                max precision  0.811739 1.000000   0
## 6                   max recall  0.037682 1.000000 283
## 7              max specificity  0.811739 1.000000   0
## 8             max absolute_mcc  0.254391 0.469060  69
## 9   max min_per_class_accuracy  0.141051 0.716561 131
## 10 max mean_per_class_accuracy  0.113031 0.761821 157
## 
## Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
dfmodelh2 <- as.data.frame(h2o.varimp(modelh2))
dfmodelh2
##                       names coefficients sign
## 1                     pitch  0.700331511  NEG
## 2              timbre_1_min  0.510270513  POS
## 3              timbre_0_max  0.402059546  NEG
## 4              timbre_6_min  0.333316236  NEG
## 5             timbre_11_min  0.331647383  NEG
## 6              timbre_3_max  0.252425901  NEG
## 7             timbre_11_max  0.227500308  POS
## 8              timbre_4_max  0.210663865  POS
## 9              timbre_0_min  0.208516163  POS
## 10             timbre_5_min  0.202748055  NEG
## 11             timbre_4_min  0.197246582  POS
## 12            timbre_10_max  0.172729619  POS
## 13         tempo_confidence  0.167523934  POS
## 14 timesignature_confidence  0.167398830  POS
## 15             timbre_7_min  0.142450727  NEG
## 16             timbre_8_max  0.093377516  POS
## 17            timbre_10_min  0.090333426  POS
## 18            timesignature  0.085851625  POS
## 19             timbre_7_max  0.083948442  NEG
## 20           key_confidence  0.079657073  POS
## 21             timbre_6_max  0.076426046  POS
## 22             timbre_2_min  0.071957831  NEG
## 23             timbre_9_max  0.071393189  POS
## 24             timbre_8_min  0.070225578  POS
## 25                      key  0.061394702  POS
## 26             timbre_3_min  0.048384697  POS
## 27             timbre_1_max  0.044721121  NEG
## 28                   energy  0.039698433  POS
## 29             timbre_5_max  0.039469064  POS
## 30             timbre_2_max  0.018461133  POS
## 31                    tempo  0.013279926  POS
## 32             timbre_9_min  0.005282143  NEG
## 33                                    NA <NA>

h2o.auc(h2o.performance(modelh2,test.h2o)) 
## [1] 0.8431933

You can make the following observations:

  • The AUC metric is 0.8431933.
  • Inspecting the coefficient of the variable energy, Model 2 suggests that songs with high energy levels tend to be more popular. This is as per expectation.
  • As H2O orders variables by significance, the variable energy is not significant in this model.

You can conclude that Model 2 is not ideal for this use , as energy is not significant.

CreateModel 3: Keep loudness but omit energy

colnames(train.h2o)
##  [1] "year"                     "songtitle"               
##  [3] "artistname"               "songid"                  
##  [5] "artistid"                 "timesignature"           
##  [7] "timesignature_confidence" "loudness"                
##  [9] "tempo"                    "tempo_confidence"        
## [11] "key"                      "key_confidence"          
## [13] "energy"                   "pitch"                   
## [15] "timbre_0_min"             "timbre_0_max"            
## [17] "timbre_1_min"             "timbre_1_max"            
## [19] "timbre_2_min"             "timbre_2_max"            
## [21] "timbre_3_min"             "timbre_3_max"            
## [23] "timbre_4_min"             "timbre_4_max"            
## [25] "timbre_5_min"             "timbre_5_max"            
## [27] "timbre_6_min"             "timbre_6_max"            
## [29] "timbre_7_min"             "timbre_7_max"            
## [31] "timbre_8_min"             "timbre_8_max"            
## [33] "timbre_9_min"             "timbre_9_max"            
## [35] "timbre_10_min"            "timbre_10_max"           
## [37] "timbre_11_min"            "timbre_11_max"           
## [39] "top10"
y.dep <- 39
x.indep <- c(6:12,14:38)
x.indep
##  [1]  6  7  8  9 10 11 12 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
## [24] 30 31 32 33 34 35 36 37 38
modelh3 <- h2o.glm( y = y.dep, x = x.indep, training_frame = train.h2o, family = "binomial")
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |========                                                         |  12%
  |                                                                       
  |=================================================================| 100%
perfh3<-h2o.performance(model=modelh3,newdata=test.h2o)
perfh3
## H2OBinomialMetrics: glm
## 
## MSE:  0.0978859
## RMSE:  0.3128672
## LogLoss:  0.3178367
## Mean Per-Class Error:  0.264925
## AUC:  0.8492389
## Gini:  0.6984778
## R^2:  0.2648836
## Null Deviance:  326.0801
## Residual Deviance:  237.1062
## AIC:  303.1062
## 
## Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
##          0  1    Error     Rate
## 0      286 28 0.089172  =28/314
## 1       26 33 0.440678   =26/59
## Totals 312 61 0.144772  =54/373
## 
## Maximum Metrics: Maximum metrics at their respective thresholds
##                         metric threshold    value idx
## 1                       max f1  0.273799 0.550000  60
## 2                       max f2  0.125503 0.663265 155
## 3                 max f0point5  0.435479 0.628931  24
## 4                 max accuracy  0.435479 0.882038  24
## 5                max precision  0.821606 1.000000   0
## 6                   max recall  0.038328 1.000000 280
## 7              max specificity  0.821606 1.000000   0
## 8             max absolute_mcc  0.435479 0.471426  24
## 9   max min_per_class_accuracy  0.173693 0.745763 120
## 10 max mean_per_class_accuracy  0.125503 0.775073 155
## 
## Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
dfmodelh3 <- as.data.frame(h2o.varimp(modelh3))
dfmodelh3
##                       names coefficients sign
## 1              timbre_0_max 1.216621e+00  NEG
## 2                  loudness 9.780973e-01  POS
## 3                     pitch 7.249788e-01  NEG
## 4              timbre_1_min 3.891197e-01  POS
## 5              timbre_6_min 3.689193e-01  NEG
## 6             timbre_11_min 3.086673e-01  NEG
## 7              timbre_3_max 3.025593e-01  NEG
## 8             timbre_11_max 2.459081e-01  POS
## 9              timbre_4_min 2.379749e-01  POS
## 10             timbre_4_max 2.157627e-01  POS
## 11             timbre_0_min 1.859531e-01  POS
## 12             timbre_5_min 1.846128e-01  NEG
## 13 timesignature_confidence 1.729658e-01  POS
## 14             timbre_7_min 1.431871e-01  NEG
## 15            timbre_10_max 1.366703e-01  POS
## 16            timbre_10_min 1.215954e-01  POS
## 17         tempo_confidence 1.183698e-01  POS
## 18             timbre_2_min 1.019149e-01  NEG
## 19           key_confidence 9.109701e-02  POS
## 20             timbre_7_max 8.987908e-02  NEG
## 21             timbre_6_max 6.935132e-02  POS
## 22             timbre_8_max 6.878241e-02  POS
## 23            timesignature 6.120105e-02  POS
## 24                      key 5.814805e-02  POS
## 25             timbre_8_min 5.759228e-02  POS
## 26             timbre_1_max 2.930285e-02  NEG
## 27             timbre_9_max 2.843755e-02  POS
## 28             timbre_3_min 2.380245e-02  POS
## 29             timbre_2_max 1.917035e-02  POS
## 30             timbre_5_max 1.715813e-02  POS
## 31                    tempo 1.364418e-02  NEG
## 32             timbre_9_min 8.463143e-05  NEG
## 33                                    NA <NA>
h2o.sensitivity(perfh3,0.5)
## Warning in h2o.find_row_by_threshold(object, t): Could not find exact
## threshold: 0.5 for this set of metrics; using closest threshold found:
## 0.501855569251422. Run `h2o.predict` and apply your desired threshold on a
## probability column.
## [[1]]
## [1] 0.2033898
h2o.auc(perfh3)
## [1] 0.8492389

You can make the following observations:

  • The AUC metric is 0.8492389.
  • From the confusion matrix, the model correctly predicts that 33 songs will be top 10 hits (true positives). However, it has 26 false positives (songs that the model predicted would be Top 10 hits, but ended up not being Top 10 hits).
  • Loudness has a positive coefficient estimate, meaning that this model predicts that songs with heavier instrumentation tend to be more popular. This is the same conclusion from Model 2.
  • Loudness is significant in this model.

Overall, Model 3 predicts a higher number of top 10 hits with an accuracy rate that is acceptable. To choose the best fit for production runs, record labels should consider the following factors:

  • Desired model accuracy at a given threshold
  • Number of correct predictions for top10 hits
  • Tolerable number of false positives or false negatives

Next, make predictions using Model 3 on the test dataset.

predict.regh <- h2o.predict(modelh3, test.h2o)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
print(predict.regh)
##   predict        p0          p1
## 1       0 0.9654739 0.034526052
## 2       0 0.9654748 0.034525236
## 3       0 0.9635547 0.036445318
## 4       0 0.9343579 0.065642149
## 5       0 0.9978334 0.002166601
## 6       0 0.9779949 0.022005078
## 
## [373 rows x 3 columns]
predict.regh$predict
##   predict
## 1       0
## 2       0
## 3       0
## 4       0
## 5       0
## 6       0
## 
## [373 rows x 1 column]
dpr<-as.data.frame(predict.regh)
#Rename the predicted column 
colnames(dpr)[colnames(dpr) == 'predict'] <- 'predict_top10'
table(dpr$predict_top10)
## 
##   0   1 
## 312  61

The first set of output results specifies the probabilities associated with each predicted observation.  For example, observation 1 is 96.54739% likely to not be a Top 10 hit, and 3.4526052% likely to be a Top 10 hit (predict=1 indicates Top 10 hit and predict=0 indicates not a Top 10 hit).  The second set of results list the actual predictions made.  From the third set of results, this model predicts that 61 songs will be top 10 hits.

Compute the baseline accuracy, by assuming that the baseline predicts the most frequent outcome, which is that most songs are not Top 10 hits.

table(BillboardTest$top10)
## 
##   0   1 
## 314  59

Now observe that the baseline model would get 314 observations correct, and 59 wrong, for an accuracy of 314/(314+59) = 0.8418231.

It seems that Model 3, with an accuracy of 0.8552, provides you with a small improvement over the baseline model. But is this model useful for record labels?

View the two models from an investment perspective:

  • A production company is interested in investing in songs that are more likely to make it to the Top 10. The company’s objective is to minimize the risk of financial losses attributed to investing in songs that end up unpopular.
  • How many songs does Model 3 correctly predict as a Top 10 hit in 2010? Looking at the confusion matrix, you see that it predicts 33 top 10 hits correctly at an optimal threshold, which is more than half the number
  • It will be more useful to the record label if you can provide the production company with a list of songs that are highly likely to end up in the Top 10.
  • The baseline model is not useful, as it simply does not label any song as a hit.

Considering the three models built so far, you can conclude that Model 3 proves to be the best investment choice for the record label.

GBM model

H2O provides you with the ability to explore other learning models, such as GBM and deep learning. Explore building a model using the GBM technique, using the built-in h2o.gbm function.

Before you do this, you need to convert the target variable to a factor for multinomial classification techniques.

train.h2o$top10=as.factor(train.h2o$top10)
gbm.modelh <- h2o.gbm(y=y.dep, x=x.indep, training_frame = train.h2o, ntrees = 500, max_depth = 4, learn_rate = 0.01, seed = 1122,distribution="multinomial")
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |===                                                              |   5%
  |                                                                       
  |=====                                                            |   7%
  |                                                                       
  |======                                                           |   9%
  |                                                                       
  |=======                                                          |  10%
  |                                                                       
  |======================                                           |  33%
  |                                                                       
  |=====================================                            |  56%
  |                                                                       
  |====================================================             |  79%
  |                                                                       
  |================================================================ |  98%
  |                                                                       
  |=================================================================| 100%
perf.gbmh<-h2o.performance(gbm.modelh,test.h2o)
perf.gbmh
## H2OBinomialMetrics: gbm
## 
## MSE:  0.09860778
## RMSE:  0.3140188
## LogLoss:  0.3206876
## Mean Per-Class Error:  0.2120263
## AUC:  0.8630573
## Gini:  0.7261146
## 
## Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
##          0  1    Error     Rate
## 0      266 48 0.152866  =48/314
## 1       16 43 0.271186   =16/59
## Totals 282 91 0.171582  =64/373
## 
## Maximum Metrics: Maximum metrics at their respective thresholds
##                       metric threshold    value idx
## 1                     max f1  0.189757 0.573333  90
## 2                     max f2  0.130895 0.693717 145
## 3               max f0point5  0.327346 0.598802  26
## 4               max accuracy  0.442757 0.876676  14
## 5              max precision  0.802184 1.000000   0
## 6                 max recall  0.049990 1.000000 284
## 7            max specificity  0.802184 1.000000   0
## 8           max absolute_mcc  0.169135 0.496486 104
## 9 max min_per_class_accuracy  0.169135 0.796610 104
## 10 max mean_per_class_accuracy  0.169135 0.805948 104
## 
## Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `
h2o.sensitivity(perf.gbmh,0.5)
## Warning in h2o.find_row_by_threshold(object, t): Could not find exact
## threshold: 0.5 for this set of metrics; using closest threshold found:
## 0.501205344484314. Run `h2o.predict` and apply your desired threshold on a
## probability column.
## [[1]]
## [1] 0.1355932
h2o.auc(perf.gbmh)
## [1] 0.8630573

This model correctly predicts 43 top 10 hits, which is 10 more than the number predicted by Model 3. Moreover, the AUC metric is higher than the one obtained from Model 3.

As seen above, H2O’s API provides the ability to obtain key statistical measures required to analyze the models easily, using several built-in functions. The record label can experiment with different parameters to arrive at the model that predicts the maximum number of Top 10 hits at the desired level of accuracy and threshold.

H2O also allows you to experiment with deep learning models. Deep learning models have the ability to learn features implicitly, but can be more expensive computationally.

Now, create a deep learning model with the h2o.deeplearning function, using the same training and test datasets created before. The time taken to run this model depends on the type of EC2 instance chosen for this purpose.  For models that require more computation, consider using accelerated computing instances such as the P2 instance type.

system.time(
  dlearning.modelh <- h2o.deeplearning(y = y.dep,
                                      x = x.indep,
                                      training_frame = train.h2o,
                                      epoch = 250,
                                      hidden = c(250,250),
                                      activation = "Rectifier",
                                      seed = 1122,
                                      distribution="multinomial"
  )
)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |===                                                              |   4%
  |                                                                       
  |=====                                                            |   8%
  |                                                                       
  |========                                                         |  12%
  |                                                                       
  |==========                                                       |  16%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |================                                                 |  24%
  |                                                                       
  |==================                                               |  28%
  |                                                                       
  |=====================                                            |  32%
  |                                                                       
  |=======================                                          |  36%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |=============================                                    |  44%
  |                                                                       
  |===============================                                  |  48%
  |                                                                       
  |==================================                               |  52%
  |                                                                       
  |====================================                             |  56%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==========================================                       |  64%
  |                                                                       
  |============================================                     |  68%
  |                                                                       
  |===============================================                  |  72%
  |                                                                       
  |=================================================                |  76%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |=======================================================          |  84%
  |                                                                       
  |=========================================================        |  88%
  |                                                                       
  |============================================================     |  92%
  |                                                                       
  |==============================================================   |  96%
  |                                                                       
  |=================================================================| 100%
##    user  system elapsed 
##   1.216   0.020 166.508
perf.dl<-h2o.performance(model=dlearning.modelh,newdata=test.h2o)
perf.dl
## H2OBinomialMetrics: deeplearning
## 
## MSE:  0.1678359
## RMSE:  0.4096778
## LogLoss:  1.86509
## Mean Per-Class Error:  0.3433013
## AUC:  0.7568822
## Gini:  0.5137644
## 
## Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
##          0  1    Error     Rate
## 0      290 24 0.076433  =24/314
## 1       36 23 0.610169   =36/59
## Totals 326 47 0.160858  =60/373
## 
## Maximum Metrics: Maximum metrics at their respective thresholds
##                       metric threshold    value idx
## 1                     max f1  0.826267 0.433962  46
## 2                     max f2  0.000000 0.588235 239
## 3               max f0point5  0.999929 0.511811  16
## 4               max accuracy  0.999999 0.865952  10
## 5              max precision  1.000000 1.000000   0
## 6                 max recall  0.000000 1.000000 326
## 7            max specificity  1.000000 1.000000   0
## 8           max absolute_mcc  0.999929 0.363219  16
## 9 max min_per_class_accuracy  0.000004 0.662420 145
## 10 max mean_per_class_accuracy  0.000000 0.685334 224
## 
## Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
h2o.sensitivity(perf.dl,0.5)
## Warning in h2o.find_row_by_threshold(object, t): Could not find exact
## threshold: 0.5 for this set of metrics; using closest threshold found:
## 0.496293348880151. Run `h2o.predict` and apply your desired threshold on a
## probability column.
## [[1]]
## [1] 0.3898305
h2o.auc(perf.dl)
## [1] 0.7568822

The AUC metric for this model is 0.7568822, which is less than what you got from the earlier models. I recommend further experimentation using different hyper parameters, such as the learning rate, epoch or the number of hidden layers.

H2O’s built-in functions provide many key statistical measures that can help measure model performance. Here are some of these key terms.

Metric Description
Sensitivity Measures the proportion of positives that have been correctly identified. It is also called the true positive rate, or recall.
Specificity Measures the proportion of negatives that have been correctly identified. It is also called the true negative rate.
Threshold Cutoff point that maximizes specificity and sensitivity. While the model may not provide the highest prediction at this point, it would not be biased towards positives or negatives.
Precision The fraction of the documents retrieved that are relevant to the information needed, for example, how many of the positively classified are relevant
AUC

Provides insight into how well the classifier is able to separate the two classes. The implicit goal is to deal with situations where the sample distribution is highly skewed, with a tendency to overfit to a single class.

0.90 – 1 = excellent (A)

0.8 – 0.9 = good (B)

0.7 – 0.8 = fair (C)

.6 – 0.7 = poor (D)

0.5 – 0.5 = fail (F)

Here’s a summary of the metrics generated from H2O’s built-in functions for the three models that produced useful results.

Metric Model 3 GBM Model Deep Learning Model

Accuracy

(max)

0.882038

(t=0.435479)

0.876676

(t=0.442757)

0.865952

(t=0.999999)

Precision

(max)

1.0

(t=0.821606)

1.0

(t=0802184)

1.0

(t=1.0)

Recall

(max)

1.0 1.0

1.0

(t=0)

Specificity

(max)

1.0 1.0

1.0

(t=1)

Sensitivity

 

0.2033898 0.1355932

0.3898305

(t=0.5)

AUC 0.8492389 0.8630573 0.756882

Note: ‘t’ denotes threshold.

Your options at this point could be narrowed down to Model 3 and the GBM model, based on the AUC and accuracy metrics observed earlier.  If the slightly lower accuracy of the GBM model is deemed acceptable, the record label can choose to go to production with the GBM model, as it can predict a higher number of Top 10 hits.  The AUC metric for the GBM model is also higher than that of Model 3.

Record labels can experiment with different learning techniques and parameters before arriving at a model that proves to be the best fit for their business. Because deep learning models can be computationally expensive, record labels can choose more powerful EC2 instances on AWS to run their experiments faster.

Conclusion

In this post, I showed how the popular music industry can use analytics to predict the type of songs that make the Top 10 Billboard charts. By running H2O’s scalable machine learning platform on AWS, data scientists can easily experiment with multiple modeling techniques and interactively query the data using Amazon Athena, without having to manage the underlying infrastructure. This helps record labels make critical decisions on the type of artists and songs to promote in a timely fashion, thereby increasing sales and revenue.

If you have questions or suggestions, please comment below.


Additional Reading

Learn how to build and explore a simple geospita simple GEOINT application using SparkR.


About the Authors

gopalGopal Wunnava is a Partner Solution Architect with the AWS GSI Team. He works with partners and customers on big data engagements, and is passionate about building analytical solutions that drive business capabilities and decision making. In his spare time, he loves all things sports and movies related and is fond of old classics like Asterix, Obelix comics and Hitchcock movies.

 

 

Bob Strahan, a Senior Consultant with AWS Professional Services, contributed to this post.

 

 

[$] Steps toward a privacy-preserving phone

Post Syndicated from jake original https://lwn.net/Articles/735597/rss

What kind of cell phone would emerge from a concerted effort
to design privacy in from
the beginning, using free software as much as possible? Some
answers are provided by a crowdfunding campaign launched in
August by Purism SPC, which has used two such
campaigns successfully in the past to build a business around secure
laptops. The Librem 5, with a five-inch screen and radio chip for
communicating with cell phone companies, represents Purism’s hope to bring
the same privacy-enhancing vision to the mobile space, which is much more
demanding in its threats, technology components, and user experience.