Tag Archives: Core

Bitcoin, UASF… и политиката

Post Syndicated from Григор original http://www.gatchev.info/blog/?p=2064

Напоследък се заговори из Нета за UASF при Bitcoin. Надали обаче много хора са обърнали внимание на тия акроними. (Обикновено статиите по въпроса на свой ред са салата от други акроними, което също не улеснява разбирането им.) Какво, по дяволите, значи това? И важно ли е?

Всъщност не е особено важно, освен за хора, които сериозно се занимават с криптовалути. Останалите спокойно могат да не му обръщат внимание.

Поне на пръв поглед. Защото дава и сериозно разбиране за ефективността на някои фундаментални политически понятия. Затова смятам да му посветя тук част от времето си – и да изгубя част от вашето.

1. Проблемите на Bitcoin

Електронна валута, която се контролира не от политикани и меринджеи, а от строги правила – мечта, нали? Край на страховете, че поредният популист ще отвори печатницата за пари и ще превърне спестяванията ви в шарена тоалетна хартия… Но идеи без проблеми няма (за реализациите им да не говорим). Така е и с Bitcoin.

Всички транзакции в биткойни се записват в блокове, които образуват верига – така нареченият блокчейн. По този начин всяка стотинка (пардон, сатоши 🙂 ) може да бъде проследена до самото ѝ създаване. Адресите, между които се обменят парите, са анонимни, но самите обмени са публични и явни. Може да ги проследи и провери за валидност всеки, които има нужния софтуер (достъпен свободно) и поддържа „пълен възел“ (full node), тоест е склонен да отдели стотина гигабайта на диска си.

Проблемът е, че блокът на Bitcoin има фиксиран максимален размер – до 1 мегабайт. Той побира максимум 2-3 хиляди транзакции. При 6 блока на час това означава около 15 000 транзакции на час, или около 360 000 на денонощие. Звучи много, но всъщност е абсолютно недостатъчно – доста големи банки правят по повече транзакции на секунда. Та, от известно време насам нуждата от транзакции надхвърля капацитета на блокчейна. Което създава проблем за потребителите на валутата. Някои от тях започват да я изоставят и да се насочват към традиционни валути, или към други криптовалути. Съответно, влиянието и ролята ѝ спада.

2. Положението с решенията

Предлагани са немалко решения на този проблем. Последното се нарича SegWit (segregated witness). Срещу всички тях (и конкретно срещу това) обаче има сериозна съпротива от ключови фактори в Bitcoin.

Сравнително скоро след създаването на Bitcoin в него беше въведено правилото, че транзакциите са платени. (Иначе беше много лесно да бъдат генерирани огромен брой транзакции за минимална сума напред-назад, и така да бъде задръстен блокчейнът.) Всяка транзакция указва колко ще плати за включването си в блок. (Това е, което я „узаконява“.)

Кои транзакции от чакащите реда си ще включи в блок решава този, който създава блока. Това е „копачът“, който е решил целта от предишния блок. Той прибира заплащането за включените транзакции, освен стандартната „награда“ за блока. Затова копачите имат изгода транзакциите да са колкото се може по-скъпи – тоест, капацитетът на блокчейна да е недостатъчен.

В добавка, немалко копачи използват „хак“ в технологията на системата – така нареченият ASICBOOST. Едно от предимствата на SegWit е, че пречи на подобни хакове – тоест, на тези „копачи“. (Подробности можете да намерите тук.)

Резултатът е, че някои копачи се съпротивляват на въвеждането на SegWit. А „копаещата мощност“ е, която служи като „демократичен глас“ в системата на Bitcoin. Вече е правен опит да се въведе SegWit, който не сполучи. За да е по-добър консенсусът, този опит изискваше SegWit да се приеме когато 95% от копаещата мощност го подкрепи. Скоро стана ясно, че това няма да се случи.

3. UASF? WTF? (Демек, кво е тва UASF?)

Не зная колко точно е процентът на отхвърлящите SegWit копачи. Но към момента копаенето е централизирано до степен да се върши почти всичкото от малък брой мощни компании. Напълно е възможно отхвърлящите SegWit да са над 50% от копаещата мощност. Ако е така, въвеждането на SegWit чрез подкрепа от нея би било невъзможно. (Разбира се, това ще значи в близко бъдеще упадъка на Bitcoin и превръщането му от „царя на криптовалутите“ в евтин музеен експонат. В крайна сметка тези копачи ще са си изкопали гроба. Но ако има на света нещо, на което може да се разчита винаги и докрай, това е човешката глупост.)

За да се избегне такъв сценарий, девелоперите от Bitcoin Core Team предложиха т.нар. User-Activated Soft Fork, съкратено UASF. Същността му е, че от 1 август нататък възлите в мрежата на Bitcoin, които подкрепят SegWit, ще започнат да смятат блокове, които не потвърждават че го поддържат, за невалидни.

Отхвърлящите SegWit копачи могат да продължат да си копаят по старому. Поддържащите го ще продължат по новому. Съответно блокчейнът на Bitcoin от този момент нататък ще се раздели на два – клон без SegWit и клон с него.

4. Какъв ще е резултатът?

Преобладаващата копаеща мощност може да се окаже в първия – тоест, по правилата на Сатоши Накамото той ще е основният. Но ако мрежата е разделена на две, всяка ще има своя основен клон, така че няма да бъдат технически обединени. Ще има две различни валути на име Bitcoin, и всяка ще претендира, че е основната.

Как ще се разреши този спор? Потребителите на Bitcoin търсят по-ниски цени за транзакции, така че огромният процент от тях бързо ще се ориентират към веригата със SegWit. А ценността и приетостта на Bitcoin се дължи просто на факта, че хората го приемат и са склонни да го използват. Затова и Segwit-натият Bitcoin ще запази ролята (и цената) на оригиналния Bitcoin, докато този без SegWit ще поевтинее и ще загуби повечето от релевантността си.

(Всъщност, подобно „разцепление“ вече се е случвало с No. 2 в света на криптовалутите – Ethereum. Затова има Ethereum и Ethereum Classic. Вторите изгубиха борбата да са наследникът на оригиналния Ethereum, но продължава да ги има, макар и да са с много по-малка роля и цена.)

Отхвърлилите SegWit копачи скоро ще се окажат в положение да копаят нещо, което струва жълти стотинки. Затова вероятно те шумно или тихо ще преминат към поддръжка на SegWit. Не бих се учудил дори доста от тях да го направят още на 1 август. (Въпреки че някои сигурно ще продължат да опищяват света колко лошо е решението и какви загуби понасят от него. Може да има дори съдебни процеси… Подробностите ще ги видим.)

5. Политиката

Ако сте издържали дотук, четете внимателно – същността на този запис е в тази част.

Наскоро си говорих с горда випускничка на български икономически ВУЗ. Изслушах обяснение как икономията от мащаба не съществува и е точно обратното. Как малките фирми са по-ефективни от големите и т.н…

Нищо чудно, че ги учат на глупости. Който плаща, дори зад сцената, той поръчва музиката. Странно ми е, че обучаваните вярват на тези глупости при положение, че реалността е пред очите им. И че в нея големите фирми разоряват и/или купуват малките, а не обратното. Няма как да е иначе. Както законите на Нютон важат еднакво за лабораторни тежести и за търговски контейнери, така и дисипативните закони важат еднакво за тенджери с вода и за икономически системи.

В ИТ бизнеса динамиката е много над средната. Където не е и няма как да бъде регулиран лесно, където нещата са по-laissez-faire, както е примерно в копаенето на биткойни, е още по-голяма. Нищо чудно, че копаенето премина толкова бързо от милиони индивидуални участници към малък брой лесно картелиращи се тиранозаври. Всяка система еволюира вътрешно в такава посока… Затова „перфектна система“ и „щастие завинаги“ няма как да съществуват. Затова, ако щете, свободата трябва да се замесва и изпича всеки ден.

„Преобладаващата копаеща мощност“, било като преобладаващият брой индивиди във вида, било като основната маса пари, било като управление на най-популярните сред гласоподавателите мемове, лесно може да се съсредоточи в тесен кръг ръце. И законите на вътрешната еволюция на системите, като конкретно изражение на дисипативните закони, водят именно натам… Тогава всяко гласуване започва да подкрепя статуквото. Демокрацията престава да бъде възможност за промяна – такава остава само разделянето на възгледите в отделни системи. Единствено тогава новото получава възможност реално да конкурира старото.

Затова и всеки биологичен вид наоколо е започнал някога като миниатюрна различна клонка от могъщото тогава стъбло на друг вид. Който днес познават само палеобиолозите. И всяка могъща банка, или производствена или медийна фирма е започнала – като сума пари, или производствен капацитет, или интелектуална собственост – като обикновена будка за заеми, или работилничка, или ателие. В сянката на тогавашните тиранозаври, помнени днес само от историците. Намерили начин да се отделят и скрият някак от тях, за да съберат мощта да ги конкурират…

Който разбрал – разбрал.

[$] Making Python faster

Post Syndicated from jake original https://lwn.net/Articles/725114/rss

The Python core developers, and Victor Stinner in particular, have been
focusing on improving the performance of Python 3 over the last few
years. At PyCon 2017, Stinner
gave a talk on some of the optimizations that have been added recently and
the effect they have had on various benchmarks. Along the way, he took a
detour into some improvements that have been made for benchmarking
Python.

2017 Maintainer and Kernel Summit planning

Post Syndicated from corbet original https://lwn.net/Articles/725374/rss

The Kernel Summit is undergoing some changes this year; the core
developers’ gathering from previous events will be replaced by a half-day
“maintainers summit” consisting of about 30 people. The process of
selecting those people, and of selecting topics for the open technical
session, is underway now; interested developers are encouraged to submit
their topic ideas.

Pirate Bay Facilitates Piracy and Can be Blocked, Top EU Court Rules

Post Syndicated from Ernesto original https://torrentfreak.com/pirate-bay-facilitates-piracy-and-can-be-blocked-top-eu-court-rules-170614/

pirate bayIn 2014, The Court of The Hague handed down its decision in a long running case which had previously forced two Dutch ISPs, Ziggo and XS4ALL, to block The Pirate Bay.

The Court ruled against local anti-piracy outfit BREIN, concluding that the blockade was ineffective and restricted the ISPs’ entrepreneurial freedoms.

The Pirate Bay was unblocked by all local ISPs while BREIN took the matter to the Supreme Court, which subsequently referred the case to the EU Court of Justice, seeking further clarification.

After a careful review of the case, the Court of Justice today ruled that The Pirate Bay can indeed be blocked.

While the operators don’t share anything themselves, they knowingly provide users with a platform to share copyright-infringing links. This can be seen as “an act of communication” under the EU Copyright Directive, the Court concludes.

“Whilst it accepts that the works in question are placed online by the users, the Court highlights the fact that the operators of the platform play an essential role in making those works available,” the Court explains in a press release (pdf).

According to the ruling, The Pirate Bay indexes torrents in a way that makes it easy for users to find infringing content while the site makes a profit. The Pirate Bay is aware of the infringements, and although moderators sometimes remove “faulty” torrents, infringing links remain online.

“In addition, the same operators expressly display, on blogs and forums accessible on that platform, their intention of making protected works available to users, and encourage the latter to make copies of those works,” the Court writes.

The ruling means that there are no major obstacles for the Dutch Supreme Court to issue an ISP blockade, but a final decision in the underlying case will likely take a few more months.

A decision at the European level is important, as it may also affect court orders in other countries where The Pirate Bay and other torrent sites are already blocked, including Austria, Belgium, Finland, Italy, and its home turf Sweden.

Despite the negative outcome, the Pirate Bay team is not overly worried.

“Copyright holders will remain stubborn and fight to hold onto a dying model. Clueless and corrupt law makers will put corporate interests before the public’s. Their combined jackassery is what keeps TPB alive,” TPB’s plc365 tells TorrentFreak.

“The reality is that regardless of the ruling, nothing substantial will change. Maybe more ISPs will block TPB. More people will use one of the hundreds of existing proxies, and even more new ones will be created as a result.”

Pirate Bay moderator “Xe” notes that while it’s an extra barrier to access the site, blockades will eventually help people to get around censorship efforts, which are not restricted to TPB.

“They’re an issue for everyone in the sense that they’re an obstacle which has to be overcome. But learning how to work around them isn’t hard and knowing how to work around them is becoming a core skill for everyone who uses the Internet.

“Blockades are not a major issue for the site in the sense that they’re nothing new: we’ve long since adapted to them. We serve the needs of millions of people every day in spite of them,” Xe adds.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Firefox 54 released

Post Syndicated from ris original https://lwn.net/Articles/725275/rss

Firefox 54.0 has been released. The release
notes
are somewhat sparse, however this
blog post
contains more information about some changes under-the-hood.
To make Firefox run even complex sites faster, we’ve been changing it to run using multiple operating system processes. Translation? The old Firefox used a single process to run all the tabs in a browser. Modern browsers split the load into several independent processes. We named our project to split Firefox into multiple processes ‘Electrolysis (E10S)’ after the chemical process that divides water into its core elements. E10S is the largest change to Firefox code in our history. And today we’re launching our next big phase of the E10S initiative.

Teaching tech

Post Syndicated from Eevee original https://eev.ee/blog/2017/06/10/teaching-tech/

A sponsored post from Manishearth:

I would kinda like to hear about any thoughts you have on technical teaching or technical writing. Pedagogy is something I care about. But I don’t know how much you do, so feel free to ignore this suggestion 🙂

Good news: I care enough that I’m trying to write a sorta-kinda-teaching book!

Ironically, one of the biggest problems I’ve had with writing the introduction to that book is that I keep accidentally rambling on for pages about problems and difficulties with teaching technical subjects. So maybe this is a good chance to get it out of my system.

Phaser

I recently tried out a new thing. It was Phaser, but this isn’t a dig on them in particular, just a convenient example fresh in my mind. If anything, they’re better than most.

As you can see from Phaser’s website, it appears to have tons of documentation. Two of the six headings are “LEARN” and “EXAMPLES”, which seems very promising. And indeed, Phaser offers:

  • Several getting-started walkthroughs
  • Possibly hundreds of examples
  • A news feed that regularly links to third-party tutorials
  • Thorough API docs

Perfect. Beautiful. Surely, a dream.

Well, almost.

The examples are all microscopic, usually focused around a single tiny feature — many of them could be explained just as well with one line of code. There are a few example games, but they’re short aimless demos. None of them are complete games, and there’s no showcase either. Games sometimes pop up in the news feed, but most of them don’t include source code, so they’re not useful for learning from.

Likewise, the API docs are just API docs, leading to the sorts of problems you might imagine. For example, in a few places there’s a mention of a preUpdate stage that (naturally) happens before update. You might rightfully wonder what kinds of things happen in preUpdate — and more importantly, what should you put there, and why?

Let’s check the API docs for Phaser.Group.preUpdate:

The core preUpdate – as called by World.

Okay, that didn’t help too much, but let’s check what Phaser.World has to say:

The core preUpdate – as called by World.

Ah. Hm. It turns out World is a subclass of Group and inherits this method — and thus its unaltered docstring — from Group.

I did eventually find some brief docs attached to Phaser.Stage (but only by grepping the source code). It mentions what the framework uses preUpdate for, but not why, and not when I might want to use it too.


The trouble here is that there’s no narrative documentation — nothing explaining how the library is put together and how I’m supposed to use it. I get handed some brief primers and a massive reference, but nothing in between. It’s like buying an O’Reilly book and finding out it only has one chapter followed by a 500-page glossary.

API docs are great if you know specifically what you’re looking for, but they don’t explain the best way to approach higher-level problems, and they don’t offer much guidance on how to mesh nicely with the design of a framework or big library. Phaser does a decent chunk of stuff for you, off in the background somewhere, so it gives the strong impression that it expects you to build around it in a particular way… but it never tells you what that way is.

Tutorials

Ah, but this is what tutorials are for, right?

I confess I recoil whenever I hear the word “tutorial”. It conjures an image of a uniquely useless sort of post, which goes something like this:

  1. Look at this cool thing I made! I’ll teach you how to do it too.

  2. Press all of these buttons in this order. Here’s a screenshot, which looks nothing like what you have, because I’ve customized the hell out of everything.

  3. You did it!

The author is often less than forthcoming about why they made any of the decisions they did, where you might want to try something else, or what might go wrong (and how to fix it).

And this is to be expected! Writing out any of that stuff requires far more extensive knowledge than you need just to do the thing in the first place, and you need to do a good bit of introspection to sort out something coherent to say.

In other words, teaching is hard. It’s a skill, and it takes practice, and most people blogging are not experts at it. Including me!


With Phaser, I noticed that several of the third-party tutorials I tried to look at were 404s — sometimes less than a year after they were linked on the site. Pretty major downside to relying on the community for teaching resources.

But I also notice that… um…

Okay, look. I really am not trying to rag on this author. I’m not. They tried to share their knowledge with the world, and that’s a good thing, something worthy of praise. I’m glad they did it! I hope it helps someone.

But for the sake of example, here is the most recent entry in Phaser’s list of community tutorials. I have to link it, because it’s such a perfect example. Consider:

  • The post itself is a bulleted list of explanation followed by a single contiguous 250 lines of source code. (Not that there’s anything wrong with bulleted lists, mind you.) That code contains zero comments and zero blank lines.

  • This is only part two in what I think is a series aimed at beginners, yet the title and much of the prose focus on object pooling, a performance hack that’s easy to add later and that’s almost certainly unnecessary for a game this simple. There is no explanation of why this is done; the prose only says you’ll understand why it’s critical once you add a lot more game objects.

  • It turns out I only have two things to say here so I don’t know why I made this a bulleted list.

In short, it’s not really a guided explanation; it’s “look what I did”.

And that’s fine, and it can still be interesting. I’m not sure English is even this person’s first language, so I’m hardly going to criticize them for not writing a novel about platforming.

The trouble is that I doubt a beginner would walk away from this feeling very enlightened. They might be closer to having the game they wanted, so there’s still value in it, but it feels closer to having someone else do it for them. And an awful lot of tutorials I’ve seen — particularly of the “post on some blog” form (which I’m aware is the genre of thing I’m writing right now) — look similar.

This isn’t some huge social problem; it’s just people writing on their blog and contributing to the corpus of written knowledge. It does become a bit stickier when a large project relies on these community tutorials as its main set of teaching aids.


Again, I’m not ragging on Phaser here. I had a slightly frustrating experience with it, coming in knowing what I wanted but unable to find a description of the semantics anywhere, but I do sympathize. Teaching is hard, writing documentation is hard, and programmers would usually rather program than do either of those things. For free projects that run on volunteer work, and in an industry where anything other than programming is a little undervalued, getting good docs written can be tricky.

(Then again, Phaser sells books and plugins, so maybe they could hire a documentation writer. Or maybe the whole point is for you to buy the books?)

Some pretty good docs

Python has pretty good documentation. It introduces the language with a tutorial, then documents everything else in both a library and language reference.

This sounds an awful lot like Phaser’s setup, but there’s some considerable depth in the Python docs. The tutorial is highly narrative and walks through quite a few corners of the language, stopping to mention common pitfalls and possible use cases. I clicked an arbitrary heading and found a pleasant, informative read that somehow avoids being bewilderingly dense.

The API docs also take on a narrative tone — even something as humble as the collections module offers numerous examples, use cases, patterns, recipes, and hints of interesting ways you might extend the existing types.

I’m being a little vague and hand-wavey here, but it’s hard to give specific examples without just quoting two pages of Python documentation. Hopefully you can see right away what I mean if you just take a look at them. They’re good docs, Bront.

I’ve likewise always enjoyed the SQLAlchemy documentation, which follows much the same structure as the main Python documentation. SQLAlchemy is a database abstraction layer plus ORM, so it can do a lot of subtly intertwined stuff, and the complexity of the docs reflects this. Figuring out how to do very advanced things correctly, in particular, can be challenging. But for the most part it does a very thorough job of introducing you to a large library with a particular philosophy and how to best work alongside it.

I softly contrast this with, say, the Perl documentation.

It’s gotten better since I first learned Perl, but Perl’s docs are still a bit of a strange beast. They exist as a flat collection of manpage-like documents with terse names like perlootut. The documentation is certainly thorough, but much of it has a strange… allocation of detail.

For example, perllol — the explanation of how to make a list of lists, which somehow merits its own separate documentation — offers no fewer than nine similar variations of the same code for reading a file into a nested lists of words on each line. Where Python offers examples for a variety of different problems, Perl shows you a lot of subtly different ways to do the same basic thing.

A similar problem is that Perl’s docs sometimes offer far too much context; consider the references tutorial, which starts by explaining that references are a powerful “new” feature in Perl 5 (first released in 1994). It then explains why you might want to nest data structures… from a Perl 4 perspective, thus explaining why Perl 5 is so much better.

Some stuff I’ve tried

I don’t claim to be a great teacher. I like to talk about stuff I find interesting, and I try to do it in ways that are accessible to people who aren’t lugging around the mountain of context I already have. This being just some blog, it’s hard to tell how well that works, but I do my best.

I also know that I learn best when I can understand what’s going on, rather than just seeing surface-level cause and effect. Of course, with complex subjects, it’s hard to develop an understanding before you’ve seen the cause and effect a few times, so there’s a balancing act between showing examples and trying to provide an explanation. Too many concrete examples feel like rote memorization; too much abstract theory feels disconnected from anything tangible.

The attempt I’m most pleased with is probably my post on Perlin noise. It covers a fairly specific subject, which made it much easier. It builds up one step at a time from scratch, with visualizations at every point. It offers some interpretations of what’s going on. It clearly explains some possible extensions to the idea, but distinguishes those from the core concept.

It is a little math-heavy, I grant you, but that was hard to avoid with a fundamentally mathematical topic. I had to be economical with the background information, so I let the math be a little dense in places.

But the best part about it by far is that I learned a lot about Perlin noise in the process of writing it. In several places I realized I couldn’t explain what was going on in a satisfying way, so I had to dig deeper into it before I could write about it. Perhaps there’s a good guideline hidden in there: don’t try to teach as much as you know?

I’m also fairly happy with my series on making Doom maps, though they meander into tangents a little more often. It’s hard to talk about something like Doom without meandering, since it’s a convoluted ecosystem that’s grown organically over the course of 24 years and has at least three ways of doing anything.


And finally there’s the book I’m trying to write, which is sort of about game development.

One of my biggest grievances with game development teaching in particular is how often it leaves out important touches. Very few guides will tell you how to make a title screen or menu, how to handle death, how to get a Mario-style variable jump height. They’ll show you how to build a clearly unfinished demo game, then leave you to your own devices.

I realized that the only reliable way to show how to build a game is to build a real game, then write about it. So the book is laid out as a narrative of how I wrote my first few games, complete with stumbling blocks and dead ends and tiny bits of polish.

I have no idea how well this will work, or whether recapping my own mistakes will be interesting or distracting for a beginner, but it ought to be an interesting experiment.

Announcing Rust 1.18

Post Syndicated from ris original https://lwn.net/Articles/724889/rss

Version 1.18 of the Rust programming language has been released.
One of the largest changes is a long time coming: core team members
Carol Nichols and Steve Klabnik have been writing a new edition of “The
Rust Programming Language”, the official book about Rust. It’s being written openly on GitHub, and
has over a hundred contributors in total. This release includes the first draft of
the second edition in our online documentation
. 19 out of 20 chapters
have a draft; the draft of chapter 20 will land in Rust 1.19.

Online Platforms Should Collaborate to Ban Piracy and Terrorism, Report Suggests

Post Syndicated from Andy original https://torrentfreak.com/online-platforms-collaborate-ban-piracy-terrorism-report-suggests-170608/

With deep ties to the content industries, the Digital Citizens Alliance periodically produces reports on Internet piracy. It has published reports on cyberlockers and tried to blame Cloudflare for the spread of malware, for example.

One of the key themes pursued by DCA is that Internet piracy is inextricably linked to a whole bunch of other online evils and that tackling the former could deliver a much-needed body blow to the latter.

Its new report, titled ‘Trouble in Our Digital Midst’, takes this notion and runs with it, bundling piracy with everything from fake news to hacking, to malware and brand protection, to the sextortion of “young girls and boys” via their computer cameras.

The premise of the report is that cybercrime as a whole is undermining America’s trust in the Internet, noting that 64% of US citizens say that their trust in digital platforms has dropped in the last year. Given the topics under the spotlight, it doesn’t take long to see where this is going – Internet platforms like Google, Facebook and YouTube must tackle the problem.

“When asked, ‘In your opinion, are digital platforms doing enough to keep the Internet safe and trustworthy, or are do they need to do more?’ a staggering 75 percent responded that they need to do more to keep the Internet safe,” the report notes.

It’s abundantly clear that the report is mostly about piracy but a lot of effort has been expended to ensure that people support its general call for the Internet to be cleaned up. By drawing attention to things that even most pirates might find offensive, it’s easy to find more people in agreement.

“Nearly three-quarters of respondents see the pairing of brand name advertising with offensive online content – like ISIS/terrorism recruiting videos – as a threat to the continued trust and integrity of the Internet,” the report notes.

Of course, this is an incredibly sensitive topic. When big brand ads turned up next to terrorist recruiting videos on YouTube, there was an almighty stink, and rightly so. However, at every turn, the DCA report manages to weave the issue of piracy into the equation, noting that the problem includes the “$200 million in advertising that shows up on illegal content theft websites often unbeknownst to the brands.”

The overriding theme is that platforms like Google, Facebook, and YouTube should be able to tackle all of these problems in the same way. Filtering out a terrorist video is the same as removing a pirate movie. And making sure that ads for big brands don’t appear alongside terrorist videos will be just as easy as starving pirates of revenue, the suggestion goes.

But if terrorism doesn’t grind your gears, what about fake news?

“64 percent of Americans say that the Fake News issue has made them less likely to trust the Internet as a source of information,” the report notes.

At this juncture, Facebook gets a gentle pat on the back for dealing with fake news and employing 3,000 people to monitor for violent videos being posted to the network. This shows that the company “takes seriously” the potential harm bad actors pose to Internet safety. But in keeping with the theme running throughout the report, it’s clear DCA are carefully easing in the thin end of the wedge.

“We are at only the beginning of thinking through other kinds of illicit and illegal activity happening on digital platforms right now that we must gain or re-gain control over,” DCA writes.

Quite. In the very next sentence, the group goes on to warn about the sale of drugs and stolen credit cards, adding that the sale of illicit streaming devices (modified Kodi boxes etc) is actually an “insidious yet effective delivery mechanism to infect computers with malware such as Remote Access Trojans.”

Both Amazon and Facebook receive praise in the report for their recent banning (1,2) of augmented Kodi devices but their actions are actually framed as the companies protecting their own reputations, rather than the interests of the media groups that have been putting them under pressure.

“And though this issue underscores the challenges faced by digital platforms – not all of which act with the same level of responsibility – it also highlights the fact digital platforms can and will step up when their own brands are at stake,” the report reads.

But pirate content and Remote Access Trojans through Kodi boxes are only the beginning. Pirate sites are playing a huge part as well, DCA claims, with one in three “content theft websites” exposing people to identify theft, ransomware, and sextortion via “the computer cameras of young girls and boys.”

Worst still, if that was possible, the lack of policing by online platforms means that people are able to “showcase live sexual assaults, murders, and other illegal conduct.”

DCA says that with all this in mind, Americans are looking for online digital platforms to help them. The group claims that citizens need proactive protection from these ills and want companies like Facebook to take similar steps to those taken when warning consumers about fake news and violent content.

So what can be done to stop this tsunami of illegality? According to DCA, platforms like Google, Facebook, YouTube, and Twitter need to up their game and tackle the problem together.

“While digital platforms collaborate on policy and technical issues, there is no evidence that they are sharing information about the bad actors themselves. That enables criminals and bad actors to move seamlessly from platform to platform,” DCA writes.

“There are numerous examples of industry working together to identify and share information about exploitive behavior. For example, casinos share information about card sharks and cheats, and for decades the retail industry has shared information about fraudulent credit cards. A similar model would enable digital platforms and law enforcement to more quickly identify and combat those seeking to leverage the platforms to harm consumers.”

How this kind of collaboration could take place in the real world is open to interpretation but the DCA has a few suggestions of its own. Again, it doesn’t shy away from pulling people on side with something extremely offensive (in this case child pornography) in order to push what is clearly an underlying anti-piracy agenda.

“With a little help from engineers, digital platforms could create fingerprints of unlawful conduct that is shared across platforms to proactively block such conduct, as is done in a limited capacity with child pornography,” DCA explains.

“If these and other newly developed measures were adopted, digital platforms would have the information to enable them to make decisions whether to de-list or demote websites offering illicit goods and services, and the ability to stop the spread of illegal behavior that victimizes its users.”

The careful framing of the DCA report means that there’s something for everyone. If you don’t agree with them on tackling piracy, then their malware, fake news, or child exploitation angles might do the trick. It’s quite a clever strategy but one that the likes of Google, Facebook, and YouTube will recognize immediately.

And they need to – because apparently, it’s their job to sort all of this out. Good luck with that.

The full report can be found here (pdf)

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

AWS Greengrass – Run AWS Lambda Functions on Connected Devices

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-greengrass-run-aws-lambda-functions-on-connected-devices/

I first told you about AWS Greengrass in the post that I published during re:Invent (AWS Greengrass – Ubiquitous Real-World Computing). We launched a limited preview of Greengrass at that time and invited you to sign up if you were interested.

As I noted at the time, many AWS customers want to collect and process data out in the field, where connectivity is often slow and sometimes either intermittent or unreliable. Greengrass allows them to extend the AWS programming model to small, simple, field-based devices. It builds on AWS IoT and AWS Lambda, and supports access to the ever-increasing variety of services that are available in the AWS Cloud.

Greengrass gives you access to compute, messaging, data caching, and syncing services that run in the field, and that do not depend on constant, high-bandwidth connectivity to an AWS Region. You can write Lambda functions in Python 2.7 and deploy them to your Greengrass devices from the cloud while using device shadows to maintain state. Your devices and peripherals can talk to each other using local messaging that does not pass through the cloud.

Now Generally Available
Today we are making Greengrass generally available in the US East (Northern Virginia) and US West (Oregon) Regions. During the preview, AWS customers were able to get hands-on experience with Greengrass and to start building applications and businesses around it. I’ll share a few of these early successes later in this post.

The Greengrass Core code runs on each device. It allows you to deploy and run Lambda applications on the device, supports local MQTT messaging across a secure network, and also ensures that conversations between devices and the cloud are made across secure connections. The Greengrass Core also supports secure, over-the-air software updates, including Lambda functions. It includes a message broker, a Lambda runtime, a Thing Shadows implementation, and a deployment agent. Greengrass Core and (optionally) other devices make up a Greengrass Group. The group includes configuration data, the list of devices and the identity of the Greengrass Core, a list of Lambda functions, and a set of subscriptions that define where the messages should go. All of this information is copied to the Greengrass core devices during the deployment process.

Your Lambda functions can use APIs in three distinct SDKs:

AWS SDK for Python – This SDK allows your code to interact with Amazon Simple Storage Service (S3), Amazon DynamoDB, Amazon Simple Queue Service (SQS), and other AWS services.

AWS IoT Device SDK – This SDK (available for Node.js, Python, Java, and C++) helps you to connect your hardware devices to AWS IoT. The C++ SDK has a few extra features including access to the Greengrass Discovery Service and support for root CA downloads.

AWS Greengrass Core SDK – This SDK provides APIs that allow local invocation of other Lambda functions, publish messages, and work with thing shadows.

You can run the Greengrass Core on x86 and ARM devices that have version 4.4.11 (or newer) of the Linux kernel, with the OverlayFS and user namespace features enabled. While most deployments of Greengrass will be targeted at specialized, industrial-grade hardware, you can also run the Greengrass Core on a Raspberry Pi or an EC2 instance for development and test purposes.

For this post, I used a Raspberry Pi attached to a BrickPi, connected to my home network via WiFi:

The Raspberry Pi, the BrickPi, the case, and all of the other parts are available in the BrickPi 3 Starter Kit. You will need some Linux command-line expertise and a decent amount of manual dexterity to put all of this together, but if I did it then you surely can.

Greengrass in Action
I can access Greengrass from the Console, API, or CLI. I’ll use the Console. The intro page of the Greengrass Console lets me define groups, add Greengrass Cores, and add devices to my groups:

I click on Get Started and then on Use easy creation:

Then I name my group:

And name my first Greengrass Core:

I’m ready to go, so I click on Create Group and Core:

This runs for a few seconds and then offers up my security resources (two keys and a certificate) for downloading, along with the Greengrass Core:

I download the security resources and put them in a safe place, and select and download the desired version of the Greengrass Core software (ARMv7l for my Raspberry Pi), and click on Finish.

Now I power up my Pi, and copy the security resources and the software to it (I put them in an S3 bucket and pulled them down with wget). Here’s my shell history at that point:

Following the directions in the user guide, I create a new user and group, run the rpi-update script, and install several packages including sqlite3 and openssl. After a couple of reboots, I am ready to proceed!

Next, still following the directions, I untar the Greengrass Core software and move the security resources to their final destination (/greengrass/configuration/certs), giving them generic names along the way. Here’s what the directory looks like:

The next step is to associate the core with an AWS IoT thing. I return to the Console, click through the group and the Greengrass Core, and find the Thing ARN:

I insert the names of the certificates and the Thing ARN into the config.json file, and also fill in the missing sections of the iotHost and ggHost:

I start the Greengrass demon (this was my second attempt; I had a typo in one of my path names the first time around):

After all of this pleasant time at the command line (taking me back to my Unix v7 and BSD 4.2 days), it is time to go visual once again! I visit my AWS IoT dashboard and see that my Greengrass Core is making connections to IoT:

I go to the Lambda Console and create a Lambda function using the Python 2.7 runtime (the IAM role does not matter here):

I publish the function in the usual way and, hop over to the Greengrass Console, click on my group, and choose to add a Lambda function:

Then I choose the version to deploy:

I also configure the function to be long-lived instead of on-demand:

My code will publish messages to AWS IoT, so I create a subscription by specifying the source and destination:

I set up a topic filter (hello/world) on the subscription as well:

I confirm my settings and save my subscription and I am just about ready to deploy my code. I revisit my group, click on Deployments, and choose Deploy from the Actions menu:

I choose Automatic detection to move forward:

Since this is my first deployment, I need to create a service-level role that gives Greengrass permission to access other AWS services. I simply click on Grant permission:

I can see the status of each deployment:

The code is now running on my Pi! It publishes messages to topic hello/world; I can see them by going to the IoT Console, clicking on Test, and subscribing to the topic:

And here are the messages:

With all of the setup work taken care of, I can do iterative development by uploading, publishing, and deploying new versions of my code. I plan to use the BrickPi to control some LEGO Technic motors and to publish data collected from some sensors. Stay tuned for that post!

Greengrass Pricing
You can run the Greengrass Core on three devices free for one year as part of the AWS Free Tier. At the next level (3 to 10,000 devices) two options are available:

  • Pay as You Go – $0.16 per month per device.
  • Annual Commitment – $1.49 per year per device, a 17.5% savings.

If you want to run the Greengrass Core on more than 10,000 devices or make a longer commitment, please get in touch with us; details on all pricing models are on the Greengrass Pricing page.

Jeff;

[$] Classes and types in the Python typing module

Post Syndicated from jake original https://lwn.net/Articles/724639/rss

Mark Shannon is concerned that the Python core developers may be replaying
a mistake: treating two distinct things as being the
same. Treating byte strings and Unicode text-strings interchangeably is
part of what led to Python 3, so he would rather not see that happen
again with types and classes. The Python typing
module
, which is meant to support type hints, currently
implements types as classes. That leads to several kinds of problems, as
Shannon described in his session at the 2017 Python Language Summit.

Symantec Patent Protects Torrent Users Against Malware

Post Syndicated from Ernesto original https://torrentfreak.com/symantec-patent-protects-torrent-users-against-malware-170606/

In recent years we have documented a wide range of patent applications, several of which had a clear anti-piracy angle.

Symantec Corporation, known for the popular anti-virus software Norton Security, is taking a more torrent-friendly approach. At least, that’s what a recently obtained patent suggests.

The patent describes a system that can be used to identify fake torrents and malware-infected downloads, which are a common problem on badly-moderated torrent sites. Downloaders of these torrents are often redirected to scam websites or lured into installing malware.

Here’s where Symantec comes in with their automatic torrent moderating solution. Last week the company obtained a patent for a system that can rate the trustworthiness of torrents and block suspicious content to protect users.

“While the BitTorrent protocol represents a popular method for distributing files, this protocol also represents a common means for distributing malicious software. Unfortunately, torrent hosting sites generally fail to provide sufficient information to reliably predict whether such files are trustworthy,” the patent reads.

Unlike traditional virus scans, where the file itself is scanned for malicious traits, the patented technology uses a reputation score to make the evaluation.

The trustworthiness of torrents is determined by factors including the reputation of the original uploaders, torrent sites, trackers and other peers. For example, if an IP-address of a seeder is linked to several malicious torrents, it will get a low reputation score.

“For example, if an entity has been involved in several torrent transactions that involved malware-infected target files, the reputation information associated with the entity may indicate that the entity has a poor reputation, indicating a high likelihood that the target file represents a potential security risk,” Symantec notes.

In contrast, if a torrent is seeded by a user that only shares non-malicious files, the trustworthiness factor goes up.

Reputation information

If a torrent file has a high likelihood of being linked to malware or other malicious content, the system can take appropriate “security actions.” This may be as simple as deleting the suspicious torrent, or a more complex respone such as blocking all related network traffic.

“Examples of such security actions include, without limitation, alerting a user of the potential security risk, blocking access to the target file until overridden by the user, blocking network traffic associated with the torrent transaction, quarantining the target file, and/or deleting the target file,” Symantec writes.

Security actions

Symantec Corporation applied for the pattern nearly four years ago, but thus far we haven’t seen it used in the real world.

Many torrent users would likely appreciate an extra layer of security, although they might be concerned about overblocking and possible monitoring of their download habits. This means that, for now, they will have to rely on site moderators, and most importantly, common sense.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

MPAA Chief Praises Site-Blocking But Italians Love Piracy – and the Quality

Post Syndicated from Andy original https://torrentfreak.com/mpaa-chief-praises-site-blocking-but-italians-love-pirate-quality-170606/

After holding a reputation for being soft on piracy for many years, in more recent times Italy has taken a much tougher stance. The country now takes regular action against pirate sites and has a fairly aggressive site-blocking mechanism.

On Monday, the industry gathered in Rome and was presented with new data from local anti-piracy outfit FAPAV. The research revealed that while there has been some improvement over the past six years, 39% of Italians are still consuming illicit movies, TV shows, sporting events and other entertainment, at the rate of 669m acts of piracy every year.

While movie piracy is down 4% from 2010, the content most often consumed by pirates is still films, with 33% of the adult population engaging in illicit consumption during the past year.

The downward trend was not shared by TV shows, however. In the past seven years, piracy has risen to 22% of the population, up 13% on figures from 2010.

In keeping with the MPAA’s recent coding of piracy in 1.0, 2.0, and 3.0 variants (P2P as 1.0, streaming websites as 2.0, streaming devices/Kodi as 3.0), FAPAV said that Piracy 2.0 had become even more established recently, with site operators making considerable technological progress.

“The research tells us we can not lower our guard, we always have to work harder and with greater determination in communication and awareness, especially with regard to digital natives,” said FAPAV Secretary General, Bagnoli Rossi.

The FAPAV chief said that there needs to be emphasis in two areas. One, changing perceptions among the public over the seriousness of piracy via education and two, placing pressure on websites using the police, judiciary, and other law enforcement agencies.

“The pillars of anti-piracy protection are: the judicial authority, self-regulatory agreements, communication and educational activities,” said Rossi, adding that cooperation with Italy’s AGCOM had resulted in 94 sites being blocked over three years.

FAPAV research has traditionally focused on people aged 15 and up but the anti-piracy group believes that placing more emphasis on younger people (aged 10-14) is important since they also consume a lot of pirated content online. MPAA chief Chris Dodd, who was at the event, agreed with the sentiment.

“Today’s youth are the future of the audiovisual industry. Young people must learn to respect the people who work in film and television that in 96% of cases never appear [in front of camera] but still work behind the scenes,” Dodd said.

“It is important to educate and direct them towards legal consumption, which creates jobs and encourages investment. Technology has expanded options to consume content legally and at any time and place, but at the same time has given attackers the opportunity to develop illegal businesses.”

Despite large-scale site-blocking not being a reality in the United States, Dodd was also keen to praise Italy for its efforts while acknowledging the wider blocking regimes in place across the EU.

“We must not only act by blocking pirate sites (we have closed a little less than a thousand in Europe) but also focus on legal offers. Today there are 480 legal online distribution services worldwide. We must have more,” Dodd said.

The outgoing MPAA chief reiterated that movies, music, games and a wide range of entertainment products are all available online legally now. Nevertheless, piracy remains a “growing phenomenon” that has criminals at its core.

“Piracy is composed of criminal organizations, ready to steal sensitive data and to make illegal profits any way they can. It’s a business that harms the entire audiovisual market, which in Europe alone has a million working professionals. To promote the culture of legality means protecting this market and its collective heritage,” Dodd said.

In Italy, convincing pirates to go legal might be more easily said than done. Not only do millions download video every year, but the majority of pirates are happy with the quality too. 89% said they were pleased with the quality of downloaded movies while the satisfaction with TV shows was even greater with 91% indicating approval.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

When a Big Torrent Site Dies, Some Hope it Will Be Right Back

Post Syndicated from Andy original https://torrentfreak.com/when-a-big-torrent-site-dies-some-hope-it-will-be-right-back-170604/

For a niche that has had millions of words written about it over the past 18 years or so, most big piracy stories have had the emotions of people at their core.

When The Pirate Bay was taken down by the police eleven years ago it was global news, but the real story was the sense of disbelief and loss felt by millions of former users. Outsiders may dismiss these feelings, but they are very common and very real.

Of course, those negative emotions soon turned to glee when the site returned days later, but full-on, genuine resurrections are something that few big sites have been able to pull off since. What we have instead today is the sudden disappearance of iconic sites and a scrambling by third-party opportunists to fill in the gaps with look-a-like platforms.

The phenomenon has affected many big sites, from The Pirate Bay itself through to KickassTorrents, YTS/YIFY, and more recently, ExtraTorrent. When sites disappear, it’s natural for former users to look for replacements. And when those replacements look just like the real deal there’s a certain amount of comfort to be had. For many users, these sites provide the perfect antidote to their feelings of loss.

That being said, the clone site phenomenon has seriously got out of hand. Pioneered by players in the streaming site scene, fake torrent sites can now be found in abundance wherever there is a brand worth copying. ExtraTorrent operator SaM knew this when he closed his site last month, and he took the time to warn people away from them personally.

“Stay away from fake ExtraTorrent websites and clones,” he said.

It’s questionable how many listened.

Within days, users were flooding to fake ExtraTorrent sites, encouraged by some elements of the press. Despite having previously reported SaM’s clear warnings, some publications were still happy to report that ExtraTorrent was back, purely based on the word of the fake sites themselves. And I’ve got a bridge for sale, if you have the cash.

While misleading news reports must take some responsibility, it’s clear that when big sites go down a kind of grieving process takes place among dedicated former users, making some more likely to clutch at straws. While some simply move on, others who have grown more attached to a platform they used to call home can go into denial.

This reaction has often been seen in TF’s mailbox, when YTS/YIFY went down in particular. More recently, dozens of emails informed us that ExtraTorrent had gone, with many others asking when it was coming back. But the ones that stood out most were from people who had read SaM’s message, read TF’s article stating that ALL clones were fakes, yet still wanted to know if sites a, b and c were legitimate or not.

We approached a user on Reddit who asked similar things and been derided by other users for his apparent reluctance to accept that ExtraTorrent had gone. We didn’t find stupidity (as a few in /r/piracy had cruelly suggested) but a genuine sense of loss.

“I loved the site dude, what can I say?” he told TF. “Just kinda got used to it and hung around. Before I knew it I was logging in every day. In time it just felt like home. I miss it.”

The user hadn’t seen the articles claiming that one of the imposter ExtraTorrent sites was the real deal. He did, however, seem a bit unsettled when we told him it was a fake. But when we asked if he was going to stop using it, we received an emphatic “no”.

“Dude it looks like ET and yeah it’s not quite the same but I can get my torrents. Why does it matter what crew [runs it]?” he said.

It does matter, of course. The loss of a proper torrent site like ExtraTorrent, which had releasers and a community, can never be replaced by a custom-skinned Pirate Bay mirror. No matter how much it looks like a lost friend, it’s actually a pig in lipstick that contributes little to the ecosystem.

That being said, it’s difficult to counter the fact that some of these clones make people happy. They fill a void that other sites, for mainly cosmetic reasons, can’t fill. With this in mind, the grounds for criticism weaken a little – but not much.

For anyone who has watched the Black Mirror episode ‘Be Right Back‘, it’s clear that sudden loss can be a hard thing for humans to accept. When trying to fill the gap, what might initially seem like a good replacement is almost certainly destined to disappoint longer term, when the sub-standard copy fails to capture the heart and soul of the real deal.

It’s an issue that will occupy the piracy scene for some time to come, but interestingly, it’s also an argument that Hollywood has used against piracy itself for decades. But that’s another story.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Torrents Help Researchers Worldwide to Study Babies’ Brains

Post Syndicated from Ernesto original https://torrentfreak.com/torrents-help-researchers-worldwide-to-study-babies-brains-170603/

One of the core pillars of academic research is sharing.

By letting other researchers know what you do, ideas are criticized, improved upon and extended. In today’s digital age, sharing is easier than ever before, especially with help from torrents.

One of the leading scientific projects that has adopted BitTorrent is the developing Human Connectome Project, or dHCP for short. The goal of the project is to map the brain wiring of developing babies in the wombs of their mothers.

To do so, a consortium of researchers with expertise ranging from computer science, to MRI physics and clinical medicine, has teamed up across three British institutions: Imperial College London, King’s College London and the University of Oxford.

The collected data is extremely valuable for the neuroscience community and the project has received mainstream press coverage and financial backing from the European Union Research Council. Not only to build the dataset, but also to share it with researchers around the globe. This is where BitTorrent comes in.

Sharing more than 150 GB of data with researchers all over the world can be quite a challenge. Regular HTTP downloads are not really up to the task, and many other transfer options have a high failure rate.

Baby brain scan (Credit: Developing Human Connectome Project)

This is why Jonathan Passerat-Palmbach, Research Associate Department of Computing Imperial College London, came up with the idea to embrace BitTorrent instead.

“For me, it was a no-brainer from day one that we couldn’t rely on plain old HTTP to make this dataset available. Our first pilot release is 150GB, and I expect the next ones to reach a couple of TB. Torrents seemed like the de facto solution to share this data with the world’s scientific community.” Passerat-Palmbach says.

The researchers opted to go for the Academic Torrents tracker, which specializes in sharing research data. A torrent with the first batch of images was made available there a few weeks ago.

“This initial release contains 3,629 files accounting for 167.20GB of data. While this figure might not appear extremely large at the moment, it will significantly grow as the project aims to make the data of 1,000 subjects available by the time it has completed.”

Torrent of the first dataset

The download numbers are nowhere in the region of an average Hollywood blockbuster, of course. Thus far the tracker has registered just 28 downloads. That said, as a superior and open file-transfer protocol, BitTorrent does aid in critical research that helps researchers to discover more about the development of conditions such as ADHD and autism.

Interestingly, the biggest challenges of implementing the torrent solution were not of a technical nature. Most time and effort went into assuring other team members that this was the right solution.

“I had to push for more than a year for the adoption of torrents within the consortium. While my colleagues could understand the potential of the approach and its technical inputs, they remained skeptical as to the feasibility to implement such a solution within an academic context and its reception by the world community.

“However, when the first dataset was put together, amounting to 150GB, it became obvious all the HTTP and FTP fallback plans would not fit our needs,” Passerat-Palmbach adds.

Baby brain scans (Credit: Developing Human Connectome Project)

When the consortium finally agreed that BitTorrent was an acceptable way to share the data, local IT staff at the university had to give their seal of approval. Imperial College London doesn’t allow torrent traffic to flow freely across the network, so an exception had to be made.

“Torrents are blocked across the wireless and VPN networks at Imperial. Getting an explicit firewall exception created for our seeding machine was not a walk in the park. It was the first time they were faced with such a situation and we were clearly told that it was not to become the rule.”

Then, finally, the data could be shared around the world.

While BitTorrent is probably the most efficient way to share large files, there were other proprietary solutions that could do the same. However, Passerat-Palmbach preferred not to force other researchers to install “proprietary black boxes” on their machines.

Torrents are free and open, which is more in line with the Open Access approach more academics take today.

Looking back, it certainly wasn’t a walk in the park to share the data via BitTorrent. Passerat-Palmbach was frequently confronted with the piracy stigma torrents have amoung many of his peers, even among younger generations.

“Considering how hard it was to convince my colleagues within the project to actually share this dataset using torrents (‘isn’t it illegal?’ and other kinds of misconceptions…), I think there’s still a lot of work to do to demystify the use of torrents with the public.

“I was even surprised to see that these misconceptions spread out not only to more senior scientists but also to junior researchers who I was expecting to be more tech-aware,” Passerat-Palmbach adds.

That said, the hard work is done now and in the months and years ahead the neuroscience community will have access to Petabytes of important data, with help from BitTorrent. That is definitely worth the effort.

Finally, we thought it was fitting to end with Passerat-Palmbach’s “pledge to seed,” which he shared with his peers. Keep on sharing!


On the importance of seeding

Dear fellow scientist,

Thank for you very much for the interest you are showing in the dHCP dataset!

Once you start downloading the dataset, you’ll notice that your torrent client mentions a sharing / seeding ratio. It means that as soon as you start downloading the dataset, you become part of our community of sharers and contribute to making the dataset available to other researchers all around the world!

There’s no reason to be scared! It’s perfectly legal as long as you’re allowed to have a copy of the dataset (that’s the bit you need to forward to your lab’s IT staff if they’re blocking your ports).

You’re actually providing a tremendous contribution to dHCP by spreading the data, so thank you again for that!

With your help, we can make sure this data remains available and can be downloaded relatively fast in the future. Over time, the dataset will grow and your contribution will be more and more important so that each and everyone of you can still obtain the data in the smoothest possible way.

We cannot do it without you. By seeding, you’re actually saying “cheers!” to your peers whom you downloaded your data from. So leave your client open and stay tuned!

All this is made possible thanks to the amazing folks at academictorrents and their infrastructure, so kudos academictorrents!

You can learn more about their project here and get some help to get started with torrent downloading here.

Jonathan Passerat-Palmbach

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

WannaCry and Vulnerabilities

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/06/wannacry_and_vu.html

There is plenty of blame to go around for the WannaCry ransomware that spread throughout the Internet earlier this month, disrupting work at hospitals, factories, businesses, and universities. First, there are the writers of the malicious software, which blocks victims’ access to their computers until they pay a fee. Then there are the users who didn’t install the Windows security patch that would have prevented an attack. A small portion of the blame falls on Microsoft, which wrote the insecure code in the first place. One could certainly condemn the Shadow Brokers, a group of hackers with links to Russia who stole and published the National Security Agency attack tools that included the exploit code used in the ransomware. But before all of this, there was the NSA, which found the vulnerability years ago and decided to exploit it rather than disclose it.

All software contains bugs or errors in the code. Some of these bugs have security implications, granting an attacker unauthorized access to or control of a computer. These vulnerabilities are rampant in the software we all use. A piece of software as large and complex as Microsoft Windows will contain hundreds of them, maybe more. These vulnerabilities have obvious criminal uses that can be neutralized if patched. Modern software is patched all the time — either on a fixed schedule, such as once a month with Microsoft, or whenever required, as with the Chrome browser.

When the US government discovers a vulnerability in a piece of software, however, it decides between two competing equities. It can keep it secret and use it offensively, to gather foreign intelligence, help execute search warrants, or deliver malware. Or it can alert the software vendor and see that the vulnerability is patched, protecting the country — and, for that matter, the world — from similar attacks by foreign governments and cybercriminals. It’s an either-or choice. As former US Assistant Attorney General Jack Goldsmith has said, “Every offensive weapon is a (potential) chink in our defense — and vice versa.”

This is all well-trod ground, and in 2010 the US government put in place an interagency Vulnerabilities Equities Process (VEP) to help balance the trade-off. The details are largely secret, but a 2014 blog post by then President Barack Obama’s cybersecurity coordinator, Michael Daniel, laid out the criteria that the government uses to decide when to keep a software flaw undisclosed. The post’s contents were unsurprising, listing questions such as “How much is the vulnerable system used in the core Internet infrastructure, in other critical infrastructure systems, in the US economy, and/or in national security systems?” and “Does the vulnerability, if left unpatched, impose significant risk?” They were balanced by questions like “How badly do we need the intelligence we think we can get from exploiting the vulnerability?” Elsewhere, Daniel has noted that the US government discloses to vendors the “overwhelming majority” of the vulnerabilities that it discovers — 91 percent, according to NSA Director Michael S. Rogers.

The particular vulnerability in WannaCry is code-named EternalBlue, and it was discovered by the US government — most likely the NSA — sometime before 2014. The Washington Post reported both how useful the bug was for attack and how much the NSA worried about it being used by others. It was a reasonable concern: many of our national security and critical infrastructure systems contain the vulnerable software, which imposed significant risk if left unpatched. And yet it was left unpatched.

There’s a lot we don’t know about the VEP. The Washington Post says that the NSA used EternalBlue “for more than five years,” which implies that it was discovered after the 2010 process was put in place. It’s not clear if all vulnerabilities are given such consideration, or if bugs are periodically reviewed to determine if they should be disclosed. That said, any VEP that allows something as dangerous as EternalBlue — or the Cisco vulnerabilities that the Shadow Brokers leaked last August to remain unpatched for years isn’t serving national security very well. As a former NSA employee said, the quality of intelligence that could be gathered was “unreal.” But so was the potential damage. The NSA must avoid hoarding vulnerabilities.

Perhaps the NSA thought that no one else would discover EternalBlue. That’s another one of Daniel’s criteria: “How likely is it that someone else will discover the vulnerability?” This is often referred to as NOBUS, short for “nobody but us.” Can the NSA discover vulnerabilities that no one else will? Or are vulnerabilities discovered by one intelligence agency likely to be discovered by another, or by cybercriminals?

In the past few months, the tech community has acquired some data about this question. In one study, two colleagues from Harvard and I examined over 4,300 disclosed vulnerabilities in common software and concluded that 15 to 20 percent of them are rediscovered within a year. Separately, researchers at the Rand Corporation looked at a different and much smaller data set and concluded that fewer than six percent of vulnerabilities are rediscovered within a year. The questions the two papers ask are slightly different and the results are not directly comparable (we’ll both be discussing these results in more detail at the Black Hat Conference in July), but clearly, more research is needed.

People inside the NSA are quick to discount these studies, saying that the data don’t reflect their reality. They claim that there are entire classes of vulnerabilities the NSA uses that are not known in the research world, making rediscovery less likely. This may be true, but the evidence we have from the Shadow Brokers is that the vulnerabilities that the NSA keeps secret aren’t consistently different from those that researchers discover. And given the alarming ease with which both the NSA and CIA are having their attack tools stolen, rediscovery isn’t limited to independent security research.

But even if it is difficult to make definitive statements about vulnerability rediscovery, it is clear that vulnerabilities are plentiful. Any vulnerabilities that are discovered and used for offense should only remain secret for as short a time as possible. I have proposed six months, with the right to appeal for another six months in exceptional circumstances. The United States should satisfy its offensive requirements through a steady stream of newly discovered vulnerabilities that, when fixed, also improve the country’s defense.

The VEP needs to be reformed and strengthened as well. A report from last year by Ari Schwartz and Rob Knake, who both previously worked on cybersecurity policy at the White House National Security Council, makes some good suggestions on how to further formalize the process, increase its transparency and oversight, and ensure periodic review of the vulnerabilities that are kept secret and used for offense. This is the least we can do. A bill recently introduced in both the Senate and the House calls for this and more.

In the case of EternalBlue, the VEP did have some positive effects. When the NSA realized that the Shadow Brokers had stolen the tool, it alerted Microsoft, which released a patch in March. This prevented a true disaster when the Shadow Brokers exposed the vulnerability on the Internet. It was only unpatched systems that were susceptible to WannaCry a month later, including versions of Windows so old that Microsoft normally didn’t support them. Although the NSA must take its share of the responsibility, no matter how good the VEP is, or how many vulnerabilities the NSA reports and the vendors fix, security won’t improve unless users download and install patches, and organizations take responsibility for keeping their software and systems up to date. That is one of the important lessons to be learned from WannaCry.

This essay originally appeared in Foreign Affairs.

Building High-Throughput Genomic Batch Workflows on AWS: Batch Layer (Part 3 of 4)

Post Syndicated from Andy Katz original https://aws.amazon.com/blogs/compute/building-high-throughput-genomic-batch-workflows-on-aws-batch-layer-part-3-of-4/

Aaron Friedman is a Healthcare and Life Sciences Partner Solutions Architect at AWS

Angel Pizarro is a Scientific Computing Technical Business Development Manager at AWS

This post is the third in a series on how to build a genomics workflow on AWS. In Part 1, we introduced a general architecture, shown below, and highlighted the three common layers in a batch workflow:

  • Job
  • Batch
  • Workflow

In Part 2, you built a Docker container for each job that needed to run as part of your workflow, and stored them in Amazon ECR.

In Part 3, you tackle the batch layer and build a scalable, elastic, and easily maintainable batch engine using AWS Batch.

AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. It dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs that you submit. With AWS Batch, you do not need to install and manage your own batch computing software or server clusters, which allows you to focus on analyzing results, such as those of your genomic analysis.

Integrating applications into AWS Batch

If you are new to AWS Batch, we recommend reading Setting Up AWS Batch to ensure that you have the proper permissions and AWS environment.

After you have a working environment, you define several types of resources:

  • IAM roles that provide service permissions
  • A compute environment that launches and terminates compute resources for jobs
  • A custom Amazon Machine Image (AMI)
  • A job queue to submit the units of work and to schedule the appropriate resources within the compute environment to execute those jobs
  • Job definitions that define how to execute an application

After the resources are created, you’ll test the environment and create an AWS Lambda function to send generic jobs to the queue.

This genomics workflow covers the basic steps. For more information, see Getting Started with AWS Batch.

Creating the necessary IAM roles

AWS Batch simplifies batch processing by managing a number of underlying AWS services so that you can focus on your applications. As a result, you create IAM roles that give the service permissions to act on your behalf. In this section, deploy the AWS CloudFormation template included in the GitHub repository and extract the ARNs for later use.

To deploy the stack, go to the top level in the repo with the following command:

aws cloudformation create-stack --template-body file://batch/setup/iam.template.yaml --stack-name iam --capabilities CAPABILITY_NAMED_IAM

You can capture the output from this stack in the Outputs tab in the CloudFormation console:

Creating the compute environment

In AWS Batch, you will set up a managed compute environments. Managed compute environments automatically launch and terminate compute resources on your behalf based on the aggregate resources needed by your jobs, such as vCPU and memory, and simple boundaries that you define.

When defining your compute environment, specify the following:

  • Desired instance types in your environment
  • Min and max vCPUs in the environment
  • The Amazon Machine Image (AMI) to use
  • Percentage value for bids on the Spot Market and VPC subnets that can be used.

AWS Batch then provisions an elastic and heterogeneous pool of Amazon EC2 instances based on the aggregate resource requirements of jobs sitting in the RUNNABLE state. If a mix of CPU and memory-intensive jobs are ready to run, AWS Batch provisions the appropriate ratio and size of CPU and memory-optimized instances within your environment. For this post, you will use the simplest configuration, in which instance types are set to "optimal" allowing AWS Batch to choose from the latest C, M, and R EC2 instance families.

While you could create this compute environment in the console, we provide the following CLI commands. Replace the subnet IDs and key name with your own private subnets and key, and the image-id with the image you will build in the next section.

ACCOUNTID=<your account id>
SERVICEROLE=<from output in CloudFormation template>
IAMFLEETROLE=<from output in CloudFormation template>
JOBROLEARN=<from output in CloudFormation template>
SUBNETS=<comma delimited list of subnets>
SECGROUPS=<your security groups>
SPOTPER=50 # percentage of on demand
IMAGEID=<ami-id corresponding to the one you created>
INSTANCEROLE=<from output in CloudFormation template>
REGISTRY=${ACCOUNTID}.dkr.ecr.us-east-1.amazonaws.com
KEYNAME=<your key name>
MAXCPU=1024 # max vCPUs in compute environment
ENV=myenv

# Creates the compute environment
aws batch create-compute-environment --compute-environment-name genomicsEnv-$ENV --type MANAGED --state ENABLED --service-role ${SERVICEROLE} --compute-resources type=SPOT,minvCpus=0,maxvCpus=$MAXCPU,desiredvCpus=0,instanceTypes=optimal,imageId=$IMAGEID,subnets=$SUBNETS,securityGroupIds=$SECGROUPS,ec2KeyPair=$KEYNAME,instanceRole=$INSTANCEROLE,bidPercentage=$SPOTPER,spotIamFleetRole=$IAMFLEETROLE

Creating the custom AMI for AWS Batch

While you can use default Amazon ECS-optimized AMIs with AWS Batch, you can also provide your own image in managed compute environments. We will use this feature to provision additional scratch EBS storage on each of the instances that AWS Batch launches and also to encrypt both the Docker and scratch EBS volumes.

AWS Batch has the same requirements for your AMI as Amazon ECS. To build the custom image, modify the default Amazon ECS-Optimized Amazon Linux AMI in the following ways:

  • Attach a 1 TB scratch volume to /dev/sdb
  • Encrypt the Docker and new scratch volumes
  • Mount the scratch volume to /docker_scratch by modifying /etcfstab

The first two tasks can be addressed when you create the custom AMI in the console. Spin up a small t2.micro instance, and proceed through the standard EC2 instance launch.

After your instance has launched, record the IP address and then SSH into the instance. Copy and paste the following code:

sudo yum -y update
sudo parted /dev/xvdb mklabel gpt
sudo parted /dev/xvdb mkpart primary 0% 100%
sudo mkfs -t ext4 /dev/xvdb1
sudo mkdir /docker_scratch
sudo echo -e '/dev/xvdb1\t/docker_scratch\text4\tdefaults\t0\t0' | sudo tee -a /etc/fstab
sudo mount -a

This auto-mounts your scratch volume to /docker_scratch, which is your scratch directory for batch processing. Next, create your new AMI and record the image ID.

Creating the job queues

AWS Batch job queues are used to coordinate the submission of batch jobs. Your jobs are submitted to job queues, which can be mapped to one or more compute environments. Job queues have priority relative to each other. You can also specify the order in which they consume resources from your compute environments.

In this solution, use two job queues. The first is for high priority jobs, such as alignment or variant calling. Set this with a high priority (1000) and map back to the previously created compute environment. Next, set a second job queue for low priority jobs, such as quality statistics generation. To create these compute environments, enter the following CLI commands:

aws batch create-job-queue --job-queue-name highPriority-${ENV} --compute-environment-order order=0,computeEnvironment=genomicsEnv-${ENV}  --priority 1000 --state ENABLED
aws batch create-job-queue --job-queue-name lowPriority-${ENV} --compute-environment-order order=0,computeEnvironment=genomicsEnv-${ENV}  --priority 1 --state ENABLED

Creating the job definitions

To run the Isaac aligner container image locally, supply the Amazon S3 locations for the FASTQ input sequences, the reference genome to align to, and the output BAM file. For more information, see tools/isaac/README.md.

The Docker container itself also requires some information on a suitable mountable volume so that it can read and write files temporary files without running out of space.

Note: In the following example, the FASTQ files as well as the reference files to run are in a publicly available bucket.

FASTQ1=s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz
FASTQ2=s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz
REF=s3://aws-batch-genomics-resources/reference/isaac/
BAM=s3://mybucket/genomic-workflow/test_results/bam/

mkdir ~/scratch

docker run --rm -ti -v $(HOME)/scratch:/scratch $REPO_URI --bam_s3_folder_path $BAM \
--fastq1_s3_path $FASTQ1 \
--fastq2_s3_path $FASTQ2 \
--reference_s3_path $REF \
--working_dir /scratch 

Locally running containers can typically expand their CPU and memory resource headroom. In AWS Batch, the CPU and memory requirements are hard limits and are allocated to the container image at runtime.

Isaac is a fairly resource-intensive algorithm, as it creates an uncompressed index of the reference genome in memory to match the query DNA sequences. The large memory space is shared across multiple CPU threads, and Isaac can scale almost linearly with the number of CPU threads given to it as a parameter.

To fit these characteristics, choose an optimal instance size to maximize the number of CPU threads based on a given large memory footprint, and deploy a Docker container that uses all of the instance resources. In this case, we chose a host instance with 80+ GB of memory and 32+ vCPUs. The following code is example JSON that you can pass to the AWS CLI to create a job definition for Isaac.

aws batch register-job-definition --job-definition-name isaac-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/isaac",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":80000,
"vcpus":32,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

You can copy and paste the following code for the other three job definitions:

aws batch register-job-definition --job-definition-name strelka-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/strelka",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":32000,
"vcpus":32,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

aws batch register-job-definition --job-definition-name snpeff-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/snpeff",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":10000,
"vcpus":4,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

aws batch register-job-definition --job-definition-name samtoolsStats-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/samtools_stats",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":10000,
"vcpus":4,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

The value for "image" comes from the previous post on creating a Docker image and publishing to ECR. The value for jobRoleArn you can find from the output of the CloudFormation template that you deployed earlier. In addition to providing the number of CPU cores and memory required by Isaac, you also give it a storage volume for scratch and staging. The volume comes from the previously defined custom AMI.

Testing the environment

After you have created the Isaac job definition, you can submit the job using the AWS Batch submitJob API action. While the base mappings for Docker run are taken care of in the job definition that you just built, the specific job parameters should be specified in the container overrides section of the API call. Here’s what this would look like in the CLI, using the same parameters as in the bash commands shown earlier:

aws batch submit-job --job-name testisaac --job-queue highPriority-${ENV} --job-definition isaac-${ENV}:1 --container-overrides '{
"command": [
			"--bam_s3_folder_path", "s3://mybucket/genomic-workflow/test_batch/bam/",
            "--fastq1_s3_path", "s3://aws-batch-genomics-resources/fastq/ SRR1919605_1.fastq.gz",
            "--fastq2_s3_path", "s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz",
            "--reference_s3_path", "s3://aws-batch-genomics-resources/reference/isaac/",
            "--working_dir", "/scratch",
			"—cmd_args", " --exome ",]
}'

When you execute a submitJob call, jobId is returned. You can then track the progress of your job using the describeJobs API action:

aws batch describe-jobs –jobs <jobId returned from submitJob>

You can also track the progress of all of your jobs in the AWS Batch console dashboard.

To see exactly where a RUNNING job is at, use the link in the AWS Batch console to direct you to the appropriate location in CloudWatch logs.

Completing the batch environment setup

To finish, create a Lambda function to submit a generic AWS Batch job.

In the Lambda console, create a Python 2.7 Lambda function named batchSubmitJob. Copy and paste the following code. This is similar to the batch-submit-job-python27 Lambda blueprint. Use the LambdaBatchExecutionRole that you created earlier. For more information about creating functions, see Step 2.1: Create a Hello World Lambda Function.

from __future__ import print_function

import json
import boto3

batch_client = boto3.client('batch')

def lambda_handler(event, context):
    # Log the received event
    print("Received event: " + json.dumps(event, indent=2))
    # Get parameters for the SubmitJob call
    # http://docs.aws.amazon.com/batch/latest/APIReference/API_SubmitJob.html
    job_name = event['jobName']
    job_queue = event['jobQueue']
    job_definition = event['jobDefinition']
    
    # containerOverrides, dependsOn, and parameters are optional
    container_overrides = event['containerOverrides'] if event.get('containerOverrides') else {}
    parameters = event['parameters'] if event.get('parameters') else {}
    depends_on = event['dependsOn'] if event.get('dependsOn') else []
    
    try:
        response = batch_client.submit_job(
            dependsOn=depends_on,
            containerOverrides=container_overrides,
            jobDefinition=job_definition,
            jobName=job_name,
            jobQueue=job_queue,
            parameters=parameters
        )
        
        # Log response from AWS Batch
        print("Response: " + json.dumps(response, indent=2))
        
        # Return the jobId
        event['jobId'] = response['jobId']
        return event
    
    except Exception as e:
        print(e)
        message = 'Error getting Batch Job status'
        print(message)
        raise Exception(message)

Conclusion

In part 3 of this series, you successfully set up your data processing, or batch, environment in AWS Batch. We also provided a Python script in the corresponding GitHub repo that takes care of all of the above CLI arguments for you, as well as building out the job definitions for all of the jobs in the workflow: Isaac, Strelka, SAMtools, and snpEff. You can check the script’s README for additional documentation.

In Part 4, you’ll cover the workflow layer using AWS Step Functions and AWS Lambda.

Please leave any questions and comments below.

[$] Keeping Python competitive

Post Syndicated from jake original https://lwn.net/Articles/723949/rss

Victor Stinner sees a need to improve Python performance in order to keep
it competitive with other languages. He brought up some ideas for doing
that in a 2017 Python Language Summit session. No solid conclusions were
reached, but there is a seemingly growing segment of the core developers
who are interested in pushing Python’s performance much further, possibly
breaking the existing C API in the process.