The Bytecode Alliance is an industry partnership with the aim of forging WebAssembly’s outside-the-browser future by collaborating on implementing standards and proposing new ones. The newly formed alliance has “a vision of a WebAssembly ecosystem that is secure by default, fixing cracks in today’s software foundations“. The alliance is currently working on a standalone WebAssembly runtime, two use-case specific runtimes, runtime components, and language tooling.
This year saw the second edition of the Automated Testing Summit (ATS) and the first that was open to all. Last year’s ATS was an invitation-only gathering of around 35 developers (that was described in an LWN article), while this year’s event attracted around 50 attendees; both were held in conjunction with the Embedded Linux Conference Europe (ELCE), in Edinburgh, Scotland for 2018 and in Lyon, France this year. The basic problem has not changed—more collaboration is needed between the different kernel testing systems—but the starting points have been identified and work is progressing, albeit slowly. Part of the problem, of course, is that all of these testing efforts have their own constituencies and customers, who must be kept up and running, even while any of this collaborative development is going on.
With AWS CloudFormation, you can model your entire infrastructure with text files. In this way, you can treat your infrastructure as code and apply software development best practices, such as putting it under version control, or reviewing architectural changes with your team before deployment.
Sometimes AWS resources initially created using the console or the AWS Command Line Interface (CLI) need to be managed using CloudFormation. For example, you (or a different team) may create an IAMrole, a Virtual Private Cloud, or an RDS database in the early stages of a migration, and then you have to spend time to include them in the same stack as the final application. In such cases, you often end up recreating the resources from scratch using CloudFormation, and then migrating configuration and data from the original resource.
To make these steps easier for our customers, you can now import existing resources into a CloudFormation stack!
It was already possible to remove resources from a stack without deleting them by setting the DeletionPolicy to Retain. This, together with the new import operation, enables a new range of possibilities. For example, you are now able to:
Create a new stack importing existing resources.
Import existing resources in an already created stack.
During the resource import operation, CloudFormation checks that:
The imported resources do not already belong to another stack in the same region (be careful with global resources such as IAM roles).
The target resources exist and you have sufficient permissions to perform the operation.
The properties and configuration values are valid against the resource type schema, which defines its required, acceptable properties, and supported values.
The resource import operation does not check that the template configuration and the actual configuration are the same. Since the import operation supports the same resource types as drift detection, I recommend running drift detection after importing resources in a stack.
Importing Existing Resources into a New Stack In my AWS account, I have an S3 bucket and a DynamoDB table, both with some data inside, and I’d like to manage them using CloudFormation. In the CloudFormation console, I have two new options:
I can create a new stack importing existing resources.
I can import resources into an existing stack.
In this case, I want to start from scratch, so I create a new stack. The next step is to provide a template with the resources to import.
I upload the following template with two resources to import: a DynamoDB table and an S3 bucket.
Description: Import test
- AttributeName: id
- AttributeName: id
In this template I am setting DeletionPolicy to Retain for both resources. In this way, if I remove them from the stack, they will not be deleted. This is a good option for resources which contain data you don’t want to delete by mistake, or that you may want to move to a different stack in the future. It is mandatory for imported resources to have a deletion policy set, so you can safely and easily revert the operation, and be protected from mistakenly deleting resources that were imported by someone else.
I now have to provide an identifier to map the logical IDs in the template with the existing resources. In this case, I use the DynamoDB table name and the S3 bucket name. For other resource types, there may be multiple ways to identify them and you can select which property to use in the drop-down menus.
In the final recap, I review changes before applying them. Here I check that I’m targeting the right resources to import with the right identifiers. This is actually a CloudFormation Change Set that will be executed when I import the resources.
When importing resources into an existing stack, no changes are allowed to the existing resources of the stack. The import operation will only allow the Change Set action of Import. Changes to parameters are allowed as long as they don’t cause changes to resolved values of properties in existing resources. You can change the template for existing resources to replace hard coded values with a Ref to a resource being imported. For example, you may have a stack with an EC2 instance using an existing IAM role that was created using the console. You can now import the IAM role into the stack and replace in the template the hard coded value used by the EC2 instance with a Ref to the role.
Moving on, each resource has its corresponding import events in the CloudFormation console.
When the import is complete, in the Resources tab, I see that the S3 bucket and the DynamoDB table are now part of the stack.
To be sure the imported resources are in sync with the stack template, I use drift detection.
All stack-level tags, including automatically created tags, are propagated to resources that CloudFormation supports. For example, I can use the AWS CLI to get the tag set associated with the S3 bucket I just imported into my stack. Those tags give me the CloudFormation stack name and ID, and the logical ID of the resource in the stack template:
Available Now You can use the new CloudFormation import operation via the console, AWS Command Line Interface (CLI), or AWS SDKs, in the following regions: US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), Canada (Central), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), EU (Frankfurt), EU (Ireland), EU (London), EU (Paris), and South America (São Paulo).
Despite some of the most intense opposition seen in recent years, on March 26, 2019, the EU Parliament adopted the Copyright Directive.
The main controversy surrounded Article 17 (previously known as Article 13), which places greater restrictions on user-generated content platforms like YouTube.
Rightsholders, from the music industry in particular, welcomed the new reality. Without official licensing arrangements in place or strong efforts to obtain licensing alongside best efforts to take down infringing content and keep it down, sites like YouTube (Online Content Sharing Service Providers – OCSSP) can potentially be held liable for infringing content.
This uncertainty led many to fear for the future of fair use, with the specter of content upload platforms deploying strict automated filters that err on the side of caution in order to avoid negative legal consequences under the new law.
While the legislation has been passed at the EU level, it still has to be written into Member States’ local law. With that in mind, more than 50 EU Academics have published a set of recommendations that they believe have the potential to limit restrictions on user freedoms as a result of the new legislation.
A key recommendation is that national implementations should “fully explore” legal mechanisms for broad licensing of copyrighted content. The academics are calling for this to ensure that the preventative obligations of OCSSPs are limited in application wherever possible.
The academics hope that broad licensing can avoid situations where to avoid liability, OCSSPs would otherwise have to prove they have made “best efforts” to ensure works specified by rightsholders are rendered inaccessible or show that they have “acted expeditiously” to remove content and prevent its reupload following a request from a rightsholder.
“Otherwise, the freedom of EU citizens to participate in democratic online content creation and distribution will be encroached upon and freedom of expression and information in the online environment would be curtailed,” the academics warn.
The academics’ recommendations are focused on ensuring that non-infringing works don’t become collateral damage as OCSSPs scramble to cover their own backs and avoid liability.
For example, the preventative obligations listed above should generally not come into play when content is used for quotation, criticism, or review, or for the purpose of caricature, parody or pastiche. If content is removed or filtered incorrectly, however, Member States must ensure that online content-sharing service providers put in place an “effective and expeditious” complaint and redress system.
The prospect of automatic filtering at the point of upload was a hugely controversial matter before Article 17 passed but the academics believe they have identified ways to ensure that freedom of expression and access to information can be better protected.
“[W]e recommend that where preventive measures [as detailed above] are applied, especially where they lead to the filtering and blocking of uploaded content before it is made available to the public, Member States should, to the extent possible, limit their application to cases of prima facie [upon first impression] copyright infringement,” the academics write.
“In this context, a prima facie copyright infringement means the upload of protected material that is identical or equivalent to the ‘relevant and necessary information’ previously provided by the rightholders to OCSSPs, including information previously considered infringing. The concept of equivalent information should be interpreted strictly.”
The academics say that if content is removed on the basis of prima facie infringement, users are entitled to activate the complaint and redress procedure. If there is no prima facie infringement, content should not be removed until its legal status is determined.
In cases where user-uploaded content does not meet the prima facie standard but matches “relevant and necessary information” (fingerprints etc) supplied by rightsholders, OCSSPs must grant users the ability to declare that content is not infringing due to fair use-type exceptions.
“The means to provide such declaration should be concise, transparent, intelligible, and be presented to the user in an easily accessible form, using clear and plain language (e.g. a standard statement clarifying the status of the uploaded content, such as ‘This is a permissible quotation’ or ‘This is a permissible parody’),” the recommendations read.
If users don’t provide a declaration within a “reasonable” time following upload, the OCSSP (YouTube etc) should be “allowed” to remove the content, with users granted permission to activate the complaint and redress procedure.
Rightsholders who still maintain that content was removed correctly must then justify the deletion, detailing why it is a prima facie case of infringement and not covered by a fair use-type exemption, particularly the one cited by the user.
A human review should then be conducted at the OCSSP, which should not be held liable for infringement under Article 17 until the process is complete and legality determined.
Given that Article 17 has passed, there appears to be limited room to maneuver and there is a long way to go before all Member States write its terms into local law.
However, even if the above safeguarding recommendations are implemented, it’s clear that substantial resources will have to be expended to ensure that everyone’s rights are protected. As a result, platforms lacking YouTube-sized budgets will undoubtedly feel the pinch.
Safeguarding User Freedoms in Implementing Article 17 of the Copyright in the Digital Single Market Directive: Recommendations from European Academics is available here.
With the 50th anniversary of the D-Day landings very much in the news this year, Adam Clark found himself interested in all things relating to that era. So it wasn’t long before he found himself on the Internet Archive listening to some of the amazing recordings of radio broadcasts from that time. In this month’s HackSpace magazine, Adam details how he built his WW2 radio-broadcast time machine using a Raspberry Pi Zero W, and provides you with the code to build your own.
As good as the recordings on the Internet Archive were, it felt as if something was missing by listening to them on a modern laptop, so I wanted something to play them back on that was more evocative of that time, and would perhaps capture the feeling of listening to them on a radio set.
I also wanted to make the collection portable and to make the interface for selecting and playing the tracks as easy as possible – this wasn’t going to be screen-based!
Another important consideration was to house the project in something that would not look out of place in the living room, and not to give away the fact that it was being powered by modern tech.
So I came up with the idea of using an original radio as the project case, and to use as many of the original knobs and dials as possible. I also had the idea to repurpose the frequency dial to select individual years of the war and to play broadcasts from whichever year was selected.
Of course, the Raspberry Pi was immediately the first option to run all this, and ideally, I wanted to use a Raspberry Pi Zero to keep the costs down and perhaps to allow expansion in the future outside of being a standalone playback device.
Right off the bat, I knew that I would have a couple of obstacles to overcome as the Raspberry Pi Zero doesn’t have an easy way to play audio out, and I also wanted to have analogue inputs for the controls. So the first thing was to get some audio playing to see if this was possible.
The first obstacle was to find a satisfactory way to playback audio. In the past, I have had some success using PWM pins, but this needs a low-pass filter as well as an amplifier, and the quality of audio was never as good as I’d hoped for.
The other alternative is to use one of the many HATs available, but these come at a price as they are normally aimed at more serious quality of audio. I wanted to keep the cost down, so these were excluded as an option. The other option was to use a mono I2S 3W amplifier breakout board – MAX98357A from Adafruit – which is extremely simple to use.
As the BBC didn’t start broadcasting stereo commercially until the late 1950s, this was also very apt for the radio (which only has one speaker). Connecting up this board is very easy – it just requires three GPIO pins, power, and the speaker. For this, I just soldered some female jumper leads to the breakout board and connected them to the header pins of the Raspberry Pi Zero. There are detailed instructions on the Adafruit website for this which basically entails running their install script.
I’d now got a nice playback device that would easily play the MP3 files downloaded from archive.org and so the next task was to find a suitable second-hand radio set.
Preparing the case
After a lot of searching on auction sites, I eventually found a radio that was going to be suitable: wasn’t too large, was constructed from wood, and looked old enough to convince the casual observer. I had to settle for something that actually came from the early 1950s, but it drew on design influences from earlier years and wasn’t too large as a lot of the real period ones tended to be (and it was only £15). This is a fun project, so a bit of leeway was fine by me in this respect.
When the radio arrived, my first thought as a tinkerer was perhaps I should get the valves running, but a quick piece of research turned up that I’d probably have to replace all the resistors and capacitors and all the old wiring and then hope that the valves still worked. Then discovering that the design used a live chassis running at 240 V soon convinced me that I should get back on track and replace everything.
With a few bolts and screws removed, I soon had an empty case.
I then stripped out all the interior components and set about restoring the case and dial glass, seeing what I could use by way of the volume and power controls. Sadly, there didn’t seem to be any way to hook into the old controls, so I needed to design a new chassis to mount all the components, which I did in Tinkercad, an online 3D CAD package. The design was then downloaded and printed on my 3D printer.
It took a couple of iterations, and during this phase, I wondered if I could use the original speaker. It turned out to be absolutely great, and the audio took on a new quality and brought even more authenticity to the project.
The case itself was pretty worn and faded, and the varnish had cracked, so I decided to strip it back. The surface was actually veneer, but you can still sand this. After a few applications of Nitromors to remove the varnish, it was sanded to remove the scratches and finished off with fine sanding.
The wood around the speaker grille was pretty cracked and had started to delaminate. I carefully removed the speaker grille cloth, and fixed these with a few dabs of wood glue, then used some Tamiya brown paint to colour the edges of the wood to blend it back in with the rest of the case. I was going to buy replacement cloth, but it’s fairly pricey – I had discovered a trick of soaking the cloth overnight in neat washing-up liquid and cold water, and it managed to lift the years of grime out and give it a new lease of life.
At this point, I should have just varnished or used Danish oil on the case, but bitten by the restoration bug I thought I would have a go at French polishing. This gave me a huge amount of respect for anyone that can do this properly. It’s messy, time-consuming, and a lot of work. I ended up having to do several coats, and with all the polishing involved, this was probably one of the most time-consuming tasks, plus I ended up with some pretty stained fingers as a result.
The rest of the case was pretty easy to clean, and the brass dial pointer polished up nice and shiny with some Silvo polish. The cloth was glued back in place, and the next step was to sort out the dial and glass.
Frequency, volume, glass, and knobs
Unfortunately, the original glass was cracked, so a replacement part was cut from some Makrolon sheet, also known as Lexan. I prefer this to acrylic as it’s much easier to cut and far less likely to crack when drilling it. It’s used as machine guards as well and can even be bent if necessary.
With the dial, I scanned it into the PC and then in PaintShop I replaced the existing frequency scale with a range of years running from 1939 to 1945, as the aim was for anyone using the radio to just dial the year they wanted to listen to. The program will then read the value of the potentiometer, and randomly select a file to play from that year.
It was also around about now that I had to come up with some means of having the volume control the sound and an interface for the frequency dial. Again there are always several options to consider, and I originally toyed with using a couple of rotary encoders and using one of these with the built-in push button as the power switch, but eventually decided to just use some potentiometers. Now I just had to come up with an easy way to read the analogue value of the pots and get that into the program.
There are quite a few good analogue-to-digital boards and HATs available, but with simplicity in mind, I chose to use an MCP3002 chip as it was only about £2. This is a two-channel analogue-to-digital converter (ADC) and outputs the data as a 10-bit value onto the SPI bus. This sounds easy when you say it, but it proved to be one of the trickier technical tasks as none of the code around for the four-channel MCP3008 seemed to work for the MCP3002, nor did many of the examples that were around for the MCP3002 – I think I went through about a dozen examples. At long last, I did find some code examples that worked, and with a bit of modification, I had a simple way of reading the values from the two potentiometers. You can download the original code by Stéphane Guerreau from GitHub. To use this on your Raspberry Pi, you’ll also need to run up raspi-config and switch on the SPI interface. Then it is simply a case of hooking up the MCP3002 and connecting the pots between the 3v3 line and ground and reading the voltage level from the wiper of the pots. When coding this, I just opted for some simple if-then statements in cap-Python to determine where the dial was pointing, and just tweaked the values in the code until I got each year to be picked out.
Power supply and control
One of the challenges when using a Raspberry Pi in headless mode is that it likes to be shut down in an orderly fashion rather than just having the power cut. There are lots of examples that show how you can hook up a push button to a GPIO pin and initiate a shutdown script, but to get the Raspberry Pi to power back up you need to physically reset the power. To overcome this piece of the puzzle, I use a Pimoroni OnOff SHIM which cleverly lets you press a button to start up, and then press and hold it for a second to start a shutdown. It’s costly in comparison to the price of a Raspberry Pi Zero, but I’ve not found a more convenient option. The power itself is supplied by using an old power bank that I had which is ample enough to power the radio long enough to be shown off, and can be powered by USB connector if longer-term use is required.
To illuminate the dial, I connected a small LED in series with a 270R resistor to the 3v3 rail so that it would come on as soon as the Raspberry Pi received power, and this lets you easily see when it’s on without waiting for the Raspberry Pi to start up.
If you’re interested in the code Adam used to build his time machine, especially if you’re considering making your own, you’ll find it all in this month’s HackSpace magazine. Download the latest issue for free here, subscribe for more issues here, or visit your local newsagent or the Raspberry Pi Store, Cambridge to pick up the magazine in physical, real-life, in-your-hands print.
The Uber car that hit and killed Elaine Herzberg in Tempe, Ariz., in March 2018 could not recognize all pedestrians, and was being driven by an operator likely distracted by streaming video, according to documents released by the U.S. National Transportation Safety Board (NTSB) this week.
But while the technical failures and omissions in Uber’s self-driving car program are shocking, the NTSB investigation also highlights safety failures that include the vehicle operator’s lapses, lax corporate governance of the project, and limited public oversight.
The details of what happened in the seconds before the collision are worth reading. They describe a cascading series of issues that led to the collision and the fatality.
As computers continue to become part of things, and affect the world in a direct physical manner, this kind of thing will become even more important.
Не съм писал нарочно тук за случая със свещеника и делото му в Сливен. Както многократно съм обсъждал до сега, въпросът не е медицински, а юридически и психологически. Лекари и имунолози не обсъждат дали трябва да се поставят ваксини, а кои и в каква комбинация. Как да се стимулират хората да не пропускат тези за себе си и децата си е вече въпрос на социалната система, здравната култура и условията в конкретно общество.
Всички са единодушни, че решението на АС Сливен се базира на грешно тълкуване на закона и конвенцията и че ще падне при обжалването. Тепърва чакаме отговор от Министерството на здравеопазването дали и какво са предприели въобще по това дело. Междувременно обаче „постижението“ на свещеника доведе не само до повдигане на вежди, но и до ликуване на доста из антиваксърските групи.
Макар символично, това, което Янакиев сътвори има значителен негативен обществен ефект. Подобен виждаме при други борещи се активно с имунизациите и въобще този вид претенция на инфекциозни заболявания. Затова подобно на миналата година, реших да направя нещо добро от негово име.
На името на отец Янакиев има вече дарение за световната борба с полио. Дарението е от 100 лв. – по един лев за всеки случай до сега за тази година по последни данни. Благодарение на фондацията на Мелинда и Бил Гейтс, дарението се утроява. Събраните пари ще бъдат използвани за купуване на още ваксини, съкранението им, осигуряване на здравни служители по места и физическата им сигурност. Ако желае, може да свали и картичка, което отбелязва дарението му.
Тази година видяхме четирикратно увеличение на случаите на единият дивите щамове полиомиелит в последните две държави, където се среща – Афганистан и Пакистан. Добрите новини обаче са, че щам 3 вече е официално обявен за заличен след като същата съдба достигна щам 2, а целият континент Африка беше обявен за свободен от полио.
Има доста път напред, за да остане тази болест в историята. Радвам се, че отец Янакиев помага така в посока имунизирането на повече хора, макар в другата си дейност ефективно да помага на връщането на болести като морбили и рубеола.
Ако искате и вие да дарите за борбата срещу полио и други заболявания, може да го направите на End Polio, както и на сайта на Уницеф или на фонда на Bob Keegan подкрепящ здравните работници станали жертви на атаки от религиозни антивакс фанатици. Ако искате да научите повече за борбата срещу полиото, ще намерите всичко на GEI инициативата.
In February, several major Hollywood studios filed a lawsuit against Omniverse One World Television.
Under the flag of anti-piracy group ACE, the companies accused Omniverse and its owner Jason DeMeo of supplying of pirate streaming channels to various IPTV services.
Omniverse sold live-streaming services to third-party distributors, such as Dragon Box and HDHomerun, which in turn offered live TV streaming packages to customers. According to ACE, the company was a pirate streaming TV supplier, offering these channels without permission from its members.
Omniverse disagreed with this characterization and countered that it did everything by the book. It relied on a deal from the licensed cable company Hovsat, which has a long-standing agreement with DirecTV to distribute a broad range of TV-channels with few restrictions.
As time went on, however, it transpired that the streaming provider was clearly worried about the legal threat. After several of its distributors distanced themselves from the service, Omniverse decided to wind down its business.
The streaming provider also filed a third-party complaint (pdf) against Hovsat for indemnification and breach of contract, among other things. Omniverse believed that it was properly licensed and wants Hovsat to pay the damages for any alleged infringements if that was not the case.
That there are damages became crystal clear yesterday, when ACE announced that it had obtained a consent judgment against Omniverse. Both parties have agreed to settle the matter with the streaming provider committing to pay a $50 million settlement.
“Damages are awarded in favor of Plaintiffs and against Defendants, jointly and severally, in the total amount of fifty million dollars,” the proposed judgment reads.
The agreement also includes a permanent injunction that prevents Omniverse and its owner Jason DeMeo from operating the service and being involved in supplying or offering pirate streaming channels in any other way.
The damages amount of $50 million is a substantial figure. In the past, however, we have seen that the public figure can be substantially higher than what’s agreed in private. In any case, Omniverse may hold Hovsat accountable, as previously suggested.
Karen Thorland, Senior Vice President at the Motion Picture Association, which has a leading role in the ACE coalition, is pleased with the outcome.
“This judgment and injunction are a major win for creators, audiences, and the legitimate streaming market, which has been undermined by Omniverse and its ‘back office’ piracy infrastructure for years,” Thorland, says
Over the past years, ACE has built a steady track record of successful cases against IPTV providers and services. In addition to Omniverse, it also helped to shut down SetTV, Dragon Box, TickBox, Vader Streams, and many third-party Kodi addons.
The consent judgment and permanent injunction (pdf) have yet to be signed off by the court but since both parties are in agreement, that’s mostly a formality.
През юли 2013 г. активистът на ВМРО Стоян Божинов, който сега влиза в Народното събрание на мястото на новоизбрания кмет на Сандански, продава фирмата си “СМРМ Транс” ЕООД на лицето Въчко Въчков. Към този момент фирмата дължи почти половин милион лева на НАП. Повече от година по-рано, през април 2012 г., НАП е издала на фирмата ревизионен акт за 489 254,07 лв. за отказано право за приспадане на данъчен кредит по фактури.
Новият собственик оспорва в съда ревизионния акт на НАП, но той е окончателно потвърден с Решение 1322 от 2017 г. от ВАС. Междувременно чрез фирмата Въчко Въчков тегли кредити от ДСК и успява да натрупа задължения за впечатляващите 3,6 милиона лева общо към НАП и Банка ДСК.
С продажбата на фирмата, дължаща близо половин милион на хазната, Стоян Божинов подобрява рекорда на бившия депутат от ГЕРБ Димитър Гамишев. Както разкри Биволъ, през 2016 г. Гамишев успява да се отърве от фирма, която дължи на НАП 200 000 лв, като я продава на “сламен човек”.
След развихрилия се скандал Гамишев подаде оставка като депутат, а НАП му заведе дело за възстановяване на сумата. Освен това данъчните поискаха да се обявят за недействителни 134 подобни сделки за прехвърляне на имущество за над 20 млн. лв. (информация на в. Сега)
Божинов: “Супер чист съм, но съм прекаран”
Биволъ се свърза за коментар със Стоян Божинов през социалната мрежа Facebook. Той беше категоричен, че е “супер чист” и е бил “прекаран” от лошия мениджмънт на фирмата, поверен на друго лице – Мария Митева.
“Фирмата реално се управляваше от друго лице, с генерално пълномощно и аз нямах подписан нито един документ и след като видях, че фирмата трупа задължения, помолих генералния пълномощник да прехвърли фирмата на неин служител и да се съди колкото си иска с НАП. Пълномощникът така и направи. Пълномощникът се казва Мария Митева. През цялото време фирмата е обслужвана счетоводно и управлявана от Мария Митева. Аз не съм разписвал нито една сделка.” – заяви Божинов.
Запитан дали все пак той не се чувства отговорен за големия дълг, Божинов доразви обяснението си:
Аз не съм се занимавал с тази фирма, просто я бях предоставил на Мария да работи. Когато разбрах, че не работи добре, я помолих да я прехвърли на някой неин приятел, което и направи. И аз съм прекаран в случая.
Стоян Божинов не се съгласи с напомнянето, че случаят прилича на този на Гамишев. “Задължението не е било окончателно, а после тази фирма е взимала кредити и държавата е могла да си вземе всичко. Не знам защо не го е направила.” – каза той. “А и законодателството е било друго.”
Не става ясно обаче как прехвърлянето на задълженията към друго лице, което “да се съди с НАП”, решава репутационния проблем с неплащането на данък в такъв размер и използването на подставени лица.
Мария Митева е била разпитвана от Окръжна прокуратура – Варна по наказателно дело за данъчни измами. Досъдебното производство № 330 ЗМ-477/2015 г.е образувано на 25.04.2012 г. след подадено уведомление от ТД на НАП във Варна. Сигналът на НАП е срещу фирмата “Макском 81” ЕООД, а сред фирмите, на които са издавани фактури, фигурира и “СМРМ Транс”. Митева твърди при разпита, че сделката е била реална и разплатена по банков път.
Делото е спирано и възобновявано няколко пъти през 2013, 2014, 2015 и 2016 г. НАП обаче постоянства и дори през 2016 г. успява да осъди Прокуратурата да продължи досъдебното производство, след като наблюдаващия прокурор го прекратява за пореден път. Това става с определение номер 47 от 18.03.2016 г. на Окръжен съд – Разград. По-нататъшната съдба на това досъдебно производство не е известна.
През 2010 г. транспортният бизнес на семейството на Божинов претърпява неудачи в Гърция. Властите в южната ни съседка конфискуват камиони със стока на фирмата “Сторос” ЕООД, която тогава е собственост на майката на Божинов. Задържани са и шофьорите и започват наказателни дела.
“Няма осъдени шофьори. Не е осъдена и фирма Сторос. Там спедиторите бяха правили някакви далавера. Те са осъдени, гръцки спедиторите. Ние сме били добросъвестни, но за съжаление така е в транспорта.” – коментира Божинов този случай.
Впоследствие камионите са продадени на търг от гръцката държава на безценица. Божинов разказа, че обмислял да съди гръцката държава, тъй като фирмата му претърпяла щети без да има вина. Но тежките процедури и големия депозит го разубедили да завежда дело в Гърция.
FogHorn is an intelligent Internet of Things ( IoT) edge solution that delivers data processing and real-time inference where data is created. Referring to itself as “the only ‘real’ edge intelligence solution in the market today,” FogHorn is powered by a hyper-efficient Complex Event Processor (CEP) and delivers comprehensive data enrichment and real-time analytics on high volumes, varieties, and velocities of streaming sensor data, and is optimized for constrained compute footprints and limited connectivity.
Andrea Sabet, AWS Solutions Architect speaks with Ramya Ravichandar, Vice President of Products at Foghorn to talk about how FogHorn integrates with IoT MQTT for edge-to-edge communication as well as Amazon SageMaker for deep learning model deployment. The edgefication process involves running inference with real-time streaming data against a trained deep learning model. Drifts in the model accuracy trigger a callback to SageMaker for retraining.
In the weeks leading up to re:Invent 2019, we’ll share conversations we’e had with people at AWS who will be presenting at the event so you can learn more about them and some of the interesting work that they’re doing.
How long have you been at AWS, and what do you do enjoy most in your current role?
It’s been two and a half years already! Time has flown. I’m the product manager for AWS CloudHSM. As with most product managers at AWS, I’m the CEO of my product. I spend a lot of my time talking to customers who are looking to use CloudHSM, to understand the problems they are looking to solve. My goal is to make sure they are looking at their problems correctly. Often, my role as a product manager is to coach. I ask a lot of why’s. I learned this approach after I came to AWS—before that I had the more traditional product management approach of listening to customers to take requirements, prioritize them, do the marketing, all of that. This notion of deeply understanding what customers are trying to do and then helping them find the right path forward—which might not be what they were thinking of originally—is something I’ve found unique to AWS. And I really enjoy that piece of my work.
What are you currently working on that you’re excited about?
CloudHSM is a hardware security module (HSM) that lets you generate and use your own encryption keys on AWS. However, CloudHSM is weird in that, by design, you’re explicitly outside the security boundary of AWS managed services when you use it: You don’t use AWS IAM roles, and HSM transactions aren’t captured in AWS CloudTrail. You transact with your HSM over an end-to-end encrypted channel between your application and your HSM. It’s more similar to having to operate a 3rd party application in Amazon Elastic Compute Cloud (EC2) than it is to using an AWS managed service. My job, without breaking the security and control the service offers, is to continue to make customers’ lives better through more elastic, user-friendly, and reliable HSM experiences.
We’re currently working on simplifying cross-region synchronization of CloudHSM clusters. We’re also working on simplifying management operations, like adjusting key attributes or rotating user passwords.
Another really exciting thing that we’re working on is auto-scaling for HSM clusters based on load metrics, to make CloudHSM even more elastic. CloudHSM already broke the mold of traditional HSMs with zero-config cluster scaling. Now, we’re looking to expand how customers can leverage this capability to control costs without sacrificing availability.
What’s the most challenging part of your job?
For one, time management. AWS is so big, and our influence is so vast, that there’s no end to how much you can do. As Amazonians, we want to take ownership of our work, and we want bias for action to accomplish everything quickly. Still, you have to live to fight another day, so prioritizing and saying no is necessary. It’s hard!
I also challenge myself to continue to cultivate the patience and collaboration that gets a customer on a good security path. It’s very easy to say, This is what they’re asking for, so let’s build it—it’s easy, it’s fast, let’s do it. But that’s not the customer obsessed solution. It’s important to push for the correct, long-term outcome for our customers, and that often means training, and bringing in Solutions Architects and Support. It means being willing to schedule the meetings and take the calls and go out to the conferences. It’s hard, but it’s the right thing to do.
What’s your favorite part of your job?
Shipping products. It’s fun to announce something new, and then watch people jump on it and get really excited.
I still really enjoy demonstrating the elastic nature of CloudHSM. It sounds silly, but you can delete a CloudHSM instance and then create a new HSM with a simple API call or console button click. We save your state, so it picks up right where you left off. When you demo that to customers who are used to the traditional way of using on-premises HSMs, their eyes will light up—it’s like being a kid in the candy store. They see a meaningful improvement to the experience of managing HSM they never thought was possible. It’s so much fun to see their reaction.
What does cloud security mean to you, personally?
At the risk of hubris, I believe that to some extent, cloud security is about the survival of the human race. 15-20 years ago, we didn’t have smart phones, and the internet was barely alive. What happened on one side of the planet didn’t immediately and irrevocably affect what happened on the opposite side of the planet. Now, in this connected world, my children’s classrooms are online, my assets, our family videos, our security system—they are all online. With all the flexibility of digital systems comes an enormous amount of responsibility on the service and solution providers. Entire governments, populations, and countries depend on cloud-based systems. It’s vital that we stay ten steps ahead of any potential risk. I think cloud security functions similar to the way that antibiotics and vaccinations function—it allows us to prevent, detect and treat issues before they become serious threats. I am very, very proud to be part of a team that is constantly looking ahead and raising the bar in this area.
What’s the most common misperception you encounter with customers about cloud security?
That you have to directly configure and use your HSMs to be secure in the cloud. In other words, I’m constantly telling people they do not need to use my product.
To some extent, when customers adopt CloudHSM, it means that we at AWS have not succeeded at giving them an easier to use, lower cost, fully managed option. CloudHSM is expensive. As easy as we’ve made it to use, customers still have to manage their own availability, their own throttling, their own users, their own IT monitoring.
We want customers to be able to use fully managed security services like AWS KMS, ACM Private CA, AWS Code Signing, AWS Secrets Manager and similar services instead of rolling their own solution using CloudHSM. We’re constantly working to pull common CloudHSM use cases into other managed services. In fact, the main talk that I’m doing at re:Invent will put all of our security services into this context. I’m trying to make the point that traditional wisdom says that you have to use a dedicated cryptographic module via CloudHSM to be secure. However, practical wisdom, with all of the advances that we’ve made in all of the other services, almost always indicates that KMS or one of the other managed services is the better option.
In your opinion, what’s the biggest challenge facing cloud security right now?
From my vantage point, I think the challenge is the disconnect between compliance and security officers and DevOps teams.
DevOps people want to know things like, Can you rotate your keys?Can you detect breaches?Can you be agile with your encryption? But I think that security and compliance folks still tend to gravitate toward a focus on creating and tracking keys and cryptographic material. When you try to adapt those older, more established methodologies, I think you give away a lot of the power and flexibility that would give you better resilience.
Five or more years from now, what changes do you think we’ll see across the security landscape?
I think what’s coming is a fundamental shift in the roots of trust. Right now, the prevailing notion is that the roots of trust are physically, logically, and administratively separate from your day to day compute. With Nitro and Firecracker and more modern, scalable ways of local roots of trust, I look forward to a day, maybe ten years from now, when HSMs are obsolete altogether, and customers can take their key security wherever they go.
I also think there is a lot of work being done, and to be done, in encrypted search. If at the end of the day you can’t search data, it’s hard to get the full value out of it. At the same time, you can’t have it in clear text. Searchable encryption currently has and will likely always have limitations, but we’re optimistic that encrypted search for meaningful use cases can be delivered at scale.
I talk to customers at networking conferences run by AWS—and also recently at Grace Hopper—about what content they’d like from us. A recurring request is guidance on navigating the many options for security and cryptography on AWS. They’re not sure where to start, what they should use, or the right way to think about all these security services.
So the genesis of this talk was basically, Hey, let’s provide some kind of decision tree to give customers context for the different use cases they’re trying to solve and the services that AWS provides for those use cases! For each use case, we’ll show the recommended managed service, the alternative service, and the pros and cons of both. We want the customer’s decision process to go beyond just considerations of cost and day one complexity.
What are you hoping that your audience will do differently as a result of attending this session?
I’d like DevOps attendees to be able to articulate their operational needs to their security planning teams more succinctly and with greater precision. I’d like auditors and security planners to have a wider, more realistic view of AWS services and capabilities. I’d like customers as a whole to make the right choice for their business and their own customers. It’s really important for teams as a whole to understand the problem they’re trying to solve. If they can go into their planning and Ops meetings armed with a clear, comprehensive view of the capabilities that AWS offers, and if they can make their decisions from the position of rational information, not preconceived notions, then I think I’ll have achieved the goals of this session.
You’re also co-presenting a deep-dive session along with Rohit Mathur on CloudHSM. What can you tell us about the session that’s not described in the re:Invent catalog?
So, what the session actually should be called is: If you must use CloudHSM, here’s how you don’t shoot your foot.
In the first half of the deep dive, we explain how CloudHSM is different than traditional HSMs. When we made it agile, elastic, and durable, we changed a lot of the traditional paradigms of how HSMs are set up and operated. So we’ll spend a good bit of time explaining how things are different. While there are many things you don’t have to worry about, there are some things that you really have to get right in order for your CloudHSM cluster to work for you as you expect it to.
We’ll talk about how to get maximum power, flexibility, and economy out of the CloudHSM clusters that you’re setting up. It’s somewhat different from a traditional model, where the HSM is just one appliance owned by one customer, and the hardware, software, and support all came from a single vendor. CloudHSM is AWS native, so you still have the single tenant third party FIPS 140-2 validated hardware, but your software and support are coming from AWS. A lot of the integrations and operational aspect of it are very “cloudy” in nature now. Getting customers comfortable with how to program, monitor, and scale is a lot of what we’ll talk about in this session.
We’ll also cover some other big topics. I’m very excited that we’ll talk about trusted key wrapping. It’s a new feature that allows you to mark certain keys as trusted and then control the attributes of keys that are wrapped and unwrapped with those trusted keys. It’s going to open up a lot of flexibility for customers as they implement their workloads. We’ll include cross-region disaster recovery, which tends to be one of the more gnarly problems that customers are trying to solve. You have several different options to solve it depending on your workloads, so we’ll walk you through those options. Finally, we’ll definitely go through performance because that’s where we see a lot of customer concerns, and we really want our users to get the maximum throughput for their HSM investments.
Any advice for first-time attendees coming to re:Invent?
Wear comfortable shoes … and bring Chapstick. If you’ve never been to re:Invent before, prepare to be overwhelmed!
Also, come prepared with your hard questions and seek out AWS experts to answer them. You’ll find resources at the Security booth, you can DM us on Twitter, catch us before or after talks, or just reach out to your account manager to set up a meeting. We want to meet customers while we’re there, and solve problems for you, so seek us out!
You like philosophy. Who’s your favorite philosopher and why?
Rabindranath Tagore. He’s an Indian poet who writes with deep insight about homeland, faith, change, and humanity. I spent my early childhood in the US, then grew up in Bombay and have lived across the Pacific Northwest, the East Coast, the Midwest, and down south in Louisiana in equal measure. When someone asks me where I’m from, I have a hard time answering honestly because I’m never really sure. I like Tagore’s poems because he frames that ambiguity in a way that makes sense. If you abstract the notion of home to the notion of what makes you feel at home, then answers are easier to find!
Want more AWS Security news? Follow us on Twitter.
The AWS Security team is hiring! Want to find out more? Check out our career page.
THE INSTITUTECitizens in several cities including Aspen, Colo.; Bern, Switzerland; San Diego, Calif.; and Totnes, England have been protesting the installation of 5G wireless base stations over concerns about the harmful effects these network nodes could have on humans, animals, and plants. They point to the potential danger of radio frequency (RF) radiation emitted from antennas installed in close proximity to people.
Protestors also cite the lack of scientific evidence showing that 5G signals, specifically those transmitting in the millimeter wave region of the electromagnetic spectrum, are safe. Today’s mobile devices operate at frequencies below 6 gigahertz, while 5G will use frequencies from 600 megahertz and above, including the millimeter wave bands between 30 GHz and 300 GHz.
Enough concern has been raised about 5G that some cities have cancelled or delayed the installation of the base stations.
The Institute asked two members of the IEEE initiative about their take on the controversy over 5G. IEEE Fellow Rod Waterhouse is on the editorial board of the initiative’s Tech Focus publication and edited the 5G report. His research interests include antennas, electromagnetics, and microwave photonics engineering. He’s the CTO and cofounder of Octane Wireless in Hanover, Md.
IEEE Senior Member David Witkowski is cochair of the initiative’s Deployment Working Group He’s a wireless and telecommunication industry expert. Witkowski is the executive director of the Wireless Communications Initiative for Joint Venture Silicon Valley, a nonprofit based in San Jose, Calif., that works to solve problems in that region such as communications, education, and transportation.
Most of the concerns about 5G’s supposed negative impact on health stem from its cell towers having such a different architecture than the ones supporting today’s 3G and 4G cellular networks, Waterhouse says. Those towers are kilometers apart and placed on tall, raised structures that are typically located away from populated areas. Because a 5G base station can be smaller than a backpack, it can be placed just about anywhere, such as on top of light poles, street lights, and rooftops. That means the stations will be located near houses, apartment buildings, schools, stores, parks, and farms.
“Wireless companies are going to incorporate the devices into everyday structures, such as benches and bus stops, so they’ll be lower to the ground and closer to people,” Waterhouse says. “There also will be more of these base stations [compared with the number of cell towers around today] because of their limited reach. A 5G mm network requires cell antennas to be located every 100 to 200 meters.”
That being said, one of the benefits of these small base stations is that they would not have to transmit as much power as current cell towers, because the coverage areas are smaller.
“If the same amount of power that’s currently transmitted from a cell tower located 30 meters up were to be transmitted from a 5G base station installed at a bus stop, then there would be cause for concern,” says Waterhouse, “But that will not be the case.”
A 5G radio replacing a 4G radio at 750 MHz will have the same coverage as the 4G radio, presuming no change to the antenna, according to Witkowski. But, of course, it will provide higher data rates and quicker network response times.
Waterhouse predicts that 5G will be rolled out in two stages. The first, he says, would operate in bands closer to the slice of spectrum—below 6 GHz—where 4G equipment works. “There will be a little bit more bandwidth or faster data rates for everyone,” he says. “Also, 5G base stations will only be in certain small areas, not everywhere.”
In the next phase, which he calls 5G Plus, there will be huge improvement in bandwidth and data rates because there will be more base stations and they will be using mm wave frequencies.
Witkowski says U.S. carriers that already have dense deployments in sub-6 GHz bands will start deployment of 5G in the K/Ka band and mm wave. There also will be some swapping of 3G and 4G radios for newer 5G radios.
“For the U.S. carriers that have access to vacated/re-farmed spectrum, such as T-Mobile in 600 MHz and Sprint in 2.5 GHz, their deployment strategy will be to leave 3G/4G alone for now, and add 5G into these lower bands,” Witkowski says.
The ICNIRP and IEEE guidelines, which are periodically revised, were both updated this year. The limits for local exposure (for frequencies above 6 GHz) were set even lower. Belgium, India, Russia, and other countries have established even more restrictive limits.
As to whether the millimeter wave bands are safe, Waterhouse explains that because RF from cellular sites is on the non-ionizing radiation spectrum, it’s not the kind of radiation that could damage DNA and possibly cause cancer. The only known biological impact of RF on humans is heating tissue. Excessive exposure to RF causes a person’s entire body to overheat to dangerous levels. Local exposure can damage skin tissue or corneas.
“The actual impact and the depth of penetration into the human body is less at higher frequencies,” he says. “The advantage of that is your skin won’t be damaged because millimeter waves will reflect off the skin’s surface.”
Waterhouse admits that although mm waves have been used for many different applications— including astronomy and military applications—the effect of their use in telecommunications is not well understood. Waterhouse says it’s up to regulatory bodies overseeing the telecommunication companies to ensure the safety of 5G. The general perception is that mm waves are safe but should still be monitored, he says.
“The majority of the scientific community does not think there’s an issue,” Waterhouse says. “However, it would be unscientific to flat out say there are no reasons to worry.”
Many opponents insist that 5G must be proven safe before regulators allow deployments. The problem with this assertion, according to Witkowski, is that it isn’t logically possible to prove anything with 100 percent certainty.
“Showering, cooking breakfast, commuting to work, eating in a restaurant, being out in public—everything we do carries risk,” he says. “Whether we’re talking about 3G, 4G, or 5G, the question of electromagnetic radiation safety (EMR) is whether the risks are manageable. The first medical studies on possible health effects from EMR started almost 60 years ago, and literally thousands of studies since then reported either no health risk or inconclusive findings. A relatively small number of studies have claimed to find some evidence of risk, but those studies have never been reproduced—and reproducibility is a key factor in good science.
We should continue to look at the question of EMR health effects, but the vast majority of evidence says there’s no reason to pause deployments.”
A set of patches has just been pushed into the mainline repository (and stable updates) for yet another set of hardware vulnerabilities. “TSX async abort” (or TAA) exposes information through the usual side channels by way of internal buffers used with the transactional memory (TSX) instructions. Mitigation is done by disabling TSX or by clearing the relevant buffers when switching between kernel and user mode. Given that this is not the first problem with TSX, disabling it entirely is recommended; a microcode update may be needed to do so, though. This commit contains documentation on this vulnerability and its mitigation.
There are also fixes for another vulnerability: it seems that accessing a memory address immediately after the size of the page containing it was changed (from a regular to a huge page, for example) can cause the processor to lock up. This behavior is considered undesirable by many. The vulnerability only exists for pages marked as executable; the mitigation is to force all executable pages to be the regular, 4K page size.
At Netflix, we spend a lot of effort to make it easy for our members to find content they will love. To make this happen, we personalize many aspects of our service, including which movies and TV shows we present on each member’s homepage. Over the years, we have built a recommendation system that uses many different machine learning algorithms to create these personalized recommendations. We also apply additional business logic to handle constraints like maturity filtering and deduplication of videos. All of these algorithms and logic come together in our page generation system to produce a personalized homepage for each of our members, which we have outlined in a previous post. While a diverse set of algorithms working together can produce a great outcome, innovating on such a complex system can be difficult. For instance, adding a single feature to one of the recommendation algorithms can change how the whole page is put together. Conversely, a big change to such a ranking system may only have a small incremental impact (for instance because it makes the ranking of a row similar to that of another existing row).
With systems driven by machine learning, it is important to measure the overall system-level impact of changes to a model, not just the local impact on the model performance itself. One way to do this is by running A/B tests. Netflix typically A/B tests all changes before rolling them out to all members. A drawback to this approach is that tests take time to run and require experimental models be ready to run in production. In Machine Learning, offline metrics are often used to measure the performance of model changes on historical data. With a good offline metric, we can gain a reasonable understanding of how a particular model change would perform online. We would like to extend this approach, which is typically applied to a single machine-learned model, and apply it to the entire homepage generation system. This would allow us to measure the potential impact of offline changes in any of the models or logic involved in creating the homepage before running an A/B test.
To achieve this goal, we have built a system that simulates what a member’s homepage would have been given an experimental change and compares it against the page the member actually saw in the service. This provides an indication of the overall quality of the change. While we primarily use this for evaluating modifications to our machine learning algorithms, such as what happens when we have a new row selection or ranking algorithm, we can also use it to evaluate any changes in the code used to construct the page, from filtering rules to new row types. A key feature of this system is the ability to reconstruct a view of the systemic and user-level data state at a certain point in the past. As such, the system uses time-travel mechanisms for more precise reconstruction of an experience and coordinates time-travel across multiple systems. Thus, the simulator allows us to rapidly evaluate new ideas without needing to expose members to the changes.
In this blog post, we will go into more detail about this page simulation system and discuss some of the lessons we learned along the way.
Why Is This Hard?
A simulation system needs to run on many samples to generate reliable results. In our case, this requirement translates to generating millions of personalized homepages. Naturally, some problems of scale come into the picture, including:
How to ensure that the executions run within a reasonable time frame
How to coordinate work despite the distributed nature of the system
How to ensure that the system is easy to use and extend for future types of experiments
At a high level, the Page Simulation system consists of the following stages:
We’ll go through each of these stages in more detail below.
The experiment scope determines the set of experimental pages that will be simulated and which data sources will be used to generate those pages. Thus, the experimenter needs to tailor the scope to the metrics the experiment aims to measure. This involves defining three aspects:
A data source
Stratification rules for profile selection
Number of profiles for the experiment
We provide two different mechanisms for data retrieval: via time travel and via live service calls.
In the first approach, we use data from time-travel infrastructure built at Netflix to compute pages as they would have been at some point in the past. In the experimentation landscape, this gives us the ability to backtest the performance of experimental page generation model accurately. In particular, it lets us compare a new page against a page that a member has seen and interacted with in the past, including what actions they took in the session.
The second approach retrieves data in the exact same way as the live production system. To simulate production systems closely, in this mode, we randomly select profiles that have recently logged into Netflix. The primary drawback of using live data is that we can only compute a limited set of metrics compared to the time-travel approach. However, this type of experiment is still valuable in the following scenarios:
Doing final sanity checks before allocating a new A/B test or rolling out a new feature
Analyzing changes in page composition, which are measures of the rows and videos on the page. These measures are needed to validate that the changes we seek to test are having the intended effect without unexpected side-effects
Determining if two approaches are producing sufficiently similar pages that we may not need to test both
Early detection of negative interactions between two features that will be rolled out simultaneously
Once the data source is specified, a combination of different stratification types can be applied to refine user selection. Some examples of stratification types are:
Country — select profiles based on their country
Tenure — select profiles based on their membership tenure; long-term members vs members in trial period
Login device — select users based on their active device type; e.g. Smart TV, Android, or devices supporting certain feature sets
Number of Profiles
We typically start with a small number to perform a dry run of the experiment configuration and then extend it to millions of users to ensure reliable and statistically significant results.
Simulating Modified Behavior
Once the experiment scope is determined, experimenters specify the modifications they would like to test within the page generation framework. Generally, these changes can be made by either modifying the configuration of the existing system or by implementing new code and deploying it to the simulation system.
There are several ways to control what changes are run in the simulator, including but not limited to:
A/B test allocations
Collect metrics of the behavior of an A/B test that is not yet allocated
Analyze the behavior across cells using custom metrics
Inspect the effect of cross-allocating members to multiple A/B tests
2. Page generation models
Compare performance of different page generation models
Evaluate interactions between different models (when page is constructed using multiple models)
3. Device capabilities and page geometry
Evaluate page composition for different geometries. Page geometry is the number of rows and columns, which differs between device types
Multiple modifications can be grouped together to define different experimental variants. During metrics computation we collect each metric at the level of variant and stratum. This detailed breakdown of metrics allows for a fine-grained attribution of any shifts in page characteristics.
The lifecycle of an experiment starts when a user (Engineer, Researcher, Data Scientist or Product Manager) configures an experiment and submits it for execution (detailed below). Once the execution is complete, they get detailed Tableau reports. Those reports contain page composition and other important metrics regarding their experiment, which can be split by the different variants under test.
The execution workflow for the experiment proceeds through the following stages:
Partition the experiment into smaller chunks
Compute pages asynchronously for each partition
Compute experiment metrics
In the Page Simulation system an experiment is configured as a single entity, however when executing the experiment, the system splits it into multiple partitions. This is needed to isolate different parts of the experiment for the following reasons:
Some modifications to the page algorithm might impact the latency of page generation significantly
When time traveling to different times, different clusters of the page generation system are needed for each time (more on this later)
Asynchronous Page Computation
We embrace asynchronous computation as much as possible, especially in the page computation stage, which can be very compute-intensive and time consuming due to the heavy machine-learned models we often test. Each experiment partition is sent out as an event to a Request Poster. The Request Poster is responsible for reading data and applying stratification to select profiles for each partition. For each selected profile, page computation requests are generated and sent to a dedicated queue per partition. Each queue is then processed by a separate Page Generation cluster that is launched to serve a particular partition. Once the generator is running, it processes the requests in the queue to compute the simulated pages. Generated pages are then persisted to an S3-backed Hive table for metrics processing.
We chose to use queue-based communication between the systems instead of RESTFul calls to decouple the systems and allow for easy retries of each request, as well as individual experiment partitions. Writing the generated pages to Hive and running the Metrics Computation stage out-of-band allows us to modify or add new metrics on previously generated pages, thus avoiding needing to regenerate them.
Creating Mini Netflix Ecosystem on the Fly
The page generation system at Netflix consists of many interdependent services. Experiments can simulate new behaviors in any number of these microservices. Thus, for each experiment, we need to create an isolated mini Netflix ecosystem where each service exhibits their respective new behaviors. Because of this isolation requirement, we architected a system that can create a mini Netflix ecosystem on the fly.
Our approach is to create Docker container stacks to define a mini Netflix ecosystem for each simulation. We use Titus as a container management platform, which was built internally at Netflix. We configure each cluster using custom bootstrapping code in order to create different server environments, for example to initialize the containers with different machine-learned model versions and other data to precisely replicate time-traveled state in the past. Because we would like to time-travel all the services together to replicate a specific point in time in the past, we created a new capability to start stacks of multiple services with a common time configuration and route traffic between them on-the-fly per experiment to maintain temporal accuracy of the data. This capability provides the precision we need to simulate and correlate metrics correctly with actions of our members that happened in the past.
Achieving high temporal accuracy across multiple systems and data sources is challenging. It took us several iterations to determine the correct set of data and services to include in this time-travel scheme for accurate simulation of pages in time-travel mode. To this end, we developed tools that compared real pages computed by our live production system with that of our simulators, both in terms of the final output and the features involved in our models. To ensure that we maintain temporal accuracy going forward, we also automated these checks to avoid future regressions and identify new data sources that we need to handle. As such, the system is architected in a flexible way so we can easily incorporate more downstream systems into the time-travel experiment workflow.
Once the generated pages are saved to a Hive table, the system sends a signal to the workflow manager (Controller) for the completion of the page generation experiment. This signal triggers a Spark job to calculate the metrics, normalize the results and save both the raw and normalized data to Hive. Experimenters can then access the results of their experiment either using pre-configured Tableau reports or from notebooks that pull the raw data from Hive. If necessary, they can also access the simulated pages to compute new experiment-specific metrics.
Experiment Workflow Management
Given the asynchronous nature of the experiment workflow and the need to govern the lifecycle of multiple clusters dedicated to each partition, we needed a solution to manage the experiment workflow. Thus, we built a simple and lightweight workflow management system with the following capabilities:
Automatic retry of workflow steps in case of a transient failure
Conditional execution of workflow steps
Recording execution history
We use this simple workflow engine for the execution of the following tasks:
Govern the lifecycle of page generation services dedicated to each partition (external startup, shutdown tasks)
Initialize metrics computation when page generation for all partitions is complete
Terminate the experiment when the experiment does not have a sufficient page yield (i.e. there is a high error rate)
Send out notifications to experiment owners on the status of the experiment
Listen to the heartbeat of all components in the experimentation system and terminate the experiment when an issue is detected
To facilitate lifecycle management and to monitor the overall health of an experiment, we built a separate micro-service called Status Keeper. This service provides the following capabilities:
Expose a detailed report with granular metrics about different steps (Controller / Request Poster / Page Generator and Metrics Processor) in the system
Aid in lifecycle decisions to fast fail the experiment if failure threshold has reached
Store and retrieve status and aggregate metrics
Throughout the experiment workflow, each application in the Page Simulation system reports its status to the Status Keeper. We combine all the status and metrics recorded by each application in the system to create a view of the overall health of the system.
Need for Offline Metrics
An important part of improving our page generation approach is having good offline metrics to track model performance and to compare different model variants. Usually, there is not a perfect correspondence between offline results and results from A/B testing (if there was, it would do away with the need for online testing). For example, suppose we build two model variants and we find that one is better than the other according to our offline metric. The online A/B test performance will usually be measured by a different metric, and it may turn out that the model that’s worse on the offline metric is actually the better model online or even that there is no statistically significant difference between the two models online. Given that A/B tests need to run for a while to measure long-term metrics, finding an offline metric that provides an accurate pulse of how the testing might pan out is critical. So one of the main objectives in building our page simulation system was to come up with offline metrics that correspond better with online A/B metrics.
One major source of discrepancy between online and offline results is presentation bias. The real pages we presented to our members are the result of ranking videos and rows from our current production page generation models. Thus, the engagement data (what members click, play or thumb) we get as a result can be strongly influenced by those models. Members can only see and play from rows that the production system served to them. Thus, it is important that our offline metrics mitigate this bias (i.e. it should not unduly favor or disfavor the production model).
In the absence of A/B testing results on new candidate models, there is no ground truth to compare offline metrics against. However, because of the system described above, we can simulate how a member’s page might have looked at a past point-in-time if it had been generated by our new model instead of the production model. Because of time travel, we could also build the new model based on the data available at that time so as to get us as close as possible to the unobserved counterfactual page that the new model would have shown.
Given these pages, the next question to answer was exactly what numerical metrics we can use for validating the effectiveness of our offline metrics. This turned out to be easy with the new system because we could use models from past A/B tests to ascertain how well the offline metrics computed on the simulated pages correlated with the actual online metrics for those A/B tests. That is, we could take the hypothetical pages generated by certain models, evaluate them according to an offline metric, and then see how well those offline metrics correspond to online ones. After trying out a few variations, we were able to settle on a suite of metrics that had a much stronger correlation with corresponding online metrics across many A/B tests as compared to our previous offline metric, as shown below.
Having such offline metrics that strongly correlate with online metrics allows us to experiment more rapidly and reject model variants which may not be significantly better than the current production model, thus saving valuable A/B testing bandwidth and time. It has also helped us detect bugs early in the model development process when the offline metrics go vigorously against our hypothesis. This has saved many development cycles, experimentation cycles, and has enabled us to try out more ideas.
In addition, these offline metrics enable us to:
Compare models trained with different objective functions
Compare models trained on different datasets
Compare page construction related changes outside of our machine learning models
Reconcile effects due to changes arising out of many A/B tests running simultaneously
Personalizing home pages for users is a hard problem and one that traditionally required us to run A/B tests to find out whether a new approach works. However, our Page Simulation system allows us to rapidly try out new ideas and obtain results without needing to expose our members to all these experiences. Being able to create a mini Netflix ecosystem on the fly helps us iterate fast and allows us to try out more far-fetched ideas. Building this system was a big collaboration between our engineering and research teams that allows our researchers to run page simulations and our engineers to quickly extend the system to accommodate new types of simulations. This, in turn, has resulted in improvements of the personalized homepages for our members. If you are interested in helping us solve these types of problems and helping entertain the world, please take a look at some of our open positions on the Netflix jobs page.
Page Simulator was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.
Security updates have been issued by Fedora (community-mysql, crun, java-latest-openjdk, and mupdf), openSUSE (libssh2_org), and SUSE (go1.12, libseccomp, and tar).
The collective thoughts of the interwebz
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.