Tag Archives: Demo

Google on Collision Course With Movie Biz Over Piracy & Safe Harbor

Post Syndicated from Andy original https://torrentfreak.com/google-on-collision-course-with-movie-biz-over-piracy-safe-harbor-180219/

Wherever Google has a presence, rightsholders are around to accuse the search giant of not doing enough to deal with piracy.

Over the past several years, the company has been attacked by both the music and movie industries but despite overtures from Google, criticism still floods in.

In Australia, things are definitely heating up. Village Roadshow, one of the nation’s foremost movie companies, has been an extremely vocal Google critic since 2015 but now its co-chief, the outspoken Graham Burke, seems to want to take things to the next level.

As part of yet another broadside against Google, Burke has for the second time in a month accused Google of playing a large part in online digital crime.

“My view is they are complicit and they are facilitating crime,” Burke said, adding that if Google wants to sue him over his comments, they’re very welcome to do so.

It’s highly unlikely that Google will take the bait. Burke’s attempt at pushing the issue further into the spotlight will have been spotted a mile off but in any event, legal battles with Google aren’t really something that Burke wants to get involved in.

Australia is currently in the midst of a consultation process for the Copyright Amendment (Service Providers) Bill 2017 which would extend the country’s safe harbor provisions to a broader range of service providers including educational institutions, libraries, archives, key cultural institutions and organizations assisting people with disabilities.

For its part, Village Roadshow is extremely concerned that these provisions may be extended to other providers – specifically Google – who might then use expanded safe harbor to deflect more liability in respect of piracy.

“Village Roadshow….urges that there be no further amendments to safe harbor and in particular there is no advantage to Australia in extending safe harbor to Google,” Burke wrote in his company’s recent submission to the government.

“It is very unlikely given their size and power that as content owners we would ever sue them but if we don’t have that right then we stand naked. Most importantly if Google do the right thing by Australia on the question of piracy then there will be no issues. However, they are very far from this position and demonstrably are facilitating crime.”

Accusations of crime facilitation are nothing new for Google, with rightsholders in the US and Europe having accused the company of the same a number of times over the years. In response, Google always insists that it abides by relevant laws and actually goes much further in tackling piracy than legislation currently requires.

On the safe harbor front, Google begins by saying that not expanding provisions to service providers will have a seriously detrimental effect on business development in the region.

“[Excluding] online service providers falls far short of a balanced, pro-innovation environment for Australia. Further, it takes Australia out of step with other digital economies by creating regulatory uncertainty for [venture capital] investment and startup/entrepreneurial success,” Google’s submission reads.

“[T]he Draft Bill’s narrow safe harbor scheme places Australian-based startups and online service providers — including individual bloggers, websites, small startups, video-hosting services, enterprise cloud companies, auction sites, online marketplaces, hosting providers for real-estate listings, photo hosting services, search engines, review sites, and online platforms —in a disadvantaged position compared with global startups in countries that have strong safe harbor frameworks, such as the United States, Canada, United Kingdom, Singapore, South Korea, Japan, and other EU countries.

“Under the new scheme, Australian-based startups and service providers, unlike their international counterparts, will not receive clear and consistent legal protection when they respond to complaints from rightsholders about alleged instances of online infringement by third-party users on their services,” Google notes.

Interestingly, Google then delivers what appears to be a loosely veiled threat.

One of the key anti-piracy strategies touted by the mainstream entertainment companies is collaboration between rightsholders and service providers, including the latter providing voluntary tools to police infringement online. Google says that if service providers are given a raw deal on safe harbor, the extent of future cooperation may be at risk.

“If Australian-based service providers are carved out of the new safe harbor regime post-reform, they will operate from a lower incentive to build and test new voluntary tools to combat online piracy, potentially reducing their contributions to innovation in best practices in both Australia and international markets,” the company warns.

But while Village Roadshow argue against safe harbors and warn that piracy could kill the movie industry, it is quietly optimistic that the tide is turning.

In a presentation to investors last week, the company said that reducing piracy would have “only an upside” for its business but also added that new research indicates that “piracy growth [is] getting arrested.” As a result, the company says that it will build on the notion that “74% of people see piracy as ‘wrong/theft’” and will call on Australians to do the right thing.

In the meantime, the pressure on Google will continue but lawsuits – in either direction – won’t provide an answer.

Village Roadshow’s submission can be found here, Google’s here (pdf).

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

This IoT Pet Monitor barks back

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/iot-pet-monitor/

Jennifer Fox, founder of FoxBot Industries, uses a Raspberry Pi pet monitor to check the sound levels of her home while she is out, allowing her to keep track of when her dog Marley gets noisy or agitated, and to interact with the gorgeous furball accordingly.

Bark Back Project Demo

A quick overview and demo of the Bark Back, a project to monitor and interact with Check out the full tutorial here: https://learn.sparkfun.com/tutorials/bark-back-interactive-pet-monitor For any licensing requests please contact [email protected]

Marley, bark!

Using a Raspberry Pi 3, speakers, SparkFun’s MEMS microphone breakout board, and an analogue-to-digital converter (ADC), the IoT Pet Monitor is fairly easy to recreate, all thanks to Jennifer’s full tutorial on the FoxBot website.

Building the pet monitor

In a nutshell, once the Raspberry Pi and the appropriate bits and pieces are set up, you’ll need to sign up at CloudMQTT — it’s free if you select the Cute Cat account. CloudMQTT will create an invisible bridge between your home and wherever you are that isn’t home, so that you can check in on your pet monitor.

Screenshot CloudMQTT account set-up — IoT Pet Monitor Bark Back Raspberry Pi

Image c/o FoxBot Industries

Within the project code, you’ll be able to calculate the peak-to-peak amplitude of sound the microphone picks up. Then you can decide how noisy is too noisy when it comes to the occasional whine and bark of your beloved pup.

MEMS microphone breakout board — IoT Pet Monitor Bark Back Raspberry Pi

The MEMS microphone breakout board collects sound data and relays it back to the Raspberry Pi via the ADC.
Image c/o FoxBot Industries

Next you can import sounds to a preset song list that will be played back when the volume rises above your predefined threshold. As Jennifer states in the tutorial, the sounds can easily be recorded via apps such as Garageband, or even on your mobile phone.

Using the pet monitor

Whenever the Bark Back IoT Pet Monitor is triggered to play back audio, this information is fed to the CloudMQTT service, allowing you to see if anything is going on back home.

A sitting dog with a doll in its mouth — IoT Pet Monitor Bark Back Raspberry Pi

*incoherent coos of affection from Alex*
Image c/o FoxBot Industries

And as Jennifer recommends, a update of the project could include a camera or sensors to feed back more information about your home environment.

If you’ve created something similar, be sure to let us know in the comments. And if you haven’t, but you’re now planning to build your own IoT pet monitor, be sure to let us know in the comments. And if you don’t have a pet but just want to say hi…that’s right, be sure to let us know in the comments.

The post This IoT Pet Monitor barks back appeared first on Raspberry Pi.

N-O-D-E’s always-on networked Pi Plug

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/node-pi-plug/

N-O-D-E’s Pi Plug is a simple approach to using a Raspberry Pi Zero W as an always-on networked device without a tangle of wires.

Pi Plug 2: Turn The Pi Zero Into A Mini Server

Today I’m back with an update on the Pi Plug I made a while back. This prototype is still in the works, and is much more modular than the previous version. https://N-O-D-E.net/piplug2.html https://github.com/N-O-D-E/piplug —————- Shop: http://N-O-D-E.net/shop/ Patreon: http://patreon.com/N_O_D_E_ BTC: 17HqC7ZzmpE7E8Liuyb5WRbpwswBUgKRGZ Newsletter: http://eepurl.com/ceA-nL Music: https://archive.org/details/Fwawn-FromManToGod

The Pi Zero Power Case

In a video early last year, YouTuber N-O-D-E revealed his Pi Zero Power Case, an all-in-one always-on networked computer that fits snugly against a wall power socket.

NODE Plug Raspberry Pi Plug

The project uses an official Raspberry Pi power supply, a Zero4U USB hub, and a Raspberry Pi Zero W, and it allows completely wireless connection to a network. N-O-D-E cut the power cord and soldered its wires directly to the power input of the USB hub. The hub powers the Zero via pogo pins that connect directly to the test pads beneath.

The Power Case is a neat project, but it may be a little daunting for anyone not keen on cutting and soldering the power supply wires.

Pi Plug 2

In his overhaul of the design, N-O-D-E has created a modular reimagining of the previous always-on networked computer that fits more streamlined to the wall socket and requires absolutely no soldering or hacking of physical hardware.

Pi Plug

The Pi Plug 2 uses a USB power supply alongside two custom PCBs and a Zero W. While one PCB houses a USB connector that slots directly into the power supply, two blobs of solder on the second PCB press against the test pads beneath the Zero W. When connected, the PCBs run power directly from the wall socket to the Raspberry Pi Zero W. Neat!

NODE Plug Raspberry Pi
NODE Plug Raspberry Pi
NODE Plug Raspberry Pi
NODE Plug Raspberry Pi

While N-O-D-E isn’t currently selling these PCBs in his online store, all files are available on GitHub, so have a look if you want to recreate the Pi Plug.

Uses

In another video — and seriously, if you haven’t checked out N-O-D-E’s YouTube channel yet, you really should — he demonstrates a few changes that can turn your Zero into a USB dongle computer. This is a great hack if you don’t want to carry a power supply around in your pocket. As N-O-D-E explains:

Besides simply SSH’ing into the Pi, you could also easily install a remote desktop client and use the GUI. You can share your computer’s internet connection with the Pi and use it just like you would normally, but now without the need for a monitor, chargers, adapters, cables, or peripherals.

We’re keen to see how our community is hacking their Zeros and Zero Ws in order to take full advantage of the small footprint of the computer, so be sure to share your projects and ideas with us, either in the comments below or via social media.

The post N-O-D-E’s always-on networked Pi Plug appeared first on Raspberry Pi.

Join Us for AWS Security Week February 20–23 in San Francisco!

Post Syndicated from Craig Liebendorfer original https://aws.amazon.com/blogs/security/join-us-for-aws-security-week-february-20-23-in-san-francisco/

AWS Pop-up Loft image

Join us for AWS Security Week, February 20–23 at the AWS Pop-up Loft in San Francisco, where you can participate in four days of themed content that will help you secure your workloads on AWS. Each day will highlight a different security and compliance topic, and will include an overview session, a customer or partner speaker, a deep dive into the day’s topic, and a hands-on lab or demos of relevant AWS or partner services.

Tuesday (February 20) will kick off the week with a day devoted to identity and governance. On Wednesday, we will dig into secure configuration and automation, including a discussion about upcoming General Data Protection Regulation (GDPR) requirements. On Thursday, we will cover threat detection and remediation, which will include an Amazon GuardDuty lab. And on Friday, we will discuss incident response on AWS.

Sessions, demos, and labs about each of these topics will be led by seasoned security professionals from AWS, who will help you understand not just the basics, but also the nuances of building applications in the AWS Cloud in a robust and secure manner. AWS subject-matter experts will be available for “Ask the Experts” sessions during breaks.

Register today!

– Craig

Pirate Streaming Search Engine Exploits Crunchyroll Vulnerability

Post Syndicated from Ernesto original https://torrentfreak.com/pirate-streaming-search-engine-exploits-crunchyroll-vulnerability-180213/

With 20 million members around the world, Crunchyroll is one of the largest on-demand streaming platforms for anime and manga content.

Much like Hollywood, the site has competition from pirate streaming sites which offer their content without permission. These usually stream pirated videos which are hosted on external sites.

However, this week Crunchyroll is facing a more direct attack. The people behind the new streaming meta-search engine StreamCR say they’ve found a way to stream the site’s content from its own servers, without paying.

“This works due to a vulnerability in the Crunchyroll system,” StreamCR’s operators tell TorrentFreak.

Simply put, StreamCR uses an active Crunchyroll account to locate the video streams and embeds this on its own website. This allows people to access Crunchyroll videos in the best quality without paying.

“This gives access to the full library in the region of our server, retrieving it as long as we’re not bound by the regular regional restriction. For this, we pick a US server as American Crunchyroll has the most library of content.

Stream in various qualities

The exploit was developed in-house, the StreamCR team informs us. While it works fine at the moment the team realizes that this may not last forever, as Crunchyroll might eventually patch the vulnerability.

However, the meta-search engine will have made its point by then.

“We expect them to fix this, Why wouldn’t they? In the meantime, this can demonstrate how vulnerable Crunchyroll is at the moment,” they tell us.

The site’s ultimate plan is to become the go-to search engine for people looking to stream all kinds of pirated videos. In addition to Crunchyroll, StreamCR also indexes various pirate sites, including YesMovies, Gomovies, and 9anime.

“StreamCR’s goal is to let people access streams with ease from a universal site, we’re trying to have a Google-like experience for finding online streams,” they say.

TorrentFreak reached out to Crunchyroll asking for a comment on the issue, but at the time of publication, we have yet to hear back.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

The Fisher Piano: make music in the air

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/air-piano/

Piano keys are so limiting! Why not swap them out for LEDs and the wealth of instruments in Pygame to build air keys, as demonstrated by Instructables maker 2fishy?

Raspberry Pi LED Light Schroeder Piano – Twinkle Little Star

Raspberry Pi LED Light Schroeder Piano – Twinkle Little Star

Keys? Where we’re going you don’t need keys!

This project, created by either Yolanda or Ken Fisher (or both!), uses an array of LEDs and photoresistors to form a MIDI sequencer. Twelve LEDs replace piano keys, and another three change octaves and access the menu.

Each LED is paired with a photoresistor, which detects the emitted light to form a closed circuit. Interrupting the light beam — in this case with a finger — breaks the circuit, telling the Python program to perform an action.

2fishy LED light piano raspberry pi

We’re all hoping this is just the scaled-down prototype of a full-sized LED grand piano

Using Pygame, the 2fishy team can access 75 different instruments and 128 notes per instrument, making their wooden piano more than just a one-hit wonder.

Piano building

The duo made the piano’s body out of plywood, hardboard, and dowels, and equipped it with a Raspberry Pi 2, a speaker, and the aforementioned LEDs and photoresistors.

2fishy LED light piano raspberry pi

A Raspberry Pi 2 and speaker sit within the wooden body, with LEDs and photoresistors in place of the keys.

A complete how-to for the build, including some rather fancy and informative schematics, is available at Instructables, where 2fishy received a bronze medal for their project. Congratulations!

Learn more

If you’d like to learn more about using Pygame, check out The MagPi’s Make Games with Python Essentials Guide, available both in print and as a free PDF download.

And for more music-based projects using a variety of tech, be sure to browse our free resources.

Lastly, if you’d like to see more piano-themed Raspberry Pi projects, take a look at our Big Minecraft Piano, these brilliant piano stairs, this laser-guided piano teacher, and our video below about the splendid Street Fighter duelling pianos we witnessed at Maker Faire.

Pianette: Piano Street Fighter at Maker Faire NYC 2016

Two pianos wired up as Playstation 2 controllers allow users to battle…musically! We caught up with makers Eric Redon and Cyril Chapellier of foobarflies a…

The post The Fisher Piano: make music in the air appeared first on Raspberry Pi.

Sharing Secrets with AWS Lambda Using AWS Systems Manager Parameter Store

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/sharing-secrets-with-aws-lambda-using-aws-systems-manager-parameter-store/

This post courtesy of Roberto Iturralde, Sr. Application Developer- AWS Professional Services

Application architects are faced with key decisions throughout the process of designing and implementing their systems. One decision common to nearly all solutions is how to manage the storage and access rights of application configuration. Shared configuration should be stored centrally and securely with each system component having access only to the properties that it needs for functioning.

With AWS Systems Manager Parameter Store, developers have access to central, secure, durable, and highly available storage for application configuration and secrets. Parameter Store also integrates with AWS Identity and Access Management (IAM), allowing fine-grained access control to individual parameters or branches of a hierarchical tree.

This post demonstrates how to create and access shared configurations in Parameter Store from AWS Lambda. Both encrypted and plaintext parameter values are stored with only the Lambda function having permissions to decrypt the secrets. You also use AWS X-Ray to profile the function.

Solution overview

This example is made up of the following components:

  • An AWS SAM template that defines:
    • A Lambda function and its permissions
    • An unencrypted Parameter Store parameter that the Lambda function loads
    • A KMS key that only the Lambda function can access. You use this key to create an encrypted parameter later.
  • Lambda function code in Python 3.6 that demonstrates how to load values from Parameter Store at function initialization for reuse across invocations.

Launch the AWS SAM template

To create the resources shown in this post, you can download the SAM template or choose the button to launch the stack. The template requires one parameter, an IAM user name, which is the name of the IAM user to be the admin of the KMS key that you create. In order to perform the steps listed in this post, this IAM user will need permissions to execute Lambda functions, create Parameter Store parameters, administer keys in KMS, and view the X-Ray console. If you have these privileges in your IAM user account you can use your own account to complete the walkthrough. You can not use the root user to administer the KMS keys.

SAM template resources

The following sections show the code for the resources defined in the template.
Lambda function

ParameterStoreBlogFunctionDev:
    Type: 'AWS::Serverless::Function'
    Properties:
      FunctionName: 'ParameterStoreBlogFunctionDev'
      Description: 'Integrating lambda with Parameter Store'
      Handler: 'lambda_function.lambda_handler'
      Role: !GetAtt ParameterStoreBlogFunctionRoleDev.Arn
      CodeUri: './code'
      Environment:
        Variables:
          ENV: 'dev'
          APP_CONFIG_PATH: 'parameterStoreBlog'
          AWS_XRAY_TRACING_NAME: 'ParameterStoreBlogFunctionDev'
      Runtime: 'python3.6'
      Timeout: 5
      Tracing: 'Active'

  ParameterStoreBlogFunctionRoleDev:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          -
            Effect: Allow
            Principal:
              Service:
                - 'lambda.amazonaws.com'
            Action:
              - 'sts:AssumeRole'
      ManagedPolicyArns:
        - 'arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole'
      Policies:
        -
          PolicyName: 'ParameterStoreBlogDevParameterAccess'
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              -
                Effect: Allow
                Action:
                  - 'ssm:GetParameter*'
                Resource: !Sub 'arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:parameter/dev/parameterStoreBlog*'
        -
          PolicyName: 'ParameterStoreBlogDevXRayAccess'
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              -
                Effect: Allow
                Action:
                  - 'xray:PutTraceSegments'
                  - 'xray:PutTelemetryRecords'
                Resource: '*'

In this YAML code, you define a Lambda function named ParameterStoreBlogFunctionDev using the SAM AWS::Serverless::Function type. The environment variables for this function include the ENV (dev) and the APP_CONFIG_PATH where you find the configuration for this app in Parameter Store. X-Ray tracing is also enabled for profiling later.

The IAM role for this function extends the AWSLambdaBasicExecutionRole by adding IAM policies that grant the function permissions to write to X-Ray and get parameters from Parameter Store, limited to paths under /dev/parameterStoreBlog*.
Parameter Store parameter

SimpleParameter:
    Type: AWS::SSM::Parameter
    Properties:
      Name: '/dev/parameterStoreBlog/appConfig'
      Description: 'Sample dev config values for my app'
      Type: String
      Value: '{"key1": "value1","key2": "value2","key3": "value3"}'

This YAML code creates a plaintext string parameter in Parameter Store in a path that your Lambda function can access.
KMS encryption key

ParameterStoreBlogDevEncryptionKeyAlias:
    Type: AWS::KMS::Alias
    Properties:
      AliasName: 'alias/ParameterStoreBlogKeyDev'
      TargetKeyId: !Ref ParameterStoreBlogDevEncryptionKey

  ParameterStoreBlogDevEncryptionKey:
    Type: AWS::KMS::Key
    Properties:
      Description: 'Encryption key for secret config values for the Parameter Store blog post'
      Enabled: True
      EnableKeyRotation: False
      KeyPolicy:
        Version: '2012-10-17'
        Id: 'key-default-1'
        Statement:
          -
            Sid: 'Allow administration of the key & encryption of new values'
            Effect: Allow
            Principal:
              AWS:
                - !Sub 'arn:aws:iam::${AWS::AccountId}:user/${IAMUsername}'
            Action:
              - 'kms:Create*'
              - 'kms:Encrypt'
              - 'kms:Describe*'
              - 'kms:Enable*'
              - 'kms:List*'
              - 'kms:Put*'
              - 'kms:Update*'
              - 'kms:Revoke*'
              - 'kms:Disable*'
              - 'kms:Get*'
              - 'kms:Delete*'
              - 'kms:ScheduleKeyDeletion'
              - 'kms:CancelKeyDeletion'
            Resource: '*'
          -
            Sid: 'Allow use of the key'
            Effect: Allow
            Principal:
              AWS: !GetAtt ParameterStoreBlogFunctionRoleDev.Arn
            Action:
              - 'kms:Encrypt'
              - 'kms:Decrypt'
              - 'kms:ReEncrypt*'
              - 'kms:GenerateDataKey*'
              - 'kms:DescribeKey'
            Resource: '*'

This YAML code creates an encryption key with a key policy with two statements.

The first statement allows a given user (${IAMUsername}) to administer the key. Importantly, this includes the ability to encrypt values using this key and disable or delete this key, but does not allow the administrator to decrypt values that were encrypted with this key.

The second statement grants your Lambda function permission to encrypt and decrypt values using this key. The alias for this key in KMS is ParameterStoreBlogKeyDev, which is how you reference it later.

Lambda function

Here I walk you through the Lambda function code.

import os, traceback, json, configparser, boto3
from aws_xray_sdk.core import patch_all
patch_all()

# Initialize boto3 client at global scope for connection reuse
client = boto3.client('ssm')
env = os.environ['ENV']
app_config_path = os.environ['APP_CONFIG_PATH']
full_config_path = '/' + env + '/' + app_config_path
# Initialize app at global scope for reuse across invocations
app = None

class MyApp:
    def __init__(self, config):
        """
        Construct new MyApp with configuration
        :param config: application configuration
        """
        self.config = config

    def get_config(self):
        return self.config

def load_config(ssm_parameter_path):
    """
    Load configparser from config stored in SSM Parameter Store
    :param ssm_parameter_path: Path to app config in SSM Parameter Store
    :return: ConfigParser holding loaded config
    """
    configuration = configparser.ConfigParser()
    try:
        # Get all parameters for this app
        param_details = client.get_parameters_by_path(
            Path=ssm_parameter_path,
            Recursive=False,
            WithDecryption=True
        )

        # Loop through the returned parameters and populate the ConfigParser
        if 'Parameters' in param_details and len(param_details.get('Parameters')) > 0:
            for param in param_details.get('Parameters'):
                param_path_array = param.get('Name').split("/")
                section_position = len(param_path_array) - 1
                section_name = param_path_array[section_position]
                config_values = json.loads(param.get('Value'))
                config_dict = {section_name: config_values}
                print("Found configuration: " + str(config_dict))
                configuration.read_dict(config_dict)

    except:
        print("Encountered an error loading config from SSM.")
        traceback.print_exc()
    finally:
        return configuration

def lambda_handler(event, context):
    global app
    # Initialize app if it doesn't yet exist
    if app is None:
        print("Loading config and creating new MyApp...")
        config = load_config(full_config_path)
        app = MyApp(config)

    return "MyApp config is " + str(app.get_config()._sections)

Beneath the import statements, you import the patch_all function from the AWS X-Ray library, which you use to patch boto3 to create X-Ray segments for all your boto3 operations.

Next, you create a boto3 SSM client at the global scope for reuse across function invocations, following Lambda best practices. Using the function environment variables, you assemble the path where you expect to find your configuration in Parameter Store. The class MyApp is meant to serve as an example of an application that would need its configuration injected at construction. In this example, you create an instance of ConfigParser, a class in Python’s standard library for handling basic configurations, to give to MyApp.

The load_config function loads the all the parameters from Parameter Store at the level immediately beneath the path provided in the Lambda function environment variables. Each parameter found is put into a new section in ConfigParser. The name of the section is the name of the parameter, less the base path. In this example, the full parameter name is /dev/parameterStoreBlog/appConfig, which is put in a section named appConfig.

Finally, the lambda_handler function initializes an instance of MyApp if it doesn’t already exist, constructing it with the loaded configuration from Parameter Store. Then it simply returns the currently loaded configuration in MyApp. The impact of this design is that the configuration is only loaded from Parameter Store the first time that the Lambda function execution environment is initialized. Subsequent invocations reuse the existing instance of MyApp, resulting in improved performance. You see this in the X-Ray traces later in this post. For more advanced use cases where configuration changes need to be received immediately, you could implement an expiry policy for your configuration entries or push notifications to your function.

To confirm that everything was created successfully, test the function in the Lambda console.

  1. Open the Lambda console.
  2. In the navigation pane, choose Functions.
  3. In the Functions pane, filter to ParameterStoreBlogFunctionDev to find the function created by the SAM template earlier. Open the function name to view its details.
  4. On the top right of the function detail page, choose Test. You may need to create a new test event. The input JSON doesn’t matter as this function ignores the input.

After running the test, you should see output similar to the following. This demonstrates that the function successfully fetched the unencrypted configuration from Parameter Store.

Create an encrypted parameter

You currently have a simple, unencrypted parameter and a Lambda function that can access it.

Next, you create an encrypted parameter that only your Lambda function has permission to use for decryption. This limits read access for this parameter to only this Lambda function.

To follow along with this section, deploy the SAM template for this post in your account and make your IAM user name the KMS key admin mentioned earlier.

  1. In the Systems Manager console, under Shared Resources, choose Parameter Store.
  2. Choose Create Parameter.
    • For Name, enter /dev/parameterStoreBlog/appSecrets.
    • For Type, select Secure String.
    • For KMS Key ID, choose alias/ParameterStoreBlogKeyDev, which is the key that your SAM template created.
    • For Value, enter {"secretKey": "secretValue"}.
    • Choose Create Parameter.
  3. If you now try to view the value of this parameter by choosing the name of the parameter in the parameters list and then choosing Show next to the Value field, you won’t see the value appear. This is because, even though you have permission to encrypt values using this KMS key, you do not have permissions to decrypt values.
  4. In the Lambda console, run another test of your function. You now also see the secret parameter that you created and its decrypted value.

If you do not see the new parameter in the Lambda output, this may be because the Lambda execution environment is still warm from the previous test. Because the parameters are loaded at Lambda startup, you need a fresh execution environment to refresh the values.

Adjust the function timeout to a different value in the Advanced Settings at the bottom of the Lambda Configuration tab. Choose Save and test to trigger the creation of a new Lambda execution environment.

Profiling the impact of querying Parameter Store using AWS X-Ray

By using the AWS X-Ray SDK to patch boto3 in your Lambda function code, each invocation of the function creates traces in X-Ray. In this example, you can use these traces to validate the performance impact of your design decision to only load configuration from Parameter Store on the first invocation of the function in a new execution environment.

From the Lambda function details page where you tested the function earlier, under the function name, choose Monitoring. Choose View traces in X-Ray.

This opens the X-Ray console in a new window filtered to your function. Be aware of the time range field next to the search bar if you don’t see any search results.
In this screenshot, I’ve invoked the Lambda function twice, one time 10.3 minutes ago with a response time of 1.1 seconds and again 9.8 minutes ago with a response time of 8 milliseconds.

Looking at the details of the longer running trace by clicking the trace ID, you can see that the Lambda function spent the first ~350 ms of the full 1.1 sec routing the request through Lambda and creating a new execution environment for this function, as this was the first invocation with this code. This is the portion of time before the initialization subsegment.

Next, it took 725 ms to initialize the function, which includes executing the code at the global scope (including creating the boto3 client). This is also a one-time cost for a fresh execution environment.

Finally, the function executed for 65 ms, of which 63.5 ms was the GetParametersByPath call to Parameter Store.

Looking at the trace for the second, much faster function invocation, you see that the majority of the 8 ms execution time was Lambda routing the request to the function and returning the response. Only 1 ms of the overall execution time was attributed to the execution of the function, which makes sense given that after the first invocation you’re simply returning the config stored in MyApp.

While the Traces screen allows you to view the details of individual traces, the X-Ray Service Map screen allows you to view aggregate performance data for all traced services over a period of time.

In the X-Ray console navigation pane, choose Service map. Selecting a service node shows the metrics for node-specific requests. Selecting an edge between two nodes shows the metrics for requests that traveled that connection. Again, be aware of the time range field next to the search bar if you don’t see any search results.

After invoking your Lambda function several more times by testing it from the Lambda console, you can view some aggregate performance metrics. Look at the following:

  • From the client perspective, requests to the Lambda service for the function are taking an average of 50 ms to respond. The function is generating ~1 trace per minute.
  • The function itself is responding in an average of 3 ms. In the following screenshot, I’ve clicked on this node, which reveals a latency histogram of the traced requests showing that over 95% of requests return in under 5 ms.
  • Parameter Store is responding to requests in an average of 64 ms, but note the much lower trace rate in the node. This is because you only fetch data from Parameter Store on the initialization of the Lambda execution environment.

Conclusion

Deduplication, encryption, and restricted access to shared configuration and secrets is a key component to any mature architecture. Serverless architectures designed using event-driven, on-demand, compute services like Lambda are no different.

In this post, I walked you through a sample application accessing unencrypted and encrypted values in Parameter Store. These values were created in a hierarchy by application environment and component name, with the permissions to decrypt secret values restricted to only the function needing access. The techniques used here can become the foundation of secure, robust configuration management in your enterprise serverless applications.

Chrome and Firefox Block 123movies Over “Harmful Programs”

Post Syndicated from Ernesto original https://torrentfreak.com/chrome-and-firefox-block-123movies-over-harmful-programs-180209/

With millions of visitors per day, 123movies(hub), also known as Gomovies, is one of the largest pirate streaming sites on the web.

Today, however, many visitors were welcomed by a dangerous-looking red banner instead of the usual homepage.

“The site ahead contains harmful programs,” Chrome warns its users. “Attackers on 123movieshub.to might attempt to trick you into installing programs that harm your browsing experience.”

It is not clear what the problem is in this particular case, but these type of notifications are often triggered by malicious or deceptive third-party advertising that has appeared on a site.

Warning

These warning messages are triggered by Google’s Safebrowsing algorithm which flags websites that pose a potential danger to visitors. Chrome, Firefox, and others use this service to prevent users from running into unwanted software.

In addition to the browser block, Google generally informs the site’s owners that their domain will be demoted in search results until the issue is resolved.

Google previously informed us that these kinds of warnings automatically disappear when the flagged sites no longer violate Google’s policy. This can take one or two days, but also longer.

This isn’t the first time that Google has flagged such a large website. Many pirate sites, including The Pirate Bay, have been affected by this issue in the past.

Chrome and Firefox users should be familiar with these intermittent warning notices be now. If users believe that an affected site is harmless they can always take steps (Chrome, FF) to bypass the blocks, but that’s completely at their own risk.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Astro Pi celebrates anniversary of ISS Columbus module

Post Syndicated from David Honess original https://www.raspberrypi.org/blog/astro-pi-celebrates-anniversary/

Right now, 400km above the Earth aboard the International Space Station, are two very special Raspberry Pi computers. They were launched into space on 6 December 2015 and are, most assuredly, the farthest-travelled Raspberry Pi computers in existence. Each year they run experiments that school students create in the European Astro Pi Challenge.

Raspberry Astro Pi units on the International Space Station

Left: Astro Pi Vis (Ed); right: Astro Pi IR (Izzy). Image credit: ESA.

The European Columbus module

Today marks the tenth anniversary of the launch of the European Columbus module. The Columbus module is the European Space Agency’s largest single contribution to the ISS, and it supports research in many scientific disciplines, from astrobiology and solar science to metallurgy and psychology. More than 225 experiments have been carried out inside it during the past decade. It’s also home to our Astro Pi computers.

Here’s a video from 7 February 2008, when Space Shuttle Atlantis went skywards carrying the Columbus module in its cargo bay.

STS-122 Launch NASA TV Coverage

From February 7th, 2008 NASA-TV Coverage of The 121st Space Shuttle Launch Launched At:2:45:30 P.M E.T – Coverage begins exactly one hour till launch STS-122 Crew:

Today, coincidentally, is also the deadline for the European Astro Pi Challenge: Mission Space Lab. Participating teams have until midnight tonight to submit their experiments.

Anniversary celebrations

At 16:30 GMT today there will be a live event on NASA TV for the Columbus module anniversary with NASA flight engineers Joe Acaba and Mark Vande Hei.

Our Astro Pi computers will be joining in the celebrations by displaying a digital birthday candle that the crew can blow out. It works by detecting an increase in humidity when someone blows on it. The video below demonstrates the concept.

AstroPi candle

Uploaded by Effi Edmonton on 2018-01-17.

Do try this at home

The exact Astro Pi code that will run on the ISS today is available for you to download and run on your own Raspberry Pi and Sense HAT. You’ll notice that the program includes code to make it stop automatically when the date changes to 8 February. This is just to save time for the ground control team.

If you have a Raspberry Pi and a Sense HAT, you can use the terminal commands below to download and run the code yourself:

wget http://rpf.io/colbday -O birthday.py
chmod +x birthday.py
./birthday.py

When you see a blank blue screen with the brightness increasing, the Sense HAT is measuring the baseline humidity. It does this every 15 minutes so it can recalibrate to take account of natural changes in background humidity. A humidity increase of 2% is needed to blow out the candle, so if the background humidity changes by more than 2% in 15 minutes, it’s possible to get a false positive. Press Ctrl + C to quit.

Please tweet pictures of your candles to @astro_pi – we might share yours! And if we’re lucky, we might catch a glimpse of the candle on the ISS during the NASA TV event at 16:30 GMT today.

The post Astro Pi celebrates anniversary of ISS Columbus module appeared first on Raspberry Pi.

Build a Multi-Tenant Amazon EMR Cluster with Kerberos, Microsoft Active Directory Integration and EMRFS Authorization

Post Syndicated from Songzhi Liu original https://aws.amazon.com/blogs/big-data/build-a-multi-tenant-amazon-emr-cluster-with-kerberos-microsoft-active-directory-integration-and-emrfs-authorization/

One of the challenges faced by our customers—especially those in highly regulated industries—is balancing the need for security with flexibility. In this post, we cover how to enable multi-tenancy and increase security by using EMRFS (EMR File System) authorization, the Amazon S3 storage-level authorization on Amazon EMR.

Amazon EMR is an easy, fast, and scalable analytics platform enabling large-scale data processing. EMRFS authorization provides Amazon S3 storage-level authorization by configuring EMRFS with multiple IAM roles. With this functionality enabled, different users and groups can share the same cluster and assume their own IAM roles respectively.

Simply put, on Amazon EMR, we can now have an Amazon EC2 role per user assumed at run time instead of one general EC2 role at the cluster level. When the user is trying to access Amazon S3 resources, Amazon EMR evaluates against a predefined mappings list in EMRFS configurations and picks up the right role for the user.

In this post, we will discuss what EMRFS authorization is (Amazon S3 storage-level access control) and show how to configure the role mappings with detailed examples. You will then have the desired permissions in a multi-tenant environment. We also demo Amazon S3 access from HDFS command line, Apache Hive on Hue, and Apache Spark.

EMRFS authorization for Amazon S3

There are two prerequisites for using this feature:

  1. Users must be authenticated, because EMRFS needs to map the current user/group/prefix to a predefined user/group/prefix. There are several authentication options. In this post, we launch a Kerberos-enabled cluster that manages the Key Distribution Center (KDC) on the master node, and enable a one-way trust from the KDC to a Microsoft Active Directory domain.
  2. The application must support accessing Amazon S3 via Applications that have their own S3FileSystem APIs (for example, Presto) are not supported at this time.

EMRFS supports three types of mapping entries: user, group, and Amazon S3 prefix. Let’s use an example to show how this works.

Assume that you have the following three identities in your organization, and they are defined in the Active Directory:

To enable all these groups and users to share the EMR cluster, you need to define the following IAM roles:

In this case, you create a separate Amazon EC2 role that doesn’t give any permission to Amazon S3. Let’s call the role the base role (the EC2 role attached to the EMR cluster), which in this example is named EMR_EC2_RestrictedRole. Then, you define all the Amazon S3 permissions for each specific user or group in their own roles. The restricted role serves as the fallback role when the user doesn’t belong to any user/group, nor does the user try to access any listed Amazon S3 prefixes defined on the list.

Important: For all other roles, like emrfs_auth_group_role_data_eng, you need to add the base role (EMR_EC2_RestrictedRole) as the trusted entity so that it can assume other roles. See the following example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::511586466501:role/EMR_EC2_RestrictedRole"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The following is an example policy for the admin user role (emrfs_auth_user_role_admin_user):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:*",
            "Resource": "*"
        }
    ]
}

We are assuming the admin user has access to all buckets in this example.

The following is an example policy for the data science group role (emrfs_auth_group_role_data_sci):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::emrfs-auth-data-science-bucket-demo/*",
                "arn:aws:s3:::emrfs-auth-data-science-bucket-demo"
            ],
            "Action": [
                "s3:*"
            ]
        }
    ]
}

This role grants all Amazon S3 permissions to the emrfs-auth-data-science-bucket-demo bucket and all the objects in it. Similarly, the policy for the role emrfs_auth_group_role_data_eng is shown below:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::emrfs-auth-data-engineering-bucket-demo/*",
                "arn:aws:s3:::emrfs-auth-data-engineering-bucket-demo"
            ],
            "Action": [
                "s3:*"
            ]
        }
    ]
}

Example role mappings configuration

To configure EMRFS authorization, you use EMR security configuration. Here is the configuration we use in this post

Consider the following scenario.

First, the admin user admin1 tries to log in and run a command to access Amazon S3 data through EMRFS. The first role emrfs_auth_user_role_admin_user on the mapping list, which is a user role, is mapped and picked up. Then admin1 has access to the Amazon S3 locations that are defined in this role.

Then a user from the data engineer group (grp_data_engineering) tries to access a data bucket to run some jobs. When EMRFS sees that the user is a member of the grp_data_engineering group, the group role emrfs_auth_group_role_data_eng is assumed, and the user has proper access to Amazon S3 that is defined in the emrfs_auth_group_role_data_eng role.

Next, the third user comes, who is not an admin and doesn’t belong to any of the groups. After failing evaluation of the top three entries, EMRFS evaluates whether the user is trying to access a certain Amazon S3 prefix defined in the last mapping entry. This type of mapping entry is called the prefix type. If the user is trying to access s3://emrfs-auth-default-bucket-demo/, then the prefix mapping is in effect, and the prefix role emrfs_auth_prefix_role_default_s3_prefix is assumed.

If the user is not trying to access any of the Amazon S3 paths that are defined on the list—which means it failed the evaluation of all the entries—it only has the permissions defined in the EMR_EC2RestrictedRole. This role is assumed by the EC2 instances in the cluster.

In this process, all the mappings defined are evaluated in the defined order, and the first role that is mapped is assumed, and the rest of the list is skipped.

Setting up an EMR cluster and mapping Active Directory users and groups

Now that we know how EMRFS authorization role mapping works, the next thing we need to think about is how we can use this feature in an easy and manageable way.

Active Directory setup

Many customers manage their users and groups using Microsoft Active Directory or other tools like OpenLDAP. In this post, we create the Active Directory on an Amazon EC2 instance running Windows Server and create the users and groups we will be using in the example below. After setting up Active Directory, we use the Amazon EMR Kerberos auto-join capability to establish a one-way trust from the KDC running on the EMR master node to the Active Directory domain on the EC2 instance. You can use your own directory services as long as it talks to the LDAP (Lightweight Directory Access Protocol).

To create and join Active Directory to Amazon EMR, follow the steps in the blog post Use Kerberos Authentication to Integrate Amazon EMR with Microsoft Active Directory.

After configuring Active Directory, you can create all the users and groups using the Active Directory tools and add users to appropriate groups. In this example, we created users like admin1, dataeng1, datascientist1, grp_data_engineering, and grp_data_science, and then add the users to the right groups.

Join the EMR cluster to an Active Directory domain

For clusters with Kerberos, Amazon EMR now supports automated Active Directory domain joins. You can use the security configuration to configure the one-way trust from the KDC to the Active Directory domain. You also configure the EMRFS role mappings in the same security configuration.

The following is an example of the EMR security configuration with a trusted Active Directory domain EMRKRB.TEST.COM and the EMRFS role mappings as we discussed earlier:

The EMRFS role mapping configuration is shown in this example:

We will also provide an example AWS CLI command that you can run.

Launching the EMR cluster and running the tests

Now you have configured Kerberos and EMRFS authorization for Amazon S3.

Additionally, you need to configure Hue with Active Directory using the Amazon EMR configuration API in order to log in using the AD users created before. The following is an example of Hue AD configuration.

[
  {
    "Classification":"hue-ini",
    "Properties":{

    },
    "Configurations":[
      {
        "Classification":"desktop",
        "Properties":{

        },
        "Configurations":[
          {
            "Classification":"ldap",
            "Properties":{

            },
            "Configurations":[
              {
                "Classification":"ldap_servers",
                "Properties":{

                },
                "Configurations":[
                  {
                    "Classification":"AWS",
                    "Properties":{
                      "base_dn":"DC=emrkrb,DC=test,DC=com",
                      "ldap_url":"ldap://emrkrb.test.com",
                      "search_bind_authentication":"false",
                      "bind_dn":"CN=adjoiner,CN=users,DC=emrkrb,DC=test,DC=com",
                      "bind_password":"Abc123456",
                      "create_users_on_login":"true",
                      "nt_domain":"emrkrb.test.com"
                    },
                    "Configurations":[

                    ]
                  }
                ]
              }
            ]
          },
          {
            "Classification":"auth",
            "Properties":{
              "backend":"desktop.auth.backend.LdapBackend"
            },
            "Configurations":[

            ]
          }
        ]
      }
    ]
  }

Note: In the preceding configuration JSON file, change the values as required before pasting it into the software setting section in the Amazon EMR console.

Now let’s use this configuration and the security configuration you created before to launch the cluster.

In the Amazon EMR console, choose Create cluster. Then choose Go to advanced options. On the Step1: Software and Steps page, under Edit software settings (optional), paste the configuration in the box.

The rest of the setup is the same as an ordinary cluster setup, except in the Security Options section. In Step 4: Security, under Permissions, choose Custom, and then choose the RestrictedRole that you created before.

Choose the appropriate subnets (these should meet the base requirement in order for a successful Active Directory join—see the Amazon EMR Management Guide for more details), and choose the appropriate security groups to make sure it talks to the Active Directory. Choose a key so that you can log in and configure the cluster.

Most importantly, choose the security configuration that you created earlier to enable Kerberos and EMRFS authorization for Amazon S3.

You can use the following AWS CLI command to create a cluster.

aws emr create-cluster --name "TestEMRFSAuthorization" \ 
--release-label emr-5.10.0 \ --instance-type m3.xlarge \ 
--instance-count 3 \ 
--ec2-attributes InstanceProfile=EMR_EC2_DefaultRole,KeyName=MyEC2KeyPair \ --service-role EMR_DefaultRole \ 
--security-configuration MyKerberosConfig \ 
--configurations file://hue-config.json \
--applications Name=Hadoop Name=Hive Name=Hue Name=Spark \ 
--kerberos-attributes Realm=EC2.INTERNAL, \ KdcAdminPassword=<YourClusterKDCAdminPassword>, \ ADDomainJoinUser=<YourADUserLogonName>,ADDomainJoinPassword=<YourADUserPassword>, \ 
CrossRealmTrustPrincipalPassword=<MatchADTrustPwd>

Note: If you create the cluster using CLI, you need to save the JSON configuration for Hue into a file named hue-config.json and place it on the server where you run the CLI command.

After the cluster gets into the Waiting state, try to connect by using SSH into the cluster using the Active Directory user name and password.

ssh -l [email protected] <EMR IP or DNS name>

Quickly run two commands to show that the Active Directory join is successful:

  1. id [user name] shows the mapped AD users and groups in Linux.
  2. hdfs groups [user name] shows the mapped group in Hadoop.

Both should return the current Active Directory user and group information if the setup is correct.

Now, you can test the user mapping first. Log in with the admin1 user, and run a Hadoop list directory command:

hadoop fs -ls s3://emrfs-auth-data-science-bucket-demo/

Now switch to a user from the data engineer group.

Retry the previous command to access the admin’s bucket. It should throw an Amazon S3 Access Denied exception.

When you try listing the Amazon S3 bucket that a data engineer group member has accessed, it triggers the group mapping.

hadoop fs -ls s3://emrfs-auth-data-engineering-bucket-demo/

It successfully returns the listing results. Next we will test Apache Hive and then Apache Spark.

 

To run jobs successfully, you need to create a home directory for every user in HDFS for staging data under /user/<username>. Users can configure a step to create a home directory at cluster launch time for every user who has access to the cluster. In this example, you use Hue since Hue will create the home directory in HDFS for the user at the first login. Here Hue also needs to be integrated with the same Active Directory as explained in the example configuration described earlier.

First, log in to Hue as a data engineer user, and open a Hive Notebook in Hue. Then run a query to create a new table pointing to the data engineer bucket, s3://emrfs-auth-data-engineering-bucket-demo/table1_data_eng/.

You can see that the table was created successfully. Now try to create another table pointing to the data science group’s bucket, where the data engineer group doesn’t have access.

It failed and threw an Amazon S3 Access Denied error.

Now insert one line of data into the successfully create table.

Next, log out, switch to a data science group user, and create another table, test2_datasci_tb.

The creation is successful.

The last task is to test Spark (it requires the user directory, but Hue created one in the previous step).

Now let’s come back to the command line and run some Spark commands.

Login to the master node using the datascientist1 user:

Start the SparkSQL interactive shell by typing spark-sql, and run the show tables command. It should list the tables that you created using Hive.

As a data science group user, try select on both tables. You will find that you can only select the table defined in the location that your group has access to.

Conclusion

EMRFS authorization for Amazon S3 enables you to have multiple roles on the same cluster, providing flexibility to configure a shared cluster for different teams to achieve better efficiency. The Active Directory integration and group mapping make it much easier for you to manage your users and groups, and provides better auditability in a multi-tenant environment.


Additional Reading

If you found this post useful, be sure to check out Use Kerberos Authentication to Integrate Amazon EMR with Microsoft Active Directory and Launching and Running an Amazon EMR Cluster inside a VPC.


About the Authors

Songzhi Liu is a Big Data Consultant with AWS Professional Services. He works closely with AWS customers to provide them Big Data & Machine Learning solutions and best practices on the Amazon cloud.

 

 

 

 

Progressing from tech to leadership

Post Syndicated from Michal Zalewski original http://lcamtuf.blogspot.com/2018/02/on-leadership.html

I’ve been a technical person all my life. I started doing vulnerability research in the late 1990s – and even today, when I’m not fiddling with CNC-machined robots or making furniture, I’m probably clobbering together a fuzzer or writing a book about browser protocols and APIs. In other words, I’m a geek at heart.

My career is a different story. Over the past two decades and a change, I went from writing CGI scripts and setting up WAN routers for a chain of shopping malls, to doing pentests for institutional customers, to designing a series of network monitoring platforms and handling incident response for a big telco, to building and running the product security org for one of the largest companies in the world. It’s been an interesting ride – and now that I’m on the hook for the well-being of about 100 folks across more than a dozen subteams around the world, I’ve been thinking a bit about the lessons learned along the way.

Of course, I’m a bit hesitant to write such a post: sometimes, your efforts pan out not because of your approach, but despite it – and it’s possible to draw precisely the wrong conclusions from such anecdotes. Still, I’m very proud of the culture we’ve created and the caliber of folks working on our team. It happened through the work of quite a few talented tech leads and managers even before my time, but it did not happen by accident – so I figured that my observations may be useful for some, as long as they are taken with a grain of salt.

But first, let me start on a somewhat somber note: what nobody tells you is that one’s level on the leadership ladder tends to be inversely correlated with several measures of happiness. The reason is fairly simple: as you get more senior, a growing number of people will come to you expecting you to solve increasingly fuzzy and challenging problems – and you will no longer be patted on the back for doing so. This should not scare you away from such opportunities, but it definitely calls for a particular mindset: your motivation must come from within. Look beyond the fight-of-the-day; find satisfaction in seeing how far your teams have come over the years.

With that out of the way, here’s a collection of notes, loosely organized into three major themes.

The curse of a techie leader

Perhaps the most interesting observation I have is that for a person coming from a technical background, building a healthy team is first and foremost about the subtle art of letting go.

There is a natural urge to stay involved in any project you’ve started or helped improve; after all, it’s your baby: you’re familiar with all the nuts and bolts, and nobody else can do this job as well as you. But as your sphere of influence grows, this becomes a choke point: there are only so many things you could be doing at once. Just as importantly, the project-hoarding behavior robs more junior folks of the ability to take on new responsibilities and bring their own ideas to life. In other words, when done properly, delegation is not just about freeing up your plate; it’s also about empowerment and about signalling trust.

Of course, when you hand your project over to somebody else, the new owner will initially be slower and more clumsy than you; but if you pick the new leads wisely, give them the right tools and the right incentives, and don’t make them deathly afraid of messing up, they will soon excel at their new jobs – and be grateful for the opportunity.

A related affliction of many accomplished techies is the conviction that they know the answers to every question even tangentially related to their domain of expertise; that belief is coupled with a burning desire to have the last word in every debate. When practiced in moderation, this behavior is fine among peers – but for a leader, one of the most important skills to learn is knowing when to keep your mouth shut: people learn a lot better by experimenting and making small mistakes than by being schooled by their boss, and they often try to read into your passing remarks. Don’t run an authoritarian camp focused on total risk aversion or perfectly efficient resource management; just set reasonable boundaries and exit conditions for experiments so that they don’t spiral out of control – and be amazed by the results every now and then.

Death by planning

When nothing is on fire, it’s easy to get preoccupied with maintaining the status quo. If your current headcount or budget request lists all the same projects as last year’s, or if you ever find yourself ending an argument by deferring to a policy or a process document, it’s probably a sign that you’re getting complacent. In security, complacency usually ends in tears – and when it doesn’t, it leads to burnout or boredom.

In my experience, your goal should be to develop a cadre of managers or tech leads capable of coming up with clever ideas, prioritizing them among themselves, and seeing them to completion without your day-to-day involvement. In your spare time, make it your mission to challenge them to stay ahead of the curve. Ask your vendor security lead how they’d streamline their work if they had a 40% jump in the number of vendors but no extra headcount; ask your product security folks what’s the second line of defense or containment should your primary defenses fail. Help them get good ideas off the ground; set some mental success and failure criteria to be able to cut your losses if something does not pan out.

Of course, malfunctions happen even in the best-run teams; to spot trouble early on, instead of overzealous project tracking, I found it useful to encourage folks to run a data-driven org. I’d usually ask them to imagine that a brand new VP shows up in our office and, as his first order of business, asks “why do you have so many people here and how do I know they are doing the right things?”. Not everything in security can be quantified, but hard data can validate many of your assumptions – and will alert you to unseen issues early on.

When focusing on data, it’s important not to treat pie charts and spreadsheets as an art unto itself; if you run a security review process for your company, your CSAT scores are going to reach 100% if you just rubberstamp every launch request within ten minutes of receiving it. Make sure you’re asking the right questions; instead of “how satisfied are you with our process”, try “is your product better as a consequence of talking to us?”

Whenever things are not progressing as expected, it is a natural instinct to fall back to micromanagement, but it seldom truly cures the ill. It’s probable that your team disagrees with your vision or its feasibility – and that you’re either not listening to their feedback, or they don’t think you’d care. It’s good to assume that most of your employees are as smart or smarter than you; barking your orders at them more loudly or more frequently does not lead anyplace good. It’s good to listen to them and either present new facts or work with them on a plan you can all get behind.

In some circumstances, all that’s needed is honesty about the business trade-offs, so that your team feels like your “partner in crime”, not a victim of circumstance. For example, we’d tell our folks that by not falling behind on basic, unglamorous work, we earn the trust of our VPs and SVPs – and that this translates into the independence and the resources we need to pursue more ambitious ideas without being told what to do; it’s how we game the system, so to speak. Oh: leading by example is a pretty powerful tool at your disposal, too.

The human factor

I’ve come to appreciate that hiring decent folks who can get along with others is far more important than trying to recruit conference-circuit superstars. In fact, hiring superstars is a decidedly hit-and-miss affair: while certainly not a rule, there is a proportion of folks who put the maintenance of their celebrity status ahead of job responsibilities or the well-being of their peers.

For teams, one of the most powerful demotivators is a sense of unfairness and disempowerment. This is where tech-originating leaders can shine, because their teams usually feel that their bosses understand and can evaluate the merits of the work. But it also means you need to be decisive and actually solve problems for them, rather than just letting them vent. You will need to make unpopular decisions every now and then; in such cases, I think it’s important to move quickly, rather than prolonging the uncertainty – but it’s also important to sincerely listen to concerns, explain your reasoning, and be frank about the risks and trade-offs.

Whenever you see a clash of personalities on your team, you probably need to respond swiftly and decisively; being right should not justify being a bully. If you don’t react to repeated scuffles, your best people will probably start looking for other opportunities: it’s draining to put up with constant pie fights, no matter if the pies are thrown straight at you or if you just need to duck one every now and then.

More broadly, personality differences seem to be a much better predictor of conflict than any technical aspects underpinning a debate. As a boss, you need to identify such differences early on and come up with creative solutions. Sometimes, all you need is taking some badly-delivered but valid feedback and having a conversation with the other person, asking some questions that can help them reach the same conclusions without feeling that their worldview is under attack. Other times, the only path forward is making sure that some folks simply don’t run into each for a while.

Finally, dealing with low performers is a notoriously hard but important part of the game. Especially within large companies, there is always the temptation to just let it slide: sideline a struggling person and wait for them to either get over their issues or leave. But this sends an awful message to the rest of the team; for better or worse, fairness is important to most. Simply firing the low performers is seldom the best solution, though; successful recovery cases are what sets great managers apart from the average ones.

Oh, one more thought: people in leadership roles have their allegiance divided between the company and the people who depend on them. The obligation to the company is more formal, but the impact you have on your team is longer-lasting and more intimate. When the obligations to the employer and to your team collide in some way, make sure you can make the right call; it might be one of the the most consequential decisions you’ll ever make.

Playboy’s Copyright Lawsuit Threatens Online Expression, Boing Boing Argues

Post Syndicated from Ernesto original https://torrentfreak.com/playboys-copyright-lawsuit-threatens-online-expression-boing-boing-argues-180202/

Early 2016, Boing Boing co-editor Xeni Jardin published an article in which she linked to an archive of every Playboy centerfold image till then.

“Kind of amazing to see how our standards of hotness, and the art of commercial erotic photography, have changed over time,” Jardin commented.

While the linked material undoubtedly appealed to many readers, Playboy itself took offense to the fact that infringing copies of their work were being shared in public. While Boing Boing didn’t upload or store the images in question, the publisher filed a complaint.

Playboy accused the blog’s parent company Happy Mutants of various counts of copyright infringement, claiming that it exploited their playmates’ images for commercial purposes.

Last month Boing Boing responded to the allegations with a motion to dismiss. The case should be thrown out, it argued, noting that linking to infringing material for the purpose of reporting and commentary, is not against the law.

This prompted Playboy to fire back, branding Boing Boing a “clickbait” site. Playboy informed the court that the popular blog profits off the work of others and has no fair use defense.

Before the California District Court decides on the matter, Boing Boing took the opportunity to reply to Playboy’s latest response. According to the defense, Playboy’s case is an attack on people’s freedom of expression.

“Playboy claims this is an important case. It is partially correct: if the Court allows this case to go forward, it will send a dangerous message to everyone engaged in ordinary online commentary,” Boing Boing’s reply reads.

Referencing a previous Supreme Court decision, the blog says that the Internet democratizes access to speech, with websites as a form of modern-day pamphlets.

Links to source materials posted by third parties give these “pamphlets” more weight as they allow readers to form their own opinion on the matter, Boing Boing argues. If the court upholds Playboy’s arguments, however, this will become a risky endeavor.

“Playboy, however, would apparently prefer a world in which the ‘pamphleteer’ must ask for permission before linking to primary sources, on pain of expensive litigation,” the defense writes.

“This case merely has to survive a motion to dismiss to launch a thousand more expensive lawsuits, chilling a broad variety of lawful expression and reporting that merely adopts the common practice of linking to the material that is the subject of the report.”

The defense says that there are several problems with Playboy’s arguments. Among other things, Boing Boing argues that did nothing to cause the unauthorized posting of Playboy’s work on Imgur and YouTube.

Another key argument is that linking to copyright-infringing material should be considered fair use, since it was for purposes of criticism, commentary, and news reporting.

“Settled precedent requires dismissal, both because Boing Boing did not induce or materially contribute to any copyright infringement and, in the alternative, because Boing Boing engaged in fair use,” the defense writes.

Instead of going after Boing Boing for contributory infringement, Playboy could actually try to uncover the people who shared the infringing material, they argue. There is nothing that prevents them from doing so.

After hearing the arguments from both sides it is now up to the court to decide how to proceed. Given what’s at stake, the eventual outcome in this case is bound to set a crucial precedent.

A copy of Boing Boing’s reply is available here (pdf).

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

The problematic Wannacry North Korea attribution

Post Syndicated from Robert Graham original http://blog.erratasec.com/2018/01/the-problematic-wannacry-north-korea.html

Last month, the US government officially “attributed” the Wannacry ransomware worm to North Korea. This attribution has three flaws, which are a good lesson for attribution in general.

It was an accident

The most important fact about Wannacry is that it was an accident. We’ve had 30 years of experience with Internet worms teaching us that worms are always accidents. While launching worms may be intentional, their effects cannot be predicted. While they appear to have targets, like Slammer against South Korea, or Witty against the Pentagon, further analysis shows this was just a random effect that was impossible to predict ahead of time. Only in hindsight are these effects explainable.
We should hold those causing accidents accountable, too, but it’s a different accountability. The U.S. has caused more civilian deaths in its War on Terror than the terrorists caused triggering that war. But we hold these to be morally different: the terrorists targeted the innocent, whereas the U.S. takes great pains to avoid civilian casualties. 
Since we are talking about blaming those responsible for accidents, we also must include the NSA in that mix. The NSA created, then allowed the release of, weaponized exploits. That’s like accidentally dropping a load of unexploded bombs near a village. When those bombs are then used, those having lost the weapons are held guilty along with those using them. Yes, while we should blame the hacker who added ETERNAL BLUE to their ransomware, we should also blame the NSA for losing control of ETERNAL BLUE.

A country and its assets are different

Was it North Korea, or hackers affilliated with North Korea? These aren’t the same.

It’s hard for North Korea to have hackers of its own. It doesn’t have citizens who grow up with computers to pick from. Moreover, an internal hacking corps would create tainted citizens exposed to dangerous outside ideas. Update: Some people have pointed out that Kim Il-sung University in the capital does have some contact with the outside world, with academics granted limited Internet access, so I guess some tainting is allowed. Still, what we know of North Korea hacking efforts largley comes from hackers they employ outside North Korea. It was the Lazurus Group, outside North Korea, that did Wannacry.
Instead, North Korea develops external hacking “assets”, supporting several external hacking groups in China, Japan, and South Korea. This is similar to how intelligence agencies develop human “assets” in foreign countries. While these assets do things for their handlers, they also have normal day jobs, and do many things that are wholly independent and even sometimes against their handler’s interests.
For example, this Muckrock FOIA dump shows how “CIA assets” independently worked for Castro and assassinated a Panamanian president. That they also worked for the CIA does not make the CIA responsible for the Panamanian assassination.
That CIA/intelligence assets work this way is well-known and uncontroversial. The fact that countries use hacker assets like this is the controversial part. These hackers do act independently, yet we refuse to consider this when we want to “attribute” attacks.

Attribution is political

We have far better attribution for the nPetya attacks. It was less accidental (they clearly desired to disrupt Ukraine), and the hackers were much closer to the Russian government (Russian citizens). Yet, the Trump administration isn’t fighting Russia, they are fighting North Korea, so they don’t officially attribute nPetya to Russia, but do attribute Wannacry to North Korea.
Trump is in conflict with North Korea. He is looking for ways to escalate the conflict. Attributing Wannacry helps achieve his political objectives.
That it was blatantly politics is demonstrated by the way it was released to the press. It wasn’t released in the normal way, where the administration can stand behind it, and get challenged on the particulars. Instead, it was pre-released through the normal system of “anonymous government officials” to the NYTimes, and then backed up with op-ed in the Wall Street Journal. The government leaks information like this when it’s weak, not when its strong.

The proper way is to release the evidence upon which the decision was made, so that the public can challenge it. Among the questions the public would ask is whether it they believe it was North Korea’s intention to cause precisely this effect, such as disabling the British NHS. Or, whether it was merely hackers “affiliated” with North Korea, or hackers carrying out North Korea’s orders. We cannot challenge the government this way because the government intentionally holds itself above such accountability.

Conclusion

We believe hacking groups tied to North Korea are responsible for Wannacry. Yet, even if that’s true, we still have three attribution problems. We still don’t know if that was intentional, in pursuit of some political goal, or an accident. We still don’t know if it was at the direction of North Korea, or whether their hacker assets acted independently. We still don’t know if the government has answers to these questions, or whether it’s exploiting this doubt to achieve political support for actions against North Korea.

Top 8 Best Practices for High-Performance ETL Processing Using Amazon Redshift

Post Syndicated from Thiyagarajan Arumugam original https://aws.amazon.com/blogs/big-data/top-8-best-practices-for-high-performance-etl-processing-using-amazon-redshift/

An ETL (Extract, Transform, Load) process enables you to load data from source systems into your data warehouse. This is typically executed as a batch or near-real-time ingest process to keep the data warehouse current and provide up-to-date analytical data to end users.

Amazon Redshift is a fast, petabyte-scale data warehouse that enables you easily to make data-driven decisions. With Amazon Redshift, you can get insights into your big data in a cost-effective fashion using standard SQL. You can set up any type of data model, from star and snowflake schemas, to simple de-normalized tables for running any analytical queries.

To operate a robust ETL platform and deliver data to Amazon Redshift in a timely manner, design your ETL processes to take account of Amazon Redshift’s architecture. When migrating from a legacy data warehouse to Amazon Redshift, it is tempting to adopt a lift-and-shift approach, but this can result in performance and scale issues long term. This post guides you through the following best practices for ensuring optimal, consistent runtimes for your ETL processes:

  • COPY data from multiple, evenly sized files.
  • Use workload management to improve ETL runtimes.
  • Perform table maintenance regularly.
  • Perform multiple steps in a single transaction.
  • Loading data in bulk.
  • Use UNLOAD to extract large result sets.
  • Use Amazon Redshift Spectrum for ad hoc ETL processing.
  • Monitor daily ETL health using diagnostic queries.

1. COPY data from multiple, evenly sized files

Amazon Redshift is an MPP (massively parallel processing) database, where all the compute nodes divide and parallelize the work of ingesting data. Each node is further subdivided into slices, with each slice having one or more dedicated cores, equally dividing the processing capacity. The number of slices per node depends on the node type of the cluster. For example, each DS2.XLARGE compute node has two slices, whereas each DS2.8XLARGE compute node has 16 slices.

When you load data into Amazon Redshift, you should aim to have each slice do an equal amount of work. When you load the data from a single large file or from files split into uneven sizes, some slices do more work than others. As a result, the process runs only as fast as the slowest, or most heavily loaded, slice. In the example shown below, a single large file is loaded into a two-node cluster, resulting in only one of the nodes, “Compute-0”, performing all the data ingestion:

When splitting your data files, ensure that they are of approximately equal size – between 1 MB and 1 GB after compression. The number of files should be a multiple of the number of slices in your cluster. Also, I strongly recommend that you individually compress the load files using gzip, lzop, or bzip2 to efficiently load large datasets.

When loading multiple files into a single table, use a single COPY command for the table, rather than multiple COPY commands. Amazon Redshift automatically parallelizes the data ingestion. Using a single COPY command to bulk load data into a table ensures optimal use of cluster resources, and quickest possible throughput.

2. Use workload management to improve ETL runtimes

Use Amazon Redshift’s workload management (WLM) to define multiple queues dedicated to different workloads (for example, ETL versus reporting) and to manage the runtimes of queries. As you migrate more workloads into Amazon Redshift, your ETL runtimes can become inconsistent if WLM is not appropriately set up.

I recommend limiting the overall concurrency of WLM across all queues to around 15 or less. This WLM guide helps you organize and monitor the different queues for your Amazon Redshift cluster.

When managing different workloads on your Amazon Redshift cluster, consider the following for the queue setup:

  • Create a queue dedicated to your ETL processes. Configure this queue with a small number of slots (5 or fewer). Amazon Redshift is designed for analytics queries, rather than transaction processing. The cost of COMMIT is relatively high, and excessive use of COMMIT can result in queries waiting for access to the commit queue. Because ETL is a commit-intensive process, having a separate queue with a small number of slots helps mitigate this issue.
  • Claim extra memory available in a queue. When executing an ETL query, you can take advantage of the wlm_query_slot_count to claim the extra memory available in a particular queue. For example, a typical ETL process might involve COPYing raw data into a staging table so that downstream ETL jobs can run transformations that calculate daily, weekly, and monthly aggregates. To speed up the COPY process (so that the downstream tasks can start in parallel sooner), the wlm_query_slot_count can be increased for this step.
  • Create a separate queue for reporting queries. Configure query monitoring rules on this queue to further manage long-running and expensive queries.
  • Take advantage of the dynamic memory parameters. They swap the memory from your ETL to your reporting queue after the ETL job has completed.

3. Perform table maintenance regularly

Amazon Redshift is a columnar database, which enables fast transformations for aggregating data. Performing regular table maintenance ensures that transformation ETLs are predictable and performant. To get the best performance from your Amazon Redshift database, you must ensure that database tables regularly are VACUUMed and ANALYZEd. The Analyze & Vacuum schema utility helps you automate the table maintenance task and have VACUUM & ANALYZE executed in a regular fashion.

  • Use VACUUM to sort tables and remove deleted blocks

During a typical ETL refresh process, tables receive new incoming records using COPY, and unneeded data (cold data) is removed using DELETE. New rows are added to the unsorted region in a table. Deleted rows are simply marked for deletion.

DELETE does not automatically reclaim the space occupied by the deleted rows. Adding and removing large numbers of rows can therefore cause the unsorted region and the number of deleted blocks to grow. This can degrade the performance of queries executed against these tables.

After an ETL process completes, perform VACUUM to ensure that user queries execute in a consistent manner. The complete list of tables that need VACUUMing can be found using the Amazon Redshift Util’s table_info script.

Use the following approaches to ensure that VACCUM is completed in a timely manner:

  • Use wlm_query_slot_count to claim all the memory allocated in the ETL WLM queue during the VACUUM process.
  • DROP or TRUNCATE intermediate or staging tables, thereby eliminating the need to VACUUM them.
  • If your table has a compound sort key with only one sort column, try to load your data in sort key order. This helps reduce or eliminate the need to VACUUM the table.
  • Consider using time series This helps reduce the amount of data you need to VACUUM.
  • Use ANALYZE to update database statistics

Amazon Redshift uses a cost-based query planner and optimizer using statistics about tables to make good decisions about the query plan for the SQL statements. Regular statistics collection after the ETL completion ensures that user queries run fast, and that daily ETL processes are performant. The Amazon Redshift utility table_info script provides insights into the freshness of the statistics. Keeping the statistics off (pct_stats_off) less than 20% ensures effective query plans for the SQL queries.

4. Perform multiple steps in a single transaction

ETL transformation logic often spans multiple steps. Because commits in Amazon Redshift are expensive, if each ETL step performs a commit, multiple concurrent ETL processes can take a long time to execute.

To minimize the number of commits in a process, the steps in an ETL script should be surrounded by a BEGIN…END statement so that a single commit is performed only after all the transformation logic has been executed. For example, here is an example multi-step ETL script that performs one commit at the end:

Begin
CREATE temporary staging_table;
INSERT INTO staging_table SELECT .. FROM source (transformation logic);
DELETE FROM daily_table WHERE dataset_date =?;
INSERT INTO daily_table SELECT .. FROM staging_table (daily aggregate);
DELETE FROM weekly_table WHERE weekending_date=?;
INSERT INTO weekly_table SELECT .. FROM staging_table(weekly aggregate);
Commit

5. Loading data in bulk

Amazon Redshift is designed to store and query petabyte-scale datasets. Using Amazon S3 you can stage and accumulate data from multiple source systems before executing a bulk COPY operation. The following methods allow efficient and fast transfer of these bulk datasets into Amazon Redshift:

  • Use a manifest file to ingest large datasets that span multiple files. The manifest file is a JSON file that lists all the files to be loaded into Amazon Redshift. Using a manifest file ensures that Amazon Redshift has a consistent view of the data to be loaded from S3, while also ensuring that duplicate files do not result in the same data being loaded more than one time.
  • Use temporary staging tables to hold the data for transformation. These tables are automatically dropped after the ETL session is complete. Temporary tables can be created using the CREATE TEMPORARY TABLE syntax, or by issuing a SELECT … INTO #TEMP_TABLE query. Explicitly specifying the CREATE TEMPORARY TABLE statement allows you to control the DISTRIBUTION KEY, SORT KEY, and compression settings to further improve performance.
  • User ALTER table APPEND to swap data from the staging tables to the target table. Data in the source table is moved to matching columns in the target table. Column order doesn’t matter. After data is successfully appended to the target table, the source table is empty. ALTER TABLE APPEND is much faster than a similar CREATE TABLE AS or INSERT INTO operation because it doesn’t involve copying or moving data.

6. Use UNLOAD to extract large result sets

Fetching a large number of rows using SELECT is expensive and takes a long time. When a large amount of data is fetched from the Amazon Redshift cluster, the leader node has to hold the data temporarily until the fetches are complete. Further, data is streamed out sequentially, which results in longer elapsed time. As a result, the leader node can become hot, which not only affects the SELECT that is being executed, but also throttles resources for creating execution plans and managing the overall cluster resources. Here is an example of a large SELECT statement. Notice that the leader node is doing most of the work to stream out the rows:

Use UNLOAD to extract large results sets directly to S3. After it’s in S3, the data can be shared with multiple downstream systems. By default, UNLOAD writes data in parallel to multiple files according to the number of slices in the cluster. All the compute nodes participate to quickly offload the data into S3.

If you are extracting data for use with Amazon Redshift Spectrum, you should make use of the MAXFILESIZE parameter to and keep files are 150 MB. Similar to item 1 above, having many evenly sized files ensures that Redshift Spectrum can do the maximum amount of work in parallel.

7. Use Redshift Spectrum for ad hoc ETL processing

Events such as data backfill, promotional activity, and special calendar days can trigger additional data volumes that affect the data refresh times in your Amazon Redshift cluster. To help address these spikes in data volumes and throughput, I recommend staging data in S3. After data is organized in S3, Redshift Spectrum enables you to query it directly using standard SQL. In this way, you gain the benefits of additional capacity without having to resize your cluster.

For tips on getting started with and optimizing the use of Redshift Spectrum, see the previous post, 10 Best Practices for Amazon Redshift Spectrum.

8. Monitor daily ETL health using diagnostic queries

Monitoring the health of your ETL processes on a regular basis helps identify the early onset of performance issues before they have a significant impact on your cluster. The following monitoring scripts can be used to provide insights into the health of your ETL processes:

Script Use when… Solution
commit_stats.sql – Commit queue statistics from past days, showing largest queue length and queue time first DML statements such as INSERT/UPDATE/COPY/DELETE operations take several times longer to execute when multiple of these operations are in progress Set up separate WLM queues for the ETL process and limit the concurrency to < 5.
copy_performance.sql –  Copy command statistics for the past days Daily COPY operations take longer to execute • Follow the best practices for the COPY command.
• Analyze data growth with the incoming datasets and consider cluster resize to meet the expected SLA.
table_info.sql – Table skew and unsorted statistics along with storage and key information Transformation steps take longer to execute • Set up regular VACCUM jobs to address unsorted rows and claim the deleted blocks so that transformation SQL execute optimally.
• Consider a table redesign to avoid data skewness.
v_check_transaction_locks.sql – Monitor transaction locks INSERT/UPDATE/COPY/DELETE operations on particular tables do not respond back in timely manner, compared to when run after the ETL Multiple DML statements are operating on the same target table at the same moment from different transactions. Set up ETL job dependency so that they execute serially for the same target table.
v_get_schema_priv_by_user.sql – Get the schema that the user has access to Reporting users can view intermediate tables Set up separate database groups for reporting and ETL users, and grants access to objects using GRANT.
v_generate_tbl_ddl.sql – Get the table DDL You need to create an empty table with same structure as target table for data backfill Generate DDL using this script for data backfill.
v_space_used_per_tbl.sql – monitor space used by individual tables Amazon Redshift data warehouse space growth is trending upwards more than normal

Analyze the individual tables that are growing at higher rate than normal. Consider data archival using UNLOAD to S3 and Redshift Spectrum for later analysis.

Use unscanned_table_summary.sql to find unused table and archive or drop them.

top_queries.sql – Return the top 50 time consuming statements aggregated by its text ETL transformations are taking longer to execute Analyze the top transformation SQL and use EXPLAIN to find opportunities for tuning the query plan.

There are several other useful scripts available in the amazon-redshift-utils repository. The AWS Lambda Utility Runner runs a subset of these scripts on a scheduled basis, allowing you to automate much of monitoring of your ETL processes.

Example ETL process

The following ETL process reinforces some of the best practices discussed in this post. Consider the following four-step daily ETL workflow where data from an RDBMS source system is staged in S3 and then loaded into Amazon Redshift. Amazon Redshift is used to calculate daily, weekly, and monthly aggregations, which are then unloaded to S3, where they can be further processed and made available for end-user reporting using a number of different tools, including Redshift Spectrum and Amazon Athena.

Step 1:  Extract from the RDBMS source to a S3 bucket

In this ETL process, the data extract job fetches change data every 1 hour and it is staged into multiple hourly files. For example, the staged S3 folder looks like the following:

 [[email protected] ~]$ aws s3 ls s3://<<S3 Bucket>>/batch/2017/07/02/
2017-07-02 01:59:58   81900220 20170702T01.export.gz
2017-07-02 02:59:56   84926844 20170702T02.export.gz
2017-07-02 03:59:54   78990356 20170702T03.export.gz
…
2017-07-02 22:00:03   75966745 20170702T21.export.gz
2017-07-02 23:00:02   89199874 20170702T22.export.gz
2017-07-02 00:59:59   71161715 20170702T23.export.gz

Organizing the data into multiple, evenly sized files enables the COPY command to ingest this data using all available resources in the Amazon Redshift cluster. Further, the files are compressed (gzipped) to further reduce COPY times.

Step 2: Stage data to the Amazon Redshift table for cleansing

Ingesting the data can be accomplished using a JSON-based manifest file. Using the manifest file ensures that S3 eventual consistency issues can be eliminated and also provides an opportunity to dedupe any files if needed. A sample manifest20170702.json file looks like the following:

{
  "entries": [
    {"url":" s3://<<S3 Bucket>>/batch/2017/07/02/20170702T01.export.gz", "mandatory":true},
    {"url":" s3://<<S3 Bucket>>/batch/2017/07/02/20170702T02.export.gz", "mandatory":true},
    …
    {"url":" s3://<<S3 Bucket>>/batch/2017/07/02/20170702T23.export.gz", "mandatory":true}
  ]
}

The data can be ingested using the following command:

SET wlm_query_slot_count TO <<max available concurrency in the ETL queue>>;
COPY stage_tbl FROM 's3:// <<S3 Bucket>>/batch/manifest20170702.json' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' manifest;

Because the downstream ETL processes depend on this COPY command to complete, the wlm_query_slot_count is used to claim all the memory available to the queue. This helps the COPY command complete as quickly as possible.

Step 3: Transform data to create daily, weekly, and monthly datasets and load into target tables

Data is staged in the “stage_tbl” from where it can be transformed into the daily, weekly, and monthly aggregates and loaded into target tables. The following job illustrates a typical weekly process:

Begin
INSERT into ETL_LOG (..) values (..);
DELETE from weekly_tbl where dataset_week = <<current week>>;
INSERT into weekly_tbl (..)
  SELECT date_trunc('week', dataset_day) AS week_begin_dataset_date, SUM(C1) AS C1, SUM(C2) AS C2
	FROM   stage_tbl
GROUP BY date_trunc('week', dataset_day);
INSERT into AUDIT_LOG values (..);
COMMIT;
End;

As shown above, multiple steps are combined into one transaction to perform a single commit, reducing contention on the commit queue.

Step 4: Unload the daily dataset to populate the S3 data lake bucket

The transformed results are now unloaded into another S3 bucket, where they can be further processed and made available for end-user reporting using a number of different tools, including Redshift Spectrum and Amazon Athena.

unload ('SELECT * FROM weekly_tbl WHERE dataset_week = <<current week>>’) TO 's3:// <<S3 Bucket>>/datalake/weekly/20170526/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';

Summary

Amazon Redshift lets you easily operate petabyte-scale data warehouses on the cloud. This post summarized the best practices for operating scalable ETL natively within Amazon Redshift. I demonstrated efficient ways to ingest and transform data, along with close monitoring. I also demonstrated the best practices being used in a typical sample ETL workload to transform the data into Amazon Redshift.

If you have questions or suggestions, please comment below.

 


About the Author

Thiyagarajan Arumugam is a Big Data Solutions Architect at Amazon Web Services and designs customer architectures to process data at scale. Prior to AWS, he built data warehouse solutions at Amazon.com. In his free time, he enjoys all outdoor sports and practices the Indian classical drum mridangam.

 

Detecting Drone Surveillance with Traffic Analysis

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/01/detecting_drone.html

This is clever:

Researchers at Ben Gurion University in Beer Sheva, Israel have built a proof-of-concept system for counter-surveillance against spy drones that demonstrates a clever, if not exactly simple, way to determine whether a certain person or object is under aerial surveillance. They first generate a recognizable pattern on whatever subject­ — a window, say — someone might want to guard from potential surveillance. Then they remotely intercept a drone’s radio signals to look for that pattern in the streaming video the drone sends back to its operator. If they spot it, they can determine that the drone is looking at their subject.

In other words, they can see what the drone sees, pulling out their recognizable pattern from the radio signal, even without breaking the drone’s encrypted video.

The details have to do with the way drone video is compressed:

The researchers’ technique takes advantage of an efficiency feature streaming video has used for years, known as “delta frames.” Instead of encoding video as a series of raw images, it’s compressed into a series of changes from the previous image in the video. That means when a streaming video shows a still object, it transmits fewer bytes of data than when it shows one that moves or changes color.

That compression feature can reveal key information about the content of the video to someone who’s intercepting the streaming data, security researchers have shown in recent research, even when the data is encrypted.

Research paper and video.

SUPER game night 3: GAMES MADE QUICK??? 2.0

Post Syndicated from Eevee original https://eev.ee/blog/2018/01/23/super-game-night-3-games-made-quick-2-0/

Game night continues with a smorgasbord of games from my recent game jam, GAMES MADE QUICK??? 2.0!

The idea was to make a game in only a week while watching AGDQ, as an alternative to doing absolutely nothing for a week while watching AGDQ. (I didn’t submit a game myself; I was chugging along on my Anise game, which isn’t finished yet.)

I can’t very well run a game jam and not play any of the games, so here’s some of them in no particular order! Enjoy!

These are impressions, not reviews. I try to avoid major/ending spoilers, but big plot points do tend to leave impressions.

Weather Quest, by timlmul

short · rpg · jan 2017 · (lin)/mac/win · free on itch · jam entry

Weather Quest is its author’s first shipped game, written completely from scratch (the only vendored code is a micro OO base). It’s very short, but as someone who has also written LÖVE games completely from scratch, I can attest that producing something this game-like in a week is a fucking miracle. Bravo!

For reference, a week into my first foray, I think I was probably still writing my own Tiled importer like an idiot.

Only Mac and Windows builds are on itch, but it’s a LÖVE game, so Linux folks can just grab a zip from GitHub and throw that at love.

FINAL SCORE: ⛅☔☀

Pancake Numbers Simulator, by AnorakThePrimordial

short · sim · jan 2017 · lin/mac/win · free on itch · jam entry

Given a stack of N pancakes (of all different sizes and in no particular order), the Nth pancake number is the most flips you could possibly need to sort the pancakes in order with the smallest on top. A “flip” is sticking a spatula under one of the pancakes and flipping the whole sub-stack over. There’s, ah, a video embedded on the game page with some visuals.

Anyway, this game lets you simulate sorting a stack via pancake flipping, which is surprisingly satisfying! I enjoy cleaning up little simulated messes, such as… incorrectly-sorted pancakes, I guess?

This probably doesn’t work too well as a simulator for solving the general problem — you’d have to find an optimal solution for every permutation of N pancakes to be sure you were right. But it’s a nice interactive illustration of the problem, and if you know the pancake number for your stack size of choice (which I wish the game told you — for seven pancakes, it’s 8), then trying to restore a stack in that many moves makes for a nice quick puzzle.

FINAL SCORE: \(\frac{18}{11}\)

Framed Animals, by chridd

short · metroidvania · jan 2017 · web/win · free on itch · jam entry

The concept here was to kill the frames, save the animals, which is a delightfully literal riff on a long-running AGDQ/SGDQ donation incentive — people vote with their dollars to decide whether Super Metroid speedrunners go out of their way to free the critters who show you how to walljump and shinespark. Super Metroid didn’t have a showing at this year’s AGDQ, and so we have this game instead.

It’s rough, but clever, and I got really into it pretty quickly — each animal you save gives you a new ability (in true Metroid style), and you get to test that ability out by playing as the animal, with only that ability and no others, to get yourself back to the most recent save point.

I did, tragically, manage to get myself stuck near what I think was about to be the end of the game, so some of the animals will remain framed forever. What an unsatisfying conclusion.

Gravity feels a little high given the size of the screen, and like most tile-less platformers, there’s not really any way to gauge how high or long your jump is before you leap. But I’m only even nitpicking because I think this is a great idea and I hope the author really does keep working on it.

FINAL SCORE: $136,596.69

Battle 4 Glory, by Storyteller Games

short · fighter · jan 2017 · win · free on itch · jam entry

This is a Smash Bros-style brawler, complete with the four players, the 2D play area in a 3D world, and the random stage obstacles showing up. I do like the Smash style, despite not otherwise being a fan of fighting games, so it’s nice to see another game chase that aesthetic.

Alas, that’s about as far as it got — which is pretty far for a week of work! I don’t know what more to say, though. The environments are neat, but unless I’m missing something, the only actions at your disposal are jumping and very weak melee attacks. I did have a good few minutes of fun fruitlessly mashing myself against the bumbling bots, as you can see.

FINAL SCORE: 300%

Icnaluferu Guild, Year Sixteen, by CHz

short · adventure · jan 2017 · web · free on itch · jam entry

Here we have the first of several games made with bitsy, a micro game making tool that basically only supports walking around, talking to people, and picking up items.

I tell you this because I think half of my appreciation for this game is in the ways it wriggled against those limits to emulate a Zelda-like dungeon crawler. Everything in here is totally fake, and you can’t really understand just how fake unless you’ve tried to make something complicated with bitsy.

It’s pretty good. The dialogue is entertaining (the rest of your party develops distinct personalities solely through oneliners, somehow), the riffs on standard dungeon fare are charming, and the Link’s Awakening-esque perspective walls around the edges of each room are fucking glorious.

FINAL SCORE: 2 bits

The Lonely Tapes, by JTHomeslice

short · rpg · jan 2017 · web · free on itch · jam entry

Another bitsy entry, this one sees you play as a Wal— sorry, a JogDawg, which has lost its cassette tapes and needs to go recover them!

(A cassette tape is like a VHS, but for music.)

(A VHS is—)

I have the sneaking suspicion that I missed out on some musical in-jokes, due to being uncultured swine. I still enjoyed the game — it’s always clear when someone is passionate about the thing they’re writing about, and I could tell I was awash in that aura even if some of it went over my head. You know you’ve done good if someone from way outside your sphere shows up and still has a good time.

FINAL SCORE: Nine… Inch Nails? They’re a band, right? God I don’t know write your own damn joke

Pirate Kitty-Quest, by TheKoolestKid

short · adventure · jan 2017 · win · free on itch · jam entry

I completely forgot I’d even given “my birthday” and “my cat” as mostly-joking jam themes until I stumbled upon this incredible gem. I don’t think — let me just check here and — yeah no this person doesn’t even follow me on Twitter. I have no idea who they are?

BUT THEY MADE A GAME ABOUT ANISE AS A PIRATE, LOOKING FOR TREASURE

PIRATE. ANISE

PIRATE ANISE!!!

This game wins the jam, hands down. 🏆

FINAL SCORE: Yarr, eight pieces o’ eight

CHIPS Mario, by NovaSquirrel

short · platformer · jan 2017 · (lin/mac)/win · free on itch · jam entry

You see this? This is fucking witchcraft.

This game is made with MegaZeux. MegaZeux games look like THIS. Text-mode, bound to a grid, with two colors per cell. That’s all you get.

Until now, apparently?? The game is a tech demo of “unbound” sprites, which can be drawn on top of the character grid without being aligned to it. And apparently have looser color restrictions.

The collision is a little glitchy, which isn’t surprising for a MegaZeux platformer; I had some fun interactions with platforms a couple times. But hey, goddamn, it’s free-moving Mario, in MegaZeux, what the hell.

(I’m looking at the most recently added games on DigitalMZX now, and I notice that not only is this game in the first slot, but NovaSquirrel’s MegaZeux entry for Strawberry Jam last February is still in the seventh slot. RIP, MegaZeux. I’m surprised a major feature like this was even added if the community has largely evaporated?)

FINAL SCORE: n/a, disqualified for being probably summoned from the depths of Hell

d!¢< pic, by 573 Games

short · story · jan 2017 · web · free on itch · jam entry

This is a short story about not sending dick pics. It’s very short, so I can’t say much without spoiling it, but: you are generally prompted to either text something reasonable, or send a dick pic. You should not send a dick pic.

It’s a fascinating artifact, not because of the work itself, but because it’s so terse that I genuinely can’t tell what the author was even going for. And this is the kind of subject where the author was, surely, going for something. Right? But was it genuinely intended to be educational, or was it tongue-in-cheek about how some dudes still don’t get it? Or is it side-eying the player who clicks the obviously wrong option just for kicks, which is the same reason people do it for real? Or is it commentary on how “send a dick pic” is a literal option for every response in a real conversation, too, and it’s not that hard to just not do it — unless you are one of the kinds of people who just feels a compulsion to try everything, anything, just because you can? Or is it just a quick Twine and I am way too deep in this? God, just play the thing, it’s shorter than this paragraph.

I’m also left wondering when it is appropriate to send a dick pic. Presumably there is a correct time? Hopefully the author will enter Strawberry Jam 2 to expound upon this.

FINAL SCORE: 3½” 😉

Marble maze, by Shtille

short · arcade · jan 2017 · win · free on itch · jam entry

Ah, hm. So this is a maze navigated by rolling a marble around. You use WASD to move the marble, and you can also turn the camera with the arrow keys.

The trouble is… the marble’s movement is always relative to the world, not the camera. That means if you turn the camera 30° and then try to move the marble, it’ll move at a 30° angle from your point of view.

That makes navigating a maze, er, difficult.

Camera-relative movement is the kind of thing I take so much for granted that I wouldn’t even think to do otherwise, and I think it’s valuable to look at surprising choices that violate fundamental conventions, so I’m trying to take this as a nudge out of my comfort zone. What could you design in an interesting way that used world-relative movement? Probably not the player, but maybe something else in the world, as long as you had strong landmarks? Hmm.

FINAL SCORE: ᘔ

Refactor: flight, by fluffy

short · arcade · jan 2017 · lin/mac/win · free on itch · jam entry

Refactor is a game album, which is rather a lot what it sounds like, and Flight is one of the tracks. Which makes this a single, I suppose.

It’s one of those games where you move down an oddly-shaped tunnel trying not to hit the walls, but with some cute twists. Coins and gems hop up from the bottom of the screen in time with the music, and collecting them gives you points. Hitting a wall costs you some points and kills your momentum, but I don’t think outright losing is possible, which is great for me!

Also, the monk cycles through several animal faces. I don’t know why, and it’s very good. One of those odd but memorable details that sits squarely on the intersection of abstract, mysterious, and a bit weird, and refuses to budge from that spot.

The music is great too? Really chill all around.

FINAL SCORE: 🎵🎵🎵🎵

The Adventures of Klyde

short · adventure · jan 2017 · web · free on itch · jam entry

Another bitsy game, this one starring a pig (humorously symbolized by a giant pig nose with ears) who must collect fruit and solve some puzzles.

This is charmingly nostalgic for me — it reminds me of some standard fare in engines like MegaZeux, where the obvious things to do when presented with tiles and pickups were to make mazes. I don’t mean that in a bad way; the maze is the fundamental environmental obstacle.

A couple places in here felt like invisible teleport mazes I had to brute-force, but I might have been missing a hint somewhere. I did make it through with only a little trouble, but alas — I stepped in a bad warp somewhere and got sent to the upper left corner of the starting screen, which is surrounded by walls. So Klyde’s new life is being trapped eternally in a nowhere space.

FINAL SCORE: 19/20 apples

And more

That was only a third of the games, and I don’t think even half of the ones I’ve played. I’ll have to do a second post covering the rest of them? Maybe a third?

Or maybe this is a ludicrous format for commenting on several dozen games and I should try to narrow it down to the ones that resonated the most for Strawberry Jam 2? Maybe??