Tag Archives: Tested

Sharing Secrets with AWS Lambda Using AWS Systems Manager Parameter Store

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/sharing-secrets-with-aws-lambda-using-aws-systems-manager-parameter-store/

This post courtesy of Roberto Iturralde, Sr. Application Developer- AWS Professional Services

Application architects are faced with key decisions throughout the process of designing and implementing their systems. One decision common to nearly all solutions is how to manage the storage and access rights of application configuration. Shared configuration should be stored centrally and securely with each system component having access only to the properties that it needs for functioning.

With AWS Systems Manager Parameter Store, developers have access to central, secure, durable, and highly available storage for application configuration and secrets. Parameter Store also integrates with AWS Identity and Access Management (IAM), allowing fine-grained access control to individual parameters or branches of a hierarchical tree.

This post demonstrates how to create and access shared configurations in Parameter Store from AWS Lambda. Both encrypted and plaintext parameter values are stored with only the Lambda function having permissions to decrypt the secrets. You also use AWS X-Ray to profile the function.

Solution overview

This example is made up of the following components:

  • An AWS SAM template that defines:
    • A Lambda function and its permissions
    • An unencrypted Parameter Store parameter that the Lambda function loads
    • A KMS key that only the Lambda function can access. You use this key to create an encrypted parameter later.
  • Lambda function code in Python 3.6 that demonstrates how to load values from Parameter Store at function initialization for reuse across invocations.

Launch the AWS SAM template

To create the resources shown in this post, you can download the SAM template or choose the button to launch the stack. The template requires one parameter, an IAM user name, which is the name of the IAM user to be the admin of the KMS key that you create. In order to perform the steps listed in this post, this IAM user will need permissions to execute Lambda functions, create Parameter Store parameters, administer keys in KMS, and view the X-Ray console. If you have these privileges in your IAM user account you can use your own account to complete the walkthrough. You can not use the root user to administer the KMS keys.

SAM template resources

The following sections show the code for the resources defined in the template.
Lambda function

ParameterStoreBlogFunctionDev:
    Type: 'AWS::Serverless::Function'
    Properties:
      FunctionName: 'ParameterStoreBlogFunctionDev'
      Description: 'Integrating lambda with Parameter Store'
      Handler: 'lambda_function.lambda_handler'
      Role: !GetAtt ParameterStoreBlogFunctionRoleDev.Arn
      CodeUri: './code'
      Environment:
        Variables:
          ENV: 'dev'
          APP_CONFIG_PATH: 'parameterStoreBlog'
          AWS_XRAY_TRACING_NAME: 'ParameterStoreBlogFunctionDev'
      Runtime: 'python3.6'
      Timeout: 5
      Tracing: 'Active'

  ParameterStoreBlogFunctionRoleDev:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          -
            Effect: Allow
            Principal:
              Service:
                - 'lambda.amazonaws.com'
            Action:
              - 'sts:AssumeRole'
      ManagedPolicyArns:
        - 'arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole'
      Policies:
        -
          PolicyName: 'ParameterStoreBlogDevParameterAccess'
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              -
                Effect: Allow
                Action:
                  - 'ssm:GetParameter*'
                Resource: !Sub 'arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:parameter/dev/parameterStoreBlog*'
        -
          PolicyName: 'ParameterStoreBlogDevXRayAccess'
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              -
                Effect: Allow
                Action:
                  - 'xray:PutTraceSegments'
                  - 'xray:PutTelemetryRecords'
                Resource: '*'

In this YAML code, you define a Lambda function named ParameterStoreBlogFunctionDev using the SAM AWS::Serverless::Function type. The environment variables for this function include the ENV (dev) and the APP_CONFIG_PATH where you find the configuration for this app in Parameter Store. X-Ray tracing is also enabled for profiling later.

The IAM role for this function extends the AWSLambdaBasicExecutionRole by adding IAM policies that grant the function permissions to write to X-Ray and get parameters from Parameter Store, limited to paths under /dev/parameterStoreBlog*.
Parameter Store parameter

SimpleParameter:
    Type: AWS::SSM::Parameter
    Properties:
      Name: '/dev/parameterStoreBlog/appConfig'
      Description: 'Sample dev config values for my app'
      Type: String
      Value: '{"key1": "value1","key2": "value2","key3": "value3"}'

This YAML code creates a plaintext string parameter in Parameter Store in a path that your Lambda function can access.
KMS encryption key

ParameterStoreBlogDevEncryptionKeyAlias:
    Type: AWS::KMS::Alias
    Properties:
      AliasName: 'alias/ParameterStoreBlogKeyDev'
      TargetKeyId: !Ref ParameterStoreBlogDevEncryptionKey

  ParameterStoreBlogDevEncryptionKey:
    Type: AWS::KMS::Key
    Properties:
      Description: 'Encryption key for secret config values for the Parameter Store blog post'
      Enabled: True
      EnableKeyRotation: False
      KeyPolicy:
        Version: '2012-10-17'
        Id: 'key-default-1'
        Statement:
          -
            Sid: 'Allow administration of the key & encryption of new values'
            Effect: Allow
            Principal:
              AWS:
                - !Sub 'arn:aws:iam::${AWS::AccountId}:user/${IAMUsername}'
            Action:
              - 'kms:Create*'
              - 'kms:Encrypt'
              - 'kms:Describe*'
              - 'kms:Enable*'
              - 'kms:List*'
              - 'kms:Put*'
              - 'kms:Update*'
              - 'kms:Revoke*'
              - 'kms:Disable*'
              - 'kms:Get*'
              - 'kms:Delete*'
              - 'kms:ScheduleKeyDeletion'
              - 'kms:CancelKeyDeletion'
            Resource: '*'
          -
            Sid: 'Allow use of the key'
            Effect: Allow
            Principal:
              AWS: !GetAtt ParameterStoreBlogFunctionRoleDev.Arn
            Action:
              - 'kms:Encrypt'
              - 'kms:Decrypt'
              - 'kms:ReEncrypt*'
              - 'kms:GenerateDataKey*'
              - 'kms:DescribeKey'
            Resource: '*'

This YAML code creates an encryption key with a key policy with two statements.

The first statement allows a given user (${IAMUsername}) to administer the key. Importantly, this includes the ability to encrypt values using this key and disable or delete this key, but does not allow the administrator to decrypt values that were encrypted with this key.

The second statement grants your Lambda function permission to encrypt and decrypt values using this key. The alias for this key in KMS is ParameterStoreBlogKeyDev, which is how you reference it later.

Lambda function

Here I walk you through the Lambda function code.

import os, traceback, json, configparser, boto3
from aws_xray_sdk.core import patch_all
patch_all()

# Initialize boto3 client at global scope for connection reuse
client = boto3.client('ssm')
env = os.environ['ENV']
app_config_path = os.environ['APP_CONFIG_PATH']
full_config_path = '/' + env + '/' + app_config_path
# Initialize app at global scope for reuse across invocations
app = None

class MyApp:
    def __init__(self, config):
        """
        Construct new MyApp with configuration
        :param config: application configuration
        """
        self.config = config

    def get_config(self):
        return self.config

def load_config(ssm_parameter_path):
    """
    Load configparser from config stored in SSM Parameter Store
    :param ssm_parameter_path: Path to app config in SSM Parameter Store
    :return: ConfigParser holding loaded config
    """
    configuration = configparser.ConfigParser()
    try:
        # Get all parameters for this app
        param_details = client.get_parameters_by_path(
            Path=ssm_parameter_path,
            Recursive=False,
            WithDecryption=True
        )

        # Loop through the returned parameters and populate the ConfigParser
        if 'Parameters' in param_details and len(param_details.get('Parameters')) > 0:
            for param in param_details.get('Parameters'):
                param_path_array = param.get('Name').split("/")
                section_position = len(param_path_array) - 1
                section_name = param_path_array[section_position]
                config_values = json.loads(param.get('Value'))
                config_dict = {section_name: config_values}
                print("Found configuration: " + str(config_dict))
                configuration.read_dict(config_dict)

    except:
        print("Encountered an error loading config from SSM.")
        traceback.print_exc()
    finally:
        return configuration

def lambda_handler(event, context):
    global app
    # Initialize app if it doesn't yet exist
    if app is None:
        print("Loading config and creating new MyApp...")
        config = load_config(full_config_path)
        app = MyApp(config)

    return "MyApp config is " + str(app.get_config()._sections)

Beneath the import statements, you import the patch_all function from the AWS X-Ray library, which you use to patch boto3 to create X-Ray segments for all your boto3 operations.

Next, you create a boto3 SSM client at the global scope for reuse across function invocations, following Lambda best practices. Using the function environment variables, you assemble the path where you expect to find your configuration in Parameter Store. The class MyApp is meant to serve as an example of an application that would need its configuration injected at construction. In this example, you create an instance of ConfigParser, a class in Python’s standard library for handling basic configurations, to give to MyApp.

The load_config function loads the all the parameters from Parameter Store at the level immediately beneath the path provided in the Lambda function environment variables. Each parameter found is put into a new section in ConfigParser. The name of the section is the name of the parameter, less the base path. In this example, the full parameter name is /dev/parameterStoreBlog/appConfig, which is put in a section named appConfig.

Finally, the lambda_handler function initializes an instance of MyApp if it doesn’t already exist, constructing it with the loaded configuration from Parameter Store. Then it simply returns the currently loaded configuration in MyApp. The impact of this design is that the configuration is only loaded from Parameter Store the first time that the Lambda function execution environment is initialized. Subsequent invocations reuse the existing instance of MyApp, resulting in improved performance. You see this in the X-Ray traces later in this post. For more advanced use cases where configuration changes need to be received immediately, you could implement an expiry policy for your configuration entries or push notifications to your function.

To confirm that everything was created successfully, test the function in the Lambda console.

  1. Open the Lambda console.
  2. In the navigation pane, choose Functions.
  3. In the Functions pane, filter to ParameterStoreBlogFunctionDev to find the function created by the SAM template earlier. Open the function name to view its details.
  4. On the top right of the function detail page, choose Test. You may need to create a new test event. The input JSON doesn’t matter as this function ignores the input.

After running the test, you should see output similar to the following. This demonstrates that the function successfully fetched the unencrypted configuration from Parameter Store.

Create an encrypted parameter

You currently have a simple, unencrypted parameter and a Lambda function that can access it.

Next, you create an encrypted parameter that only your Lambda function has permission to use for decryption. This limits read access for this parameter to only this Lambda function.

To follow along with this section, deploy the SAM template for this post in your account and make your IAM user name the KMS key admin mentioned earlier.

  1. In the Systems Manager console, under Shared Resources, choose Parameter Store.
  2. Choose Create Parameter.
    • For Name, enter /dev/parameterStoreBlog/appSecrets.
    • For Type, select Secure String.
    • For KMS Key ID, choose alias/ParameterStoreBlogKeyDev, which is the key that your SAM template created.
    • For Value, enter {"secretKey": "secretValue"}.
    • Choose Create Parameter.
  3. If you now try to view the value of this parameter by choosing the name of the parameter in the parameters list and then choosing Show next to the Value field, you won’t see the value appear. This is because, even though you have permission to encrypt values using this KMS key, you do not have permissions to decrypt values.
  4. In the Lambda console, run another test of your function. You now also see the secret parameter that you created and its decrypted value.

If you do not see the new parameter in the Lambda output, this may be because the Lambda execution environment is still warm from the previous test. Because the parameters are loaded at Lambda startup, you need a fresh execution environment to refresh the values.

Adjust the function timeout to a different value in the Advanced Settings at the bottom of the Lambda Configuration tab. Choose Save and test to trigger the creation of a new Lambda execution environment.

Profiling the impact of querying Parameter Store using AWS X-Ray

By using the AWS X-Ray SDK to patch boto3 in your Lambda function code, each invocation of the function creates traces in X-Ray. In this example, you can use these traces to validate the performance impact of your design decision to only load configuration from Parameter Store on the first invocation of the function in a new execution environment.

From the Lambda function details page where you tested the function earlier, under the function name, choose Monitoring. Choose View traces in X-Ray.

This opens the X-Ray console in a new window filtered to your function. Be aware of the time range field next to the search bar if you don’t see any search results.
In this screenshot, I’ve invoked the Lambda function twice, one time 10.3 minutes ago with a response time of 1.1 seconds and again 9.8 minutes ago with a response time of 8 milliseconds.

Looking at the details of the longer running trace by clicking the trace ID, you can see that the Lambda function spent the first ~350 ms of the full 1.1 sec routing the request through Lambda and creating a new execution environment for this function, as this was the first invocation with this code. This is the portion of time before the initialization subsegment.

Next, it took 725 ms to initialize the function, which includes executing the code at the global scope (including creating the boto3 client). This is also a one-time cost for a fresh execution environment.

Finally, the function executed for 65 ms, of which 63.5 ms was the GetParametersByPath call to Parameter Store.

Looking at the trace for the second, much faster function invocation, you see that the majority of the 8 ms execution time was Lambda routing the request to the function and returning the response. Only 1 ms of the overall execution time was attributed to the execution of the function, which makes sense given that after the first invocation you’re simply returning the config stored in MyApp.

While the Traces screen allows you to view the details of individual traces, the X-Ray Service Map screen allows you to view aggregate performance data for all traced services over a period of time.

In the X-Ray console navigation pane, choose Service map. Selecting a service node shows the metrics for node-specific requests. Selecting an edge between two nodes shows the metrics for requests that traveled that connection. Again, be aware of the time range field next to the search bar if you don’t see any search results.

After invoking your Lambda function several more times by testing it from the Lambda console, you can view some aggregate performance metrics. Look at the following:

  • From the client perspective, requests to the Lambda service for the function are taking an average of 50 ms to respond. The function is generating ~1 trace per minute.
  • The function itself is responding in an average of 3 ms. In the following screenshot, I’ve clicked on this node, which reveals a latency histogram of the traced requests showing that over 95% of requests return in under 5 ms.
  • Parameter Store is responding to requests in an average of 64 ms, but note the much lower trace rate in the node. This is because you only fetch data from Parameter Store on the initialization of the Lambda execution environment.

Conclusion

Deduplication, encryption, and restricted access to shared configuration and secrets is a key component to any mature architecture. Serverless architectures designed using event-driven, on-demand, compute services like Lambda are no different.

In this post, I walked you through a sample application accessing unencrypted and encrypted values in Parameter Store. These values were created in a hierarchy by application environment and component name, with the permissions to decrypt secret values restricted to only the function needing access. The techniques used here can become the foundation of secure, robust configuration management in your enterprise serverless applications.

Rightscorp Has a Massive Database of ‘Repeat Infringers’ to Pursue

Post Syndicated from Ernesto original https://torrentfreak.com/rightscorp-has-a-massive-database-of-repeat-infringers-to-pursue-180208/

Last week the Fourth Circuit Court of Appeals ruled that ISPs are required to terminate ‘repeat infringers’ based on allegations from copyright holders alone, a topic that has been contested for years.

This means that copyright holders now have a bigger incentive to send takedown notices, as ISPs can’t easily ignore them. That’s music to the ears of the various piracy tracking companies, Rightscorp included.

The piracy monetization company always maintained that multiple complaints from copyright holders are enough to classify someone as a repeat infringer, without a court order, and the Fourth Circuit has now reached the same conclusion.

“After years of uncertainty on these issues, it is gratifying for the US Court of Appeals to proclaim the law on ISP liability for subscriber infringements to be essentially what Rightscorp has always said it is,” Rightscorp President Christopher Sabec says.

Rightscorp is pleased to see that the court shares its opinion since the verdict also provides new business opportunities. The company informs TorrentFreak that it’s ready to help copyright holders to hold ISPs responsible.

“Rightscorp has always stood with content holders who wish to protect their rights against ISPs that are not taking action against repeat infringers,” Sabec tells us.

“Now, with the law addressing ISP liability for subscriber infringements finally sharpened and clarified at the appellate level, we are ready to support all efforts by rights holders to compel ISPs to abide by their responsibilities under the DMCA.”

The piracy tracking company has a treasure trove of piracy data at its disposal to issue takedown requests or back lawsuits. Over the past five years, it amassed nearly a billion “records” of copyright infringements.

“Rightscorp’s data records include no less the 969,653,557 infringements over the last five years,” Sabec says.

This number includes a lot of repeat infringers, obviously. It’s made up of IP-addresses downloading the same file on several occasions and/or multiple files over time.

While it’s unlikely that account holders will be disconnected based on infringements that happened years ago, this type of historical data can be used in court cases. Rightscorp’s infringement notices are the basis of the legal action against Cox, and are being used as evidence in a separate RIAA case against ISP Grande communications as well.

Grande previously said that it refused to act on Rightcorp’s notices because it doubts their accuracy, but the tracking company contests this. That case is still ongoing and a final decision has yet to be reached.

For now, however, Rightcorp is marketing its hundreds of thousands of recorded copyright infringements as an opportunity for rightsholders. And for a company that can use some extra cash in hand, that’s good news.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Cloudflare Terminates Service to Sci-Hub Domain Names

Post Syndicated from Ernesto original https://torrentfreak.com/cloudflare-terminates-service-to-sci-hub-domain-names-180205/

While Sci-Hub is praised by thousands of researchers and academics around the world, copyright holders are doing everything in their power to wipe the site from the web.

Following a $15 million defeat against Elsevier last June, the American Chemical Society (ACS) won a default judgment of $4.8 million in copyright damages a few months later.

The publisher was further granted a broad injunction, requiring various third-party services to stop providing access to the site. This includes domain registries, hosting companies and search engines.

Soon after the order was signed, several of Sci-Hub’s domain names became unreachable as domain registries complied with the court order. This resulted in a domain name whack-a-mole, but all this time Sci-Hub remained available.

Last weekend another problem appeared for Sci-Hub. This time ACS went after CDN provider Cloudflare, which informed the site that a court order requires the company to disconnect several domain names.

“Cloudflare has received the attached court order, Case 1:17-cv-OO726-LMB-JFA,” the company writes. “Cloudflare will terminate your service for the following domains sci-hub.la, sci-hub.tv, and sci-hub.tw by disabling our authoritative DNS in 24 hours.”

According to Sci-Hub’s operator, losing access to Cloudflare is not “critical,” but it may “cause a short pause in website operation.”

Sci-Hub’s Cloudflare tweet

Cloudflare’s actions are significant because the company previously protested a similar order. When the RIAA used the permanent injunction in the MP3Skull case to compel Cloudflare to disconnect the site, the CDN provider refused.

The RIAA argued that Cloudflare was operating “in active concert or participation” with the pirates. The CDN provider objected, but the court eventually ordered Cloudflare to take action, although it did not rule on the “active concert or participation” part.

In the Sci-Hub case “active concert or participation” is also a requirement for the injunction to apply. While it specifically mentions ISPs and search engines, ACS Director Glenn Ruskin previously stressed that companies won’t be targeted for simply linking users to Sci-Hub.

“The court’s affirmative ruling does not apply to search engines writ large, but only to those entities who have been in active concert or participation with Sci-Hub, such as websites that host ACS content stolen by Sci-Hub,” Ruskin told us at the time.

Cloudflare does more than linking of course, but the company doesn’t see itself as a web hosting service either. While it still may not agree with the “active concert” classification, there’s no evidence that Cloudflare objected in court this time.

As for Sci-Hub, they have to look elsewhere if they want another CDN provider. For now, however, the site remains widely available.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Success at Apache: A Newbie’s Narrative

Post Syndicated from mikesefanov original https://yahooeng.tumblr.com/post/170536010891

yahoodevelopers:

Kuhu Shukla (bottom center) and team at the 2017 DataWorks Summit


By Kuhu Shukla

This post first appeared here on the Apache Software Foundation blog as part of ASF’s “Success at Apache” monthly blog series.

As I sit at my desk on a rather frosty morning with my coffee, looking up new JIRAs from the previous day in the Apache Tez project, I feel rather pleased. The latest community release vote is complete, the bug fixes that we so badly needed are in and the new release that we tested out internally on our many thousand strong cluster is looking good. Today I am looking at a new stack trace from a different Apache project process and it is hard to miss how much of the exceptional code I get to look at every day comes from people all around the globe. A contributor leaves a JIRA comment before he goes on to pick up his kid from soccer practice while someone else wakes up to find that her effort on a bug fix for the past two months has finally come to fruition through a binding +1.

Yahoo – which joined AOL, HuffPost, Tumblr, Engadget, and many more brands to form the Verizon subsidiary Oath last year – has been at the frontier of open source adoption and contribution since before I was in high school. So while I have no historical trajectories to share, I do have a story on how I found myself in an epic journey of migrating all of Yahoo jobs from Apache MapReduce to Apache Tez, a then-new DAG based execution engine.

Oath grid infrastructure is through and through driven by Apache technologies be it storage through HDFS, resource management through YARN, job execution frameworks with Tez and user interface engines such as Hive, Hue, Pig, Sqoop, Spark, Storm. Our grid solution is specifically tailored to Oath’s business-critical data pipeline needs using the polymorphic technologies hosted, developed and maintained by the Apache community.

On the third day of my job at Yahoo in 2015, I received a YouTube link on An Introduction to Apache Tez. I watched it carefully trying to keep up with all the questions I had and recognized a few names from my academic readings of Yarn ACM papers. I continued to ramp up on YARN and HDFS, the foundational Apache technologies Oath heavily contributes to even today. For the first few weeks I spent time picking out my favorite (necessary) mailing lists to subscribe to and getting started on setting up on a pseudo-distributed Hadoop cluster. I continued to find my footing with newbie contributions and being ever more careful with whitespaces in my patches. One thing was clear – Tez was the next big thing for us. By the time I could truly call myself a contributor in the Hadoop community nearly 80-90% of the Yahoo jobs were now running with Tez. But just like hiking up the Grand Canyon, the last 20% is where all the pain was. Being a part of the solution to this challenge was a happy prospect and thankfully contributing to Tez became a goal in my next quarter.

The next sprint planning meeting ended with me getting my first major Tez assignment – progress reporting. The progress reporting in Tez was non-existent – “Just needs an API fix,”  I thought. Like almost all bugs in this ecosystem, it was not easy. How do you define progress? How is it different for different kinds of outputs in a graph? The questions were many.

I, however, did not have to go far to get answers. The Tez community actively came to a newbie’s rescue, finding answers and posing important questions. I started attending the bi-weekly Tez community sync up calls and asking existing contributors and committers for course correction. Suddenly the team was much bigger, the goals much more chiseled. This was new to anyone like me who came from the networking industry, where the most open part of the code are the RFCs and the implementation details are often hidden. These meetings served as a clean room for our coding ideas and experiments. Ideas were shared, to the extent of which data structure we should pick and what a future user of Tez would take from it. In between the usual status updates and extensive knowledge transfers were made.

Oath uses Apache Pig and Apache Hive extensively and most of the urgent requirements and requests came from Pig and Hive developers and users. Each issue led to a community JIRA and as we started running Tez at Oath scale, new feature ideas and bugs around performance and resource utilization materialized. Every year most of the Hadoop team at Oath travels to the Hadoop Summit where we meet our cohorts from the Apache community and we stand for hours discussing the state of the art and what is next for the project. One such discussion set the course for the next year and a half for me.

We needed an innovative way to shuffle data. Frameworks like MapReduce and Tez have a shuffle phase in their processing lifecycle wherein the data from upstream producers is made available to downstream consumers. Even though Apache Tez was designed with a feature set corresponding to optimization requirements in Pig and Hive, the Shuffle Handler Service was retrofitted from MapReduce at the time of the project’s inception. With several thousands of jobs on our clusters leveraging these features in Tez, the Shuffle Handler Service became a clear performance bottleneck. So as we stood talking about our experience with Tez with our friends from the community, we decided to implement a new Shuffle Handler for Tez. All the conversation points were tracked now through an umbrella JIRA TEZ-3334 and the to-do list was long. I picked a few JIRAs and as I started reading through I realized, this is all new code I get to contribute to and review. There might be a better way to put this, but to be honest it was just a lot of fun! All the whiteboards were full, the team took walks post lunch and discussed how to go about defining the API. Countless hours were spent debugging hangs while fetching data and looking at stack traces and Wireshark captures from our test runs. Six months in and we had the feature on our sandbox clusters. There were moments ranging from sheer frustration to absolute exhilaration with high fives as we continued to address review comments and fixing big and small issues with this evolving feature.

As much as owning your code is valued everywhere in the software community, I would never go on to say “I did this!” In fact, “we did!” It is this strong sense of shared ownership and fluid team structure that makes the open source experience at Apache truly rewarding. This is just one example. A lot of the work that was done in Tez was leveraged by the Hive and Pig community and cross Apache product community interaction made the work ever more interesting and challenging. Triaging and fixing issues with the Tez rollout led us to hit a 100% migration score last year and we also rolled the Tez Shuffle Handler Service out to our research clusters. As of last year we have run around 100 million Tez DAGs with a total of 50 billion tasks over almost 38,000 nodes.

In 2018 as I move on to explore Hadoop 3.0 as our future release, I hope that if someone outside the Apache community is reading this, it will inspire and intrigue them to contribute to a project of their choice. As an astronomy aficionado, going from a newbie Apache contributor to a newbie Apache committer was very much like looking through my telescope - it has endless possibilities and challenges you to be your best.

About the Author:

Kuhu Shukla is a software engineer at Oath and did her Masters in Computer Science at North Carolina State University. She works on the Big Data Platforms team on Apache Tez, YARN and HDFS with a lot of talented Apache PMCs and Committers in Champaign, Illinois. A recent Apache Tez Committer herself she continues to contribute to YARN and HDFS and spoke at the 2017 Dataworks Hadoop Summit on “Tez Shuffle Handler: Shuffling At Scale With Apache Hadoop”. Prior to that she worked on Juniper Networks’ router and switch configuration APIs. She likes to participate in open source conferences and women in tech events. In her spare time she loves singing Indian classical and jazz, laughing, whale watching, hiking and peering through her Dobsonian telescope.

Building Blocks of Amazon ECS

Post Syndicated from Tiffany Jernigan original https://aws.amazon.com/blogs/compute/building-blocks-of-amazon-ecs/

So, what’s Amazon Elastic Container Service (ECS)? ECS is a managed service for running containers on AWS, designed to make it easy to run applications in the cloud without worrying about configuring the environment for your code to run in. Using ECS, you can easily deploy containers to host a simple website or run complex distributed microservices using thousands of containers.

Getting started with ECS isn’t too difficult. To fully understand how it works and how you can use it, it helps to understand the basic building blocks of ECS and how they fit together!

Let’s begin with an analogy

Imagine you’re in a virtual reality game with blocks and portals, in which your task is to build kingdoms.

In your spaceship, you pull up a holographic map of your upcoming destination: Nozama, a golden-orange planet. Looking at its various regions, you see that the nearest one is za-southwest-1 (SW Nozama). You set your destination, and use your jump drive to jump to the outer atmosphere of za-southwest-1.

As you approach SW Nozama, you see three portals, 1a, 1b, and 1c. Each portal lets you transport directly to an isolated zone (Availability Zone), where you can start construction on your new kingdom (cluster), Royaume.

With your supply of blocks, you take the portal to 1b, and erect the surrounding walls of your first territory (instance)*.

Before you get ahead of yourself, there are some rules to keep in mind. For your territory to be a part of Royaume, the land ordinance requires construction of a building (container), specifically a castle, from which your territory’s lord (agent)* rules.

You can then create architectural plans (task definitions) to build your developments (tasks), consisting of up to 10 buildings per plan. A development can be built now within this or any territory, or multiple territories.

If you do decide to create more territories, you can either stay here in 1b or take a portal to another location in SW Nozama and start building there.

Amazon EC2 building blocks

We currently provide two launch types: EC2 and Fargate. With Fargate, the Amazon EC2 instances are abstracted away and managed for you. Instead of worrying about ECS container instances, you can just worry about tasks. In this post, the infrastructure components used by ECS that are handled by Fargate are marked with a *.

Instance*

EC2 instances are good ol’ virtual machines (VMs). And yes, don’t worry, you can connect to them (via SSH). Because customers have varying needs in memory, storage, and computing power, many different instance types are offered. Just want to run a small application or try a free trial? Try t2.micro. Want to run memory-optimized workloads? R3 and X1 instances are a couple options. There are many more instance types as well, which cater to various use cases.

AMI*

Sorry if you wanted to immediately march forward, but before you create your instance, you need to choose an AMI. An AMI stands for Amazon Machine Image. What does that mean? Basically, an AMI provides the information required to launch an instance: root volume, launch permissions, and volume-attachment specifications. You can find and choose a Linux or Windows AMI provided by AWS, the user community, the AWS Marketplace (for example, the Amazon ECS-Optimized AMI), or you can create your own.

Region

AWS is divided into regions that are geographic areas around the world (for now it’s just Earth, but maybe someday…). These regions have semi-evocative names such as us-east-1 (N. Virginia), us-west-2 (Oregon), eu-central-1 (Frankfurt), ap-northeast-1 (Tokyo), etc.

Each region is designed to be completely isolated from the others, and consists of multiple, distinct data centers. This creates a “blast radius” for failure so that even if an entire region goes down, the others aren’t affected. Like many AWS services, to start using ECS, you first need to decide the region in which to operate. Typically, this is the region nearest to you or your users.

Availability Zone

AWS regions are subdivided into Availability Zones. A region has at minimum two zones, and up to a handful. Zones are physically isolated from each other, spanning one or more different data centers, but are connected through low-latency, fiber-optic networking, and share some common facilities. EC2 is designed so that the most common failures only affect a single zone to prevent region-wide outages. This means you can achieve high availability in a region by spanning your services across multiple zones and distributing across hosts.

Amazon ECS building blocks

Container

Well, without containers, ECS wouldn’t exist!

Are containers virtual machines?
Nope! Virtual machines virtualize the hardware (benefits), while containers virtualize the operating system (even more benefits!). If you look inside a container, you would see that it is made by processes running on the host, and tied together by kernel constructs like namespaces, cgroups, etc. But you don’t need to bother about that level of detail, at least not in this post!

Why containers?
Containers give you the ability to build, ship, and run your code anywhere!

Before the cloud, you needed to self-host and therefore had to buy machines in addition to setting up and configuring the operating system (OS), and running your code. In the cloud, with virtualization, you can just skip to setting up the OS and running your code. Containers make the process even easier—you can just run your code.

Additionally, all of the dependencies travel in a package with the code, which is called an image. This allows containers to be deployed on any host machine. From the outside, it looks like a host is just holding a bunch of containers. They all look the same, in the sense that they are generic enough to be deployed on any host.

With ECS, you can easily run your containerized code and applications across a managed cluster of EC2 instances.

Are containers a fairly new technology?
The concept of containerization is not new. Its origins date back to 1979 with the creation of chroot. However, it wasn’t until the early 2000s that containers became a major technology. The most significant milestone to date was the release of Docker in 2013, which led to the popularization and widespread adoption of containers.

What does ECS use?
While other container technologies exist (LXC, rkt, etc.), because of its massive adoption and use by our customers, ECS was designed first to work natively with Docker containers.

Container instance*

Yep, you are back to instances. An instance is just slightly more complex in the ECS realm though. Here, it is an ECS container instance that is an EC2 instance running the agent, has a specifically defined IAM policy and role, and has been registered into your cluster.

And as you probably guessed, in these instances, you are running containers. 

AMI*

These container instances can use any AMI as long as it has the following specifications: a modern Linux distribution with the agent and the Docker Daemon with any Docker runtime dependencies running on it.

Want it more simplified? Well, AWS created the Amazon ECS-Optimized AMI for just that. Not only does that AMI come preconfigured with all of the previously mentioned specifications, it’s tested and includes the recommended ecs-init upstart process to run and monitor the agent.

Cluster

An ECS cluster is a grouping of (container) instances* (or tasks in Fargate) that lie within a single region, but can span multiple Availability Zones – it’s even a good idea for redundancy. When launching an instance (or tasks in Fargate), unless specified, it registers with the cluster named “default”. If “default” doesn’t exist, it is created. You can also scale and delete your clusters.

Agent*

The Amazon ECS container agent is a Go program that runs in its own container within each EC2 instance that you use with ECS. (It’s also available open source on GitHub!) The agent is the intermediary component that takes care of the communication between the scheduler and your instances. Want to register your instance into a cluster? (Why wouldn’t you? A cluster is both a logical boundary and provider of pool of resources!) Then you need to run the agent on it.

Task

When you want to start a container, it has to be part of a task. Therefore, you have to create a task first. Succinctly, tasks are a logical grouping of 1 to N containers that run together on the same instance, with N defined by you, up to 10. Let’s say you want to run a custom blog engine. You could put together a web server, an application server, and an in-memory cache, each in their own container. Together, they form a basic frontend unit.

Task definition

Ah, but you cannot create a task directly. You have to create a task definition that tells ECS that “task definition X is composed of this container (and maybe that other container and that other container too!).” It’s kind of like an architectural plan for a city. Some other details it can include are how the containers interact, container CPU and memory constraints, and task permissions using IAM roles.

Then you can tell ECS, “start one task using task definition X.” It might sound like unnecessary planning at first. As soon as you start to deal with multiple tasks, scaling, upgrades, and other “real life” scenarios, you’ll be glad that you have task definitions to keep track of things!

Scheduler*

So, the scheduler schedules… sorry, this should be more helpful, huh? The scheduler is part of the “hosted orchestration layer” provided by ECS. Wait a minute, what do I mean by “hosted orchestration”? Simply put, hosted means that it’s operated by ECS on your behalf, without you having to care about it. Your applications are deployed in containers running on your instances, but the managing of tasks is taken care of by ECS. One less thing to worry about!

Also, the scheduler is the component that decides what (which containers) gets to run where (on which instances), according to a number of constraints. Say that you have a custom blog engine to scale for high availability. You could create a service, which by default, spreads tasks across all zones in the chosen region. And if you want each task to be on a different instance, you can use the distinctInstance task placement constraint. ECS makes sure that not only this happens, but if a task fails, it starts again.

Service

To ensure that you always have your task running without managing it yourself, you can create a service based on the task that you defined and ECS ensures that it stays running. A service is a special construct that says, “at any given time, I want to make sure that N tasks using task definition X1 are running.” If N=1, it just means “make sure that this task is running, and restart it if needed!” And with N>1, you’re basically scaling your application until you hit N, while also ensuring each task is running.

So, what now?

Hopefully you, at the very least, learned a tiny something. All comments are very welcome!

Want to discuss ECS with others? Join the amazon-ecs slack group, which members of the community created and manage.

Also, if you’re interested in learning more about the core concepts of ECS and its relation to EC2, here are some resources:

Pages
Amazon ECS landing page
AWS Fargate landing page
Amazon ECS Getting Started
Nathan Peck’s AWSome ECS

Docs
Amazon EC2
Amazon ECS

Blogs
AWS Compute Blog
AWS Blog

GitHub code
Amazon ECS container agent
Amazon ECS CLI

AWS videos
Learn Amazon ECS
AWS videos
AWS webinars

 

— tiffany

 @tiffanyfayj

 

Raspberry Pi Spy’s Alexa Skill

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/pi-spy-alexa-skill/

With Raspberry Pi projects using home assistant services such as Amazon Alexa and Google Home becoming more and more popular, we invited Raspberry Pi maker Matt ‘Raspberry Pi Spy‘ Hawkins to write a guest post about his latest project, the Pi Spy Alexa Skill.

Pi Spy Alexa Skill Raspberry Pi

Pi Spy Skill

The Alexa system uses Skills to provide voice-activated functionality, and it allows you to create new Skills to add extra features. With the Pi Spy Skill, you can ask Alexa what function each pin on the Raspberry Pi’s GPIO header provides, for example by using the phrase “Alexa, ask Pi Spy what is Pin 2.” In response to a phrase such as “Alexa, ask Pi Spy where is GPIO 8”, Alexa can now also tell you on which pin you can find a specific GPIO reference number.

This information is already available in various forms, but I thought it would be useful to retrieve it when I was busy soldering or building circuits and had no hands free.

Creating an Alexa Skill

There is a learning curve to creating a new Skill, and in some regards it was similar to mobile app development.

A Skill consists of two parts: the first is created within the Amazon Developer Console and defines the structure of the voice commands Alexa should recognise. The second part is a webservice that can receive data extracted from the voice commands and provide a response back to the device. You can create the webservice on a webserver, internet-connected device, or cloud service.

I decided to use Amazon’s AWS Lambda service. Once set up, this allows you to write code without having to worry about the server it is running on. It also supports Python, so it fit in nicely with most of my other projects.

To get started, I logged into the Amazon Developer Console with my personal Amazon account and navigated to the Alexa section. I created a new Skill named Pi Spy. Within a Skill, you define an Intent Schema and some Sample Utterances. The schema defines individual intents, and the utterances define how these are invoked by the user.

Here is how my ExaminePin intent is defined in the schema:

Pi Spy Alexa Skill Raspberry Pi

Example utterances then attempt to capture the different phrases the user might speak to their device.

Pi Spy Alexa Skill Raspberry Pi

Whenever Alexa matches a spoken phrase to an utterance, it passes the name of the intent and the variable PinID to the webservice.

In the test section, you can check what JSON data will be generated and passed to your webservice in response to specific phrases. This allows you to verify that the webservices’ responses are correct.

Pi Spy Alexa Skill Raspberry Pi

Over on the AWS Services site, I created a Lambda function based on one of the provided examples to receive the incoming requests. Here is the section of that code which deals with the ExaminePin intent:

Pi Spy Alexa Skill Raspberry Pi

For this intent, I used a Python dictionary to match the incoming pin number to its description. Another Python function deals with the GPIO queries. A URL to this Lambda function was added to the Skill as its ‘endpoint’.

As with the Skill, the Python code can be tested to iron out any syntax errors or logic problems.

With suitable configuration, it would be possible to create the webservice on a Pi, and that is something I’m currently working on. This approach is particularly interesting, as the Pi can then be used to control local hardware devices such as cameras, lights, or pet feeders.

Note

My Alexa Skill is currently only available to UK users. I’m hoping Amazon will choose to copy it to the US service, but I think that is down to its perceived popularity, or it may be done in bulk based on release date. In the next update, I’ll be adding an American English version to help speed up this process.

The post Raspberry Pi Spy’s Alexa Skill appeared first on Raspberry Pi.

When You Have A Blockchain, Everything Looks Like a Nail

Post Syndicated from Bozho original https://techblog.bozho.net/blockchain-everything-looks-like-nail/

Blockchain, AI, big data, NoSQL, microservices, single page applications, cloud, SOA. What do these have in common? They have been or are hyped. At some point they were “the big thing” du jour. Everyone was investigating the possibility of using them, everyone was talking about them, there were meetups, conferences, articles on Hacker news and reddit. There are more examples, of course (which is the javascript framework this month?) but I’ll focus my examples on those above.

Another thing they have in common is that they are useful. All of them have some pretty good applications that are definitely worth the time and investment.

Yet another thing they have in common is that they are far from universally applicable. I’ve argued that monoliths are often still the better approach and that microservices introduce too much complexity for the average project. Big Data is something very few organizations actually have; AI/machine learning can help a wide variety of problems, but it is just a tool in a toolbox, not the solution to all problems. Single page applications are great for, yeah, applications, but most websites are still websites, not feature-rich frontends – you don’t need an SPA for every type of website. NoSQL has solved niche issues, and issues of scale that few companies have had, but nothing beats a good old relational database for the typical project out there. “The cloud” is not always where you want your software to be; and SOA just means everything (ESBs, direct integrations, even microservices, according to some). And the blockchain – it seems to be having limited success beyond cryptocurrencies.

And finally, another trait many of them share is that the hype has settled down. Only yesterday I read an article about the “death of the microservices madness”. I don’t see nearly as many new NoSQL databases as a few years ago, some of the projects that have been popular have faded. SOA and “the cloud” are already “boring”, and we’ve realized we don’t actually have big data if it fits in an Excel spreadsheet. SPAs and AI are still high in popularity, but we are getting a good understanding as a community why and when they are useful.

But it seems that nuanced reality has never stopped us from hyping a particular technology or approach. And maybe that’s okay in order to get a promising, though niche, technology, the spotlight and let it shine in the particular usecases where it fits.

But countless projects have and will suffer from our collective inability to filter through these hypes. I’d bet millions of developer hours have been wasted in trying to use the above technologies where they just didn’t fit. It’s like that scene from Idiocracy where a guy tries to fit a rectangular figure into a circular hole.

And the new one is not “the blockchain”. I won’t repeat my rant, but in summary – it doesn’t solve many of the problems companies are trying to solve with it right now just because it’s cool. Or at least it doesn’t solve them better than existing solutions. Many pilots will be carried out, many hours will be wasted in figuring out why that thing doesn’t work. A few of those projects will be a good fit and will actually bring value.

Do you need to reach multi-party consensus for the data you store? Can all stakeholder support the infrastructure to run their node(s)? Do they have the staff to administer the node(s)? Do you need to execute distributed application code on the data? Won’t it be easier to just deploy RESTful APIs and integrate the parties through that? Do you need to store all the data, or just parts of it, to guarantee data integrity?

“If you have is a hammer, everything looks like a nail” as the famous saying goes. In the software industry we repeatedly find new and cool hammers and then try to hit as many nails as we can. But only few of them are actual nails. The rest remain ugly, hard to support, “who was the idiot that wrote this” and “I wasn’t here when the decisions were made” types of projects.

I don’t have the illusion that we will calm down and skip the next hypes. Especially if adding the hyped word to your company raises your stock price. But if there’s one thing I’d like people to ask themselves when choosing a technology stack, it is “do we really need that to solve our problems?”.

If the answer is really “yes”, then great, go ahead and deploy the multi-organization permissioned blockchain, or fork Ethereum, or whatever. If not, you can still do a project a home that you can safely abandon. And if you need some pilot project to figure out whether the new piece of technology would be beneficial – go ahead and try it. But have a baseline – the fact that it somehow worked doesn’t mean it’s better than old, tested models of doing the same thing.

The post When You Have A Blockchain, Everything Looks Like a Nail appeared first on Bozho's tech blog.

New AWS Auto Scaling – Unified Scaling For Your Cloud Applications

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-auto-scaling-unified-scaling-for-your-cloud-applications/

I’ve been talking about scalability for servers and other cloud resources for a very long time! Back in 2006, I wrote “This is the new world of scalable, on-demand web services. Pay for what you need and use, and not a byte more.” Shortly after we launched Amazon Elastic Compute Cloud (EC2), we made it easy for you to do this with the simultaneous launch of Elastic Load Balancing, EC2 Auto Scaling, and Amazon CloudWatch. Since then we have added Auto Scaling to other AWS services including ECS, Spot Fleets, DynamoDB, Aurora, AppStream 2.0, and EMR. We have also added features such as target tracking to make it easier for you to scale based on the metric that is most appropriate for your application.

Introducing AWS Auto Scaling
Today we are making it easier for you to use the Auto Scaling features of multiple AWS services from a single user interface with the introduction of AWS Auto Scaling. This new service unifies and builds on our existing, service-specific, scaling features. It operates on any desired EC2 Auto Scaling groups, EC2 Spot Fleets, ECS tasks, DynamoDB tables, DynamoDB Global Secondary Indexes, and Aurora Replicas that are part of your application, as described by an AWS CloudFormation stack or in AWS Elastic Beanstalk (we’re also exploring some other ways to flag a set of resources as an application for use with AWS Auto Scaling).

You no longer need to set up alarms and scaling actions for each resource and each service. Instead, you simply point AWS Auto Scaling at your application and select the services and resources of interest. Then you select the desired scaling option for each one, and AWS Auto Scaling will do the rest, helping you to discover the scalable resources and then creating a scaling plan that addresses the resources of interest.

If you have tried to use any of our Auto Scaling options in the past, you undoubtedly understand the trade-offs involved in choosing scaling thresholds. AWS Auto Scaling gives you a variety of scaling options: You can optimize for availability, keeping plenty of resources in reserve in order to meet sudden spikes in demand. You can optimize for costs, running close to the line and accepting the possibility that you will tax your resources if that spike arrives. Alternatively, you can aim for the middle, with a generous but not excessive level of spare capacity. In addition to optimizing for availability, cost, or a blend of both, you can also set a custom scaling threshold. In each case, AWS Auto Scaling will create scaling policies on your behalf, including appropriate upper and lower bounds for each resource.

AWS Auto Scaling in Action
I will use AWS Auto Scaling on a simple CloudFormation stack consisting of an Auto Scaling group of EC2 instances and a pair of DynamoDB tables. I start by removing the existing Scaling Policies from my Auto Scaling group:

Then I open up the new Auto Scaling Console and selecting the stack:

Behind the scenes, Elastic Beanstalk applications are always launched via a CloudFormation stack. In the screen shot above, awseb-e-sdwttqizbp-stack is an Elastic Beanstalk application that I launched.

I can click on any stack to learn more about it before proceeding:

I select the desired stack and click on Next to proceed. Then I enter a name for my scaling plan and choose the resources that I’d like it to include:

I choose the scaling strategy for each type of resource:

After I have selected the desired strategies, I click Next to proceed. Then I review the proposed scaling plan, and click Create scaling plan to move ahead:

The scaling plan is created and in effect within a few minutes:

I can click on the plan to learn more:

I can also inspect each scaling policy:

I tested my new policy by applying a load to the initial EC2 instance, and watched the scale out activity take place:

I also took a look at the CloudWatch metrics for the EC2 Auto Scaling group:

Available Now
We are launching AWS Auto Scaling today in the US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), and Asia Pacific (Singapore) Regions today, with more to follow. There’s no charge for AWS Auto Scaling; you pay only for the CloudWatch Alarms that it creates and any AWS resources that you consume.

As is often the case with our new services, this is just the first step on what we hope to be a long and interesting journey! We have a long roadmap, and we’ll be adding new features and options throughout 2018 in response to your feedback.

Jeff;

Continuous Deployment to Kubernetes using AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, Amazon ECR and AWS Lambda

Post Syndicated from Chris Barclay original https://aws.amazon.com/blogs/devops/continuous-deployment-to-kubernetes-using-aws-codepipeline-aws-codecommit-aws-codebuild-amazon-ecr-and-aws-lambda/

Thank you to my colleague Omar Lari for this blog on how to create a continuous deployment pipeline for Kubernetes!


You can use Kubernetes and AWS together to create a fully managed, continuous deployment pipeline for container based applications. This approach takes advantage of Kubernetes’ open-source system to manage your containerized applications, and the AWS developer tools to manage your source code, builds, and pipelines.

This post describes how to create a continuous deployment architecture for containerized applications. It uses AWS CodeCommit, AWS CodePipeline, AWS CodeBuild, and AWS Lambda to deploy containerized applications into a Kubernetes cluster. In this environment, developers can remain focused on developing code without worrying about how it will be deployed, and development managers can be satisfied that the latest changes are always deployed.

What is Continuous Deployment?

There are many articles, posts and even conferences dedicated to the practice of continuous deployment. For the purposes of this post, I will summarize continuous delivery into the following points:

  • Code is more frequently released into production environments
  • More frequent releases allow for smaller, incremental changes reducing risk and enabling simplified roll backs if needed
  • Deployment is automated and requires minimal user intervention

For a more information, see “Practicing Continuous Integration and Continuous Delivery on AWS”.

How can you use continuous deployment with AWS and Kubernetes?

You can leverage AWS services that support continuous deployment to automatically take your code from a source code repository to production in a Kubernetes cluster with minimal user intervention. To do this, you can create a pipeline that will build and deploy committed code changes as long as they meet the requirements of each stage of the pipeline.

To create the pipeline, you will use the following services:

  • AWS CodePipeline. AWS CodePipeline is a continuous delivery service that models, visualizes, and automates the steps required to release software. You define stages in a pipeline to retrieve code from a source code repository, build that source code into a releasable artifact, test the artifact, and deploy it to production. Only code that successfully passes through all these stages will be deployed. In addition, you can optionally add other requirements to your pipeline, such as manual approvals, to help ensure that only approved changes are deployed to production.
  • AWS CodeCommit. AWS CodeCommit is a secure, scalable, and managed source control service that hosts private Git repositories. You can privately store and manage assets such as your source code in the cloud and configure your pipeline to automatically retrieve and process changes committed to your repository.
  • AWS CodeBuild. AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces artifacts that are ready to deploy. You can use AWS CodeBuild to both build your artifacts, and to test those artifacts before they are deployed.
  • AWS Lambda. AWS Lambda is a compute service that lets you run code without provisioning or managing servers. You can invoke a Lambda function in your pipeline to prepare the built and tested artifact for deployment by Kubernetes to the Kubernetes cluster.
  • Kubernetes. Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It provides a platform for running, deploying, and managing containers at scale.

An Example of Continuous Deployment to Kubernetes:

The following example illustrates leveraging AWS developer tools to continuously deploy to a Kubernetes cluster:

  1. Developers commit code to an AWS CodeCommit repository and create pull requests to review proposed changes to the production code. When the pull request is merged into the master branch in the AWS CodeCommit repository, AWS CodePipeline automatically detects the changes to the branch and starts processing the code changes through the pipeline.
  2. AWS CodeBuild packages the code changes as well as any dependencies and builds a Docker image. Optionally, another pipeline stage tests the code and the package, also using AWS CodeBuild.
  3. The Docker image is pushed to Amazon ECR after a successful build and/or test stage.
  4. AWS CodePipeline invokes an AWS Lambda function that includes the Kubernetes Python client as part of the function’s resources. The Lambda function performs a string replacement on the tag used for the Docker image in the Kubernetes deployment file to match the Docker image tag applied in the build, one that matches the image in Amazon ECR.
  5. After the deployment manifest update is completed, AWS Lambda invokes the Kubernetes API to update the image in the Kubernetes application deployment.
  6. Kubernetes performs a rolling update of the pods in the application deployment to match the docker image specified in Amazon ECR.
    The pipeline is now live and responds to changes to the master branch of the CodeCommit repository. This pipeline is also fully extensible, you can add steps for performing testing or adding a step to deploy into a staging environment before the code ships into the production cluster.

An example pipeline in AWS CodePipeline that supports this architecture can be seen below:

Conclusion

We are excited to see how you leverage this pipeline to help ease your developer experience as you develop applications in Kubernetes.

You’ll find an AWS CloudFormation template with everything necessary to spin up your own continuous deployment pipeline at the CodeSuite – Continuous Deployment Reference Architecture for Kubernetes repo on GitHub. The repository details exactly how the pipeline is provisioned and how you can use it to deploy your own applications. If you have any questions, feedback, or suggestions, please let us know!

Graphite 1.1: Teaching an Old Dog New Tricks

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2018/01/11/graphite-1.1-teaching-an-old-dog-new-tricks/

The Road to Graphite 1.1

I started working on Graphite just over a year ago, when @obfuscurity asked me to help out with some issues blocking the Graphite 1.0 release. Little did I know that a year later, that would have resulted in 262 commits (and counting), and that with the help of the other Graphite maintainers (especially @deniszh, @iksaif & @cbowman0) we would have added a huge amount of new functionality to Graphite.

There are a huge number of new additions and updates in this release, in this post I’ll give a tour of some of the highlights including tag support, syntax and function updates, custom function plugins, and python 3.x support.

Tagging!

The single biggest feature in this release is the addition of tag support, which brings the ability to describe metrics in a much richer way and to write more flexible and expressive queries.

Traditionally series in Graphite are identified using a hierarchical naming scheme based on dot-separated segments called nodes. This works very well and is simple to map into a hierarchical structure like the whisper filesystem tree, but it means that the user has to know what each segment represents, and makes it very difficult to modify or extend the naming scheme since everything is based on the positions of the segments within the hierarchy.

The tagging system gives users the ability to encode information about the series in a collection of tag=value pairs which are used together with the series name to uniquely identify each series, and the ability to query series by specifying tag-based matching expressions rather than constructing glob-style selectors based on the positions of specific segments within the hierarchy. This is broadly similar to the system used by Prometheus and makes it possible to use Graphite as a long-term storage backend for metrics gathered by Prometheus with full tag support.

When using tags, series names are specified using the new tagged carbon format: name;tag1=value1;tag2=value2. This format is backward compatible with most existing carbon tooling, and makes it easy to adapt existing tools to produce tagged metrics simply by changing the metric names. The OpenMetrics format is also supported for ingestion, and is normalized into the standard Graphite format internally.

At its core, the tagging system is implemented as a tag database (TagDB) alongside the metrics that allows them to be efficiently queried by individual tag values rather than having to traverse the metrics tree looking for series that match the specified query. Internally the tag index is stored in one of a number of pluggable tag databases, currently supported options are the internal graphite-web database, redis, or an external system that implements the Graphite tagging HTTP API. Carbon automatically keeps the index up to date with any tagged series seen.

The new seriesByTag function is used to query the TagDB and will return a list of all the series that match the expressions passed to it. seriesByTag supports both exact and regular expression matches, and can be used anywhere you would previously have specified a metric name or glob expression.

There are new dedicated functions for grouping and aliasing series by tag (groupByTags and aliasByTags), and you can also use tags interchangeably with node numbers in the standard Graphite functions like aliasByNode, groupByNodes, asPercent, mapSeries, etc.

Piping Syntax & Function Updates

One of the huge strengths of the Graphite render API is the ability to chain together multiple functions to process data, but until now (unless you were using a tool like Grafana) writing chained queries could be painful as each function had to be wrapped around the previous one. With this release it is now possible to “pipe” the output of one processing function into the next, and to combine piped and nested functions.

For example:

alias(movingAverage(scaleToSeconds(sumSeries(stats_global.production.counters.api.requests.*.count),60),30),'api.avg')

Can now be written as:

sumSeries(stats_global.production.counters.api.requests.*.count)|scaleToSeconds(60)|movingAverage(30)|alias('api.avg')

OR

stats_global.production.counters.api.requests.*.count|sumSeries()|scaleToSeconds(60)|movingAverage(30)|alias('api.avg')

Another source of frustration with the old function API was the inconsistent implementation of aggregations, with different functions being used in different parts of the API, and some functions simply not being available. In 1.1 all functions that perform aggregation (whether across series or across time intervals) now support a consistent set of aggregations; average, median, sum, min, max, diff, stddev, count, range, multiply and last. This is part of a new approach to implementing functions that emphasises using shared building blocks to ensure consistency across the API and solve the problem of a particular function not working with the aggregation needed for a given task.

To that end a number of new functions have been added that each provide the same functionality as an entire family of “old” functions; aggregate, aggregateWithWildcards, movingWindow, filterSeries, highest, lowest and sortBy.

Each of these functions accepts an aggregation method parameter, for example aggregate(some.metric.*, 'sum') implements the same functionality as sumSeries(some.metric.*).

It can also be used with different aggregation methods to replace averageSeries, stddevSeries, multiplySeries, diffSeries, rangeOfSeries, minSeries, maxSeries and countSeries. All those functions are now implemented as aliases for aggregate, and it supports the previously-missing median and last aggregations.

The same is true for the other functions, and the summarize, smartSummarize, groupByNode, groupByNodes and the new groupByTags functions now all support the standard set of aggregations. Gone are the days of wishing that sortByMedian or highestRange were available!

For more information on the functions available check the function documentation.

Custom Functions

No matter how many functions are available there are always going to be specific use-cases where a custom function can perform analysis that wouldn’t otherwise be possible, or provide a convenient alias for a complicated function chain or specific set of parameters.

In Graphite 1.1 we added support for easily adding one-off custom functions, as well as for creating and sharing plugins that can provide one or more functions.

Each function plugin is packaged as a simple python module, and will be automatically loaded by Graphite when placed into the functions/custom folder.

An example of a simple function plugin that translates the name of every series passed to it into UPPERCASE:

from graphite.functions.params import Param, ParamTypes

def toUpperCase(requestContext, seriesList):
  """Custom function that changes series names to UPPERCASE"""
  for series in seriesList:
    series.name = series.name.upper()
  return seriesList

toUpperCase.group = 'Custom'
toUpperCase.params = [
  Param('seriesList', ParamTypes.seriesList, required=True),
]

SeriesFunctions = {
  'upper': toUpperCase,
}

Once installed the function is not only available for use within Grpahite, but is also exposed via the new Function API which allows the function definition and documentation to be automatically loaded by tools like Grafana. This means that users will be able to select and use the new function in exactly the same way as the internal functions.

More information on writing and using custom functions is available in the documentation.

Clustering Updates

One of the biggest changes from the 0.9 to 1.0 releases was the overhaul of the clustering code, and with 1.1.1 that process has been taken even further to optimize performance when using Graphite in a clustered deployment. In the past it was common for a request to require the frontend node to make multiple requests to the backend nodes to identify matching series and to fetch data, and the code for handling remote vs local series was overly complicated. In 1.1.1 we took a new approach where all render data requests pass through the same path internally, and multiple backend nodes are handled individually rather than grouped together into a single finder. This has greatly simplified the codebase, making it much easier to understand and reason about, while allowing much more flexibility in design of the finders. After these changes, render requests can now be answered with a single internal request to each backend node, and all requests for both remote and local data are executed in parallel.

To maintain the ability of graphite to scale out horizontally, the tagging system works seamlessly within a clustered environment, with each node responsible for the series stored on that node. Calls to load tagged series via seriesByTag are fanned out to the backend nodes and results are merged on the query node just like they are for non-tagged series.

Python 3 & Django 1.11 Support

Graphite 1.1 finally brings support for Python 3.x, both graphite-web and carbon are now tested against Python 2.7, 3.4, 3.5, 3.6 and PyPy. Django releases 1.8 through 1.11 are also supported. The work involved in sorting out the compatibility issues between Python 2.x and 3.x was quite involved, but it is a huge step forward for the long term support of the project! With the new Django 2.x series supporting only Python 3.x we will need to evaluate our long-term support for Python 2.x, but the Django 1.11 series is supported through 2020 so there is time to consider the options there.

Watch This Space

Efforts are underway to add support for the new functionality across the ecosystem of tools that work with Graphite, adding collectd tagging support, prometheus remote read & write with tags (and native Prometheus remote read/write support in Graphite) and last but not least Graphite tag support in Grafana.

We’re excited about the possibilities that the new capabilities in 1.1.x open up, and can’t wait to see how the community puts them to work.

Download the 1.1.1 release and check out the release notes here.

Instrumenting Web Apps Using AWS X-Ray

Post Syndicated from Bharath Kumar original https://aws.amazon.com/blogs/devops/instrumenting-web-apps-using-aws-x-ray/

This post was written by James Bowman, Software Development Engineer, AWS X-Ray

AWS X-Ray helps developers analyze and debug distributed applications and underlying services in production. You can identify and analyze root-causes of performance issues and errors, understand customer impact, and extract statistical aggregations (such as histograms) for optimization.

In this blog post, I will provide a step-by-step walkthrough for enabling X-Ray tracing in the Go programming language. You can use these steps to add X-Ray tracing to any distributed application.

Revel: A web framework for the Go language

This section will assist you with designing a guestbook application. Skip to “Instrumenting with AWS X-Ray” section below if you already have a Go language application.

Revel is a web framework for the Go language. It facilitates the rapid development of web applications by providing a predefined framework for controllers, views, routes, filters, and more.

To get started with Revel, run revel new github.com/jamesdbowman/guestbook. A project base is then copied to $GOPATH/src/github.com/jamesdbowman/guestbook.

$ tree -L 2
.
├── README.md
├── app
│ ├── controllers
│ ├── init.go
│ ├── routes
│ ├── tmp
│ └── views
├── conf
│ ├── app.conf
│ └── routes
├── messages
│ └── sample.en
├── public
│ ├── css
│ ├── fonts
│ ├── img
│ └── js
└── tests
└── apptest.go

Writing a guestbook application

A basic guestbook application can consist of just two routes: one to sign the guestbook and another to list all entries.
Let’s set up these routes by adding a Book controller, which can be routed to by modifying ./conf/routes.

./app/controllers/book.go:
package controllers

import (
    "math/rand"
    "time"

    "github.com/aws/aws-sdk-go/aws"
    "github.com/aws/aws-sdk-go/aws/endpoints"
    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/aws/aws-sdk-go/service/dynamodb"
    "github.com/aws/aws-sdk-go/service/dynamodb/dynamodbattribute"
    "github.com/revel/revel"
)

const TABLE_NAME = "guestbook"
const SUCCESS = "Success.\n"
const DAY = 86400

var letters = []rune("ABCDEFGHIJKLMNOPQRSTUVWXYZ")

func init() {
    rand.Seed(time.Now().UnixNano())
}

// randString returns a random string of len n, used for DynamoDB Hash key.
func randString(n int) string {
    b := make([]rune, n)
    for i := range b {
        b[i] = letters[rand.Intn(len(letters))]
    }
    return string(b)
}

// Book controls interactions with the guestbook.
type Book struct {
    *revel.Controller
    ddbClient *dynamodb.DynamoDB
}

// Signature represents a user's signature.
type Signature struct {
    Message string
    Epoch   int64
    ID      string
}

// ddb returns the controller's DynamoDB client, instatiating a new client if necessary.
func (c Book) ddb() *dynamodb.DynamoDB {
    if c.ddbClient == nil {
        sess := session.Must(session.NewSession(&aws.Config{
            Region: aws.String(endpoints.UsWest2RegionID),
        }))
        c.ddbClient = dynamodb.New(sess)
    }
    return c.ddbClient
}

// Sign allows users to sign the book.
// The message is to be passed as application/json typed content, listed under the "message" top level key.
func (c Book) Sign() revel.Result {
    var s Signature

    err := c.Params.BindJSON(&s)
    if err != nil {
        return c.RenderError(err)
    }
    now := time.Now()
    s.Epoch = now.Unix()
    s.ID = randString(20)

    item, err := dynamodbattribute.MarshalMap(s)
    if err != nil {
        return c.RenderError(err)
    }

    putItemInput := &dynamodb.PutItemInput{
        TableName: aws.String(TABLE_NAME),
        Item:      item,
    }
    _, err = c.ddb().PutItem(putItemInput)
    if err != nil {
        return c.RenderError(err)
    }

    return c.RenderText(SUCCESS)
}

// List allows users to list all signatures in the book.
func (c Book) List() revel.Result {
    scanInput := &dynamodb.ScanInput{
        TableName: aws.String(TABLE_NAME),
        Limit:     aws.Int64(100),
    }
    res, err := c.ddb().Scan(scanInput)
    if err != nil {
        return c.RenderError(err)
    }

    messages := make([]string, 0)
    for _, v := range res.Items {
        messages = append(messages, *(v["Message"].S))
    }
    return c.RenderJSON(messages)
}

./conf/routes:
POST /sign Book.Sign
GET /list Book.List

Creating the resources and testing

For the purposes of this blog post, the application will be run and tested locally. We will store and retrieve messages from an Amazon DynamoDB table. Use the following AWS CLI command to create the guestbook table:

aws dynamodb create-table --region us-west-2 --table-name "guestbook" --attribute-definitions AttributeName=ID,AttributeType=S AttributeName=Epoch,AttributeType=N --key-schema AttributeName=ID,KeyType=HASH AttributeName=Epoch,KeyType=RANGE --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

Now, let’s test our sign and list routes. If everything is working correctly, the following result appears:

$ curl -d '{"message":"Hello from cURL!"}' -H "Content-Type: application/json" http://localhost:9000/book/sign
Success.
$ curl http://localhost:9000/book/list
[
  "Hello from cURL!"
]%

Integrating with AWS X-Ray

Download and run the AWS X-Ray daemon

The AWS SDKs emit trace segments over UDP on port 2000. (This port can be configured.) In order for the trace segments to make it to the X-Ray service, the daemon must listen on this port and batch the segments in calls to the PutTraceSegments API.
For information about downloading and running the X-Ray daemon, see the AWS X-Ray Developer Guide.

Installing the AWS X-Ray SDK for Go

To download the SDK from GitHub, run go get -u github.com/aws/aws-xray-sdk-go/... The SDK will appear in the $GOPATH.

Enabling the incoming request filter

The first step to instrumenting an application with AWS X-Ray is to enable the generation of trace segments on incoming requests. The SDK conveniently provides an implementation of http.Handler which does exactly that. To ensure incoming web requests travel through this handler, we can modify app/init.go, adding a custom function to be run on application start.

import (
    "github.com/aws/aws-xray-sdk-go/xray"
    "github.com/revel/revel"
)

...

func init() {
  ...
    revel.OnAppStart(installXRayHandler)
}

func installXRayHandler() {
    revel.Server.Handler = xray.Handler(xray.NewFixedSegmentNamer("GuestbookApp"), revel.Server.Handler)
}

The application will now emit a segment for each incoming web request. The service graph appears:

You can customize the name of the segment to make it more descriptive by providing an alternate implementation of SegmentNamer to xray.Handler. For example, you can use xray.NewDynamicSegmentNamer(fallback, pattern) in place of the fixed namer. This namer will use the host name from the incoming web request (if it matches pattern) as the segment name. This is often useful when you are trying to separate different instances of the same application.

In addition, HTTP-centric information such as method and URL is collected in the segment’s http subsection:

"http": {
    "request": {
        "url": "/book/list",
        "method": "GET",
        "user_agent": "curl/7.54.0",
        "client_ip": "::1"
    },
    "response": {
        "status": 200
    }
},

Instrumenting outbound calls

To provide detailed performance metrics for distributed applications, the AWS X-Ray SDK needs to measure the time it takes to make outbound requests. Trace context is passed to downstream services using the X-Amzn-Trace-Id header. To draw a detailed and accurate representation of a distributed application, outbound call instrumentation is required.

AWS SDK calls

The AWS X-Ray SDK for Go provides a one-line AWS client wrapper that enables the collection of detailed per-call metrics for any AWS client. We can modify the DynamoDB client instantiation to include this line:

// ddb returns the controller's DynamoDB client, instatiating a new client if necessary.
func (c Book) ddb() *dynamodb.DynamoDB {
    if c.ddbClient == nil {
        sess := session.Must(session.NewSession(&aws.Config{
            Region: aws.String(endpoints.UsWest2RegionID),
        }))
        c.ddbClient = dynamodb.New(sess)
        xray.AWS(c.ddbClient.Client) // add subsegment-generating X-Ray handlers to this client
    }
    return c.ddbClient
}

We also need to ensure that the segment generated by our xray.Handler is passed to these AWS calls so that the X-Ray SDK knows to which segment these generated subsegments belong. In Go, the context.Context object is passed throughout the call path to achieve this goal. (In most other languages, some variant of ThreadLocal is used.) AWS clients provide a *WithContext method variant for each AWS operation, which we need to switch to:

_, err = c.ddb().PutItemWithContext(c.Request.Context(), putItemInput)
    res, err := c.ddb().ScanWithContext(c.Request.Context(), scanInput)

We now see much more detail in the Timeline view of the trace for the sign and list operations:

We can use this detail to help diagnose throttling on our DynamoDB table. In the following screenshot, the purple in the DynamoDB service graph node indicates that our table is underprovisioned. The red in the GuestbookApp node indicates that the application is throwing faults due to this throttling.

HTTP calls

Although the guestbook application does not make any non-AWS outbound HTTP calls in its current state, there is a similar one-liner to wrap HTTP clients that make outbound requests. xray.Client(c *http.Client) wraps an existing http.Client (or nil if you want to use a default HTTP client). For example:

resp, err := ctxhttp.Get(ctx, xray.Client(nil), "https://aws.amazon.com/")

Instrumenting local operations

X-Ray can also assist in measuring the performance of local compute operations. To see this in action, let’s create a custom subsegment inside the randString method:


// randString returns a random string of len n, used for DynamoDB Hash key.
func randString(ctx context.Context, n int) string {
    xray.Capture(ctx, "randString", func(innerCtx context.Context) {
        b := make([]rune, n)
        for i := range b {
            b[i] = letters[rand.Intn(len(letters))]
        }
        s := string(b)
    })
    return s
}

// we'll also need to change the callsite

s.ID = randString(c.Request.Context(), 20)

Summary

By now, you are an expert on how to instrument X-Ray for your Go applications. Instrumenting X-Ray with your applications is an easy way to analyze and debug performance issues and understand customer impact. Please feel free to give any feedback or comments below.

For more information about advanced configuration of the AWS X-Ray SDK for Go, see the AWS X-Ray SDK for Go in the AWS X-Ray Developer Guide and the aws/aws-xray-sdk-go GitHub repository.

For more information about some of the advanced X-Ray features such as histograms, annotations, and filter expressions, see the Analyzing Performance for Amazon Rekognition Apps Written on AWS Lambda Using AWS X-Ray blog post.

Might Google Class “Torrent” a Dirty Word? France is About to Find Out

Post Syndicated from Andy original https://torrentfreak.com/might-google-class-torrent-a-dirty-word-france-is-about-to-find-out-171223/

Like most countries, France is struggling to find ways to stop online piracy running rampant. A number of options have been tested thus far, with varying results.

One of the more interesting cases has been running since 2015, when music industry group SNEP took Google and Microsoft to court demanding automated filtering of ‘pirate’ search results featuring three local artists.

Before the High Court of Paris, SNEP argued that searches for the artists’ names plus the word “torrent” returned mainly infringing results on Google and Bing. Filtering out results with both sets of terms would reduce the impact of people finding pirate content through search, they said.

While SNEP claimed that its request was in line with Article L336-2 of France’s intellectual property code, which allows for “all appropriate measures” to prevent infringement, both Google and Microsoft fought back, arguing that such filtering would be disproportionate and could restrict freedom of expression.

The Court eventually sided with the search engines, noting that torrent is a common noun that refers to a neutral communication protocol.

“The requested measures are thus tantamount to general monitoring and may block access to lawful websites,” the High Court said.

Despite being told that its demands were too broad, SNEP decided to appeal. The case was heard in November where concerns were expressed over potential false positives.

Since SNEP even wants sites with “torrent” in their URL filtered out via a “fully automated procedures that do not require human intervention”, this very site – TorrentFreak.com – could be sucked in. To counter that eventuality, SNEP proposed some kind of whitelist, NextInpact reports.

With no real consensus on how to move forward, the parties were advised to enter discussions on how to get closer to the aim of reducing piracy but without causing collateral damage. Last week the parties agreed to enter negotiations so the details will now have to be hammered out between their respective law firms. Failing that, they will face a ruling from the court.

If this last scenario plays out, the situation appears to favor the search engines, who have a High Court ruling in their favor and already offer comprehensive takedown tools for copyright holders to combat the exploitation of their content online.

Meanwhile, other elements of the French recording industry have booked a notable success against several pirate sites.

SCPP, which represents Warner, Universal, Sony and thousands of others, went to court in February this year demanding that local ISPs Bouygues, Free, Orange, SFR and Numéricable prevent their subscribers from accessing ExtraTorrent, isoHunt, Torrent9 and Cpasbien.

Like SNEP in the filtering case, SCPP also cited Article L336-2 of France’s intellectual property code, demanding that the sites plus their variants, mirrors and proxies should be blocked by the ISPs so that their subscribers can no longer gain access.

This week the Paris Court of First Instance sided with the industry group, ordering the ISPs to block the sites. The service providers were also told to pick up the bill for costs.

These latest cases are yet more examples of France’s determination to crack down on piracy.

Early December it was revealed that since its inception, nine million piracy warnings have been sent to citizens via the Hadopi anti-piracy agency. Since the launch of its graduated response regime in 2010, more than 2,000 cases have been referred to prosecutors, resulting in 189 criminal convictions.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Google Defeats Worldwide Site Blocking Order in US Court

Post Syndicated from Ernesto original https://torrentfreak.com/google-defeats-worldwide-site-blocking-order-in-us-court-171218/

As the largest search engine on the Internet, Google has received its fair share of takedown requests. Over the past year, the company removed roughly a billion links from its search results.

However, this doesn’t mean that Google will remove everything it’s asked to. When a Canadian court demanded the search engine to delist sites that offered unlawful and competing products of Equustek Solutions, it fought back.

After several years in court, the Supreme Court of Canada directed Google to remove the websites from its search results last summer. This order wasn’t limited to Canada alone, but applied worldwide.

Worried about the possible negative consequences the broad verdict could have, Google then took the case to the US, and with success.

A federal court in California already signed a preliminary injunction a few weeks ago, disarming the Canadian order, and a few days ago ruled that Google has won its case.

Case closed

According to the California court, the Canadian Supreme court ruling violates the First Amendment of the U.S. Constitution, putting free speech at risk.

It would also go against Section 230 of the Communications Decency Act, which offers search engines and other Internet services immunity from liability for material published by others.

“The Canadian order would eliminate Section 230 immunity for service providers that link to third-party websites,” the court wrote.

“By forcing intermediaries to remove links to third-party material, the Canadian order undermines the policy goals of Section 230 and threatens free speech on the global internet.”

After a legal battle that kept the Canadian court busy since 2014, the US case was solved rather quickly. Equustek Solutions didn’t show up and failed to defend itself, which made it an easy win.

Now that the permanent injunction is signed the case will be closed. While Google still has to delist the contested pages in Canada, it no longer has to do the same worldwide.

As highlighted previously, the order is very important in the broader scheme. If foreign courts are allowed to grant worldwide blockades, free speech could be severely hampered.

Today it’s a relatively unknown Canadian company, but the damage could be much more severe if the Chinese Government asked Google to block the websites of VPN providers, or any other information they don’t like.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

FCC Repeals U.S. Net Neutrality Rules

Post Syndicated from Ernesto original https://torrentfreak.com/fcc-repeals-u-s-net-neutrality-rules-171214/

In recent months, millions of people have protested the FCC’s plan to repeal U.S. net neutrality rules, which were put in place by the Obama administration.

However, an outpouring public outrage, critique from major tech companies, and even warnings from pioneers of the Internet, had no effect.

Today the FCC voted to repeal the old rules, effectively ending net neutrality.

Under the net neutrality rules that have been in effect during recent years, ISPs were specifically prohibited from blocking, throttling, and paid prioritization of “lawful” traffic. In addition, Internet providers could be regulated as carriers under Title II.

Now that these rules have been repealed, Internet providers have more freedom to experiment with paid prioritization. Under the new guidelines, they can charge customers extra for access to some online services, or throttle certain types of traffic.

Most critics of the repeal fear that, now that the old net neutrality rules are in the trash, ‘fast lanes’ for some services, and throttling for others, will become commonplace in the U.S.

This could also mean that BitTorrent traffic becomes a target once again. After all, it was Comcast’s ‘secretive’ BitTorrent throttling that started the broader net neutrality debate, now ten years ago.

Comcast’s throttling history is a sensitive issue, also for the company itself.

Before the Obama-era net neutrality rules, the ISP vowed that it would no longer discriminate against specific traffic classes. Ahead of the FCC vote yesterday, it doubled down on this promise.

“Despite repeated distortions and biased information, as well as misguided, inaccurate attacks from detractors, our Internet service is not going to change,” writes David Cohen, Comcast’s Chief Diversity Officer.

“We have repeatedly stated, and reiterate today, that we do not and will not block, throttle, or discriminate against lawful content.”

It’s worth highlighting the term “lawful” in the last sentence. It is by no means a promise that pirate sites won’t be blocked.

As we’ve highlighted in the past, blocking pirate sites was already an option under the now-repealed rules. The massive copyright loophole made sure of that. Targeting all torrent traffic is even an option, in theory.

That said, today’s FCC vote certainly makes it easier for ISPs to block or throttle BitTorrent traffic across the entire network. For the time being, however, there are no signs that any ISPs plan to do so.

If they do, we will know soon enough. The FCC requires all ISPs to be transparent under the new plan. They have to disclose network management practices, blocking efforts, commercial prioritization, and the like.

And with the current focus on net neutrality, ISPs are likely to tread carefully, or else they might just face an exodus of customers.

Finally, it’s worth highlighting that today’s vote is not the end of the road yet. Net neutrality supporters are planning to convince Congress to overturn the repeal. In addition, there are is also talk of taking the matter to court.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

What is HAMR and How Does It Enable the High-Capacity Needs of the Future?

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hamr-hard-drives/

HAMR drive illustration

During Q4, Backblaze deployed 100 petabytes worth of Seagate hard drives to our data centers. The newly deployed Seagate 10 and 12 TB drives are doing well and will help us meet our near term storage needs, but we know we’re going to need more drives — with higher capacities. That’s why the success of new hard drive technologies like Heat-Assisted Magnetic Recording (HAMR) from Seagate are very relevant to us here at Backblaze and to the storage industry in general. In today’s guest post we are pleased to have Mark Re, CTO at Seagate, give us an insider’s look behind the hard drive curtain to tell us how Seagate engineers are developing the HAMR technology and making it market ready starting in late 2018.

What is HAMR and How Does It Enable the High-Capacity Needs of the Future?

Guest Blog Post by Mark Re, Seagate Senior Vice President and Chief Technology Officer

Earlier this year Seagate announced plans to make the first hard drives using Heat-Assisted Magnetic Recording, or HAMR, available by the end of 2018 in pilot volumes. Even as today’s market has embraced 10TB+ drives, the need for 20TB+ drives remains imperative in the relative near term. HAMR is the Seagate research team’s next major advance in hard drive technology.

HAMR is a technology that over time will enable a big increase in the amount of data that can be stored on a disk. A small laser is attached to a recording head, designed to heat a tiny spot on the disk where the data will be written. This allows a smaller bit cell to be written as either a 0 or a 1. The smaller bit cell size enables more bits to be crammed into a given surface area — increasing the areal density of data, and increasing drive capacity.

It sounds almost simple, but the science and engineering expertise required, the research, experimentation, lab development and product development to perfect this technology has been enormous. Below is an overview of the HAMR technology and you can dig into the details in our technical brief that provides a point-by-point rundown describing several key advances enabling the HAMR design.

As much time and resources as have been committed to developing HAMR, the need for its increased data density is indisputable. Demand for data storage keeps increasing. Businesses’ ability to manage and leverage more capacity is a competitive necessity, and IT spending on capacity continues to increase.

History of Increasing Storage Capacity

For the last 50 years areal density in the hard disk drive has been growing faster than Moore’s law, which is a very good thing. After all, customers from data centers and cloud service providers to creative professionals and game enthusiasts rarely go shopping looking for a hard drive just like the one they bought two years ago. The demands of increasing data on storage capacities inevitably increase, thus the technology constantly evolves.

According to the Advanced Storage Technology Consortium, HAMR will be the next significant storage technology innovation to increase the amount of storage in the area available to store data, also called the disk’s “areal density.” We believe this boost in areal density will help fuel hard drive product development and growth through the next decade.

Why do we Need to Develop Higher-Capacity Hard Drives? Can’t Current Technologies do the Job?

Why is HAMR’s increased data density so important?

Data has become critical to all aspects of human life, changing how we’re educated and entertained. It affects and informs the ways we experience each other and interact with businesses and the wider world. IDC research shows the datasphere — all the data generated by the world’s businesses and billions of consumer endpoints — will continue to double in size every two years. IDC forecasts that by 2025 the global datasphere will grow to 163 zettabytes (that is a trillion gigabytes). That’s ten times the 16.1 ZB of data generated in 2016. IDC cites five key trends intensifying the role of data in changing our world: embedded systems and the Internet of Things (IoT), instantly available mobile and real-time data, cognitive artificial intelligence (AI) systems, increased security data requirements, and critically, the evolution of data from playing a business background to playing a life-critical role.

Consumers use the cloud to manage everything from family photos and videos to data about their health and exercise routines. Real-time data created by connected devices — everything from Fitbit, Alexa and smart phones to home security systems, solar systems and autonomous cars — are fueling the emerging Data Age. On top of the obvious business and consumer data growth, our critical infrastructure like power grids, water systems, hospitals, road infrastructure and public transportation all demand and add to the growth of real-time data. Data is now a vital element in the smooth operation of all aspects of daily life.

All of this entails a significant infrastructure cost behind the scenes with the insatiable, global appetite for data storage. While a variety of storage technologies will continue to advance in data density (Seagate announced the first 60TB 3.5-inch SSD unit for example), high-capacity hard drives serve as the primary foundational core of our interconnected, cloud and IoT-based dependence on data.

HAMR Hard Drive Technology

Seagate has been working on heat assisted magnetic recording (HAMR) in one form or another since the late 1990s. During this time we’ve made many breakthroughs in making reliable near field transducers, special high capacity HAMR media, and figuring out a way to put a laser on each and every head that is no larger than a grain of salt.

The development of HAMR has required Seagate to consider and overcome a myriad of scientific and technical challenges including new kinds of magnetic media, nano-plasmonic device design and fabrication, laser integration, high-temperature head-disk interactions, and thermal regulation.

A typical hard drive inside any computer or server contains one or more rigid disks coated with a magnetically sensitive film consisting of tiny magnetic grains. Data is recorded when a magnetic write-head flies just above the spinning disk; the write head rapidly flips the magnetization of one magnetic region of grains so that its magnetic pole points up or down, to encode a 1 or a 0 in binary code.

Increasing the amount of data you can store on a disk requires cramming magnetic regions closer together, which means the grains need to be smaller so they won’t interfere with each other.

Heat Assisted Magnetic Recording (HAMR) is the next step to enable us to increase the density of grains — or bit density. Current projections are that HAMR can achieve 5 Tbpsi (Terabits per square inch) on conventional HAMR media, and in the future will be able to achieve 10 Tbpsi or higher with bit patterned media (in which discrete dots are predefined on the media in regular, efficient, very dense patterns). These technologies will enable hard drives with capacities higher than 100 TB before 2030.

The major problem with packing bits so closely together is that if you do that on conventional magnetic media, the bits (and the data they represent) become thermally unstable, and may flip. So, to make the grains maintain their stability — their ability to store bits over a long period of time — we need to develop a recording media that has higher coercivity. That means it’s magnetically more stable during storage, but it is more difficult to change the magnetic characteristics of the media when writing (harder to flip a grain from a 0 to a 1 or vice versa).

That’s why HAMR’s first key hardware advance required developing a new recording media that keeps bits stable — using high anisotropy (or “hard”) magnetic materials such as iron-platinum alloy (FePt), which resist magnetic change at normal temperatures. Over years of HAMR development, Seagate researchers have tested and proven out a variety of FePt granular media films, with varying alloy composition and chemical ordering.

In fact the new media is so “hard” that conventional recording heads won’t be able to flip the bits, or write new data, under normal temperatures. If you add heat to the tiny spot on which you want to write data, you can make the media’s coercive field lower than the magnetic field provided by the recording head — in other words, enable the write head to flip that bit.

So, a challenge with HAMR has been to replace conventional perpendicular magnetic recording (PMR), in which the write head operates at room temperature, with a write technology that heats the thin film recording medium on the disk platter to temperatures above 400 °C. The basic principle is to heat a tiny region of several magnetic grains for a very short time (~1 nanoseconds) to a temperature high enough to make the media’s coercive field lower than the write head’s magnetic field. Immediately after the heat pulse, the region quickly cools down and the bit’s magnetic orientation is frozen in place.

Applying this dynamic nano-heating is where HAMR’s famous “laser” comes in. A plasmonic near-field transducer (NFT) has been integrated into the recording head, to heat the media and enable magnetic change at a specific point. Plasmonic NFTs are used to focus and confine light energy to regions smaller than the wavelength of light. This enables us to heat an extremely small region, measured in nanometers, on the disk media to reduce its magnetic coercivity,

Moving HAMR Forward

HAMR write head

As always in advanced engineering, the devil — or many devils — is in the details. As noted earlier, our technical brief provides a point-by-point short illustrated summary of HAMR’s key changes.

Although hard work remains, we believe this technology is nearly ready for commercialization. Seagate has the best engineers in the world working towards a goal of a 20 Terabyte drive by 2019. We hope we’ve given you a glimpse into the amount of engineering that goes into a hard drive. Keeping up with the world’s insatiable appetite to create, capture, store, secure, manage, analyze, rapidly access and share data is a challenge we work on every day.

With thousands of HAMR drives already being made in our manufacturing facilities, our internal and external supply chain is solidly in place, and volume manufacturing tools are online. This year we began shipping initial units for customer tests, and production units will ship to key customers by the end of 2018. Prepare for breakthrough capacities.

The post What is HAMR and How Does It Enable the High-Capacity Needs of the Future? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Linaro ERP 17.12 released

Post Syndicated from corbet original https://lwn.net/Articles/741352/rss

Linaro has announced the 17.12 release of its “Enterprise Reference
Platform” distribution. “The goal of the Linaro Enterprise Reference Platform is to provide a fully
tested, end to end, documented, open source implementation for ARM based
Enterprise servers. The Reference Platform includes kernel, a community
supported userspace and additional relevant open source projects, and is
validated against existing firmware releases.

Managing AWS Lambda Function Concurrency

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/managing-aws-lambda-function-concurrency/

One of the key benefits of serverless applications is the ease in which they can scale to meet traffic demands or requests, with little to no need for capacity planning. In AWS Lambda, which is the core of the serverless platform at AWS, the unit of scale is a concurrent execution. This refers to the number of executions of your function code that are happening at any given time.

Thinking about concurrent executions as a unit of scale is a fairly unique concept. In this post, I dive deeper into this and talk about how you can make use of per function concurrency limits in Lambda.

Understanding concurrency in Lambda

Instead of diving right into the guts of how Lambda works, here’s an appetizing analogy: a magical pizza.
Yes, a magical pizza!

This magical pizza has some unique properties:

  • It has a fixed maximum number of slices, such as 8.
  • Slices automatically re-appear after they are consumed.
  • When you take a slice from the pizza, it does not re-appear until it has been completely consumed.
  • One person can take multiple slices at a time.
  • You can easily ask to have the number of slices increased, but they remain fixed at any point in time otherwise.

Now that the magical pizza’s properties are defined, here’s a hypothetical situation of some friends sharing this pizza.

Shawn, Kate, Daniela, Chuck, Ian and Avleen get together every Friday to share a pizza and catch up on their week. As there is just six of them, they can easily all enjoy a slice of pizza at a time. As they finish each slice, it re-appears in the pizza pan and they can take another slice again. Given the magical properties of their pizza, they can continue to eat all they want, but with two very important constraints:

  • If any of them take too many slices at once, the others may not get as much as they want.
  • If they take too many slices, they might also eat too much and get sick.

One particular week, some of the friends are hungrier than the rest, taking two slices at a time instead of just one. If more than two of them try to take two pieces at a time, this can cause contention for pizza slices. Some of them would wait hungry for the slices to re-appear. They could ask for a pizza with more slices, but then run the same risk again later if more hungry friends join than planned for.

What can they do?

If the friends agreed to accept a limit for the maximum number of slices they each eat concurrently, both of these issues are avoided. Some could have a maximum of 2 of the 8 slices, or other concurrency limits that were more or less. Just so long as they kept it at or under eight total slices to be eaten at one time. This would keep any from going hungry or eating too much. The six friends can happily enjoy their magical pizza without worry!

Concurrency in Lambda

Concurrency in Lambda actually works similarly to the magical pizza model. Each AWS Account has an overall AccountLimit value that is fixed at any point in time, but can be easily increased as needed, just like the count of slices in the pizza. As of May 2017, the default limit is 1000 “slices” of concurrency per AWS Region.

Also like the magical pizza, each concurrency “slice” can only be consumed individually one at a time. After consumption, it becomes available to be consumed again. Services invoking Lambda functions can consume multiple slices of concurrency at the same time, just like the group of friends can take multiple slices of the pizza.

Let’s take our example of the six friends and bring it back to AWS services that commonly invoke Lambda:

  • Amazon S3
  • Amazon Kinesis
  • Amazon DynamoDB
  • Amazon Cognito

In a single account with the default concurrency limit of 1000 concurrent executions, any of these four services could invoke enough functions to consume the entire limit or some part of it. Just like with the pizza example, there is the possibility for two issues to pop up:

  • One or more of these services could invoke enough functions to consume a majority of the available concurrency capacity. This could cause others to be starved for it, causing failed invocations.
  • A service could consume too much concurrent capacity and cause a downstream service or database to be overwhelmed, which could cause failed executions.

For Lambda functions that are launched in a VPC, you have the potential to consume the available IP addresses in a subnet or the maximum number of elastic network interfaces to which your account has access. For more information, see Configuring a Lambda Function to Access Resources in an Amazon VPC. For information about elastic network interface limits, see Network Interfaces section in the Amazon VPC Limits topic.

One way to solve both of these problems is applying a concurrency limit to the Lambda functions in an account.

Configuring per function concurrency limits

You can now set a concurrency limit on individual Lambda functions in an account. The concurrency limit that you set reserves a portion of your account level concurrency for a given function. All of your functions’ concurrent executions count against this account-level limit by default.

If you set a concurrency limit for a specific function, then that function’s concurrency limit allocation is deducted from the shared pool and assigned to that specific function. AWS also reserves 100 units of concurrency for all functions that don’t have a specified concurrency limit set. This helps to make sure that future functions have capacity to be consumed.

Going back to the example of the consuming services, you could set throttles for the functions as follows:

Amazon S3 function = 350
Amazon Kinesis function = 200
Amazon DynamoDB function = 200
Amazon Cognito function = 150
Total = 900

With the 100 reserved for all non-concurrency reserved functions, this totals the account limit of 1000.

Here’s how this works. To start, create a basic Lambda function that is invoked via Amazon API Gateway. This Lambda function returns a single “Hello World” statement with an added sleep time between 2 and 5 seconds. The sleep time simulates an API providing some sort of capability that can take a varied amount of time. The goal here is to show how an API that is underloaded can reach its concurrency limit, and what happens when it does.
To create the example function

  1. Open the Lambda console.
  2. Choose Create Function.
  3. For Author from scratch, enter the following values:
    1. For Name, enter a value (such as concurrencyBlog01).
    2. For Runtime, choose Python 3.6.
    3. For Role, choose Create new role from template and enter a name aligned with this function, such as concurrencyBlogRole.
  4. Choose Create function.
  5. The function is created with some basic example code. Replace that code with the following:

import time
from random import randint
seconds = randint(2, 5)

def lambda_handler(event, context):
time.sleep(seconds)
return {"statusCode": 200,
"body": ("Hello world, slept " + str(seconds) + " seconds"),
"headers":
{
"Access-Control-Allow-Headers": "Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token",
"Access-Control-Allow-Methods": "GET,OPTIONS",
}}

  1. Under Basic settings, set Timeout to 10 seconds. While this function should only ever take up to 5-6 seconds (with the 5-second max sleep), this gives you a little bit of room if it takes longer.

  1. Choose Save at the top right.

At this point, your function is configured for this example. Test it and confirm this in the console:

  1. Choose Test.
  2. Enter a name (it doesn’t matter for this example).
  3. Choose Create.
  4. In the console, choose Test again.
  5. You should see output similar to the following:

Now configure API Gateway so that you have an HTTPS endpoint to test against.

  1. In the Lambda console, choose Configuration.
  2. Under Triggers, choose API Gateway.
  3. Open the API Gateway icon now shown as attached to your Lambda function:

  1. Under Configure triggers, leave the default values for API Name and Deployment stage. For Security, choose Open.
  2. Choose Add, Save.

API Gateway is now configured to invoke Lambda at the Invoke URL shown under its configuration. You can take this URL and test it in any browser or command line, using tools such as “curl”:


$ curl https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01
Hello world, slept 2 seconds

Throwing load at the function

Now start throwing some load against your API Gateway + Lambda function combo. Right now, your function is only limited by the total amount of concurrency available in an account. For this example account, you might have 850 unreserved concurrency out of a full account limit of 1000 due to having configured a few concurrency limits already (also the 100 concurrency saved for all functions without configured limits). You can find all of this information on the main Dashboard page of the Lambda console:

For generating load in this example, use an open source tool called “hey” (https://github.com/rakyll/hey), which works similarly to ApacheBench (ab). You test from an Amazon EC2 instance running the default Amazon Linux AMI from the EC2 console. For more help with configuring an EC2 instance, follow the steps in the Launch Instance Wizard.

After the EC2 instance is running, SSH into the host and run the following:


sudo yum install go
go get -u github.com/rakyll/hey

“hey” is easy to use. For these tests, specify a total number of tests (5,000) and a concurrency of 50 against the API Gateway URL as follows(replace the URL here with your own):


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01

The output from “hey” tells you interesting bits of information:


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01

Summary:
Total: 381.9978 secs
Slowest: 9.4765 secs
Fastest: 0.0438 secs
Average: 3.2153 secs
Requests/sec: 13.0891
Total data: 140024 bytes
Size/request: 28 bytes

Response time histogram:
0.044 [1] |
0.987 [2] |
1.930 [0] |
2.874 [1803] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
3.817 [1518] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
4.760 [719] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
5.703 [917] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
6.647 [13] |
7.590 [14] |
8.533 [9] |
9.477 [4] |

Latency distribution:
10% in 2.0224 secs
25% in 2.0267 secs
50% in 3.0251 secs
75% in 4.0269 secs
90% in 5.0279 secs
95% in 5.0414 secs
99% in 5.1871 secs

Details (average, fastest, slowest):
DNS+dialup: 0.0003 secs, 0.0000 secs, 0.0332 secs
DNS-lookup: 0.0000 secs, 0.0000 secs, 0.0046 secs
req write: 0.0000 secs, 0.0000 secs, 0.0005 secs
resp wait: 3.2149 secs, 0.0438 secs, 9.4472 secs
resp read: 0.0000 secs, 0.0000 secs, 0.0004 secs

Status code distribution:
[200] 4997 responses
[502] 3 responses

You can see a helpful histogram and latency distribution. Remember that this Lambda function has a random sleep period in it and so isn’t entirely representational of a real-life workload. Those three 502s warrant digging deeper, but could be due to Lambda cold-start timing and the “second” variable being the maximum of 5, causing the Lambda functions to time out. AWS X-Ray and the Amazon CloudWatch logs generated by both API Gateway and Lambda could help you troubleshoot this.

Configuring a concurrency reservation

Now that you’ve established that you can generate this load against the function, I show you how to limit it and protect a backend resource from being overloaded by all of these requests.

  1. In the console, choose Configure.
  2. Under Concurrency, for Reserve concurrency, enter 25.

  1. Click on Save in the top right corner.

You could also set this with the AWS CLI using the Lambda put-function-concurrency command or see your current concurrency configuration via Lambda get-function. Here’s an example command:


$ aws lambda get-function --function-name concurrencyBlog01 --output json --query Concurrency
{
"ReservedConcurrentExecutions": 25
}

Either way, you’ve set the Concurrency Reservation to 25 for this function. This acts as both a limit and a reservation in terms of making sure that you can execute 25 concurrent functions at all times. Going above this results in the throttling of the Lambda function. Depending on the invoking service, throttling can result in a number of different outcomes, as shown in the documentation on Throttling Behavior. This change has also reduced your unreserved account concurrency for other functions by 25.

Rerun the same load generation as before and see what happens. Previously, you tested at 50 concurrency, which worked just fine. By limiting the Lambda functions to 25 concurrency, you should see rate limiting kick in. Run the same test again:


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01

While this test runs, refresh the Monitoring tab on your function detail page. You see the following warning message:

This is great! It means that your throttle is working as configured and you are now protecting your downstream resources from too much load from your Lambda function.

Here is the output from a new “hey” command:


$ ./go/bin/hey -n 5000 -c 50 https://ofixul557l.execute-api.us-east-1.amazonaws.com/prod/concurrencyBlog01
Summary:
Total: 379.9922 secs
Slowest: 7.1486 secs
Fastest: 0.0102 secs
Average: 1.1897 secs
Requests/sec: 13.1582
Total data: 164608 bytes
Size/request: 32 bytes

Response time histogram:
0.010 [1] |
0.724 [3075] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1.438 [0] |
2.152 [811] |∎∎∎∎∎∎∎∎∎∎∎
2.866 [11] |
3.579 [566] |∎∎∎∎∎∎∎
4.293 [214] |∎∎∎
5.007 [1] |
5.721 [315] |∎∎∎∎
6.435 [4] |
7.149 [2] |

Latency distribution:
10% in 0.0130 secs
25% in 0.0147 secs
50% in 0.0205 secs
75% in 2.0344 secs
90% in 4.0229 secs
95% in 5.0248 secs
99% in 5.0629 secs

Details (average, fastest, slowest):
DNS+dialup: 0.0004 secs, 0.0000 secs, 0.0537 secs
DNS-lookup: 0.0002 secs, 0.0000 secs, 0.0184 secs
req write: 0.0000 secs, 0.0000 secs, 0.0016 secs
resp wait: 1.1892 secs, 0.0101 secs, 7.1038 secs
resp read: 0.0000 secs, 0.0000 secs, 0.0005 secs

Status code distribution:
[502] 3076 responses
[200] 1924 responses

This looks fairly different from the last load test run. A large percentage of these requests failed fast due to the concurrency throttle failing them (those with the 0.724 seconds line). The timing shown here in the histogram represents the entire time it took to get a response between the EC2 instance and API Gateway calling Lambda and being rejected. It’s also important to note that this example was configured with an edge-optimized endpoint in API Gateway. You see under Status code distribution that 3076 of the 5000 requests failed with a 502, showing that the backend service from API Gateway and Lambda failed the request.

Other uses

Managing function concurrency can be useful in a few other ways beyond just limiting the impact on downstream services and providing a reservation of concurrency capacity. Here are two other uses:

  • Emergency kill switch
  • Cost controls

Emergency kill switch

On occasion, due to issues with applications I’ve managed in the past, I’ve had a need to disable a certain function or capability of an application. By setting the concurrency reservation and limit of a Lambda function to zero, you can do just that.

With the reservation set to zero every invocation of a Lambda function results in being throttled. You could then work on the related parts of the infrastructure or application that aren’t working, and then reconfigure the concurrency limit to allow invocations again.

Cost controls

While I mentioned how you might want to use concurrency limits to control the downstream impact to services or databases that your Lambda function might call, another resource that you might be cautious about is money. Setting the concurrency throttle is another way to help control costs during development and testing of your application.

You might want to prevent against a function performing a recursive action too quickly or a development workload generating too high of a concurrency. You might also want to protect development resources connected to this function from generating too much cost, such as APIs that your Lambda function calls.

Conclusion

Concurrent executions as a unit of scale are a fairly unique characteristic about Lambda functions. Placing limits on how many concurrency “slices” that your function can consume can prevent a single function from consuming all of the available concurrency in an account. Limits can also prevent a function from overwhelming a backend resource that isn’t as scalable.

Unlike monolithic applications or even microservices where there are mixed capabilities in a single service, Lambda functions encourage a sort of “nano-service” of small business logic directly related to the integration model connected to the function. I hope you’ve enjoyed this post and configure your concurrency limits today!

The Pi Towers Secret Santa Babbage

Post Syndicated from Mark Calleja original https://www.raspberrypi.org/blog/secret-santa-babbage/

Tired of pulling names out of a hat for office Secret Santa? Upgrade your festive tradition with a Raspberry Pi, thermal printer, and everybody’s favourite microcomputer mascot, Babbage Bear.

Raspberry Pi Babbage Bear Secret Santa

The name’s Santa. Secret Santa.

It’s that time of year again, when the cosiness gets turned up to 11 and everyone starts thinking about jolly fat men, reindeer, toys, and benevolent home invasion. At Raspberry Pi, we’re running a Secret Santa pool: everyone buys a gift for someone else in the office. Obviously, the person you buy for has to be picked in secret and at random, or the whole thing wouldn’t work. With that in mind, I created Secret Santa Babbage to do the somewhat mundane task of choosing gift recipients. This could’ve just been done with some names in a hat, but we’re Raspberry Pi! If we don’t make a Python-based Babbage robot wearing a jaunty hat and programmed to spread Christmas cheer, who will?

Secret Santa Babbage

Ho ho ho!

Mecha-Babbage Xmas shenanigans

The script the robot runs is pretty basic: a list of names entered as comma-separated strings is shuffled at the press of a GPIO button, then a name is popped off the end and stored as a variable. The name is matched to a photo of the person stored on the Raspberry Pi, and a thermal printer pinched from Alex’s super awesome PastyCam (blog post forthcoming, maybe) prints out the picture and name of the person you will need to shower with gifts at the Christmas party. (Well, OK — with one gift. No more than five quid’s worth. Nothing untoward.) There’s also a redo function, just in case you pick yourself: press another button and the last picked name — still stored as a variable — is appended to the list again, which is shuffled once more, and a new name is popped off the end.

Secret Santa Babbage prototyping

Prototyping!

As the build was a bit of a rush job undertaken at the request of our ‘Director of Vibe’ Emily, there are a few things I’d like to improve about this functionality that I didn’t get around to — more on that later. To add some extra holiday spirit to the project at the last minute, I used Pygame to play a WAV file of Santa’s jolly laugh while Babbage chooses a name for you. The file is included in the GitHub repo along with everything else, because ‘tis the season, etc., etc.

Secret Santa Babbage prototyping

Editor’s note: Considering these desk adornments, Mark’s Secret Santa gift-giver has a lot to go on.

Writing the code for Xmas Mecha-Babbage was fairly straightforward, though it uses some tricky bits for managing the thermal printer. You’ll need to install the drivers to make it go, as well as the CUPS package for managing the print hosting. You can find instructions for these things here, thanks to the wonderful Adafruit crew. Also, for reasons I couldn’t fathom, this will all only work on a Pi 2 and not a Pi 3, as there are some compatibility issues with the thermal printer otherwise. (I also tested the script on a Pi Zero W…no dice.)

Building a Christmassy throne

The hardest (well, fiddliest) parts of making the whole build were constructing the throne and wiring the bear. Using MakerCase, Inkscape, a bit of ingenuity, and a laser cutter, I was able to rig up a Christmassy plywood throne which has a hole through the seat so I could run the wires down from Babbage and to the Pi inside. I finished the throne by rubbing a couple of fingers of beeswax into it; as well as making the wood shine just a little bit and protecting it against getting wet, this had the added bonus of making it smell awesome.

Secret Santa Babbage inside

Next year’s iteration will be mulled wine–scented.

I next soldered two LEDs to some lengths of wire, and then ran the wires through holes at the top of the throne and down the back along a small channel I had carved with a narrow chisel to connect them to the Pi’s GPIO pins. The green LED will remain on as long as Babbage is running his program, and the red one will light up while he is processing your request. Once the red LED goes off again, the next person can have a go. I also laser-cut a final piece of wood to overlay the back of Babbage’s Xmas throne and cover the wiring a bit.

Creating a Xmas cyborg bear

Taking two 6 mm tactile buttons, I clipped the spiky metal legs off one side of each (the buttons were going into a stuffed christmas toy, after all) and soldered a length of wire to each of the remaining legs. Next, I made a small incision into Babbage with my trusty Swiss army knife (in a place that actually made me cringe a little) and fed the buttons up into his paws. At some point in this process I was standing in the office wrestling with the bear and muttering to myself, which elicited some very strange looks from my colleagues.

Secret Santa Babbage throne

Poor Babbage…

One thing to note here is to make sure the wires remain attached at the solder points while you push them up into Babbage’s paws. The first time I tried it, I snapped one of my connections and had to start again. It helped to remove some stuffing like a tunnel and then replace it afterward. Moreover, you can use your fingertip to support the joints as you poke the wire in. Finally, a couple of squirts of hot glue to keep Babbage’s furry cheeks firmly on the seat, and done!

Secret Santa Babbage

Next year: Game of Thrones–inspired candy cane throne

The Secret Santa Babbage masterpiece

The whole build process was the perfect holiday mix of cheerful and macabre, and while getting the thermal printer to work was a little time-consuming, the finished product definitely raised some smiles around the office and added a bit of interesting digital flavour to a staid office tradition. And it also helped people who are new to the office or from other branches of the Foundation to know for whom they will be buying a gift.

Secret Santa Babbage

Ready to dispense Christmas cheer!

There are a few ways in which I’ll polish this project before next year, such as having the script write the names to external text files to create a record that will persist in case of a reboot, and maybe having Secret Santa Babbage play you a random Christmas carol when you squeeze his paw instead of just laughing merrily every time. (I also thought about adding electric shocks for those people who are on the naughty list, but HR said no. Bah, humbug!)

Make your own

The code and laser cut plans for the whole build are available here. If you plan to make your own, let us know which stuffed toy you will be turning into a Secret Santa cyborg! And if you’ve been working on any other Christmas-themed Raspberry Pi projects, we’d like to see those too, so tag us on social media to share the festive maker cheer.

The post The Pi Towers Secret Santa Babbage appeared first on Raspberry Pi.

GPIO expander: access a Pi’s GPIO pins on your PC/Mac

Post Syndicated from Gordon Hollingworth original https://www.raspberrypi.org/blog/gpio-expander/

Use the GPIO pins of a Raspberry Pi Zero while running Debian Stretch on a PC or Mac with our new GPIO expander software! With this tool, you can easily access a Pi Zero’s GPIO pins from your x86 laptop without using SSH, and you can also take advantage of your x86 computer’s processing power in your physical computing projects.

A Raspberry Pi zero connected to a laptop - GPIO expander

What is this magic?

Running our x86 Stretch distribution on a PC or Mac, whether installed on the hard drive or as a live image, is a great way of taking advantage of a well controlled and simple Linux distribution without the need for a Raspberry Pi.

The downside of not using a Pi, however, is that there aren’t any GPIO pins with which your Scratch or Python programs could communicate. This is a shame, because it means you are limited in your physical computing projects.

I was thinking about this while playing around with the Pi Zero’s USB booting capabilities, having seen people employ the Linux gadget USB mode to use the Pi Zero as an Ethernet device. It struck me that, using the udev subsystem, we could create a simple GUI application that automatically pops up when you plug a Pi Zero into your computer’s USB port. Then the Pi Zero could be programmed to turn into an Ethernet-connected computer running pigpio to provide you with remote GPIO pins.

So we went ahead and built this GPIO expander application, and your PC or Mac can now have GPIO pins which are accessible through Scratch or the GPIO Zero Python library. Note that you can only use this tool to access the Pi Zero.

You can also install the application on the Raspberry Pi. Theoretically, you could connect a number of Pi Zeros to a single Pi and (without a USB hub) use a maximum of 140 pins! But I’ve not tested this — one for you, I think…

Making the GPIO expander work

If you’re using a PC or Mac and you haven’t set up x86 Debian Stretch yet, you’ll need to do that first. An easy way to do it is to download a copy of the Stretch release from this page and image it onto a USB stick. Boot from the USB stick (on most computers, you just need to press F10 during booting and select the stick when asked), and then run Stretch directly from the USB key. You can also install it to the hard drive, but be aware that installing it will overwrite anything that was on your hard drive before.

Whether on a Mac, PC, or Pi, boot through to the Stretch desktop, open a terminal window, and install the GPIO expander application:

sudo apt install usbbootgui

Next, plug in your Raspberry Pi Zero (don’t insert an SD card), and after a few seconds the GUI will appear.

A screenshot of the GPIO expander GUI

The Raspberry Pi USB programming GUI

Select GPIO expansion board and click OK. The Pi Zero will now be programmed as a locally connected Ethernet port (if you run ifconfig, you’ll see the new interface usb0 coming up).

What’s really cool about this is that your plugged-in Pi Zero is now running pigpio, which allows you to control its GPIOs through the network interface.

With Scratch 2

To utilise the pins with Scratch 2, just click on the start bar and select Programming > Scratch 2.

In Scratch, click on More Blocks, select Add an Extension, and then click Pi GPIO.

Two new blocks will be added: the first is used to set the output pin, the second is used to get the pin value (it is true if the pin is read high).

This a simple application using a Pibrella I had hanging around:

A screenshot of a Scratch 2 program - GPIO expander

With Python

This is a Python example using the GPIO Zero library to flash an LED:

[email protected]:~ $ export GPIOZERO_PIN_FACTORY=pigpio
[email protected]:~ $ export PIGPIO_ADDR=fe80::1%usb0
[email protected]:~ $ python3
>>> from gpiozero import LED
>>> led = LED(17)
>>> led.blink()
A Raspberry Pi zero connected to a laptop - GPIO expander

The pinout command line tool is your friend

Note that in the code above the IP address of the Pi Zero is an IPv6 address and is shortened to fe80::1%usb0, where usb0 is the network interface created by the first Pi Zero.

With pigs directly

Another option you have is to use the pigpio library and the pigs application and redirect the output to the Pi Zero network port running IPv6. To do this, you’ll first need to set some environment variable for the redirection:

[email protected]:~ $ export PIGPIO_ADDR=fe80::1%usb0
[email protected]:~ $ pigs bc2 0x8000
[email protected]:~ $ pigs bs2 0x8000

With the commands above, you should be able to flash the LED on the Pi Zero.

The secret sauce

I know there’ll be some people out there who would be interested in how we put this together. And I’m sure many people are interested in the ‘buildroot’ we created to run on the Pi Zero — after all, there are lots of things you can create if you’ve got a Pi Zero on the end of a piece of IPv6 string! For a closer look, find the build scripts for the GPIO expander here and the source code for the USB boot GUI here.

And be sure to share your projects built with the GPIO expander by tagging us on social media or posting links in the comments!

The post GPIO expander: access a Pi’s GPIO pins on your PC/Mac appeared first on Raspberry Pi.

[$] BPF-based error injection for the kernel

Post Syndicated from corbet original https://lwn.net/Articles/740146/rss

Diligent developers do their best to anticipate things that can go wrong
and write appropriate error-handling code. Unfortunately, error-handling
code is especially hard to test and, as a result, often goes untested; the
code meant to deal with errors, in other words, is likely to contain errors
itself. One way of finding those bugs is to inject errors into a running
system and watching how it responds; the kernel may soon have a new
mechanism for doing this sort of injection.