Slackware 15 released

Post Syndicated from original https://lwn.net/Articles/883731/rss

Version 15 of the
venerable Slackware distribution has been released.

The challenge this time around was to adopt as much of the good
stuff out there as we could without changing the character of the
operating system. Keep it familiar, but make it modern. And boy
did we have our work cut out for us. We adopted PAM (finally) as
projects we needed dropped support for pure shadow passwords. We
switched from ConsoleKit2 to elogind, making it much easier to
support software that targets that Other Init System and bringing
us up-to-date with the XDG standards. We added support for PipeWire
as an alternate to PulseAudio, and for Wayland sessions in addition
to X11.

A bit more information can be found in the release
notes
. Many of us got our start with Slackware; it is good to see that
it’s still out there and true to form.

7Rapid Questions With Our APAC Sales Manager, Soumi

Post Syndicated from Rapid7 original https://blog.rapid7.com/2022/02/03/7rapid-questions-with-our-apac-sales-manager-soumi/

7Rapid Questions With Our APAC Sales Manager, Soumi

For this installment of 7Rapid Questions, we sat down with Soumi Mukherjee, APAC Sales Manager – ANZ North Sales, to learn more about what drives her in her role at Rapid7.

7Rapid Questions With Our APAC Sales Manager, Soumi

1. Why did you join Rapid7?

The truth is I joined for the people. I worked for a Rapid7 channel partner prior, and my interaction with the Rapid7 team back then gave me an impression of a company built on a culture of respect, trust, and high standards. I wanted to be a part of it!

2. Describe what your teams do in a few words?

We help customers achieve their cybersecurity goals by connecting them to the right products and solutions.

3. What can a candidate do to stand out in the interview process?

Be authentic, and bring your real self! If you’re asked to do a presentation round and you do your best pitch by whiteboarding and not slides, do what is best for you!

4. Which of our core values do you embody the most?

I am an obsessive learner, and I feel the spirit of “Never Done” speaks to me naturally. The cybersecurity landscape is constantly changing, so it’s important to continue learning and anticipating what’s next.

5. What is it that makes cybersecurity such an exciting field?

No boring day, in a nutshell! The evolving threat landscape means, to stay relevant, we have to keep innovating and improving. This aspect keeps me excited.

6. What three words would you use to describe the culture at Rapid7?

Accountability, collaboration, empathy.

7. What’s the best team activity you’ve done?

I miss the in-person team events and can’t wait to get back to them when we can. One of my favorites was a barista course. Our team even competed to see who could make the best latte art! It was an amazing event, and I even have a certificate!

Want to join Soumi and her team? We’re hiring! Browse our open roles at Rapid7 here.

How to Build the Right Tech Stack for Your MSP

Post Syndicated from Kari Rivas original https://www.backblaze.com/blog/how-to-build-the-right-tech-stack-for-your-msp/

As a managed service provider (MSP) or IT consultant, your bottom line depends on having the right tools at the right prices to maintain your margins while still providing the resources and functionality your clients need. And you’ve likely seen the resources and functionality your clients need changing over the past few years towards an increased focus on cybersecurity and disaster recovery.

More and more companies are hiring remotely, which means increased security risks, ransomware attacks on small and medium-sized businesses (SMBs) have increased, and severe natural disasters are threatening on-premises office technology. Having the right tech stack for your MSP demonstrates to current and potential clients that they can trust you to safeguard their valuable data and systems against the threats of today and tomorrow.

Level up your value proposition with insights on building a competitive “right-sized” tech stack at our upcoming webinar, “The Essential MSP Tech Stack,” on Tuesday, February 15th at 1 p.m. CST/2 p.m. EST.

➔ Sign Up for the Webinar

Read on to get a preview of what will be covered during the webinar.

The Top Considerations for an Essential MSP Tech Stack

SMBs outsource their IT to MSPs and consultants because they don’t have the time, knowledge, or resources to shop around for the right tech solution for themselves. They may not even know what criteria they should be using to evaluate solutions, and this can lead to them shopping around among MSPs based on price alone.

Sourcing solutions with a lower cost to you means you can price your services more competitively and better attract customers. But pricing is just one of the considerations you should make when purchasing software. Have you also thought about scalability, and whether your tech stack can grow with you as your client base grows? Or what kinds of support options your software provider has available?

Pricing is important, yes, but there are several other factors by which you should judge your tech stack options, including features, automation options, and integrations, which will be covered in more detail during the webinar.

Right-sizing Your MSP Tech Stack

To develop your MSP offering, you’ll also want to think about what MSP services are most in demand in your area and what solutions you can offer the most efficiently and cost-effectively. It’s not “essential” to offer everything. The right tech stack is the one that brings you the most clients at the greatest profitability.

You may even want to do some research on the other MSPs in your geographic area. Is there something you can offer that they do not? Play to your strengths—what technical areas do you know the best?

As you start to develop your offering, consider the following areas of managed IT services and how they might help you attract clients:

Backup and Cloud Storage for MSPs

When it comes to managed backup and cloud storage, Backblaze and our partner, MSP360, have you covered. Backblaze provides easy and affordable server and workstation backup, and our integration with MSP360 provides a seamless experience to back up standalone and multiple servers to Backblaze B2 Cloud Storage.

MSPs and IT organizations with multiple servers can manage all of their machines from one, centralized, web-based admin console. Backblaze B2 backups are “set it and forget it” after the initial setup. Data is kept in hot storage and available immediately when needed. And B2 Cloud Storage is extremely affordable at $5/TB per month without any additional fees or tiered pricing structure.

Our integration with MSP360 includes advanced backup protection features like flexible scheduling, compression, encryption, and ransomware protection. We’ve even made it super easy to get started on your own. Just use our online onboarding tool to create both Backblaze B2 and MSP360 accounts at the same time.

Bundling MSP Services to Streamline the Purchase Decision

Consider bundling your services to make it easier for clients to buy from you and understand how you’ll help protect their business. For instance, the joint solution from MSP360 and Backblaze can be bundled as part of a disaster recovery, backup, and storage package. You could also create tiers of services, like a “bronze” level disaster recovery, backup, and storage package; a “silver” level package that includes all of the above plus monitoring, tech management, and installation services; and a “gold” level package that functions essentially like fully outsourced IT.

Non-IT Tools for the MSP Tech Stack

Finally, as you build your MSP, don’t forget that your tech stack may need to include non-IT tools as well. You’ll need a way to oversee business accounting and your books, a way to manage your client relationships, leads, and sales, plus software to manage employees, payroll, and other aspects of general business management.

Ready to Upgrade Your Tech Stack?

Having the right tech stack isn’t a matter of checking all the boxes on a list of software. It’s a strategic decision about what your potential clients will most value, what you’re best equipped to offer, and how you can make a profit. Instead of trying to meet every possible need, ensure that you have the “right-sized” tech stack to service the types of clients you represent without paying extra for bloated software that may go unused. You can often have a healthier business by specializing in just a few areas and attracting the right types of clients, rather than trying to cater to everyone.

Want to learn more? Join our webinar on Tuesday, February 15th at 1 p.m. CST/2 p.m. EST to learn more about how to build the tech stack for your MSP.

The post How to Build the Right Tech Stack for Your MSP appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

[$] Stray-write protection for persistent memory

Post Syndicated from original https://lwn.net/Articles/883352/rss

Persistent memory has a number of advantages; it is fast, CPU-addressable,
available in large quantities and, of course, persistent. But it also,
arguably, poses a higher risk of suffering corruption as a result of bugs
in the kernel. Protecting against this possibility is the objective of this
patch set
from Ira Weiny, which makes use of Intel’s “protection keys
supervisor” (PKS) feature to make it harder for the kernel to inadvertently write
to persistent memory.

How ENGIE scales their data ingestion pipelines using Amazon MWAA

Post Syndicated from Anouar Zaaber original https://aws.amazon.com/blogs/big-data/how-engie-scales-their-data-ingestion-pipelines-using-amazon-mwaa/

ENGIE—one of the largest utility providers in France and a global player in the zero-carbon energy transition—produces, transports, and deals electricity, gas, and energy services. With 160,000 employees worldwide, ENGIE is a decentralized organization and operates 25 business units with a high level of delegation and empowerment. ENGIE’s decentralized global customer base had accumulated lots of data, and it required a smarter, unique approach and solution to align its initiatives and provide data that is ingestible, organizable, governable, sharable, and actionable across its global business units.

In 2018, the company’s business leadership decided to accelerate its digital transformation through data and innovation by becoming a data-driven company. Yves Le Gélard, chief digital officer at ENGIE, explains the company’s purpose: “Sustainability for ENGIE is the alpha and the omega of everything. This is our raison d’être. We help large corporations and the biggest cities on earth in their attempts to transition to zero carbon as quickly as possible because it is actually the number one question for humanity today.”

ENGIE, as with any other big enterprise, is using multiple extract, transform, and load (ETL) tools to ingest data into their data lake on AWS. Nevertheless, they usually have expensive licensing plans. “The company needed a uniform method of collecting and analyzing data to help customers manage their value chains,” says Gregory Wolowiec, the Chief Technology Officer who leads ENGIE’s data program. ENGIE wanted a free-license application, well integrated with multiple technologies and with a continuous integration, continuous delivery (CI/CD) pipeline to more easily scale all their ingestion process.

ENGIE started using Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to solve this issue and started moving various data sources from on-premise applications and ERPs, AWS services like Amazon Redshift, Amazon Relational Database Service (Amazon RDS), Amazon DynamoDB, external services like Salesforce, and other cloud providers to a centralized data lake on top of Amazon Simple Storage Service (Amazon S3).

Amazon MWAA is used in particular to collect and store harmonized operational and corporate data from different on-premises and software as a service (SaaS) data sources into a centralized data lake. The purpose of this data lake is to create a “group performance cockpit” that enables an efficient, data-driven analysis and thoughtful decision-making by the Engie Management board.

In this post, we share how ENGIE created a CI/CD pipeline for an Amazon MWAA project template using an AWS CodeCommit repository and plugged it into AWS CodePipeline to build, test, and package the code and custom plugins. In this use case, we developed a custom plugin to ingest data from Salesforce based on the Airflow Salesforce open-source plugin.

Solution overview

The following diagrams illustrate the solution architecture defining the implemented Amazon MWAA environment and its associated pipelines. It also describes the customer use case for Salesforce data ingestion into Amazon S3.

The following diagram shows the architecture of the deployed Amazon MWAA environment and the implemented pipelines.

The preceding architecture is fully deployed via infrastructure as code (IaC). The implementation includes the following:

  • Amazon MWAA environment – A customizable Amazon MWAA environment packaged with plugins and requirements and configured in a secure manner.
  • Provisioning pipeline – The admin team can manage the Amazon MWAA environment using the included CI/CD provisioning pipeline. This pipeline includes a CodeCommit repository plugged into CodePipeline to continuously update the environment and its plugins and requirements.
  • Project pipeline – This CI/CD pipeline comes with a CodeCommit repository that triggers CodePipeline to continuously build, test and deploy DAGs developed by users. Once deployed, these DAGs are made available in the Amazon MWAA environment.

The following diagram shows the data ingestion workflow, which includes the following steps:

  1. The DAG is triggered by Amazon MWAA manually or based on a schedule.
  2. Amazon MWAA initiates data collection parameters and calculates batches.
  3. Amazon MWAA distributes processing tasks among its workers.
  4. Data is retrieved from Salesforce in batches.
  5. Amazon MWAA assumes an AWS Identity and Access Management (IAM) role with the necessary permissions to store the collected data into the target S3 bucket.

This AWS Cloud Development Kit (AWS CDK) construct is implemented with the following security best practices:

  • With the principle of least privilege, you grant permissions to only the resources or actions that users need to perform tasks.
  • S3 buckets are deployed with security compliance rules: encryption, versioning, and blocking public access.
  • Authentication and authorization management is handled with AWS Single Sign-On (AWS SSO).
  • Airflow stores connections to external sources in a secure manner either in Airflow’s default secrets backend or an alternative secrets backend such as AWS Secrets Manager or AWS Systems Manager Parameter Store.

For this post, we step through a use case using the data from Salesforce to ingest it into an ENGIE data lake in order to transform it and build business reports.

Prerequisites for deployment

For this walkthrough, the following are prerequisites:

  • Basic knowledge of the Linux operating system
  • Access to an AWS account with administrator or power user (or equivalent) IAM role policies attached
  • Access to a shell environment or optionally with AWS CloudShell

Deploy the solution

To deploy and run the solution, complete the following steps:

  1. Install AWS CDK.
  2. Bootstrap your AWS account.
  3. Define your AWS CDK environment variables.
  4. Deploy the stack.

Install AWS CDK

The described solution is fully deployed with AWS CDK.

AWS CDK is an open-source software development framework to model and provision your cloud application resources using familiar programming languages. If you want to familiarize yourself with AWS CDK, the AWS CDK Workshop is a great place to start.

Install AWS CDK using the following commands:

npm install -g aws-cdk
# To check the installation
cdk --version

Bootstrap your AWS account

First, you need to make sure the environment where you’re planning to deploy the solution to has been bootstrapped. You only need to do this one time per environment where you want to deploy AWS CDK applications. If you’re unsure whether your environment has been bootstrapped already, you can always run the command again:

cdk bootstrap aws://YOUR_ACCOUNT_ID/YOUR_REGION

Define your AWS CDK environment variables

On Linux or MacOS, define your environment variables with the following code:

export CDK_DEFAULT_ACCOUNT=YOUR_ACCOUNT_ID
export CDK_DEFAULT_REGION=YOUR_REGION

On Windows, use the following code:

setx CDK_DEFAULT_ACCOUNT YOUR_ACCOUNT_ID
setx CDK_DEFAULT_REGION YOUR_REGION

Deploy the stack

By default, the stack deploys a basic Amazon MWAA environment with the associated pipelines described previously. It creates a new VPC in order to host the Amazon MWAA resources.

The stack can be customized using the parameters listed in the following table.

To pass a parameter to the construct, you can use the AWS CDK runtime context. If you intend to customize your environment with multiple parameters, we recommend using the cdk.json context file with version control to avoid unexpected changes to your deployments. Throughout our example, we pass only one parameter to the construct. Therefore, for the simplicity of the tutorial, we use the the --context or -c option to the cdk command, as in the following example:

cdk deploy -c paramName=paramValue -c paramName=paramValue ...
Parameter Description Default Valid values
vpcId VPC ID where the cluster is deployed. If none, creates a new one and needs the parameter cidr in that case. None VPC ID
cidr The CIDR for the VPC that is created to host Amazon MWAA resources. Used only if the vpcId is not defined. 172.31.0.0/16 IP CIDR
subnetIds Comma-separated list of subnets IDs where the cluster is deployed. If none, looks for private subnets in the same Availability Zone. None Subnet IDs list (coma separated)
envName Amazon MWAA environment name MwaaEnvironment String
envTags Amazon MWAA environment tags None See the following JSON example: '{"Environment":"MyEnv", "Application":"MyApp", "Reason":"Airflow"}'
environmentClass Amazon MWAA environment class mw1.small mw1.small, mw1.medium, mw1.large
maxWorkers Amazon MWAA maximum workers 1 int
webserverAccessMode Amazon MWAA environment access mode (private or public) PUBLIC_ONLY PUBLIC_ONLY, PRIVATE_ONLY
secretsBackend Amazon MWAA environment secrets backend Airflow SecretsManager

Clone the GitHub repository:

git clone https://github.com/aws-samples/cdk-amazon-mwaa-cicd

Deploy the stack using the following command:

cd mwaairflow && \
pip install . && \
cdk synth && \
cdk deploy -c vpcId=YOUR_VPC_ID

The following screenshot shows the stack deployment:

The following screenshot shows the deployed stack:

Create solution resources

For this walkthrough, you should have the following prerequisites:

If you don’t have a Salesforce account, you can create a SalesForce developer account:

  1. Sign up for a developer account.
  2. Copy the host from the email that you receive.
  3. Log in into your new Salesforce account
  4. Choose the profile icon, then Settings.
  5. Choose Reset my Security Token.
  6. Check your email and copy the security token that you receive.

After you complete these prerequisites, you’re ready to create the following resources:

  • An S3 bucket for Salesforce output data
  • An IAM role and IAM policy to write the Salesforce output data on Amazon S3
  • A Salesforce connection on the Airflow UI to be able to read from Salesforce
  • An AWS connection on the Airflow UI to be able to write on Amazon S3
  • An Airflow variable on the Airflow UI to store the name of the target S3 bucket

Create an S3 bucket for Salesforce output data

To create an output S3 bucket, complete the following steps:

  1. On the Amazon S3 console, choose Create bucket.

The Create bucket wizard opens.

  1. For Bucket name, enter a DNS-compliant name for your bucket, such as airflow-blog-post.
  2. For Region, choose the Region where you deployed your Amazon MWAA environment, for example, US East (N. Virginia) us-east-1.
  3. Choose Create bucket.

For more information, see Creating a bucket.

Create an IAM role and IAM policy to write the Salesforce output data on Amazon S3

In this step, we create an IAM policy that allows Amazon MWAA to write on your S3 bucket.

  1. On the IAM console, in the navigation pane, choose Policies.
  2. Choose Create policy.
  3. Choose the JSON tab.
  4. Enter the following JSON policy document, and replace airflow-blog-post with your bucket name:
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": ["s3:ListBucket"],
          "Resource": ["arn:aws:s3:::airflow-blog-post"]
        },
        {
          "Effect": "Allow",
          "Action": [
            "s3:PutObject",
            "s3:GetObject",
            "s3:DeleteObject"
          ],
          "Resource": ["arn:aws:s3:::airflow-blog-post/*"]
        }
      ]
    }

  5. Choose Next: Tags.
  6. Choose Next: Review.
  7. For Name, choose a name for your policy (for example, airflow_data_output_policy).
  8. Choose Create policy.

Let’s attach the IAM policy to a new IAM role that we use in our Airflow connections.

  1. On the IAM console, choose Roles in the navigation pane and then choose Create role.
  2. In the Or select a service to view its use cases section, choose S3.
  3. For Select your use case, choose S3.
  4. Search for the name of the IAM policy that we created in the previous step (airflow_data_output_role) and select the policy.
  5. Choose Next: Tags.
  6. Choose Next: Review.
  7. For Role name, choose a name for your role (airflow_data_output_role).
  8. Review the role and then choose Create role.

You’re redirected to the Roles section.

  1. In the search box, enter the name of the role that you created and choose it.
  2. Copy the role ARN to use later to create the AWS connection on Airflow.

Create a Salesforce connection on the Airflow UI to be able to read from Salesforce

To read data from Salesforce, we need to create a connection using the Airflow user interface.

  1. On the Airflow UI, choose Admin.
  2. Choose Connections, and then the plus sign to create a new connection.
  3. Fill in the fields with the required information.

The following table provides more information about each value.

Field Mandatory Description Values
Conn Id Yes Connection ID to define and to be used later in the DAG For example, salesforce_connection
Conn Type Yes Connection type HTTP
Host Yes Salesforce host name host-dev-ed.my.salesforce.com or host.lightning.force.com. Replace the host with your Salesforce host and don’t add the http:// as prefix.
Login Yes The Salesforce user name. The user must have read access to the salesforce objects. [email protected]
Password Yes The corresponding password for the defined user. MyPassword123
Port No Salesforce instance port. By default, 443. 443
Extra Yes Specify the extra parameters (as a JSON dictionary) that can be used in the Salesforce connection. security_token is the Salesforce security token for authentication. To get the Salesforce security token in your email, you must reset your security token. {"security_token":"AbCdE..."}

Create an AWS connection in the Airflow UI to be able to write on Amazon S3

An AWS connection is required to upload data into Amazon S3, so we need to create a connection using the Airflow user interface.

  1. On the Airflow UI, choose Admin.
  2. Choose Connections, and then choose the plus sign to create a new connection.
  3. Fill in the fields with the required information.

The following table provides more information about the fields.

Field Mandatory Description Value
Conn Id Yes Connection ID to define and to be used later in the DAG For example, aws_connection
Conn Type Yes Connection type Amazon Web Services
Extra Yes It is required to specify the Region. You also need to provide the role ARN that we created earlier.
{
"region":"eu-west-1",
"role_arn":"arn:aws:iam::123456789101:role/airflow_data_output_role "
}

Create an Airflow variable on the Airflow UI to store the name of the target S3 bucket

We create a variable to set the name of the target S3 bucket. This variable is used by the DAG. So, we need to create a variable using the Airflow user interface.

  1. On the Airflow UI, choose Admin.
  2. Choose Variables, then choose the plus sign to create a new variable.
  3. For Key, enter bucket_name.
  4. For Val, enter the name of the S3 bucket that you created in a previous step (airflow-blog-post).

Create and deploy a DAG in Amazon MWAA

To be able to ingest data from Salesforce into Amazon S3, we need to create a DAG (Directed Acyclic Graph). To create and deploy the DAG, complete the following steps:

  1. Create a local Python DAG.
  2. Deploy your DAG using the project CI/CD pipeline.
  3. Run your DAG on the Airflow UI.
  4. Display your data in Amazon S3 (with S3 Select).

Create a local Python DAG

The provided SalesForceToS3Operator allows you to ingest data from Salesforce objects to an S3 bucket. Refer to standard Salesforce objects for the full list of objects you can ingest data from with this Airflow operator.

In this use case, we ingest data from the Opportunity Salesforce object. We retrieve the last 6 months’ data in monthly batches and we filter on a specific list of fields.

The DAG provided in the sample in GitHub repository imports the last 6 months of the Opportunity object (one file by month) by filtering the list of retrieved fields.

This operator takes two connections as parameters:

  • An AWS connection that is used to upload ingested data into Amazon S3.
  • A Salesforce connection to read data from Salesforce.

The following table provides more information about the parameters.

Parameter Type Mandatory Description
sf_conn_id string Yes Name of the Airflow connection that has the following information:

  • user name
  • password
  • security token
sf_obj string Yes Name of the relevant Salesforce object (Account, Lead, Opportunity)
s3_conn_id string Yes The destination S3 connection ID
s3_bucket string Yes The destination S3 bucket
s3_key string Yes The destination S3 key
sf_fields string No The (optional) list of fields that you want to get from the object (Id, Name, and so on).
If none (the default), then this gets all fields for the object.
fmt string No The (optional) format that the S3 key of the data should be in.
Possible values include CSV (default), JSON, and NDJSON.
from_date date format No A specific date-time (optional) formatted input to run queries from for incremental ingestion.
Evaluated against the SystemModStamp attribute.
Not compatible with the query parameter and should be in date-time format (for example, 2021-01-01T00:00:00Z).
Default: None
to_date date format No A specific date-time (optional) formatted input to run queries to for incremental ingestion.
Evaluated against the SystemModStamp attribute.
Not compatible with the query parameter and should be in date-time format (for example, 2021-01-01T00:00:00Z).
Default: None
query string No A specific query (optional) to run for the given object.
This overrides default query creation.
Default: None
relationship_object string No Some queries require relationship objects to work, and these are not the same names as the Salesforce object.
Specify that relationship object here (optional).
Default: None
record_time_added boolean No Set this optional value to true if you want to add a Unix timestamp field to the resulting data that marks when the data was fetched from Salesforce.
Default: False
coerce_to_timestamp boolean No Set this optional value to true if you want to convert all fields with dates and datetimes into Unix timestamp (UTC).
Default: False

The first step is to import the operator in your DAG:

from operators.salesforce_to_s3_operator import SalesforceToS3Operator

Then define your DAG default ARGs, which you can use for your common task parameters:

# These args will get passed on to each operator
# You can override them on a per-task basis during operator initialization
default_args = {
    'owner': '[email protected]',
    'depends_on_past': False,
    'start_date': days_ago(2),
    'retries': 0,
    'retry_delay': timedelta(minutes=1),
    'sf_conn_id': 'salesforce_connection',
    's3_conn_id': 'aws_connection',
    's3_bucket': 'salesforce-to-s3',
}
...

Finally, you define the tasks to use the operator.

The following examples illustrate some use cases.

Salesforce object full ingestion

This task ingests all the content of the Salesforce object defined in sf_obj. This selects all the object’s available fields and writes them into the defined format in fmt. See the following code:

...
salesforce_to_s3 = SalesforceToS3Operator(
    task_id="Opportunity_to_S3",
    sf_conn_id=default_args["sf_conn_id"],
    sf_obj="Opportunity",
    fmt="ndjson",
    s3_conn_id=default_args["s3_conn_id"],
    s3_bucket=default_args["s3_bucket"],
    s3_key=f"salesforce/raw/dt={s3_prefix}/{table.lower()}.json",
    dag=salesforce_to_s3_dag,
)
...

Salesforce object partial ingestion based on fields

This task ingests specific fields of the Salesforce object defined in sf_obj. The selected fields are defined in the optional sf_fields parameter. See the following code:

...
salesforce_to_s3 = SalesforceToS3Operator(
    task_id="Opportunity_to_S3",
    sf_conn_id=default_args["sf_conn_id"],
    sf_obj="Opportunity",
    sf_fields=["Id","Name","Amount"],
    fmt="ndjson",
    s3_conn_id=default_args["s3_conn_id"],
    s3_bucket=default_args["s3_bucket"],
    s3_key=f"salesforce/raw/dt={s3_prefix}/{table.lower()}.json",
    dag=salesforce_to_s3_dag,
)
...

Salesforce object partial ingestion based on time period

This task ingests all the fields of the Salesforce object defined in sf_obj. The time period can be relative using from_date or to_date parameters or absolute by using both parameters.

The following example illustrates relative ingestion from the defined date:

...
salesforce_to_s3 = SalesforceToS3Operator(
    task_id="Opportunity_to_S3",
    sf_conn_id=default_args["sf_conn_id"],
    sf_obj="Opportunity",
    from_date="YESTERDAY",
    fmt="ndjson",
    s3_conn_id=default_args["s3_conn_id"],
    s3_bucket=default_args["s3_bucket"],
    s3_key=f"salesforce/raw/dt={s3_prefix}/{table.lower()}.json",
    dag=salesforce_to_s3_dag,
)
...

The from_date and to_date parameters support Salesforce date-time format. It can be either a specific date or literal (for example TODAY, LAST_WEEK, LAST_N_DAYS:5). For more information about date formats, see Date Formats and Date Literals.

For the full DAG, refer to the sample in GitHub repository.

This code dynamically generates tasks that run queries to retrieve the data of the Opportunity object in the form of 1-month batches.

The sf_fields parameter allows us to extract only the selected fields from the object.

Save the DAG locally as salesforce_to_s3.py.

Deploy your DAG using the project CI/CD pipeline

As part of the CDK deployment, a CodeCommit repository and CodePipeline pipeline were created in order to continuously build, test, and deploy DAGs into your Amazon MWAA environment.

To deploy the new DAG, the source code should be committed to the CodeCommit repository. This triggers a CodePipeline run that builds, tests, and deploys your new DAG and makes it available in your Amazon MWAA environment.

  1. Sign in to the CodeCommit console in your deployment Region.
  2. Under Source, choose Repositories.

You should see a new repository mwaaproject.

  1. Push your new DAG in the mwaaproject repository under dags. You can either use the CodeCommit console or the Git command line to do so:
    1. CodeCommit console:
      1. Choose the project CodeCommit repository name mwaaproject and navigate under dags.
      2. Choose Add file and then Upload file and upload your new DAG.
    2. Git command line:
      1. To be able to clone and access your CodeCommit project with the Git command line, make sure Git client is properly configured. Refer to Setting up for AWS CodeCommit.
      2. Clone the repository with the following command after replacing <region> with your project Region:
        git clone https://git-codecommit.<region>.amazonaws.com/v1/repos/mwaaproject

      3. Copy the DAG file under dags and add it with the command:
        git add dags/salesforce_to_s3.py

      4. Commit your new file with a message:
        git commit -m "add salesforce DAG"

      5. Push the local file to the CodeCommit repository:
        git push

The new commit triggers a new pipeline that builds, tests, and deploys the new DAG. You can monitor the pipeline on the CodePipeline console.

  1. On the CodePipeline console, choose Pipeline in the navigation pane.
  2. On the Pipelines page, you should see mwaaproject-pipeline.
  3. Choose the pipeline to display its details.

After checking that the pipeline run is successful, you can verify that the DAG is deployed to the S3 bucket and therefore available on the Amazon MWAA console.

  1. On the Amazon S3 console, look for a bucket starting with mwaairflowstack-mwaaenvstackne and go under dags.

You should see the new DAG.

  1. On the Amazon MWAA console, choose DAGs.

You should be able to see the new DAG.

Run your DAG on the Airflow UI

Go to the Airflow UI and toggle on the DAG.

This triggers your DAG automatically.

Later, you can continue manually triggering it by choosing the run icon.

Choose the DAG and Graph View to see the run of your DAG.

If you have any issue, you can check the logs of the failed tasks from the task instance context menu.

Display your data in Amazon S3 (with S3 Select)

To display your data, complete the following steps:

  1. On the Amazon S3 console, in the Buckets list, choose the name of the bucket that contains the output of the Salesforce data (airflow-blog-post).
  2. In the Objects list, choose the name of the folder that has the object that you copied from Salesforce (opportunity).
  3. Choose the raw folder and the dt folder with the latest timestamp.
  4. Select any file.
  5. On the Actions menu, choose Query with S3 Select.
  6. Choose Run SQL query to preview the data.

Clean up

To avoid incurring future charges, delete the AWS CloudFormation stack and the resources that you deployed as part of this post.

  1. On the AWS CloudFormation console, delete the stack MWAAirflowStack.

To clean up the deployed resources using the AWS Command Line Interface (AWS CLI), you can simply run the following command:

cdk destroy MWAAirflowStack

Make sure you are in the root path of the project when you run the command.

After confirming that you want to destroy the CloudFormation stack, the solution’s resources are deleted from your AWS account.

The following screenshot shows the process of deploying the stack:

The following screenshot confirms the stack is undeployed.

  1. Navigate to the Amazon S3 console and locate the two buckets containing mwaairflowstack-mwaaenvstack and mwaairflowstack-mwaaproj that were created during the deployment.
  2. Select each bucket delete its contents, then delete the bucket.
  3. Delete the IAM role created to write on the S3 buckets.

Conclusion

ENGIE discovered significant value by using Amazon MWAA, enabling its global business units to ingest data in more productive ways. This post presented how ENGIE scaled their data ingestion pipelines using Amazon MWAA. The first part of the post described the architecture components and how to successfully deploy a CI/CD pipeline for an Amazon MWAA project template using a CodeCommit repository and plug it into CodePipeline to build, test, and package the code and custom plugins. The second part walked you through the steps to automate the ingestion process from Salesforce using Airflow with an example. For the Airflow configuration, you used Airflow variables, but you can also use Secrets Manager with Amazon MWAA using the secretsBackend parameter when deploying the stack.

The use case discussed in this post is just one example of how you can use Amazon MWAA to make it easier to set up and operate end-to-end data pipelines in the cloud at scale. For more information about Amazon MWAA, check out the User Guide.


About the Authors

Anouar Zaaber is a Senior Engagement Manager in AWS Professional Services. He leads internal AWS, external partner, and customer teams to deliver AWS cloud services that enable the customers to realize their business outcomes.

Amine El Mallem is a Data/ML Ops Engineer in AWS Professional Services. He works with customers to design, automate, and build solutions on AWS for their business needs.

Armando Segnini is a Data Architect with AWS Professional Services. He spends his time building scalable big data and analytics solutions for AWS Enterprise and Strategic customers. Armando also loves to travel with his family all around the world and take pictures of the places he visits.

Mohamed-Ali Elouaer is a DevOps Consultant with AWS Professional Services. He is part of the AWS ProServe team, helping enterprise customers solve complex problems related to automation, security, and monitoring using AWS services. In his free time, he likes to travel and watch movies.

Julien Grinsztajn is an Architect at ENGIE. He is part of the Digital & IT Consulting ENGIE IT team working on the definition of the architecture for complex projects related to data integration and network security. In his free time, he likes to travel the oceans to meet sharks and other marine creatures.

Demonstrate your AWS Cloud Storage knowledge and skills with new digital badges!

Post Syndicated from Steve Roberts original https://aws.amazon.com/blogs/aws/demonstrate-your-aws-cloud-storage-knowledge-and-skills-with-new-digital-badges/

Are you a cloud storage professional or an on-premises storage pro who’s curious about cloud storage? Are you interested in demonstrating your AWS Storage knowledge and skills with potential employers and your community of peers? If so, I’d like to bring to your attention the recent launch of digital badges aligned to Learning Plans for Block Storage and Object Storage on AWS Skill Builder. In this 2021 blog post by Indeed, cloud-computing is the number one in-demand skill employers are looking for.

The new, verifiable, digital badges are available to everyone who scores at least 80 percent in the assessments associated with Learning Plans. The badges prove your knowledge and skills for Object Storage and/or Block Storage in the AWS Cloud. Badges, distributed and managed through Credly, carry with them metadata that enables verification of the issuer and the credential and lists the skills and knowledge demonstrated by the holder. Sharing badges on your résumé, peer community, and via social media assists in developing your career in cloud computing and celebrates your achievements. Some of you may be familiar with AWS re:Post, which launched during re:Invent 2021—your badges can be showcased in your AWS re:Post user profile too.

Object and Block Storage digital badges

AWS Skill Builder Learning Plans and digital badges for Block and Object Storage
Digital badges are available today for the Block Storage and Object Storage Learning Plans on AWS Skill Builder. Block Storage has a focus on Amazon Elastic Block Store (EBS), while Object Storage is focused on Amazon Simple Storage Service (Amazon S3). Both plans contain free learning content to help you build your knowledge in each of these areas and get ready for the assessments.

AWS Skill Builder offers a range of Learning Plans related to cloud computing skills. Learning Plans correspond to roles (architect, developer, etc.) and domain (databases, storage, etc.); each one is specifically designed to build your knowledge with a clear set of outcomes for you to achieve. Freely available, the Learning Plans and related assessments can be taken anywhere, anytime, providing equal and fair learning for all.

Badge assessments are linked to curriculum standards and are developed by service teams, field subject matter experts (SMEs), and content/curriculum SMEs. Therefore, employers can feel satisfied that the badges attained by a potential employee were awarded due to actual demonstrated skills and knowledge for Block and/or Object Storage. By the way, if you feel you have existing skills and knowledge and would prefer to skip straight to the assessment, you can. If you don’t pass, you’ll be guided to fill in your knowledge gaps, and you can then retake the assessment after 24 hours. To earn a badge, you need to score a minimum of 80 percent in the assessment.

The Block Storage and Object Storage Learning Plans are designed for you to take on your own, and you can track your own progress, making it easier to learn in your own time and manage your own learning development. They’re a great opportunity to refresh your skills, check your skills, or learn new ones.

Start collecting digital storage learning badges today
The Learning Plans and new digital badges for Block Storage and Object Storage help you showcase your in-demand knowledge and skills related to AWS Storage. As I mentioned earlier, enrollment for Learning Plans, and the subsequent assessments, are free for everyone. Find out more, and get started, at https://aws.amazon.com/training/badges. And be sure to share your accomplishment by posting on social media with the hashtag #AWSTraining and show off your badges!

— Steve

Enhance Your Contact Center Solution with Automated Voice Authentication and Visual IVR

Post Syndicated from Soonam Jose original https://aws.amazon.com/blogs/architecture/enhance-your-contact-center-solution-with-automated-voice-authentication-and-visual-ivr/

Recently, the Accenture AWS Business Group (AABG) assisted a customer in developing a secure and personalized Interactive Voice Response (IVR) contact center experience that receives and processes payments and responds to customer inquiries.

Our solution uses Amazon Connect at its core to help customers efficiently engage with customer service agents. To ensure transactions are completed securely and to prevent fraud, the architecture provides voice authentication using Amazon Connect Voice ID and a visual portal to submit payments. The visual IVR feature allows customers to easily provide the required information online while the IVR is on standby. The solution also provides agents the information they need to effectively and efficiently understand and resolve callers’ inquiries, which helps improve the quality of their service.

Overview of solution

Our IVR is designed using Contact Flows on Amazon Connect and uses the following services:

  • Amazon Lex provides the voice-based intent analysis. Intent analysis is the process of determining the underlying intention behind customer interactions.
  • Amazon Connect integrates with other AWS services using AWS Lambda.
  • Amazon DynamoDB stores customer data.
  • Amazon Pinpoint notifies customers via text and email.
  • AWS Amplify provides the customized agent dashboard and generates the visual IVR portal.

Figure 1 shows how this architecture routes customer calls:

  1. Callers dial the main line to interact with the IVR in Amazon Connect.
  2. Amazon Connect Voice ID sets up a voiceprint for first time callers or performs voice authentication for repeat callers for added security.
  3. Upon successful voice authentication, callers can proceed to IVR self-service functions, such as checking their account balance or making a payment. Amazon Lex handles the voice intent analysis.
  4. When callers make a payment request, they are given the option to be handed off securely to a visual IVR portal to process their payment.
  5. If a caller requests to be connected to an agent, the agent will be presented with the customer’s information and IVR interaction details on their agent dashboard.
Architecture diagram

Figure 1. Architecture diagram

Customer IVR experience

Figure 2 describes how callers navigate through the IVR:

  1. The IVR asks the caller the purpose of the call.
  2. The caller’s answer is sent for voice intent analysis. The IVR also attempts to authenticate the caller’s voice using Amazon Connect Voice ID. If authenticated, the caller is automatically routed to the correct flow based on the analyzed intent.
    • For the “Account Balance” flow, the caller is provided the account balance information.
    • For the “Make a Payment” flow, the caller can use the IVR or a visual IVR portal to process the payment. Upon payment completion, the caller is immediately notified their transaction has completed via SMS or email. Both flows allow the caller to be transferred to an agent. The caller also has the option to be called back when an agent becomes available or choose a specific date and time for the callback.
Customer IVR experience diagram

Figure 2. Customer IVR experience diagram

The intelligent self-service IVR solution includes the following features:

  • The IVR can redirect callers to a payment portal for scenarios like making a payment while the IVR remains on standby.
  • IVR transaction tracking helps agents understand the current status of the caller’s transaction and quickly determines the caller’s situation.
  • Callers have the option to receive a call as soon as the next agent becomes available or they can schedule a time that works for them to receive a callback.
  • IVR activity logging gives agents a detailed summary of the caller’s actions within the IVR.
  • Transaction confirmation which notifies callers of successful transactions via SMS or email.

Solution walkthrough

Amazon Connect Voice ID authenticates a caller’s voice as an added level of security. It requires 30 seconds to create the initial enrollment voiceprint and 10 seconds of a caller’s voice to authenticate. If there is not enough net speech to perform the voice authentication, the IVR asks the caller more questions, such as their first name and last name, until it has collected enough net speech.

The IVR falls back to dual tone multi-frequency (DTMF) input for the caller’s credentials in case the system cannot successfully authenticate. This can include information like the last four digits of their national identification number or postal code.

In contact flows, you will enable voice authentication by adding the “Set security behavior” contact block and specifying the authentication threshold, as shown in Figure 3.

Set security behavior contact block

Figure 3. Set security behavior contact block

Figure 4 shows the “Check security status” contact block, which determines if the user has been successfully authenticated or not. It also shows results that it may return if the caller is not successfully authenticated, including, “Not authenticated,” “Inconclusive,” “Not enrolled,” “Opted out,” and “Error.”

Check security status contact block

Figure 4. Check security status contact block

Providing a personalized experience for callers

To provide a personalized experience for callers, sample customer data is stored in a DynamoDB table. A Lambda function queries this table when callers call the contact center. The query returns information about the caller, such as their name, so the IVR can offer a customized greeting.

Transaction tracking

The table can also query if a customer previously called and attempted to make a payment but didn’t complete it successfully. This feature is called “transaction tracking.” Here’s how it works:

  • When the caller progresses through the “make a payment” flow, a field in the table is updated to reflect their transaction’s status.
  • If the payment is abandoned, the status in the table remains open, and the IVR prompts the caller to pick up where they left off the next time they call.
  • Once they have successfully completed their payment, we update the status in the table to “complete.”
  • When the IVR confirms that the caller’s payment has gone through, they will receive a confirmation via SMS and email. A Lambda function in the contact flow receives the caller’s phone number and email address. Then it distributes the confirmation messages via Amazon Pinpoint.

If a call is escalated to an agent, the “Check contact attributes” contact block in Figure 5 helps to check the caller’s intent and provide the agent with a customized whisper.

Agent whisper sample contact flow

Figure 5. Agent whisper sample contact flow

Making payments via the payment portal

To make a payment, an Amazon Lex bot presents the caller with the option to provide payment details over the phone or through a visual IVR portal.

If they choose to use the visual IVR portal (Figure 6), they can enter their payment details while maintaining an open phone connection with the contact center, in case they need additional assistance. Here’s how it works:

  • When callers select to use the payment portal, it prompts a Lambda function that generates a universally unique identifier (UUID) and provides the caller a unique PIN.
  • The UUID and PIN are stored in the DynamoDB table along with the caller’s information.
  • Another Lambda function generates a secure link using the UUID. It then uses Amazon Pinpoint to send the link to the caller over text message to their phone number on record. When they open the link, they are prompted to enter their unique PIN.
  • Then, the webpage makes an API call that validates the payment request by comparing the entered PIN to the PIN stored in the DynamoDB table.
  • Once validated, the caller can enter their payment information.
Visual IVR portal

Figure 6. Visual IVR portal

Figure 7 illustrates visual IVR portal contact flow:

  • Every 10 seconds, a Lambda function checks the caller’s payment status. It provides the caller the option to escalate to an agent if they have questions.
  • If the caller does not fill out all the information when they hit “Submit Payment,” an IVR prompt will ask them to provide all payment details before proceeding.
  • The IVR phone call stays active until the user’s payment status is updated to “complete” in the DynamoDB table. This generates an IVR prompt stating that their payment was successful.
Visual IVR portal sample contact flow

Figure 7. Visual IVR portal sample contact flow

Generating a chat transcript for agents

When the customer’s call is escalated to an agent, the agent receives a chat transcript. Here’s how it works:

  • After the caller’s intent is captured at the start of the call, the IVR logs activity using a “Set contact attribute” contact block, which prompts the $.Lex.SessionAttributes.transcript.
  • This transcript is used in a Lambda function to build a chat interface.
  • This transcript is shown on the agent’s dashboard, along with the Contact Control Panel (CCP) and a few key pieces of caller information.
IVR transcript

Figure 8. IVR transcript

The agent’s customized dashboard and the visual IVR portal are deployed and hosted on Amplify. This allows us to seamlessly connect to our code repository and automate deployments after changes are committed. It removed the need to configure Amazon Simple Storage Service (Amazon S3) buckets, an Amazon CloudFront distribution, and Amazon Route 53 DNS to host our front-end components.

This solution also offers callers the ability to opt-in for a callback or to schedule a callback. A “Check queue status” contact block checks the current time in queue, and if it reaches a certain threshold, the IVR will offer a callback. The caller has the option to receive a call as soon as the next agent becomes available or to schedule a time to receive a callback. A Lex bot gathers the date and time slots, which are then passed to a Lambda function that will validate the proposed callback option.

Once confirmed, the scheduled callback is placed into a DynamoDB table along with the caller’s phone number. Another Lambda function scans the table every 5 minutes to see if there are any callbacks scheduled within that 5-minute time period. You’ll add an Amazon EventBridge prompt to the Lambda function that specifies a schedule expression like cron(0/5 8-17 ? * MON-FRI *), which means the Lambda function will execute every 5 minutes, Monday through Friday from 8:00 AM to 4:55 PM.

Conclusion

This solution helps you increase customer satisfaction by making it easier for callers to complete transactions over the phone. The visual IVR provides added web-based support experience to submit payments. It also improves the quality of service of your customer service agents by making relevant information available to agents during the call.

This solution also allows you to scale out the resources to handle increasing demand. Custom features can easily be added using serverless technology, such as Lambda functions or other cloud-native services on AWS.

Ready to get started? The AABG helps customers accelerate their pace of digital innovation and realize incremental business value from cloud adoption and transformation. Connect with our team at [email protected] to learn how to use machine learning in your products and services.

Looking for more architecture content? AWS Architecture Center provides reference architecture diagrams, vetted architecture solutions, Well-Architected best practices, patterns, icons, and more!

The final 4.4 stable kernel has been released

Post Syndicated from original https://lwn.net/Articles/883685/rss

With a more lengthy than usual message, Greg Kroah-Hartman has released the
4.4.302 stable kernel; it will be the last
from the stable kernel team in the 4.4.x series. “Do not use it
anymore unless you really know
what you are doing.
” He notes that the Civil Infrastructure Platform (CIP) project
is considering maintaining 4.4 into the future; those interested should contact CIP. He also added some
statistics showing a nearly six-year lifetime for the branch with 8.44
changes per day from over 3500 developers.

It was a good kernel branch, helped out by many to work as well as it
has, thanks to all for your help with this. It has powered many
millions, maybe a few billion, devices out in the world, but now it’s
time to say good-bye.

AWS cloud services adhere to CISPE Data Protection Code of Conduct for added GDPR assurance

Post Syndicated from Chad Woolf original https://aws.amazon.com/blogs/security/aws-cloud-services-adhere-to-cispe-data-protection-code-of-conduct/

French version
German version

I’m happy to announce that AWS has declared 52 services under the Cloud Infrastructure Service Providers Europe Data Protection Code of Conduct (CISPE Code). This provides an independent verification and an added level of assurance to our customers that our cloud services can be used in compliance with the General Data Protection Regulation (GDPR).

Validated by the European Data Protection Board (EDPB) and approved by the French Data Protection Authority (CNIL), the CISPE Code assures organizations that their cloud infrastructure service provider meets the requirements applicable to personal data processed on their behalf (customer data) under the GDPR. The CISPE Code also raises the bar on data protection and privacy for cloud services in Europe, going beyond current GDPR requirements. For example:

  • Data in Europe: The CISPE Code goes beyond GDPR compliance by requiring cloud infrastructure service providers to give customers the choice to use services to store and process customer data exclusively in the European Economic Area (EEA).
  • Data privacy: The CISPE Code prohibits cloud infrastructure service providers from using customer data for data mining, profiling, or direct marketing.
  • Cloud infrastructure focused: The CISPE Code addresses the specific roles and responsibilities of cloud infrastructure service providers (not represented in more general codes).

These 52 AWS services have now been independently verified as complying with the CISPE Code. The verification process was conducted by Ernst & Young CertifyPoint (EY CertifyPoint), an independent, globally recognized monitoring body accredited by CNIL. AWS is bound by the CISPE Code’s requirements for the 52 declared services, and we are committed to bringing additional services into the scope of the CISPE compliance program.

About the CISPE Data Protection Code of Conduct

The CISPE Code is the first pan-European data protection code of conduct for cloud infrastructure service providers. In May 2021, the CISPE Code was approved by the EDPB, acting on behalf of the 27 data protection authorities across Europe; and in June 2021, the Code was formally adopted by the CNIL, acting as the lead supervisory authority.

EY CertifyPoint is accredited as an independent monitoring body for the CISPE Code by CNIL, based on criteria approved by the EDPB. EY CertifyPoint is responsible for supervising AWS’s ongoing compliance with the CISPE Code for all declared services.

AWS and the GDPR

To earn and maintain customer trust, AWS is committed to providing customers and partners an environment to deploy AWS services in compliance with the GDPR, and to build their own GDPR-compliant products, services, and solutions.

For more information, see the AWS General Data Protection Regulation (GDPR) Center.

Further information

A list of the 52 AWS services that are verified as compliant with the CISPE Code is available on the CISPE Public Register site.

AWS helps customers accelerate cloud-driven innovation and succeed at home and globally. You can read more about our ongoing commitments to protect EU customers’ data on our EU data protection section of the AWS Cloud Security site.

.


Les services cloud d’AWS adhèrent au code de conduite du CISPE sur la protection des données pour une garantie de conformité supplémentaire au RGPD.

par Chad Woolf

Je suis heureux d’annoncer qu’AWS a déclaré 52 services sous le Code de conduite sur la protection des données des fournisseurs de services d’infrastructure cloud en Europe (Code CISPE). Ceci donne une vérification indépendante et un niveau d’assurance supplémentaire à nos clients quant à la conformité de nos services cloud qu’ils utilisent avec le Règlement Général sur la Protection des Données (RGPD).

Validé par le Conseil Européen de la Protection des Données (CEPD) et approuvé par la Commission Nationale de l’Informatique et des Libertés (CNIL), le Code CISPE assure aux organisations que leur fournisseur de services d’infrastructure cloud répond aux exigences applicables aux données personnelles traitées en leur nom (données clients) sur base du RGPD. Le Code CISPE met la barre plus haut en matière de protection des données et de vie privée pour les services cloud en Europe, allant au-delà des exigences actuelles du RGPD. Par exemple :

  • Données en Europe : Le Code CISPE va au-delà de la conformité au RGPD en exigeant des fournisseurs de services d’infrastructure cloud qu’ils donnent aux clients le choix d’utiliser les services de stockage et de traitement des données clients exclusivement dans l’Espace Economique Européen (EEE).
  • Confidentialité des données : Le Code CISPE interdit aux fournisseurs de services d’infrastructure cloud d’utiliser les données clients pour l’exploration de données, le profilage ou le marketing direct.
  • Ciblage sur l’infrastructure cloud : Le Code CISPE traite des rôles et des responsabilités spécifiques des fournisseurs de services d’infrastructure cloud (non représentés dans des codes plus généraux).

Ces 52 services AWS ont aujourd’hui été vérifiés de manière indépendante comme étant conformes au Code CISPE. Le processus de vérification a été mené par Ernst & Young CertifyPoint (EY CertifyPoint), un organisme de contrôle indépendant et mondialement reconnu, accrédité par la CNIL. AWS est lié par les exigences du Code CISPE pour les 52 services déclarés, et nous nous engageons à faire entrer des services supplémentaires dans le champ d’application du programme de conformité CISPE.

À propos du Code de conduite sur la protection des données du CISPE

Le Code CISPE est le premier code de conduite paneuropéen sur la protection des données destiné aux fournisseurs de services d’infrastructure cloud. En mai 2021, le Code CISPE a été approuvé par le CEPD, agissant au nom des 27 autorités de protection des données à travers l’Europe ; et en juin 2021, le Code a été formellement adopté par la CNIL, agissant en tant qu’autorité de contrôle principale.

EY CertifyPoint est accrédité en tant qu’organisme indépendant de contrôle du Code CISPE par la CNIL, sur la base de critères approuvés par le CEPD. EY CertifyPoint est chargé de superviser la conformité permanente d’AWS au Code CISPE pour tous les services déclarés.

AWS et le GDPR

Pour gagner et conserver la confiance des clients, AWS s’engage à fournir aux clients et aux partenaires un environnement permettant de déployer les services AWS en conformité avec le RGPD, et de créer leurs propres produits, services et solutions conformes au RGPD.

Pour plus d’informations, consultez le Centre AWS sur le Règlement Générale sur la Protection des Données (RGPD).

Informations complémentaires

Une liste des 52 services AWS qui ont été vérifiés comme étant conformes au code CISPE est disponible sur le site du registre public CISPE.

AWS aide ses clients à accélérer l’innovation basée sur le cloud et à réussir chez eux et dans le monde entier. Vous pouvez en savoir plus sur nos engagements continus en matière de protection des données des clients de l’UE sur section Protection des Données de l’UE du site AWS Cloud Security.

.


 

AWS-Cloud-Dienste befolgen den CISPE-Verhaltenskodex für Datenschutz als zusätzliche Sicherheit bezüglich DSGVO

von Chad Woolf

Mit großer Freude darf ich verkünden, dass AWS 52 Dienste als im Einklang mit dem Verhaltenskodex für Cloud-Infrastruktur-Dienstanbieter in Europa (CISPE-Kodex) deklariert hat. Dies bietet unseren Kunden eine unabhängige Verifizierung und ein zusätzliches Maß an Sicherheit, dass unsere Cloud-Dienste in Übereinstimmung mit der Datenschutz-Grundverordnung (DSGVO) genutzt werden können.

Der CISPE-Kodex wurde vom Europäischen Datenschutzausschuss (EDSA) geprüft und von der französischen Datenschutzbehörde (CNIL) genehmigt. Er bietet Unternehmen die Sicherheit, dass ihr Cloud-Infrastruktur-Dienstanbieter die Anforderungen erfüllt, die für in ihrem Auftrag verarbeitete personenbezogene Daten (Kundendaten) gemäß der DSGVO gelten. Der CISPE-Kodex erhöht auch die Messlatte für Datenschutz für Cloud-Dienste in Europa, indem er über die aktuellen DSGVO-Anforderungen hinausgeht. Zum Beispiel:

  • Daten in Europa: Der CISPE-Kodex geht über die DSGVO-Konformität hinaus, indem er Cloud-Infrastruktur-Dienstanbieter dazu verpflichtet, ihren Kunden die Wahl zu geben, Dienste zur Speicherung und Verarbeitung von Kundendaten ausschließlich im Europäischen Wirtschaftsraum (EWR) zu nutzen.
  • Datenschutz: Der CISPE-Kodex verbietet Cloud-Infrastruktur-Dienstanbietern, Kundendaten für Data Mining, Profiling oder Direktmarketing zu verwenden.
  • Schwerpunkt auf Cloud-Infrastruktur: Der CISPE-Code adressiert die spezifischen Rollen und Verantwortlichkeiten von Cloud-Infrastruktur-Dienstanbietern (dies ist in allgemeineren Kodizes nicht abgebildet).

Für diese 52 AWS-Dienste wurde nun unabhängig verifiziert, dass sie mit dem CISPE-Kodex konform sind. Der Überprüfungsprozess wurde von Ernst & Young CertifyPoint (EY CertifyPoint) durchgeführt, einer unabhängigen, weltweit anerkannten Überprüfungsstelle, die von der CNIL akkreditiert ist. AWS ist an die Anforderungen des CISPE-Kodex für die 52 deklarierten Dienste gebunden, und wir sind bestrebt, zusätzliche Dienste in den Umfang des CISPE-Compliance-Programms aufzunehmen.

Über den CISPE-Verhaltenskodex für Datenschutz

Beim CISPE-Kodex handelt es sich um die ersten europaweiten Verhaltensregeln für Cloud-Infrastruktur-Dienstanbieter. Im Mai 2021 wurde der CISPE-Kodex vom EDSA im Namen der 27 Datenschutzbehörden aus ganz Europa genehmigt. Im Juni 2021 wurde der Kodex von der CNIL als federführende Aufsichtsbehörde offiziell verabschiedet.

EY CertifyPoint ist von der CNIL als unabhängige Überprüfungsstelle für den CISPE-Kodex auf der Grundlage der vom EDSA genehmigten Kriterien akkreditiert. EY CertifyPoint ist für die Überwachung der laufenden Einhaltung des CISPE-Kodex durch AWS für alle deklarierten Dienste verantwortlich.

AWS und die DSGVO

Um das Vertrauen von Kunden zu gewinnen und aufrechtzuerhalten, verpflichtet sich AWS, Kunden und Partnern eine Umgebung zu bieten, in der sie AWS-Dienste in Übereinstimmung mit der DSGVO verwenden und ihre eigenen DSGVO-konformen Produkte, Dienste und Lösungen entwickeln können.

Weitere Informationen finden Sie im AWS General Data Protection Regulation (GDPR) Center.

Weitere Informationen

Eine Liste der 52 AWS-Dienste, die als mit dem CISPE-Kodex konform verifiziert wurden, ist auf der CISPE Public Register Website verfügbar.

AWS hilft Kunden dabei, Cloud-getriebene Innovationen zu beschleunigen und sowohl zu Hause als auch weltweit erfolgreich zu sein. Weitere Informationen zu unserem kontinuierlichen Bestreben zum Schutz der Daten von EU-Kunden finden Sie in unserem Abschnitt zum EU-Datenschutz auf der AWS Cloud Security Website.


If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Chad Woolf

Chad joined Amazon in 2010 and built the AWS compliance functions from the ground up, including audit and certifications, privacy, contract compliance, control automation engineering and security process monitoring. Chad’s work also includes enabling public sector and regulated industry adoption of the AWS cloud and leads the AWS trade and product compliance team.

 

Velociraptor Version 0.6.3: Dig Deeper With More Speed and Scalability

Post Syndicated from Carlos Canto original https://blog.rapid7.com/2022/02/03/velociraptor-version-0-6-3-dig-deeper-with-more-speed-and-scalability/

Velociraptor Version 0.6.3: Dig Deeper With More Speed and Scalability

Rapid7 is very excited to announce the latest Velociraptor release 0.6.3. This release has been in the making for a few months now and has several exciting new features.

Scalability and speed have been the main focus of development since our previous release. Working with some of our larger partners on scaling Velociraptor to a large number of endpoints, we’ve addressed a number of challenges that we believe have improved Velociraptor for everyone at any level of scale.

Performance running on EFS

Running on a distributed filesystem such as EFS presents many advantages, not the least of which is removing the risk that disk space will run out. Many users previously faced disk full errors when running large hunts and accidentally collecting too much data from endpoints. Since Velociraptor is so fast, it’s quite easy to do a hunt collecting a large number of files, but before you know it, the disk may be full.

Using EFS removed this risk, since storage is essentially infinite (but not free). So there is a definite advantage to running the data store on EFS even when not running multiple frontends. When scaling to multiple frontends, EFS use is essential to facilitate a shared distributed filesystem among all the servers.

However, EFS presents some challenges. Although conceptually EFS behaves as a transparent filesystem, in reality the added network latency of EFS IO has caused unacceptable performance issues.

In this release, we employed a number of strategies to improve performance on EFS — and potentially other distributed filesystems, such as NFS. You can read all about the new changes here, but the gist is that added caching and delayed writing strategies help isolate the GUI performance from the underlying EFS latency, making the GUI snappy and quick even with slow filesystems.

We encourage everyone to test the new release on an EFS backend, to assess the performance on this setup — there are many advantages to this configuration. While this configuration is still considered experimental, it’s running successfully in a number of environments.

Searching and indexing

More as a side effect of the EFS work, Velociraptor 0.6.3 moves the client index into memory. This means that searching for clients by DNS name or labels is almost instant, significantly improving the performance of these operations over previous versions.

VQL queries that walk over all clients are now very fast as well. For example, the following query iterates over all clients (maybe thousands!) and checks if their last IP came from a particular subnet:

SELECT * , split(sep=":", string=last_ip)[0] AS LastIp
FROM clients()
WHERE cidr_contains(ip=LastIp, ranges="192.168.1.0/16")

This query will complete in a few seconds even with a large number of clients.

The GUI search bar can now search for IP addresses (e.g. ip:192.168*), and the online only filter is much faster as a result.

Velociraptor Version 0.6.3: Dig Deeper With More Speed and Scalability
Searching is much faster

Another benefit of rapid index searching is that we can now quickly estimate how many hosts will be affected by a hunt (calculated based on how many hosts are included and how many are excluded from the hunt). When users have multiple label groups, this helps to quickly understand how targeted a specific hunt is.

Velociraptor Version 0.6.3: Dig Deeper With More Speed and Scalability
Estimating hunt scope

Regular expressions and Yara rules

Velociraptor artifacts are just a way of wrapping a VQL query inside a YAML file for ease of use. Artifacts accept parameters that are passed to the VQL itself, controlling how it runs.

Velociraptor artifacts accept a number of parameters of different types. Sometimes, they accept a windows path — for example, the Windows.EventLogs.EvtxHunter artifact accepts a Windows glob path like %SystemRoot%\System32\Winevt\Logs\*.evtx. In the same artifact, we also can provide a PathRegex, which is a regular expression.

A regular expression is not the same thing as a path at all. In fact, when users get mixed up providing something like C:\Windows\System32 to a regular expression field, this is an invalid expression — backslashes have a specific meaning in a regular expression.

In 0.6.3, there are now dedicated GUI elements for Regular Expression inputs. Special regex patterns, such as backslash sequences, are visually distinct. Additionally, the GUI verifies that the regex is syntactically correct and offers suggestions. Users can type ? to receive further regular expression suggestions and help them build their regex.

Velociraptor Version 0.6.3: Dig Deeper With More Speed and Scalability
Entering regex in the GUI

To receive a RegEx GUI selector in your custom artifacts, simply denote the parameter’s type as regex.

Similarly, other artifacts require the user to enter a Yara rule to use the yara() VQL plugin. The Yara domain specific language (DSL) is rather verbose, so even for very simple search terms (e.g. a simple keyword search) a full rule needs to be constructed.

To help with this task, the GUI now presents a specific Yara GUI element. Users can press ? to automatically fill in a skeleton Yara rule suitable for a simple keyword match. Additionally, syntax highlighting gives visual feedback to the validity of the yara syntax.

Velociraptor Version 0.6.3: Dig Deeper With More Speed and Scalability
Entering Yara Rules in the GUI

Some artifacts allow file upload as a parameter to the artifact. This allows users to upload larger inputs, for example a large Yara rule-set. The content of the file will be made available to the VQL running on the client transparently.

To receive a RegEx GUI selector in your custom artifacts, simply denote the parameter’s type as yara. To allow uploads in your artifact parameters simply denote the parameter as an upload type. Within the VQL, the content of the uploaded file will be available as that parameter.

Overriding Generic.Client.Info

When a new client connects to the Velociraptor server, the server performs an Interrogation flow by scheduling the Generic.Client.Info artifact on it. This artifact collects basic metadata about the client, such as the type of OS it is, the hostname, and the version of Velociraptor. This information is used to feed the search index and is also displayed in the “VQL drilldown” page of the Host Information screen.

In the latest release, it’s possible to customize the Generic.Client.Info artifact, and Velociraptor will use the customized version instead to interrogate new clients. This allows users to add more deployment specific collections to the interrogate flow and customize the “VQL drilldown” page. Simply search for Generic.Client.Info in the View Artifact screen, and customize as needed.

Root certificates are now embedded

By default, Golang searches for root certificates from the running system so it can verify TLS connections. This behavior caused problems when running Velociraptor on very old unpatched systems that did not receive the latest Let’s Encrypt Root Certificate update. We decided it was safer to just include the root certs in the binary so we don’t need to rely on the OS itself.

Additionally, Velociraptor will now accept additional root certs embedded in its config file — just add all the certs in PEM format under the Client.Crypto.root_certs key in the config file. This helps deployments that must use a MITM proxy or traffic inspection proxies.

When adding a Root Certificate to the configuration file, Velociraptor will treat that certificate as part of the public PKI roots — therefore, you’ll need to have Client.use_self_signed_ssl as false.

This allows Velociraptor to trust the TLS connection — however, bear in mind that Velociraptor’s internal encryption channel is still present. The MITM proxy won’t be able to actually decode the data or interfere with the communications by injecting or modifying data. Only the outer layer of TLS encryption can be stripped by the MITM proxy.

VQL changes

Glob plugin improvements

The glob plugin now has a new option: recursion_callback. This allows much finer control over which directories to visit making file searches much more efficient and targeted. To learn more about it, read our previous Velociraptor blog post “Searching for Files.”

Notable new artifacts

Many people use Velociraptor to collect and hunt for data from endpoints. Once the data is inspected and analyzed, often the data is no longer needed.

To help with the task of expiring old data, the latest release incorporates the Server.Utils.DeleteManyFlows and Server.Utils.DeleteMonitoringData artifacts that allow users to remove older collections. This helps manage disk usage and reduce ongoing costs.

Try it out!

If you’re interested in the new features, take Velociraptor for a spin by downloading it from our release page. It’s available for free on GitHub under an open source license.

As always, please file bugs on the GitHub issue tracker or submit questions to our mailing list by emailing [email protected]. You can also chat with us directly on our discord server.

Learn more about Velociraptor by visiting any of our web and social media channels below:

Dig Deeper!

Additional reading:

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

Security updates for Thursday

Post Syndicated from original https://lwn.net/Articles/883676/rss

Security updates have been issued by Debian (librecad), Fedora (flatpak, flatpak-builder, and glibc), Mageia (chromium-browser-stable, connman, libtiff, and rust), openSUSE (lighttpd), Oracle (cryptsetup, nodejs:14, and rpm), Red Hat (varnish:6), SUSE (kernel and unbound), and Ubuntu (linux, linux-aws, linux-aws-5.11, linux-aws-5.13, linux-gcp, linux-gcp-5.11, linux-hwe-5.13, linux-kvm, linux-oem-5.13, linux-oracle, linux-oracle-5.11, linux-raspi, linux, linux-aws, linux-aws-5.4, linux-bluefield, linux-gcp, linux-gcp-5.4, linux-gkeop, linux-gkeop-5.4, linux-hwe-5.4, linux-ibm, linux-kvm, linux-oracle, linux-oracle-5.4, linux, linux-aws, linux-aws-hwe, linux-azure, linux-dell300x, linux-gcp-4.15, linux-hwe, linux-kvm, linux-oracle, linux-raspi2, linux-snapdragon, linux-gke, linux-gke-5.4, mysql-5.7, mysql-8.0, python-django, and samba).

It’s back: The Hello World podcast for the computing education community

Post Syndicated from Janina Ander original https://www.raspberrypi.org/blog/hello-world-podcast-season-3-computing-education/

We set out last year to gather more stories, ideas, and inspiration from and for the computing education community in between Hello World magazine issues: we launched the Hello World podcast. On the podcast, we dive deeper into articles from Hello World, and we speak with people from all over the world who work as teachers, educators, and other computing education professionals.

Hello World logo.

Season 3 of the Hello World podcast starts on Monday

The Hello World podcast helps connect the global community of computing educators and Hello World readers, and lets them share their experiences. After two seasons and a short pause during the autumn, we are finally back with a brand-new Hello World podcast season. Regular listeners will also notice a new theme music!

Each episode, we explore computing, coding, and digital making education by delving into an exciting topic together with our guests: experts, practitioners, and other members of the Hello World community.

 In season 3, we’re exploring:

  • The role of makerspaces, both within schools and the wider community 
  • The relevance of imagination and storytelling to computing 
  • Computing in the context of science and ecology
  • How learners can promote and support computing as digital leaders
  • And much more…
A phone with headphones plugged in next to a cup of coffee on a table.

Meet our guests for episode 1 of the new season

In our first episode, which will be available from 7 February, your hosts Carrie Anne and James ask the question “What role do makerspaces play in the classroom?”. We talk to two fantastic guests, each with a wealth of experience in designing and developing makerspaces:

Nick Provenzano.
Nick Provenzano

Nick Provenzano, who is a Teacher and Makerspace Director at University Liggett School in Michigan. He is also an author, makerspace builder, international keynote speaker and Raspberry Pi Certified Educator.

Chris Hillidge
Chris Hillidge

Chris Hillidge, who established FabLab Warrington in 2016 and manages the STEM strategy for students aged 4 to 19 across The Challenge Academy Trust. Chris is a Specialist Leader of Education, consultant, and Raspberry Pi Certified Educator.

If you’ve not tried out the Hello World podcast yet, why not get started by diving into one of our most popular episodes?

You’ll find the upcoming season and past episodes on your favourite podcast platform, where you can also subscribe to never miss an episode. Alternatively, you can listen via your browser at helloworld.cc/podcast.

The post It’s back: The Hello World podcast for the computing education community appeared first on Raspberry Pi.

Need to Keep Analytics Data in the EU? Cloudflare Zaraz Can Offer a Solution

Post Syndicated from Yair Dovrat original https://blog.cloudflare.com/keep-analytics-tracking-data-in-the-eu-cloudflare-zaraz/

Need to Keep Analytics Data in the EU? Cloudflare Zaraz Can Offer a Solution

Need to Keep Analytics Data in the EU? Cloudflare Zaraz Can Offer a Solution

A recent decision from the Austrian Data Protection Authority (the Datenschutzbehörde) has network engineers scratching their heads and EU companies that use Google Analytics scrambling. The Datenschutzbehörde found that an Austrian website’s use of Google Analytics violates the EU General Data Protection Regulation (GDPR) as interpreted by the “Schrems II” case because Google Analytics can involve sending full or truncated IP addresses to the United States.

While disabling such trackers might be one (extreme) solution, doing so would leave website operators blind to how users are engaging with their site. A better approach: find a way to use tools like Google Analytics, but do so with an approach that protects the privacy of personal information and keeps it in the EU, avoiding a data transfer altogether. Enter Cloudflare Zaraz.

But before we get into just how Cloudflare Zaraz can help, we need to explain a bit of the background for the Datenschutzbehörde’s ruling, and why it’s a big deal.

What are the privacy and data localization issues?

The GDPR is a comprehensive data privacy law that applies to EU residents’ personal data, regardless of where it is processed. The GDPR itself does not insist that personal data must be processed only in Europe. Instead, it provides a number of legal mechanisms to ensure that GDPR-level protections are available for EU personal data if it is transferred outside the EU to a third country like the United States. Data transfers from the EU to the US were, until the 2020 “Schrems II” decision, permitted under an agreement called the EU-US Privacy Shield Framework.

The Schrems II decision refers to the July 2020 decision by the Court of Justice of the European Union that invalidated the EU-US Privacy Shield. The Court found that the Privacy Shield was not an effective means to protect EU data from US government surveillance authorities once data was transferred to the US, and therefore that under the Privacy Shield, EU personal data would not receive the level of protection guaranteed by the GDPR. However, the court upheld other valid transfer mechanisms designed to allow EU personal data to be transferred to the US in a way that is consistent with the GDPR that ensure EU personal data won’t be accessed by US government authorities in a way that violates the GDPR. One of those was the use of Standard Contractual Clauses, which are legal agreements approved by the EU Commission that enable data transfers – but they can only be used if supplementary measures are also in place.

Following the Schrems II case, the “NOYB” advocacy group founded by Max Schrems (the lawyer and activist who brought the legal action against Facebook that ultimately ended with the Schrems II ruling) filed 101 complaints against European websites that used Google Analytics and Facebook Connect trackers on the grounds that use of these trackers violates the Schrems II ruling because they send EU personal data to the United States without putting in place sufficient supplementary measures.

That issue of supplementary measures figured prominently in the Austrian data regulator’s decision. In its decision, the Datenschutzbehörde said that a European company could not use Google Analytics on its Austrian website because Google Analytics was sending the IP addresses of visitors to that website to Google’s servers in the United States. The Datenschutzbehörde reiterated earlier case law out of the EU that IP addresses can be sufficiently linked to individuals and therefore constitute personal data, so the GDPR applies. The regulator also found that IP addresses are not pseudonymous, and that Google doesn’t have sufficient supplementary measures in place to prevent US government authorities from accessing the data. As a result, the regulator found the use of Google Analytics and the transmission of IP addresses to the United States in this case violated the GDPR as interpreted by the Schrems II case. Since the Datenschutzbehörde announced its decision, Norway’s data protection authority announced it is joining the Austrian decision.

Google Analytics decision sets worrisome precedent

It’s important to remember that the Austrian ruling relates to one website’s use and implementation of Google Analytics. It is not a ban on Google Analytics throughout Europe. But is it a harbinger of more sweeping actions from data regulators? Any website might use dozens of third-party tools. If any of the third-party tools are transferring personal data to the US, they could attract the attention of an EU data regulator. Even if those tools are not collecting personal data or sensitive information intentionally, there remains a concern with the use of third-party tools, which evolves from how the Internet is built and operates.

Every time a user loads a website, those tools load and establish a connection between the end user’s browser and the third-party server. This connection is used for multiple purposes, such as requesting a script, reporting analytics data, or downloading an image pixel. In every such communication, the IP address of the visitor is exposed. This is how communication between a browser and a server has worked over the Internet since the Internet’s infancy.

The implications of the decision are therefore profound. If other European regulators adopt the Austrian ruling, and its conclusion that even the transfer of truncated IP addresses to the United States could constitute transfers of personal data that violate GDPR, the industry will likely need to fundamentally rethink current Internet architecture and the way IP addresses are used. Cloudflare increasingly believes that we’ll eventually solve these challenges by completely disassociating IP addresses from identity. We’ve partnered with others in the industry to pioneer new protocols like Oblivious DNS over HTTPS that divorce IP addresses from content being queried online to help begin to make this future a reality.

While we can envision this future, our customers need immediate ways to address regulators’ concerns. The median website in 2021 used 21 third-party solutions on mobile and 23 on desktop. At the 90th percentile, these numbers climbed to 89 third-party solutions on mobile, and 91 on desktop. Taking into account the Austrian DPA ruling, according to which the EU company itself is responsible for making sure no personal data is transmitted to the United States without proper handling, we can conclude that companies may soon become responsible for every one of their third-party solutions implemented on their website. And since this is a staggering amount of tools, it demands a scalable solution. Luckily, that is exactly what we have built.

Zaraz’s solution leverages Cloudflare’s global network and Workers platform

Zaraz is a third-party manager, built for speed, privacy and security. With Zaraz, customers can load analytics tools, advertising pixels, interactive widgets, and many other types of third-party tools without making any changes to their code.

Zaraz loads third party tools on the cloud, using Cloudflare Workers. There are multiple reasons why we chose to build on Workers, and you can read more about it in this blog post. By using Workers to offload third-party tools to the cloud and away from the browser, Zaraz creates an extra layer of security and control over Personal Identifiable Information (PII), Protected Health Information (PHI), or other sensitive pieces of information that are often unintentionally passed to third-party vendors.

Need to Keep Analytics Data in the EU? Cloudflare Zaraz Can Offer a Solution

In the traditional way of loading third-party tools, either via a Tag Management Software (TMS), a Customer Data Platform (CDP) or by including JavaScript snippets directly in the HTML, the browser always sends requests to the third-party domain. This is problematic for a bunch of reasons, but mainly because even if you wanted to, you can’t hide the user’s IP address. It is revealed with every HTTP request. It is also problematic because those tools execute remote JavaScript resources, and you have almost no visibility over the actions they take in the browser or the data they transmit.

We can use the Google Analytics example to illustrate the difference. When a website is loading Google Analytics either via Google Tag Manager or directly from the HTML, the browser downloads the analytics.js file that loads Google Analytics. It then sends an HTTP POST request from the browser to Google’s endpoint: https://www.google-analytics.com/collect. Both of these requests reveal the end-user’s IP address and might append to the URL some personal data, such as the Google Client ID, as query parameters for example.

Need to Keep Analytics Data in the EU? Cloudflare Zaraz Can Offer a Solution

In comparison, when you use Zaraz to load Google Analytics, there’s simply no communication at all between the browser and Google’s endpoint. Instead, Zaraz works as an intermediary, and the entire communication is between Zaraz (which runs on Workers servers, isolated from the browser) and the third party. You can think of Zaraz as an extra protection layer between the browser and the third-party endpoint, and this extra layer allows us to include some powerful privacy features.

For example, Zaraz allows customers to decide whether to transfer an end user’s IP address to Google Analytics or not. As simple as that. When configuring a new third-party tool like Google Analytics, you can choose in the tools settings page to hide IP addresses.

Need to Keep Analytics Data in the EU? Cloudflare Zaraz Can Offer a Solution

You can use this feature currently with Google Analytics and the Facebook Pixel/Conversion API. But with more and more tools opening up their API and allowing server-to-server integrations, we expect the number of tools you can apply this on to grow rapidly.

A somewhat similar feature Zaraz offers is the Zaraz Data Loss Prevention (DLP) feature, currently used by several of our Enterprise customers. The DLP feature scans every request going to a third-party endpoint to make sure it doesn’t include sensitive information such as names, email addresses, social  security number, credit card numbers, IP addresses, and more. Using this feature, customers can either mask the data or simply be alerted when a tool is collecting such personal data. It gives full visibility and control over the information shared with third parties.

How Zaraz Can Help with Data Localization

Right now, you might be asking yourself, “wait, but how is Cloudflare different from Google, and won’t end users’ logs go to Cloudflare’s US servers as well?” This is a great question, and where the combination of Zaraz with the Cloudflare global network makes us shine. We offer Enterprise customers Zaraz in combination with two powerful features of Cloudflare’s Data Localisation Suite: Regional Services, and the Customer Metadata Boundary.

Cloudflare Regional Services allows you to choose where you want the Cloudflare services to run, including the Zaraz service. To meet your compliance obligations, you may need control over where your data is inspected. Cloudflare Regional Services helps you decide where your data should be handled, without losing the performance benefits our network provides.

Let’s say you run a website for a European bank. Let’s also assume you enabled the Data Localisation Suite for the EU. When a person in the EU visits your website, an HTTP request is sent to activate Zaraz. Since Zaraz is running in a first-party context, meaning under your own domain, all the Data Localisation settings will apply on it as well. So the network will direct the traffic to the EU, without inspecting its content, and run Zaraz there.

The EU Customer Metadata Boundary expands the Data Localisation Suite to ensure that a customer’s end-user traffic metadata stays in the EU. “Metadata” can be a scary term, but it’s a simple concept — it just means “data about data.” In other words, it’s a description of activity that happened on our network. Using the EU Customer Metadata Boundary means that this type of metadata would be saved only in the EU.

And what about the end user’s personal data handled by Zaraz? By default, Zaraz doesn’t log or save any piece of information about the end user, with one exception in the case of error logging. To make our service better, we are saving logs of errors, so we can fix any issues. For customers that are using the Data Localisation Suite, this is something we can toggle off, which means that no log data whatsoever will be saved by Zaraz.

What Does the Future Hold for Privacy Features?

Since the Zaraz acquisition, we have been talking to hundreds of Cloudflare enterprise customers, and thousands of users using the beta for the free version of Zaraz. And we have gathered a shortlist of features that we plan to develop in 2022.

  • The Zaraz Consent Manager. Zaraz is fundamentally changing the way third-party tools are implemented on the web. So, in order to provide our customers with full control over user consent management, we realized we should build our own tool to allow customers to do so easily. The Zaraz consent manager will be fully integrated with Zaraz and will allow customers to take actions according to the user choices in a few clicks.
  • Geolocation Triggers. We are planning to add the option to create trigger rules based on an end user’s current location. This means you could configure tools to only load if the user is visiting your site from a specific region. You’d be able to even send specific events or properties according to the end-user’s location. This feature should help global companies to set granular configurations that meet the requirements of their global operations.
  • DLP pattern templates. At the moment, our DLP feature can scan requests going to third-party tools according to the patterns that enterprise customers create themselves. In the near future, we will introduce templates to help customers scan for common PII with more ease.

This is just a taste of what’s coming. If you have any ideas for privacy features you’d like to see, reach out to [email protected] – we would love to hear from you!

If you would like to explore the free beta version, please click here. Provided you are an Enterprise customer and want to learn more about Zaraz’s privacy features, please click here to join the waitlist. To join our Discord channel, click here.

Interview with the Head of the NSA’s Research Directorate

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/02/interview-with-the-head-of-the-nsas-research-directorate.html

MIT Technology Review published an interview with Gil Herrera, the new head of the NSA’s Research Directorate. There’s a lot of talk about quantum computing, monitoring 5G networks, and the problems of big data:

The math department, often in conjunction with the computer science department, helps tackle one of NSA’s most interesting problems: big data. Despite public reckoning over mass surveillance, NSA famously faces the challenge of collecting such extreme quantities of data that, on top of legal and ethical problems, it can be nearly impossible to sift through all of it to find everything of value. NSA views the kind of “vast access and collection” that it talks about internally as both an achievement and its own set of problems. The field of data science aims to solve them.

“Everyone thinks their data is the messiest in the world, and mine maybe is because it’s taken from people who don’t want us to have it, frankly,” said Herrera’s immediate predecessor at the NSA, the computer scientist Deborah Frincke, during a 2017 talk at Stanford. “The adversary does not speak clearly in English with nice statements into a mic and, if we can’t understand it, send us a clearer statement.”

Making sense of vast stores of unclear, often stolen data in hundreds of languages and even more technical formats remains one of the directorate’s enduring tasks.

GitHub Availability Report: January 2022

Post Syndicated from Scott Sanders original https://github.blog/2022-02-02-github-availability-report-january-2022/

In January, we experienced no incidents resulting in service downtime to our core services. However, we do want to acknowledge an incident in February that we are continuing to investigate.

February 2 19:12 UTC (lasting 26 minutes)

Our service monitors detected a high rate of errors for issues, pull requests, GitHub Codespaces, and GitHub Actions services. We have mitigated the incident and are confident it has been fully resolved.

Due to the recency of this incident, we are still investigating the contributing factors and will provide a more detailed update in next month’s report.

Please follow our status page for real time updates. To learn more about what we’re working on, check out the GitHub engineering blog.

The collective thoughts of the interwebz