Security updates for Friday

Post Syndicated from original https://lwn.net/Articles/837105/rss

Security updates have been issued by Debian (libproxy, pacemaker, and thunderbird), Fedora (nss), openSUSE (kernel), Oracle (curl, librepo, qt and qt5-qtbase, and tomcat), Red Hat (firefox), SUSE (firefox, java-1_7_0-openjdk, and openldap2), and Ubuntu (apport, libmaxminddb, openjdk-8, openjdk-lts, and slirp).

Why Zabbix throttling preprocessing is a key point for high-frequency monitoring

Post Syndicated from Dmitry Lambert original https://blog.zabbix.com/why-zabbix-throttling-preprocessing-is-a-key-point-for-high-frequency-monitoring/12364/

Sometimes we need much more than collecting generic data from our servers or network devices. For high-frequency monitoring, we need functionality to offload сore components from the extensive load. Throttling is the exact thing that will allow you to drop repetitive values on a Pre-processing level and collect only changing values.

Contents

I. High-frequency monitoring (0:33)

1. High-frequency monitoring issues (2:25)
2. Throttling (5:55)

Throttling is available since Zabbix 4.2 and is highly effective for high-frequency monitoring.

High-frequency monitoring

We have to set update intervals for all of the items we create in Configuration > Host > Items > Create item.

Setting update interval

The smallest update interval for regular items in Zabbix is one second. If we want to monitor all items, including memory usage, network bandwidth, or CPU load once per second, this can be considered a high-frequency interval. However, in the case of industrial equipment or telemetry data, we’ll most likely need the data more often, for instance, every 1 millisecond.

The easiest way to send data every millisecond is to use Zabbix sender — a small utility to send values to the Zabbix server or the proxy. But first, these values should be gathered.

High-frequency monitoring issues

Selecting an update interval for different items

We have to think about performance, as the more data we have, the more performance issues will arise and the more powerful hardware we’ll have to buy.

If the data grabbed from a host is constantly changing, it makes sense to collect the data every 10 or 100 milliseconds, for instance. This means that we have to process this changing data with the triggers, store it in the database, visualize it in the Latest data, as every time we receive a new value.

There are values that does not have that trend to change very frequently, but without Throttling we would still collect a new value every milisecond and process it with all our triggers and internal processes, even if the value does not change over hours.

Throttling

The greatest way to solve this problem is through throttling.

To illustrate it, in Configuration > Hosts, let’s create a ‘Throttling‘ host and add it to a group.

Creating host

Then we’ll create an item to work as a Zabbix sender item.

Creating Zabbix sender item

NOTE. For a Zabbix sender item, the Type should always be ‘Zabbix trapper’.

Then open the CLI and reload the config cache:

zabbix_server -R config_cache_reload

Now we can send values to the Zabbix sender, specifying IP address of the Zabbix server, hostname, which is case-sensitive, the key, and then the value — 1:

zabbix_sender -z 127.0.0.1 -s Throttling -k youtube -o 1

If we send value “1” several times, they all will be displayed in Monitoring > Latest data.

Displaying the values grabbed from the host

NOTE. It’s possible to filter the Latest data to display only the needed host and set a sufficient range of the last values to be displayed.

Using this method we are spamming the Zabbix server. So, we can add throttling to the settings of our item in the Pre-processing tab in Configuration > Hosts.

NOTE. There are no other parameters to configure besides this Pre-processing step from the throttling menu.

Discard unchanged

Discard unchanged throttling option

With the ‘Discard unchanged‘ throttling option, only new values will be processed by the server, while identical values will be ignored.

Throttling ignores identical values

Discard unchanged with a heartbeat

If we change the pre-processing settings for our item in the Pre-processing tab in Configuration > Hosts to ‘Discard unchanged with a heartbeat‘, we have one additional Parameter to specify — the interval to send the values if they are identical.

Discard unchanged with a heartbeat

So, if we specify 120 seconds, then in Monitoring > Latest data, we’ll get the values once per 120 seconds even if they are identical.

Displaying identical values with an interval

This throttling option is useful when we have nodata() triggers. So, with the Discard unchanged throttling option, the nodata() triggers will fire as identical data will be dropped. If we use Discard unchanged with heartbeat even identical values will be grabbed, so the trigger won’t fire.

In simpler words, the ‘Discard unchanged‘ throttling option will drop all identical values, while ‘Discard unchanged with heartbeat‘ will send even the identical values with the specified interval.

Watch the video.

 

New Zealand Election Fraud

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/11/new-zealand-election-fraud.html

It seems that this election season has not gone without fraud. In New Zealand, a vote for “Bird of the Year” has been marred by fraudulent votes:

More than 1,500 fraudulent votes were cast in the early hours of Monday in the country’s annual bird election, briefly pushing the Little-Spotted Kiwi to the top of the leaderboard, organizers and environmental organization Forest & Bird announced Tuesday.

Those votes — which were discovered by the election’s official scrutineers — have since been removed. According to election spokesperson Laura Keown, the votes were cast using fake email addresses that were all traced back to the same IP address in Auckland, New Zealand’s most populous city.

It feels like writing this story was a welcome distraction from writing about the US election:

“No one has to worry about the integrity of our bird election,” she told Radio New Zealand, adding that every vote would be counted.

Asked whether Russia had been involved, she denied any “overseas interference” in the vote.

I’m sure that’s a relief to everyone involved.

Automated Origin CA for Kubernetes

Post Syndicated from Terin Stock original https://blog.cloudflare.com/automated-origin-ca-for-kubernetes/

Automated Origin CA for Kubernetes

Automated Origin CA for Kubernetes

In 2016, we launched the Cloudflare Origin CA, a certificate authority optimized for making it easy to secure the connection between Cloudflare and an origin server. Running our own CA has allowed us to support fast issuance and renewal, simple and effective revocation, and wildcard certificates for our users.

Out of the box, managing TLS certificates and keys within Kubernetes can be challenging and error prone. The secret resources have to be constructed correctly, as components expect secrets with specific fields. Some forms of domain verification require manually rotating secrets to pass. Once you’re successful, don’t forget to renew before the certificate expires!

cert-manager is a project to fill this operational gap, providing Kubernetes resources that manage the lifecycle of a certificate. Today we’re releasing origin-ca-issuer, an extension to cert-manager integrating with Cloudflare Origin CA to easily create and renew certificates for your account’s domains.

Origin CA Integration

Creating an Issuer

After installing cert-manager and origin-ca-issuer, you can create an OriginIssuer resource. This resource creates a binding between cert-manager and the Cloudflare API for an account. Different issuers may be connected to different Cloudflare accounts in the same Kubernetes cluster.

apiVersion: cert-manager.k8s.cloudflare.com/v1
kind: OriginIssuer
metadata:
  name: prod-issuer
  namespace: default
spec:
  signatureType: OriginECC
  auth:
    serviceKeyRef:
      name: service-key
      key: key
      ```

This creates a new OriginIssuer named “prod-issuer” that issues certificates using ECDSA signatures, and the secret “service-key” in the same namespace is used to authenticate to the Cloudflare API.

Signing an Origin CA Certificate

After creating an OriginIssuer, we can now create a Certificate with cert-manager. This defines the domains, including wildcards, that the certificate should be issued for, how long the certificate should be valid, and when cert-manager should renew the certificate.

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: example-com
  namespace: default
spec:
  # The secret name where cert-manager
  # should store the signed certificate.
  secretName: example-com-tls
  dnsNames:
    - example.com
  # Duration of the certificate.
  duration: 168h
  # Renew a day before the certificate expiration.
  renewBefore: 24h
  # Reference the Origin CA Issuer you created above,
  # which must be in the same namespace.
  issuerRef:
    group: cert-manager.k8s.cloudflare.com
    kind: OriginIssuer
    name: prod-issuer

Once created, cert-manager begins managing the lifecycle of this certificate, including creating the key material, crafting a certificate signature request (CSR), and constructing a certificate request that will be processed by the origin-ca-issuer.

When signed by the Cloudflare API, the certificate will be made available, along with the private key, in the Kubernetes secret specified within the secretName field. You’ll be able to use this certificate on servers proxied behind Cloudflare.

Extra: Ingress Support

If you’re using an Ingress controller, you can use cert-manager’s Ingress support to automatically manage Certificate resources based on your Ingress resource.

apiVersion: networking/v1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/issuer: prod-issuer
    cert-manager.io/issuer-kind: OriginIssuer
    cert-manager.io/issuer-group: cert-manager.k8s.cloudflare.com
  name: example
  namespace: default
spec:
  rules:
    - host: example.com
      http:
        paths:
          - backend:
              serviceName: examplesvc
              servicePort: 80
            path: /
  tls:
    # specifying a host in the TLS section will tell cert-manager 
    # what DNS SANs should be on the created certificate.
    - hosts:
        - example.com
      # cert-manager will create this secret
      secretName: example-tls

Building an External cert-manager Issuer

An external cert-manager issuer is a specialized Kubernetes controller. There’s no direct communication between cert-manager and external issuers at all; this means that you can use any existing tools and best practices for developing controllers to develop an external issuer.

We’ve decided to use the excellent controller-runtime project to build origin-ca-issuer, running two reconciliation controllers.

Automated Origin CA for Kubernetes

OriginIssuer Controller

The OriginIssuer controller watches for creation and modification of OriginIssuer custom resources. The controllers create a Cloudflare API client using the details and credentials referenced. This client API instance will later be used to sign certificates through the API. The controller will periodically retry to create an API client; once it is successful, it updates the OriginIssuer’s status to be ready.

CertificateRequest Controller

The CertificateRequest controller watches for the creation and modification of cert-manager’s CertificateRequest resources. These resources are created automatically by cert-manager as needed during a certificate’s lifecycle.

The controller looks for Certificate Requests that reference a known OriginIssuer, this reference is copied by cert-manager from the origin Certificate resource, and ignores all resources that do not match. The controller then verifies the OriginIssuer is in the ready state, before transforming the certificate request into an API request using the previously created clients.

On a successful response, the signed certificate is added to the certificate request, and which cert-manager will use to create or update the secret resource. On an unsuccessful request, the controller will periodically retry.

Learn More

Up-to-date documentation and complete installation instructions can be found in our GitHub repository. Feedback and contributions are greatly appreciated. If you’re interested in Kubernetes at Cloudflare, including building controllers like these, we’re hiring.

PRIMM: encouraging talk in programming lessons

Post Syndicated from Oliver Quinlan original https://www.raspberrypi.org/blog/primm-talk-in-programming-lessons-research-seminar/

Whenever you learn a new subject or skill, at some point you need to pick up the particular language that goes with that domain. And the only way to really feel comfortable with this language is to practice using it. It’s exactly the same when learning programming.

A girl doing Scratch coding in a Code Club classroom

In our latest research seminar, we focused on how we educators and our students can talk about programming. The seminar presentation was given by our Chief Learning Officer, Dr Sue Sentance. She shared the work she and her collaborators have done to develop a research-based approach to teaching programming called PRIMM, and to work with teachers to investigate the effects of PRIMM on students.

Sue Sentance

As well as providing a structure for programming lessons, Sue’s research on PRIMM helps us think about ways in which learners can investigate programs, start to understand how they work, and then gradually develop the language to talk about them themselves.

Productive talk for education

Sue began by taking us through the rich history of educational research into language and dialogue. This work has been heavily developed in science and mathematics education, as well as language and literacy.

In particular the work of Neil Mercer and colleagues has shown that students need guidance to develop and practice using language to reason, and that developing high-quality language improves understanding. The role of the teacher in this language development is vital.

Sue’s work draws on these insights to consider how language can be used to develop understanding in programming.

Why is programming challenging for beginners?

Sue identified shortcomings of some teaching approaches that are common in the computing classroom but may not be suitable for all beginners.

  • ‘Copy code’ activities for learners take a long time, lead to dreaded syntax errors, and don’t necessarily build more understanding.
  • When teachers model the process of writing a program, this can be very helpful, but for beginners there may still be a huge jump from being able to follow the modeling to being able to write a program from scratch themselves.

PRIMM was designed by Sue and her collaborators as a language-first approach where students begin not by writing code, but by reading it.

What is PRIMM?

PRIMM stands for ‘Predict, Run, Investigate, Modify, Make’. In this approach, rather than copying code or writing programs from scratch, beginners instead start by focussing on reading working code.

In the Predict stage, the teacher provides learners with example code to read, discuss, and make output predictions about. Next, they run the code to see how the output compares to what they predicted. In the Investigate stage, the teacher sets activities for the learners to trace, annotate, explain, and talk about the code line by line, in order to help them understand what it does in detail.

In the seminar, Sue took us through a mini example of the stages of PRIMM where we predicted the output of Python Turtle code. You can follow along on the recording of the seminar to get the experience of what it feels like to work through this approach.

The impact of PRIMM on learning

The PRIMM approach is informed by research, and it is also the subject of research by Sue and her collaborators. They’ve conducted two studies to measure the effectiveness of PRIMM: an initial pilot, and a larger mixed-methods study with 13 teachers and 493 students with a control group.

The larger study used a pre and post test, and found that the group who experienced a PRIMM approach performed better on the tests than the control group. The researchers also collected a wealth of qualitative feedback from teachers. The feedback suggested that the approach can help students to develop a language to express their understanding of programming, and that there was much more productive peer conversation in the PRIMM lessons (sometimes this meant less talk, but at a more advanced level).

The PRIMM structure also gave some teachers a greater capacity to talk about the process of teaching programming. It facilitated the discussion of teaching ideas and learning approaches for the teachers, as well as developing language approaches that students used to learn programming concepts.

The research results suggest that learners taught using PRIMM appear to be developing the language skills to talk coherently about their programming. The effectiveness of PRIMM is also evidenced by the number of teachers who have taken up the approach, building in their own activities and in some cases remixing the PRIMM terminology to develop their own take on a language-first approach to teaching programming.

Future research will investigate in detail how PRIMM encourages productive talk in the classroom, and will link the approach to other work on semantic waves. (For more on semantic waves in computing education, see this seminar by Jane Waite and this symposium talk by Paul Curzon.)

Resources for educators who want to try PRIMM

If you would like to try out PRIMM with your learners, use our free support materials:

Join our next seminar

If you missed the seminar, you can find the presentation slides alongside the recording of Sue’s talk on our seminars page.

In our next seminar on Tuesday 1 December at 17:00–18:30 GMT / 12:00–13:30 EsT / 9:00–10:30 PT / 18:00–19:30 CEST. Dr David Weintrop from the University of Maryland will be presenting on the role of block-based programming in computer science education. To join, simply sign up with your name and email address.

Once you’ve signed up, we’ll email you the seminar meeting link and instructions for joining. If you attended this past seminar, the link remains the same.

The post PRIMM: encouraging talk in programming lessons appeared first on Raspberry Pi.

Migrating from Vertica to Amazon Redshift

Post Syndicated from Seetha Sarma original https://aws.amazon.com/blogs/big-data/migrating-from-vertica-to-amazon-redshift/

Amazon Redshift powers analytical workloads for Fortune 500 companies, startups, and everything in between. With Amazon Redshift, you can query petabytes of structured and semi-structured data across your data warehouse, operational database, and your data lake using standard SQL.

When you use Vertica, you have to install and upgrade Vertica database software and manage the cluster OS and hardware. Amazon Redshift is a fully managed cloud solution; you don’t have to install and upgrade database software and manage the OS and the hardware. In this post, we discuss the best practices for migrating from a self-managed Vertica cluster to the fully managed Amazon Redshift solution. We discuss how to plan for the migration, including sizing your Amazon Redshift cluster and strategies for data placement. We look at the tools for schema conversion and see how to choose the right keys for distributing and sorting your data. We also see how to speed up the data migration to Amazon Redshift based on your data size and network connectivity. Finally, we cover how cluster management on Amazon Redshift differs from Vertica.

Migration planning

When planning your migration, start with where you want to place the data. Your business use case drives what data gets loaded to Amazon Redshift and what data remains on the data lake. In this section, we discuss how to size the Amazon Redshift cluster based on the size of the Vertica dataset that you’re moving to Amazon Redshift. We also look at the Vertica schema and decide the best data distribution and sorting strategies to use for Amazon Redshift, if you choose to do it manually.

Data placement

Amazon Redshift powers the lake house architecture, which enables you to query data across your data warehouse, data lake, and operational databases to gain faster and deeper insights not possible otherwise. In a Vertica data warehouse, you plan the capacity for all your data, whereas with Amazon Redshift, you can plan your data warehouse capacity much more efficiently. If you have a huge historical dataset being shared by multiple compute platforms, then it’s a good candidate to keep on Amazon Simple Storage Service (Amazon S3) and utilize Amazon Redshift Spectrum. Also, streaming data coming from Kafka and Amazon Kinesis Data Streams can add new files to an existing external table by writing to Amazon S3 with no resource impact to Amazon Redshift. This has a positive impact on concurrency. Amazon Redshift Spectrum is good for heavy scan and aggregate work. For tables that are frequently accessed from a business intelligence (BI) reporting or dashboarding interface and for tables frequently joined with other Amazon Redshift tables, it’s optimal to have tables loaded in Amazon Redshift.

Vertica has Flex tables to handle JSON data. You don’t need to load the JSON data to Amazon Redshift. You can use external tables to query JSON data stored on Amazon S3 directly from Amazon Redshift. You create external tables in Amazon Redshift within an external schema.

Vertica users typically create a projection on a Vertica table to optimize for a particular query. If necessary, use materialized views in Amazon Redshift. Vertica also has aggregate projection, which acts like a synchronized materialized view. With materialized views in Amazon Redshift, you can store the pre-computed results of queries and efficiently maintain them by incrementally processing the latest changes made to the source tables. Subsequent queries referencing the materialized views use the pre-computed results to run much faster. You can create materialized views based on one or more source tables using filters, inner joins, aggregations, grouping, functions, and other SQL constructs.

Cluster sizing

When you create a cluster on the Amazon Redshift console, you can get a recommendation of your cluster configuration based on the size of your data and query characteristics (see the following screenshot).

Amazon Redshift offers different node types to accommodate your workloads. We recommend using RA3 nodes so you can size compute and storage independently to achieve improved price and performance. Amazon Redshift takes advantage of optimizations such as data block temperature, data block age, and workload patterns to optimize performance and manage automatic data placement across tiers of storage in the RA3 clusters.

ETL pipelines and BI reports typically use temporary tables that are only valid for a session. Vertica has local and global temporary tables. If you’re using Vertica local temporary tables, no change is required during migration. Vertica local tables and Amazon Redshift temporary tables have similar behavior. They’re visible only to the session and get dropped when the session ends. Vertica global tables persist across sessions until they are explicitly dropped. If you use them now, you have to change them to permanent tables in Amazon Redshift and drop them when they’re no longer needed.

Data distribution, sorting, and compression

Amazon Redshift optimizes for performance by distributing the data across compute nodes and sorting the data. Make sure to set the sort key, distribution style, and compression encoding of the tables to take full advantage of the massively parallel processing (MPP) capabilities. The choice of distribution style and sort keys vary based on data model and access patterns. Use the data distribution and column order of the Vertica tables to help choose the distribution keys and sort keys on Amazon Redshift.

Distribution keys

Choose a column with high cardinality of evenly spread out values as the distribution key. Profile the data for the columns used for distribution keys. Vertica has segmentation that specifies how to distribute data for superprojections of a table, where the data to be hashed consists of one or more column values. The columns used in segmentation are most likely good candidates for distribution keys on Amazon Redshift. If you have multiple columns in segmentation, pick the column that provides the highest cardinality to reduce the possibility of high data skew.

Besides supporting data distribution by key, Amazon Redshift also supports other distribution styles: ALL, EVEN, and AUTO. Use ALL distribution for small dimension tables and EVEN distribution for larger tables, or use AUTO distribution, where Amazon Redshift changes the distribution style from ALL to EVEN as the table size reaches a threshold.

Sort keys

Amazon Redshift stores your data on disk in sorted order using the sort key. The Amazon Redshift query optimizer uses the sort order for optimal query plans. Review if one of raw columns used in the Vertica table’s Order By clause is the best column to use as the sort key in the Amazon Redshift table.

The order by fields in Vertica superprojections are good candidates for a sort key in Amazon Redshift, but the design criteria of sort order in Amazon Redshift is different from what you use in Vertica. In Vertica projections Order By clause, you use the low-cardinality columns with high probability of having RLE encoding before the high-cardinality columns. In Amazon Redshift, you can set the SORTKEY to AUTO, or choose a column as SORTKEY or define a compound sort key. You define compound sort keys using multiple columns, starting with the most frequently used column first. All the columns in the compound sort key are used, in the order in which they are listed, to sort the data. You can use a compound sort key when query predicates use a subset of the sort key columns in order. Amazon Redshift stores the table rows on disk in sorted order and uses metadata to track the minimum and maximum values for each 1 MB block, called a zone map. Amazon Redshift uses the zone map and the sort key for filtering the block, thereby reducing the scanning cost to efficiently handle range-restricted predicates.

Profile the data for the columns used for sort keys. Make sure the first column of the sort key is not encoded. Choose timestamp columns or columns used in frequent range filtering, equality filtering, or joins as sort keys in Amazon Redshift.

Encoding

You don’t always have to select compression encodings; Amazon Redshift automatically assigns RAW compression for columns that are defined as sort keys, AZ64 compression for the numeric and timestamp columns, and LZO compression for the VARCHAR columns. When you select compression encodings manually, choose AZ64 for numeric and date/time data stored in Amazon Redshift. AZ64 encoding has consistently better performance and compression than LZO. It has comparable compression with ZSTD but greatly better performance.

Tooling

After we decide the data placement, cluster size, partition keys, and sort keys, the next step is to look at the tooling for schema conversion and data migration.

You can use AWS Schema Conversion Tool (AWS SCT) to convert your schema, which can automate about 80% of the conversion, including the conversion of DISTKEY and SORTKEY, or you can choose to convert the Vertica DDLs to Amazon Redshift manually.

To efficiently migrate your data, you want to choose the right tools depending on the data size. If you have a dataset that is smaller than a couple of terabytes, you can migrate your data using AWS Data Migration Service (AWS DMS) or AWS SCT data extraction agents. When you have more than a few terabytes of data, your tool choice depends on your network connectivity. When there is no dedicated network connection, you can run the AWS SCT data extraction agents to copy the data to AWS Snowball Edge and ship the device back to AWS to complete the data export to Amazon S3. If you have a dedicated network connection to AWS, you can run the S3EXPORT or S3EXPORT_PARTITION commands available in Vertica 9.x directly from the Vertica nodes to copy the data in parallel to the S3 bucket.

The following diagram visualizes the migration process.

Schema conversion

AWS SCT uses extension pack schema to implement system functions of the source database that are required when writing your converted schema to your target database instance. Review the database migration assessment report for compatibility. AWS SCT can use source metadata and statistical information to determine the distribution key and sort key. AWS SCT adds a sort key in the Amazon Redshift table for the raw column used in the Vertica table’s Order By clause.

The following code is an example of Vertica CREATE TABLE and CREATE PROJECTION statements:

CREATE TABLE My_Schema.My_Table
(
    Product_id int,
    Product_name varchar(50),
    Product_type varchar(50),
    Product_category varchar(50),
    Quantity int,
    Created_at timestamp DEFAULT "sysdate"()
)
PARTITION BY (date_trunc('day', My_Table.Created_at));


CREATE PROJECTION My_Schema.My_Table_Projected
(
 Product_id ENCODING COMMONDELTA_COMP,
 Product_name,
 Product_type ENCODING RLE,
 Product_category ENCODING RLE,
 Quantity,
 Created_at ENCODING GCDDELTA
)
AS
 SELECT Product_id,
        Product_name,
        Product_type,
        Product_category,
        Quantity,
        Created_at
 FROM My_Schema.My_Table 
 ORDER BY Product_type,
          Product_category,
          Product_id,
          Product_name
SEGMENTED BY hash(Product_id) ALL NODES KSAFE 1;

The following code is the corresponding Amazon Redshift CREATE TABLE statement:

CREATE TABLE My_Schema.My_Table
(
    Product_id integer,
    Product_name varchar(50),
    Product_type varchar(50),
    Product_category varchar(50),
    Quantity integer,
    Created_at timestamp DEFAULT sysdate
)
DISTKEY (Product_id) 
SORTKEY (Created_at);

Data migration

To significantly reduce the data migration time from large Vertica clusters (if you have a dedicated network connection from your premises to AWS with good bandwidth), run the S3EXPORT or S3EXPORT_PARTITION function in Vertica 9.x, which exports the data in parallel from the Vertica nodes directly to Amazon S3.

The Parquet files generated by S3EXPORT don’t have any partition key on them, because partitioning consumes time and resources on the database where the S3EXPORT runs, which is typically the Vertica production database. The following code is one command you can use:

SELECT S3EXPORT( * USING PARAMETERS url='s3://myBucket/myTable') OVER(PARTITION BEST) FROM 
myTable;

The following code is another command option:

SELECT S3EXPORT_PARTITION(* USING PARAMETERS url='s3://mytable/bystate.date', multipart=false)
OVER (PARTITION by state, year) from myTable;

Performance

In this section, we look at best practices for ETL performance while copying the data from Amazon S3 to Amazon Redshift. We also discuss how to handle Vertica partition swapping and partition dropping scenarios in Amazon Redshift.

Copying using an Amazon S3 prefix

Make sure the ETL process is running from Amazon Elastic Compute Cloud (Amazon EC2) servers or other managed services within AWS. Exporting your data from Vertica as multiple files to Amazon S3 gives you the option to load your data in parallel to Amazon Redshift. While converting the Vertica ETL scripts, use the COPY command with an Amazon S3 object prefix to load an Amazon Redshift table in parallel from data files stored under that prefix on Amazon S3. See the following code:

copy mytable
from 's3://mybucket/data/mytable/' 
iam_role 'arn:aws:iam::<myaccount>:role/MyRedshiftRole';

Loading data using Amazon Redshift Spectrum queries

When you want to transform the exported Vertica data before loading to Amazon Redshift, or when you want to load only a subset of data into Amazon Redshift, use an Amazon Redshift Spectrum query. Create an external table in Amazon Redshift pointing to the exported Vertica data stored in Amazon S3 within an external schema. Put your transformation logic in a SELECT query, and ingest the result into Amazon Redshift using a CREATE TABLE or SELECT INTO statement:

CREATE TABLE mytable AS SELECT … FROM s3_external_schema.xxx WHERE …;

SELECT … INTO mytable FROM s3_external_schema.xxx WHERE …;

Handling Vertica partitions

Vertica has partitions, and the data loads use partition swapping and partition dropping. In Amazon Redshift, we can use the sort key, staging table, and alter table append to achieve similar results. First, the Amazon Redshift ETL job should use the sort key as filter conditions to insert the incremental data into a staging table or a temporary table in Amazon Redshift, for example the date from the MyTimeStamp column between yesterday and today. The ETL job should then delete data from the primary table that matches the filter conditions. The delete operation is very efficient in Amazon Redshift because of the sort key on the source partition column. The Amazon Redshift ETL jobs can then use alter table append to move the new data to the primary table. See the following code:

INSERT INTO stage_table select * from source_table WHERE date_trunc('day', table.MyTimestamp) BETWEEN <yesterday> AND <today>

DELETE FROM target_table_name select * from stage_table WHERE <target_table.key> = <stage_table.key>

ALTER TABLE target_table_name APPEND FROM stage_table_name 
[ IGNOREEXTRA | FILLTARGET ]

Cluster management

When a Vertica node fails, Vertica remains queryable but the performance is degraded until all the data is restored to the recovered node. When an Amazon Redshift node fails, Amazon Redshift automatically detects and replaces a failed node in your data warehouse cluster and replays the ReadOnly queries. Amazon Redshift makes your replacement node available immediately and loads your most frequently accessed data from the S3 bucket first to allow you to resume querying your data as quickly as possible.

Vertica cluster resize, similar to Amazon Redshift classic resize, takes a few hours depending on data volume to rebalance the data when nodes are added or removed. With Amazon Redshift elastic resize, the cluster resize completes within minutes. We recommend elastic resize for most use cases to shorten the cluster downtime and schedule resizes to handle seasonal spikes in your workload.

Conclusion

This post shared some best practices for migrating your data warehouse from Vertica to Amazon Redshift. It also pointed out the differences between Amazon Redshift and Vertica in handling queries, data management, cluster management, and temporary tables. Create your cluster on the Amazon Redshift console and convert your schema using AWS SCT to start your migration to Amazon Redshift. If you have any questions or comments, please share your thoughts in the comments section.


About the Authors

Seetha Sarma is a Senior Database Solutions Architect with Amazon Web Services.

 

 

 

 

Veerendra Nayak is a Senior Database Solutions Architect with Amazon Web Services.

Кой покрива “Газпром”? Ресорни министри крият информация за Турски поток

Post Syndicated from Николай Марченко original https://bivol.bg/%D1%80%D0%B5%D1%81%D0%BE%D1%80%D0%BD%D0%B8-%D0%BC%D0%B8%D0%BD%D0%B8%D1%81%D1%82%D1%80%D0%B8-%D0%BA%D1%80%D0%B8%D1%8F%D1%82-%D0%B8%D0%BD%D1%84%D0%BE%D1%80%D0%BC%D0%B0%D1%86%D0%B8%D1%8F-%D0%B7%D0%B0.html

четвъртък 12 ноември 2020


Ключовите министри в правителството на Бойко Борисов, в чиито компетенции влиза реализацията на проекта за газопровод „Турски поток 2“, от седмици отказват всякаква информация по проекта на сайта за разследваща журналистика „Биволъ“. Докато очакваме отговори за договорите с руски компании, за статута на беларуски и руски работници по ареала на газопровода, ТАСС изведнъж съобщава, че изграждането на тръбата е приключило с прекарването й до сръбската граница. Посолството на САЩ пък също поне две седмици с мълчание отвръща на „Биволъ“ за потенциалните санкции и американската компания, доставила турбини за компресорната станция край с. Расово.  Обоснованото подозрение е, че целта на това „мотане“ бе с цел проектът да бъде реализиран. И това е така, въпреки протестите на НПО-та начело с гражданското движение „България обединена с една цел“ (БОЕЦ) и политическите сили като „Да, България“ и „Зелено движение“.

Бойко Борисов, Владимир Путин, Реджеп Тайип Ердоган и Александър Вучич пускат Турски поток

В края на месец октомври 2020 г. „Биволъ“ се обърна към министрите Теменужка Петкова, Деница Сачева и Петя Ставрева с въпроси за условията, по които се реализира т.нар. „Балкански поток“ на територията на Република България.

„Биволъ“ изпрати по надлежен ред писмата си на името на тези министри съгласно Закона за достъп до обществената информация (ЗДОИ). По изтичането на двуседмичния законов срок обаче нито едно от трите ведомства не отговори по същество на поставените от медията ни въпроси.

Оказа се, че тримата министри използват типичната бюрократична „хватка“, използвана успешно от прокуратурата на България – т.нар.

„препращане по компетентност“.

И едно е да препращат на подчинените си институции в качеството си на принципал, както направиха зам.-министърът на енергетиката Жечо Станков и министърът на енергетиката Теменужка Петкова, които препратиха запитването до „Булгартрансгаз“.

Друго е, когато министърът на регионалното развитие препраща заявлението по ЗДОИ не само на подчинените си в Дирекцията за национален строителен контрол (ДНСК), а и към колегите в МТСП и Министерството на здравеопазването (МЗ).

Министри си прехвърлят топката 

Министърът на енергетиката Теменужка Петкова нямала информация по Турски поток

Въпросите до Теменужка Петкова са от нейната компетентност, тъй като касят ключовия енергиен проект, реализиран от „Булгартрансгаз“ като дружество на държавния Български енергиен холдинг (БЕХ) с принципал в лицето на министър на енергетиката. Ето какво написа “Биволъ”:

    Уважаема Гжо Министър,

 молим да ни бъде предоставена информация съгласно Закона за достъп до обществена информация (ЗДОИ) по договорите наБулгартрансгазЕАД с компаниите: Аркад Енджиниъринг енд Констракшън“, “АркадС.П.А., ДЗЗДГазово развитие и разширение в България“, “Комплишънс дивелъпмънт С.а р.л. – Клон България, „Ай Ди Си“ (Infrastructure Development and Construction)и ОАОБелтрубопроводстройотносно изграждането на разширението на българската газопреносна мрежа до сръбската граница. Съответно копията на договорите, анексите към тях със заличени данни по GDPR”.

Едва на 10 ноември 2020 г. от пресцентъра на МЕ информират репортера на „Биволъ“ по телефона какво се случва или по-скоро не се случва с преписката по случая. Още на 30 октомври екипът трябваше да е получил на мейла си придружаващото писмо на Жечо Станков и Теменужка Петкова до ръководството на „Булгартрансгаз“, което да даде отговори в законосъобразните срокове.

Очевидно „в министерството липсва информация“ относно договорите на БТГ с компаниите, изпълнители на „Турски поток 2“. Понеже такова писмо така и не постъпва, деловодството им все пак го препраща и до „Биволъ“ повторно на 10 ноември 2020 г. след съответното обаждане до пресцентъра на МЕ. Писмото е препратено до изпълнителния директор на БТГ Александър Малинов (на главната снимка с премиера Бойко Борисов).

МТСП пък получава заявлението на „Биволъ“ по ЗДОИ на 28 октомври 2020 г. и почти десет дни го държи в своето деловодство. Съдържанието е същото и със същите въпроси:

Деница Сачева не коментира трудовите злополуки на Турски поток

     Уважаема Гжо Министър,

молим да ни бъде предоставена информация съгласно Закона за достъп до обществена информация (ЗДОИ) относно наличието на проверки и контрол върху сключените трудови договори, изплащаните осигуровки (пенсионно и здравно осигуряване) от българските компании:БулгартрансгазЕАД, „Главболгарстрой“ АД, „Главболгарстрой Интернешънъл“ АД „Стримона строй“, „Турбо Машина България ЕООД“, „Нова строй Тийм“ ЕООД.

Интересува ни как се спазва трудовото законодателство на Република България предвид хилядите работници от чуждестранни компании, които се намират в ареала на изграждането на разширението на българската газопреносна мрежа до сръбската граница.

По какъв начин се изплащат здравните осигуровки и обезщетенията при трудовите злополуки, при автомобилни катастрофи с работници от съответния строеж и при случаи на масово заразяване с Ковид 19 и необходимото изолиране на работници.

Включително такива извън страните на ЕС (Руска Федерация, Република Беларус, Република Сърбия, Република Турция, Саудитска Арабия, Украйна и др.) Става дума за юридическите лица, регистрирани в България за изпълнението на енергийния проект: „Бонатти С.П.А“, „Макс Щрайхер С.П.А“,  „Комплишънс дивелъпмънт С.А.Р.Л- клон България“, Аркад Енджиниъринг енд Констракшън“, “АркадС.П.А.,, „Ай Ди Си“ (Infrastructure Development and Construction), ОАОБелтрубопроводстрой“.

Съответно изискваме копията на документацията, свързана с проверките за спазване на трудово законодателство след регистрираните злополуки на строителните обекти на този енергиен проект, сключените договори, изплащането на осигуровки и обезщетения при злополуки, анексите към тях със заличени данни по GDPR.

Писмо на Теменужка Петкова до изпълнителния директор на “Булгартрансгаз”

Следва препращане на 6 ноември „по компетентност“ до изпълнителния директор на ИА „Главна Инспекция по труда“ Румяна Михайлова, до изпълнителния директор на Националната агенция за приходите Галя Димитрова и до управителя на Националния осигурителен институт (НОИ) Ивайло Иванов:

„УВАЖАЕМИ  ГОСПОЖИ И ГОСПОДА,

В Министерството на труда и социалната политика е постъпило заявление с вх. № АУ-2-22 от 28.10.2020 г. от г-н Николай Сергеев Марченко, журналист в сайта „БИВОЛЪ“. Получено е и заявление, с вх. № АУ-2-24 от 06.11.2020 г., препратено по компетентност от Министерство на регионалното развитие и благоустройство, с идентично съдържание, подадено от същия заявител.

Информирам Ви, че Министерството на труда и социалната политика не администрира поисканата с горепосочените заявления информация.

На основание чл. 32, ал. 1 от Закона за достъп до обществена информация, приложено препращам Ви по компетентност, постъпилите в Министерството на труда и социалната политика, заявления за достъп до обществена информация“.

Подписът под писмото е на главния секретар на ведомството на Деница СачеваМилена Петрова.

Аналогично постъпват и във ведомството, ръководено от регионалния министър Петя Ставрева, към която “Биволъ” също се обърна със заявление по ЗДОИ:

Уважаема Гжо Министър,

молим да ни бъде предоставена информация съгласно Закона за достъп до обществена информация (ЗДОИ) относно наличието на проверки и контрол върху строителната дейност на българските компании и чуждестранни компании, които се намират в качеството си на изпълнители и подизпълнители в ареала на изграждането на разширението на българската газопреносна мрежа до сръбската граница.

Става дума за: „БулгартрансгазЕАД, „Главболгарстрой“ АД, „Главболгарстрой Интернешънъл“ АД „Стримона строй“, „Турбо Машина България ЕООД“, „Нова строй Тийм“ ЕООД.  Включително такива извън страните на ЕС (Руска Федерация, Република Беларус, Република Сърбия, Република Турция, Саудитска Арабия, Украйна и др.) Става дума за юридическите лица, регистрирани в България за изпълнението на енергийния проект: „Бонатти С.П.А“, „Макс Щрайхер С.П.А“,  „Комплишънс дивелъпмънт С.А.Р.Л- клон България“, Аркад Енджиниъринг енд Констракшън“, “АркадС.П.А.,, „Ай Ди Си“ (Infrastructure Development and Construction), ОАОБелтрубопроводстрой“.

Регионалната министърка Петя Ставрева ни препрати…към здравния министър

Интересува ни как се спазва строителното законодателство на Република България и ЕС предвид хилядите работници от чуждестранни компании.  

По какъв начин се осъществява строителният контрол предвид наличието на информация за нерегламентиран нощен труд, предизвикани трудови злополуки, автомобилни катастрофи с работници и случаи на масово заразяване с Ковид 19 с изолиране на десетки работници.

Съответно изискваме копията на документацията, свързана с наличие на строителен контрол / надзор над площадките и строителните обекти на този енергиен проект, законосъобразността на сключените договори и анексите към тях със заличени данни по GDPR.“.

Там отговорът е с подписа на директора Дирекция „Правна“ на МРРБ Боянка Георгиева, препратила “по компетентност” заявлението извън ведомството:

УВАЖАЕМИ ГОСПОЖИ И ГОСПОДА,

На основание чл. 32, ал. 1 от Закона за достъп до обществена информация, приложено Ви изпращаме по компетентност постъпилото заявление за достъп до обществена информация с вх. № 94-00-99/28.10.2020 г., подадено от Николай Марченко, журналист в сайта „Биволъ“.

Писмото ни се препраща не само на сезираното от нас вече МТСП, но и до Министерство на здравеопазването и до подчинената на МРРБ контролна институция – ДНСК.

Вашингтон не бърза със санкциите 

Интересното е, че и посланикът на Съединените Американски Щати в България Херо Мустафа не бърза да отговаря на поставените от „Биволъ“ въпроси относно реализирането на газопровода с участие на американски компании, доставяли не само тежка строителна техника, но и турбини за компресорната станция „Расово“.

Официалното запитване на „Биволъ“ е изпратено на 30 октомври 2020 г. със съответната информация, препратки към разследванията на медията ни и няколкото общи въпроса относно позицията и плановете на САЩ за реакция при реализирането на проекта:

Посланичката на САЩ Херо Мустафа не бърза да коментира щекотливата тема

Ваше Превъзходителство,

молим да ни бъде предоставена информация относно плановете на САЩ за включване в санкционния списък на изпълнителите на проекта за газопровод „Турски поток 2“.  

Обръщаме вниманието Ви на факта, че разследванията на „Биволъ“ и ГД „България Обединена с Една Цел“ (БОЕЦ) установиха, че държавният концерн  „БулгартрансгазЕАД има договори с европейския клон на компания Solar Turbines (САЩ) –

“Солар Търбайнс Юръп С.А.“ – Белгия, доставила в с. Расово три компресора за Компресорната станция „Расово“ за нуждите на Турски поток 2. Заснети са с дрон от ГД „БОЕЦ“.

 Установихме също така на линейната част на газопровода край с. Макреш (Област Монтана) наличие на тежка техника на американската Caterpillar, брандирана от официалния дистрибутор за Гърция и България Eltrak Group (вижте: Dubious Bulgarian Companies Provide Cover Up Construction-Equipment for Turkstream 2). Според Информационната система за управление и наблюдение на евросрествата в България (ИСУН) “Елтрак България” е доставчик по европроект на фирмата “Верда 008” OOД, чиято техника американско производство може да се види по строежите на газопровода.

От българската страна подизпълнители са „Главболгарстрой“ АД и  „Главболгарстрой Интернешънъл“ АД в консорциум ДЗЗД „ФЕРОЩАЛ БАЛКАНГАЗ“ с „Ферощал Индустрианлаген ГмбХ – Германия“. Сред декларираните от тях подизпълнители са: “Турбо Машина България” ЕООД и “Химкомплект Инженеринг” АД.

Обръщаме вниманието Ви на факта, че „Турбо Машина България“ ЕООД има едни и същи собственици с фирмата „Брайт Инженеринг ООД“ – Варна, която през 2007, 2010, 2012 и 2016 г. е

подизпълнител на двата ТЕЦА-а, собственост на компании от САЩ: „Ей и Ес – 3С Марица изток I” ЕООД и „КонтурГлобал Марица Изток 3“ АД.

Биволъ: Александър Лукашенко строи Турски поток с европари

Иначе компаниите извън ЕС и НАТО на Турски поток 2 са: Аркад Енджиниъринг енд Констракшън” (Саудитска Арабия), “АркадС.П.А.(Милано, Италия), контрол върху които според данните ни е минал в ръцете на компании с краен собственик ОАО „Газпром“ (Руска Федерация).

 Също на обекта работи ДЗЗДГазово развитие и разширение в България – ГРРБ“ с участие на Консорцио Варна 1 („Бонатти С.П.А“ и „Макс Щрайхер С.П.А“). Относно италианската Bonatti установихме (вижте Company Suspected of Ties with Cosa-nostra Wants to Build Turkstream 2 in Bulgaria) в партньорство с Investigative Reporting Project Italy (IRPI), че е свързана със сицилианската ОПГ Коза Ностра и е подизпълнител на територията на Италия и Гърция на подкрепените от САЩ и ЕС проекти за Трансадриатически и Трансанадолски газопроводи.

За компанията IDC – Infrastructure Development and Construction (Белград, Сърбия) е установено, че през мрежа от сръбски и офшорни юридически лица е собственост на „Газпром“, който отдавна е под санкциите на САЩ и ЕС. Беларуският държавен концерн ОАОБелтрубопроводстрой“ (вижте: Lukashenko is Building Tturkstream in bulgaria with European Money) е дългогодишен подизпълнител на „Газпром“ и е изключен от Литовската държавна агенция за сигурност от проекта за газов интерконектор „Полша – Литва“ заради потенциалните си връзки с беларуски и руски специални служби. 

Наличието на американски доставчици, оборудване и тежка техника означава ли, че Вашингтон не е против компании от САЩ да участват в проектите на „Газпром“? Заобиколила ли е Solar Turbines през белгийския си клон обявените от Държавния департамент санкции за Северен поток 2 и Турски поток 2?

Ще бъде ли извършена проверка за законосъобразността на сътрудничеството на американски компании с „Булгартрансгаз“ и подизпълнители на подсанкционния в САЩ и ЕС „Газпром“?

Могат ли да бъдат наложени санкции върху българските компании като „Български енергиен холдинг“ ЕАД, „Булгартрансгаз“ ЕАД, „Главболгарстрой Интернешънъл“ АД и др., които реализират спорния за интересите на ЕС и НАТО Турски поток 2? Нарушава ли Република България Договора си за членство в НАТО, след като допуска представители на руски служби да охраняват офиса/базата на IDC и Arkad в гр. Брусарци, използвали физическа сила срещу Биволъ и ГД БОЕЦ при заснемане на обекта?

Каква е позицията на Посолството относно реализирания проект, подлага ли се той вече на санкции наред със Северен поток 2 или това предстои да бъде решено в обявения 30-дневен срок? 

На 30 октомври специалистът „Информация“ в Посолството на САЩ Раиса Йорданова отговаря на медията ни, че запитването ни е получено.

„Опитваме се да намерим човек, който да отговори на въпросите Ви“, гласи имейлът от пресслужбата на Херо Мустафа.

На 2 ноември Раиса Йорданова не коментира дали и в какви срокове може да се очаква някакъв официален отговор от страна на американската дипломатическа мисия у нас:

„За съжаление, не мога да дам конкретен отговор на въпроса Ви“.

По време на проведения телефонен разговор на 10 ноември на екипа ни пак е отговорено, че не се знае кога могат да очакват отговора на посланик Херо Мустафа относно евентуалните санкции за „Турски поток 2“.

Биволъ за строителната техника по площадките на газопровода

Дали тази липса на официална позиция е в резултат на несигурността с резултатите на изборите в САЩ, където действащият държавен глава Доналд Тръмп по първоначални данни губи битката за Белия дом?

Или е свързано с желанието на напускащия поста президент и неговия държавен секретар Майк Помпео да не си развалят отношенията с руския президент Владимир Путин заради и без това достроената тръба?

А и всеки президент и държавен секретар има право да си смени посланиците след встъпването в длъжност. Затова и Херо Мустафа може да бъде сменена след януари 2021 г., ако Джо Байдън заеме поста на президент, а бившата съветничка на Барак Обама по международна политика Сюзан Райс стане държавен секретар на САЩ.

И в този случай американската дипломатка по разбираеми причини може да няма желание да си разваля отношенията с руския си колега в София Анатолий Макаров, за когото реализирането на ребрандирания „Южен поток“ под формата на „Турски поток 2“ може да се окаже повод за повишение в дипломатическата му кариера.

Тръбата е готова, отговорите – липсват

Докато тече тази безкрайна официална кореспонденция, на 9 ноември 2020 г. изведнъж главната руска държавна информационна агенция информира, че газопроводът през България вече е готов.

„Строителството на „Балкански поток“ де факто е завършено, а българският участък от газопровода е свързан с този в Сърбия. Това съобщи пред ТАСС информиран източник, участващ в изграждането на газопровода“, писа в. „Сега“, както и редица други национални медии.

Оказа се, че властта и изпълнителите са най-обезпокоени не дали ще бъде пуснат в експлоатация предвид риска от санкциите на САЩ, а дали ще има пищен банкет по повод приключилата работа.

Бойко Борисов и Теменужка Петкова “строят” Турски поток 2

„Балкански поток” е завършен, свързан е с газопровод в Сърбия, няма официални събития във връзка с завършването на газопровода поради пандемията на коронавируса, а церемонията вероятно ще се състои по-късно “.

Това е казал пред агенция ТАСС неназованият източник в София.

През октомври министър-председателят Бойко Борисов обяви, че строителството на газопровода “Балкански поток” в страната „е почти приключило“.

Тогава премиерът отново подчерта пред, че “с изпълнението на проекта “Балкански поток” България щяла

„да се превърне в регионален стратегически газоразпределителен център и ще гарантира диверсификация на доставките на газ”.

Всъщност, „Балкански поток“ е продължение на газопровода „Турски поток“ през територията на България, а за Държавния департамент на САЩ това е един и същи газопровод. В официалните документи се нарича „обект на разширяване на газопреносната инфраструктура на компанията „Булгартрансгаз “паралелно със северния главен газопровод до българо-сръбската граница“.

Първият участък на „Турски поток“ беше пуснат на 8 януари 2020 г. тръбата минава по дъното на Черно море от Русия до бреговете на Турция. Това вече ощети БТГ откъм транзитните такси, тъй като Турция в качеството на най-голям потребител от години беше получавала своето синьо гориво през стария Трансбалкански газопровод, който е с маршрут за пренос по линията: Русия – Украйна – Молдова – Румъния – България.

Но от началото на 2020 г. Анкара вече получава основните си количества от „Газпром“ през морската част на „Турски поток“. Така Москва успява да накаже не само Украйна, но и страните по маршрута с орязването до минимум на транзитните такси, които оттук нататък България получава само за пренос към малките енергийни потребители като Сърбия и Северна Македония.

Ресорните вицепремиери пазят мълчание по темата “Турски поток” (Снимка: Дир.бг)

“Биволъ” поиска позиция по проекта и от ресорните вицепремиери по еврофондовете Томислав Дончев и по икономическата политика Мариана Николова  

дали според тях е нормално да се използва строителната техника, закупена по еврофондове на площадките за “Турски поток 2”.

И дали правителството е готово при наличие на рискове от санкции на САЩ да пусне газопровода в експлоатация.

Двамата ресорни зам.-министър-председатели очаквано игнорираха поставените от медията ни въпроси.

Същевременно сръбската дъщерна компания на “Газпром” IDC е подала сигнал към Районното полицейско управление на гр. Брусарци заради навлизането на “неустановени лица” на “нейната територия” след акцията на БОЕЦ и Биволъ пред офиса на дружеството в началото на октомври т.г.

Районният прокурор на Лом пък все още разглежда подадените тогава от екипите на Биволъ и БОЕЦ жалби за към ПУ Брусарци заради прилагане на физическа сила от неизвестни рускоезични служители на IDC, опитали да ни изведат извън бариерата на общинския терен, използван от тях за база.

На 9 ноември от ГД “БОЕЦ” са внесли и и официален сигнал до ДНСК към МРРБ с искане да бъде спряна работата по газопровода поради липсата на адекватен строителен контрол и евентуалните нарушения по трасето:

“На основание чл. 224 и чл. 225 от ЗУТ, БОЕЦ изискваме от ДНСК образуването на проверка и да пристъпи към незабавно спиране на строителните дейности на компресорната станция до с. Расово, област Монтана”.

“Както и по трасето на „Турски поток” до приключване на проверката и премахване на установените нередности”, гласи сигналът на гражданското НПО.

 

 

 

Meet the newest AWS Heroes including the first DevTools Heroes!

Post Syndicated from Ross Barich original https://aws.amazon.com/blogs/aws/meet-the-newest-aws-heroes-including-the-first-devtools-heroes/

The AWS Heroes program recognizes individuals from around the world who have extensive AWS knowledge and go above and beyond to share their expertise with others. The program continues to grow, to better recognize the most influential community leaders across a variety of technical disciplines.

Introducing AWS DevTools Heroes
Today we are introducing AWS DevTools Heroes: passionate advocates of the developer experience on AWS and the tools that enable that experience. DevTools Heroes excel at sharing their knowledge and building community through open source contributions, blogging, speaking, community organizing, and social media. Through their feedback, content, and contributions DevTools Heroes help shape the AWS developer experience and evolve the AWS DevTools, such as the AWS Cloud Development Kit, the AWS SDKs, and AWS Code suite of services.

The first cohort of AWS DevTools Heroes include:

Bhuvaneswari Subramani – Bengaluru, India

DevTools Hero Bhuvaneswari Subramani is Director Engineering Operations at Infor. With two decades of IT experience, specializing in Cloud Computing, DevOps, and Performance Testing, she is one of the community leaders of AWS User Group Bengaluru. She is also an active speaker at AWS community events and industry conferences, and delivers guest lectures on Cloud Computing for staff and students at engineering colleges across India. Her workshops, presentations, and blogs on AWS Developer Tools always stand out.

Jared Short – Washington DC, USA

DevTools Hero Jared Short is an engineer at Stedi, where they are using AWS native tooling and serverless services to build a global network for exchanging B2B transactions in a standard format. Jared’s work includes early contributions to the Serverless Framework. These days, his focus is working with AWS CDK and other toolsets to create intuitive and joyful developer experiences for teams on AWS.

Matt Coulter – Belfast, Northern Ireland

DevTools Hero Matt Coulter is a Technical Architect for Liberty IT, focused on creating the right environment for empowered teams to rapidly deliver business value in a well-architected, sustainable, and serverless-first way. Matt has been creating this environment by building CDK Patterns, an open source collection of serverless architecture patterns built using AWS CDK that reference the AWS Well Architected Framework. Matt also created CDK Day, which was the first community driven, global conference focused on everything CDK (AWS CDK, CDK for Terraform, CDK for Kubernetes, and others).

Paul Duvall – Washington DC, USA

DevTools Hero Paul Duvall is co-founder and former CTO of Stelligent. He is principal author of “Continuous Integration: Improving Software Quality and Reducing Risk,” and is also the author of many other publications including “Continuous Compliance on AWS,” “Continuous Encryption on AWS,” and “Continuous Security on AWS.” Paul hosted the DevOps on AWS Radio podcast for over three years and has been an enthusiastic user and advocate of AWS Developer Tools since their respective releases.

Sebastian Korfmann – Hamburg, Germany

DevTools Hero Sebastian Korfmann is an entrepreneurial Software Engineer with a current focus on Cloud Tooling, Infrastructure as Code, and the Cloud Development Kit (CDK) ecosystem in particular. He is a core contributor to the CDK for Terraform project, which enables users to define infrastructure using TypeScript, Python, and Java while leveraging the hundreds of providers and thousands of module definitions provided by Terraform and the Terraform ecosystem. With cdk.dev, Sebastian co-founded a community-driven hub for all things CDK, and he runs a weekly newsletter covering the growing CDK ecosystem.

Steve Gordon – East Sussex, United Kingdom

DevTools Hero Steve Gordon is a Pluralsight author and senior engineer who is passionate about community and all things .NET related, having worked with .NET for over 16 years. Steve has used AWS extensively for five years as a platform for running .NET microservices. He blogs regularly about running .NET on AWS, including deep dives into how the .NET SDK works, building cloud-native services, and how to deploy .NET containers to Amazon ECS. Steve founded .NET South East, a .NET Meetup group based in Brighton.

Thorsten Höger – Stuttgart, Germany

DevTools Hero Thorsten Höger is CEO and cloud consultant at Taimos, where he is advising customers on how to use AWS. Being a developer, he focuses on improving development processes and automating everything to build efficient deployment pipelines for customers of all sizes. As a supporter of open-source software, Thorsten is maintaining or contributing to several projects on GitHub, like test frameworks for AWS Lambda, Amazon Alexa, or developer tools for AWS. He is also the maintainer of the Jenkins AWS Pipeline plugin and one of the top three non-AWS contributors to AWS CDK.

 

 

 

 

Meet the rest of the new AWS Heroes
There is more good news! We are thrilled to introduce the remaining new AWS Heroes in this cohort, including the first Heroes from Argentina, Lebanon, and Saudi Arabia:

Ahmed Samir – Riyadh, Saudi Arabia

Community Hero Ahmed Samir is a Cloud Architect and mentor with more than 12 years of experience in IT. He is the leader of three Arabic Meetups in Riyadh: AWS, Amazon SageMaker, and Kubernetes, where he has organized and delivered over 40 Meetup events. Ahmed frequently shares knowledge and evangelizes AWS in Arabic through his social media accounts. He also holds AWS and Kubernetes certifications.

Anas Khattar – Beirut, Lebanon

Community Hero Anas Khattar is co-founder of Digico Solutions. He founded the AWS User Group Lebanon in 2018 and coordinates monthly meetups and workshops on a variety of cloud topics, which helped grow the group to more than 1,000 AWSome members. He also regularly speaks at tech conferences and authors tech blogs on Dev Community, sharing his AWS experiences and best practices. In close collaboration with the regional AWS community leaders and builders, Anas organized AWS Community Day MENA, which started in September 2020 with 12 User Groups from 10 countries, and hosted 27 speakers over 2 days.

Chris Gong – New York, USA

Community Hero Chris Gong is constantly exploring the different ways that cloud services can be applied in game development. Passionate about sharing his knowledge with the world, he routinely creates tutorials and educational videos on his YouTube channel, Flopperam, where the primary focus has been AWS Game Tech and Unreal Engine, specifically the multiplayer and networking aspects of game development. Although Amazon GameLift has been his biggest interest, Chris has plans to cover the usage of other AWS services in game development while exploring how they can be integrated with other game engines besides Unreal Engine.

Damian Olguin – Cordoba, Argentina

Community Hero Damian Olguin is a tech entrepreneur and one of the founders of Teracloud, an AWS APN Partner. As a community leader, he promotes knowledge-sharing experiences in user communities within LATAM. He is co-organizer of AWS User Group Cordoba and co-host of #DeepFridays, a Twitch streaming show that promotes AI/ML technology adoption by playing with DeepRacer, DeepLens, and DeepComposer. Damian is a public speaker who has spoken at AWS Community Day Buenos Aires 2019, AWS re:Invent 2019, and AWS Community Day LATAM 2020.

Denis Bauer – Sydney, Australia

Data Hero Denis Bauer is a Principal Research Scientist at Australia’s government research agency (CSIRO). Her open source products include VariantSpark, the Machine Learning tool for analysing ultra-high dimensional data, which was the first AWS Marketplace health product from a public sector organization. Denis is passionate about facilitating the digital transformation of the health and life-science sector by building a strong community of practice through open source technology, keynote presentations, and inclusive interdisciplinary collaborations. For example, the collaboration with her organization’s visionary Scientific Computing and Cloud Platforms experts as well as AWS Data Hero Lynn Langit has enabled the creation of cloud-based bioinformatics solutions used by 10,000 researchers annually.

Denis Dyack – St. Catharines, Canada

Community Hero Denis Dyack, a video game industry veteran of more than 30 years, is the Founder and CEO of Apocalypse Studios. His studio evangelizes using a cloud-first approach in game development and partners with AWS Game Tech to move the medium of the games industry forward. In his years of experience speaking at games conferences and within AWS communities, Denis has been an advocate for building on Amazon Lumberyard and more recently in moving over game development pipelines to AWS.

Emrah Şamdan – Ankara, Turkey

Serverless Hero Emrah Şamdan is the VP of Products at Thundra. In order to expand the serverless community globally in a pandemic, he co-organized the quarterly held ServerlessDays Virtual. He’s also a local community organizer for AWS Community Day Turkey, ServerlessDays Istanbul, and bi-weekly meetups at Cloud and Serverless Turkey. He’s currently part of the core organizer team of global ServerlessDays and is continuously looking for ways to expand the community. He frequently writes about serverless and cloud-native microservices on Medium and on the Thundra blog.

Franck Pachot – Lausanne, Switzerland

Data Hero Franck Pachot is the Principal Consultant and Database Evangelist at dbi services (an AWS Select Partner) and is passionate about all databases. With over 20 years of experience in development, data modeling, infrastructure, and all DBA tasks, Franck is a recognized database expert across Oracle and AWS. Franck is also an AWS Academy educator for Powercoders, and holds AWS Certified Database Specialty and Oracle Certified Master certifications. Franck contributes to technical communities, educating customers on AWS Databases through his blog, Twitter, and podcast in French. He is active in the Data community, and enjoys talking and meeting other data enthusiasts at conferences.

Hiroko Nishimura – Washington DC, USA

Community Hero Hiroko Nishimura (Hiro) is the founder of AWS Newbies and Cloud Newbies, which help people with non-traditional technical backgrounds begin their explorations into the AWS Cloud. As a “career switcher” herself, she has been community building since 2018 to help others deconstruct Cloud Computing jargon so they, too, can begin a career in the Cloud. Finally putting her degrees in Special Education to good use, Hiro teaches “Introduction to AWS for Non-Engineers” courses at LinkedIn Learning, and introductory coding lessons at egghead.

Juliano Cristian – Santa Catarina, Brazil

Community Hero Juliano Cristian is CEO of Game Business Accelerator Academy and co-founder of Game Developers SC which participates in the AWS APN program. He organizes the AWS Game Tech Lumberyard User Group in Florianópolis, Brazil, holding Meetups, Practical Labs, Game Jams, and Workshops. He also conducts many lectures at universities, speaking with more than 90 educational institutions across Brazil, introducing students to cloud computing and AWS Game Tech services. Whenever he can, Juliano also participates in other AWS User Groups in Brazil and Latin America, working to build an increasingly motivated and productive community.

Jungyoul Yu – Seoul, Korea

Machine Learning Hero Jungyoul Yu works at Danggeun Market as a DevOps Engineer, and is a leader of the AWS DeepRacer Group, part of the AWS Korea User Group. He was one of the AWS DeepRacer League finalists in AWS Summit Seoul and AWS re:Invent 2019. Starting off with zero ML experience, Jungyoul used AWS DeepRacer to learn ML techniques and began sharing his learnings both in the AWS DeepRacer Community, with User Groups, at AWS Community Day, and at various meetups. He has also shared many blog posts and sample code such as DeepRacer Reward Function Simulator, Rank Notifier, and Auto submit bot.

Juv Chan – Singapore

Machine Learning Hero Juv Chan is an AI automation engineer at UBS. He is the AWS DeepRacer League Singapore Summit 2019 champion and a re:Invent Championship Cup 2019 finalist. He is the lead organizer for the AWS DeepRacer Beginner Challenge global virtual community race in 2020. Juv is also involved in sharing his DeepRacer and Machine Learning knowledge with the AWS Machine Learning community at both global and regional scale. Juv is a contributing writer for both Towards Data Science and Towards AI platforms, where he blogs about AWS AI/ML and cloud relevant topics.

Lukonde Mwila – Johannesburg, South Africa

Container Hero Lukonde Mwila is a Senior Software Engineer at Entelect. He has a passion for sharing knowledge through speaking engagements such as meetups and tech conferences, as well as writing technical articles. His talk at DockerCon 2020 on deploying multi-container applications to AWS was one of the top rated and most viewed sessions of the event. He is 3x AWS certified and is an advocate for containerization and serverless technologies. Lukonde enjoys sharing experiences of building out AWS infrastructure on Medium and sharing open source projects on GitHub for the developer community to easily consume, replicate, and improve for their own benefit.

Magdalena Zawada – Rybnik, Poland

Community Hero Magdalena Zawada is Director of Strategy and Expansion at LCloud Ltd. She has been working in IT for 11 years and started her adventure with AWS technology in 2013 as CEO of Hostersi Ltd. Magdalena willingly shares her knowledge and experience with the community, co-organizing AWS UG Warsaw meetings. She also organizes a series of events to support preparation for obtaining AWS Cloud Practitioner Certifications. Magdalena belongs to numerous industry organizations, including FinOps Foundation and ISSA Information Systems Security Association Poland, and in 2019 she was a participant of the AWS re:Invent Community Leader “We Power Tech” Diversity Grant in Las Vegas.

Nick Walter – Lincoln, USA

Data Hero Nick Walter has over 15 years of experience with enterprise IT solutions, including expertise and certifications in AWS, VMware, and Oracle. A passionate evangelist for data management solutions on AWS, Nick can often be found blogging, hosting webinars, or presenting at conferences regarding the latest trends in business critical database technologies. Recently, Nick has focused on helping clients find cost effective ways to handle both the technical and licensing challenges of migrating application stacks backed by commercial database engines, such as Oracle or MS SQL Server, into AWS.

Renato Losio – Berlin, Germany

Data Hero Renato Losio is the Principal Cloud Architect at Funambol, a provider of white label cloud services. He has been working with AWS technologies since 2011, and holds 7 AWS certifications (including the Database Specialty). Renato enjoys speaking at international events, including DevOps Pro Europe, DevOpsConf Russia, All Day DevOps, Codemotion, and Percona Live. Passionate about knowledge sharing, Renato is an editor at InfoQ and writes about different cloud-related topics on his blog, cloudiamo.com. Through his various platforms, he has covered different topics across AWS Databases, such as Amazon RDS Proxy, Amazon RDS, and Amazon Aurora.

Tomasz Lakomy – Poznan, Poland

Community Hero Tomasz Lakomy is a Senior Frontend Engineer at OLX Group, and an egghead.io instructor. Over the last two years, he’s been diving into the world of AWS and sharing what he’s learned with others. After passing the AWS Certified Solutions Architect: Associate exam in 2019 he has recorded multiple courses on serverless technologies, including “Build an App with the AWS Cloud Development Kit” and “Learn AWS Lambda from Scratch.” In addition, he’s active on his Twitter, blog (tlakomy.com), as well as The Practical Dev community, where he posts articles on career advice, testing and – of course – AWS.
 

 

 

 

If you’d like to learn more about the new Heroes, or connect with a Hero near you, please visit the AWS Hero website.

Ross;

[$] iproute2 and libbpf: vendoring on the small scale

Post Syndicated from original https://lwn.net/Articles/836911/rss

LWN’s recent article on Kubernetes in
Debian
discussed the challenges of packaging a massive project with
hundreds of dependencies. Many of the issues that arose there, however,
are not limited to such projects, as can be seen in the ongoing discussion
about whether a copy of the relatively small libbpf library should be shipped
with the iproute2
collection of networking tools. Fast-moving projects, it would seem,
continue to feel limited by the restrictions imposed by the Linux
distribution model.

Majority of Alexa Now Running on Faster, More Cost-Effective Amazon EC2 Inf1 Instances

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/majority-of-alexa-now-running-on-faster-more-cost-effective-amazon-ec2-inf1-instances/

Today, we are announcing that the Amazon Alexa team has migrated the vast majority of their GPU-based machine learning inference workloads to Amazon Elastic Compute Cloud (EC2) Inf1 instances, powered by AWS Inferentia. This resulted in 25% lower end-to-end latency, and 30% lower cost compared to GPU-based instances for Alexa’s text-to-speech workloads. The lower latency allows Alexa engineers to innovate with more complex algorithms and to improve the overall Alexa experience for our customers.

AWS built AWS Inferentia chips from the ground up to provide the lowest-cost machine learning (ML) inference in the cloud. They power the Inf1 instances that we launched at AWS re:Invent 2019. Inf1 instances provide up to 30% higher throughput and up to 45% lower cost per inference compared to GPU-based G4 instances, which were, before Inf1, the lowest-cost instances in the cloud for ML inference.

Alexa is Amazon’s cloud-based voice service that powers Amazon Echo devices and more than 140,000 models of smart speakers, lights, plugs, smart TVs, and cameras. Today customers have connected more than 100 million devices to Alexa. And every month, tens of millions of customers interact with Alexa to control their home devices (“Alexa, increase temperature in living room,” “Alexa, turn off bedroom’“), to listen to radios and music (“Alexa, start Maxi 80 on bathroom,” “Alexa, play Van Halen from Spotify“), to be informed (“Alexa, what is the news?” “Alexa, is it going to rain today?“), or to be educated, or entertained with 100,000+ Alexa Skills.

If you ask Alexa where she lives, she’ll tell you she is right here, but her head is in the cloud. Indeed, Alexa’s brain is deployed on AWS, where she benefits from the same agility, large-scale infrastructure, and global network we built for our customers.

How Alexa Works
When I’m in my living room and ask Alexa about the weather, I trigger a complex system. First, the on-device chip detects the wake word (Alexa). Once detected, the microphones record what I’m saying and stream the sound for analysis in the cloud. At a high level, there are two phases to analyze the sound of my voice. First, Alexa converts the sound to text. This is known as Automatic Speech Recognition (ASR). Once the text is known, the second phase is to understand what I mean. This is Natural Language Understanding (NLU). The output of NLU is an Intent (what does the customer want) and associated parameters. In this example (“Alexa, what’s the weather today ?”), the intent might be “GetWeatherForecast” and the parameter can be my postcode, inferred from my profile.

This whole process uses Artificial Intelligence heavily to transform the sound of my voice to phonemes, phonemes to words, words to phrases, phrases to intents. Based on the NLU output, Alexa routes the intent to a service to fulfill it. The service might be internal to Alexa or external, like one of the skills activated on my Alexa account. The fulfillment service processes the intent and returns a response as a JSON document. The document contains the text of the response Alexa must say.

The last step of the process is to generate the voice of Alexa from the text. This is known as Text-To-Speech (TTS). As soon as the TTS starts to produce sound data, it is streamed back to my Amazon Echo device: “The weather today will be partly cloudy with highs of 16 degrees and lows of 8 degrees.” (I live in Europe, these are Celsius degrees 🙂 ). This Text-To-Speech process also heavily involves machine learning models to build a phrase that sounds natural in terms of pronunciations, rhythm, connection between words, intonation etc.

Alexa is one of the most popular hyperscale machine learning services in the world, with billions of inference requests every week. Of Alexa’s three main inference workloads (ASR, NLU, and TTS), TTS workloads initially ran on GPU-based instances. But the Alexa team decided to move to the Inf1 instances as fast as possible to improve the customer experience and reduce the service compute cost.

What is AWS Inferentia?
AWS Inferentia is a custom chip, built by AWS, to accelerate machine learning inference workloads and optimize their cost. Each AWS Inferentia chip contains four NeuronCores. Each NeuronCore implements a high-performance systolic array matrix multiply engine, which massively speeds up typical deep learning operations such as convolution and transformers. NeuronCores are also equipped with a large on-chip cache, which helps cut down on external memory accesses, dramatically reducing latency and increasing throughput.

AWS Inferentia can be used natively from popular machine-learning frameworks like TensorFlow, PyTorch, and MXNet, with AWS Neuron. AWS Neuron is a software development kit (SDK) for running machine learning inference using AWS Inferentia chips. It consists of a compiler, run-time, and profiling tools that enable you to run high-performance and low latency inference.

Who Else is Using Amazon EC2 Inf1?
In addition to Alexa, Amazon Rekognition is also adopting AWS Inferentia. Running models such as object classification on Inf1 instances resulted in 8x lower latency and doubled throughput compared to running these models on GPU instances.

Customers, from Fortune 500 companies to startups, are using Inf1 instances for machine learning inference. For example, Snap Inc.​ incorporates machine learning (ML) into many aspects of Snapchat, and exploring innovation in this field is a key priority for them. Once they heard about AWS Inferentia, they collaborated with AWS to adopt Inf1 instances to help with ML deployment, including around performance and cost. They started with their recommendation models inference, and are now looking forward to deploying more models on Inf1 instances in the future.

Conde Nast, one of the world’s leading media companies, saw a 72% reduction in cost of inference compared to GPU-based instances for its recommendation engine. And Anthem, one of the leading healthcare companies in the US, observed 2x higher throughput compared to GPU-based instances for its customer sentiment machine learning workload.

How to Get Started with Amazon EC2 Inf1
You can start using Inf1 instances today.

If you prefer to manage your own machine learning application development platforms, you can get started by either launching Inf1 instances with AWS Deep Learning AMIs, which include the Neuron SDK, or you can use Inf1 instances via Amazon Elastic Kubernetes Service or Amazon ECS for containerized machine learning applications. To learn more about running containers on Inf1 instances, read this blog to get started on ECS and this blog to get started on EKS.

The easiest and quickest way to get started with Inf1 instances is via Amazon SageMaker, a fully managed service that enables developers to build, train, and deploy machine learning models quickly.

Get started with Inf1 on Amazon SageMaker today.

— seb

PS: The team just released this video, check it out!

Combining encryption and signing with AWS asymmetric keys

Post Syndicated from J.D. Bean original https://aws.amazon.com/blogs/security/combining-encryption-and-signing-with-aws-asymmetric-keys/

In this post, I discuss how to use AWS Key Management Service (KMS) to combine asymmetric digital signature and asymmetric encryption of the same data.

The addition of support for asymmetric keys in AWS KMS has exciting use cases for customers. The ability to create, manage, and use public and private key pairs with KMS enables you to perform digital signing operations using RSA and Elliptic Curve Cryptography (ECC) keys. AWS KMS asymmetric keys can also be used to perform digital encryption operations using RSA keys. You can use these features together to digitally sign and encrypt the same data.

Another notable property of AWS KMS asymmetric keys is that they enable disconnected use cases. For example AWS KMS asymmetric keys can be used to cryptographically verify a digital signature client-side without the need for a network connection. AWS KMS asymmetric keys also enable scenarios where customers can use KMS to securely manage decryption of data that has been encrypted by a partner’s system that does not integrate with AWS APIs or have access to AWS account credentials. For the sake of simplicity, however, the example that I discuss in this post describes a connected use case where all cryptographic actions are performed server-side in AWS KMS using AWS credentials. The use of AWS KMS asymmetric keys throughout this post allows the overall approach to be adapted to disconnected and/or non-AWS-integrated use cases.

Overview

This post contains three basic steps.

  1. Create and configure AWS asymmetric customer master keys (CMK), AWS Identity and Access Management (IAM) roles, and key policies.
  2. Use your asymmetric CMKs to encrypt and sign a sample message in the role of a sender.
  3. Use AWS KMS to decrypt and verify the message signature of the sample message archive you generated in the previous procedure using your asymmetric CMKs in the role of a receiver.

Prerequisites

The commands I use in this tutorial were tested using AWS Command Line Interface (AWS CLI) version 2.50 on Amazon Linux 2. In order to run these commands in your in your own local environment ensure that you have first installed and updated the AWS CLI.

I assume you have at least one administrator identity available to you that has broad rights for creating roles, assuming roles, as well as creating, managing and using KMS keys. This can be a federated identity (for example, from your corporate identity provider or from a social identity), or it can be an AWS IAM user. Where no AWS identity is mentioned, I assume that you will be accessing the AWS Management Console or the AWS CLI using this administrator identity.

For simplicity, I create the KMS keys in the same region as each other. You must specify an AWS Region when using the AWS CLI, either explicitly or by setting a default Region. Before beginning, you should select an AWS Region to work in such as US East (N. Virginia). If you have not configured the AWS CLI in your environment please review the Configuration basics section of the AWS Command Line Interface User Guide for instructions. You may revert this configuration once you have finished if you do not wish to continue using a default Region with your AWS CLI. Take note of your selected region. When working in the AWS Console, if you do not see resources, such as AWS KMS keys, that you expect you may want to confirm that you are viewing resources in your chosen Region. For more information on selecting your Region in the AWS Console see Choosing a Region in AWS Management Console Getting Started Guide.

Create and configure resources

In the first phase of this tutorial you create and configure two asymmetric AWS KMS CMKs, two AWS IAM roles, and configure the key policies for both of your KMS CMKs to grant permissions to the roles. Shown in the following figure.
 

Figure 1: Create keys, roles, and key policies

Figure 1: Create keys, roles, and key policies

Create asymmetric signing and encryption key pairs

In the first step, you create two asymmetric master keys (CMK). One is configured for signing and verifying digital signatures while the other is configured for encrypting and decrypting data.

Note: The CMKs configured for this post are examples. RSA and Elliptic curve CMKs key specs can differ in a variety of dimensions. The RSA or elliptic curve key spec that you choose might be determined by your security standards or the requirements of your task. Different CMK key specs are priced differently and are subject to different request quotas because they each have different performance profiles. In general, use RSA or ECC keys with the highest security level that is practical and affordable for your task. For more information on CMK configuration options, please review the How to choose your CMK configuration section of the KMS Developer Guide.

To create a CMK for encryption and decryption

  1. Use the KMS CreateKey API. Pass RSA_4096 for the CustomerMasterKeySpec parameter and ENCRYPT_DECRYPT for the KeyUsage parameter in the AWS CLI example command below in order to generate a RSA 4096 key pair for signature creation and verification using AWS KMS.
    aws kms create-key --customer-master-key-spec RSA_4096 \
        --key-usage ENCRYPT_DECRYPT \
        --description "Sample Digital Encryption Key Pair"
    

    Note: If successful, this command returns a KeyMetadata object. Take note of the KeyID value in this object.

  2. As a best practice, assign an alias for your key. Use the following command to assign an alias of sample-encrypt-decrypt-key to your newly created CMK (replace the target-key-id value of 1234abcd-12ab-34cd-56ef-1234567890ab with your KeyID). Mapping a human-readable alias to the KeyID will make it easier to identify, use, and manage.
    aws kms create-alias \
        --alias-name alias/sample-encrypt-decrypt-key \
        --target-key-id 1234abcd-12ab-34cd-56ef-1234567890ab
    

To create a CMK for signature and verification

  1. Use the KMS CreateKey API. Pass ECC_NIST_P521 for the CustomerMasterKeySpec parameter and SIGN_VERIFY for the KeyUsage parameter in the AWS CLI example command below in order to generate an elliptic curve (ECC) key pair for signature creation and verification using AWS KMS.
    aws kms create-key --customer-master-key-spec \
        ECC_NIST_P521  \
        --key-usage SIGN_VERIFY \
        --description "Sample Digital Signature Key Pair"
    

    Note: If successful, this command returns a KeyMetadata object. Take note of the KeyID value.

  2. Use the following command to assign an alias of sample-sign-verify-key to your newly created CMK (replace the target-key-id value of 1234abcd-12ab-34cd-56ef-1234567890ab with your KeyID).
    aws kms create-alias \
        --alias-name alias/sample-sign-verify-key \
        --target-key-id 1234abcd-12ab-34cd-56ef-1234567890ab
    

Create sender and receiver roles

For the next step of this tutorial, you create two AWS principals. Use the steps that follow to create two roles—a sender principal and a receiver principal. Later, you will grant permissions to perform private key operations (sign and decrypt) and public key operations (verify and encrypt) to these roles.

To create and configure the roles

  1. Navigate to the AWS Identity and Access Management (IAM) Create role console dialogue that allows entities in a specified account to assume the role. Enter your Account ID and choose Next, as shown in the following figure.

    Note: If you don’t know you AWS account ID, please read Finding you AWS account ID in the AWS IAM User Guide for guidance on how to obtain this information.

    Figure 2: Enter your account ID to begin creating a role in AWS IAM

    Figure 2: Enter your account ID to begin creating a role in AWS IAM

  2. Select Next through the next two screens.

    Note: By clicking next through these dialogues you do not attach an IAM permissions policy or a tag to this new role.

  3. On the final screen, enter a Role name of SenderRole and a Role description of your choice, as shown in the following figure.
     
    Figure 3: Create the sender role

    Figure 3: Create the sender role

  4. Choose Create role to finish creating the sender role.
  5. To create the receiver role, repeat the preceding role creation process. However, in step 3, substitute the name ReceiverRole for SenderRole.

Configure key policy permissions

Best practice is to adhere to the principle of least privilege and provide each AWS principal with the minimal permissions necessary to perform its tasks. The sender and receiver roles that you created in the previous step currently have no permissions in your account. For this scenario, the receiver principal must be granted permission to verify digital signatures and decrypt data in AWS KMS using your asymmetric CMKs and the sender principal must be granted permission to create digital signatures and encrypt data in KMS using your asymmetric CMKs.

To provide access control permissions for AWS KMS actions to your AWS principals, attach a key policy to each of your CMKs.

Modify the CMK key policy

For the sample-encrypt-decrypt-key CMK, grant the IAM role for the sender principal (SenderRole) kms:Encrypt permissions and the IAM role for the receiver principal (ReceiverRole) kms:Decrypt permissions in the CMK key policy.

To modify the CMK key policy (console)

  1. Navigate to the AWS KMS page in the AWS Console and select customer-managed keys.
  2. Select your sample-encrypt-decrypt-key CMK.
  3. In the key policy section, choose edit.
  4. To allow your receiver principal to use the CMK to decrypt data encrypted under that CMK, append the following statement to the key policy (replace the account ID value of 111122223333 with your own).
    {
        "Sid": "Allow use of the CMK for decryption",
        "Effect": "Allow",
        "Principal": {"AWS":"arn:aws:iam::111122223333:role/ReceiverRole"},
        "Action": "kms:Decrypt",
        "Resource": "*"
    }
    

  5. To allow your sender principal to use the CMK to encrypt data, append the following statement to the key policy (replace the account ID value of 111122223333 with your own):
    {
        "Sid": "Allow use of the CMK for encryption",
        "Effect": "Allow",
        "Principal": {"AWS":"arn:aws:iam::111122223333:role/SenderRole"},
        "Action": "kms:Encrypt",
        "Resource": "*"
    }
    

  6. Choose Save changes.

Note: The kms:Encrypt permission is sufficient to permit the sender principal to encrypt small amounts of arbitrary data using your CMK directly.

Grant sign and verify permissions to the CMK key policy

For the sample-sign-verify-key CMK, grant the IAM role for the sender principal (SenderRole) kms:Sign permissions in the CMK key policy and the IAM role for the receiver principal (ReceiverRole) kms:Verify permissions in the CMK key policy.

To grant sign and verify permissions

  1. Using the same process as above, navigate to the key policy edit dialog for the sample-sign-verify-key CMK in the AWS console.
  2. To allow your sender principal to use the CMK to create digital signatures, append the following statement to the key policy (replace the account ID value of 111122223333 with your own).
    {
        "Sid": "Allow use of the CMK for digital signing",
        "Effect": "Allow",
        "Principal": {"AWS":"arn:aws:iam::111122223333:role/SenderRole"},
        "Action": "kms:Sign",
        "Resource": "*"
    }
    

  3. To allow your receiver principal to use the CMK to verify signatures created by that CMK, append the following statement to the key policy (replace the account ID value of 111122223333 with your own):
    {
        "Sid": "Allow use of the CMK for digital signature verification",
        "Effect": "Allow",
        "Principal": {"AWS":"arn:aws:iam::111122223333:role/ReceiverRole"},
        "Action": "kms:Verify",
        "Resource": "*"
    }
    

  4. Choose Save changes.

Key permissions summary

When these key policy edits have been completed the sender principal:

  • Will have permissions to encrypt data using the sample-encrypt-decrypt-key CMK and generate digital signatures using the sample-sign-verify-key CMK.
  • Will not have permissions to decrypt or to verify signatures using the CMKs.

The receiver principal:

  • Will have permissions to decrypt data which has been encrypted using the sample-encrypt-decrypt-key CMK and to verify signatures created using the sample-sign-verify-key CMK.
  • Will not have permissions to encrypt or to generate signatures using the CMKs.
Figure 4: Summary of key policy permissions

Figure 4: Summary of key policy permissions

Signing and encrypting a sample message

So far, you’ve created two asymmetric CMKs, created a set of sender and receiver roles, and configured permissions for those roles in each of your CMK key policies. In the second phase of this tutorial, you assume the role of sender and use your asymmetric signature and verification CMK to sign a sample message. You then bundle the sample message and its corresponding digital signature together into an archive and use your encryption and decryption asymmetric CMK to encrypt the archive.
 

Figure 5: Creating a message signature and encrypting the message along with its signature

Figure 5: Creating a message signature and encrypting the message along with its signature

Note: The order of operations in this process is that the message is first signed and then the signature and the message are encrypted together. This order is intentional. When a message is signed and then encrypted, neither the contents nor the identity of the sender will be available to unauthorized 3rd parties. If the order of operations were reversed, however, and a message was first encrypted and then signed it could leak information about the sender’s identity to unauthorized 3rd parties. Moreover, when a message is encrypted and then signed, an unauthorized 3rd party with access to the files could discard the authentic signature created by the sender and replace it with a valid signature created by their own key. This creates the potential for a 3rd party to deceptively create the appearance that they are the legitimate sender of the message and exploit that misperception further.

Assume the sender role

Start by assuming the sender role. In order to successfully assume a role you must authenticate as an IAM principal which has permission to perform sts:AssumeRole. If the principal you are authenticated as lacks this permission you will not able to assume the sender role.

To assume the sender role

  1. Run the following command, but be sure to replace the account ID value of 111122223333 with your account ID:
    aws sts assume-role \
        --role-arn arn:aws:iam::111122223333::role/SenderRole \
        --role-session-name AWSCLI-Session
    

  2. The return value for this command provides an access key ID, secret key, and session token. Substitute them into their respective places in the following commands and execute:
    export AWS_ACCESS_KEY_ID=ExampleAccessKeyID1
    export AWS_SECRET_ACCESS_KEY=ExampleSecretKey1
    export AWS_SESSION_TOKEN=ExampleSessionToken1
    

  3. Confirm that you’ve successfully assumed the sender role by issuing:
    aws sts get-caller-identity
    

    Note: If the output of this command contains the text assumed-role/SenderRole, then you’ve successfully assumed the sender role.

Create a message

Now, create a sample message file called message.json.

To create a message

Run the following command to create a message with the following content:

echo "
{ 
    "message": "The Magic Words are Squeamish Ossifrage", 
    "sender": "Sender Principal" 
}
" > ./message.json 

Create a digital signature

Creating and verifying a digital signature for the message provides confidence that the message contents haven’t been altered after being sent. This characteristic is known as integrity. Furthermore, when access to a signing key is scoped to a particular principal, creating and verifying a digital signature for the message provides confidence in the sender’s identity. This characteristic is known as authenticity. Finally, a high degree of confidence in both the integrity and authenticity of a message limits the plausible ability of a sender to fraudulently deny having signed a message. This characteristic is known a non-repudiation.

To create a digital signature

Run the following command to create a digital signature for message.json:

aws kms sign \
    --key-id alias/sample-sign-verify-key \
    --message-type RAW \
    --signing-algorithm ECDSA_SHA_512 \
    --message fileb://message.json \
    --output text \
    --query Signature | base64 --decode > message.sig

This generates an independent digital signature file, message.sig, for message.json. Any modification to the contents of message.json, such as changing the sender or message fields, will now cause signature validation of message.sig to fail for message.json.

Encrypt the message and signature

Even with the benefits of a digital signature, the message could still be viewed by any party with access to the file. In order to provide confidence that the message contents aren’t exposed to unauthorized parties, you can encrypt the message. This characteristic is known as confidentiality. In order to retain the benefits of your digital signature you can encrypt the message and corresponding signature together in a single package.

To encrypt the message and signature

  1. Combine your message and signature into an archive. For example, with the GNU Tar utility you can issue the following:
    tar -czvf message.tar.gz message.sig message.json
    

    This will create a new archive file named message.tar.gz containing both your message and message signature.

  2. Encrypt the archive using AWS KMS. To do so, issue the following command:
    aws kms encrypt \
        --key-id alias/sample-encrypt-decrypt-key \
        --encryption-algorithm RSAES_OAEP_SHA_256 \
        --plaintext fileb://message.tar.gz \
        --output text \
        --query CiphertextBlob | base64 --decode > message.enc
    

    This will output a message.enc file containing an encrypted copy of the message.tar.gz archive.

Decrypting and verifying a sample message

Now that you’ve created, signed, and encrypted a message, let’s change gears and see what working with this message.enc file is like from the perspective of a receiving party. In the final phase of this tutorial you assume the role of receiver and use your asymmetric CMKs to decrypt the encrypted message archive and verify the digital signature that you created. Finally, you will view your message. The process is shown in the following figure.
 

Figure 6: Decrypting a message archive and verifying the message signature

Figure 6: Decrypting a message archive and verifying the message signature

Assume the receiver role

Assume the receiver role so that you can simulate receiving a signed and encrypted message. As before, in order to assume the receiver role you must authenticate as an IAM principal which has permission to perform sts:AssumeRole. If the principal you are authenticated as lacks this permission you will not able to assume the receiver role.

To assume the receiver role

  1. Copy the message.enc file to a new directory to create a clean working space and navigate there in a terminal session.
  2. Assume your receiver role. To do so, execute the following command, replacing the account ID value of 111122223333 with your own:
    aws sts assume-role \
    	--role-arn arn:aws:iam::111122223333::role/ReceiverRole \
    	--role-session-name AWSCLI-Session
    

  3. The return value for this command provides an access key ID, secret key, and session token. Substitute them into their respective places in the following commands and execute:
    export AWS_ACCESS_KEY_ID=ExampleAccessKeyID1
    export AWS_SECRET_ACCESS_KEY=ExampleSecretKey1
    export AWS_SESSION_TOKEN=ExampleSessionToken1
    

  4. Confirm that you have successfully assumed the receiver role by issuing:
    aws sts get-caller-identity
    

If the output of this command contains the text assumed-role/ReceiverRole then you have successfully assumed the receiver role.

Decrypt the encrypted message archive in AWS KMS

Decrypt the encrypted message archive to access the plaintext of the message and message signature files.

To decrypt the encrypted message archive

  1. Issue the following command:
    aws kms decrypt \
        --key-id alias/sample-encrypt-decrypt-key \
        --ciphertext-blob fileb://EncryptedMessage \
        --encryption-algorithm RSAES_OAEP_SHA_256 \
        --output text \
        --query Plaintext | base64 --decode > message.tar.gz
    

  2. This will create an unencrypted message.tar.gz file that you can unpack with:
    tar -xvfz message.tar.gz
    

This, in turn, will expand the archive contents message.sig and message.json in your working directory.

Verify the message signature

To verify the signature on the message issue the following command:

aws kms verify \
    --key-id alias/sample-sign-verify-key \
    --message-type RAW \
    --message fileb://message.json \
    --signing-algorithm ECDSA_SHA_512 \
    --signature fileb://message.sig

In the response you should see that SignatureValid is marked true indicating that the signature has been verified using the specified sample-sign-verify-key that you granted the sender principal permission to generate signatures with.

View the message

Finally, open message.json and view the file’s contents by issuing the following command:

less message.json

You will see that the contents of the file have not been modified and still read:

{ 
    "message": "The Magic Words are Squeamish Ossifrage", 
    "sender": "Sender Principal" 
}

Note: Be careful to avoid making any changes to the contents of this file. Even a minor modification of the message contents will compromise the integrity of the message and cause future attempts at signature validation using your message.sig file to fail.

Summary

In this tutorial, you signed and encrypted data using two AWS KMS asymmetric CMKs and later decrypted and verified your signature using those CMKs.

You first created two asymmetric CMKs in AWS KMS, one for creating and verifying digital signatures and the other for encrypting and decrypting data. You then configured key policy permissions for your sender and receiver principals. Acting as your sender principal, you digitally signed a message in AWS KMS, added the message and signature to an archive and then encrypted that archive in AWS KMS. Next you assumed your receiver role and decrypted the archive in AWS KMS, viewed your message, and verified its signature in AWS KMS.

To learn more about the asymmetric keys feature of AWS KMS, please read the AWS KMS Developer Guide. If you have questions about the asymmetric keys feature, please start a new thread on the AWS KMS forum. If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

J.D. Bean

J.D. is a Senior Solutions Architect at AWS working with public sector organizations and financial institutions based out of New York City. His interests include security, privacy, and compliance. He is passionate about his work enabling AWS customers’ successful cloud journeys. J.D. holds a Bachelor of Arts from The George Washington University and a Juris Doctor from New York University School of Law.

Building an event-driven application with AWS Lambda and the Amazon Redshift Data API

Post Syndicated from Manash Deb original https://aws.amazon.com/blogs/big-data/building-an-event-driven-application-with-aws-lambda-and-the-amazon-redshift-data-api/

Eventdriven applications are becoming popular with many customers, where applications run in response to events. A primary benefit of this architecture is the decoupling of producer and consumer processes, allowing greater flexibility in application design and building decoupled processes.

An example of an even-driven application is an automated workflow being triggered by an event, which runs a series of transformations in the data warehouse. At the end of this workflow, another event gets initiated to notify end-users about the completion of those transformations and that they can start analyzing the transformed dataset.

In this post, we explain how you can easily design a similar event-driven application with Amazon Redshift, AWS Lambda, and Amazon EventBridge. In response to a scheduled event defined in EventBridge, this application automatically triggers a Lambda function to run a stored procedure performing extract, load, and transform (ELT) operations in an Amazon Redshift data warehouse, using its out-of-the-box integration with the Amazon Redshift Data API. This stored procedure copies the source data from Amazon Simple Storage Service (Amazon S3) to Amazon Redshift and aggregates the results. When complete, it sends an event to EventBridge, which triggers a Lambda function to send notification to end-users through Amazon Simple Notification Service (Amazon SNS) to inform them about the availability of updated data in Amazon Redshift.

This event-driven server-less architecture offers greater extensibility and simplicity, making it easier to maintain and faster to release new features, and also reduces the impact of changes. It also simplifies adding other components or third-party products to the application without many changes.

Prerequisites

As a prerequisite for creating the application in this post, you need to set up an Amazon Redshift cluster and associate it with an AWS Identity and Access Management (IAM) role. For more information, see Getting Started with Amazon Redshift.

Solution overview

The following architecture diagram highlights the end-to-end solution, which you can provision automatically with an AWS CloudFormation template.

The workflow includes the following steps:

  1. The EventBridge rule EventBridgeScheduledEventRule is initiated based on a cron schedule.
  2. The rule triggers the Lambda function LambdaRedshiftDataApiETL, with the action run_sql as an input parameter. The Python code for the Lambda function is available in the GitHub repo.
  3. The function performs an asynchronous call to the stored procedure run_elt_process in Amazon Redshift, performing ELT operations using the Amazon Redshift Data API.
  4. The stored procedure uses the Amazon S3 location event-driven-app-with-lambda-redshift/nyc_yellow_taxi_raw/ as the data source for the ELT process. We have pre-populated this with the NYC Yellow Taxi public dataset for the year 2015 to test this solution.
  5. When the stored procedure is complete, the EventBridge rule EventBridgeRedshiftEventRule is triggered automatically to capture the event based on the source parameter redshift-data from the Amazon Redshift Data API.
  6. The rule triggers the Lambda function LambdaRedshiftDataApiETL, with the action notify as an input parameter.
  7. The function uses the SNS topic RedshiftNotificationTopicSNS to send an automated email notification to end-users that the ELT process is complete.

The Amazon Redshift database objects required for this solution are provisioned automatically by the Lambda function LambdaSetupRedshiftObjects as part of the CloudFormation template initiation by invoking the function LambdaRedshiftDataApiETL, which creates the following objects in Amazon Redshift:

  • Table nyc_yellow_taxi, which we use to copy the New York taxi dataset from Amazon S3
  • Materialized view nyc_yellow_taxi_volume_analysis, providing an aggregated view of table
  • Stored procedure run_elt_process to take care of data transformations

The Python code for this function is available in the GitHub repo.

We also use the IAM role LambdaRedshiftDataApiETLRole for the Lambda function and  LambdaRedshiftDataApiETL to allow the following permissions:

  • Federate to the Amazon Redshift cluster through getClusterCredentials permission, avoiding password credentials
  • Initiate queries in the Amazon Redshift cluster through redshift-data API calls
  • Log with Amazon CloudWatch for troubleshooting purposes
  • Send notifications through Amazon SNS

A sample IAM role for this function is available in the GitHub repo.

Lambda is a key service in this solution because it initiates queries in Amazon Redshift using the redshift-data client. Based on the input parameter action, this function can asynchronously initiate Structured Query Language (SQL) statements in Amazon Redshift, thereby avoiding chances of timing out in case of long-running SQL statements[MOU1] [MOU2]   [MOU1]I think we should put reference to Redshift Data API and highlight that there is no need to configure drivers and connections [MOU2]done. It can also publish custom notifications through Amazon SNS. Also, it uses the Amazon Redshift Data API temporary credentials functionality, which allows it to communicate with Amazon Redshift using IAM permissions without the need of any password-based authentication. With the Data API, you also don’t need to configure drivers and connections for your Amazon Redshift cluster, because it’s handled automatically.

Deploying the CloudFormation template

When your Amazon Redshift cluster is set up, use the provided CloudFormation template to automatically create all required resources for this solution in your AWS account. For more information, see Getting started with AWS CloudFormation.

The template requires you to provide the following parameters:

  • RedshiftClusterIdentifier – Cluster identifier for your Amazon Redshift cluster.
  • DbUsername – Amazon Redshift database user name that has access to run the SQL script.
  • DatabaseName – Name of the Amazon Redshift database where the SQL script runs.
  • RedshiftIAMRoleARN – ARN of the IAM role associated with the Amazon Redshift cluster.
  • NotificationEmailId – Email to send event notifications through Amazon SNS.
  • ExecutionSchedule – Cron expression to schedule the ELT process through an EventBridge rule.
  • SqlText – SQL text to run as part of the ELT process. Don’t change the default value call run_elt_process(); if you want to test this solution with the test dataset provided for this post.

The following screenshot shows the stack details on the AWS CloudFormation console.

Testing the pipeline

After setting up the architecture, you should have an automated pipeline to trigger based on the schedule you defined in the EventBridge rule’s cron expression. You can view the CloudWatch logs and troubleshoot issues in the Lambda function. The following screenshot shows the logs for our setup.

You can also view the query status on the Amazon Redshift console, which allows you to view detailed execution plans for the queries you ran. Although the stored procedure may take around 6 minutes to complete, the Lambda function finishes in seconds. This is primarily because the execution from Lambda on Amazon Redshift was asynchronous. Therefore, the function is complete after initiating the process in Amazon Redshift without caring about the query completion.

When this process is complete, you receive the email notification that the ELT process is complete.

You may then view the updated data in your business intelligence tool, like Amazon QuickSight, or query data directly in Amazon Redshift Query Editor (see the following screenshot) to view the most recent data processed by this event-driven architecture.

Conclusion

The Amazon Redshift Data API enables you to painlessly interact with Amazon Redshift and enables you to build event-driven and cloud-native applications. We demonstrated how to build an event-driven application with Amazon Redshift, Lambda, and EventBridge. For more information about the Data API, see Using the Amazon Redshift Data API to interact with Amazon Redshift clusters and Using the Amazon Redshift Data API.


About the Authors

Manash Deb is a Senior Analytics Specialist Solutions Architect. He has worked in different database and data warehousing technologies for more than 15 years.

 

 

 

Debu Panda, a senior product manager at AWS, is an industry leader in analytics, application platform, and database technologies. He has more than 20 years of experience in the IT industry and has published numerous articles on analytics, enterprise Java, and databases and has presented at multiple conferences. He is lead author of the EJB 3 in Action (Manning Publications 2007, 2014) and Middleware Management (Packt).

 

Fei Peng is a Software Dev Engineer working in the Amazon Redshift team.

Vanguard Perspectives: Microsoft 365 to Veeam Backup to Backblaze B2 Cloud Storage

Post Syndicated from Natasha Rabinov original https://www.backblaze.com/blog/vanguard-perspectives-microsoft-365-to-veeam-backup-to-backblaze-b2-cloud-storage/

Ben Young works for vBridge, a cloud service provider in New Zealand. He specializes in the automation and integration of a broad range of cloud & virtualization technologies. Ben is also a member of the Veeam® Vanguard program, Veeam’s top-level influencer community. (He is not an employee of Veeam). Because Backblaze’s new S3 Compatible APIs enable Backblaze B2 Cloud Storage as an endpoint in the Veeam ecosystem, we reached out to Ben, in his role as a Veeam Vanguard, to break down some common use cases for us. If you’re working with Veeam and Microsoft 365, this post from Ben could help save you some time and headaches.

—Natasha Rabinov, Backblaze

Backing Up Microsoft Office 365 via Veeam in Backblaze B2 Cloud Storage

Veeam Backup for Microsoft Office 365 v4 included a number of enhancements, one of which was the support for object-based repositories. This is a common trend for new Veeam product releases. The flagship Veeam Backup & Replication™ product now supports a growing number of object enabled capabilities.

So, why object storage over block-based repositories? There are a number of reasons but scalability is, I believe, the biggest. These platforms are designed to handle petabytes of data with very good durability, and object storage is better suited to that task.

With the data scalability sorted, you only need to worry about monitoring and scaling out the compute workload of the proxy servers (worker nodes). Did I mention you no longer need to juggle data moves between repositories?! These enhancements create a number of opportunities to simplify your workflows.

So naturally, with the recent announcement from Backblaze saying they now have S3 Compatible API support, I wanted to try it out with Veeam Backup for Microsoft Office 365.
Let’s get started. You will need:

  • A Backblaze B2 account: You can create one here for free. The first 10GB are complimentary so you can give this a go without even entering a credit card.
  • A Veeam Backup for Microsoft Office 365 environment setup: You can also get this for free (up to 10 users) with their Community Edition.
  • An organization connected to the Veeam Backup for Microsoft Office 365 environment: View the options and how-to guide here.

Configuring Your B2 Cloud Storage Bucket

In the Backblaze B2 console, you need to create a bucket. If you already have one, you may notice that there is a blank entry next to “endpoint.” This is because buckets created before May 4, 2020 cannot be used with the Backblaze S3 Compatible APIs.

So, let’s create a new bucket. I used “VeeamBackupO365.”

This bucket will now appear with an S3 endpoint, which we will need for use in Veeam Backup for Microsoft Office 365.

Before you can use the new bucket, you’ll need to create some application keys/credentials. Head into the App Keys settings in Backblaze and select “create new.” Fill out your desired settings and, as good practice, make sure you only give access to this bucket, or the buckets you want to be accessible.

Your application key(s) will now appear. Make sure to save these keys somewhere secure, such as a password manager, as they only will appear once. You should also keep them accessible now as you are going to need them shortly.

The Backblaze setup is now done.

Configuring Your Veeam Backup

Now you’ll need to head over to your Veeam Backup for Microsoft Office 365 Console.

Note: You could also achieve all of this via PowerShell or the RESTful API included with this product if you wanted to automate.

It is time to create a new backup repository in Veeam. Click into your Backup Infrastructure panel and add a new backup repository and give it a name…

…Then select the “S3 Compatible” option:

Enter the S3 endpoint you generated earlier in the Backblaze console into the Service endpoint on the Veeam wizard. This will be something along the lines of: s3.*.backblazeb2.com.
Now select “Add Credential,” and enter the App Key ID and Secret that you generated as part of the Backblaze setup.

With your new credentials selected, hit “Next.” Your bucket(s) will now show up. Select your desired backup bucket—in this case I’m selecting the one I created earlier: “VeeamBackupO365.” Now you need to browse for a folder which Veeam will use as its root folder to base the backups from. If this is a new bucket, you will need to create one via the Veeam console like I did below, called “Data.”

If you are curious, you can take a quick look back in your Backblaze account, after hitting “Next,” to confirm that Veeam has created the folder you entered, plus some additional parent folders, as you can see in the example below:

Now you can select your desired retention. Remember, all jobs targeting this repository will use this retention setting, so if you need a different retention for, say, Exchange and OneDrive, you will need two different repositories and you will need to target each job appropriately.

Once you’ve selected your retention, the repository is ready for use and can be used for backup jobs.

Now you can create a new backup job. For this demo, I am going to only back up my user account. The target will be our new repository backed by Backblaze S3 Compatible storage. The wizard walks users through this process.

Giving the backup job a name.

 

Select your entire organization or desired users/groups and what to process (Exchange, OneDrive, and/or Sharepoint).

 

Select the object-backed backblazeb2-s3 backup repository you created.

That is it! Right click and run the job—you can see it starting to process your organization.
As this is the first job you’ve run, it may take some time and you might notice it slowing down. This slow down is a result of the Microsoft data being pulled out of O365. But Veeam is smart enough to have added in some clever user-hopping, so as it detects throttling it will jump across and start a new user, and then loop back to the others to ensure your jobs finish as quickly as possible.

While this is running, if you open up Backblaze again you will see the usage starting to show.

Done and Done

And there it is—a fully functional backup of your Microsoft Office 365 tenancy using Veeam Backup for Microsoft Office 365 and Backblaze B2 Cloud Storage.

We really appreciate Ben’s guide and hope it helps you try out Backblaze as a repository for your Veeam data. If you do—or if you’ve already set us as a storage target—we’d love to hear how it goes in the comments.
You can reach out to Ben at @benyoungnz on Twitter, or his blog, https://benyoung.blog.

The post Vanguard Perspectives: Microsoft 365 to Veeam Backup to Backblaze B2 Cloud Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The collective thoughts of the interwebz

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close