AWS CloudTrail network activity events for VPC endpoints now generally available

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/aws-cloudtrail-network-activity-events-for-vpc-endpoints-now-generally-available/

Today, I’m happy to announce the general availability of network activity events for Amazon Virtual Private Cloud (Amazon VPC) endpoints in AWS CloudTrail. This feature helps you to record and monitor AWS API activity traversing your VPC endpoints, helping you strengthen your data perimeter and implement better detective controls.

Previously, it was hard to detect potential data exfiltration attempts and unauthorized access to the resources within your network through VPC endpoints. While VPC endpoint policies could be configured to prevent access from external accounts, there was no built-in mechanism to log denied actions or detect when external credentials were used at a VPC endpoint. This often required you to build custom solutions to inspect and analyze TLS traffic, which could be operationally costly and negate the benefits of encrypted communications.

With this new capability, you can now opt in to log all AWS API activity passing through your VPC endpoints. CloudTrail records these events as a new event type called network activity events, which capture both control plane and data plane actions passing through a VPC endpoint.

Network activity events in CloudTrail provide several key benefits:

  • Comprehensive visibility – Log all API activity traversing VPC endpoints, regardless of the AWS account initiating the action.
  • External credential detection – Identify when credentials from outside your organization are accessing your VPC endpoint.
  • Data exfiltration prevention – Detect and investigate potential unauthorized data movement attempts.
  • Enhanced security monitoring – Gain insights into all AWS API activity at your VPC endpoints without the need to decrypt TLS traffic.
  • Visibility for regulatory compliance – Improve your ability to meet regulatory requirements by tracking all API activity passing through.

Getting started with network activity events for VPC endpoint logging
To enable network activity events, I go to the AWS CloudTrail console and choose Trails in the navigation pane. I choose Create trail to create a new one. I enter a name in the Trail name field and choose an Amazon Simple Storage Service (Amazon S3) bucket to store the event logs. When I create a trail in CloudTrail, I can specify an existing Amazon S3 bucket or create a new bucket to store my trail’s event logs.

If you set Log file SSE-KMS encryption to Enabled, you have two options: Choose New to create a new AWS Key Management Service (AWS KMS) key or choose Existing to choose an existing KMS key. If you chose New, you need to type an alias in the AWS KMS alias field. CloudTrail encrypts your log files with this KMS key and adds the policy for you. The KMS key and Amazon S3 must be in the same AWS Region. For this example, I use an existing KMS key. I enter the alias in the AWS KMS alias field and leave the rest as default for this demo. I choose Next for the next step.

In the Choose log events step, I choose Network activity events under Events. I choose the event source from the list of AWS services, such as cloudtrail.amazonaws.com, ec2.amazonaws.com, kms.amazonaws.com, s3.amazonaws.com, and secretsmanager.amazonaws.com. I add two network activity event sources for this demo. For the first source, I select ec2.amazonaws.com option. For Log selector template, I can use templates for common use cases or create fine-grained filters for specific scenarios. For example, to log all API activities traversing the VPC endpoint, I can choose the Log all events template. I choose Log network activity access denied events template to log only access denied events. Optionally, I can enter a name in the Selector name field to identify the log selector template, such as Include network activity events for Amazon EC2.

As a second example, I choose Custom to create custom filters on multiple fields, such as eventName and vpcEndpointId. I can specify specific VPC endpoint IDs or filter the results to include only the VPC endpoints that match specific criteria. For Advanced event selectors, I choose vpcEndpointId from the Field dropdown, choose equals as Operator, and enter the VPC endpoint ID. When I expand the JSON view, I can see my event selectors as a JSON block. I choose Next and after reviewing the selections, I choose Create trail.

After it’s configured, CloudTrail will begin logging network activity events for my VPC endpoints, helping me analyze and act on this data. To analyze AWS CloudTrail network activity events, you can use the CloudTrail console, AWS Command Line Interface (AWS CLI), and AWS SDK to retrieve relevant logs. You can also use CloudTrail Lake to capture, store and analyze your network activity events. If you are using Trails, you can use Amazon Athena to query and filter these events based on specific criteria. Regular analysis of these events can help you maintain security, comply with regulations, and optimize your network infrastructure in AWS.

Now available
CloudTrail network activity events for VPC endpoint logging provide you with a powerful tool to enhance your security posture, detect potential threats, and gain deeper insights into your VPC network traffic. This feature addresses your critical needs for comprehensive visibility and control over your AWS environments.

Network activity events for VPC endpoints are available in all commercial AWS Regions.

For pricing information, visit AWS CloudTrail pricing.

To get started with CloudTrail network activity events, visit AWS CloudTrail. For more information on CloudTrail and its features, refer to the AWS CloudTrail documentation.

— Esra

Migrate from Standard brokers to Express brokers in Amazon MSK using Amazon MSK Replicator

Post Syndicated from Subham Rakshit original https://aws.amazon.com/blogs/big-data/migrate-from-standard-brokers-to-express-brokers-in-amazon-msk-using-amazon-msk-replicator/

Amazon Managed Streaming for Apache Kafka (Amazon MSK) now offers a new broker type called Express brokers. It’s designed to deliver up to 3 times more throughput per broker, scale up to 20 times faster, and reduce recovery time by 90% compared to Standard brokers running Apache Kafka. Express brokers come preconfigured with Kafka best practices by default, support Kafka APIs, and provide the same low latency performance that Amazon MSK customers expect, so you can continue using existing client applications without any changes. Express brokers provide straightforward operations with hands-free storage management by offering unlimited storage without pre-provisioning, eliminating disk-related bottlenecks. To learn more about Express brokers, refer to Introducing Express brokers for Amazon MSK to deliver high throughput and faster scaling for your Kafka clusters.

Creating a new cluster with Express brokers is straightforward, as described in Amazon MSK Express brokers. However, if you have an existing MSK cluster, you need to migrate to a new Express based cluster. In this post, we discuss how you should plan and perform the migration to Express brokers for your existing MSK workloads on Standard brokers. Express brokers offer a different user experience and a different shared responsibility boundary, so using them on an existing cluster is not possible. However, you can use Amazon MSK Replicator to copy all data and metadata from your existing MSK cluster to a new cluster comprising of Express brokers.

MSK Replicator offers a built-in replication capability to seamlessly replicate data from one cluster to another. It automatically scales the underlying resources, so you can replicate data on demand without having to monitor or scale capacity. MSK Replicator also replicates Kafka metadata, including topic configurations, access control lists (ACLs), and consumer group offsets.

In the following sections, we discuss how to use MSK Replicator to replicate the data from a Standard broker MSK cluster to an Express broker MSK cluster and the steps involved in migrating the client applications from the old cluster to the new cluster.

Planning your migration

Migrating from Standard brokers to Express brokers requires thorough planning and careful consideration of various factors. In this section, we discuss key aspects to address during the planning phase.

Assessing the source cluster’s infrastructure and needs

It’s crucial to evaluate the capacity and health of the current (source) cluster to make sure it can handle additional consumption during migration, because MSK Replicator will retrieve data from the source cluster. Key checks include:

    • CPU utilization – The combined CPU User and CPU System utilization per broker should remain below 60%.
    • Network throughput – The cluster-to-cluster replication process adds extra egress traffic, because it might need to replicate the existing data based on business requirements along with the incoming data. For instance, if the ingress volume is X GB/day and data is retained in the cluster for 2 days, replicating the data from the earliest offset would cause the total egress volume for replication to be 2X GB. The cluster must accommodate this increased egress volume.

Let’s take an example where in your existing source cluster you have an average data ingress of 100 MBps and peak data ingress of 400 MBps with retention of 48 hours. Let’s assume you have one consumer of the data you produce to your Kafka cluster, which means that your egress traffic will be same compared to your ingress traffic. Based on this requirement, you can use the Amazon MSK sizing guide to calculate the broker capacity you need to safely handle this workload. In the spreadsheet, you will need to provide your average and maximum ingress/egress traffic in the cells, as shown in the following screenshot.

Because you need to replicate all the data produced in your Kafka cluster, the consumption will be higher than the regular workload. Taking this into account, your overall egress traffic will be at least twice the size of your ingress traffic.
However, when you run a replication tool, the resulting egress traffic will be higher than twice the ingress because you also need to replicate the existing data along with the new incoming data in the cluster. In the preceding example, you have an average ingress of 100 MBps and you retain data for 48 hours, which means that you have a total of approximately 18 TB of existing data in your source cluster that needs to be copied over on top of the new data that’s coming through. Let’s further assume that your goal for the replicator is to catch up in 30 hours. In this case, your replicator needs to copy data at 260 MBps (100 MBps for ingress traffic + 160 MBps (18 TB/30 hours) for existing data) to catch up in 30 hours. The following figure illustrates this process.

Therefore, in the sizing guide’s egress cells, you need to add an additional 260 MBps to your average data out and peak data out to estimate the size of the cluster you should provision to complete the replication safely and on time.

Replication tools act as a consumer to the source cluster, so there is a chance that this replication consumer can consume higher bandwidth, which can negatively impact the existing application client’s produce and consume requests. To control the replication consumer throughput, you can use a consumer-side Kafka quota in the source cluster to limit the replicator throughput. This makes sure that the replicator consumer will throttle when it goes beyond the limit, thereby safeguarding the other consumers. However, if the quota is set too low, the replication throughput will suffer and the replication might never end. Based on the preceding example, you can set a quota for the replicator to be at least 260 MBps, otherwise the replication will not finish in 30 hours.

  • Volume throughput – Data replication might involve reading from the earliest offset (based on business requirement), impacting your primary storage volume, which in this case is Amazon Elastic Block Store (Amazon EBS). The VolumeReadBytes and VolumeWriteBytes metrics should be checked to make sure the source cluster volume throughput has additional bandwidth to handle any additional read from the disk. Depending on the cluster size and replication data volume, you should provision storage throughput in the cluster. With provisioned storage throughput, you can increase the Amazon EBS throughput up to 1000 MBps depending on the broker size. The maximum volume throughput can be specified depending on broker size and type, as mentioned in Manage storage throughput for Standard brokers in a Amazon MSK cluster. Based on the preceding example, the replicator will start reading from the disk and the volume throughput of 260 MBps will be shared across all the brokers. However, existing consumers can lag, which will cause reading from the disk, thereby increasing the storage read throughput. Also, there is storage write throughput due to incoming data from the producer. In this scenario, enabling provisioned storage throughput will increase the overall EBS volume throughput (read + write) so that existing producer and consumer performance doesn’t get impacted due to the replicator reading data from EBS volumes.
  • Balanced partitions – Make sure partitions are well-distributed across brokers, with no skewed leader partitions.

Depending on the assessment, you might need to vertically scale up or horizontally scale out the source cluster before migration.

Assessing the target cluster’s infrastructure and needs

Use the same sizing tool to estimate the size of your Express broker cluster. Typically, fewer Express brokers might be needed compared to Standard brokers for the same workload because depending on the instance size, Express brokers allow up to three times more ingress throughput.

Configuring Express Brokers

Express brokers employ opinionated and optimized Kafka configurations, so it’s important to differentiate between configurations that are read-only and those that are read/write during planning. Read/write broker-level configurations should be configured separately as a pre-migration step in the target cluster. Although MSK Replicator will replicate most topic-level configurations, certain topic-level configurations are always set to default values in an Express cluster: replication-factor, min.insync.replicas, and unclean.leader.election.enable. If the default values differ from the source cluster, these configurations will be overridden.

As part of the metadata, MSK Replicator also copies certain ACL types, as mentioned in Metadata replication. It doesn’t explicitly copy the write ACLs except the deny ones. Therefore, if you’re using SASL/SCRAM or mTLS authentication with ACLs rather than AWS Identity and Access Management (IAM) authentication, write ACLs need to be explicitly created in the target cluster.

Client connectivity to the target cluster

Deployment of the target cluster can occur within the same virtual private cloud (VPC) or a different one. Consider any changes to client connectivity, including updates to security groups and IAM policies, during the planning phase.

Migration strategy: All at once vs. wave

Two migration strategies can be adopted:

  • All at once – All topics are replicated to the target cluster simultaneously, and all clients are migrated at once. Although this approach simplifies the process, it generates significant egress traffic and involves risks to multiple clients if issues arise. However, if there is any failure, you can roll back by redirecting the clients to use the source cluster. It’s recommended to perform the cutover during non-business hours and communicate with stakeholders beforehand.
  • Wave – Migration is broken into phases, moving a subset of clients (based on business requirements) in each wave. After each phase, the target cluster’s performance can be evaluated before proceeding. This reduces risks and builds confidence in the migration but requires meticulous planning, especially for large clusters with many microservices.

Each strategy has its pros and cons. Choose the one that aligns best with your business needs. For insights, refer to Goldman Sachs’ migration strategy to move from on-premises Kafka to Amazon MSK.

Cutover plan

Although MSK Replicator facilitates seamless data replication with minimal downtime, it’s essential to devise a clear cutover plan. This includes coordinating with stakeholders, stopping producers and consumers in the source cluster, and restarting them in the target cluster. If a failure occurs, you can roll back by redirecting the clients to use the source cluster.

Schema registry

When migrating from a Standard broker to an Express broker cluster, schema registry considerations remain unaffected. Clients can continue using existing schemas for both producing and consuming data with Amazon MSK.

Solution overview

In this setup, two Amazon MSK provisioned clusters are deployed: one with Standard brokers (source) and the other with Express brokers (target). Both clusters are located in the same AWS Region and VPC, with IAM authentication enabled. MSK Replicator is used to replicate topics, data, and configurations from the source cluster to the target cluster. The replicator is configured to maintain identical topic names across both clusters, providing seamless replication without requiring client-side changes.

During the first phase, the source MSK cluster handles client requests. Producers write to the clickstream topic in the source cluster, and a consumer group with the group ID clickstream-consumer reads from the same topic. The following diagram illustrates this architecture.

When data replication to the target MSK cluster is complete, we need to evaluate the health of the target cluster. After confirming the cluster is healthy, we need to migrate the clients in a controlled manner. First, we need to stop the producers, reconfigure them to write to the target cluster, and then restart them. Then, we need to stop the consumers after they have processed all remaining records in the source cluster, reconfigure them to read from the target cluster, and restart them. The following diagram illustrates the new architecture.

After verifying that all clients are functioning correctly with the target cluster using Express brokers, we can safely decommission the source MSK cluster with Standard brokers and the MSK Replicator.

Deployment Steps

In this section, we discuss the step-by-step process to replicate data from an MSK Standard broker cluster to an Express broker cluster using MSK Replicator and also the client migration strategy. For the purpose of the blog, “all at once” migration strategy is used.

Provision the MSK cluster

Download the AWS CloudFormation template to provision the MSK cluster. Deploy the following in us-east-1 with stack name as migration.

This will create the VPC, subnets, and two Amazon MSK provisioned clusters: one with Standard brokers (source) and another with Express brokers (target) within the VPC configured with IAM authentication. It will also create a Kafka client Amazon Elastic Compute Cloud (Amazon EC2) instance where from we can use the Kafka command line to create and view Kafka topics and produce and consume messages to and from the topic.

Configure the MSK client

On the Amazon EC2 console, connect to the EC2 instance named migration-KafkaClientInstance1 using Session Manager, a capability of AWS Systems Manager.

After you log in, you need to configure the source MSK cluster bootstrap address to create a topic and publish data to the cluster. You can get the bootstrap address for IAM authentication from the details page for the MSK cluster (migration-standard-broker-src-cluster) on the Amazon MSK console, under View Client Information. You also need to update the producer.properties and consumer.properties files to reflect the bootstrap address of the standard broker cluster.

sudo su - ec2-user

export BS_SRC=<<SOURCE_MSK_BOOTSTRAP_ADDRESS>>
sed -i "s/BOOTSTRAP_SERVERS_CONFIG=/BOOTSTRAP_SERVERS_CONFIG=${BS_SRC}/g" producer.properties 
sed -i "s/bootstrap.servers=/bootstrap.servers=${BS_SRC}/g" consumer.properties

Create a topic

Create a clickstream topic using the following commands:

/home/ec2-user/kafka/bin/kafka-topics.sh --bootstrap-server=$BS_SRC \
--create --replication-factor 3 --partitions 3 \
--topic clickstream \
--command-config=/home/ec2-user/kafka/config/client_iam.properties

Produce and consume messages to and from the topic

Run the clickstream producer to generate events in the clickstream topic:

cd /home/ec2-user/clickstream-producer-for-apache-kafka/

java -jar target/KafkaClickstreamClient-1.0-SNAPSHOT.jar -t clickstream \
-pfp /home/ec2-user/producer.properties -nt 8 -rf 3600 -iam \
-gsr -gsrr <<REGION>> -grn default-registry -gar

Open another Session Manager instance and from that shell, run the clickstream consumer to consume from the topic:

cd /home/ec2-user/clickstream-consumer-for-apache-kafka/

java -jar target/KafkaClickstreamConsumer-1.0-SNAPSHOT.jar -t clickstream \
-pfp /home/ec2-user/consumer.properties -nt 3 -rf 3600 -iam \
-gsr -gsrr <<REGION>> -grn default-registry

Keep the producer and consumer running. If not interrupted, the producer and consumer will run for 60 minutes before it exits. The -rf parameter controls how long the producer and consumer will run.

Create an MSK replicator

To create an MSK replicator, complete the following steps:

  1. On the Amazon MSK console, choose Replicators in the navigation pane.
  2. Choose Create replicator.
  3. In the Replicator details section, enter a name and optional description.

  1. In the Source cluster section, provide the following information:
    1. For Cluster region, choose us-east-1.
    2. For MSK cluster, enter the MSK cluster Amazon Resource Name (ARN) for the Standard broker.

After the source cluster is selected, it automatically selects the subnets associated with the primary cluster and the security group associated with the source cluster. You can also select additional security groups.

Make sure that the security groups have outbound rules to allow traffic to your cluster’s security groups. Also make sure that your cluster’s security groups have inbound rules that accept traffic from the replicator security groups provided here.

  1. In the Target cluster section, for MSK cluster¸ enter the MSK cluster ARN for the Express broker.

After the target cluster is selected, it automatically selects the subnets associated with the primary cluster and the security group associated with the source cluster. You can also select additional security groups.

Now let’s provide the replicator settings.

  1. In the Replicator settings section, provide the following information:
    1. For the purpose of the example, we have kept the topics to replicate as a default value that would replicate all topics from primary to secondary cluster.
    2. For Replicator starting position, we configure it to replicate from the earliest offset, so that we can get all the events from the start of the source topics.
    3. To configure the topic name in the secondary cluster as identical to the primary cluster, we select Keep the same topic names for Copy settings. This makes sure that the MSK clients don’t need to add a prefix to the topic names.

    1. For this example, we keep the Consumer Group Replication setting as default (make sure it’s enabled to allow redirected clients resume processing data from the last processed offset).
    2. We set Target Compression type as None.

The Amazon MSK console will automatically create the required IAM policies. If you’re deploying using the AWS Command Line Interface (AWS CLI), SDK, or AWS CloudFormation, you have to create the IAM policy and use it as per your deployment process.

  1. Choose Create to create the replicator.

The process will take around 15–20 minutes to deploy the replicator. When the MSK replicator is running, this will be reflected in the status.

Monitor replication

When the MSK replicator is up and running, monitor the MessageLag metric. This metric indicates how many messages are yet to be replicated from the source MSK cluster to the target MSK cluster. The MessageLag metric should come down to 0.

Migrate clients from source to target cluster

When the MessageLag metric reaches 0, it indicates that all messages have been replicated from the source MSK cluster to the target MSK cluster. At this stage, you can cut over client applications from the source to the target cluster. Before initiating this step, confirm the health of the target cluster by reviewing the Amazon MSK metrics in Amazon CloudWatch and making sure that the client applications are functioning properly. Then complete the following steps:

  1. Stop the producers writing data to the source (old) cluster with Standard brokers and reconfigure them to write to the target (new) cluster with Express brokers.
  2. Before migrating the consumers, make sure that the MaxOffsetLag metric for the consumers has dropped to 0, confirming that they have processed all existing data in the source cluster.
  3. When this condition is met, stop the consumers and reconfigure them to read from the target cluster.

The offset lag happens if the consumer is consuming slower than the rate the producer is producing data. The flat line in the following metric visualization shows that the producer has stopped producing to the source cluster while the consumer attached to it continues to consume the existing data and eventually consumes all the data, therefore the metric goes to 0.

  1. Now you can update the bootstrap address in properties and consumer.properties to point to the target Express based MSK cluster. You can get the bootstrap address for IAM authentication from the MSK cluster (migration-express-broker-dest-cluster) on the Amazon MSK console under View Client Information.
export BS_TGT=<<TARGET_MSK_BOOTSTRAP_ADDRESS>>
sed -i "s/BOOTSTRAP_SERVERS_CONFIG=.*/BOOTSTRAP_SERVERS_CONFIG=${BS_TGT}/g" producer.properties
sed -i "s/bootstrap.servers=.*/bootstrap.servers=${BS_TGT}/g" consumer.properties

  1. Run the clickstream producer to generate events in the clickstream topic:
cd /home/ec2-user/clickstream-producer-for-apache-kafka/

java -jar target/KafkaClickstreamClient-1.0-SNAPSHOT.jar -t clickstream \
-pfp /home/ec2-user/producer.properties -nt 8 -rf 60 -iam \
-gsr -gsrr <<REGION>> -grn default-registry -gar

  1. In another Session Manager instance and from that shell, run the clickstream consumer to consume from the topic:
cd /home/ec2-user/clickstream-consumer-for-apache-kafka/

java -jar target/KafkaClickstreamConsumer-1.0-SNAPSHOT.jar -t clickstream \
-pfp /home/ec2-user/consumer.properties -nt 3 -rf 60 -iam \
-gsr -gsrr <<REGION>> -grn default-registry

We can see that the producers and consumers are now producing and consuming to the target Express based MSK cluster. The producers and consumers will run for 60 seconds before they exit.

The following screenshot shows producer-produced messages to the new Express based MSK cluster for 60 seconds.

Migrate stateful applications

Stateful applications such as Kafka Streams, KSQL, Apache Spark, and Apache Flink use their own checkpointing mechanisms to store consumer offsets instead of relying on Kafka’s consumer group offset mechanism. When migrating topics from a source cluster to a target cluster, the Kafka offsets in the source will differ from those in the target. As a result, migrating a stateful application along with its state requires careful consideration, because the existing offsets are incompatible with the target cluster’s offsets. Before migrating stateful applications, it is crucial to stop producers and make sure that consumer applications have processed all data from the source MSK cluster.

Migrate Kafka Streams and KSQL applications

Kafka Streams and KSQL store consumer offsets in internal changelog topics. It is advisable not to replicate these internal changelog topics to the target MSK cluster. Instead, the Kafka Streams application should be configured to start from the earliest offset of the source topics in the target cluster. This allows the state to be rebuilt. However, this method results in duplicate processing, because all the data in the topic is reprocessed. Therefore, the target destination (such as a database) must be idempotent to handle these duplicates effectively.

Express brokers don’t allow configuring segment.bytes to optimize performance. Therefore, the internal topics need to be manually created before the Kafka Streams application is migrated to the new Express based cluster. For more information, refer to Using Kafka Streams with MSK Express brokers and MSK Serverless.

Migrate Spark applications

Spark stores offsets in its checkpoint location, which should be a file system compatible with HDFS, such as Amazon Simple Storage Service (Amazon S3). After migrating the Spark application to the target MSK cluster, you should remove the checkpoint location, causing the Spark application to lose its state. To rebuild the state, configure the Spark application to start processing from the earliest offset of the source topics in the target cluster. This will lead to re-processing all the data from the start of the topic and therefore will generate duplicate data. Consequently, the target destination (such as a database) must be idempotent to effectively handle these duplicates.

Migrate Flink applications

Flink stores consumer offsets within the state of its Kafka source operator. When checkpoints are completed, the Kafka source commits the current consuming offset to provide consistency between Flink’s checkpoint state and the offsets committed on Kafka brokers. Unlike other systems, Flink applications don’t rely on the __consumer_offsets topic to track offsets; instead, they use the offsets stored in Flink’s state.

During Flink application migration, one approach is to start the application without a Savepoint. This approach discards the entire state and reverts to reading from the last committed offset of the consumer group. However, this prevents the application from accurately rebuilding the state of downstream Flink operators, leading to discrepancies in computation results. To address this, you can either avoid replicating the consumer group of the Flink application or assign a new consumer group to the application when restarting it in the target cluster. Additionally, configure the application to start reading from the earliest offset of the source topics. This enables re-processing all data from the source topics and rebuilding the state. However, this method will result in duplicate data, so the target system (such as a database) must be idempotent to handle these duplicates effectively.

Alternatively, you can reset the state of the Kafka source operator. Flink uses operator IDs (UIDs) to map the state to specific operators. When restarting the application from a Savepoint, Flink matches the state to operators based on their assigned IDs. It is recommended to assign a unique ID to each operator to enable seamless state restoration from Savepoints. To reset the state of the Kafka source operator, change its operator ID. Passing the operator ID as a parameter in a configuration file can simplify this process. Restart the Flink application with parameter --allowNonRestoredState (if you are running self-managed Flink). This will reset only the state of the Kafka source operator, leaving other operator states unaffected. As a result, the Kafka source operator resumes from the last committed offset of the consumer group, avoiding full reprocessing and state rebuilding. Although this might still produce some duplicates in the output, it results in no data loss. This approach is applicable only when using the DataStream API to build Flink applications.

Conclusion

Migrating from a Standard broker MSK cluster to an Express broker MSK cluster using MSK Replicator provides a seamless, efficient transition with minimal downtime. By following the steps and strategies discussed in this post, you can take advantage of the high-performance, cost-effective benefits of Express brokers while maintaining data consistency and application uptime.

Ready to optimize your Kafka infrastructure? Start planning your migration to Amazon MSK Express brokers today and experience improved scalability, speed, and reliability. For more details, refer to the Amazon MSK Developer Guide.


About the Author

Subham Rakshit is a Senior Streaming Solutions Architect for Analytics at AWS based in the UK. He works with customers to design and build streaming architectures so they can get value from analyzing their streaming data. His two little daughters keep him occupied most of the time outside work, and he loves solving jigsaw puzzles with them. Connect with him on LinkedIn.

Foundational blocks of Amazon SageMaker Unified Studio: An admin’s guide to implement unified access to all your data, analytics, and AI

Post Syndicated from Lakshmi Nair original https://aws.amazon.com/blogs/big-data/foundational-blocks-of-amazon-sagemaker-unified-studio-an-admins-guide-to-implement-unified-access-to-all-your-data-analytics-and-ai/

Amazon SageMaker Unified Studio (preview) provides a unified experience for using data, analytics, and AI capabilities. You can use familiar AWS services for model development, generative AI, data processing, and analytics—all within a single, governed environment. Users can now build, deploy, and execute end-to-end workflows from a single interface. SageMaker Unified Studio is built on the foundations of Amazon DataZone, where it uses domains to categorize and structure the data assets, while offering project-based collaboration features that allow teams to securely share artifacts and work together across various compute services. This experience allows multiple personas to seamlessly collaborate, while operating under appropriate access controls and governance policies.

In this post, we focus on the admin persona and deep dive into the foundational building blocks while implementing the self-service access to all your data.

Conceptual framework

SageMaker Unified Studio offers an integrated development experience organized into three distinct planes, each serving different personas and purposes within the development lifecycle. This architecture enables seamless collaboration while maintaining clear boundaries of responsibility.

As shown in the following figure, each plane represents a distinct layer of functionality that works in harmony with the others to create a complete data and machine learning (ML) solution.

foundational planes

The planes are as follows:

  • Infrastructure plane – The infrastructure plane forms the foundation of SageMaker Unified Studio. Here administrators and domain owners of the organization provision the underlying infrastructure and define rules for users of the data factory plane to deploy the compute resources for data and ML operations in self-service mode. They can also decide to onboard existing resources or pre-create them. They can set up access controls and permissions to enforce and allocate resources to different teams and projects. This layer makes sure that all necessary computational resources are available and properly governed for downstream computation.
  • Data factory plane – The data factory plane functions like a sophisticated vending machine for compute resources, where data scientists and ML engineers can select and utilize preconfigured compute resources or deploy new ones. The data product developers, data engineers, and data scientists can create collaboration spaces and build data products by consuming infrastructure resources, with all the underlying complexity abstracted away.
  • Product experience plane – At the outermost layer, the product experience plane serves as a discovery and collaboration hub where business units (data producers and data consumers) can explore available data products from the asset catalog. This plane drives users to engage in data-driven conversations with knowledge and insights shared across the organization. Through the product experience plane, data product owners can use automated workflows to capture data lineage and data quality metrics and oversee access controls. They can track how their data products are being used and continuously improve the value proposition of their data assets.

In this post, we focus on the infrastructure plane deployment steps from an administrator’s perspective, outlining key responsibilities and actions required and how to configure and organize your assets under specific business units and teams and authorize policies during the initial setup phase.

Roles and responsibilities of the domain owner (admin) for the infrastructure plane

As shown in the following figure, the infrastructure plane revolves around three pivotal operational paradigms: onboard, organize, and authorize.

The details of the three essential functions in the foundational layer are as follows:

  • Onboard – The domain owner establishes a foundational environment by creating a domain, which represents an organization entity for you to connect together your assets, users, resources, and code repository configs. They can onboard the users who have authorization to access the self-serve unified studio. The self-serve unified studio is a browser-based web application where you can analyze, discover, catalog, govern, and share data in self-serve manner. The admin can enable the necessary blueprints and create project profiles to set up the underlying data infrastructure. In a multi-account (Mesh) scenario, the admin can also onboard the business units by associating the AWS accounts.
  • Organize – Here the domain owner creates hierarchies to organize and isolate projects within individual business units. The method of creating hierarchical representation of business units or team-level organization is through domain units. This makes sure that each business unit takes ownership of their assets. The admin can also delegate ownership within these business units.
  • Authorize – The admin or owners of individual business units or line of business (domain unit owners) can manage user policies—project-specific policies that dictate certain actions these principals can perform under a domain unit.

Now that we have discussed the core functions, let’s delve into the workflow that brings these concepts together.

Process workflow (infrastructure plane)

In the following figure, we break down the roles and responsibilities of domain owners to unit administrators through a sequence of operations, providing infrastructure deployment and management.

process workflow

The workflow consists of the following steps:

  1. The root domain owner (admin) creates a SageMaker Unified Studio domain from the console. After the domain is created, you get a SageMaker Unified Studio URL—a browser-based web application that can authenticate you with your AWS Identity and Access Management (IAM) user credentials or with credentials from your identity provider (IdP) through AWS IAM Identity Center or with your SAML credentials.
  2. As part of the onboarding process, the admin onboards single sign-on (SSO) users, SSO groups, and IAM users who are authorized to log in to SageMaker Unified Studio. IAM roles can be onboarded on the domain as well, but can be used for programmatic access only. During the quick setup deployment of the domain, default project profile templates are created. A project profile is a collection of blueprints that holds configurations of AWS tools and services. You can create following project profiles:
    1. Generative AI application development – Provides you with the tooling capabilities to build generative AI applications using Amazon Bedrock foundation models (FMs) and tools.
    2. SQL analytics – Provides you with a SQL editor to query the data in Amazon SageMaker Lakehouse, Amazon Redshift, and Amazon Athena.
    3. Data analytics and AI-ML model development – Provides you tools to build and orchestrate ML and generative AI models powered by AWS Glue, Athena, Amazon Managed Workflows for Apache Airflow (Amazon MWAA), Amazon SageMaker AI, and SageMaker Lakehouse.
    4. Custom project profile – Provides capabilities to build custom templates that can bundle multiple blueprints with varied tooling capabilities to suit your business needs.

Admins can also authorize project profile templates to specific users and groups, enforcing the capability to control resource deployment based on user personas. By default, all users are authorized to use default project profiles. However, this can be changed by the admin to limit the access of certain project profiles to certain users and groups.

The quick setup also establishes a default Git connection to AWS CodeCommit for users to manage their code repository. However, you also have the option to create and enable new Git connections to GitHub, GitHub Enterprise Server, GitLab, and GitLab self-managed. The Free Tier release of Amazon Q is enabled by default to all users of SageMaker Unified Studio domain. Amazon Q Developer Pro can be configured if IAM Identity Center is configured for users of the domain.

Finally, as part of the initial setup, the admin provides access to Amazon Bedrock serverless models.

In a multi-account scenario, the central admin associates AWS accounts, and the associated account admins accept the association and enable the blueprints for the project profiles that the central admin would create. Refer to the appendix at the end of this post for more details.

  1. To organize the data assets within the organization, the admin logs in to the SageMaker Unified Studio URL and creates domain units aligned with the business divisions.
  2. Each domain unit receives delegated ownership, enabling autonomous management of assets within their designated scope. This domain-based isolation provides clear boundaries while allowing unit owners to independently govern their assets and enforce relevant policies.

Steps 3 and 4 are optional as part of the quick deployment setup. Users can directly log in to SageMaker Unified Studio to build data products for their business use case if domain units are not part of immediate requirement. If no domain units are created, all users and groups fall back under the root domain level and authorization policies are applied on the root domain.

Behind the scenes

While users interact with a streamlined project creation interface in SageMaker Unified Studio, a sophisticated orchestration of components operates beneath the surface. This abstraction allows the admin to deploy infrastructure through simple selections while the system handles resource provisioning automatically. Let’s examine the underlying process behind the scenes, as illustrated in the following figure.

conceptual diagram of blueprints

This workflow consists of the following steps:

  1. Administrators enable the blueprints containing the AWS CloudFormation templates that have information on how to create and set up the underlying data infrastructure. These blueprints are automatically enabled during the quick setup deployment.
  2. Project profiles bundle these blueprint configurations into templates. These templates determine which infrastructure components deploy when a project is created.
  3. When users select a project profile within SageMaker Unified Studio, the system automatically triggers the relevant CloudFormation stack and deploys the necessary infrastructure resources in the form of environments. Environments are the actual data infrastructure behind a project.

In a multi-account scenario, the associated account admin enables the blueprints. However, the project profile creation happens at the root domain account. The project profile template will include the associated account details and the linked blueprints from the associated account. Refer to the appendix at the end of this post for more details.

Now that we have understood the functional building blocks of SageMaker Unified Studio, let’s proceed with the deployment walkthrough. We will create a domain using the quick setup deployment for single account. Refer to the appendix for multi-account deployment steps.

Prerequisites

You will need to complete the following prerequisites before you can follow the instructions in the next section:

  1. Sign up for an AWS account.
  2. Create a user with administrative access.
  3. Enable IAM Identity Center in the same AWS Region you want to create your SageMaker Unified Studio domain. Confirm in which Region SageMaker Unified Studio is currently available. Set up your IdP and synchronize identities and groups with IAM Identity Center. For more information, refer to IAM Identity Center Identity source tutorials.
  4. To use Amazon Bedrock FMs, grant access to base models.

Set up domain

Complete the following steps to create a new SageMaker Unified Studio domain:

  1. Sign in to the SageMaker console in the Region in which IAM Identity Center is enabled.
  2. Choose Create a Unified Studio domain.

create domain

  1. Select the Quick setup (recommended for exploration).
  2. Choose Create VPC (you can also use your own VPC but to simplify the cleanup, we opted to use a new VPC).

create vpc

This will open a new tab to deploy the CloudFormation stack to create the VPC and the necessary private and public subnets.

  1. For Stack name, enter a unique name to the stack (if the default name already exists).
  2. Keep the parameter for useVpcEndpoints as false.
  3. Choose Create stack.

create stack

  1. After the stack is created, go to the domain creation page and refresh the page, as shown in the following screenshot.

refresh

  1. For Name, enter a unique name for the domain.
  2. Keep the default selections for Domain Execution role, Domain Service role, Provisioning role, and Manage Access role.
  3. The configuration automatically selects the VPC and private subnets.

domain roles

service roles

  1. Keep the default selection for Model provisioning role and Model consumption role.
  2. Choose Continue.

prov roles

  1. Provide the email address of the SSO user that exists in IAM Identity Center.

The SSO user selected here is used as the administrator in SageMaker Unified Studio. If the account doesn’t have IAM Identity Center set up, then it will create an IAM Identity Center account instance, so long as the account is permitted to do so. An SSO or IAM user is required so that a user is able to log in to the studio after the domain is created.

  1. Choose Create domain.

create IdC

  1. After the domain is created, a dialog box pops up. You can close dialog box to set up authorization policies and onboard users.

dialog box

On the domain detail page, the Amazon SageMaker Unified Studio URL is listed. You can authenticate with your IAM user credentials or with credentials from your IdP through IAM Identity Center or with your SAML credentials. To authorize users to log in to the URL, the administrator must onboard the users to the domain. We see this as part of the next steps.

Unified Studio URL

Onboard users and associated accounts

Complete the following steps:

  1. To onboard users, go to the User management tab and choose Add.
  2. On the Add menu, choose either Add SSO users and groups or Add IAM users.

You can also add IAM roles for the purpose of managing the domain programmatically. However, you can’t use IAM roles to log in to the SageMaker Unified Studio URL. After you add the users, they will appear with the status Assigned. The status changes to Activated only when the user logs in to the SageMaker Unified Studio URL.

onboard users

  1. If you want to onboard multiple AWS accounts to your domain account, go to the Account associations tab and choose Request association.

This enables domain users to publish and consume data from these AWS accounts.

associate accounts

For a multi-account setup, by sending an association request to another AWS account, you share the root domain with the other AWS account with AWS Resource Access Manger (AWS RAM). The associated admin domain owner accepts the invitation. To access the compute resources of the associated accounts from SageMaker Unified Studio, the associated domain owner must enable the necessary blueprints. Refer to the appendix to understand the cross-account deployment steps.

Project profiles and authorizing users

For the quick setup deployment, when you navigate to the Blueprints tab, you will notice all the blueprints are automatically enabled. Also, on the Project profiles tab, you will find default project profiles are available to the user.project profiles

Leave the rest of the tabs with the default options.

Create a custom project profile and authorize users (optional)

In the following example, we show the steps to create a custom project profile by bundling selected blueprints. We also show the steps to authorize only restricted users to use this project profile template. This example creates a custom project profile with selective blueprints. This enables the user to create a data lake environment with AWS Glue database and Athena workgroup to query the data. The user can also create an Amazon MWAA environment for orchestration. You can also change or override the configuration parameters of the blueprint by using the Tooling configurations option within the project profile.

Because SageMaker Unified Studio is in preview mode, the naming conventions of some visual elements might appear different in the current version.

When you create a project profile, you can add blueprint deployment settings in two modes: on create and on demand. On create mode allows you to deploy the blueprint deployment settings as soon as the project is created. On demand mode allows you to deploy the blueprint deployment settings when users need it.

Create a project, create domain units, and delegate ownership (optional)

In the following example, the administrator logs in to SageMaker Unified Studio and creates the retail domain unit. The admin also delegates ownership to the retail business user. The retail business user logs in to SageMaker Unified Studio and creates a project with the authorized project profile template.

With these configurations in place, you have successfully completed the initial infrastructure plane deployment from an administrative perspective.

Authorization of blueprints (optional)

By default, all domain users have authorization to create projects with the enabled blueprints across domain units. If you want to restrict the usage of the blueprint within a specific domain unit (in this case, the retail domain unit, as shown in the following screenshot), you need to revoke the existing permissions and authorize the specific domain units. By limiting the use of blueprints to a particular domain unit, users can only create projects using the blueprint within that domain unit. To apply authorization settings to child domain units, enable the Cascade to all child domain units option.

blueprints authorization

Clean up

Make sure you remove the SageMaker Unified Studio resources to mitigate any unexpected costs. This involves a few steps:

  1. If you had multiple projects and subscribed to assets, unsubscribe to all assets.
  2. Note the names of all AWS Glue databases and Athena workgroups created by your projects.
  3. Delete any connections you created in the data explorer that you don’t want to keep.
  4. Note the project IDs.
  5. Delete the projects. If you encounter any errors, check the AWS CloudFormation console and find the failed stack. Fix the error that failed the stack deletion and delete the projects.
  6. Note down the domain ID.
  7. Delete the domain.
  8. Delete the S3 bucket named amazon-datazone-AWSACCOUNTID-AWSREGION-DOMAINID.
  9. Delete the AWS Glue databases and Athena workgroups you noted earlier.
  10. Delete the CloudFormation stack for the VPC (if you followed that step in the setup).

If you have additional resources that haven’t been deleted, you can also use tags to identify and delete specific resources.

Conclusion

In this post, we discussed the foundational building blocks of SageMaker Unified Studio and how, by abstracting complex technical implementations behind user-friendly interfaces, organizations can maintain standardized governance while enabling efficient resource management across business units. This approach provides consistency in infrastructure deployment while providing the flexibility needed for diverse business requirements.

To learn more, refer to the Amazon SageMaker Unified Studio Administrator Guide and the following resources:

Appendix: Multi-account administration

This section illustrates the cross-account association. After the account invitation is accepted by the associated account owner, follow the instructions as shown in the following example to understand how to enable the blueprints. After the blueprints are enabled in the associate accounts, the root domain account can create project profile templates with the parameters of the associated account, including its linked blueprints. The example then demonstrates how the retail domain unit user can deploy compute resources and create data using the resources from the associated account.


About the Authors

Lakshmi Nair is a Senior Analytics Specialist Solutions Architect at AWS. She specializes in designing advanced analytics systems across industries. She focuses on crafting cloud-based data platforms, enabling real-time streaming, big data processing, and robust data governance. She can be reached via LinkedIn.

Fabrizio Napolitano is a Principal Specialist Solutions Architect for DB and Analytics. He has worked in the analytics space for the last 20 years, and has recently and quite by surprise become a Hockey Dad after moving to Canada.

Exabyte Scale Hard Drive Investments

Post Syndicated from Chris Opat original https://www.backblaze.com/blog/exabyte-scale-hard-drive-investments/

A decorative image showing several servers connected to the same network.

Not many companies run exabyte scale data platforms, and not many companies open source their drive data—at Backblaze, we do both. From that perch, I’m sharing how I think about buying hard drives at exabyte scale, including the intentional design decisions and trade-offs I make as an expert in the field, and what you can apply to your own operations whether you’re running a couple hundred terabytes or petabytes on-premises. 

TL/DR: Bigger drives aren’t always better

You’d think, as a cloud platform managing massive amounts of data, we’d be delighted that drive density continues to grow. But it’s not as simple as that. While we do run cohorts of 20TB+ drives in our environment, there are a few reasons it doesn’t always make sense to fill our servers up with the densest drives we can buy.

Drive size and IOPS starvation

Drives have a finite amount of capacity to perform input/output operations per second (IOPS). The larger the drive, the more those IOPS become a contentious consumable—creating a triangle of tension between storage capacity, reading, and writing. You can store more data on a 20TB drive, but you can only read and write as fast as that one drive allows. Conversely, you can store the same amount of data on five 4TB drives and 5x your IOPS capacity through concurrency. 

For high demand workloads with high concurrency requirements for reading and writing files—like AI inferencing, for example—you’ll want to carefully consider the balance point between the right drive size and the performance you need to get out of the system. The ability to read, write, or delete content has to peacefully coexist with the ability for your storage infrastructure to service any of those three needs. Now, you might be thinking: If that’s a constraint, what about SSDs? I’ll get to that down below.

Drive size and rebuilds

When managing large data at scale we employ Reed-Solomon erasure coding to rebuild drives upon failure to maintain data durability. The larger the drive, the more painful and slow the rebuild when that drive eventually fails. The rebuild process can take hours or even days, depending on the size of the drive and the workload on the system. That can impact performance, especially if the storage system is already under heavy use, and increases the risk of another failure while the rebuild is in progress. While we mitigate that risk in a variety of ways, it may not be feasible for smaller shops to do so.

If you’re in a business that relies on real-time data access—financial institutions, healthcare providers, e-commerce platforms, for example—you need drives that balance capacity and rebuild speed. Higher-capacity drives may offer better storage density but smaller or enterprise-grade drives with faster rebuild times and higher endurance may be a better choice for businesses where continuous uptime and/or durability is critical.

HDD vs. SSD: Unit economics

The moral of the story is that the way you invest in drives, and how much you take things like drive size, drive type, and the failure rates we publish into consideration absolutely depends on your use case. It’s not as simple as looking at our Drive Stats and picking the drive with the lowest annualized failure rate. 

In Backblaze’s early days, when we were focused on consumer backup, drive density and durability were the most important part of the equipment for us. We didn’t care about speed. As our customers increasingly bring us newer and more demanding use cases, our calculus for the kinds of drives we fill our data centers with will change with them. 

The post Exabyte Scale Hard Drive Investments appeared first on Backblaze Blog | Cloud Storage & Cloud Backup

New leadership for Asahi Linux

Post Syndicated from corbet original https://lwn.net/Articles/1009528/

The Asahi Linux project, which is working to support Linux on Apple
silicon, has announced the
resignation of Hector “marcan” Martin as its lead, and his replacement by a
seven-person committee. “Today’s news is bittersweet. We are grateful
to marcan for kicking off this project and tirelessly working on it these
past years. Our community will miss him. Still, with your support, the
project has a bright future to come
“. Martin has explained his reasons
for leaving at length in this
blog post
.

[$] Multi-size THP creation, two different ways

Post Syndicated from corbet original https://lwn.net/Articles/1009039/

Huge pages can increase the performance of many programs, but they can also
have unfortunate performance impacts of their own. Over the last few
years, multi-size transparent huge pages (mTHPs) have increasingly been
seen as a happy medium that bring the benefits of huge pages at a lower cost.
The system cannot benefit from mTHPs, though, if it does not create them;
two developers have independently posted patches to enable the creation of
mTHPs in the background.

CVE-2025-1094: PostgreSQL psql SQL injection (FIXED)

Post Syndicated from Stephen Fewer original https://blog.rapid7.com/2025/02/13/cve-2025-1094-postgresql-psql-sql-injection-fixed/

CVE-2025-1094: PostgreSQL psql SQL injection (FIXED)

Rapid7 discovered a high-severity SQL injection vulnerability, CVE-2025-1094, affecting the PostgreSQL interactive tool psql. This discovery was made while Rapid7 was performing research into the recent exploitation of CVE-2024-12356 — an unauthenticated remote code execution (RCE) vulnerability that affects both BeyondTrust Privileged Remote Access (PRA) and BeyondTrust Remote Support (RS). Rapid7 discovered that in every scenario we tested, a successful exploit for CVE-2024-12356 had to include exploitation of CVE-2025-1094 in order to achieve remote code execution. While CVE-2024-12356 was patched by BeyondTrust in December 2024, and this patch successfully blocks exploitation of both CVE-2024-12356 and CVE-2025-1094, the patch did not address the root cause of CVE-2025-1094, which remained a zero-day until Rapid7 discovered and reported it to PostgreSQL.

All supported versions before PostgreSQL 17.3, 16.7, 15.11, 14.16, and 13.19 are affected. CVE-2025-1094 has a CVSS 3.1 base score of 8.1 (High). More information is available in the PostgreSQL advisory.

Impact

CVE-2025-1094 arises from an incorrect assumption that when attacker-controlled untrusted input has been safely escaped via PostgreSQL’s string escaping routines, it cannot be leveraged to generate a successful SQL injection attack. Rapid7 found that SQL injection is, in fact, still possible in a certain scenario when escaped untrusted input is included as part of a SQL statement executed by the interactive psql tool.

Because of how PostgreSQL string escaping routines handle invalid UTF-8 characters, in combination with how invalid byte sequences within the invalid UTF-8 characters are processed by psql, an attacker can leverage CVE-2025-1094 to generate a SQL injection.

An attacker who can generate a SQL injection via CVE-2025-1094 can then achieve arbitrary code execution (ACE) by leveraging the interactive tool’s ability to run meta-commands. Meta-commands extend the interactive tools functionality, by providing a wide variety of additional operations that the interactive tool can perform. The meta-command, identified by the exclamation mark symbol, allows for an operating system shell command to be executed. An attacker can leverage CVE-2025-1094 to perform this meta-command, thus controlling the operating system shell command that is executed.

Alternatively, an attacker who can generate a SQL injection via CVE-2025-1094 can execute arbitrary attacker-controlled SQL statements.

Credit

This vulnerability was discovered by Stephen Fewer, Principal Security Researcher at Rapid7 and is being disclosed in accordance with Rapid7’s vulnerability disclosure policy.

Analysis

A technical analysis of CVE-2025-1094, as it relates to the exploitation of the BeyondTrust vulnerability CVE-2024-12356, is available in AttackerKB.

A Metasploit exploit module that exploits CVE-2025-1094 against a vulnerable BeyondTrust Privileged Remote Access (PRA) and Remote Support (RS) target is available here.

Vendor Statement

The PostgreSQL Global Development Group provides information on security vulnerability reporting, releases processes, and known vulnerability fixes at https://www.postgresql.org/support/security/.

Remediation

To remediate CVE-2025-1094, PostgreSQL users should upgrade to PostgreSQL 17.3, 16.7, 15.11, 14.16, or 13.19. For additional details, please see the PostgreSQL advisory.

Rapid7 customers

InsightVM and Nexpose customers will be able to assess their exposure to CVE-2025-1094 with an authenticated vulnerability check expected to be available in today’s (February 13) content release.

For CVE-2024-12356 affecting BeyondTrust Privileged Remote Access (PRA) and Remote Support (RS) products, InsightVM and Nexpose customers have been able to assess exposure with authenticated checks for Windows systems (Scan Engine only checks) as of the February 10, 2025 content release.

Disclosure timeline

  • January 27, 2025: Rapid7 makes initial contact with the PostgreSQL security team and discloses vulnerability details.
  • January 29, 2025: The PostgreSQL development group confirms the finding; Rapid7 and PostgreSQL developers agree on a coordinated disclosure date.
  • February 11, 2025: The PostgreSQL development group provides a CVE ID and affected versions.
  • February 13, 2025: This disclosure.

Automatic Audit Logs: new updates deliver increased transparency and accountability

Post Syndicated from Sahidya Devadoss original https://blog.cloudflare.com/introducing-automatic-audit-logs/

What are audit logs and why do they matter?

Audit logs are a critical tool for tracking and recording changes, actions, and resource access patterns within your Cloudflare environment. They provide visibility into who performed an action, what the action was, when it occurred, where it happened, and how it was executed. This enables security teams to identify vulnerabilities, ensure regulatory compliance, and assist in troubleshooting operational issues. Audit logs provide critical transparency and accountability. That’s why we’re making them “automatic” — eliminating the need for individual Cloudflare product teams to manually send events. Instead, audit logs are generated automatically in a standardized format when an action is performed, providing complete visibility and ensuring comprehensive coverage across all our products.

What’s new?

We’re excited to announce the beta release of Automatic Audit Logs — a system that unifies audit logging across Cloudflare products. This new system is designed to give you a complete and consistent view of your environment’s activity. Here’s how we’ve enhanced our audit logging capabilities:

  • Standardized logging: Previously, audit logs generation was dependent on separate internal teams, which could lead to gaps and inconsistencies. Now, audit logs are automatically produced in a seamless and standardized way, eliminating reliance on individual teams and ensuring consistency across all Cloudflare services.

  • Expanded Product Coverage: Automatic Audit Logs now extend our coverage from 62 to 111 products, boosting overall coverage from 75% to 95%. We now capture actions from key endpoints such as the /accounts, /zones, and /organizations APIs.

  • Granular Filtering: With uniformly formatted logs, you can quickly pinpoint specific actions, users, methods, and resources, making investigations faster and more efficient.

  • Enhanced Context and Transparency: Each log entry includes detailed context like the authentication method used, whether the action was performed via the API or Dashboard, and mappings to Cloudflare Ray IDs for better traceability.

  • Comprehensive Activity Capture: In addition to create, edit, and delete actions, the system now records GET requests and failed attempts, ensuring that no critical activity goes unnoticed.

This new system reflects Cloudflare’s commitment to building a safer, more transparent Internet. It also supports Cloudflare’s pledge to CISA’s Cybersecurity Commitment, reinforcing our dedication to increase our customers’ ability to gather evidence of cybersecurity intrusions.

Automatic Audit Logs (beta release) is available exclusively through the API

The journey of an audit log: how Cloudflare creates reliable, secure records

At Cloudflare, we’ve always made audit logs available through the Audit Log API, but the experience has not been very consistent.

Why? Individual product teams were responsible for creating and maintaining their audit logs. This resulted in inconsistencies, gaps in coverage, and a fragmented user experience

Recognizing the importance of reliable audit logs, we set out to improve coverage across all Cloudflare products. Our goal was to standardize, secure, and automate the process, giving users comprehensive insights into user-initiated actions while enhancing visibility and usability. Let’s take a closer look at how an audit log is created at Cloudflare.

Which APIs are audit logged? 

Audit logs are generated for all user requests made via the public API or the Cloudflare dashboard. While a few exceptions exist, such as GraphQL requests and static assets, the majority of user actions are captured.

When a user action occurs, the request is forwarded to our audit logging pipeline. This ensures logs are generated automatically for all products, close to the source of the action, and capturing the most relevant details.

For RESTful APIs that produce JSON, sanitized request bodies are logged to prevent any sensitive information from being included in the audit logs. For GET requests, which are typically read-only and may generate large responses, only the action performed and the resource accessed are logged, avoiding unnecessary overhead while still maintaining essential visibility.

Streaming HTTP requests

Any user-initiated action on Cloudflare, whether through the API or the Dashboard, is handled by the API Gateway. The HTTP request, along with its corresponding request and response data, is then forwarded to a Worker called the Audit Log Redactor. This allows audit logging to happen automatically without relying on internal teams to send events.

To minimise the latency, the API Gateway streams these requests to the redactor Worker via RPC (Remote Procedure Calls) using service bindings. This approach ensures the requests are successfully sent without going through a publicly-accessible URL.

Redacting sensitive information

Once the Worker receives the HTTP request, it references the Cloudflare OpenAPI Schema to handle sensitive information. OpenAPI is a widely adopted, machine-readable, and human-friendly specification format that is used to define HTTP APIs. It relies on JSON Schema to describe the API’s underlying data.  

Using the OpenAPI Schema, the redactor Worker identifies the corresponding API schema for the HTTP request. It then redacts any sensitive information, leaving only those explicitly marked as auditable in the schema. This redaction process ensures that no sensitive data progresses further down the pipeline while retaining enough information to debug and analyze how an action impacted a resource’s value.

Each Cloudflare product team defines its APIs within the OpenAPI schema and marks specific fields as auditable. This provides visibility into resource changes while safeguarding sensitive data.

Once redacted, the data moves through Cloudflare’s data pipeline. This pipeline includes several key components including Logfwdr, Logreceiver and Buftee buffers, where the sanitized data is eventually pushed, awaiting further processing.


Ingesting and building the audit log

The Ingestor service consumes messages from Buftee buffers and transforms individual requests into audit log records. Using a fixed schema, the Ingestor ensures that audit logs remain standardized across all Cloudflare products, regardless of scale.

Because API Gateway — the system from which the majority of Automatic Audit Logs are recorded, as noted above — handles tens of thousands of requests per second, the Ingestor was designed to process multiple requests concurrently. 


Plot of audit requests rate. x-axis indicates the time and y-axis indicates the total number of audit requests handled per second.

Enriching and storing the logs

From a security perspective, it is critical to capture who initiated a change and how they were authenticated. To achieve this, the audit log is enriched with user details and authentication information extracted from custom response headers.

Additional contextual details, such as the account name, are retrieved by making calls to internal services. To enhance performance, a read-through caching mechanism is used. The system checks the cache for responses first and if unavailable, it fetches the data from internal services and caches it for future use.

Once the audit logs are fully transformed and enriched, they are stored in a database in batches to prevent overwhelming the system. For the beta release, we are storing 30 days of audit logs in the database. This will be extended to 18 months for our GA (General Availability) release in the second half of 2025.

Sample audit log

Here is a complete sample audit log generated when an alert notification policy is updated. It provides all the essential details to answer the who, what, when, where, and how of the action.

Audit logs are always associated with an account, and some actions also include user and zone information when relevant. The action section outlines what changed and when, while the actor section provides context on who made the change and how it was performed, including whether it was done via the API or through the UI.

Information about the resource is also included, so you can easily identify what was altered (in this case, the Advanced Security Events Alert was updated). Additionally, raw API request details are provided, allowing users to trace the audit log back to a specific API call.

curl -X PUT https://api.cloudflare.com/client/v4/accounts/<account_id>/alerting/v3/policies/<policy_id> --data-raw '{...'}
       {
            "account": {
                "id": "<account_id>",
                "name": "Example account"
            },
            "action": {
                "description": "Update a Notification policy",
                "result": "success",
                "time": "2025-01-23T18:25:14.749Z",
                "type": "update"
            },
            "actor": {
                "context": "dash",
                "email": "[email protected]",
                "id": "<actor-id>",
                "ip_address": "127.0.0.1",
                "token": {},
                "type": "user"
            },
            "id": "<audit_log_id>",
            "raw": {
                "cf_ray_id": "<ray_id>",
                "method": "PUT",
                "status_code": 200,
                "uri": "/accounts/<account_id>/alerting/v3/policies/<policy_id>",
                "user_agent": "Postman"
            },
            "resource": {
                "id": "<resource-id>",
                "product": "alerting",
                "request": {
                    "alert_type": "clickhouse_alert_fw_ent_anomaly",
                    "enabled": false,
                    "filters": {
                        "services": [
                            "securitylevel",
                            "ratelimit",
                            "firewallrules"
                        ],
                        "zones": [
                            "<zone_id>"
                        ]
                    },
                    "name": "Advanced Security Events Alert"
                },
                "response": {
                    "id": "<resource_id>"
                },
                "scope": "accounts",
                "type": "policies"
            }

Upcoming enhancements

For General Availability (GA) we are focusing on developing a new user interface in the Dashboard for Automatic Audit Logs, extracting additional auditable fields for the audit logs — including system-initiated actions and user-level actions such as login events — and enabling audit log export via Logpush. In the longer term, we plan to introduce dashboards, trend analysis, and alerting features for audit logs to further enhance their utility and ease of use. By enhancing our audit log system, Cloudflare is taking another step toward empowering users to manage their environments with greater transparency, security, and efficiency. 

Get started with Automatic Audit Logs

Automatic Audit Logs are now available for testing. We encourage you to explore the new features and provide your valuable feedback.

Retrieve audit logs using the following endpoint:

/accounts/<account_id>/logs/audit?since=<date>&before=<date>

You can access detailed documentation for Automatic Audit Logs Beta API release here

Please note that the Beta release does not include updates to the Audit Logs UI in the Cloudflare Dashboard. The existing UI and API for the current audit logs will remain available until we Automatic Audit Logs reach General Availability.

We want your feedback: Your feedback is essential to improving Automatic Audit Logs. Please consider filling out a short survey.

Security updates for Thursday

Post Syndicated from jake original https://lwn.net/Articles/1009450/

Security updates have been issued by AlmaLinux (doxygen and openssl), Debian (dcmtk and webkit2gtk), Fedora (chromium, clevis-pin-tpm2, envision, fido-device-onboard, gotify-desktop, keylime-agent-rust, keyring-ima-signer, libkrun, python3.10, python3.11, python3.14, rust-afterburn, rust-cargo-vendor-filterer, rust-coreos-installer, rust-eif_build, rust-gst-plugin-reqwest, rust-nu, rust-openssl, rust-openssl-sys, rust-pore, rust-rpm-sequoia, rust-sequoia-keyring-linter, rust-sequoia-octopus-librnp, rust-sequoia-policy-config, rust-sequoia-sqv, rust-sevctl, rust-snphost, rust-tealdeer, rustup, and s390utils), Mageia (ffmpeg, php-tcpdf, python-tornado, and subversion), Red Hat (openssl and python-jinja2), SUSE (crun, glibc, kernel, libngtcp2-16, libtasn1, netty, ovmf, podman, python, and python3), and Ubuntu (ansible, digikam, linux-aws, linux-aws-5.15, linux-azure-6.8, and ruby2.7).

DOGE as a National Cyberattack

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2025/02/doge-as-a-national.html

In the span of just weeks, the US government has experienced what may be the most consequential security breach in its history—not through a sophisticated cyberattack or an act of foreign espionage, but through official orders by a billionaire with a poorly defined government role. And the implications for national security are profound.

First, it was reported that people associated with the newly created Department of Government Efficiency (DOGE) had accessed the US Treasury computer system, giving them the ability to collect data on and potentially control the department’s roughly $5.45 trillion in annual federal payments.

Then, we learned that uncleared DOGE personnel had gained access to classified data from the US Agency for International Development, possibly copying it onto their own systems. Next, the Office of Personnel Management—which holds detailed personal data on millions of federal employees, including those with security clearances—was compromised. After that, Medicaid and Medicare records were compromised.

Meanwhile, only partially redacted names of CIA employees were sent over an unclassified email account. DOGE personnel are also reported to be feeding Education Department data into artificial intelligence software, and they have also started working at the Department of Energy.

This story is moving very fast. On Feb. 8, a federal judge blocked the DOGE team from accessing the Treasury Department systems any further. But given that DOGE workers have already copied data and possibly installed and modified software, it’s unclear how this fixes anything.

In any case, breaches of other critical government systems are likely to follow unless federal employees stand firm on the protocols protecting national security.

The systems that DOGE is accessing are not esoteric pieces of our nation’s infrastructure—they are the sinews of government.

For example, the Treasury Department systems contain the technical blueprints for how the federal government moves money, while the Office of Personnel Management (OPM) network contains information on who and what organizations the government employs and contracts with.

What makes this situation unprecedented isn’t just the scope, but also the method of attack. Foreign adversaries typically spend years attempting to penetrate government systems such as these, using stealth to avoid being seen and carefully hiding any tells or tracks. The Chinese government’s 2015 breach of OPM was a significant US security failure, and it illustrated how personnel data could be used to identify intelligence officers and compromise national security.

In this case, external operators with limited experience and minimal oversight are doing their work in plain sight and under massive public scrutiny: gaining the highest levels of administrative access and making changes to the United States’ most sensitive networks, potentially introducing new security vulnerabilities in the process.

But the most alarming aspect isn’t just the access being granted. It’s the systematic dismantling of security measures that would detect and prevent misuse—including standard incident response protocols, auditing, and change-tracking mechanisms—by removing the career officials in charge of those security measures and replacing them with inexperienced operators.

The Treasury’s computer systems have such an impact on national security that they were designed with the same principle that guides nuclear launch protocols: No single person should have unlimited power. Just as launching a nuclear missile requires two separate officers turning their keys simultaneously, making changes to critical financial systems traditionally requires multiple authorized personnel working in concert.

This approach, known as “separation of duties,” isn’t just bureaucratic red tape; it’s a fundamental security principle as old as banking itself. When your local bank processes a large transfer, it requires two different employees to verify the transaction. When a company issues a major financial report, separate teams must review and approve it. These aren’t just formalities—they’re essential safeguards against corruption and error. These measures have been bypassed or ignored. It’s as if someone found a way to rob Fort Knox by simply declaring that the new official policy is to fire all the guards and allow unescorted visits to the vault.

The implications for national security are staggering. Sen. Ron Wyden said his office had learned that the attackers gained privileges that allow them to modify core programs in Treasury Department computers that verify federal payments, access encrypted keys that secure financial transactions, and alter audit logs that record system changes. Over at OPM, reports indicate that individuals associated with DOGE connected an unauthorized server into the network. They are also reportedly training AI software on all of this sensitive data.

This is much more critical than the initial unauthorized access. These new servers have unknown capabilities and configurations, and there’s no evidence that this new code has gone through any rigorous security testing protocols. The AIs being trained are certainly not secure enough for this kind of data. All are ideal targets for any adversary, foreign or domestic, also seeking access to federal data.

There’s a reason why every modification—hardware or software—to these systems goes through a complex planning process and includes sophisticated access-control mechanisms. The national security crisis is that these systems are now much more vulnerable to dangerous attacks at the same time that the legitimate system administrators trained to protect them have been locked out.

By modifying core systems, the attackers have not only compromised current operations, but have also left behind vulnerabilities that could be exploited in future attacks—giving adversaries such as Russia and China an unprecedented opportunity. These countries have long targeted these systems. And they don’t just want to gather intelligence—they also want to understand how to disrupt these systems in a crisis.

Now, the technical details of how these systems operate, their security protocols, and their vulnerabilities are now potentially exposed to unknown parties without any of the usual safeguards. Instead of having to breach heavily fortified digital walls, these parties  can simply walk through doors that are being propped open—and then erase evidence of their actions.

The security implications span three critical areas.

First, system manipulation: External operators can now modify operations while also altering audit trails that would track their changes. Second, data exposure: Beyond accessing personal information and transaction records, these operators can copy entire system architectures and security configurations—in one case, the technical blueprint of the country’s federal payment infrastructure. Third, and most critically, is the issue of system control: These operators can alter core systems and authentication mechanisms while disabling the very tools designed to detect such changes. This is more than modifying operations; it is modifying the infrastructure that those operations use.

To address these vulnerabilities, three immediate steps are essential. First, unauthorized access must be revoked and proper authentication protocols restored. Next, comprehensive system monitoring and change management must be reinstated—which, given the difficulty of cleaning a compromised system, will likely require a complete system reset. Finally, thorough audits must be conducted of all system changes made during this period.

This is beyond politics—this is a matter of national security. Foreign national intelligence organizations will be quick to take advantage of both the chaos and the new insecurities to steal US data and install backdoors to allow for future access.

Each day of continued unrestricted access makes the eventual recovery more difficult and increases the risk of irreversible damage to these critical systems. While the full impact may take time to assess, these steps represent the minimum necessary actions to begin restoring system integrity and security protocols.

Assuming that anyone in the government still cares.

This essay was written with Davi Ottenheimer, and originally appeared in Foreign Policy.

Teaching about AI in K–12 education: Thoughts from the USA

Post Syndicated from Katharine Childs original https://www.raspberrypi.org/blog/teaching-about-ai-in-k-12-education-thoughts-from-the-usa/

As artificial intelligence continues to shape our world, understanding how to teach about AI has never been more important. Our new research seminar series brings together educators and researchers to explore approaches to AI and data science education. In the first seminar, we welcomed Shuchi Grover, Director of AI and Education Research at Looking Glass Ventures. Shuchi began by exploring the theme of teaching using AI, then moved on to discussing teaching about AI in K–12 (primary and secondary) education. She emphasised that it is crucial to teach about AI before using it in the classroom, and this blog post will focus on her insights in this area.

Shuchi Grover gave an insightful talk discussing how to teach about AI in K–12 education.
Shuchi Grover gave an insightful talk discussing how to teach about AI in K–12 education.

An AI literacy framework

From her research, Shuchi has developed a framework for teaching about AI that is structured as four interlocking components, each representing a key area of understanding:

  • Basic understanding of AI, which refers to foundational knowledge such as what AI is, types of AI systems, and the capabilities of AI technologies
  • Ethics and human–AI relationship, which includes the role of humans in regard to AI, ethical considerations, and public perceptions of AI
  • Computational thinking/literacy, which relates to how AI works, including building AI applications and training machine learning models
  • Data literacy, which addresses the importance of data, including examining data features, data visualisation, and biases

This framework shows the multifaceted nature of AI literacy, which involves an understanding of both technical aspects and ethical and societal considerations. 

Shuchi’s framework for teaching about AI includes four broad areas.
Shuchi’s framework for teaching about AI includes four broad areas.

Shuchi emphasised the importance of learning about AI ethics, highlighting the topic of bias. There are many ways that bias can be embedded in applications of AI and machine learning, including through the data sets that are used and the design of machine learning models. Shuchi discussed supporting learners to engage with the topic through exploring bias in facial recognition software, sharing activities and resources to use in the classroom that can prompt meaningful discussion, such as this talk by Joy Buolamwini. She also highlighted the Kapor Foundation’s Responsible AI and Tech Justice: A Guide for K–12 Education, which contains questions that educators can use with learners to help them to carefully consider the ethical implications of AI for themselves and for society. 

Computational thinking and AI

In computer science education, computational thinking is generally associated with traditional rule-based programming — it has often been used to describe the problem-solving approaches and processes associated with writing computer programs following rule-based principles in a structured and logical way. However, with the emergence of machine learning, Shuchi described a need for computational thinking frameworks to be expanded to also encompass data-driven, probabilistic approaches, which are foundational for machine learning. This would support learners’ understanding and ability to work with the models that increasingly influence modern technology.

A group of young people and educators smiling while engaging with a computer.

Example activities from research studies

Shuchi shared that a variety of pedagogies have been used in recent research projects on AI education, ranging from hands-on experiences, such as using APIs for classification, to discussions focusing on ethical aspects. You can find out more about these pedagogies in her award-winning paper Teaching AI to K-12 Learners: Lessons, Issues and Guidance. This plurality of approaches ensures that learners can engage with AI and machine learning in ways that are both accessible and meaningful to them.

Research projects exploring teaching about AI and machine learning have involved a range of different approaches.
Research projects exploring teaching about AI and machine learning have involved a range of different approaches.

Shuchi shared examples of activities from two research projects that she has led:

  • CS Frontiers engaged high school students in a number of activities involving using NetsBlox and accessing real-world data sets. For example, in one activity, students participated in data science activities such as creating data visualisations to answer questions about climate change. 
  • AI & Cybersecurity for Teens explored approaches to teaching AI and machine learning to 13- to 15-year-olds through the use of cybersecurity scenarios. The project aimed to provide learners with insights into how machine learning models are designed, how they work, and how human decisions influence their development. An example activity guided students through building a classification model to analyse social media accounts to determine whether they may be bot accounts or accounts run by a human.
A screenshot from an activity to classify social media accounts 
A screenshot from an activity to classify social media accounts 

Closing thoughts

At the end of her talk, Shuchi shared some final thoughts addressing teaching about AI to K–12 learners: 

  • AI learning requires contextualisation: Think about the data sets, ethical issues, and examples of AI tools and systems you use to ensure that they are relatable to learners in your context.
  • AI should not be a solution in search of a problem: Both teachers and learners need to be educated about AI before they start to use it in the classroom, so that they are informed consumers.

Join our next seminar

In our current seminar series, we are exploring teaching about AI and data science. Join us at our next seminar on Tuesday 11 March at 17:00–18:30 GMT to hear Lukas Höper and Carsten Schulte from Paderborn University discuss supporting middle school students to develop their data awareness. 

To sign up and take part in the seminar, click the button below — we will then send you information about joining. We hope to see you there.

I want to join the next seminarThe schedule of our upcoming seminars is online. You can catch up on past seminars on our previous seminars and recordings page.

The post Teaching about AI in K–12 education: Thoughts from the USA appeared first on Raspberry Pi Foundation.

Сделката ЕС–Меркосур. Защо са недоволни европейските земеделци?

Post Syndicated from Анахит Хачикян original https://www.toest.bg/sdelkata-es-merkosur-zashto-sa-nedovolni-evropeyskite-zemedeltsi/

Сделката ЕС–Меркосур. Защо са недоволни европейските земеделци?

2024-та беше година на земеделските протести в цяла Европа. Гневни фермери от Франция, Белгия, Германия, Нидерландия, Полша и други държави неколкократно блокираха Брюксел с трактори в различни размери, палиха огньове и гориха гуми, за да изразят недоволството си срещу екологичните правила и общата земеделска политика на Европейския съюз. Протести имаше и в български, френски, гръцки, полски, испански и германски градове. Подписаното търговско споразумение между ЕС и Меркосур (Бразилия, Аржентина, Парагвай и Уругвай) в края на 2024-та пък зададе тона за следващата година, през която се очаква земеделските протести в Европа да бъдат още по-бурни и ожесточени. 

Защо фермерите протестират и как това засяга всички нас в Европа и в България?

Колко често ви се случва да се замисляте откъде идва пържолата в чинията ви или пакетът захар в шкафа ви? Къде са били произведени и опаковани, преди да стигнат до супермаркета или до кухнята на ресторанта? Вероятността да са били произведени в Европа или дори в България поне засега е относително висока. Според данни на Европейската комисия около 80–90 % от храната по европейските пазари и магазини за хранителни стоки е произведена в Европейския съюз, който не само изхранва собственото си население, но е и най-големият износител на селскостопански и хранителни продукти в световен мащаб. 

Разбира се, зависи от стоките: от една страна, ЕС изнася вино, свинско месо, тестени и шоколадови изделия, млечни продукти. От друга страна, внася плодове, ядки, какао, подправки, палмово олио, маслодайни семена, соя. Повечето неща, които ЕС внася, всъщност позволяват след това износа: внесеното какао служи за изнесения шоколад, внесената соя храни животните, от които после ще бъдат произведени месни и млечни продукти за износ. ЕС все повече държи да изтъква търговския си потенциал в един несигурен свят, където си съперничи с непредвидимите САЩ, безочливия Китай, бавно напредващата Индия и редица други държави, търсещи пазар, на който първоначално да стъпят и по възможност постепенно да залеят.

В това отношение споразумението ЕС–Меркосур е идеалният брак по сметка. То предвижда прогресивно премахване на митата, което ще увеличи вноса и износа в двете посоки и ще създаде една от най-големите зони за свободна търговия, обхващаща повече от 780 млн. души. В момента ЕС е втори търговски партньор на латиноамериканския регион след Китай, а през 2023 г. Съюзът е изнесъл там стоки на стойност 56 млрд. евро. 

Смята се, че ако споразумението влезе в сила, премахването на митата ще спести на европейските износители около 4 млрд. евро годишно. Над 350 запазени европейски марки, като сиренето „Пармиджано Реджано“, пармската шунка или българското розово масло ще се възползват със същия защитен статут и на латиноамериканския пазар, а южноамериканските продукти ще станат по-евтини и по-достъпни за европейските консуматори. 

Дотук новините изглеждат по-скоро добри. Но както е казал Удхаус, привлекателността на лова зависи изцяло от това от коя страна на пушката се намираш. Европейските земеделци виждат като заплаха масивното нахлуване на латиноамерикански хранителни продукти, особено ако са на привлекателни цени и не са подчинени на същите строги производствени изисквания както храните, произведени в Европа. 

Използването на пестициди и антибиотици, експлоатацията на природни ресурси, както и неспазването на трудови и социални норми в страните от Меркосур ги превръща в нелоялни конкуренти на европейските земеделци, които вече от години недоволстват, че са принудени да се съобразяват с наложените им от Брюксел строги екологични и санитарни мерки. Сега южноамериканските им събратя ще получат достъп до същия пазар, без да има контрол над производствените им методи. 

В Европейския съюз животновъдите разполагат с броени дни, за да регистрират всяко новородено животно и краткият му житейски път от обора до кухнята ви може да бъде проследен и анализиран – от храната и лекарствата, които приема, през условията на транспорт на път за кланицата, до начина, по който ще приключи съществуването си. Това проследяване няма да може да се извърши в детайли при месо, идващо от страна извън ЕС, каквото вече отдавна има на европейския пазар, но в по-малки количества. 

Европейските земеделци не са единствените недоволни от търговското споразумение. 

Еколозите също предупреждават за опасността от още по-мащабно изсичане на Амазонската гора заради очакваното повишено търсене на земеделска земя, което пък ще се отрази пагубно на климатичните промени. Отглеждането на добитък вече е причина за 80% от обезлесяването в Амазонската гора, а Бразилия e лидер в превръщането на гори в пасища за говеда и ниви за отглеждане на соя – два продукта, които ще се радват на още по-голям интерес след премахването на митата. 

В договора ЕС–Меркосур се споменават климатичните цели за намаляване на парниковите газове, заложени в Парижкото споразумение за климата, но само се предвиждат диалог и мониторинг без конкретни наказателни механизми или обвързващи задължения, засягащи околната среда. Негативни ще бъдат последиците и за местните хора и коренното население в Южна Америка, които няма да се възползват от печалбите от нарасналата търговия, но ще са принудени да търпят все по-грубата намеса в традиционния им начин на живот. Да не говорим за допълнителните вредни емисии, причинени от търговския транспорт на по-големи количества продукти зад океана и неизбежното замърсяване на въздуха, което и сега причинява редица тежки заболявания и преждевременна смърт в много части на света.

На страната на безспорните печеливши са големите производители и износители както в Южна Америка, така и в Европа. 

Те обаче са малцинство и от двете страни на океана. По данни на Европейската комисия от 2020 г. в ЕС има 9,1 млн. земеделски стопанства, от които около две трети са по-малки от 5 хектара, тоест мнозинството европейски земеделци са собственици на малки семейни ферми, в които отглеждат продукция за собствена консумация и за местния пазар. Именно те намират сделката ЕС–Меркосур за обидна и я възприемат като незачитане на всички положени усилия през последните години да спазват все по-драконовските мерки за производство, наложени от Брюксел, и да развиват устойчиво земеделие, което защитава околната среда. Сега сделката им изглежда като несправедливо прилагане на двойни стандарти, което услужливо облагодетелства други континенти и е в ущърб на нашия. 

Някои други ползи, за които не говорим

Но страните от Меркосур знаят много добре, че безплатен обяд няма и ако европейците държат на това споразумение, то е, за да извлекат други ползи, различни от земеделската търговия. Една от тях е свързана със закупуването на суровини. Страните от Меркосур няма да налагат данъци върху износа за ЕС на редица материали, като никел, мед, алуминий, суровини за стомана, германий и други, от които Европа силно се нуждае и за които в момента е зависима основно от Китай. 

Южноамериканците също постепенно ще премахнат митата върху вноса на европейски автомобили, които за момента са губещи в конкуренцията с Китай на световния автомобилен пазар. Затова страни като Германия – със силна автомобилна индустрия – подкрепят споразумението, което ще им позволи да продават повече германски коли в Южна Америка, а против са държави като Франция, която получава най-голямата субсидия от общата земеделска политика и има много силна национална протекционистка политика в областта на земеделието. Сред застъпниците е и Испания, която по исторически причини винаги е поддържала по-тесни търговски и икономически връзки с региона, както и скандинавските държави, които по принцип защитават свободната търговия в името на стратегически интереси. 

В лагера на противниците Франция засега е плахо следвана от Италия, Нидерландия, Полша и Белгия, но някои от тях ще предпочетат да се измъкнат деликатно и да се въздържат при гласуването в Съвета, вместо да изразят гръмко несъгласие.

Къде сме ние?

По данни на Европейската комисия от 2016–2017 г. (последните налични на европейско ниво за отношенията между България и южноамериканския регион) Меркосур е бил на 21-во място сред търговските партньори на България извън ЕС. 183 български фирми са изнесли продукция за Меркосур на стойност 67 млн. евро, докато вносът е възлизал на 79 млн. евро. Новото споразумение може да предостави още възможности за износ на български млечни продукти, вино, розово масло, както и суровини за индустрията. От друга страна, родните земеделци, също както събратята си от други европейски държави, могат да пострадат от нелоялна конкуренция и подбиване на цените и да се присъединят към протестите на най-голямата европейска земеделска организация Copa-Cogeca. Българският фермерски съюз вече се подписа под протестното писмо, изпратено от повече от 50 европейски браншови организации до Европейската комисия още през ноември 2024 г., когато сделката все още се договаряше.

Какво следва?

Преговорите по сделката ЕС–Меркосур са отнели около 25 години и никой още не знае колко време ще отнеме ратифицирането на споразумението от Съвета и Европейския парламент и влизането му в сила на национално ниво в страните членки на ЕС. Предстои дебат по темата в пленарната зала на Европейския парламент в Страсбург и тепърва ще видим как европейските политически групи ще се позиционират и ще лавират между национални приоритети и общоевропейски интереси. 

Отвъд конкретния дебат става дума за избора, който Европа прави между либерализъм и протекционизъм, за поддържането на европейската индустрия конкурентоспособна, но и за принципите, които трябва да се спазват безкомпромисно, за да има устойчиво бъдеще. Вече става дума не само за икономическия просперитет на континентите и търговското съперничество в рамките на държави и международни организации, както е било, когато ЕС е създаден като зона за свободна търговия след Втората световна война, а за т.нар. планетарни граници, чието пренебрегване може да се окаже колективен самоубийствен акт в дългосрочен план. 

Докато големите лидери вземат тези тежки решения, на нас ни остава да се уповаваме на здравия си разум и да се придържаме към простичките правила за здравословен живот за нас, за околната среда и за всички живи същества на планетата: мисли глобално, яж локално и подпомагай родните производители, защото накрая отговорността лежи в чинията на всеки от нас. 

Изразеното мнение е лично и не представлява позицията на Европейския парламент.

Оценка на действията на DOGE и Мъск за правителствена ефективност

Post Syndicated from Bozho original https://blog.bozho.net/blog/4451

Управлението на Тръмп започна със заявка за оптимизация на ефективността на администрацията, с т.нар. DOGE (Department of government efficiency) на Илон Мъск. В първия ден написах, че ще следя с интерес действията на DOGE по оптимизацията на администрацията. Това е задача, която и ние имаме, като електронното управление е инструмент за това. Още в деня на указа за създаване на DOGE написах на колеги, че „рисковото е, че всяка структура трабва да им предоставя всички некласифицирани документи“ и че вероятно ще се търсят „скелети в гардероба“.

Този риск, произтичащ от указа, бързо се реализира, поради действия на служители на DOGE (или USDS, което е структурата от американското правителство за ИТ услуги и модернизация на администрацият, което е основният инструмент на DOGE). Служителите, които са 20-25 годишни ИТ-специалисти, са получили достъп до системите и базите данни на редица ключови администрации, вкл. като са закачали външни твърди дискове. Това предизвика сериозно недоволство, и с право.

Не оспорвам правото на достъп на специализирани звена до данни и документи в цялата структура на едно правителство. Нито отричам правото да бъдат закривани структури или прекратявани финансирания – това са политически решения (подлежащи на съдебен контрол), на които всяко управление има право (и всеки има право да не е съгласен с тях и да ги оспори в съд). Още повече, че със сигурност има измами и неефективност в разходите, и външен поглед върху данните и системите може да ги идентифицира и да ги спре.

Но за да има легитимност изпълнението на тези политически решения, то трябва да е по правила. Изглежда Мъск и хората му са с разбиране, че са над правилата – че могат да превземат информационните системи на ключови администрации с „разпореждане отгоре“. Тези неща могат да се правят както трябва, вкл. заради информационната сигурност и защитата на данните, които са под риск от такива ад-хок действия.

Ще дам няколко примера от времето, в което съм бил министър, а преди това и когато бях съветник, с пълното разбиране, че мащабите са различни, но все пак има някои много преки паралели.

Първият пример е системата СЕБРА (системата за бюджетни разплащания на Министерството на финансите). Мъск в момента прави на практика същото – изваждане и евентуално публикуване на всички държавни плащания. Само че ние го направихме правилно – съвместно с министъра на финансите внесохме проект на решение на Министерския съвет, с което определихме данните за плащанията като приоритетен набор от данни, в изпълнение на изискванията на Закона за достъп до обществена информация. С това решение МФ бяха задължени да предоставят на МЕУ данните в определен вид, така че да могат да бъдат обработени, анонимизирани и публикувани. МФ възложи на своя изпълнител да извади данните, предаде ги по сигурен начин на МЕУ, където кодът за обработка и анонимизация беше публикуван в хранилището с отворен код. Паралелно с това беше създадена работна група, която промени наредба, така че системата автоматично да публикува такива данни. Ако го бяхме направили като Мъск, щях аз да пратя мой съветник и да кажа „свържете му лаптопа и му дайте достъп до базата данни“. И това щеше да е неправилно.

През 2016 г. отворихме данните от Търговския регистър и от регистъра на обществените поръчки. Тогава бях съветник, но отварянето на данните беше придружено с писма, а достъп до базата данни съм нямал – експерти от съответните институции писаха заявки към базата, а аз получих тестова база данни без реални данни, на която също да работя по заявките за изваждане на информацията. Публикувахме и кода на инструмента за анонимизиране на данните в Търговския регистър. Това беше в резултат на официална кореспонденция, на база на валидно правно основание в Закона за достъп до обществена информация.

Като министър (все пак официално избран от парламента, а не неизвестен служител) съм искал да разглеждам системи, но винаги някой друг да ми показва, а аз само да насочвам какво точно да бъде извадено (въпреки, че можех и сам да си го намеря и то по-бързо). В тези случаи ситуацията винаги е била „седни тук и ми покажи“ както и „на основание … моля да ми предоставите следните данни“, а не „дай ми достъп до базата данни и се махай“.

Да, има риск от отказ или саботаж на тези усилия, но тогава има други мерки – този, който саботира, може да бъде преместен или дори уволнен, ако не изпълнява законосъобразна заповед. В момента, обаче, Мъск прави обратното – неговите хора (непроверени по надлежния ред) действат незаконосъобразно и биват спирани от съда. На места вътрешните екипи правилно определиха тези действия като „вътрешна заплаха“ (insider threat). По принцип всеки достъп до базите данни трябва да оставя следа, а достъпът на обичайните лица трябва да бъде контролиран, защото не само външни лица могат да злоупотребяват с данните – вътрешните също понякога го правят.

Въпросът за достъпа до данните на институциите е изключително сериозен и не може да бъде решен с едно бланкетно изречение в президентски указ. А той е толкова сериозен, защото реално една модерна държава зависи до голяма степен от регистрите си и базите си данни. Тяхната сигурност и пълнота са в основата на редица политики и техния успех или неуспех. „Овладяването“ на регистрите и базите данни е ключов властови инструмент с голям негативен потенциал.

Излизайки от тази конкретика, ако DOGE беше действал „както трябва“, щеше да му отнеме няколко месеца повече, но щеше да има повече легитимност в действията. Видимо обаче, в унисон с други действия и заявки на американското правителство, по-скоро те искат да рушат и да газят правилата.

Ефективността на правителството е важна и ограничаването на излишните разходи е валиден политически приоритет. Но както може да бъде легитимна цел, така може да бъде и параван за по-нелицеприятни неща. И макар да идвам от стартъп света, не съм съгласен с подхода, че правителство и администрация могат да се управляват като стартъп. В стартъпите целта е да се движиш бързо, с риск да чупиш неща. В публичния сектор е по-важно да не чупиш неща, защото от това зависят съдби и животи, поради което нещата стават по-бавно. И когато действаш, нарушавайки правилата, с мотиви за политическа целесъобразност, не генерираш, а рушиш доверие.

Материалът Оценка на действията на DOGE и Мъск за правителствена ефективност е публикуван за пръв път на БЛОГодаря.

The collective thoughts of the interwebz