How to give your students structure as they learn programming skills

Post Syndicated from Jan Ander original https://www.raspberrypi.org/blog/how-to-give-your-students-structure-as-they-learn-programming-skills/

Creating a computer program involves many different skills — knowing how to code is just one part. When we teach programming to young people, we want to guide them to learn these skills in a structured way. The ‘levels of abstraction’ framework is a great tool for doing that. This blog describes how using the framework will benefit you and your learners in the computing classroom.

Two learners at a laptop in a computing classroom.

We’re also excited to share our new Pedagogy Quick Read, which you can download for free to:

  • Find practical tips for using the ‘levels of abstraction’ framework with your learners
  • Read a summary of the research behind the framework

Learning to program: Everything at once?

Creating a program from the ground up can be daunting, especially for new learners. Without support, they’ll likely get stuck sooner or later; programs rarely work the first time round. And the more complex the problem that a program is addressing, the more likely it is that the first version of the program won’t work.

In a computing classroom, two girls concentrate on their programming task.

One reason that learning to program can be challenging is that it involves understanding a lot of specific concepts and applying many varied skills. From early on in their learning journey, young people need to have a firm grasp of concepts such as repetition, selection, variables, and functions. Also fundamental to learning to program well is the skill of abstraction: understanding a task and identifying which details are relevant and which can be ignored.

To get to grips with all these different concepts and skills, young people need structure — otherwise they’ll try to hold everything in their head at once, and likely feel overwhelmed by the cognitive load. This sort of experience may cause them to disengage instead of persisting. They may even decide that programming is not for them.

In light of these challenges, the ‘levels of abstraction’ framework is a great tool for teaching.

The benefits of the ‘levels of abstraction’ framework

The framework breaks programming down into four levels, each focusing on a different aspect of creating a program:

  • Problem: Analysing the problem or task the program should address, to understand and record the requirements.
  • Design: Turning the analysis into an algorithm — a set of steps for the computer to follow to create the desired output. This can involve flowcharts or storyboards, but importantly no code.
  • Code: Developing the code based on the design (and building the physical components if any are involved).
  • Running the code: Testing the code, checking outputs, and debugging where necessary.

Throughout the processes of developing a program, learners (and professional programmers) move between these levels as they implement their designs and debug them, sometimes even returning to the problem level if more analysis or clarification is needed.

Young child in the classroom using Scratch to program.

Potential benefits of the ‘levels of abstraction’ framework for teachers:

  • It helps you break down the activity of programming into discrete parts.
  • It helps you engage your learners, as you can show them that programming involves more than knowing how to code.
  • If your learners get stuck with their programming, the framework can help you guide them to a solution.

Potential benefits for learners:

  • The framework will help them think through all the steps needed to create a program that works, and practise their problem-solving skills and analytical thinking.
  • They will more readily see how programming connects to their world — at the problem level — and find aspects of programming where they have strengths and can use their creativity.
  • They will gain a stronger idea of how software is built in the tech sector.

Our new Quick Read shares tips on how to best use the framework in your teaching.

Things to aim for when using the framework with your learners:

  • Be aware of what level they are working at and when it’s time to switch to a different one.
  • Understand that, when they encounter an issue with their program, they can step back and use the framework to figure out where the issue comes from. The issue might be a bug in the code, the algorithm not working as intended, or a description of the problem not taking into account something important.

We hope you find the framework useful. If you have ideas for how to use it in your teaching, why not share them in the comments?

Teaching programming: The wider context

When following the ‘levels of abstraction’ approach, learners need to explain how programs work and debug them. That means program comprehension is a key skill here. You may have already helped your learners to develop and practise this skill, for example with the PRIMM approach. The Block Model is another useful tool for helping your learners talk about various aspects of a program. And if you use the pair programming approach in programming activities, your learners can improve their program comprehension by talking about their code with each other. On our website, you’ll find more guidance on the best ways to teach programming and computing.

Photo of a young person coding on a desktop computer.

And what about generative artificial intelligence (AI) tools for programmers? In the age of AI, we think young people still need to learn to code because it empowers them to navigate and think critically about all digital technologies, including AI. And while generative AI tools can help a skilled programmer create quality code more quickly, more research is needed to show whether such tools help school-age young people build their understanding as they learn to code. You can see some of the great work being done in this area if you catch up with our 2024 research seminar series.

The ‘levels of abstraction’ framework is useful in your teaching no matter what tools young people use to create programs. Even with an AI tool, they will still need to work at all four levels of abstraction to program effectively. 

The post How to give your students structure as they learn programming skills appeared first on Raspberry Pi Foundation.

Xsight Labs E1 DPU Offers Up to 64 Arm Neoverse N2 Cores and 2x 400Gbps Networking

Post Syndicated from Rohit Kumar original https://www.servethehome.com/xsight-labs-e1-dpu-offers-up-to-64-arm-neoverse-n2-cores-and-2x-400gbps-networking/

The new Xsight Labs E1 DPU can offer up to 64 Arm Neoverse N2 cores along with 32 PCIe Gen5 lanes, dual 400Gbps networking and many offloads

The post Xsight Labs E1 DPU Offers Up to 64 Arm Neoverse N2 Cores and 2x 400Gbps Networking appeared first on ServeTheHome.

RocksDB 101: Optimizing stateful streaming in Apache Spark with Amazon EMR and AWS Glue

Post Syndicated from Melody Yang original https://aws.amazon.com/blogs/big-data/rocksdb-101-optimizing-stateful-streaming-in-apache-spark-with-amazon-emr-and-aws-glue/

Real-time streaming data processing is a strategic imperative that directly impacts business competitiveness. Organizations face mounting pressure to process massive data streams instantaneously—from detecting fraudulent transactions and delivering personalized customer experiences to optimizing complex supply chains and responding to market dynamics milliseconds ahead of competitors.

Apache Spark Structured Streaming addresses these critical business challenges through its stateful processing capabilities, enabling applications to maintain and update intermediate results across multiple data streams or time windows. RocksDB was introduced in Apache Spark 3.2, offering a more efficient alternative to the default HDFS-based in-memory store. RocksDB excels in stateful streaming in scenarios that require handling large quantities of state data. It delivers optimal performance benefits, particularly in reducing Java virtual machine (JVM) memory pressure and garbage collection (GC) overhead.

This post explores RocksDB’s key features and demonstrates its implementation using Spark on Amazon EMR and AWS Glue, providing you with the knowledge you need to scale your real-time data processing capabilities.

RocksDB state store overview

Spark Structured Streaming processes fall into two categories:

  • Stateful: Requires tracking intermediate results across micro-batches (for example, when running aggregations and de-duplication).
  • Stateless: Processes each batch independently.

A state store is required by stateful applications that track intermediate query results. This is essential for computations that depend on continuous events and change results based on each batch of input, or on aggregate data over time, including late arriving data. By default, Spark offers a state store that keeps states in JVM memory, which is performant and sufficient for most general streaming cases. However, if you have a large number of stateful operations in a streaming application—such as, streaming aggregation, streaming dropDuplicates, stream-stream joins, and so on—the default in-memory state store might face out-of-memory (OOM) issues because of a large JVM memory footprint or frequent GC pauses, resulting in degraded performance.

Advantages of RocksDB over in-memory state store

RocksDB addresses the challenges of an in-memory state store through off-heap memory management and efficient checkpointing.

  • Off-heap memory management: RocksDB stores state data in OS-managed off-heap memory, reducing GC pressure. While off-heap memory still consumes machine memory, it doesn’t occupy space in the JVM. Instead, its core memory structures, such as block cache or memTables, allocate directly from the operating system, bypassing the JVM heap. This approach makes RocksDB an optimal choice for memory-intensive applications.
  • Efficient checkpointing: RocksDB automatically saves state changes to checkpoint locations, such as Amazon Simple Storage Service (Amazon S3) paths or local directories, helping to ensure full fault tolerance. When interacting with S3, RocksDB is designed to improve checkpointing efficiency; it does this through incremental updates and compaction to reduce the amount of data transferred to S3 during checkpoints, and by persisting fewer large state files compared to the many small files of the default state store, reducing S3 API calls and latency.

Implementation considerations

RocksDB operates as a native C++ library embedded within the Spark executor, using off-heap memory. While it doesn’t fall under JVM GC control, it still affects overall executor memory usage from the YARN or OS perspective. RocksDB’s off-heap memory usage might exceed YARN container limits without triggering container termination, potentially leading to OOM issues. You should consider the following approaches to manage Spark’s memory:

Adjust the Spark executor memory size

Increase spark.executor.memoryOverheadorspark.executor.memoryOverheadFactor to leave more room for off-heap usage. The following example sets half (4 GB) of spark.executor.memory (8 GB) as the memory overhead size.

# Total executor memory = 8GB (heap) + 4GB (overhead) = 12GB
spark-submit \
. . . . . . . .
--conf spark.executor.memory=8g \         # JVM Heap
--conf spark.executor.memoryOverhead=4g \ # Off-heap allocation (RocksDB + other native)
. . . . . . . .

For Amazon EMR on Amazon Elastic Compute Cloud (Amazon EC2), enabling YARN memory control with the following strict container memory enforcement through polling method preempts containers to avoid node-wide OOM failures:

yarn.nodemanager.resource.memory.enforced = false
yarn.nodemanager.elastic-memory-control.enabled = false
yarn.nodemanager.pmem-check-enabled = true 
or 
yarn.nodemanager.vmem-check-enabled = true

Off-heap memory control

Use RocksDB-specific settings to configure memory usage. More details can be found in the Best practices and considerations section.

Get started with RocksDB on Amazon EMR and AWS Glue

To turn on the state store RocksDB in Spark, configure your application with the following setting:

spark.sql.streaming.stateStore.providerClass=org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider

In the following sections, we explore creating a sample Spark Structured Streaming job with RocksDB enabled running on Amazon EMR and AWS Glue respectively.

RocksDB on Amazon EMR

Amazon EMR versions 6.6.0 and later support RocksDB, including Amazon EMR on EC2, Amazon EMR serverless and Amazon EMR on Amazon Elastic Kubernetes Service (Amazon EKS). In this case, we use Amazon EMR on EC2 as an example.

Use the following steps to run a sample streaming job with RocksDB enabled.

  1. Upload the following sample script to s3://<YOUR_S3_BUCKET>/script/sample_script.py
from pyspark.sql import SparkSession
from pyspark.sql.functions import explode, split, col, expr
import random

# List of words
words = ["apple", "banana", "orange", "grape", "melon", 
         "peach", "berry", "mango", "kiwi", "lemon"]

# Create random strings from words
def generate_random_string():
    return " ".join(random.choices(words, k=5)) 
    
    
# Create Spark Session
spark = SparkSession \
    .builder \
    .appName("StreamingWordCount") \
    .config("spark.sql.streaming.stateStore.providerClass","org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider") \
    .getOrCreate()


# Register UDF
spark.udf.register("random_string", generate_random_string)

# Create streaming data
raw_stream = spark.readStream \
    .format("rate") \
    .option("rowsPerSecond", 1) \
    .load() \
    .withColumn("words", expr("random_string()"))

# Execute word counts
wordCounts = raw_stream.select(explode(split(raw_stream.words, " ")).alias("word")).groupby("word").count()

# Output the results
query = wordCounts \
    .writeStream \
    .outputMode("complete") \
    .format("console") \
    .start()

query.awaitTermination()
  1. On the AWS Management Console for Amazon EMR, choose Create Cluster
  2. For Name and applications – required, select the latest Amazon EMR release.
  3. For Steps, choose Add. For Type, select Spark application.
  4. For Name, enter GettingStartedWithRocksDB and s3://<YOUR_S3_BUCKET>/script/sample_script.py as the Application location.
  5. Choose Save step.
  6. For other settings, choose the appropriate settings based on your use case.
  7. Choose Create cluster to start the streaming application via Amazon EMR step.

RocksDB on AWS Glue

AWS Glue 4.0 and later versions support RocksDB. Use the following steps to run the sample job with RocksDB enabled on AWS Glue.

  1. On the AWS Glue console, in the navigation pane, choose ETL jobs.
  2. Choose Script editor and Create script.
  3. For the job name, enter GettingStartedWithRocksDB.
  4. Copy the script from the previous example and paste it on the Script tab.
  5. On Job details tab, for Type, select Spark Streaming.
  6. Choose Save, and then choose Run to start the streaming job on AWS Glue.

Walkthrough details

Let’s dive deep into the script to understand how to run a simple stateful Spark application with RocksDB using the following example pySpark code.

  1. First, set up RocksDB as your state store by configuring the provider class:
spark = SparkSession \
    .builder \
    .appName("StreamingWordCount") \
    .config("spark.sql.streaming.stateStore.providerClass","org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider") \
    .getOrCreate()
  1. To simulate streaming data, create a data stream using the rate source type. It generates one record per second, containing five random fruit names from a pre-defined list.
# List of words
words = ["apple", "banana", "orange", "grape", "melon", 
         "peach", "berry", "mango", "kiwi", "lemon"]

# Create random strings from words
def generate_random_string():
    return " ".join(random.choices(words, k=5))
# Register UDF
spark.udf.register("random_string", generate_random_string)

# Create streaming data
raw_stream = spark.readStream \
    .format("rate") \
    .option("rowsPerSecond", 1) \
    .load() \
    .withColumn("words", expr("random_string()"))
  1. Create a word counting operation on the incoming stream. This is a stateful operation because it maintains running counts between processing intervals, that is, previous counts must be stored to calculate the next new totals.
# Split raw_stream into words and counts them
wordCounts = raw_stream.select(explode(split(raw_stream.words, " ")).alias("word")).groupby("word").count()
  1. Finally, output the word count totals to the console:
# Output the results
query = wordCounts \
    .writeStream \
    .outputMode("complete") \
    .format("console") \
    .start()

Input data

In the same sample code, test data (raw_stream) is generated at a rate of one-row-per-second, as shown in the following example:

+-----------------------+-----+--------------------------------+
|timestamp              |value|words                           |
+-----------------------+-----+--------------------------------+
|2025-04-18 07:05:57.204|125  |berry peach orange banana banana|
+-----------------------+-----+--------------------------------+

Output result

The streaming job produces the following results in the output logs. It demonstrates how Spark Structured Streaming maintains and updates the state across multiple micro-batches:

  • Batch 0: Starts with an empty state
  • Batch 1: Processes multiple input records, resulting in initial counts for every one of the 10 fruits (for example, banana appears 8 times)
  • Batch 2: Running totals based on new occurrences from the next set of records are added to the counts (for example,  banana increases from 8 to 15, indicating 7 new occurrences).

-------------------------------------------
Batch: 0
-------------------------------------------
+----+-----+
|word|count|
+----+-----+
+----+-----+

-------------------------------------------
Batch: 1
-------------------------------------------
+------+-----+
|  word|count|
+------+-----+
|banana|    8|
|orange|    4|
| apple|    3|
| berry|    5|
| lemon|    7|
|  kiwi|    6|
| melon|    8|
| peach|    8|
| mango|    7|
| grape|    9|
+------+-----+

-------------------------------------------
Batch: 2
-------------------------------------------
+------+-----+
|  word|count|
+------+-----+
|banana|   15|
|orange|    8|
| apple|    7|
| berry|   11|
| lemon|   12|
|  kiwi|   11|
| melon|   16|
| peach|   15|
| mango|   12|
| grape|   13|
+------+-----+

State store logs

RocksDB generates detailed logs during the job run, like the following:

INFO    2025-04-18T07:52:28,378 83933   org.apache.spark.sql.execution.streaming.MicroBatchExecution    [stream execution thread for [id = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, runId = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx]] 60  Streaming query made progress: {
  "id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "runId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "name": null,
  "timestamp": "2025-04-18T07:52:27.642Z",
  "batchId": 39,
  "numInputRows": 1,
  "inputRowsPerSecond": 100.0,
  "processedRowsPerSecond": 1.3623978201634879,
  "durationMs": {
    "addBatch": 648,
    "commitOffsets": 39,
    "getBatch": 0,
    "latestOffset": 0,
    "queryPlanning": 10,
    "triggerExecution": 734,
    "walCommit": 35
  },
  "stateOperators": [
    {
      "operatorName": "stateStoreSave",
      "numRowsTotal": 10,
      "numRowsUpdated": 4,
      "allUpdatesTimeMs": 18,
      "numRowsRemoved": 0,
      "allRemovalsTimeMs": 0,
      "commitTimeMs": 3629,
      "memoryUsedBytes": 174179,
      "numRowsDroppedByWatermark": 0,
      "numShufflePartitions": 36,
      "numStateStoreInstances": 36,
      "customMetrics": {
        "rocksdbBytesCopied": 5009,
        "rocksdbCommitCheckpointLatency": 533,
        "rocksdbCommitCompactLatency": 0,
        "rocksdbCommitFileSyncLatencyMs": 2991,
        "rocksdbCommitFlushLatency": 44,
        "rocksdbCommitPauseLatency": 0,
        "rocksdbCommitWriteBatchLatency": 0,
        "rocksdbFilesCopied": 4,
        "rocksdbFilesReused": 24,
        "rocksdbGetCount": 8,
        "rocksdbGetLatency": 0,
        "rocksdbPinnedBlocksMemoryUsage": 3168,
        "rocksdbPutCount": 4,
        "rocksdbPutLatency": 0,
        "rocksdbReadBlockCacheHitCount": 8,
        "rocksdbReadBlockCacheMissCount": 0,
        "rocksdbSstFileSize": 35035,
        "rocksdbTotalBytesRead": 136,
        "rocksdbTotalBytesReadByCompaction": 0,
        "rocksdbTotalBytesReadThroughIterator": 0,
        "rocksdbTotalBytesWritten": 228,
        "rocksdbTotalBytesWrittenByCompaction": 0,
        "rocksdbTotalBytesWrittenByFlush": 5653,
        "rocksdbTotalCompactionLatencyMs": 0,
        "rocksdbWriterStallLatencyMs": 0,
        "rocksdbZipFileBytesUncompressed": 266452
      }
    }
  ],
  "sources": [
    {
      "description": "RateStreamV2[rowsPerSecond=1, rampUpTimeSeconds=0, numPartitions=default",
      "startOffset": 63,
      "endOffset": 64,
      "latestOffset": 64,
      "numInputRows": 1,
      "inputRowsPerSecond": 100.0,
      "processedRowsPerSecond": 1.3623978201634879
    }
  ],
  "sink": {
    "description": "org.apache.spark.sql.execution.streaming.ConsoleTable$@2cf39784",
    "numOutputRows": 10
  }
}

In Amazon EMR on EC2, these logs are available on the node where the YARN ApplicationMaster container is running. They can be found at/var/log/hadoop-yarn/containers/<Application ID>/<container_id>/stderr.

As for AWS Glue, you can find the RocksDB metrics in Amazon CloudWatch, under the log group /aws-glue/jobs/error.

RocksDB metrics

The metrics from the preceding logs provide insights on RocksDB status. The followings are some example metrics you might find useful when investigating streaming job issues:

  • rocksdbCommitCheckpointLatency: Time spent writing checkpoints to local storage
  • rocksdbCommitCompactLatency: Duration of checkpoint compaction operations during checkpoint commits
  • rocksdbSstFileSize: Current size of SST files in RocksDB.

Deep dive into RocksDB key concepts

To better understand the state metrics shown in the logs, we deep dive into RocksDB’s key concepts: MemTable, sorted string table (SST) file, and checkpoints. Additionally, we provide some tips for best practices and fine-tuning.

High level architecture

RocksDB is a local, non-distributed persistent key-value store embedded in Spark executors. It enables scalable state management for streaming workloads, backed by Spark’s checkpointing for fault tolerance. As shown in the preceding figure, RocksDB stores data in memory and also on disk. RocksDB’s ability to spill data over to disk is what allows Spark Structured Streaming to handle state data that exceeds the available memory.

Memory:

  • Write buffers (MemTables): Designated memory to buffer writes before flushing onto disk
  • Block cache (read buffer): Reduces query time by caching results from disk

Disk:

  • SST files: Sorted String Table saved as SST file format for fast access

MemTable: Stored off-heap

MemTable, shown in the preceding figure, is an in-memory store where data is first written off-heap, before being flushed to disk as an SST file. RocksDB caches the latest two batches of data (hot data) in MemTable to reduce streaming process latency. By default, RocksDB only has two MemTables—one is active and the other is read-only. If you have sufficient memory, the configuration spark.sql.streaming.stateStore.rocksdb.maxWriteBufferNumber can be increased to have more than two MemTables. Among these MemTables, there is always one active table, and the rest are read-only MemTables used as write buffers.

SST files: Stored on Spark executor’s local disk

SST files are block-based tables stored on the Spark executor’s local disk. When the in-memory state data can no longer fit into a MemTable (defined by a Spark configuration writeBufferSizeMB), the active table is marked as immutable, saving it as the SST file format, which switches it to a read-only MemTable while asynchronously flushing it to local disks. While flushing, the immutable MemTable can still be read, so that the most recent state data is available with minimal read latency.

Reading from RocksDB follows the sequence demonstrated by the preceding diagram:

  1. Read from the active MemTable.
  2. If not found, iterate through read-only MemTables in the order of newest to oldest.
  3. If not found, read from BlockCache (read buffer).
  4. If misses, load index (one index per SST) from disk into BlockCache. Look up key from index and if hits, load data block onto BlockCache and return result.

SST files are stored on executors’ local directories under the path of spark.local.dir (default: /tmp) or yarn.nodemanager.local-dirs:

  • Amazon EMR on EC2 – ${yarn.nodemanager.local-dirs}/usercache/hadoop/appcache/<yarn_app_id>/<spark_app_id>/
  • Amazon EMR Serverless, Amazon EMR on EKS, AWS Glue${spark.local.dir}/<spark_app_id>/

Additionally, by using application logs, you can track the MemTable flush and SST file upload status under the file path:

  • Amazon EMR on EC2/var/log/hadoop-yarn/containers/<application_id>/<container_id>/stderr
  • Amazon EMR on EKS –/var/log/spark/user/<spark_app_name>-<spark_executor_ID>/stderr

The following is an example command to check the SST file status in an executor log from Amazon EMR on EKS:

cat /var/log/spark/user/<spark_app_name>-<spark_executor_ID>/stderr/current | grep old

or

kubectl logs <spark_executor_pod_name> --namespace emr -c spark-kubernetes-executor | grep old

The following screenshot is an example of the output of either command.

You can use the following examples to check if the MemTable records were deleted and flushed out to SST:

cat /var/log/spark/user/<spark_app_name>-<spark_executor_ID>/stderr/current | grep deletes

or

kubectl logs <spark_executor_pod_name> --namespace emr -c spark-kubernetes-executor | grep deletes

The following screenshot is an example of the output of either command.

Checkpoints: Stored on the executor’s local disk or in an S3 bucket

To handle fault tolerance and fail over from the last committed point, RocksDB supports checkpoints. The checkpoint files are usually stored on the executor’s disk or in an S3 bucket, including snapshot and delta or changelog data files.

Starting with Amazon EMR 7.0 and AWS Glue5.0, RocksDB state store provides a new feature called changelog checkpointing to enhance checkpoint performance. when the changelog is enabled (disabled by default) using the setting spark.sql.streaming.stateStore.rocksdb.changelogCheckpointing.enabled, RocksDB writes smaller change logs to the storage location (the local disk by default) instead of frequently persisting large snapshot data. Note that snapshots are still created but less frequently, as shown in the following screenshot.

Here’s an example of a checkpoint location path when overridden to an S3 bucket: s3://<S3BUCKET>/<checkpointDir>/state/0/spark_parition_ID/state_version_ID.zip

Best practices and considerations

This section outlines key strategies for fine-tuning RocksDB performance and avoiding common pitfalls.

1. Memory management for RocksDB

To prevent OOM errors on Spark executors, you can configure RocksDB’s memory usage at either the node level or instance level:

  • Node level (recommended): Enforce a global off-heap memory limit per executor. In this context, each executor is treated as a RocksDB node. If an executor processes N partitions of a stateful operator, it will have N number of RocksDB instances on a single executor.
  • Instance-level: Fine-tune individual RocksDB instances.

Node-level memory control per executor

Starting with Amazon EMR 7.0 and AWS Glue 5.0 (Spark 3.5), a critical Spark configuration, boundedMemoryUsage, was introduced (through SPARK-43311) to enforce a global memory cap at a single executor level that is shared by multiple RocksDB instances. This prevents RocksDB from consuming unbounded off-heap memory, which could lead to OOM errors or executor termination by resource managers such as YARN or Kubernetes.

The following example shows the node-level configuration:

 # Bound total memory usage per executor 
 "spark.sql.streaming.stateStore.rocksdb.boundedMemoryUsage": "true"
 # Set a static total memory size per executor
 "spark.sql.streaming.stateStore.rocksdb.maxMemoryUsageMB": "500"
 # For read-heavy workloads, split memory allocation between write buffers (30%) and block cache (70%) 
 "spark.sql.streaming.stateStore.rocksdb.writeBufferCacheRatio": "0.3"

A single RocksDB instance level control

For granular memory management, you can configure individual RocksDB instances using the following settings:

# Control MemTable (write buffer) size and count
"spark.sql.streaming.stateStore.rocksdb.writeBufferSizeMB": "64"
"spark.sql.streaming.stateStore.rocksdb.maxWriteBufferNumber": "4"
  • writeBufferSizeMB (default: 64, suggested: 64 – 128): Controls the maximum size of a single MemTable in RocksDB, affecting memory usage and write throughput. This setting is available in Spark3.5 – [SPARK-42819] and later. It determines the size of the memory buffer before state data is flushed to disk. Larger buffer sizes can improve write performance by reducing SST flush frequency but will increase the executor’s memory usage. Adjusting this parameter is crucial for optimizing memory usage and write throughput.
  • maxWriteBufferNumber (default: 2, suggested: 3 – 4): Sets the total number of active and immutable MemTables.

For read-heavy workloads, prioritize the following block cache tuning over write buffers to reduce disk I/O. You can configure SST block size and caching as follows:

"spark.sql.streaming.stateStore.rocksdb.blockSizeKB": "64"
"spark.sql.streaming.stateStore.rocksdb.blockCacheSizeMB": "128"
  •  blockSizeKB (default: 4, suggested: 64–128): When an active MemTable is full, it becomes a read-only memTable. From there, new writes continue to accumulate in a new table. The read-only MemTable is flushed into SST files on the disk. The data in SST files is approximately chunked into fixed-sized blocks (default is 4 KB). Each block, in turn, keeps multiple data entries. When writing data to SST files, you can compress or encode data efficiently within a block, which often results in a smaller data size compared with its raw format.

For workloads with a small state size (such as less than 10 GB), the default block size is usually sufficient. For a large state (such as more than 50 GB), increasing the block size can improve compression efficiency and sequential read performance but increase CPU overhead.

  • blockCacheSizeMB (default: 8, suggested: 64–512, large state: more than 1024): When retrieving data from SST files, RocksDB provides a cache layer (block cache) to improve the read performance. It first locates the data block where the target record might reside, then caches the block to memory, and finally searches that record within the cached block. To avoid frequent reads of the same block, the block cache can be used to keep the loaded blocks in memory.

2. Clean up state data at checkpoint

To help ensure that your state file sizes and storage costs remain under control when checkpoint performance becomes a concern, use the following Spark configurations to adjust cleanup frequency, retention limits, and checkpoint file types:

# clean up RocksDB state every 30 seconds
"spark.sql.streaming.stateStore.maintenanceInterval":"30s"
# retain only the last 50 state versions  
"spark.sql.streaming.minBatchesToRetain":"50"
# use changelog instead of snapshots
"spark.sql.streaming.stateStore.rocksdb.changelogCheckpointing.enabled":"true"
  • maintenanceInterval (default: 60 seconds): Retaining a state for a long period of time can help reduce maintenance cost and background IO. However, longer intervals increase file listing time, because state stores often scan every retained file.
  • minBatchesToRetain (default: 100, suggested: 10–50): Limits the number of state versions retained at checkpoint locations. Reducing this number results in fewer files being persisted and reduces storage usage.
  • changelogCheckpointing (default: false, suggested: true): Traditionally, RocksDB snapshots and uploads incremental SST files to checkpoint. To avoid this cost, changelog checkpointing was introduced in Amazon EMR7.0+ and AWS Glue 5.0, which write only state changes since the last checkpoint.

To track an SST file’s retention status, you can search RocksDBFileManager entries in the executor logs. Consider the following logs in Amazon EMR on EKS as an example. The output (shown in the screenshot) shows that four SST files under version 102 were uploaded to an S3 checkpoint location, and that an old changelog state file with version 97 was cleaned up.

cat /var/log/spark/user/<spark_app_name>-<spark_executor_ID>/stderr/ current | grep RocksDBFileManager

or

kubectl logs <spark_executor_pod_name> -n emr -c spark-kubernetes-executor | grep RocksDBFileManager

3. Optimize local disk usage

RocksDB consumes local disk space when generating SST files at each Spark executor. While disk usage doesn’t scale linearly, RocksDB can accumulate storage over time based on state data size. When running streaming jobs, if local available disk space gets insufficient, No space left on device errors can occur.

To optimize disk usage by RocksDB, adjust the following Spark configurations:

# compact state files during commit (default:false)
"spark.sql.streaming.stateStore.rocksdb.compactOnCommit": "true"
# number of delta SST files before becomes a consolidated snapshot file(default:10)
"spark.sql.streaming.stateStore.minDeltasForSnapshot": "5" 

Infrastructure adjustments can further mitigate the disk issue:

For Amazon EMR:

For AWS Glue:

  • Use AWS Glue G.2X or larger worker types to avoid the limited disk capacity of G.1X workers.
  • Schedule regular maintenance windows at optimal timing to free up disk space based on workload needs.

Conclusion

In this post, we explored RocksDB as the new state store implementation in Apache Spark Structured Streaming, available on Amazon EMR and AWS Glue. RocksDB offers advantages over the default HDFS-backed in-memory state store, particularly for applications dealing with large-scale stateful operations. RocksDB helps prevent JVM memory pressure and garbage collection issues common with the default state store.

The implementation is straightforward, requiring minimal configuration changes, though you should pay careful attention to memory and disk space management for optimal performance. While RocksDB is not guaranteed to reduce job latency, it provides a robust solution for handling large-scale stateful operations in Spark Structured Streaming applications.

We encourage you to evaluate RocksDB for your use cases, particularly if you’re experiencing memory pressure issues with the default state store or need to handle large amounts of state data in your streaming applications.


About the authors

Melody Yang is a Senior Big Data Solution Architect for Amazon EMR at AWS. She is an experienced analytics leader working with AWS customers to provide best practice guidance and technical advice in order to assist their success in data transformation. Her areas of interests are open-source frameworks and automation, data engineering and DataOps.

Dai Ozaki is a Cloud Support Engineer on the AWS Big Data Support team. He is passionate about helping customers build data lakes using ETL workloads. In his spare time, he enjoys playing table tennis.

Noritaka Sekiyama is a Principal Big Data Architect with Amazon Web Services (AWS) Analytics services. He’s responsible for building software artifacts to help customers. In his spare time, he enjoys cycling on his road bike.

Amir Shenavandeh is a Sr Analytics Specialist Solutions Architect and Amazon EMR subject matter expert at Amazon Web Services. He helps customers with architectural guidance and optimisation. He leverages his experience to help people bring their ideas to life, focusing on distributed processing and big data architectures.

Xi Yang is a Senior Hadoop System Engineer and Amazon EMR subject matter expert at Amazon Web Services. He is passionate about helping customers resolve challenging issues in the Big Data area.

Networking of Amazon MQ for RabbitMQ event source mapping for AWS Lambda

Post Syndicated from Rafal Pawlaszek original https://aws.amazon.com/blogs/compute/networking-of-amazon-mq-for-rabbitmq-event-source-mapping-for-aws-lambda/

Event-driven architectures with message brokers need careful attention to security best practices. Amazon MQ for RabbitMQ combined with AWS Lambda enables serverless event processing. However, implementing defense in depth and least privilege principles necessitates a clear understanding of networking requirements. This is particularly important when working with different subnet types and their impact on service connectivity.

This post explores the networking aspects of Lambda event source mapping for Amazon MQ for RabbitMQ. Learn how deployment options influence your networking setup and security posture to make informed architectural decisions. These networking concepts are essential for building secure, scalable solutions, regardless of your experience level with message brokers.

For clarity in this post, when we refer to “RabbitMQ”, we mean Amazon MQ for RabbitMQ.

Prerequisites

The following prerequisites are necessary to complete this post:

  • An Amazon Web Services (AWS) account
  • Basic understanding of AWS networking concepts
  • Familiarity with Amazon MQ for RabbitMQ
  • Basic knowledge of Lambda

Furthermore, to enable setup of the discussed architectures, this post is accompanied by a GitHub repository that uses AWS Cloud Development Kit (AWS CDK).

Repository prerequisites

The following prerequisites are necessary for the repository:

Repository setup

Clone the https://github.com/aws-samples/sample-amazonmq-rabbitmq-lambda-esm repository. This repository contains all the necessary code and instructions to create relevant architectures using AWS CDK.

Install dependencies and build

Install the necessary NPM dependencies by running the following commands:

npm install
npm run build

Amazon MQ for RabbitMQ networking deployment options

Public accessibility is the primary networking differentiator when deploying a RabbitMQ broker in AWS. Although the broker operates in the Amazon MQ service account, the networking configuration varies based on this choice.

Public broker

When you deploy a publicly accessible broker, Amazon MQ provisions all networking components in the service account. The service provides a DNS name that resolves to an IP address of the Network Load Balancer (NLB) in that account. This configuration doesn’t support security groups. All security measures must be implemented through the RabbitMQ broker’s authentication and authorization mechanisms. The following diagram shows this communication flow.

Figure-1 DNS resolution of a public Amazon MQ for RabbitMQ broker.

Private broker

A private broker routes networking through a Amazon Virtual Private Cloud (Amazon VPC) in your account. Amazon MQ uses AWS PrivateLink to provision VPC Endpoints, which serve as entry points for broker communication.

The following diagram shows how client applications communicate with RabbitMQ:

  1. The client application connects to Amazon Route 53 Resolver
  2. Route 53 Resolver resolves the DNS name to the VPC Endpoint’s IP address
  3. The client communicates with the broker through PrivateLink
  4. Security groups protect the VPC Endpoint’s Elastic Network Interfaces (ENIs)

Figure-2 DNS resolution of a private Amazon MQ for RabbitMQ broker.

A private broker deployment offers two networking options:

  • Custom VPC configuration – Specify:
    • Subnets for VPC Endpoint creation
    • At least one security group to protect the VPC Endpoints
  • Default VPC configuration – Leave VPC options blank to use:
    • Default VPC
    • Default security group

Amazon MQ for RabbitMQ Lambda event source mapping building blocks

RabbitMQ solutions offer two approaches for message processing:

  • Create a custom client to read messages from broker queues
  • Use Lambda functions with event source mapping (ESM) for automated message retrieval

The ESM is a Lambda service resource that reads the messages from the broker and invokes the Lambda function synchronously. In the remainder of this post, we refer to this Lambda function as listener.

ESM connectivity depends on the following:

For public brokers, ESM uses public connectivity. For private brokers, ESM:

  • Assumes the listener’s IAM Role
  • Creates ENIs in the same subnets as the broker’s VPC Endpoints
  • Uses the same security groups that protect the VPC Endpoints

The listener’s IAM Role must include these Amazon Elastic Compute Cloud (Amazon EC2) permissions:

  • CreateNetworkInterface
  • DeleteNetworkInterface
  • DescribeNetworkInterfaces
  • DescribeSecurityGroups
  • DescribeSubnets
  • DescribeVpcs

To view ESM ENIs:

  1. Open the AWS Management Console
  2. Navigate to EC2 > Network Interfaces
  3. Look for ENIs with the following naming pattern:
    AWS Lambda VPC ENI-armq-<ACCOUNT_ID>-<ESM_ID>-<remainder>

    where:

    • ACCOUNT_ID – The AWS account number containing the ESM
    • ESM_ID – The unique identifier of the ESM

The following image shows example ESM ENIs.

Figure-3 An example list of interfaces that Amazon MQ for RabbitMQ creates for private brokers.

Disabling or deleting the ESM removes the ESM components.

An enabled ESM needs connectivity to the following:

Because the ESM queue polling process follows these steps:

  1. Assumes the listener’s IAM Role
  2. Retrieves RabbitMQ credentials from Secrets Manager
  3. Establishes broker communication
  4. Invokes the listener when messages are present

You have two options to enable private broker connectivity to support the queue polling process:

  1. Deploy VPC endpoints in ESM subnets for:
    • AWS Security Token Service (AWS STS)
    • Secrets Manager
    • Lambda
  2. Deploy NAT gateway in ESM subnets

ESM networking configuration options

The following sections detail ESM networking configurations for different deployment scenarios.

Option 1: Public broker

In this approach all network communication happens on the Amazon MQ service’s side. The ESM, when enabled, uses public connectivity.

To observe the architecture implemented in your account go to the cloned repository root location, make sure that you are signed in with AWS CLI and run the following:

cdk deploy PublicRabbitMqInstanceStack

Option 2: Private broker in a default VPC

Deploying a private RabbitMQ broker without specifying the VPC informs the Amazon MQ service to pick the default VPC for setting up the networking and then the public subnet(s) in that VPC. The default security group is used for securing the broker’s VPC Endpoints.

Creating the ESM provisions dedicated ENIs in the public subnets where the RabbitMQ broker’s VPC Endpoints reside with the default security group applied. The default security group allows itself for inbound traffic on all protocols and full port range, thus the ESM can route traffic through the VPC Endpoint.

Although the subnet is public with internet gateway access, the ESM ENIs operate in private address space, preventing direct communication with AWS services. To enable proper communication, create VPC Endpoints for AWS STS, Secrets Manager, and Lambda. These endpoints allow the ESM to communicate with AWS services through private IP addresses within your VPC. The following diagram shows the complete communication path from the ESM to the broker.

Figure-4 Networking configuration and request flow for a private broker provisioned in the default VPC.

To observe the architecture implemented in your account, go to the cloned repository root location, make sure that you are signed in with AWS CLI, and run the following

cdk deploy PrivateRabbitMqInstanceDefaultVpcStack

Option 3: Private broker in a Custom VPC with NAT

When deploying a private RabbitMQ broker in a custom VPC, specify either a single subnet for a standalone broker or multiple subnets for a cluster deployment. The deployment also needs a security group for the VPC Endpoint ENIs.

Configure the security group with a self-referencing inbound rule on the AMQP port. This configuration enables communication between the ESM and the RabbitMQ VPC Endpoints’ ENIs.

The following diagram shows how ESM resources communicate through networking components when deployed in a private subnet with NAT gateway. This architecture demonstrates the complete communication path from the ESM to the broker.

Figure-5 Networking configuration and request flow for a private broker provisioned in a private VPC subnet with NAT.

To observe the architecture implemented in your account, go to the cloned repository root location, make sure that you are signed in with AWS CLI, and run the following:

cdk deploy PrivateRabbitMqInstanceCustomVpcWithNatStack

Option 4: Private broker in a Custom VPC with isolated subnets

This configuration builds upon the previous architecture but introduces isolated subnets. These subnets restrict all internet connectivity, permitting only internal VPC network traffic. Although the broker networking components mirror Option 3, the isolation introduces more considerations.

The security group still needs an open AMQP port for queue operations, but the subnet isolation prevents the ESM from directly accessing AWS services. To address this limitation, deploy VPC Endpoints for AWS STS, Secrets Manager, and Lambda within the isolated subnets. These endpoints create a private communication path for the ESM to interact with essential AWS services without needing internet access.

The following diagram shows the communication architecture for ESM resources deployed in isolated subnets. It demonstrates how VPC Endpoints enable secure communication between the ESM and AWS services while maintaining network isolation. This architecture makes sure that the ESM can fulfill its message processing responsibilities without compromising security through internet exposure.

Figure-6 Networking configuration and request flow for a private broker provisioned in an isolated VPC subnet.

To observe the architecture implemented in your account, go to the cloned repository root location, make sure that you are signed in with AWS CLI, and run the following:

cdk deploy PrivateRabbitMqInstanceCustomVpcIsolatedSubnetStack

Option 5: Private broker in a Custom VPC with public subnets

The final configuration places the broker in public subnets while maintaining the core deployment requirements from the previous options. Despite the public subnet placement, the ESM’s networking behavior presents an important consideration: ESM ENIs operate in private address space, preventing direct internet communication even with an internet gateway present.

This architecture necessitates VPC Endpoints for AWS service communication, similar to Option 2. Any attempts to route ESM traffic through the internet gateway fail because the ENIs operate in private address space. Understanding this limitation is crucial for proper deployment planning.

The following diagram shows the ESM communication architecture in public subnets. Despite the different subnet type, this configuration mirrors the isolated subnet approach in its use of VPC Endpoints. These endpoints enable the ESM to communicate with AWS STS, Secrets Manager, and Lambda services through private, secure connections within the VPC.

Figure-7 Networking configuration and request flow for a private broker provisioned in a public VPC subnet.

To observe the architecture implemented in your account, go to the cloned repository root location, make sure that you are signed in with AWS CLI, and run the following:

cdk deploy PrivateRabbitMqInstanceCustomVpcPublicSubnetStack

Cleaning up

To prevent unexpected AWS charges, remove resources you’ve created. The following AWS CDK command helps you safely remove all deployed resources:

cdk destroy --all

Conclusion

This post explored the relationship between AWS Lambda event source mapping and RabbitMQ networking configurations. We examined various deployment scenarios, from public brokers to isolated subnets, each presenting unique considerations for secure and effective implementation.

Understanding these networking patterns enables you to make informed architectural decisions when deploying Amazon MQ for RabbitMQ with Lambda event source mapping. Whether choosing public accessibility or implementing private networking with VPC Endpoints, understanding the consequences of choosing specific networking configurations allows you to apply security best practices while meeting your application’s messaging needs. As you implement these patterns, consider your specific security requirements and operational needs to choose the most appropriate configuration for your use case.

Take the next step in optimizing your serverless messaging architecture. Dive in to the AWS documentation, experiment with the RabbitMQ and Lambda integration patterns discussed, and discover how these networking configurations can elevate the security and performance of your own applications. Start implementing these strategies today to build more robust, scalable solutions.

Amazon Linux 2023 achieves FIPS 140-3 validation

Post Syndicated from Mahak Arora original https://aws.amazon.com/blogs/compute/amazon-linux-2023-achieves-fips-140-3-validation/

AWS announced that Amazon Linux 2023 (AL2023) has achieved Federal Information Processing Standards (FIPS) 140-3 Level 1 validation of our cryptographic modules, marking a significant milestone in our commitment to providing secure, compliant operating system options for regulated workloads. FIPS certified modules are particularly important for US and Canadian government workloads, healthcare applications requiring HIPAA compliance, financial services, defense contractors, and other regulated industries. FIPS 140-3, which supersedes FIPS 140-2, represents the latest government security standard for cryptographic modules, jointly validated by the National Institute of Standards and Technology (NIST) and the Canadian Centre for Cyber Security (CCCS) through the Cryptographic Module Validation Program (CMVP). The validation follows the rigorous requirements outlined in the FIPS 140-3 standard and encompasses critical cryptographic modules including the OpenSSL, Linux Kernel Cryptographic API, NSS, GnuTLS, and Libgcrypt.

These modules have been extensively tested to have robust security capabilities such as approved cryptographic algorithms, secure key management, strong entropy generation, and protected memory boundaries. The validation process was conducted by a NIST-accredited lab, and further reviewed by the Cryptographic Module Validation Program (CMVP). Additionally, the certificate details can be verified on the CMVP Active Validation List.

In order to enable FIPS mode on AL2023, customers can refer to our FIPS Mode enablement guide on AL2023. Amazon Linux maintains its compliance information through AWS Compliance Programs portal for FIPS- 140-3 and official NIST Guidelines and Compliance FAQs, for meeting global regulatory requirements. For regular updates and best practices, follow the AWS Security Blog, FIPS related FAQs on Amazon Linux 2 and Amazon Linux 2023 providing detailed configuration steps and operational guidance for regulated environments. You can also reach out to your AWS account team for help finding the resources you need.

If you have questions about this post, contact AWS Support.

[$] The hierarchical constant bandwidth server scheduler

Post Syndicated from corbet original https://lwn.net/Articles/1024757/

The POSIX
realtime model
, which is implemented in the Linux kernel, can ensure
that a realtime process obtains the CPU time it needs to get its job done.
It can be less effective, though, when there are multiple realtime
processes competing for the available CPU resources. The hierarchical
constant bandwidth server
patch series, posted by Yuri Andriaccio with
work by Luca Abeni, Alessio Balsini, and Andrea Parri, is a modification to
the Linux scheduler intended to make it possible to configure systems with
multiple realtime tasks in a deterministic and correct manner.

Empower AI agents with user context using Amazon Cognito

Post Syndicated from Abrom Douglas original https://aws.amazon.com/blogs/security/empower-ai-agents-with-user-context-using-amazon-cognito/

Amazon Cognito is a managed customer identity and access management (CIAM) service that enables seamless user sign-up and sign-in for web and mobile applications. Through user pools, Amazon Cognito provides a user directory with strong authentication features, including passkeys, federation to external identity providers (IdPs), and OAuth 2.0 flows for secure machine-to-machine (M2M) authorization.

Amazon Cognito issues standard JSON Web Tokens (JWTs) and supports the customization of identity and access tokens for user authentication by using the pre token generation Lambda trigger. Learn more about this in How to customize access tokens in Amazon Cognito user pools. Amazon Cognito has extended token customization capabilities to support access token customization for M2M and the ability to pass metadata from the client during M2M authorization. Application builders can use these two features to support multiple use cases, including customizing access tokens based on unique runtime policies, entitlements, environment, or passed metadata. This can simplify and enrich M2M authentication and authorization scenarios and opens up new possibilities for emerging use cases, such as identity and access management for AI agents.

This post demonstrates how Amazon Cognito enables AI agents to perform authorized actions on behalf of users through user-contextualized access tokens for OAuth 2.0-enabled resource servers. AI agents represent a class of autonomous services that require robust identity management and precise access control, especially when acting on behalf of users. By using the Amazon Cognito client credentials flow with access token customization, you can establish distinct identities for AI agents that carry critical information about their capabilities, scope of access, and intended use cases. This approach provides a foundation for more secure, auditable AI agent operations while maintaining clear boundaries around their authorized activities.

The identity of an AI agent can be represented within Amazon Cognito as an app client. The AI agent obtains an access token (JSON Web Token (JWT)) through an OAuth 2.0 client credentials grant. This JWT can be customized to contain claims that represent the authenticated human user whom the AI agent is acting on behalf of. This token can then be used to authorize access to other services that has established trust with the Amazon Cognito user pool by trusting the issuer and audience of the token. For example, this third-party service could be a claims processor, a travel agency service, or a scheduling service acting on behalf of a user. The focus of this post is on foundational building blocks using Amazon Cognito for AI agents and how to obtain a customized access token with user context.

Solution overview and reference architecture

Looking at an example architecture (Figure 1), a user signs in to a web or mobile application using an Amazon Cognito user pool, and tokens for the user are returned to the client. Here, the application could be a serverless digital assistant using an Amazon Bedrock agent that needs to gather and process data residing in a third-party cross-domain service. The AI agent obtains its own access token by performing an OAuth 2.0 client credentials grant while passing the user’s access token as context using the aws_client_metadata request parameter. The AI agent receives the user contextualized access token and calls an external, third-party, or cross-domain service that trusts the issuer and audience of the AI agent’s access token issued from an Amazon Cognito user pool. The cross-domain service can obtain the JSON Web Key Set (JWKS) to verify the token and extract claims presenting both the AI agent and most importantly, the underlying user. Authorization takes place within the cross-domain service using the claims of the customized access token and for fine-grain authorization, Amazon Verified Permissions is used. See Figure 1 for a detailed flow of this example.

Figure 1: AI agent identity reference architecture

Figure 1: AI agent identity reference architecture

  1. The user navigates to the application through the client.
  2. There is no existing session or token for the user, so the user authentication flow with the Amazon Cognito user pool begins.
  3. After a successful sign-in, Amazon Cognito returns access, ID, and refresh tokens to the client for the user.
  4. As the user interacts with AI agent through the application, the client sends the user’s access token to an Amazon API Gateway endpoint.
  5. The API gateway integrates with the AI agent, which is using an Amazon Bedrock agent. As an example, this can use several AWS Lambda functions interacting with an Amazon Bedrock Knowledge Base or a Retrieval-Augmented Generation (RAG) process.
  6. The AI agent obtains its own access token from an Amazon Cognito user pool using an OAuth 2.0 client credentials grant. The user’s access token, obtained in step 1, is sent with the token request in the aws_client_metadata request parameter.

Note: You can use different Amazon Cognito user pools for user authentication and for agent (machine) authentication. This promotes separation and provides the ability to apply different settings and controls on each user pool if needed to meet security requirements.

  1. Amazon Cognito validates the client ID and secret from the AI agent and invokes the pre token generation Lambda trigger to customize the access token for the AI agent.

Note: Within the pre token generation Lambda trigger, the user’s access token is verified before returning a customized access token to the AI agent using the aws-jwt-verify library.

  1. The customized access token is returned to the AI agent, including custom claims representing the user.
  2. The AI agent, using its own access token, calls the cross-domain service to perform the requested action on behalf of the user. For example, this can be a third-party reservation system or a photo sharing service.
  3. The resource server in the cross-domain service verifies that the access token from the AI agent is valid. The resource server must be pre-configured to trust the user pool that issued the agent access token.
  4. Coarse- and fine-grained authorization can happen either locally in the service code or using Verified Permissions.
  5. A response from the cross-domain service flows back to the AI agent, if necessary.
  6. A response from the AI agent to the user application or client is returned, if necessary.
  7. Actions that take place throughout the flow are logged in AWS CloudTrail, providing end-to-end logging and auditing.

Implementation details

Let’s take a deeper look into the three core components of this scenario:

  1. The AI agent obtaining its own OAuth 2.0 access token
  2. The Amazon Cognito pre token generation Lambda trigger used to enrich the AI agent’s access token with user context
  3. The cross-domain resource server performing fine-grained authorization

AI agent

Figure 2: AI agent obtaining a user access token from the frontend application through API Gateway

Figure 2: AI agent obtaining a user access token from the frontend application through API Gateway


Amazon Bedrock Agents is used in this solution with a
custom orchestration configured to use Lambda. When the application interacts with the Amazon Bedrock agent, the custom orchestrator initiation begins with the agent passing the user’s access token to a Lambda function as part of the custom orchestration (shown in Figure 2). The Lambda function validates the user’s token to verify that it’s not expired and hasn’t been tampered with. This custom orchestrator begins the process for the agent to obtain its own OAuth access token and to access downstream and cross-domain resources on behalf of the user. The human user’s access token is included in the call from the application through the client. To learn more about Amazon Bedrock Agents custom orchestrator, see
Getting started with Amazon Bedrock Agents custom orchestrator. The following is an example of what a human user’s decoded access token provided through an API Gateway REST API might look like.

{
  sub: "user-identity-4e4c-example-7cede8e609a2",
  cognito:groups: 
    [
    "exampleChatApplicationAccess"
    ]
  ,
  iss: https://cognito-idp.<region>.amazonaws.com/<region>_example,
  version: 2,
  client_id: "1example23456789",
  origin_jti: "",
  token_use: "access",
  scope: "openid profile email",
  auth_time: 499192140,
  exp: 1445444940,
  iat: 499192140,
  jti: "",
  username: "my-example-username"
}

The following is a Node.js code sample that an AI agent can use to obtain its own access token from Amazon Cognito. This can be the Lambda function part of the custom orchestration for the Amazon Bedrock agent. Notice the clientMetadata variable being set, which will be passed to the Cognito /token endpoint using the aws_client_metadata request parameter. This request parameter is where the user’s access token is provided. In the following code example, you will find an attribute called callerApp, which is set to ExampleChatApplication, which serves as a unique identifier for the application. The callerApp value is preconfigured in the backend of the solution. This unique application identifier is included in the customized access token for the agent and used for additional authorization checks later. It’s a security best practice to use AWS Secrets Manager to store the client ID and client secret and obtain these credentials at runtime. As a security best practice, the user’s access token should be verified prior to passing it to the AI agent backend.

async function getAccessToken() {
    const clientId = 'exampleAiAgentClientId'; // use Secrets Manager
    const clientSecret = 'exampleAiAgentClientSecret'; // use Secrets Manager
    const tokenEndpoint = 'https://mydomain.auth.<region>.amazoncognito.com/oauth2/token';
    const scope = 'crossDomainService/read userData/read';
    const clientMetadata = '{"onBehalfOfToken":"<HUMAN-USER-ACCESS-TOKEN>", "callerApp":"ExampleChatApplication"}';
  
    const basicAuth = Buffer.from(`${clientId}:${clientSecret}`).toString('base64');
  
    const body = new URLSearchParams({
      grant_type: 'client_credentials',
      scope,
      aws_client_metadata: clientMetadata
    });
  
    const res = await fetch(tokenEndpoint, {
      method: 'POST',
      headers: {
        'Authorization': `Basic ${basicAuth}`,
        'Content-Type': 'application/x-www-form-urlencoded'
      },
      body
    });
  
    if (!res.ok) {
      const error = await res.text();
      throw new Error(`Token request failed: ${res.status} ${error}`);
    }
  
    const { access_token } = await res.json();
    console.log('Access Token:', access_token);
  
    return access_token;
  }
  
  getAccessToken().catch(err => console.error('Error:', err.message));

The access token for the AI agent is returned only if the client ID and secret are correct and the provided user access token is valid. However, before it’s returned, the AI agent’s access token is customized by the Amazon Cognito pre token generation Lambda trigger.

Amazon Cognito pre token generation Lambda trigger

Figure 3: AI agent access token customization with Cognito pre token generation Lambda trigger

Figure 3: AI agent access token customization with Cognito pre token generation Lambda trigger

After the AI agent’s action calls the Amazon Cognito /token endpoint with a valid client ID and secret, Cognito invokes the pre token generation Lambda trigger. The following is an example Lambda function that takes the aws_client_metadata request parameter, which contains the access token of the user and the callerApp attribute that was defined while the user was authenticating. In the following Lambda function, the access token provided from the user is verified (shown in Figure 3). The aws-jwt-verify library is used to verify the token is not expired, the token has not been tampered with by verifying the signature, and it’s making sure that an access token was provided. The Lambda function is also pre-configured to accept user tokens from a specific issuer and audience, this protects against malicious context injection risks. This is also an opportunity to perform additional authorization. For example, check if the user is a member of certain groups.

After the token is verified, the Lambda function customizes the access token to be returned to the AI agent.

import { CognitoJwtVerifier } from "aws-jwt-verify";

// Initialize the JWT verifier to verify the user’s access token
// Provide the user pool ID, token use, and client ID 
const jwtVerifier = CognitoJwtVerifier.create({
  userPoolId: process.env.USER_POOL_ID,  // user pool for user authentication
  clientId: process.env.CLIENT_ID,
  // groups: "exampleChatApplicationAccess", // optional group membership authorization
  tokenUse: 'access'
});

export const handler = async function(event, context) {
  try {
    const onBehalfOfToken = event.request.clientMetadata?.onBehalfOfToken || '';
    // It’s recommended that the provided “callerApp” value from the application is authorized for use with the app client for the AI agent
    const callerApp = event.request.clientMetadata?.callerApp || '';

    // The below console log will display the authenticated user’s JWT
    // Keep this logging with caution in a production environment
    console.log('Original event:', event);

    // Verify the access token from the human user
    // You could optionally also perform some authorization checks here as well
    // Example: check for the membership of a group
    let decodedJWT;
    if (onBehalfOfToken) {
      try {
        decodedJWT = await jwtVerifier.verify(onBehalfOfToken);
        console.log('Decoded JWT:', decodedJWT);
      } catch (err) {
        console.error('Token verification failed:', err);
        throw new Error('Token verification failed');
      }
    }

    // Create the onBehalfOf claim structure
    const behalfOfClaim = decodedJWT ? {
      sub: decodedJWT.sub,
      username: decodedJWT.username,
      groups: decodedJWT['cognito:groups'] || []
    } : {};

    // Customized token returned to client
    event.response = {
      "claimsAndScopeOverrideDetails": {
        "accessTokenGeneration": {
          "claimsToAddOrOverride": {
            "onBehalfOf": behalfOfClaim,
            "callerApp": callerApp
          },
        }
      }
    };

    return event;

  } catch (error) {
    console.error('Error in Lambda execution:', error);
    throw error;
  }
};

Notice in the preceding Lambda function that two custom claims are being dynamically created within the event.response: onBehalfOf and callerApp. The onBehalfOf claim contains nested claims that were extracted from the human user’s access token. The callerApp is carried forward from the frontend application and provided alongside the user access token. It’s recommended for the callerApp value to also be verified against some custom logic to add additional layer of protection. The return AI agent’s access token would look like the following JWT.


{    
	"sub": "agent-identity-4e4c-example-7cede8e609a2",
	"onBehalfOf": {
		"sub": "user-identity-4e4c-example-7cede8e609a2",
		"username": "my-example-username",
		"groups": [
			"readaccess"        
				]    
		},    
		"iss": "https://cognito-idp..amazonaws.com/_example",
		"version": 2,
		"client_id": "1example23456789",
		"callerApp": "ExampleChatApplication",
		"token_use": "access",
		"scope": "crossDomainService123/read userData/read",
		"auth_time": 499192140,
		"exp": 1445444940,
		"iat": 499192140,
		"jti": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
}

Cross-domain resource server authorization check

At this point, shown in Figure 4, the human user has successfully authenticated to the web application, the human user’s access token was sent as context to the backend, an AI agent obtained its own customized access token containing the human user context, and now the agent is ready to call an external cross-domain service.

Figure 4: Cross-domain resource server performing fine-grained authorization with Amazon Verified Permissions

Figure 4: Cross-domain resource server performing fine-grained authorization with Amazon Verified Permissions

As shown in Figure 4, the cross-domain service is the resource server and therefore needs to perform an authorization check. For this example, we’ll keep things straightforward and make sure that three core things are verified:

  1. The AI agent’s OAuth access token is valid
  2. The AI agent is authorized to access this service
  3. The AI agent is authorized to interact with the user data

Depending on your use case and requirements, you might also need to verify that the user’s consent has been obtained prior to the AI agent acting on their behalf. Ultimately, you want to verify that the AI agent can access a user’s data on their behalf and only for the purpose for which consent has been provided by the user.

For the token verification, use the aws-jwt-verify library again. The following is a Node.js example to verify the AI agent’s access token.

import { CognitoJwtVerifier } from "aws-jwt-verify";

// add custom logic to verify that AI agent is authorized to perform this action on behalf of the user

// Verifier that expects valid access tokens:
const verifier = CognitoJwtVerifier.create({
  userPoolId: "<user_pool_id>", // user pool for AI agent authentication
  tokenUse: "access",
  clientId: "<client_id>",
});

try {
  const payload = await verifier.verify(
    "eyJraWQeyJhdF9oYXNoIjoidk..." //this will be the AI agent's access token
  );
  console.log("Token is valid. Payload:", payload);
} catch {
  console.log("Token not valid!");
}

Fine-grained authorization with Verified Permissions

As a security best practice, the zero trust principle of enforcing fine-grained identity-based authorization should take place using Verified Permissions. The preceding Node.js code sample is a basic validation of the AI agents access token that can happen within the application logic. Instead of keeping authorization logic within the resource server, you can use Verified Permissions to offload the authorization policies to a managed service. The following is an example Cedar policy for this use case.

permit(
    principal == Agent::"agent-identity-4e4c-example-7cede8e609a2",
    action == Action::"readOnly",
    resource == Resource::"crossDomainService123::userData"
)
when {
    resource.scope == Scope::"crossDomainService123/read" &&
    resource.owner == User::" user-identity-4e4c-example-7cede8e609a2" &&
    context.onBehalfOf.sub == " user-identity-4e4c-example-7cede8e609a2" &&
    context.callerApp == "ExampleChatApplication"
};

With the preceding Cedar policy example, you are permitting the AI agent to read userData from the crossDomainService123 resource. This is only permitted when the AI agent’s access token contains the crossDomainService/read scope and when the resource owner and the onBehalfOf user (from the access token) are the same—the human user in this case. There’s also an additional when clause in the policy to make sure that this interaction initiated from ExampleChatApplication.

The cross-domain resource server would use the AI agent’s access token and call the Verified Permissions IsAuthorizedWithToken API. To learn more, see Simplify fine-grained authorization with Amazon Verified Permissions and Amazon Cognito.

The following is a Node.js example using the IsAuthorizedWithToken API from Verified Permissions using the AWS SDK for JavaScript v3.

import { VerifiedPermissionsClient, IsAuthorizedWithTokenCommand } from "@aws-sdk/client-verifiedpermissions";

const client = new VerifiedPermissionsClient({ region: "<region>" });

// Dynamically provided token 
const jwtToken = "eyJraWQiOiJrMWtleSIsInR..."; //AI agent's access token

async function checkAccess() {
  const input = {
    policyStoreId: "ps-abc123example", // your AVP policy store ID
    accessToken: jwtToken,
    action: {
      actionType: "Action",
      actionId: "readOnly"
    },
    resource: {
      entityType: "crossDomainService123",
      entityId: "userData"
    }
  };

  const command = new IsAuthorizedWithTokenCommand(input);

  try {
    const response = await client.send(command);
    console.log("Authorization Decision:", response.decision);
  } catch (err) {
    console.error("Authorization error:", err);
  }
}

Based on the preceding examples of the AI agent’s access token (with user context), the Cedar policy, and the IsAuthorizedWithToken API call, the resource server would get an Allow decision for this action to take place. The following is an example of the authorization decision response.

{
    "decision": "Allow",
    "determiningPolicies": [{
        "determiningPolicyId": "ps-abc123example"
    }],
    "errors": []
}

Before this policy can be evaluated, you must define a schema that includes the relevant entity types (Agent, User, Resource, Scope, and so on), and create corresponding entities in your policy store that match the IDs used in the policy and request.

Bringing it all together, the requested data from the AI agent, on behalf of the user, is returned from the cross-domain service to the AI agent. This additional data can now be used within the context of the AI agent workload. The entire solution can be used for a chat application, such as the one described in Protect sensitive data in RAG applications with Amazon Bedrock.

Conclusion

Amazon Cognito M2M access token customization and support for passing client metadata provides you the extensibility to solve complex use cases and enables emerging ones like AI agent identity and access management. For example, passing contextual client metadata and customizing access tokens at runtime can help software as a service (SaaS) and multi-tenant service providers scale to an unlimited number of resource servers, because these can be dynamically determined at runtime. As organizations increasingly explore the use of AI agents, having a secure, scalable identity management solution becomes crucial for maintaining control and accountability. By using these new features, you can build more secure and scalable solutions with Amazon Cognito to prepare for the future of autonomous AI agent use cases.

Use the comments section to leave feedback about this post. If you have questions about this post, start a new thread on Amazon Cognito re:Post or contact AWS Support.

Abrom Douglas

Abrom Douglas III

Abrom is a Senior Solutions Architect within AWS Identity with nearly 20 years of software engineering and security experience, specializing in the identity and access management space. He loves speaking with customers about how identity and access management can provide secure outcomes that enable both business and technology initiatives. In his free time, he enjoys cheering for Arsenal FC, photography, travel, volunteering, and competing in duathlons.

Ghostwriting Scam

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2025/06/ghostwriting-scam.html

The variations seem to be endless. Here’s a fake ghostwriting scam that seems to be making boatloads of money.

This is a big story about scams being run from Texas and Pakistan estimated to run into tens if not hundreds of millions of dollars, viciously defrauding Americans with false hopes of publishing bestseller books (a scam you’d not think many people would fall for but is surprisingly huge). In January, three people were charged with defrauding elderly authors across the United States of almost $44 million ­by “convincing the victims that publishers and filmmakers wanted to turn their books into blockbusters.”

[$] Getting Lustre upstream

Post Syndicated from jake original https://lwn.net/Articles/1025268/

The Lustre filesystem has a long
history, some of which intersects with Linux. It was added to the staging
tree in 2013, but was bounced out of
staging
in 2018, due to a lack of progress and a development model
that was incompatible with the kernel’s. Lustre may be working its way
back into the kernel, though. In a filesystem-track session at
the 2025 Linux Storage, Filesystem, Memory Management, and BPF Summit
(LSFMM+BPF), Timothy Day and James Simmons led a discussion on how to get
Lustre into the mainline.

Security updates for Wednesday

Post Syndicated from jzb original https://lwn.net/Articles/1025862/

Security updates have been issued by Debian (gst-plugins-bad1.0, konsole, and libblockdev), Oracle (buildah, containernetworking-plugins, gimp, git-lfs, gvisor-tap-vsock, kernel, libvpx, podman, and skopeo), Red Hat (apache-commons-beanutils and thunderbird), Slackware (xorg), SUSE (gdm, golang-github-prometheus-alertmanager, golang-github-prometheus-node_exporter, golang-github-prometheus-prometheus, govulncheck-vulndb, grafana, kernel, Multi-Linux Manager, Multi-Linux Manager Client Tools, openssl-3, pam, python-cryptography, python-requests, python-setuptools, python3-requests, SUSE Manager Server, systemd, ucode-intel, xorg-x11-server, and xwayland), and Ubuntu (dwarfutils, mujs, node-katex, xorg-server, xorg-server-hwe-16.04, xorg-server-hwe-18.04, and xorg-server, xwayland).

Искаме си еврото

Post Syndicated from Искрен Иванов original https://www.toest.bg/iskame-si-evroto/

Искаме си еврото

Една от най-поразителни черти на българската политическа действителност е начинът, по който определени групи в обществото изразяват недоволството си. В много европейски страни позицията на една група по интереси се формира след доста четене, информираност и съвместна работа, докато у нас процесите на когнитивни възприятия по отношение на един или друг проблем в обществото често вкарват българските граждани в капана на дезинформацията.

Дебатът – ако изобщо може да го наречем така – около присъединяването на България към еврозоната е болезнено белязан от тази действителност, която се дължи на два фактора: неспособността на политическия елит да сподели с избирателите си предимствата и недостатъците от приемането на общата европейска валута и силната вълна от дезинформация, представяща еврото като събирателен образ на всичко лошо, което очаква България. 

Казано с други думи, кампанията срещу еврото не е толкова успешна, колкото е слаба информираността на хората какво ни очаква в еврозоната. Ето защо в тази статия ще се постараем да анализираме кои са политическите ефекти от присъединяването на България към еврозоната, като ще се убедим, че те далеч надхвърлят като ползи и изгоди икономическите.

Защо ни е еврозоната? Да не би датчани и поляци да са по-глупави от нас!?

Едва ли противниците на единната европейска валута си дават сметка, че основната причина за приемането на еврото всъщност се крие в историята и спецификата на Прехода. Краят на социализма през 1989 г. отвори вратата за частната собственост и пазарната инициатива – двете характеристики на капитализма, които в политическо и социално отношение далеч надхвърлят легендарната предвидимост и спокойствие на „развитото“ социалистическо общество. 

В рамките на няколко години българите, които останаха или се завърнаха в страната след промените, се опитаха да се позиционират трайно в тази нова система, която даваше на мнозина надежди за по-добро бъдеще. Оптимизмът рухна, когато България преживя най-сериозната икономическа криза след края на Втората световна война, останала в историята ни като печално известната Виденова зима. По-дълбоки като психологическо отражение от това злощастно събитие са единствено годините на сеч и гибел, сполетели страната ни по време на Междусъюзническата и двете световни войни.

Казано накратко, през януари 1997 г. българската икономика рухна безславно, а левът беше поставен на животоподдържащи системи в условията на валутен борд. Тези години бяха най-унизителните в историята на българската валута, която дори през социализма се ползва със златно покритие и е фиксирана стабилно към курса на щатския долар. Щетите, нанесени върху валутата ни, бяха особено дълбоки и поради факта, че тя никога повече нямаше да успее да се възстанови от девалвацията, а трябваше раболепно да следва курса на марката, за да може парите да възстановят покупателната си способност, а икономиката да бъде изградена отново. 

В дните, когато Германия извърши символично погребение на марката и я замени с еврото, стана ясно, че България се превръща в прецедент, тъй като нейната полуфункционираща валута на практика е вързана за курса на друга несъществуваща валута.

След тази кратка историческа ретроспекция е редно да си дадем сметка, че причината, поради която България стигна дотук, е икономическото фиаско от 1997 г. Приемането на еврото на практика ще завърши започнатото от валутния борд. Ако все пак се опитаме да пресъздадем една алтернативна историческа реалност, в която Виденовата зима отсъства, то тогава левът би успял да запази фиксиран курс към долара и днес успешно да се вмести в датския, полския или пък румънския сценарий. Но тъй като в историята няма „ако“, дилемата пред нашата страна е тази: да избере лева – валута с фиктивна покупателна стойност, която ще девалвира все повече с времето, или да избере еврото, чието златно покритие му гарантира позицията на втората най-силна валута в света.

Политическите ефекти от приемането на еврото

Първият и най-важен политически ефект засяга трайното и категорично позициониране на България като европейска държава в глобалния икономически ред. Понятието „глобален икономически ред“ е особено важно за осмислянето на предимствата на еврозоната, тъй като този ред на практика е едно неголямо семейство, в което съжителстват няколко резервни валути. С най-голям дял от тях са щатският долар и еврото, а с периферен – японската йена, китайският юан и британският паунд – остатък от колониалното минало на Британия. Еврото на практика е по-силно и от азиатските валути, и от паунда, като единствено доларът е в състояние да го конкурира. Негативните ефекти от тази конкуренция често се тушират от стабилните отношения между икономиките на САЩ и Европа, които очевидно ще надживеят сегашните турбуленции. 

Важно е да поясним, че когато имаш резервна валута, която е закрепена за стойността на златото, можеш да печаташ пари. Колкото поискаш. Чудили ли сте се как Америка винаги намира пари да воюва и да отстоява интересите си по света? Защото валутата ѝ е най-силната в света и хазната може да печата неограничено количество долари, които редовно да инжектира в американската военна машина и икономика. 

Същата логика е приложима и за еврото – винаги когато страна членка е заплашена от икономическа криза, Европейската централна банка във Франкфурт е в състояние да печата валута. В тези условия политическата стабилност и увереност на България ще се засилят, защото противниците на еврото няма да могат да легитимират твърденията си, че сме позорният длъжник на Международния валутен фонд и Световната банка, чиито заеми ще плащат идните поколения българи. България ще се сдобие с паричен и икономически суверенитет, а не с псевдовалутен васалитет.

Вторият политически ефект е отражението, което присъединяването ни към еврозоната ще окаже върху корупцията и сивия сектор. Те ще продължат да съществуват, тъй като няма развита демокрация, която да е напълно имунизирана против ефектите от корупцията, противно на претенциите на автокрациите и тоталитарните режими, които са най-яркият съвременен символ на самата корупция. Но общественият статут на корупцията и на сивия сектор, както и нивото на съществуването им ще се променят. Ефектите им няма да са толкова осезаеми за гражданите, а упражняването на корупционни практики ще е привилегия на политическите елити и кръговете около тях, която ще приема далеч по-изтънчени форми. 

Казано иначе, корупцията у нас вече няма да е балканска, а европейскa, коeто ще даде лице и мандат на България да претендира за развита демократична държава, чиито корупционни практики следват европейските стандарти и санкцията на Франкфурт. И ако за прозападните кръгове у нас това е добре дошло, то за онези, които искат България да остане извън европейското семейство, подобна крачка ще е крайно делегитимираща, тъй като няма да могат да вземат пари директно от руските си спонсори.

Третият политически ефект се отнася до скъсването на зависимостта с евразийското икономическо пространство. България ще стане част от онова глобално икономическо пространство, което смели мъже като Хенри Моргентау-младши, Хари Декстър Уайт, Густав Щреземан и Жак Делор разчертават за бъдните поколения. Всъщност една от причините, поради които еврото е толкова недолюбвано, е, че компактни маси в българското общество не желаят да се разделят със спомените от социализма, но в същото време отказват да признаят факта, че ниските стойности на пенсионното осигуряване се дължат на инфлацията, която посече българската икономика през 1997 г.

По същия начин стои въпросът с онези групи, които гледат на европейския капитализъм като на нещо страшно и враждебно, без да си дават сметка, че на фона на англосаксонския той гарантира развитието на средна класа. Такива проблеми са плод на сериозна дезинформация в българското общество, която цели да го убеди, че единната европейска валута ще направи от България колония.

Ако трябва да обобщим, присъединяването на България към еврозоната не просто ще допринесе страната ни да придобие паричен суверенитет и по-значима роля в глобалната икономика, но и трайно да се позиционира като част от Европа.

В тези условия еврото ще спомогне за затваряне на ножицата между много богати и много бедни и при последователна и градивна икономическа политика ще насърчи формирането на европейска средна класа. 

Неудобната истина е, че приемането на еврото ще наложи и много по-стриктни стандарти и условия за динамиката на труда, което ще изисква от страната да стане доста по-прозрачна при прилагането на националното и европейското законодателство. Това ще изтръгне много противници на еврото от удобната роля на бездействащи критици на НАТО и ЕС, които сглобяват аргументите си с помощта на конспиративни теории, а не на рационални аргументи, и ще даде възможност на реално критичния поглед към функционирането на общността и Алианса да излезе на преден план. Този поглед неведнъж е сблъсквал европейските и американските представители, но в крайна сметка изходът винаги е бил един и същ: от двете страни на океана са осъзнавали, че са като едно семейство, чийто икономически възход е двигателят на световната икономика.

Ами ако не успеем?

Какво би се случило в една паралелна реалност, в която България остава извън еврозоната? Или пък какво би станало, ако липсата на консенсус в политическия дебат у нас доведе до нестабилност, на която Брюксел и Франкфурт няма да погледнат добре?

Ако България не се присъедини към еврозоната, това ще създаде доста благоприятна почва за ръст на антизападните настроения у нас и за изолиране на страната. Това на практика е път обратно към Малтийския консенсус, но тъй като той отдавна не функционира, България може да се окаже в позицията на failed state вътре в Европа. В крайна сметка едно е съществуването на българската икономика, без еврото изобщо да е на дневен ред, друго е тя да функционира в условия, при които приемането на единната европейска валута се е провалило. 

Това ще консолидира антиевропейските кръгове и ще даде възможност за създаването на икономически климат, който да препозиционира България в рамките на евроатлантическото пространство. Нашата страна за пореден път ще се окаже в ролята на мост между Изтока и Запада, а както е известно, през годините тази роля ни е довлякла много проблеми.

Провалът на приемането на единната европейска валута ще даде мандат да бъде поставена под въпрос цялостната геополитическа ориентация на нашата страна.

Това няма да се случи изведнъж и веднага, но ще придобие по-осезаеми форми с наративите, че Европейският съюз няма бъдеще и упадъкът му е неизбежен. 

Тук е мястото да кажем, че никой не знае какво точно е бъдещето и че общността неведнъж е изпадала в трудни моменти. По-старото поколение от европейски политици помни периода на „eвросклероза“ и кризата на европейската валутна змия. Но Европа винаги е намирала начин да оцелее, защото се е реформирала. Тази реформа и днес предстои да бъде много болезнена предвид геополитическите предизвикателства, които стоят пред Стария континент, но рано или късно ще започне. Показателни са плановете на Германия и Франция да разделят общността на концентрични кръгове, които да се ползват в различна степен от облагите, предоставяни от интеграционния процес. Ако България се окаже в някой от външните кръгове – това е неизбежно, ако не се присъединим към еврозоната, – едва ли в бъдещите десетилетия ще ни се удаде да бъдем пълноценен член на Европейския съюз.

В дългосрочен план, ако се провалим с приемането на еврото, твърде е възможно отново да се озовем на входа на тунела, от който се опитваме да излезем вече три десетилетия. Това ще се случи, когато животоподдържащите системи на лева откажат и валутният борд спре да работи. И когато страната изпадне в икономическа криза и спре да обслужва главоломно нарастващия си външен дълг, което рано или късно ще стане, ако навлезем в нова икономическа криза в резултат на глобална рецесия или латентна политическа нестабилност. Мнозина биха определили такова твърдение като заблуда, но малко хора през славното лято на 1994 г. очакваха, че само три години по-късно ще се редят на опашки в банките, за да изтеглят спестяванията си. И всъщност, ако тези събития се повторят, политическата цена, която ще платят гражданите и бъдещите поколения, ще бъде много висока.

The collective thoughts of the interwebz