Post Syndicated from The Hook Up original https://www.youtube.com/watch?v=D_78hM_1buM
Запечатаната земя на София
Post Syndicated from Боян Юруков original https://yurukov.net/blog/2026/copernicus/
Преди време намерих данните на европейската обсерватория Copernicus, но така и не ми е оставало време да ги прегледам. Съдържат безценни данни за земеделската земя, горите, крайбрежните зони, рискове от наводнения и пожари. Тази седмица седнах да погледна един от слоевете – за изкуствено покрита земя. Разбирайте асфалт и бетон, който изцяло покрива кварталите ни.
В миналото съм критикувал доста прилагането на изискванията за озеленяване специално в София и ролята му в презастрояването. Докато един бивш главен архитект ги наричаше безсмислени, а доста строители – прекомерни, всъщност са далеч не достатъчно изискващи и отчасти трудни за прилагане. Не, че някой се е опитвал да ги наложи истински дори в сегашния им вид. На практика се позволява бетонирането на цели парцели стига строителят да може да покаже няколко кашпи с дървета и няколко квадрата чима с трева. В този смисъл вече избилата мухъл по стените се брои към озеленяването за целите на акт 16. Не защото нормативно е позволено, а защото има добре установена практика с ревностно пазена документация. Заради последното заведох тази седмица няколко дела.
Една от основните роли на изискванията за озеленяване е не само чистота на въздуха и намаляване на шумовото замърсяване, но и задържане на водата от проливните дъждове, за да не се получават наводнения и пропускането ѝ надолу, за да захранва подпочвените води. За съжаление, последните са под огромен риск не само, защото масово и често нелегално се използват за миене на коли в автомивки и сгради без право да се вържат към ВИК, но защото все по-голяма част от София е практически запечатана.
Виждаме го при всеки следващ строеж и това се позволява от ЗУТ и изискванията за озеленяване в София. Исках да разбера колко точно. Copernicus предоставя такива данни. Имат слоевете за 2018, 2021 и 2024-та. Следващото заснемане ще е догодина та ще може да сравним какво се е случило покрай бума на влезли в експлоатация имоти след многото разрешения за и започнати строежи преди това.
Интерактивна карта със слоевете може да видите на сайта на обсерваторията. Тук показвам изгледа през трите години. За съжаление, тази през 2018-та е направена по различен модел и не пасва на следващите. Вижда се ясно обаче как липсват сградите от източната страна на горния край на Самоковско шосе, в Манастирски ливади, на север от централна гара и на юг от Бизнес парка. В последната снимка показвам сравнението между 2021-ва и 2024-та. В червено се виждат новите „запечатани“ части на София. Това не означава, че не се е строяло другаде преди това, а че там вече е имало ниски сгради, производства и друго, макар и далеч не с такава интензивност на застрояване.
Може сменяте галерията със стрелката надясно или да ги видите и тук на цял екран: 2018, 2021, 2024, промяна при 2024.
Тепърва ще разглеждам данните на Copernicus. Има интересни показатели за озеленяването.
Active Exploitation of Oracle PeopleSoft Zero-Day (CVE-2026-35273)
Post Syndicated from Jonah Burgess original https://www.rapid7.com/blog/post/etr-active-exploitation-of-oracle-peoplesoft-zero-day-cve-2026-35273
Overview
On June 10, 2026, Oracle published a security alert for CVE-2026-35273, a critical vulnerability in the Updates Environment Management component of PeopleSoft Enterprise PeopleTools. Oracle released an out-of-band patch the same day as the advisory, underscoring the urgency of remediation. The vulnerability has a CVSSv3.1 score of 9.8 and is remotely exploitable without authentication. Per the vendor advisory, successful exploitation may result in remote code execution (RCE). TrendAI has classified the underlying flaw as a server-side request forgery (CWE-918). PeopleTools versions 8.61 and 8.62 are affected.
CVE-2026-35273 was reported to Oracle through TrendAI’s Zero Day Initiative. According to a report published by Mandiant on June 11, 2026, this vulnerability has been exploited in the wild as a zero-day prior to the vendor security alert, with active exploitation observed between May 27 and June 9, 2026, predating Oracle’s advisory by two weeks.
Mandiant has attributed the campaign to UNC6240 (ShinyHunters), a financially motivated cybercriminal collective known for data theft and extortion. ShinyHunters has been linked to breaches across cloud services, SaaS platforms, and telecommunications providers, frequently exploiting weak authentication controls, stolen credentials, and cloud misconfigurations rather than deploying sophisticated malware.
Based on information published by Mandiant, the campaign heavily targeted the higher education sector; 68 percent of the more than 100 notified organizations were universities and colleges. The observed exploitation targeted PeopleSoft’s Environment Management Hub (PSEMHUB) endpoints, and data stolen during the campaign was published on the ShinyHunters Data Leak Site (DLS) on June 9, 2026.
The /PSIGW/HttpListeningConnector URI path appears in both the indicators of compromise for this campaign and in a PeopleSoft exploit chain for CVE-2013-3821, detailed by Lexfo in 2017. A related XML External Entity (XXE) vulnerability, CVE-2017-3548, targeted a different Integration Gateway connector (PeopleSoftServiceListeningConnector) under the same /PSIGW/ path.
Technical overview
TrendAI’s detection signatures for CVE-2026-35273 classify the underlying vulnerability as an SSRF. These include IPS Rule 1012580 (“Oracle Peoplesoft PeopleTools SSRF Vulnerability”) and DDI Rule 5855 (“Peoplesoft PeopleTools Environment Management Hub (PSEMHUB) SSRF Exploit”). Mandiant describes CVE-2026-35273 as a critical remote code execution vulnerability, indicating that the SSRF serves as the mechanism through which code execution is achieved. Based on Mandiant’s analysis, two endpoints are involved in exploitation: /PSEMHUB/hub and /PSIGW/HttpListeningConnector. The exploit chain may also cause the target system to make outbound SMB connections (TCP port 445) to external destinations, potentially allowing attackers to capture Windows machine-account NetNTLM hashes.
Post-exploitation activity observed by Mandiant included the deployment of MeshCentral (an open-source, and self-hosted web-based remote monitoring and management platform) remote management agents configured to masquerade as Microsoft Azure services (e.g., meshagent64-azure-ops.exe), with C2 communications directed to wss://azurenetfiles[.]net:443/agent.ashx. The attackers performed internal reconnaissance of PeopleSoft configurations, deployed lateral movement scripts, and exfiltrated data using zstd compression.
Mitigation guidance
Organizations running PeopleTools versions 8.61 or 8.62 should apply the vendor-supplied patch on an emergency basis, without waiting for a regular patch cycle to occur. Oracle has characterized this as a high-priority risk reduction measure.
In addition to patching, organizations should implement the following compensating controls:
-
Disable the Environment Management Hub (EMHub) Service in multi-server configurations, or completely remove the PSEMHUB application in single-server configurations.
-
Block external access to /PSEMHUB/* and /PSIGW/HttpListeningConnector at the network perimeter or firewall level. Per Mandiant, restricting these endpoints is considered non-breaking for standard end-user PeopleSoft Internet Architecture (PIA) browser sessions.
-
Monitor outbound SMB traffic (TCP port 445) from PeopleSoft servers to untrusted external destinations.
Given that exploitation occurred as early as May 27, 2026, Rapid7 strongly recommends investigating for signs of compromise even after patching, using the indicators of compromise outlined below.
For the latest mitigation guidance, please refer to the Oracle security alert and Mandiant’s report.
Rapid7 customers
Exposure Command, InsightVM, and Nexpose
Exposure Command, InsightVM, and Nexpose customers can assess exposure to CVE-2026-35273 with authenticated vulnerability checks available in the 12th June 2026 content release.
Intelligence Hub
Customers leveraging Rapid7’s Intelligence Hub can track the latest developments surrounding CVE-2026-35273, including indicators of compromise (IOCs) from the Mandiant report published on June 11, 2026.
Indicators of compromise
The following indicators of compromise are sourced from Mandiant’s report. Mandiant has also published a GTI collection with additional IOCs for registered users.
Network indicators
Staging and C2 infrastructure:
-
142.11.200[.]186
-
142.11.200[.]187
-
142.11.200[.]188
-
142.11.200[.]189
-
142.11.200[.]190
-
azurenetfiles[.]net (C2 domain masquerading as Microsoft Azure)
-
176.120.22[.]24 (ShinyHunters DLS mirror)
File indicators
|
Filename |
Description |
SHA-256 |
|---|---|---|
|
meshagent64-azure-ops.exe |
Pre-configured Windows MeshCentral agent |
f02a924c9ff92a8780ce812511341182c6b509d45bc59f3f7b522e37225d24fc |
|
meshagent64-v2.exe |
Pre-configured Windows MeshCentral agent |
d83fdb9e53c5ff03c4cb0451ea1bebd79b53f29eadc1e2fa394c7af13a86ce2f |
|
meshagent32-azure-ops.exe |
Pre-configured Windows MeshCentral agent (32-bit) |
c7e9332731b06644fc73e0046a2a89eaa59b09f54250e9bd622467187351711f |
|
meshagent |
Unconfigured Linux MeshCentral agent |
68257a6f9ff196179ec03624e849927f26599eb180a7c82e14ef5bc4e93bc309 |
|
.bash_history |
Attacker command history |
2ab684d93c1553fad87041b4dea97188a97e78589deee2a7bacff905564f3a35 |
Host-based indicators
-
Unexpected .jsp files under <PS_CFG_HOME>/webserv/<domain>/applications/peoplesoft/PSEMHUB.war/
-
Unauthorized files or directories under …/PSEMHUB.war/envmetadata/transactions/
-
Unexpected directories named logs, persistantstorage, or scratchpad under PSEMHUB paths
-
Recently created or modified .xml files under <docroot>/envmetadata/data/environment/ (potential XMLDecoder persistence)
-
Defacement and extortion marker file: README-IF-YOU-SEE-THIS-YOUVE-BEEN-HACKED.TXT
Log-based indicators
HTTP POST requests to the following endpoints from external source IPs:
-
/PSEMHUB/hub
-
/PSIGW/HttpListeningConnector
Requests to /PSIGW/HttpListeningConnector containing loopback addresses (127.0.0.1, localhost, ::1) or internal IP ranges within request headers or parameters may indicate SSRF exploitation.
Updates
-
June 12, 2026: Initial publication.
Hundreds of AUR packages compromised
Post Syndicated from jzb original https://lwn.net/Articles/1077718/
Hundreds of orphaned packages hosted by the Arch User Repository (AUR) have
been compromised by an attacker who has added a malicious npm
package (atomic-lockfile) that can exfiltrate sensitive
data. The project is currently working
on cleaning up the mess. There is a list of affected packages
and post (possibly NSFW domain) by
“sodiboo” with additional information. Arch Linux users (or users of
Arch-based distributions) that use AUR packages may wish to see if they
have installed any of the compromised updates.
Security updates for Friday
Post Syndicated from jzb original https://lwn.net/Articles/1077703/
Security updates have been issued by AlmaLinux (.NET 10.0, .NET 8.0, .NET 9.0, bind, expat, httpd:2.4, kernel, kernel-rt, mod_http2, openssl, poppler, redis, redis:7, samba, and unbound), Debian (ironic, kernel-wedge, libinput, linux-base, and neutron), Fedora (kernel, openssl, vaultwarden, and vaultwarden-web), Mageia (erlang-hex_core, erlang-rebar3, gnupg2, and sqlite3), Red Hat (buildah, podman, and skopeo), SUSE (flannel, gdk-pixbuf-loader-libheif, gnutls, google-cloud-sap-agent, grafana, graphite2, hplip, libIex-3_4-33, libzypp, nginx, openssh, perl-DBI, perl-Git-Repository, perl-Protocol-HTTP2, python-Pygments, python-simpleeval, python311-Django4, rclone, roundcubemail, strongswan, tomcat10, tomcat11, unbound, and webkit2gtk3), and Ubuntu (apache2, dotnet8, dotnet9, dotnet10, gst-plugins-base1.0, ironic, linux-azure-5.15, linux-azure-fips, lwip, mistral, and ubuntu-kylin-software-center).
Scaling Security Insights: how we achieved a 10x increase in global scanning capacity
Post Syndicated from Dave Baxter original https://blog.cloudflare.com/scaling-security-scans/
Security Insights provides actionable security recommendations for every Cloudflare account. To find these insights, we perform regular scans for all accounts, zones, and DNS records, looking for potential security risks and misconfigurations.
However, two key issues emerged. First, our scans were too infrequent. Scans were only being performed every week or two, and therefore newly introduced security risks could remain undetected for up to two weeks. Second, automatic scanning was opt-in for many free plan accounts – meaning lots of accounts weren’t being scanned at all.
The risks of infrequent or nonexistent scans are rising: as automated attacks accelerate, the window for detecting security misconfigurations is shrinking. Making sure that we’re finding these issues for all of our customers is crucial to our aim of building a better Internet for everyone.
We calculated that to increase our scanning frequencies and enable automatic scanning for all accounts, we would need to increase our scanning throughput by around 10x on average – from 10 scans per second to 100 per second. But our system was already struggling with its load: millions of events were filling up our backlog waiting to be processed; our API was frequently timing out; our processes were crashing. We needed to fix our system, and we needed to make it scale.
This is the story of how we increased scanning throughput for Security Insights by more than 10x, enabled security insights for millions of customers, and doubled our scanning frequency for all customers. Read on to find out how we achieved these improvements.
At a high level, our automatic security scans are triggered by a scheduler. When an account or zone is due for a scan, the scheduler publishes a message (or messages) to Apache Kafka, an open-source distributed event streaming platform. These messages fan out to a number of checkers: specialized Go microservices that scan specific assets or configurations.
For every message, each checker sends its results (the security insights that it found) to our internal API, which then persists these in a Postgres database.

Apache Kafka is not strictly a queue: it is a partitioned event stream (though recently gained queue semantics). Within a partition, messages must be consumed and processed in order. This differs from typical queues where messages may be consumed in order but are processed out-of-order. As a result, we can only have one active consumer per partition within a consumer group.
This has two consequences for us:
-
Messages that are slow to process block the consumer from progressing to the next message
-
For each checker, we can only have as many consumers as there are partitions (each checker has its own consumer group)

We could have tried to scale by adding more partitions. However, this would have increased resource usage for the Kafka broker itself, which is shared by many other services. We reserved this as a last resort, aiming to improve our code and architecture first.
Although we can only consume messages in order, there is nothing stopping us from consuming multiple messages at once.
We changed our checkers to consume messages in batches, processing each message in a separate goroutine. The trade-offs are that we’d have more work to re-do if our process crashed midway through a batch, and our memory usage would be slightly increased. In our case, these were both acceptable.
Some messages processed by a few of our checkers take much longer to process than others. For example, one account/zone may have far more assets than another. In the worst case, these messages can take minutes or hours to process compared to the average case of seconds or milliseconds.
We opted for a very simple approach: splitting our consumer groups and checkers in two – the ‘slow lane’ and the ‘fast lane’. We could determine quickly whether a message would be slow or fast to process. If the ‘fast lane’ checker encounters a slow message, it skips it.

This solved the problem: slow messages had the dedicated resources and time to be processed with minimal delay, and fast messages were able to proceed at their regular fast pace.
Every insight we find gets written to our Postgres database. This is handled by a single API endpoint that our checkers invoke with a list of insights. The implementation looked like this:
for _, issue := range issues {
_, err = tx.Exec(ctx, `INSERT INTO table ... VALUES ($1, $2, ...) ON CONFLICT DO UPDATE ...`, ...)
if err != nil {
return err
}
}
The astute reader will notice that for large sets of insights, this code makes a round trip to the database per insight. With a maximum observed size of 500,000, this was half a million round trips, queries, and transactions in a single API call.
We initially tried the gold standard for bulk inserts in Postgres: COPY into a temporary table. However, we found that this approach led to bloat in the Postgres system tables.
We settled on a hybrid approach:
-
Using UNNEST when the number of issues was below a threshold
-
Using COPY when the number of issues exceeded this threshold
This provided the best of both worlds: reasonably fast inserts for huge sets of insights (seconds), and even faster inserts (milliseconds) for small sets of insights.
We noticed several strange behaviours in our internal API as we tried to scale:
-
A large number of requests were triggering client-side timeouts
-
Many checkers were spending 20-90% of their processing time on a single API call
-
When triggering a large volume of scans, our throughput would start high and deteriorate
All of these problems had the same root cause: latency.
Our primary database is located in Portland, Oregon. Our API, however, was running active-active in both Portland and Amsterdam. Even at the speed of light, the round-trip latency between Portland and Amsterdam would be 50 milliseconds.
As a result of this latency, database queries from the Amsterdam API instance took much longer, holding connections from our client-side connection pool open. With the large volume of requests that we were making to the API, the connection pool was quickly becoming exhausted, leading to timeouts waiting for a free connection. Our average API call completed in 10 ms in Portland, but almost 3 seconds in Amsterdam!
But why the drop in message throughput? Each checker process gets assigned a set of partitions of the Kafka stream to consume. Our API is load-balanced. Since we hold the connection open throughout the life of the process, some processes had a connection to the Amsterdam API, and others had a connection to the Portland API. The partitions linked to Portland were processed quickly, but the ones consumed by the Amsterdam-bound processes were lagging behind:

Kafka lag (number of messages waiting to be processed within a single consumer group) by partition for one of our checkers. Note that we have 30 partitions in this case. Exactly 15 partitions can be seen lagging behind (the lines that reach or approach zero later than around 03/10 03:00). This is because the load balancer splits traffic evenly between our API endpoints.
This was a simple fix: we switched our API to active-passive, ensuring the active API followed our primary database. Our latency problems disappeared overnight.
We’d scaled Kafka. We’d optimised our database queries. We’d fixed our API. However, we still had a problem: we needed to be sure our scans would be roughly uniformly distributed in time. It wasn’t feasible to queue all of our scans at the same time, as our Kafka topic uses a time-based retention policy: the scans would pile up in Kafka, and eventually be deleted before they could be processed.
Our scheduler was not good at uniformly distributing our scans. The number of scans that would be triggered at a given time was spiky and unpredictable. At certain points throughout the week, hundreds of thousands of scans would be triggered within minutes of each other. What was going on?
The scheduler triggers scans on fixed recurring periods. In pseudocode, the scheduler looked like this:
Loop forever:
Find accounts where last_scheduled_at + scanning frequency <= now
For each account:
Trigger scan for account
Trigger scan for all zones in the account
Update last_scheduled_at = now
We quickly noticed that last_scheduled_at was similar for a large number of accounts in our database, which was responsible for some of this unevenness.
However, even with perfectly even distribution, increasing our scanning frequency would have compounded this problem. For example, changing the scanning frequency from every 15 days to every seven days would mean 53% of accounts would suddenly be due for a scan.
There was a further problem with this logic. Some accounts have a very large number of zones. When these accounts were scheduled, there was a cascade of scans for all of their zones. This was saturating our Kafka partitions and leading to delays for scans of much smaller accounts.
To fix these problems, we made three key changes:
-
Schedule zones independently of accounts: each zone gets its own last_scheduled_at field.
-
Randomize the last_scheduled_at time for existing accounts and zones.
-
Introduce adaptive rate limiting for scan scheduling.
Scheduling zones independently was an obvious way to solve the problem of large accounts. Randomizing the last_scheduled_at time (and ensuring that no scans were delayed during this process) allowed us to fix the existing unevenness in our database.
Adaptive rate limiting is slightly more interesting. Rate limiting would allow us to solve the problem of a spike in scans when we change scanning frequencies. For example, if we wanted to increase our scanning frequency to every 7 days, and we had 50 million accounts, then a rate limit of ~83 scans/second would ensure that they were spread out evenly across 7 days.
But what if we added 10 million more accounts? Then, this rate limit would force us to take 8 days to scan all of these accounts. This is where the adaptive part comes in: the rate limit is asynchronously recalculated every half-hour based on the total number of accounts and zones we have, and our scanning frequencies. This ensures we continue scanning on time even if we onboard thousands or millions more accounts and zones.
func computeRate(free, pro, biz, ent int64) rate.Limit {
r := float64(free)/freeScanInterval.Seconds() +
float64(pro)/proScanInterval.Seconds() +
float64(biz)/bizScanInterval.Seconds() +
float64(ent)/entScanInterval.Seconds()
// Guard against zero counts. We always want to schedule at least one scan per second.
if r < 1 {
r = 1
}
// Increase rate limit beyond the 'perfect' value, to have a buffer in case of any downtime
// or spikes in load.
r *= rateLimitBufferFactor
return rate.Limit(r)
}

With these fixes, our 7-day moving average throughput per checker over time rose by more than 10x.
Before these improvements, we were executing around 10 scans per second. The gap between this and our target throughput of 100 scans per second seemed vast. We discussed throwing more resources at the problem, throwing more partitions at our Kafka topic – even throwing out our entire architecture.
But our fixes made all the difference. Today, Security Insights sustains over 120 scans per second during peak scheduling, exceeding our 10x improvement goal. Our internal API is no longer timing out, and our Kafka lag metrics look much healthier. These scalability improvements have allowed us to turn on automatic scanning for all free accounts and zones and increase the scanning frequency for all customers:
-
Free: every 7 days
-
Pro and Business: every 3 days
-
Enterprise: daily
The improved system stability has given us confidence to build new features that we were previously constrained from creating. We’ve added the ability to perform granular on-demand scans. You can now manually re-scan a Cloudflare account, zone, insight, or insight type.

Starting a granular on-demand scan from the Security Overview page in the Cloudflare dashboard
The lesson we learned is that it’s crucial to deeply understand the existing system before throwing anything away. By looking closely at our code, SQL queries, logs, and metrics (especially metrics!), we were able to increase our capacity without simply adding more pods or partitions. By questioning our assumptions, digging into weird-looking metrics, and refusing to take the easy shortcuts (such as increasing API client-side timeouts), we built a more stable and resilient system.
Throwing more resources at the problem might sometimes be the answer, but at Cloudflare, we believe in engineering our way out of problems.
Security Insights scans are enabled by default on all Cloudflare plans. Log in to the Cloudflare dashboard today to review and manage your security insights.
Rise and Fall of the Great Library at Alexandria
Post Syndicated from The History Guy: History Deserves to Be Remembered original https://www.youtube.com/watch?v=uqIG1n7d0fY
Bernie Sanders’ AI Sovereign Wealth Fund Plan
Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2026/06/bernie-sanders-ai-sovereign-wealth-fund-plan.html
Let no one accuse Bernie Sanders of ducking the big questions. Writing in the New York Times last week, the senator asked: “Will the future of humanity be determined by a handful of billionaires who have promoted and developed AI, with virtually no democratic input, who stand to become even richer and more powerful than they are today?”
We agree entirely that this is one of the most potent questions facing global democracy today. Our book, Rewiring Democracy, surveys the emerging uses for and impacts of AI in democracy around the world and reaches the same conclusion: that the most urgent risk posed by AI is the concentration of power, wealth and control among tech oligarchs.
And yet we reached a vastly different conclusion than Sanders on what to do about it.
The senator points to a once radical but increasingly popular solution: creating a US sovereign wealth fund by taking 50% stock in AI companies such as Anthropic, OpenAI and xAI. The argument in favor of this is twofold. One: it would establish democratic control over the AI companies, giving the government “the power, through its voting shares and an equal representation on each company’s board, to block decisions that hurt our citizens and to push for policies that help them”. Two: it would return a big chunk of the economic rewards of soaring AI valuations to the public, ensuring “trillions of dollars potentially generated by AI are used to improve the lives of all of us”.
We laud both these goals unreservedly.
We wholeheartedly agree that there must be public influence over the development and use of AI, just as we demand the government intervene to ensure that automakers, drugmakers, airlines and other industries balance profitability with public safety and the public interest. And we credit the senator with recognizing that there are more levers for the government to pull beyond the promulgation of regulation to achieve this.
And we also agree that the obscene, dangerous accumulation of wealth among AI companies needs to be disrupted. As OpenAI and Anthropic race to be minted as the world’s latest trillion-dollar AI companies, we should recognize that—whether or not it constitutes a bubble—these staggering market capitalizations represent a transfer of wealth. The flow of money goes from the smaller businesses and actual people using AI, and being subjected to it, to the owners of these tech companies.
That includes the world’s 86 AI billionaires “seeking to maximize their power and profit” aiming to decide the “fate of humanity behind closed doors in Silicon Valley”, as Sanders said.
And yet, while we do not outright oppose the taking of AI company stock, or of a US sovereign wealth fund, there are better ways to achieve Sanders’ stated goals.
Public ownership of these companies entangles corporate profit and valuation with the public interest. It would incentivize the government to clear regulations, permit the exploitation of workers and users, suppress competition, encourage AI adoption regardless of the responsibleness of the implementation or appropriateness of the use case, and otherwise act on behalf of corporate interests.
After all, if growing, say, Nvidia from its first $5tn in value to its next $5tn also represents a doubling in value of this segment of the sovereign wealth fund, then you can expect the fund managers to support chip sales, foreign and domestic, with the same zeal as the company’s private investors.
This is not an effective way to influence corporations to act in the public interest. In fact, it makes corporate influence on the government more likely.
We should be wary of this possibility because we’ve seen it before. Ownership of substantial stakes in oil companies by the Norwegian sovereign wealth fund, the world’s largest, does not seem to have steered those corporations to pro-environmental policies. Instead, the Norwegian government’s dependence on those companies has inhibited them from taking climate action. Here in the US, public employee pension funds merit the same criticism: the fiduciary duty to generate wealth overwhelms any intention to direct their corporate holdings in the public interest.
A better answer is to separate the two goals. The standard way to share private rewards with the broader society that made them possible is taxation. Senator Elizabeth Warren has proposed an excise tax on datacenters’ energy use. Others have proposed an AI token tax, which has much the same effect.
As to the goal of reshaping AI in the public interest, we have proposed an AI Public Option. The concept is for governments, be it federal or state, to establish publicly developed and operated AI models run by public institutions under democratic control. The idea is not to eliminate corporate AI or to seize it as a public asset, but rather for government to provide a competitive baseline that private AI offerings must meet or exceed to win business—just like the notion of a healthcare public option.
The Swiss have trailblazed this approach. Apertus is a large language model built by Swiss public servants, researchers at Swiss universities, using appropriately licensed training data and pre-existing Swiss public supercomputing infrastructure powered by renewable energy.
While Apertus doesn’t seriously compete with the latest OpenAI and Anthropic models on performance benchmarks, it blows them out of the water in transparency, sustainability and compliance with EU regulations including adherence to copyright. It’s a nascent project, but suggestive of how public institutions can apply competitive pressure for corporate actors to behave responsibly.
Don’t confuse public AI with “sovereign AI“, the notion that every country needs to invest in domestic AI infrastructure. Sovereign AI is often invoked as a marketing scheme for big tech companies looking to sell to governments; it demands public investment without guaranteeing public control.
Sanders is a bold and savvy political operator. So why is he pursuing the sovereign wealth fund strategy when he must be aware of these risks? It may be due to another argument he makes in his op-ed: that the Trump administration and the billionaire owners of AI are aligned to the idea.
It’s expedient to capitalize on rare moments of seeming alignment across diverse political factions, but it also behooves us to ask why the AI billionaires are open to this extraordinary intervention. The answer, of course, is that they believe that for every dollar ceded to government stock expropriation, they will get back more in favorable government policies to protect that newfound investment.
Energy taxation is a straightforward way to make AI companies pay for the social disruption of their technologies. Public AI represents a non-monetary mechanism for governments to shape the development of AI, complementary to direct regulation of private actors, one with a far greater chance of influencing corporate behavior towards the public interest. We urge Sanders and other political leaders to consider them.
This essay was written with Nathan E. Sanders, and originally appeared in The Guardian.
Машаллах, българи!
Post Syndicated from Емилия Милчева original https://www.toest.bg/mashallah-bulgari/

„Машаллах, българи!“, ще каже турският президент Реджеп Тайип Ердоган, който бил склонен на тактически отстъпки по споразумението с „Боташ“. И ще е прав, защото ще прибере стратегическата печалба благодарение на български политици.
За целта на официално посещение в България беше министърът на външните работи на Турция Хакан Фидан – един от най-близките хора на президента Ердоган. В продължение на 13 години Фидан оглавяваше турската разузнавателна служба (MİT), превръщайки я според анализатори „в своеобразно „острие“ на турската външна политика“ и изпълнявайки специални задачи, възлагани му лично от Ердоган.
Преди посещението той даде интервю пред кореспондентката на БТА в Турция Айше Сали, от което стана ясно, че Анкара ще използва неизгодното за България споразумение с „Боташ“ като разменна монета за свои геополитически цели. Срещу облекчаване на условията Анкара поставя на масата магистрали, гранични пунктове, енергийни връзки и една много по-голяма цел – да затвърди ролята си на незаменим посредник между Азия и Европа.

Въпреки че България обича да се (само)нарича врата към Европа/Изтока (в зависимост от коя страна се отваря), Турция иска да държи ключовете. Географията е дала на България мястото. Политиката решава кой ще се възползва от него.
А всичко започва на 3 януари 2023 г., когато „разменната монета“ вече е факт.
В действителност – по-рано.
Буквално три седмици след като президентите на двете страни Румен Радев и Реджеп Тайип Ердоган договориха мащабно сътрудничество между нашите две държави, ние успяхме да превърнем тяхната инициатива в практическо решение, което дава възможност за взаимноизгодно развитие в сферата на енергетиката.
Росен Христов, министър на енергетиката в служебния кабинет на Гълъб Донев, понастоящем прясно назначен в Държавната консолидационна компания
Аферата „Боташ“
Сключеното от един от служебните кабинети на президента Радев споразумение между турската държавна газова компания „Боташ“ и българската „Булгаргаз“ осигурява достъп на България до турските терминали за втечнен природен газ и до турската газопреносна мрежа. Независимо дали използва капацитета за пренос на до 1,5 млрд. куб.м газ годишно България е задължена да плаща на Турция по 537 000 евро (1,050 млн. лв.) на ден. Служебното правителство, което управляваше няколко месеца, сключи 13-годишния контракт (който не определя като договор, а като споразумение, защото иначе щеше да се изисква ратификация от парламента).
Газовото съглашение е критикувано заради високите фиксирани такси и след сключването му се правеха плахи опити за неговото предоговаряне. „Булгаргаз“ настоява за намаляване на количествата и срока на споразумението и за промени на тарифите. ГЕРБ, „Демократична България“, „Продължаваме промяната“ и олигархът Делян Пеевски нееднократно атакуваха президента Радев заради аферата „Боташ“. Временна парламентарна комисия проверяваше споразумението, а докладът и документацията бяха изпратени в прокуратурата.

След изборната победа на „Прогресивна България“ и Румен Радев, осигурила му абсолютно управленско мнозинство, критиките заглъхнаха. Проблемът обаче не изчезна – превърна се в предмет на преговори и е използван като инструмент за натиск.
Аферата „Боташ“ вече е в нова опаковка – предмостие към по-здрава енергийна свързаност. След срещата си с първия турски дипломат премиерът Румен Радев публично не спомена споразумението, той говори за Турция като за „ключов партньор“ и постави фокус върху „енергийната и транспортна свързаност“ по време на разговорите. Външната министърка Велислава Петрова-Чамова, която също се срещна с Фидан, представи подобна версия:
По отношение на енергетиката, която остава ключова област, разгледахме доставките на природен газ, междусистемната свързаност и диверсификацията на енергийните ресурси.
Eдна протоколна снимка от срещата между двамата дипломати, на която Петрова е с открити рамо и коляно, предизвика язвителни коментари в социалните мрежи. България щяла да предоговаря „Боташ“ с голо рамо, шегуваха се потребители. По-късно снимката беше премахната от сайта на МВнР, но остана в публикациите на турското Външно министерство. Лошото е, че докато обществото се смееше на снимката, турската страна всъщност демонстрираше далеч по-сериозна преговорна стратегия от българската.
Нито Радев, нито Петрова-Чамова излязоха извън клишетата и общите фрази за конкретните искания на Анкара. Добре че го беше направил външният министър на Турция с интервюто си пред БТА дни преди гостуването си.
А след срещата във Външно Фидан даде да се разбере, че има правомощия от най-високо ниво да разреши проблема с „Боташ“.
От времето, когато господин Радев беше президент, следим отблизо тази тема. Проведохме няколко обсъждания и бяха дадени инструкции за разрешаването на този въпрос.
След „Боташ“ още от същото, или турският пробив 2
Турция обвързва своята готовност да обсъди промени по договора с „Боташ“ срещу пакет от проекти: увеличаване на газовия и електропреносния капацитет към Европа, магистрала „Черно море“, разширяване на ГКПП „Капитан Андреево“ и нови гранични пунктове. Всички те обслужват голямата турска амбиция да се утвърди като основен енергиен и транспортен коридор между Азия и Европа. Така че решението на проблема „Боташ“ за България, което е и добре изигран ход от Турция, всъщност представлява още повече предимства за югоизточната ни съседка.
Цената е България да подкрепи инфраструктурни проекти, които дават на Турция по-голям контрол върху потоците от стоки и енергия между Азия и Европа. Фидан говори не за предоговаряне на условията по „Боташ“, а за тяхното надграждане. Турция предлага всеобхватно енергийно споразумение, включващо увеличаване на капацитета за пренос на природен газ между двете страни. Казано по-просто, вместо разговорът да се води как България да плаща по-малко за достъпа до турската инфраструктура, Анкара го пренасочва към въпроса как през същата тази инфраструктура да преминават още по-големи количества газ към Европа.
Освен двустранно сътрудничество, договорът между „Боташ“ и „Булгаргаз“ включва инфраструктура, която ще допринесе и за енергийната сигурност на Европа.
Нашата цел е, подписвайки всеобхватно споразумение за сътрудничество в областта на енергетиката, което ще включва увеличаване на капацитета за пренос на природен газ между Турция и България, да развием още повече отношенията си.
Хакан Фидан
Разширяването на ГКПП „Капитан Андреево“ също не е случайно. Това е най-натовареният сухопътен граничен пункт между Турция и Европейския съюз и едно от най-важните трасета за товарния трафик между Азия и Европа. Всяко намаляване на задръстванията и увеличаване на пропускателната способност означава по-бързо и по-евтино придвижване на турски и азиатски стоки към европейските пазари. Същата логика стои и зад идеята за нови гранични пунктове.
Магистрала „Черно море“ далеч не е български инфраструктурен проект. За Турция тя е част от много по-голяма транспортна схема. В началото на годината стана известно, че турски строителни компании искат да изградят на концесия магистрала „Черно море“ от границата с Турция при Малко Търново чак до Дуранкулак при Румъния. Заедно с коридорите от Централна Азия, Кавказ и Ирак към турска територия това би улеснило движението на товари от Азия към европейските пазари. От турска страна дори предлагали към стоте километра магистрала и два скоростни пътя – от Малко Търново до Бургас и от Варна до Дуранкулак.
Защо „Черно море“ е толкова важна за Турция? Тя ще свърже Истанбул с Варна, Румъния, Молдова и Украйна. Така турските товари ще се придвижват много по-бързо до пристанищата в Констанца и Одеса. Турската страна обвързва изграждането ѝ с разширяването на ГКПП „Малко Търново – Дерекьой“.

Общото между всички тези проекти е, че увеличават капацитета на коридора Турция–България–Европа. Затова Анкара ги поставя редом до въпроса за „Боташ“: като част от една по-голяма стратегия, а не като отделни инфраструктурни инициативи.
По време на срещата си с президентката Илияна Йотова Хакан Фидан е обсъдил железопътната инфраструктура и фериботна линия между Бургас и Истанбул. Според официалното съобщение двамата са отбелязали, че затрудненото движение през Ормузкия проток създава необходимост от алтернативни маршрути, което увеличава стратегическото значение на Югоизточна Европа.
Географията е съдба, но политиците са избор
Любопитна подробност от посещението на Хакан Фидан е, че освен с премиера, президентката и външната министърка, Хакан Фидан се срещна и с лидера на ДПС – олигарха и санкциониран по „Магнитски“ Делян Пеевски. Същият Пеевски, който определяше споразумението с „Боташ“ като един от примерите за лошо управление и настояваше за политическа отговорност за сключването на договорката.
Румен Радев също смяташе да разгражда олигархичен модел, чието назоваване му беше трудно, а след изборите на 19 април изобщо спря – и да назовава, и да говори за демонтаж на модела. Самият Пеевски и парламентарната му група се оказаха и поддръжници на инициативите на управляващото мнозинство.
Всичко върви към вдигането на завесите на политическия театър.
Срещата на Хакан Фидан с Пеевски е съзнателен политически сигнал. Тя показва, че когато Турция обсъжда стратегически теми като енергетика и транспортни коридори, Анкара разговаря и с фигура, която има място в стратегическите отношения между България и Турция. Това е неудобен, но показателен знак как декларациите отстъпват пред политическата реалност.
Преди година и половина сделката с „Боташ“ беше представяна като едно от най-тежките наследства на служебната власт. Днес същото споразумение се използва като отправна точка за нов пакет от енергийни и инфраструктурни проекти между България и Турция.
Машаллах, българи…
Островът на прокудените. Травми от миналото изплуват по бреговете на Гьокчеада (първа част)
Post Syndicated from Георги Тотев original https://www.toest.bg/ostrovut-na-prokudenite-travmi-ot-minaloto-izpluvat-po-bregovete-na-gyokcheada-purva-chast/

Небето е осеяно с цветове – десетки кайтсърф крила се носят над бурната вода, теглейки сърфистите мощно по вълните. На плажа туристи аплодират всеки зрелищен трик. Сред тях има семейства с деца, жени с хиджаб, но и в доста по-оскъдни бански костюми. Най-многобройни са заклетите кайтсърфисти, които подготвят екипировката си за следващото влизане във водата.
Изневиделица небето се разсича от изтребител – напомняне за турските военни бази навсякъде наоколо, които зорко охраняват входа на Дарданелите. В морето един от сърфистите се отличава със своята техника – буквално се изстрелва на десетки метри във въздуха, завърта се в сложна акробатична фигура и се връща отново сред вълните. От плажа отново се разнасят ръкопляскания.
Махмуд Махмуди никога не е виждал морето, преди да напусне Афганистан. На 27 години е. Друг път казва: на 30. Самият той твърди, че не е напълно сигурен.
Роден е в Кандахар и израства в сянката на продължителната международна военна интервенция в страната. Остава сирак още като дете. Отгледан е от чичо си, който не е имал постоянна работа. Махмуд работи от малък и помага за издръжката на домакинството. Учи вечер, а през деня продава каквото успее на улицата – флашки с пиратска музика, евтина козметика и дребни стоки, внесени от Китай. „Проблеми у дома и война по улиците“, обобщава детството си той.

Планът му бил прост: да спести пари, да завърши училище и някой ден да стигне до Европа – за предпочитане Германия. Бил убеден, че не е толкова трудно, колкото всички твърдят.
Човек трябва да има някаква причина да живее,
казва той.
Плажът е истински рай за кайтсърфистите – широк, пясъчен, див и с постоянен вятър. По брега са наредени десетки, ако не и стотици кемпери и каравани с регистрационни номера от различни държави в региона. Туризмът тук е сравнително ново явление. До началото на века Гьокчеада е затворена военна зона.
Днес островът, известен и като Имброс (гръцкото му име до 70-те години на миналия век), посреща около 13 000 туристи годишно. Повечето идват от континентална Турция, но има и от България, Румъния, Северна Македония и Полша.
Пътеводителят Rough Guide определя Гьокчеада като „блажено убежище от презастроеното Егейско крайбрежие на континентална Турция“. Според Lonely Planet островът е „скрито съкровище“, останало задълго в сянката на близкия полуостров Галиполи, но постепенно печелещо популярност като спокойно място за семейна почивка.
Още със слизането от ферибота усещаш острова с цялото си тяло – слънце, вятър и лек солен дъх от морето. Пейзажът е див и суров: маслинови горички, накъдето и да погледнеш, а по склоновете спокойно пасат кози. Лесно можеш да се изгубиш тук за няколко дни и да отнесеш спомена със себе си в снимките на телефона си. Но зад ваканционните кадри се крие друг Гьокчеада: за едни той е изгубен дом, за други – място на изгнание, а за трети – просто спирка по пътя.
—
Тишината на вековните маслинови горички е нарушавана единствено от блеенето на кози
Далеч от ветровитото крайбрежие, навътре в острова, въздухът е неподвижен, а времето сякаш тече по-бавно. Идиличният облик на мястото е отразен дори в името му – „гьокче“ на турски означава „небесен“, а „ада“ – „остров“. Сред маслиновите дървета край село Ширинкьой се намира фермата на Раиф и Кание Чалъшкан. Но двамата не са тук по собствено желание. Кание и Раиф са родом от българска Добруджа. Тя е израснала край Генерал Тошево, а той – в село близо до Крушари. Семействата им са част от турското малцинство в България. Младостта им преминава по времето на Тодор Живков – дългогодишния лидер на комунистическа България.

„Живеехме добре – спомня си Кание. – И до днес се радвам, че съм родена в България, че съм живяла там.“ Раиф е на същото мнение:
Бяхме като братя и сестри – турци и българи. Нямаше значение кой какъв е.
През 80-те години обаче турското малцинство се превръща в мишена на насилствена асимилационна кампания. В крайна сметка стотици хиляди души са принудени да напуснат родните си места в най-мащабното етническо прочистване в Европа по време на Студената война, отстъпващо по размер единствено на прогонването на немскоговорещото население от Централна и Източна Европа след края на Втората световна война.
„Никога не сме си представяли, че ще се озовем да живеем на остров – казва Раиф. – Това място така и не ни стана дом.“ Кание тихо го допълва:
Това не беше част от плана. Ако зависеше от мен, отдавна щях да съм си тръгнала оттук.
След падането на режима на Живков дискриминационните политики са отменени, но общественият разговор за този мрачен епизод от българската история дълго остава непълен и силно политизиран. Днес е по-видим – присъства, макар и бегло, в учебниците по история, изследва се от учени и често е посочван от защитниците на демокрацията като пример за престъпната същност на комунистическия режим.


Раиф и Кание © Георги Тотев
Въпреки това никой не е понесъл наказателна отговорност за преследването на турското малцинство.
Към днешна дата отношенията между София и Анкара са може би най-добрите в съвременната история и изглеждат все по-малко склонни да се връщат към онези страници от миналото. Историята обаче продължава да живее – най-вече в спомените на хората, които са я преживели, запълвайки празнината между живота, който са били принудени да оставят зад гърба си, и онзи, който са се опитали да изградят.
—
Следобедното слънце хвърля дълги сенки по тясната пешеходна улица в центъра на главния град Гьокчеада
Масите на уютно кафене са се разлели върху малкия павиран площад в края на улицата. Въздухът е изпълнен с шумни поздрави и клюки, разменяни на гръцки. Изглежда, че всички се познават. Виолета Патиниоти прибира лаптопа си след поредния работен ден за технологична компания в Атина, но далеч не бърза да си тръгва. По пътя към изхода спира на всяка крачка, заговорена от други посетители. Около нея избухва смях, ръце жестикулират оживено, а сбогуванията звучат така, сякаш могат да продължат безкрайно. На вратата се засича с Димитрис – собственика на заведението. Той е роден в Солун, но винаги е знаел, че островът е родното място на майка му. Преди десет години решава да се премести тук. „За да открия корените си – казва той. – И за да си изкарвам прехраната.“

Виолета също е тук сравнително отскоро. Родена е на остров Санторини. Учи археология във Великобритания. В Истанбул среща бъдещия си съпруг – турски художник по осветлението с кюрдски и арабски корени. Той работил по филм, сниман на острова, и двамата постепенно започнали да си представят Гьокчеада като място, където да отгледат двете си деца. Когато избухва пандемията от COVID-19, семейството взема окончателното решение да се премести тук, заменяйки шума на мегаполиса със спокойствието на островния живот. Допълнителен стимул е и решението на властите отново да разрешат отварянето на училища с преподаване на гръцки език.
Искахме децата ни да знаят и гръцки, и турски, да разбират и двете култури. Този остров е истинска мозайка от култури. За мен е рай. Но тук има и дълбоки рани, които все още не са зараснали и за които не е лесно да се говори,
казва Виолета. Кафенето е едно от най-новите популярни места за срещи на гръцката общност на острова, съсредоточена основно в град Гьокчеада и в селата Зейтинли, Тепекьой и Дерекьой. Според преброяване от 2023 г. общността се състои от около 700 души при общо население на острова от приблизително 11 300 жители. С други думи, за около един на всеки 20 жители на острова гръцкият е майчин език. По-голямата част от населението обаче е от турски произход и се е заселила тук едва през последните десетилетия, идвайки от различни части на континентална Турция.
В началото на XX век гръцкият е основният език на Гьокчеада
Другото име на острова – Имброс, е свързано с древногръцката история. Споменава се в „Илиада“ на Омир като скалист остров над подводната пещера, където морският бог Посейдон държал конете си. Археологически находки показват, че островът е бил населен още през каменната епоха. През XV век Гьокчеада става част от Османската империя. Както и на други места в империята, православното християнско население запазва своите църкви и училища. В същото време обаче местните общности остават уязвими на политиките на принудителни преселвания, които периодично променят етническата карта на региона.
С разпадането на Османската империя в началото на XX век Гърция за кратко поема контрола над Имброс и съседния Тенедос – днешния Бозджаада. И двата егейски острова по това време имат преобладаващо гръцкоговорещо население. Съдбата им обаче е решена с Лозанския договор от 1923 г., който ги оставя в границите на новосъздадената Република Турция. Договорът урежда част от конфликтите, съпътствали разпадането на Османската империя, и предвижда задължителна размяна на население между Гърция и Турция – първата подобна мащабна операция, основана на религиозен признак. В резултат на това над един милион православни християни, говорещи гръцки език, са принудени да напуснат територията на Турция, а близо половин милион мюсюлмани, говорещи турски език, са изселени от континентална Гърция и гръцките острови.
Православното християнско население на Имброс и Тенедос – според различни оценки между 4000 и 9000 души – е изключено от размяната. Съгласно Лозанския договор Турция поема ангажимент да гарантира автономията и специалния статус на двете островни общности. Тези гаранции обаче остават само на хартия. Последвалите мерки карат много от гръцките жители на островите да се изселят в континентална Гърция. До края на 1923 г. Турция вече е установила пълен контрол над Имброс и Тенедос.
През останалата част от XX век гръцката общност на острова постепенно се стопява до едва няколкостотин души. Причината е поредица от политически мерки, насочени срещу нейния език, културна идентичност и имуществени права.
Според Юмит Есер, преподавател в университета „Неджметин Ербакан“, тези мерки „на практика представляват форма на държавно насърчавано прогонване, която прави нормалния живот на островите все по-невъзможен“ за православното християнско население.
„Мнозина предпочитат да станат „бежанци“ в Гърция, вместо да останат чужденци, примирени със съдбата си на земята, на която са родени“, казва той.
През последните 20 години част от мерките са отменени, а някои от правата, отнети на гръцката общност през миналия век, са възстановени. Малък брой потомци на някогашните жители са се завърнали на острова, което поражда предпазлив оптимизъм за бъдещето на общността. Въпреки това темата продължава да бъде чувствителна в Турция. Според Юмит Есер, с изключение на ограничен кръг критично настроени към официалните наративи учени, „преобладаващата нагласа в турското общество дълго време е по-скоро безразличие или мълчание, отколкото открит разговор за тази част от миналото“.
Този материал е създаден в рамките на Програмата за журналистически постижения (Fellowship for Journalistic Excellence) с подкрепата на ERSTE Foundation и в сътрудничество с Balkan Investigative Reporting Network (BIRN).
Редактор на оригиналния текст: Нийл Арън
Превод: Георги Тотев
Бисер Дянков: Искаме с всяка игра да научаваме нещо ново
Post Syndicated from original https://www.toest.bg/biser-dyankov-iskame-s-vsyaka-igra-da-nauchavame-neshto-novo/

С игри като „Цар: Тежестта на короната“ (Tzar: The Burden of the Crown), „Да оцелееш на Марс“ (Surviving Mars), „Виктор Вран“ (Victor Vran), студио „Хемимонт Геймс“ (Haemimont Games) беше престижното лице на българските видеоигри. Какво наложи и какви ще са последствията от това, че станахте клон на „Парадокс“ (Paradox)?
Не сме клон. Ние си запазваме идентичността, вътрешната култура, нашия начин на работа, нашите ценности. Работим по нашия вид игри. Ако „Парадокс“ искаха просто да отворят „Парадокс София“, безкрайно по-лесно и по-евтино за тях щеше да бъде да си наемат офис и нови хора. За сделката, която сключихме – никога идеята не е била да станем техен клон. Нашата ценност за „Парадокс“ е, че сме тези, които сме. Например аз съм изпълнителен директор, но студиото се управлява и продължава да се управлява от Габриел Добрев, който е студио мениджър.
Ти си юрист по образование. Как се озова в сферата на видеоигрите?
Постъпих на работа в „Хемимонт Геймс“ през 2009 г. като дизайнер. Тогава работехме по последната от старите римски игри и по „Тропико 3“ (Tropico 3) – успешен проект, който много обичам. Знам, на пръв поглед звучи изненадващо. Спомням си момента, в който трябваше да говоря с родителите си как техните представи за мен няма да се случат, но всъщност далеч не съм единственият юрист, когото познавам и който се занимава с разработка на игри.
Но как все пак се случи при тебе?
Интересувах се от игри още през 90-те и попаднах в първите онлайн общности, които се сформираха тогава по интереси. Те бяха изключително малки. Имаше момент, в който се събра българският интернет – 200 души хлапета от всякакви градове. Студенти и гимназисти основно, в такава възраст бяхме. Така се запознах с много от хората – една част от тях правеха игри за удоволствие, а после започнаха да правят игри професионално. Аз така разбрах, че въобще има хора, които това правят. Години по-късно, през 2008-ма – „Юбисофт София“ (Ubisoft Sofia) съществуваше вече, „Блак Сий“ (Black Sea) съществуваше, – благодарение на такива приятелски контакти, отивайки за един концерт, имах шанса да си говоря дълго с Габи (Габриел Добрев – б.а.). И на следващата година започнах работа в „Хемимонт Геймс“ като дизайнер.
Тъй като ние издавахме игри с доста висока скорост, някак си естествено от един момент насетне започнах да се занимавам все повече с продуциране. Там имаше най-голям глад и най-голяма нужда. В една малка компания като нашата всеки носи по много шапки, изпълнява много роли. Никога не ни е пукало какви точно са титлите ни. Просто има работа за вършене. И аз прецених, че моят най-голям ефект върху целия екип и върху проектите, по които работим, е да се фокусирам върху продуцирането.
И така, това приключение продължава и до днес. Всеки един проект е самостоятелен и има своите собствени предизвикателства. Завършването на даден проект е някакво чудо. Каквато и игра да излезе, фактът, че от бял лист хартия се стига до нещо, което хората ще играят, е само по себе си невероятно преживяване. Един от огромните плюсове човек да работи в „Хемимонт Геймс“ е, че участва в този процес и може да го наблюдава отвътре. Докато много от колегите, които работят в другите фирми в София, знаят какви игри ще правят, още преди да постъпят на работа.
Тоест играта идва като задача, подготвена вече от други.
Да. „Юбисофт София“ правят игри на Ubisoft, „Криейтив асембли“ (Creative Assembly) правят „Тотална война“ (Total War)… Естествено, това е разумно и нормално от бизнес гледна точка. Но аз знам от кухнята, че няма проект, който да можеш да отделиш от неговата продуктова реалност. Идеята, че от едната страна има едни творци, които в балонче нещо си творят, а то после се сблъсква с хората с вратовръзки, никога не ми е звучала убедително. Смятам, че в това, което виждаме накрая като игра, е вплетена продуктовата реалност и тя не трябва да бъде изключвана.
Но от това, което казваш, виждам, че вие сте приятелски свързан екип, имате история, обща идентичност… Правилно ли съм разбрала, че има елемент, който ви прави различни от обичайните мегакомпании?
Абсолютно. И това, между другото, е както плюс, така и минус. Имаме страшно много история и сме запазили през годините сърцевината на екипа. Това означава, че имаме много ниско текучество. Това пък от своя страна означава две неща. Не сме толкова добри в интегрирането на нови хора в сравнение с една организация с по-добре установени процедури и корпоративни структури. Защото ние като групичка си знаем как правим нещата. Имаше много интересна ситуация, когато минахме от работа върху един проект към работа по няколко проекта. Тогава с изненада разбрахме, че дадени проблеми съществуват, но човекът, който винаги ги е решавал, сега работи по другия проект. И ти изведнъж трябва да се справяш с неща, за които никога не си мислил. Междувременно в другия проект е същото – има непознати за тях проблеми, а човекът, който досега ги е решавал, вече е на масата на съседната игра…


Кадри от „Виктор Вран“ (Victor Vran) и „Цар: Тежестта на короната“ (Tzar: The Burden of the Crown)
От многото игри, които сте правили, кои смяташ са най-представителни за вас като студио?
Какво точно наричаш представителни? Може да ги гледаш от различна гледна точка. „Цар“ е много известна в България; римските ни стратегии са много известни в Испания и до ден днешен…
Аз си спомням например, че „Да оцелееш на Марс“ излезе в някакъв звезден, или да го наречем марсиански момент…
О, това беше изумително! Да избереш каква точно игра да правиш е поразителен процес, защото той трябва да е съобразен с твоите силни и слаби страни като екип. Индитата (независимите екипи – б.а.) просто правят играта, която искат да направят: което е добро, то ще остане, всичко друго ще падне. Екипите с опит обаче трябва да са изключително внимателни, защото отговорността е друга и факторите са много. Сред факторите ние винаги сме включвали нашето разбиране накъде отива пазарът за видеоигри, какво правят другите студиа, какъв е, грубо казано, цайтгайстът. Нашите опити да сме в унисон с цайтгайста винаги са били не толкова успешни, с изключение на Марс, където – ама буквално – всичките планети се наредиха по един невероятен начин. Илън Мъск по това време популяризираше идеята за вертикално кацащи и излитащи ракети – очевидно това трябва да го има, за да е възможен двустранен транспорт. Ние първо си направихме визуално образите на нашите ракети, а Мъск няколко месеца преди излизането на играта пусна едни видеа, където ракетите бяха идентични. Не мога да ти кажа на колко интервюта са ме питали: „Ами вие сега защо изкопирахте ракетите на Мъск?“ Можеш да си играеш с тоя въпрос по много начини, например: „Ами Мъск всъщност вероятно е изкопирал нашите ракети.“ Истината е, че когато правиш нещо, което се опитваш да е ново, ти никога не си сам. Винаги има още хора, които са на границата на новото и мислят в сходна посока. И пътищата съвпадат, темите съвпадат.
Игрите, които ти изброи в началото – „Виктор Вран“, „Да оцелееш на Марс“, „Цар“, – са различни. И това е абсолютно съзнателно. Ние като екип експериментираме целенасочено в различни жанрове. Един от въпросите, който сме си поставяли, когато сме избирали каква игра да правим, е: „Добре, какво ново ще научим ние като екип?“ Сега правят „Тропико 7“. Сигурно ние, тъй като доста бързо издаваме игри, щяхме да сме, да не казвам голяма приказка, може би вече на „Тропико 9“. Нямаше проблем да правим „Тропико“ оттук до края на света. Не е нашето нещо това. Искаме да вървим в различни посоки, защото искаме с всяка игра да научаваме нещо ново, да подобряваме технологията си и да увеличаваме инструментариума си, така че следващата ни игра да е по-добра от предишната.



Кадри от „Да оцелееш на Марс“ (Surviving Mars)
Как според тебе ще се отрази изкуственият интелект на качествата на игрите, най-вече на естетическите им качества?
Има много нива на употреба на изкуствения интелект. Тук дори не говоря за директната му употреба за творчество. Говоря за употребата му на техническо ниво, която ще позволи много по-висока продуктивност. В момента се оптимизират системи и писане на код с изкуствен интелект. Това не е нещо, което играчът ще види, но ни спестява усилие, което да бъде насочено в посоки, които играчът ще може види. Във всички случаи неизбежно според мен ще се създава директно съдържание с изкуствен интелект. И вече ще видим доколко то може да се разпознава, да бъде отхвърляно или харесвано от играчите, дали няма да направи игрите твърде подобни… Във всеки случай, екипите, които не използват изкуствен интелект, просто ще станат неконкурентоспособни – освен ако не се случи някакво външно събитие. Примерно, изчислителната мощ, която е необходима, да стане толкова висока, че да направи непосилна масовата употреба на изкуствения интелект в създаването на видеоигри. Но това е някакъв външен фактор. Или да се стигне плато, където да спре бързият растеж на технологията. Това е друга хипотеза. Но пак си мисля, че няма връщане назад. Едно от следствията обаче може да бъде, че ще има много по-малко отворени позиции за хора без опит, така че след десет години да платим неприятна цена, когато сегашните опитни хора напуснат и няма достатъчно опитни хора, които да ги заместят.
От друга страна, някои твърдят, че може да стигнем до ситуация, в която няма да има масови игри, а изкуственият интелект ще направи за всеки перфектната игра. Всеки ще играе собствената си индивидуална версия и разговорът по-трудно ще се води, защото аз няма да съм играл твоята игра.
Преди да е дошъл този момент, в който силно се съмнявам, да сменим темата. Какви компромиси в правенето на игри би искал да не ти се налага да правиш?
От гледна точка на процеса отвътре абсолютно всяка игра е компромис. Тя е недовършена и може да бъде по-добра. Няма завършена игра, има само изоставена игра.
Тя по презумпция завършва в действието на играча, така че е структурно и онтологически незавършена.
Окей, добре, съгласен съм, аз може би прекалено силно мисля от продуктова гледна точка. Има игри, които придобиват пълната си форма или потенциалът им става ясен в края на процеса на разработка. Едва в края разбираш каква игра е трябвало да направиш от самото начало, но процесът на разработка вече свършва и в останалите два месеца няма как да поправиш стореното. Евристичният процес е понякога много бавен, той не се подчинява на срокове. Да речем обаче, че имаш това време. Тук възниква друга опасност. Когато правиш една игра много дълго, ти се отдалечаваш от контекста, в който е била замислена. Затова си мисля, че нашият подход в „Хемимонт Геймс“ е много правилен. Ние никога не сме си поставяли за цел да направим най-добрата игра, на която сме способни. Винаги сме си поставяли за цел да направим по-добра игра от последната. Важното е да пренесем наученото към следващия цикъл. Между другото, това е една от причините, които правят „Парадокс“ толкова добър партньор за нас.
С оглед на всичко това как преценяваш шансовете българско студио да създаде международно признат шедьовър?
Леле! Тук, първо, не е много ясно какво е българско студио. „Хемимонт Геймс“ беше българско студио, няма спор. А сега, като е 100% собственост на „Парадокс“, продължава ли да е българско студио, при положение че креативният контрол си е наш? Creative Assembly българско студио ли е? Ubisoft Sofia българско студио ли е? При това ние сме много щастливи да имаме колеги, които не са българи. Етикетът „българско“ изобщо е под въпрос. Ама това е само началото. А какво означава „шедьовър“ и какво означава „международно признат“, също тотално не ми е ясно. Значи, нашите игри имат нещо, което на английски би звучало като cult following (последователи на култ – б.а.), „Да оцелееш на Марс“ е продала милиони копия, както и „Тропико“-тата. Ти по дефиниция продаваш в целия свят, тоест те са международно признати по дефиниция. А сега какво е шедьовър, от какво се определя? От продадени бройки или от това, че много ни е харесало на нас? Наградите ли имат значение? Тъй като съм наблюдавал отвътре как се правят игри, за да спечелят награди… Една игра може да е много наградена, обаче ако не е намерила своята аудитория, какво означава това? Така че целият този въпрос ми се струва в едни рамки, които при по-внимателен поглед не издържат.
Има обаче конкретни примери, с които мога да илюстрирам този въпрос – „Вещерът“ като полска игра, „Светлосянка“ като френска…
„Вещерът“ не е създаден във вакуум, а по успешна литературна поредица. В този ред имам какво да разкажа от опита ни с работа с писатели. Неведнъж сме се опитвали да работим с писатели. За да се получи, трябват ни образовани хора, но без гордостта, защото медията е друга и винаги се стига до сблъсък между дизайна на играта, която ние правим, и човека, който смята, че играта трябва да бъде точно такава, каквато той е решил. А това не е възможно. Играта не отговаря на индивидуалната фикция, защото са включени много хора, има много фактори. Самата работа в екип е винаги предизвикателство. Така че крайният резултат винаги в някаква степен е отвъд контрола на всички участници.


Кадри от „Шайка наемници 3“ (Jagged Alliance 3) и „Крушение: Извънземна зора“ (Stranded Alien Dawn)
За да се върна към въпроса за изкуствения интелект, виждам две възможни развития. Все повече студиа ще правят все по-еднакви игри. От друга страна, тъй като технически правенето им ще става все по-лесно, според мен ветрилото ще се разтваря и ще има все по-интересни и по-разнообразни игри. И това само по себе си крие потенциала на едно интересно и обещаващо бъдеще.
Да завършим с тази оптимистична прогноза. Благодаря ти от името на Игромислещия екип.
—
В рубриката „Игромислие“ публикуваме разговори, в които се срещат, съпоставят и противопоставят различни гледни точки към многоизмерния, многожанров феномен на видеоигрите – не толкова като електронен спорт, колкото като нов синтез на изкуствата и като ново поле на общуване и социалност.
Comic for 2026.06.12 – Pill Bugs
Post Syndicated from Explosm.net original https://explosm.net/comics/pill-bugs
New Cyanide and Happiness Comic
Scaling out Distroless adoption with AI
Post Syndicated from Grab Tech original https://engineering.grab.com/scaling-out-distroless-adoption-with-ai
Introduction
Grab is migrating from heavy base images like Ubuntu to Distroless images to reduce security risks. By stripping containers down to the bare application and its runtime, we eliminate unnecessary binaries and Common Vulnerabilities and Exposures (CVEs).
This migration is more than a compliance mandate; it is a strategic security decision to build a more resilient and defensible production environment. By moving to Distroless, we are fundamentally shrinking our attack surface; eliminating the binaries and shells that attackers use for lateral movement. With over 900 services already transitioned, we are on track for 80% adoption by mid-2026.
Why Distroless requires rigorous testing
Distroless adoption risk: runtime failure
However, shifting to Distroless images introduces a critical technical risk: runtime failure. A service might build perfectly in Continuous Integration (CI), but fail at the deployment stage due to:
- Missing shared objects: Binaries might require specific libraries (
.sofiles) present in Ubuntu but absent in Distroless. - Implicit links: Third-party tools may expect specific system utilities or directory structures.
Testing is required to ensure two things:
- The service spins up with the correct config.
- All runtime dependencies remain intact.
Scaling this verification across thousands of services manually? That would take years unless we found a way to automate the trust.
The testing methodology
As we perform changes to the Dockerfile definition of our services, it is crucial for us to include the corresponding test strategy so that changes we make do not introduce regressions in our running services. Assessing the change introduced to our services, the lowest possible testing boundary would be that of what we define as Medium Tests in Grab.
Medium tests in Grab
At Grab, we categorize our test suites into three main sizes: small, medium and large. Small tests refer to functional tests whereby mocks are introduced via dependency injection. Large tests refer to end-to-end tests that run on actual services in our staging environment where nothing is mocked.

Medium tests belong in the middle ground, whereby external dependencies (such as service to service dependencies) are mocked with a network proxy layer in a similar concept as WireMock, but internal dependencies like MySQL are not mocked and instead spun up using Testcontainers. In this setup, systems under test are actually built into Docker containers and run in Docker before their endpoints are being hit by test inputs, with the corresponding responses being asserted on. As such, we could now effectively test if any changes of the Dockerfile definition broke the service. An added bonus is that all of these could occur within the CI environment, without reaching the Continuous Deployment (CD) stage.

This makes Medium Test effective and efficient for testing changes to the services associated with distroless adoption. We could now largely scale up our adoption process by:
- Raising batch Merge Requests to dockerfile definitions for Distroless adoption.
- Running medium tests in CI.
- Upon passing the medium tests, automatically merge the changes and trigger CD.
Introduction of toil
The approach above works nicely for services that already have Medium Tests defined. However, we quickly hit a blocker running this rollout methodology for services without a Medium Test setup. Inherently, scaffolding Medium Tests for a service is a tedious task. Most of the toil comes from first figuring out the internal dependencies, then spinning up their corresponding test containers in test time before wiring the internal dependencies up with the service under test by updating the test environment configurations.

These tasks are not challenging but are generally tedious to set up. At the same time, they cannot be automated completely given the different internal dependency combinations that each service uses, as well as the difference in how the configurations are being defined and used in each service. With ~400 services in scope without Medium Test setup, this became a huge blocker for our distroless migration campaign.
The need for flexibility in how each task is executed, together with each task’s fairly low complexity, made artificial intelligence (AI) a natural tool to accelerate distroless adoption work.
AI: The toil buster

AI was a good fit because the work we needed to automate had clearly defined output, and we could tell, deterministically, whether it worked. Success was straightforward: the CI pipeline would turn green, running basic Medium test health checks. With a measurable end goal and a reliable success signal, we pursued an agentic workflow rather than a one-off generation attempt.
The starting point
We started by adopting skills to guide the agent on how to proceed with Medium test work and how to unblock itself when it hit repo-specific friction. These skills gave context for scaffolding basic Medium tests, setting up internal dependencies, and debugging issues in the code. Once those foundations were in place, we rolled the approach out to a batch of 20 services, completed by the AI in about two working days. That batch validated the core hypothesis: the AI could scaffold Medium tests first, then use those tests to verify that our Dockerfile change (building distroless images) introduced no regressions.
Teaching an agent to test
At that point, the real shift was turning “can do the task” into “can repeat the behavior.” We captured the Medium-test knowledge as a list of skills grounded in Grab’s internal Medium test SDK.
Then DevSecOps wrapped those skills into an Entrypoint Skill, an orchestrator that runs a multi-phase workflow across services. The result is a single agent loop that moves from candidate detection, to scaffolding, to fixing failures, and onward to CI verification without treating each service as a brand-new, one-off problem.

Leveraging the skills we’ve acquired, we utilized Claude Code, Anthropic’s agentic coding tool. This tool operates by accepting a list of services and then processing them in a batch.
- Detect: Is this a deployable service or is it a library? Is it still maintained? The agent skips anything that doesn’t qualify, so human time is only spent on real candidates.
- Scaffold: Using Grab’s scaffolding tool, the agent generates the medium test boilerplate.
- Fix: The scaffold rarely works on the first try due to the unique setup of each repository like missing environment variables, database dependencies at startup, port mismatches, and similar issues. The agent reviews its knowledge base and pattern-matches errors against known fixes.
- Raise MR: Once the medium test passes locally, the agent creates a draft merge request on GitLab with a description explaining what changes were done for that specific service and why.
- Monitor CI: The agent polls the pipeline, reads job logs on failure, and attempts CI-specific fixes. If the same error persists after two attempts, it flags the issue for human review.
- Repeat: Push the fix and move to the next service while the pipeline runs. The agent doesn’t sit idle waiting for CI! It starts scaffolding the next service asynchronously, checking back on previous pipelines as results come in.
What made it work
Getting the workflow to function was the easy part. Getting it to function reliably across hundreds of services required deliberate design choices.
Model Context Protocol (MCP): The agent never leaves Claude Code. GitLab interactions like creating branches, raising MRs, reading pipeline error logs, all happen through a MCP server. When the agent needs Grab-specific context like what a service does, or who owns it, it queries Glean, an enterprise search tool used by Grab through its MCP integration rather than guessing. For code-level context, finding how a service is structured or how dependencies are wired across repositories, it queries Sourcegraph through its own MCP integration.
Guardrails over autonomy: The agent can only touch test files and CI configs. Application code is off-limits, enforced before every commit. It can’t gut tests to make them pass. If it can’t fix the problem, it escalates.
Knowledge that compounds: We maintain a feedback loop for scaffolding, mocking, and known failure patterns. After each batch, we review what the agent hit and promote recurring fixes into the skill. The agent improves not because the model gets better, but because its instructions do.
Integrating scripts with skills: For deterministic tasks like boilerplate generation, scripts are far more reliable than raw AI logic. By integrating these scripts as “skills,” we also optimize the agent’s performance in context window management. During test execution, standard output often produces hundreds of lines of repetitive logs that could exhaust token limits or distract the model. Using a script as an intermediary allows us to programmatically filter logs, extracting only the specific error messages or stack traces required for debugging. This ensures the AI receives a clean, actionable summary rather than being overwhelmed by noisy data.
Token efficiency: Batch runs across dozens of services burn through tokens fast. We configured a compressed communication style that cuts output by ~75%, keeping technical substance while stripping filler. Proper communication is reserved for MR descriptions and messages to service owners.
Isolated execution: Each batch run spawns the agent in its own context window. Long sessions processing dozens of services don’t bloat the main conversation, keeping the agent focused and responsive.
Human-in-the-loop: Every MR is raised as a draft; a human reviews before anything merges. A human also decides which learnings become permanent knowledge. The agent proposes; people approve.
From tests to migration at scale
With medium tests in place across our service fleet, we had the safety net we needed. The next step was automating the distroless migration itself.
The patch-test-compare loop

Before touching a single Dockerfile, the system runs the service’s existing medium tests to establish a baseline. Pre-existing test failures are baselined, allowing for a clear distinction between legacy issues and new regressions introduced by the distroless patch.
Then comes the distroless patching. The system inspects each service’s Dockerfile for OS-level package dependencies by scanning for apt-get install lines and filtering out packages already included in the distroless base image. Two scenarios to consider here:
- If no extra packages are needed, it’s a straightforward base image swap.
- If packages are detected, the system generates a multi-stage build: a builder stage installs the required packages, then copies only the necessary shared libraries into the distroless runtime stage. The result is a minimal image that still has everything the service needs to run.
After patching, the same medium tests run again. Results fall into clear categories: pass (tests still green – safe to migrate), regression (tests broke – the patch caused a problem), or already failing (was broken before we touched it). Regressions trigger an automated remediation step. A separate AI agent inspects the container for missing shared libraries and attempts to fix the Dockerfile. If it can’t resolve the issue, the service is flagged for human review.
Scaling with batch changes
The previous section explains the patch-test-compare loop, but how can we scale to handle more than one service at a time? To migrate at scale, we use batch change tooling that applies the Dockerfile transformation across dozens of repositories simultaneously, creating merge requests automatically. The system handles both standalone GitLab repositories and Grab’s shared Go monorepo, adapting the patching and MR strategy to each.
Impact on our services
Medium test generation at scale
With medium tests in place, services with possible regressions have higher chances of being caught before reaching staging, providing the safety guarantee we needed. Each generated test also became a permanent safety net for the service, not just for the distroless migration but for all future changes. Over 1.5 months, the agent raised 100+ medium test MRs across repositories, bringing more services into compliance with Grab’s “shift-left” testing initiative.
Distroless adoption
The campaign moved the needle significantly across our service fleet. Overall distroless adoption for our scope grew from 52.7% in December 2025 to 70.8% by April 2026, covering 997 out of 1,408 services.
Autonomous with oversight
The agent autonomously handles the majority of medium test generation and Dockerfile migration work with little human intervention for standard cases. Engineers remain in the loop, reviewing every draft MR and making the final call on what merges.
Engineering bandwidth reclaimed
Manually generating a basic medium test requires familiarity with Grab’s internal SDK, typically 1–3 days per repository for developers new to the framework. Across ~400 services without medium tests, that adds up to 400–1,200 engineer-days. By leveraging AI we brought this down to roughly 0.1 days per service, compressing what would have taken well over a year into a fraction of the calendar time. This freed the team to focus on higher-leverage work like improving migration tooling, handling edge cases, and advancing the roadmap beyond distroless.
Conclusion
With distroless images and stronger medium test coverage, we made Grab’s services more secure and easier to verify. We demonstrated that AI can shoulder much of the scale-up effort.
Join us
Grab is Southeast Asia’s leading superapp, serving over 900 cities across eight countries (Cambodia, Indonesia, Malaysia, Myanmar, the Philippines, Singapore, Thailand, and Vietnam). Through a single platform, millions of users access mobility, delivery, and digital financial services, including ride-hailing, food delivery, payments, lending, and digital banking via GXS Bank and GXBank. Founded in 2012, Grab’s mission is to drive Southeast Asia forward by creating economic empowerment for everyone while delivering sustainable financial performance and positive social impact.
Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!
AWS Nitro Isolation Engine: Formally verifying the hypervisor in the AWS Nitro System
Post Syndicated from Ali Saidi original https://aws.amazon.com/blogs/compute/aws-nitro-isolation-engine-formally-verifying-the-hypervisor-in-the-aws-nitro-system/
Ali Saidi is a VP and Distinguished Engineer at AWS
Millions of customers use the AWS Nitro System to protect their most sensitive workloads, and AWS is an industry leader in innovation to secure customer data. Helping our customers keep their data secure and confidential is our highest priority, and we continue to make investments in purpose-built hardware and software for data isolation and protection.
In 2017, AWS launched the Nitro System, the first major cloud platform designed with zero operator access to customer data. The Nitro System is purpose-built hardware and software that provides the foundation for all modern Amazon EC2 instances, offloading virtualization, storage, and networking functions to dedicated hardware and a minimal hypervisor. With the Nitro System, even the most privileged AWS operators are only able to interact with the system via authenticated, audited administrative APIs that cannot access customer workloads. This architecture has set the industry standard for cloud security, and third parties like NCC Group have independently validated our approach.
Now, we’re raising the bar even further. One of the primary responsibilities of the AWS Nitro System is to isolate instances from each other and from AWS operators. This has been a cornerstone of the Nitro System architecture for over a decade. The AWS Nitro Isolation Engine, first announced at re:Invent 2025 and generally available on all Graviton5-based instances starting today, is a purpose-built component within the Nitro Hypervisor responsible for enforcing this isolation and proving it with mathematical precision. Nitro Isolation Engine uses formal verification, a technique to mathematically demonstrate that the hardware or software behaves as intended, and not only in specific test cases. This intensive verification technique establishes Nitro as the first formally verified cloud hypervisor, setting a new standard for mathematically proven cloud security.
AWS Nitro Isolation Engine
Within the Nitro System, the AWS Nitro Hypervisor is designed so that no unauthorized entity can read or modify customer data across all virtual machines. Nitro Isolation Engine is a purpose-built component of the Nitro Hypervisor that enforces isolation between these virtual machines. It mediates all access to virtual machine memory, CPU register state, and I/O devices through a minimal set of APIs that are exposed to the rest of the Nitro Hypervisor. It is the sole system component that mediates access to customer data. The remaining Nitro Hypervisor components must operate through this restricted interface and cannot access customer workloads directly. The Nitro Isolation Engine’s minimalist code base eases human audit, reduces scope for bugs, and makes it feasible to apply formal verification to its design and implementation.
Formal verification
Formal verification uses mathematical proof to demonstrate that properties of a formal model of a system hold true in all possible system states and over all possible inputs. This contrasts with testing, where a system’s behavior is checked against a (potentially large) subset of possible states and inputs. Formal verification provides far stronger evidence about correctness than traditional testing. In the case of Nitro Isolation Engine, our isolation properties are assured across all possible system behaviors. Testing and verification are complementary. Verification extends testing, and testing covers areas of the system not yet verified and builds an intuition that the system is behaving as intended.
For customers, formal verification of the code responsible for enforcing isolation provides assurance beyond comprehensive testing. Testing remains essential, and we maintain a high bar for it — but testing can only check specific scenarios. Formal verification is complementary: it means that isolation properties are mathematically assured across all possible scenarios, not just those covered by testing.
Formally verified properties
The formal verification of the Nitro Isolation Engine establishes four key properties:
1/ Confidentiality and Integrity – The Nitro Isolation Engine preserves the confidentiality and integrity of guest virtual machines (VM). Confidentiality means that a guest VM’s private data cannot be read by any unauthorized entity and Integrity means that a guest VM’s private data cannot be modified by any unauthorized entity.
2/ Functional Correctness – Every verified hypercall matches the expected behavior defined in the specification. The specification captures the preconditions and postconditions of each hypercall, and the proof establishes that the implementation never deviates from them.
3/ Absence of Runtime Errors – The code never encounters runtime errors and the implementation behaves as specified. Together, formal verification of these properties establishes mathematically rigorous assurance that the Nitro System maintains isolation for any sequence of events covered by the verification. Today, the verification covers the hypercalls for the core VM lifecycle responsible for bringing up, running, and tearing down a VM.
4/ Memory Safety – Establishes the absence of memory safety violations such as buffer overflows, NULL pointer dereferences, and out-of-bound access. As is the case for all verified software, the Nitro Isolation Engine proofs are subject to assumptions, such as the correctness of the Rust compiler and hardware. These assumptions and our approach to engineering and verification are detailed further in the Nitro Isolation Engine whitepaper.
Rust implementation
Nitro Isolation Engine is implemented in Rust, a systems programming language designed to prevent common programming pitfalls that have historically been the root cause of security vulnerabilities in sensitive software. The choice of Rust for the Nitro Isolation Engine eliminates entire classes of bugs by construction. What makes Rust a good fit is its type of system — it enforces a strong ownership discipline, which makes some aspects of formal verification easier and provides a first layer of assurance at compile time.
Conclusion
The Nitro Isolation Engine represents our continued commitment to keeping our customers’ data confidential. This is only the starting point. We will continue to extend formal verification across all major components of the Nitro Isolation Engine that impact security and maintain those proofs as new features are introduced. In addition, we plan to make the Nitro Isolation Engine’s source code and formal proofs available to third parties for independent inspection and review. We believe this level of transparency sets a new standard for how cloud providers can demonstrate openness, code quality, and formal verification.
To learn more about the AWS Nitro System and confidential computing, see the following resources:
- AWS Nitro Isolation Engine Whitepaper.
- Confidential computing: an AWS perspective (2021).
- AWS Nitro System gets independent affirmation of its confidential compute capabilities (2023).
- AWS Nitro Whitepaper.
- AWS re:Invent 2025 presentation – Introducing Nitro Isolation Engine: Transparency through Mathematics.
About the authors
Diagnose EKS Node Issues Faster with AWS DevOps Agent and Custom MCP
Post Syndicated from Shyam Kulkarni original https://aws.amazon.com/blogs/devops/diagnose-eks-node-issues-faster-with-aws-devops-agent-and-custom-mcp/
AWS DevOps Agent can investigate a growing range of production incidents autonomously. It diagnoses CrashLoopBackOff failures, traces ConfigMap deletions through audit logs, and correlates Amazon CloudWatch metrics with cluster events — all without human intervention.
But AWS DevOps Agent has a visibility boundary. When the data it needs lives outside its native integrations — on a node’s operating system, inside a third-party monitoring tool, behind a database’s internal diagnostics — the agent stalls. It can describe symptoms, but it can’t reach the evidence needed to identify root causes.
This post shows how to extend AWS DevOps Agent by building a custom Model Context Protocol (MCP) server that bridges that gap. Using a concrete example, we give AWS DevOps Agent structured access to Amazon EKS worker node diagnostics and explain how the same approach applies to data sources the agent can’t natively reach. By the end of this walkthrough, you will have a working MCP server that gives AWS DevOps Agent access to 20+ node-level log sources — providing autonomous investigation capabilities that can assist in root cause analysis compared to manual SSH sessions.
Prerequisites
Before you begin, make sure you have the following:
- An Amazon EKS cluster with AWS Systems Manager Agent (SSM Agent) running on the worker nodes (included by default on Amazon EKS optimized AMIs)
- Node.js v18 or later
- AWS CLI v2
- AWS CDK v2 installed and bootstrapped in your target account and Region
- An AWS account with permissions to create IAM roles, Lambda functions, and Amazon S3 buckets
- Familiarity with Amazon EKS, AWS Systems Manager, and the Model Context Protocol (MCP)
How AWS DevOps Agent discovers custom tools through MCP
MCP is an open standard that defines how AI agents discover and invoke external tools. AWS DevOps Agent supports connecting to custom MCP servers, which means you can expose new capabilities to it without modifying the agent itself. When you connect an MCP server to AWS DevOps Agent, the agent automatically discovers the available tools, understands their schemas, and calls them as part of its investigation workflow. You build and connect the MCP server — the agent handles the rest.
The extensibility model follows three steps: first, identify the data source that AWS DevOps Agent cannot natively access; second, build an MCP server that wraps safe, structured access to that data source; and third, connect the MCP server to AWS DevOps Agent so it can incorporate the new tools into its investigations.
Three design principles make this work. Return structured data, not raw text — pre-index findings with severity levels and stable IDs so the agent can filter, reference, and correlate them. Never give the agent a shell — mediate interactions through a controlled, auditable execution model. Make tools composable — design tool outputs to serve as inputs to other tools, creating a chain of evidence the agent can follow.
Why Amazon EKS node OS visibility matters
AWS DevOps Agent integrates with Amazon EKS to inspect pod status, read container logs, query CloudWatch Container Insights, and correlate cluster events. This covers application crashes, container-level resource exhaustion, and configuration drift.
However, EKS production issues with nodes originate in a layer these tools cannot reach: the node operating system. Artifacts such as iptables rules, full CNI configuration and IPAMD state, route tables, conntrack entries, dmesg kernel messages, containerd runtime logs, sysctl parameters, ENI metadata, and the unfiltered kubelet journal exist exclusively on the node. These artifacts are the primary evidence for diagnosing IP allocation failures, DNS resolution issues, network policy enforcement problems, storage mount timeouts, and node registration failures.
Integrating AWS DevOps Agent with an EKS node diagnostics MCP server
The sample-eks-node-diagnostics-mcp repository (sample-eks-node-diagnostics-mcp repository) demonstrates this pattern. It provides an MCP server that gives AWS DevOps Agent structured access to node-level diagnostic data, backed by AWS Systems Manager (SSM) Automation for safe, auditable execution.
How it works
Figure 1: End-to-end architecture of the EKS Node Diagnostics MCP server. AWS DevOps Agent discovers and invokes 19 tools through AgentCore Gateway, which dispatches SSM Automation runbooks to worker nodes for log collection and uploads results to Amazon S3 for extraction and indexing.
- AWS DevOps Agent calls a collect tool with an instance ID.
- The MCP server dispatches an SSM Automation execution to the target node, running the AWS-managed AWSSupport-CollectEKSInstanceLogs runbook.
- The runbook collects 20+ log sources — kubelet, containerd, iptables, CNI config, route tables, dmesg, sysctl, ENI metadata, IPAMD logs, and more — packages them into an archive, and uploads it to an Amazon S3 bucket where you configure AWS KMS encryption.
- A processing pipeline extracts the archive, pre-indexes errors with severity classification and stable finding IDs, and provides the results to you through additional MCP tools.
The server exposes tools for log collection, pre-indexed error retrieval, cross-file search and correlation, structured network diagnostics, and live packet capture. A typical agent workflow chains these together: collect → status → errors → search → correlate → read → summarize, with each step producing outputs that feed into the next.
AWS DevOps Agent does not get a shell on the node. Every interaction is mediated by SSM Automation — an auditable, IAM-controlled, non-interactive execution model.
Connecting through Amazon Bedrock AgentCore Gateway
The reference implementation uses Amazon Bedrock AgentCore Gateway to expose the Lambda-backed MCP server to AWS DevOps Agent. AgentCore Gateway converts Lambda functions into MCP-compatible tools and handles authentication, protocol translation, and tool discovery through a single managed endpoint.
The integration follows three steps:
Step 1: Create an OAuth authorizer with Amazon Cognito. The CDK stack provisions a Cognito User Pool configured for the OAuth 2.0 client credentials flow. This secures inbound access to the gateway — only clients with valid tokens can invoke tools.
Step 2: Create a gateway and register the Lambda as a target. Register the Lambda function that handles tool invocations as a target on the gateway. AgentCore Gateway automatically discovers the tool schemas from the Lambda and makes them available through the MCP protocol. The gateway endpoint becomes the single MCP URL for AWS DevOps Agent.
Step 3: Connect AWS DevOps Agent. Register the MCP server at the account level in the AWS DevOps Agent console, providing the gateway URL and OAuth configuration. Then allowlist the specific tools each Agent Space needs. AWS DevOps Agent authenticates by obtaining a JWT from the Cognito token endpoint using the client credentials grant and passes it as a Bearer token in requests to the gateway URL.
Deploying the MCP server
Deploy the entire stack using AWS CDK :
git clone https://github.com/aws-samples/sample-eks-node-diagnostics-mcp.git
cd sample-eks-node-diagnostics-mcp
chmod +x deploy.sh
./deploy.sh
The script walks you through cluster selection and node role configuration. Have the following ready before running the script: your target EKS cluster name, the IAM role ARN you attached to your worker nodes, and the AWS Region where your cluster runs. The script outputs your MCP gateway URL, OAuth credentials, and token endpoint — everything you need to configure the connection in AWS DevOps Agent. See the repository README for detailed deployment instructions, CI/CD mode, and prerequisite details.
Seeing it in action
To demonstrate the MCP server’s capabilities, we walk through a realistic node-level failure scenario on a test EKS cluster. We manually inject a fault that blocks pod DNS resolution at the iptables level — an issue that is invisible from kubectl since pods appear Running — then show how AWS DevOps Agent investigates and identifies the root cause using the MCP server’s tools.
Setting up the scenario
Start with an EKS cluster that has a managed node group with SSM Agent running (included by default on Amazon EKS optimized AMIs). Deploy a sample workload to one of the nodes:
kubectl create namespace demo-app
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-frontend
namespace: demo-app
spec:
replicas: 3
selector:
matchLabels:
app: web-frontend
template:
metadata:
labels:
app: web-frontend
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
EOF
Identify the node and instance ID where the pods are running:
kubectl get pods -n demo-app -o wide
Injecting the fault
WARNING: The following commands will disrupt DNS resolution for all pods on the target node. Only run these in a non-production test environment. Do not execute on production nodes.
Connect to the target node using SSM Session Manager and run the following commands to block pod DNS traffic at the iptables level. This simulates a subtle networking issue – pods continue running but can’t resolve DNS, and the root cause is only visible in the node’s iptables rules:
# Block pod traffic to kube-dns ClusterIP — pods run but DNS fails
# Only affects FORWARD chain (pod traffic), not the node's own DNS
sudo iptables -I FORWARD -d 10.100.0.10/32 -p udp --dport 53 -j DROP
sudo iptables -I FORWARD -d 10.100.0.10/32 -p tcp --dport 53 -j DROP
Replace 10.100.0.10 with your cluster’s kube-dns ClusterIP (kubectl get svc kube-dns -n kube-system -o jsonpath=’{.spec.clusterIP}’).
This fault is particularly insidious because kubectl get pods shows all pods in Running state. The applications fail with DNS resolution errors, but there is no Kubernetes event or pod status that points to the cause. The iptables DROP rules targeting the kube-dns ClusterIP exist only in the node’s firewall configuration — a layer that no Kubernetes API call can inspect.
Investigating with AWS DevOps Agent
An engineer notices applications reporting DNS failures and asks AWS DevOps Agent to investigate:
“Pods on node i-xxxxxxxxxx in cluster EKS-sample (us-east-1) are running but applications report DNS resolution failures. Collect the node logs and investigate.”
Figure 2: Starting an investigation in AWS DevOps Agent. The engineer provides the symptom description and incident timestamp, and the agent autonomously plans and executes the investigation.
AWS DevOps Agent begins the investigation by recording the symptom and launching two parallel actions: collecting node logs via the nodelog_collect tool and checking cluster health. The cluster health check confirms all four nodes are running and SSM-online. The agent then polls the log collection status, tracking progress from 25% through 75% to completion. Once collection finishes, the agent fans out into parallel workstreams — running network diagnostics, performing quick triage, and collecting logs from a healthy node for comparison.
Figure 3: Investigation timeline showing the initial data collection phase. The agent identifies the symptom, confirms cluster health, collects node logs via SSM Automation, polls for completion, and launches parallel diagnostic workstreams.
With the initial data collected, the agent launches four parallel investigation tasks to maximize coverage and minimize time-to-root-cause: (1) deep-dive-iptables-routes examines the node’s firewall rules and routing table in detail, completing in 1 minute 44 seconds across 8 tool calls; (2) search-network-errors scans the collected logs for network-related error patterns, running 15 tool calls over 7 minutes 51 seconds; (3) collect-healthy-node gathers the same diagnostics from a known-good node for comparison, taking 13 tool calls over 4 minutes 55 seconds; (4) check-oom-and-pod-status investigates kernel OOM kills and pod health, executing 19 tool calls over 8 minutes 12 seconds. Each task produces a structured report that feeds into the final synthesis.
Figure 4: Parallel investigation phase. The agent runs four concurrent deep-dive tasks — iptables/route analysis, network error search, healthy node comparison, and OOM/pod status check — then synthesizes the findings into a unified report.
The iptables and route table deep-dive reveals the root cause. The agent identifies two CRITICAL findings: a FAULT-INJECT-DROP-POD-TO-POD rule in the FORWARD chain that drops inter-pod traffic, and a FAULT-INJECT-DROP-SERVICE-CIDR rule that drops forwarded traffic to the service CIDR range. It also flags a MEDIUM-severity finding — a blackhole route for 10.96.0.0/12 (the Kubernetes service CIDR) that does not exist on healthy nodes. The remaining checks come back normal: kube-proxy chains are intact, AWS VPC CNI SNAT/CONNMARK chains are properly configured, and the default gateway and ENI route tables are correct. This structured severity classification allows the agent to immediately focus on the critical items.
Figure 5: Deep-dive findings from the iptables and route table analysis. Two CRITICAL fault-injection DROP rules in the FORWARD chain are identified as the primary issue, while standard networking components — kube-proxy, VPC CNI, and routing — check normal.
The healthy node comparison confirms the diagnosis. The agent compares the unhealthy node against a known-good node across seven dimensions: security groups, ENI count, DNS configuration, iptables rules, route tables, conntrack entries, and IPAMD state. The key differences are definitive: the blackhole route for 10.96.0.0/12 exists only on the unhealthy node, kubelet API server timeout errors appear only on the unhealthy node, conntrack entries are 12x higher (1,962 vs 169), and IPAMD reconciliation errors are 5x more frequent. The iptables FORWARD chain counters show 2.4 billion packets processed on the unhealthy node versus zero on the freshly-started healthy node — confirming sustained traffic disruption.
Figure 6: Healthy node comparison confirming the diagnosis. The agent compares diagnostics across both nodes and identifies five key differences — the blackhole route, elevated conntrack entries, and high FORWARD chain packet counts exist only on the affected node.
The agent synthesizes the findings into a definitive root cause determination. It identifies a fault-injection namespace on the EKS cluster that is running chaos experiments, introducing three specific network-disrupting modifications on the target node: (1) a FAULT-INJECT-DROP-POD-TO-POD iptables rule in the FORWARD chain that drops inter-pod traffic, (2) a FAULT-INJECT-DROP-SERVICE-CIDR rule that drops forwarded traffic to the Kubernetes service CIDR, and (3) a blackhole route for 10.96.0.0/12 that does not exist on healthy nodes. Together, these three modifications create a multi-vector network disruption — pods appear Running but cannot communicate with each other or reach Kubernetes services, including kube-dns.
Figure 7: Root cause determination. The agent traces the multi-vector network disruption to three fault-injection modifications — two iptables DROP rules and a blackhole route — deployed by a chaos experiment namespace on the target node.
Cleaning up the fault
To restore the node after the demo, connect via SSM Session Manager and run:
sudo iptables -D FORWARD -d 10.100.0.10/32 -p udp --dport 53 -j DROP
sudo iptables -D FORWARD -d 10.100.0.10/32 -p tcp --dport 53 -j DROP
Extending this pattern to other data sources
The EKS node diagnostics use case demonstrates the pattern, but the architecture generalizes to systems where the SSM Agent is running and you can define an SSM Automation runbook to collect the data you need.
For example, an EC2 instance with SSM Agent can use this same approach — collect OS-level logs, network configuration, package state, or application diagnostics through a custom or pre-built SSM Automation runbook, upload results to S3, and expose them through MCP tools. The same applies to ECS container instances (Docker daemon logs, ECS agent state, iptables), on-premises servers registered via SSM Hybrid Activations, or managed nodes in your fleet.
The pattern also extends beyond SSM-managed hosts. Network devices can be reached through API calls to their management planes, databases through read-only diagnostic queries, and third-party APM tools through vendor API integrations. In each case, the same three-step approach holds: identify the unreachable data, build an MCP server that wraps safe access to it, and connect it to AWS DevOps Agent.
When to use this approach
This pattern works well for incident response where diagnostic data lives outside AWS DevOps Agent’s native reach, fleet-wide triage where manual access to individual systems is impractical, and cross-source correlation where evidence spans multiple log sources.
It is not a replacement for continuous monitoring (use CloudWatch Container Insights or Prometheus for real-time alerting), log shipping (if you have compliance requirements for continuous retention), or native integrations where the agent already has access to the data source.
The reference implementation requires SSM Agent running on the nodes with appropriate IAM permissions. It is a proof of concept — validate it in non-production environments before using it with production workloads.
Clean up
Cost considerations: This solution uses AWS Lambda, Amazon S3, AWS KMS, Amazon Cognito, and Amazon Bedrock AgentCore Gateway. Costs vary based on usage. Lambda charges apply per invocation and duration. S3 charges apply for log storage. KMS charges a per-key monthly fee plus per-request charges. Cognito charges per monthly active user. AgentCore Gateway pricing is based on API calls. For current pricing details, see the AWS Pricing page for each service. To minimize costs during evaluation, delete the stack when not in use.
Remove the deployed resources by running cdk destroy from the repository root. The S3 log bucket uses a RETAIN removal policy — delete it manually after stack destruction if needed.
Conclusion
MCP provides a standardized extensibility mechanism that lets you bridge visibility gaps in AWS DevOps Agent without modifying the agent itself. The pattern is straightforward: identify the unreachable data source, build an MCP server that wraps safe and structured access to it, and connect it to AWS DevOps Agent through Amazon Bedrock AgentCore Gateway. The agent handles the reasoning. The MCP server handles the data access.
To get started:
- Deploy the reference implementation (sample-eks-node-diagnostics-mcp repository) in a non-production environment.
- Review the MCP specification (MCP specification).
- Explore the Amazon EKS troubleshooting documentation (Amazon EKS troubleshooting documentation).
- Connect custom MCP servers to AWS DevOps Agent — see the Connecting MCP Servers guide in the AWS DevOps Agent documentation.
- Set up AgentCore Gateway — see the Amazon Bedrock AgentCore Gateway quick start guide.
About the author
ASRock Industrial NUC BOX-358H Mini PC Review
Post Syndicated from Ryan Smith original https://www.servethehome.com/asrock-industrial-nuc-box-358h-intel-panther-lake-mini-pc-review/
The ASRock Industrial NUC BOX-358H brings with it Intel’s latest-generation Panther Lake platform, marking a big change for the mini PC line
The post ASRock Industrial NUC BOX-358H Mini PC Review appeared first on ServeTheHome.
Build RAG-powered AI solutions at the edge with AWS Local Zones and Outposts
Post Syndicated from Fernando Galves original https://aws.amazon.com/blogs/compute/build-rag-powered-ai-solutions-at-the-edge-with-aws-local-zones-and-outposts/
Organizations in regulated industries or with strict information security requirements are increasingly looking to use generative AI. However, they often face a dilemma: how to utilize powerful models while keeping data strictly on-premises or within specific geographic boundaries. The solution lies in deploying self-managed Small Language Models (SLMs) on premises with AWS Outposts or in adjacent metros using AWS Local Zones.
SLMs can achieve accuracy comparable to large models for specific, well-scoped use cases. However, all language models suffer from a knowledge gap: their internal knowledge is static, probabilistic, and often outdated. This challenge is acute for SLMs, which have significantly smaller parametric memory than Large Language Models (LLMs). To equip an SLM to perform accurately in an enterprise context, it must be supported by an architecture that provides fresh, governed facts.
This is achieved through Retrieval-Augmented Generation (RAG). RAG is not merely an extension; it is the architectural pattern that bridges the gap between a model’s frozen memory and your dynamic enterprise data.
This post provides a solution template for deploying an SLM augmented with RAG. This architecture allows the model to perform accurately while offering enhanced Total Cost of Ownership (TCO) because of reduced size and latency. To address data residency and InfoSec needs, we provide guidance on deploying this solution entirely within AWS Local Zones and AWS Outposts.
Solution overview
To demonstrate this architecture, we present a Chatbot application designed to answer detailed technical questions regarding AWS Hybrid Edge products (specifically AWS Local Zones and AWS Outposts) to a level 200-300 knowledge depth.
A chatbot was selected as it represents the most common use case requested by AWS customers. The technical domain demonstrates the system’s ability to handle complex, specific queries. This solution provides enterprises with full control over the foundation model, including its operating location, configuration, and the security of confidential data.
Infrastructure components
The solution runs on four EC2 instances deployed on AWS Outposts or in an AWS Local Zone, each serving a distinct role in the RAG pipeline:
| Component | Instance Type | Role |
| Vector Embeddings Service |
g4dn or G7e (GPU)a/b Note:
|
Encodes documents and queries into dense vector representations using BAAI/bge-large-en-v1.5 1 |
| Reranking Service |
g4dn or G7e (GPU)a/b Note
|
Re-scores candidate chunks for contextual relevance using BAAI/bge-reranker-large 1 |
| Milvus Vector Database |
m5.xlarge Note : Check current instance availability for your Local Zone or Outposts deployment |
Stores and retrieves vector embeddings via high-dimensional similarity search |
| Small Language Model |
See companion blog https://aws.amazon.com/blogs/compute/running-and-optimizing-small-language-models-on-premises-and-at-the-edge/ |
Generates grounded responses from retrieved context |
All instances use the Deep Learning Base OSS Nvidia Driver GPU AMI (Amazon Linux 2023) for GPU workloads and Amazon Linux 2023 for the database instance. For instructions on setting up the SLM with Llama.cpp, refer to the companion post: Running and optimizing small language models on-premises and at the edge.

Figure 1. Elements of the chatbot
Why RAG matters for SLMs
RAG optimizes model output by referencing an authoritative knowledge base outside of its training data before generating a response. By offloading knowledge to a vector database, we allow the SLM to focus on reasoning and syntax, significantly reducing hallucinations and providing end-to-end traceability for every answer.
Architecture overview
The RAG workflow operates through a seven-stage pipeline designed so that data never leaves your controlled environment.

Figure 2. Architecture overview
- Prompt: Users submit questions to the generative AI application.
- Embedding: The application forwards the query to the vector embeddings application to generate a dense vector representation.
- Retrieval: The system searches for relevant information in the Milvus vector database, which securely stores proprietary data within the AWS Outposts environment.
- Architectural Note: This blog demonstrates a dense retrieval pipeline. However, production enterprise systems often combine this with sparse retrieval (Keyword/BM25) to create a hybrid retrieval pattern. This helps make sure that exact-match for identifiers like error codes or product SKUs are retrieved reliably, since dense embeddings alone can struggle to distinguish rare tokens.
- Reranking: The reranking application receives the initial candidate list (top K) and evaluates the chunks to identify the most contextually relevant information.
- Context construction: The prompt and the optimized set of chunks are sent to the SLM.
- Generation: The SLM processes the question and generates the response.
- Response: The final answer is returned to the user, augmented with citations, without sensitive data leaving the on-premises environment.
This design makes sure all components operate within organizational boundaries while delivering advanced AI capabilities using infrastructure deployed entirely on AWS Local Zones or Outposts.
Solution deployment
The following instructions detail how to deploy this RAG environment on AWS Outposts or Local Zones. The solution uses a range of models but these are changeable as new models come into popularity.
Prerequisites
- Deployed AWS Outposts or access to AWS Local Zones in your region.
- Two g4dn EC2 instances deployed with Deep Learning Base OSS Nvidia Driver GPU AMI (Amazon Linux 2023).
- One m5.xlarge EC2 instance deployed with Amazon Linux 2023.
- One EC2 instance running the SLM. (For instructions on setting up the SLM with Llama.cpp, refer to the blog post: Running and optimizing small language models on-premises and at the edge)
- Verify that you have installed the necessary libraries:
pip install sentence-transformers==3.4.1 pymilvus==2.5.8.
Vector embeddings configuration
Vector embeddings are the foundation of the RAG system. Selecting the right model requires balancing dimension size, latency, and accuracy. In this post, we use the BAAI/bge-large-en-v1.5 model to encode proprietary data and user queries.
Strategic chunking
Before embedding, proprietary documents must be split into chunks. If chunks are too large, they waste the SLM’s limited context window; if too small, they lack the context needed for reasoning. For this solution, we recommend recursive character chunking as a baseline. Configure your ingestion pipeline to create chunks of 600–800 tokens with a 10–15% overlap. This makes sure that concepts don’t get cut off mid-sentence and that the SLM receives coherent “units of evidence” rather than fragmented text.
Vector database configuration and optimization
Once vector embeddings are generated based on the data provided, a specialized database is required for efficient storage and similarity search operations. Milvus will be deployed for this RAG architecture. It is an open-source vector database optimized for high-dimensional similarity search at scale while maintaining low query latency. You can follow the instructions available in the Run Milvus in Docker (Linux) section on the Milvus website. The following Python snippet demonstrates how to create a collection schema in the Milvus database:
We use baseline HNSW parameters here; production deployments should tune M and efConstruction based on recall requirements.
Reranking implementation and configuration
A reranking step significantly improves retrieval quality by re-scoring initial vector search results with a cross-encoder model. The BAAI/bge-reranker-large model compares query-document pairs directly, providing more accurate relevance assessment than initial embedding similarity alone. The following Python snippet outlines a conceptual reranking application:
Performance optimization with reranking
While RAG efficiency enhances generative AI responses with relevant context, vector similarity search limitations can be challenging when deploying RAG at the edge. An additional consideration is that the context size of the prompt expands significantly adding to the latency of the SLM to generate the response, as it processes the larger prompt. One solution can be to perform a complex semantic search taking time. The alternative approach is to use a reranker to refine the output of the search, prioritizing the most contextually relevant chunks before they reach the SLM.

Figure 3. RAG without reranking
As illustrated, initial retrievals identify potentially relevant chunks with scores ranging from 0.7614 to 0.5422. When these chunks contain genuinely relevant information, they provide the SLM with the precise context needed for accurate and insightful responses. In this example, using a 50% similarity filter threshold, all five chunks qualify and are sent to the SLM model.
However, in cases when there are less relevant chunks in the list with scores above the filter, processing them can introduce inefficiencies in the SLM. By identifying and filtering these less valuable chunks from the SLM input, you can improve resource allocation and processing efficiency. This selective approach prevents the model from wasting computational resources on information that contributes minimally to response quality, focusing instead on the most informative content that enhances the generated answers.

Figure 4. RAG with reranking
Figure 4 shows implementing a reranking process effectively identifies and prioritizes the relevant chunks to be sent to the SLM. The reranker transforms the compressed similarity scores into a highly separated spectrum. It elevates the most relevant chunk to 0.9906 while downgrading less relevant content to scores as low as 0.0044. This clear separation enables the 50% threshold filter to automatically select only the single most valuable chunk to be sent to the SLM, eliminating four unnecessary chunks from processing.
Sending only high-relevance chunks to the SLM delivers dual benefits that improve RAG performance. Technical improvements materialize through reduced token processing, faster inference, and lower GPU memory consumption while response quality increases as the model focuses exclusively on meaningful information. This optimization maximizes the GPU investments while delivering superior results compared to standard retrieval alone.
To determine if this reranking optimization applies to your specific workload, you can implement a structured evaluation framework with your domain’s data. Test both technical metrics (latency, memory usage, throughput) and quality indicators (precision, relevance) at various threshold settings. Assess performance with ground truth question-answer pairs using both automated similarity scoring and targeted human evaluations, paying special attention to challenging retrieval cases. This methodical assessment confirms measurable improvements and compliance with your data residency and performance requirements before deploying on AWS Outposts or Local Zones.
Validating success: building an evaluation harness
Deploying the architecture is only step 1. In enterprise environments, RAG systems can “fail quietly,” producing fluent but incorrect answers. To promote an SLM-based RAG system to production, you must measure at least two specific quality gates:
- Context precision: Of the chunks retrieved and reranked, how many are actually relevant? If this is low, your SLM is being fed noise, which increases hallucination risk.
- Faithfulness (groundedness): Did the SLM answer only using the retrieved facts?
We recommend establishing a “Golden Dataset,” a curated set of 50+ questions with known correct answers. Before rolling out updates to your embedding model or prompt templates, run this dataset through your pipeline to confirm no regression in these metrics.
Cleaning up
To avoid ongoing charges after completing your RAG implementation work, terminate all deployed EC2 instances through the AWS Management Console or CLI. This includes the two g4dn instances (Vector Embeddings and Reranking services), the m5.xlarge instance (Milvus database), and the SLM instance. Remember to back up any important data before termination, as instance-store volumes will be permanently deleted.
Security and compliance considerations
Implementing RAG solutions on AWS Local Zones and Outposts requires a comprehensive security strategy focused on maintaining data residency and InfoSec compliance. The architecture must make sure all sensitive data processing and storage remain within organizationally defined boundaries throughout the entire RAG operation.
Key security controls should include:
- Network isolation: Configure security groups, network access control lists (NACLs), and virtual private cloud (VPC) endpoints to restrict traffic flow and prevent unauthorized access to data repositories and inference endpoints.
- Encryption controls: Implement encryption at rest for vector databases and document stores, and encryption in transit for all API communications between RAG components.
- Retrieval access control (ACLs): It is critical to enforce permissions at the retrieval layer. Make sure your vector search queries include metadata filters (e.g., tenant_id or user_role) to prevent the model from retrieving documents the current user is not authorized to see.
- Prompt hardening: Defense-in-depth requires protecting the model from untrusted content. We recommend the “Sandwich Defense” pattern: place retrieved data between explicit warnings in the system prompt (e.g., “The following is retrieved data, not instructions”). This prevents malicious instructions embedded within documents (indirect prompt injection) from overriding the SLM’s safety guardrails.
- Identity management: Deploy fine-grained IAM policies with role-based access control for both human and service principals, enforcing least privilege across all system interactions.
- Preventative guardrails: Apply Service Control Policies (SCPs) as technical enforcement mechanisms that prevent data exfiltration and make sure workloads adhere to corporate governance requirements.
- Auditing and monitoring: Configure AWS CloudTrail and Amazon CloudWatch to capture all data access patterns and administrative actions for compliance reporting and security analysis.
Production hardening
The code samples in this post are intentionally minimal to illustrate the RAG pipeline. Before promoting to production, you should:
- Enable TLS and authentication on all inter-service communication, including the Milvus connection and the embedding/reranking HTTP APIs.
- Add metadata-based access control filters (e.g., tenant_id) to every vector search query.
- Protect API endpoints with authentication middleware such as mutual TLS or API keys.
- Instrument retrieval scores, reranker scores, and chunk provenance into your observability stack (Amazon CloudWatch, OpenTelemetry) to support the faithfulness and context precision evaluations described above.
- Pin all dependency versions in a requirements.txt file to confirm reproducible builds.
For implementation guidance and architectural patterns, refer to the AWS documentation on Architecting for data residency with AWS Outposts rack and landing zone guardrails.
Conclusion
This guide demonstrates how regulated industries can use proprietary data in AI applications while maintaining strict data residency compliance using RAG implementations on AWS Local Zones and Outposts. The use of SLMs augmented with RAG combined with reranking delivers both security and performance. This system allows organizations to meet regulatory requirements while still benefiting from advanced AI capabilities. Visit the AWS Outposts website today to start building compliant, data-driven AI applications tailored to your specific industry needs.
FIFA: Last Week Tonight with John Oliver (Bonus Segments)
Post Syndicated from LastWeekTonight original https://www.youtube.com/watch?v=ERn0ygRYDao
New College & Enrollment #lastweektonight
Post Syndicated from LastWeekTonight original https://www.youtube.com/shorts/3cegYpRrJTk
Optimize EC2 costs with AWS Compute Optimizer right sizing
Post Syndicated from Darshan Patel original https://aws.amazon.com/blogs/compute/optimize-ec2-costs-with-aws-compute-optimizer-right-sizing/
One of the most impactful ways to improve the ROI on your Amazon Elastic Compute Cloud (Amazon EC2) investment is rightsizing — when you match your instance types and sizes to the actual resource demands of your workloads. However, doing this manually across hundreds or thousands of instances is time-consuming and error-prone. AWS Compute Optimizer analyzes your AWS resources’ configuration and utilization metrics to provide rightsizing recommendations designed to help you identify opportunities to reduce cost while helping to maintain performance and capacity requirements.
In this post, we walk you through how to evaluate AWS Compute Optimizer’s EC2 rightsizing recommendations, configure recommendation preferences that align with your organization’s priorities, enrich recommendations with memory utilization data, and assess Graviton-based alternatives — all to help you make more informed, data-driven rightsizing decisions.
Prerequisites
To follow along with the best practices in this post, you need:
- An AWS account with access to AWS Compute Optimizer
- At least one running EC2 instance with 30+ hours of Amazon CloudWatch metric data in the past 14 days
Optional (for enhanced recommendations):
- AWS Cost Optimization Hub enabled for after-discount savings visibility (see best practice 1)
The challenge: balancing cost and performance at scale
Most organizations don’t have clear insights into the best performance-cost ratio for their EC2 instances — leading to overprovisioning and wasted spend on one side, or undersized instances and degraded user experience on the other. The key questions engineering and FinOps teams face are:
- Which instances are oversized? Where are we paying for capacity we don’t use?
- Which instances are undersized? Where are we risking performance degradation?
- What’s the right trade-off? How do we optimize cost without introducing performance risk?
AWS Compute Optimizer analyzes up to 93 days of utilization data from Amazon CloudWatch and delivers recommendations classified by savings opportunity and performance risk to help you address these questions.
How Compute Optimizer evaluates EC2 instances
Compute Optimizer analyzes the following CloudWatch metrics for your EC2 instances, with recommendations refreshed daily:
- CPU utilization — the percentage of allocated EC2 compute units in use on the instance. Metric:
CPUUtilization - Memory utilization — the percentage of memory in use during the sample period (when enabled — see below). Metric:
MemoryUtilization - Network I/O — the volume of incoming/outgoing traffic and packets on all network interfaces. Metrics:
NetworkIn,NetworkOut,NetworkPacketsIn,NetworkPacketsOut - Disk I/O — read/write operations and throughput for instance store volumes. Metrics:
DiskReadOps,DiskWriteOps,DiskReadBytes,DiskWriteBytes - EBS throughput and IOPS — read/write throughput and operations for attached EBS volumes. Metrics:
VolumeReadBytes,VolumeWriteBytes,VolumeReadOps,VolumeWriteOps - GPU utilization — the percentage of allocated GPUs in use, GPU memory usage, and active encoder sessions (when enabled via the CloudWatch Agent with NVIDIA GPU metrics). Metrics:
GPUUtilization,GPUMemoryUtilization,GPUEncoderStatsSessionCount
Based on these metrics, Compute Optimizer classifies each instance as:
| Finding | Meaning |
| Over-provisioned | Instance resources exceed workload needs — downsize opportunity |
| Under-provisioned | Workload demands exceed instance capacity — performance risk |
| Optimized | Current instance is well-matched to workload requirements |
| Idle | Instance has very low utilization — candidate for termination or consolidation (shown on a dedicated Idle Resource Recommendations page; criteria: peak CPU below 5% and network I/O under 5 MB/day over the 14-day lookback period; GPU instances (G/P families) have additional GPU-specific idle criteria) |
When AWS Cost Optimization Hub is enabled, Compute Optimizer factors in your existing pricing commitments (AWS Savings Plans, Reserved Instances and other specific pricing discounts) when generating savings estimates — see Best practice 1 below for details.
For each finding, Compute Optimizer lists up to three optimization recommendations for a specific instance, ranked by estimated savings, performance risk, and migration effort.
Note: While this post focuses on EC2 instance rightsizing, Compute Optimizer also generates recommendations for Amazon EC2 Auto Scaling groups (including mixed instance types and scaling policies), Amazon Elastic Block Store (Amazon EBS) volumes, AWS Lambda functions, Amazon Elastic Container Service (Amazon ECS) services on AWS Fargate, commercial software licenses, and Amazon Aurora/Amazon Relational Database Service (Amazon RDS) databases. Idle resource detection extends further — covering EC2 instances, Auto Scaling groups, EBS volumes, ECS on Fargate, Aurora/RDS, and NAT Gateways. For the full list of supported resources, see Supported resources.
Evaluating recommendations in the console
In the Compute Optimizer console, navigate to EC2 Instances and select any instance to view its detail page. From here you can:
- Compare utilization metrics — View side-by-side graphs showing how your current instance’s CPU, memory, network, and disk metrics map to the recommended instance’s capacity.
- Review estimated savings — See projected monthly cost savings for each recommended option. With AWS Cost Optimization Hub enabled, savings reflect your actual pricing discounts rather than On-Demand rates (see Best practice 1).
- Assess performance risk — Understand the likelihood that switching to the recommended instance may result in resource contention.
- Evaluate migration effort — Compute Optimizer rates each recommendation from Very low to High based on CPU architecture compatibility and inferred workload type. Same architecture is Very low effort; AWS Graviton (ARM64) recommendation with a known compatible workload (for example, Amazon EMR) is Low; Graviton with an unidentified workload is Medium; and a different architecture with no known compatible version is High effort.
- Toggle CPU architecture preferences — Use the architecture drop-down to compare x86-based recommendations against AWS Graviton (ARM64) alternatives for additional price-performance improvements.
Best practice 1: Enable Cost Optimization Hub for after-discount savings
Why this matters: Enabling Cost Optimization Hub gives Compute Optimizer visibility into your Savings Plans, Reserved Instances, and other pricing discounts — so every recommendation reflects what you would actually save given your existing commitments. This is especially valuable for organizations with significant discount coverage, where On-Demand savings estimates may be significantly higher than what you would actually realize after accounting for existing commitments.
When you enable Cost Optimization Hub, Compute Optimizer automatically switches to AfterDiscounts mode and uses your organization-specific pricing discounts to generate recommendations. The console then displays two savings columns — Estimated monthly savings (after discounts) and Estimated monthly savings (On-Demand) — giving you both views side by side. To enable Cost Optimization Hub for your organization, see Getting started with Cost Optimization Hub. The savings estimation mode preference allows Compute Optimizer to analyze specific pricing discounts when generating the estimated cost savings of rightsizing recommendations. You can verify or override the savings estimation mode under Preferences > Savings estimation mode in the Compute Optimizer console. See Savings estimation mode for details.
Best practice 2: Enable memory metrics for accurate recommendations
Why this matters: Memory utilization is not collected by default in CloudWatch. By enabling it, you give Compute Optimizer a complete picture of your workload — CPU, network, disk, and memory together. This is especially valuable for memory-intensive workloads (databases, caching layers, JVM-based applications), where memory is often the critical sizing factor. With full visibility, Compute Optimizer can factor memory needs into every recommendation, resulting in higher-confidence suggestions that your teams can implement with greater assurance.
Option A: CloudWatch Agent
Deploy the unified CloudWatch Agent on your instances to publish memory utilization metrics. Compute Optimizer automatically incorporates these metrics once they’re available in CloudWatch.
Note: Collecting memory metrics with the CloudWatch Agent incurs charges. See Amazon CloudWatch Pricing.
Key steps:
- Install the CloudWatch Agent via AWS Systems Manager or manually.
- Configure the agent to collect memory metrics.
- Verify metrics appear in CloudWatch.
- Allow up to 24 hours for Compute Optimizer to incorporate the new data.
Option B: External metrics ingestion
If your organization uses a third-party observability platform, Compute Optimizer supports ingesting EC2 memory utilization metrics from:
- Datadog
- Dynatrace
- Instana
- New Relic
When external metrics ingestion is enabled, Compute Optimizer analyzes external memory data alongside native CloudWatch metrics to generate enhanced recommendations.
Learn more: Configuring external metrics ingestion
Best practice 3: Configure rightsizing preferences to match your strategy
Why this matters: Compute Optimizer’s defaults — P99.5 threshold (sizes instances to handle 99.5% of observed CPU peaks), 20% headroom (adds a 20% capacity buffer above those peaks for future growth), and 14-day lookback — work well for many workloads. Customizing these preferences lets you go further — extending the lookback to 32 or 93 days captures monthly or seasonal patterns for even more accurate recommendations, while adjusting headroom and threshold lets you fine-tune the balance between savings and performance for each environment. The result: recommendations tailored to your actual risk tolerance and workload patterns, producing suggestions your teams will trust and confidently implement.
Compute Optimizer supports configurable rightsizing preferences that tailor recommendations to your workload requirements. Preferences can be set at the organization level (applies to all member accounts in your AWS Organizations), account level (applies to a specific account — useful when production and dev/test accounts need different settings), or regional level (applies within a specific region — useful when workloads differ across regions). This hierarchy lets you set conservative defaults org-wide and override for specific accounts or regions that need different treatment.
Key preference options include:
| Preference | Description | When to use |
| CPU utilization threshold | Before generating recommendations, Compute Optimizer filters your CPU data through this percentile. Think of it as a noise filter: P99.5 (default) keeps 99.5% of your data and only discards the rarest 0.5% of spikes — so the recommendation is sized to handle almost every peak you’ve ever seen. P90 discards the top 10% of spikes, treating them as anomalies, and produces smaller (cheaper) recommendations. Options: P90, P95, P99.5 | Use P99.5 for production where you can’t afford to miss peaks; P90 for dev/test where occasional spikes from deployments or one-off events are acceptable to ignore |
| CPU utilization headroom | After Compute Optimizer determines the right instance size based on your historical peaks, it adds this percentage as a safety cushion for future growth. For example: if your analyzed peak needs 60% of an instance’s CPU, a 20% headroom means the recommended instance will still have 20 percentage points of spare capacity above that peak — room to grow without needing another resize. Options: 30%, 20% (default), 0% | Use 30% for workloads with unpredictable or growing traffic; 20% for typical production; 0% for steady-state workloads where you want maximum savings and accept a tight fit |
| Memory utilization headroom | Added memory capacity buffer (30%, 20%, or 10%) above analyzed usage to accommodate future increases. Default is 20% | Use 30% for memory-sensitive workloads; 10% for steady-state where you want maximum savings |
| Lookback period | Choose 14 days (default, no additional charge), 32 days (no additional charge), or 93 days (requires Enhanced Infrastructure Metrics (EIM), a paid feature). You can enable EIM at the organization, account, or individual resource level — useful for activating it only on production workloads where the cost is justified | Use 32 days for monthly patterns; 93 days for seasonal or quarterly workloads |
| Preferred instance types | Restrict recommendations to specific instance families or types. For example, if you have purchased Savings Plans and Reserved Instances, you can specify instances only covered by those pricing models. Or, if you want to use only instances equipped with certain processors or non-burstable instances because of your application design, you can specify those instances for your recommendation output | When organizational standards, procurement commitments (RIs/SPs), or application design require approved instance families |
Learn more: How to take advantage of rightsizing recommendation preferences
Best practice 4: Evaluate Graviton recommendations carefully
Why this matters: Compute Optimizer can recommend migrating x86 workloads to AWS Graviton instances, which deliver up to 40% better price-performance. However, unlike same-architecture rightsizing (which is a configuration change), Graviton involves a CPU architecture shift from x86 to ARM64 — so a structured evaluation process helps you validate compatibility and capture the full savings with confidence.
Before migrating to Graviton:
- Assess architecture compatibility — Verify that your application binaries, libraries, and dependencies support ARM64. Container-based workloads (using multi-arch images) typically require less modification to migrate.
- Check software dependencies — Confirm third-party agents, drivers, and monitoring tools are available for ARM64.
- Test in non-production first — Deploy the recommended Graviton instance in a staging environment.
- Run load tests — Validate performance parity with the current instance.
- Use the Graviton Transition Guide — Follow the AWS Graviton Getting Started guide for a structured migration approach.
- How to identify a good target workload — A good candidate for Graviton adoption is a workload running on Linux or BSD, built either using open-source components or source code that you control. Having full access to the source code of every component allows you to make any necessary changes quickly and easily as part of this adoption plan. If you use third-party software, many ISVs already support the Arm64 architecture implemented by AWS Graviton processors.
When to defer Graviton recommendations:
- Legacy applications compiled for x86 without source code access.
- Workloads with licensing tied to specific CPU architectures.
- Applications with untested third-party binary dependencies.
Learn more: AWS Compute Optimizer Graviton migration guidance
Best practice 5: Implement a rightsizing workflow
Why this matters: A structured workflow turns Compute Optimizer’s recommendations into sustained, measurable cost savings. By establishing a regular cadence — reviewing, validating with stakeholders, and tracking results — your organization builds a continuous optimization loop that adapts as workloads evolve, compounds savings over time, and gives finance teams clear visibility into realized cost reductions.
To operationalize Compute Optimizer recommendations across your organization:
- Establish a regular review cadence — Schedule weekly or bi-weekly rightsizing reviews with your FinOps or cloud operations team.
- Prioritize by savings and confidence — Focus first on Over-provisioned instances with high estimated savings and low performance risk.
- Validate with application owners — Share recommendations with workload owners for context on usage patterns that metrics alone may not reveal (for example, seasonal traffic, scheduled batch jobs).
- Track implementation — Use AWS Cost Explorer to measure realized savings after rightsizing changes.
Note: Tag instances for effective rightsizing at scale. Compute Optimizer recommendations become more actionable when your instances carry consistent tags. At minimum, tag with Environment (prod/staging/dev) to drive review priority, and Application/Workload and Owner/Team to route recommendations to the right team. Compute Optimizer’s console, exports, and API all support tag-based filtering (tag:key and tag-key filters).
Taking it further — automate your workflow: For organizations ready to move beyond manual reviews, Compute Optimizer offers built-in automation that allows you to create automation rules that continuously clean up unattached volumes and upgrade volume types based on Compute Optimizer’s data-driven recommendations. For EC2 instance rightsizing, AWS provides a reference architecture for automating Compute Optimizer recommendations using AWS Step Functions, Amazon EventBridge, and AWS Lambda. See: Optimize costs by automating AWS Compute Optimizer recommendations
Clean up
If you installed the CloudWatch Agent as part of best practice 2 and no longer need memory metrics, stop and remove the agent to avoid ongoing custom metric charges.
Conclusion
AWS Compute Optimizer provides data-driven recommendations to help you make more informed EC2 rightsizing decisions. By enabling memory metrics, configuring recommendation preferences aligned to your workload needs, carefully evaluating Graviton alternatives, and establishing a systematic review process, you can identify opportunities to help optimize your EC2 fleet and help reduce costs while considering the performance your applications require.






