Съвременната турска музика и европейската сцена

2024-03-15 Емине Садкъ

Post Syndicated from Емине Садкъ original https://www.toest.bg/suvremennata-turska-muzika-i-evropeyskata-scena/

Преди да започнете да четете, може да си пуснете плейлист с повече от двайсет съвременни турски банди. Ако пък нямате време за плейлисти, не пропускайте следните имена: Аyyuka; Islandman; Collectif Medz Bazar; Nekropsi; Gayes Su Akyol.

В предишните две статии говорихме за музиката на турските гастарбайтери в Германия през 60-те и 80-те години – музика, която се превръща както в дом, така и във форма на протест за стотици хиляди работници, намиращи се в една болезнено непозната страна. Представихме също и т.нар. ориенталски хип-хоп от 90-те години, служещ за съпротива срещу неонацизма и отстояване на идентичността на второто поколение турци – децата на гурбетчиите, родени през 70-те години в Германия.

В тази заключителна статия ще се фокусираме върху съвременната турска музика като цяло, която не разчита на нищо друго, освен на идентичността и автентичността си. Ще стане дума и за сливането ѝ с различни музикални форми, както и за мястото ѝ на европейската и световната сцена.

Съвременната алтернативна турска музика е отделен жанр със собствена уникалност и ритъм,

които трудно могат да бъдат сбъркани, но за да бъде поставен този жанр на мястото му в музикалните магазини, критиците най-често го причисляват към необятното и не съвсем ясно понятие World music. Според мен по-скоро е Crossing worlds music.

Във филма на Фатих Акин „Музиката на Истанбул“ (2005) членовете на една от най-емблематичните турски рок-гръндж банди – Replikas, заявяват, че дълго време са се борили с това да не принадлежат на голямата сцена и каквото и да правят, никога да не звучат така, както звучат западните банди. Накрая осъзнават, че точно в това е тяхната сила и че използването на традиционната турска (ориенталска) музика им дава по-голяма дълбочина и автентичност.

Трудно се изкоренява ориенталската мелодия от душата на един народ и е особено ценно, че сегашната турска музика успява така изкусно да вплете нишката на своето ДНК, без да звучи нито остаряло, нито досадно.

През 1934 г., по време на културните промени на Ататюрк, турската музика в радиостанциите е забранена.

На нейно място звучи класическа и европейска, но тъкмо затова повечето хора по онова време слушат египетски радиа.

Първото истинско западно влияние върху турската музика е на „Бийтълс“,

които вдъхновяват както великана на турския рок Баръш Манчо (Barış Manco), така и Джем Караджа и Еркин Корай, създали жанра анадолски рок. Преди няколко години беше осъществен проектът Аnatolian Rock Revival Project, представящ част от този стил. Освен че са преведени на английски, песните са анимирани по много красив начин.

Съвременната турска музика и европейската сцена — Отляво надясно: Баръш Манчо, Джем Караджа, Еркин Корай

Бих ви окуражила да изгледате клипове на споменатите по-горе изпълнители, за да разберете колко много са повлияли на съвременните турски банди, както и защо хипарското, психеделично, груув излъчване и звучене са все още част от облика на турската алтернативна музика, част от нейната уникалност.

В този ред на мисли, не можем да продължим нататък, без да споменем BaBa ZuLa и тяхната отдаденост на експерименталния ориенталски ъндърграунд психеделичен рок, познат с изтънчеността си, доброто познаване на традиционното звучене, но и със задължителната изненада, която бих нарекла „сладка лудост“.

За мен беше удоволствие да чуя дарк-ембиънт-психеделичния албум Bu Bir Ruya („Това е сън“) през 2018-та. Албумът е част от проекта DirtMusic на Hugo Race – бивш член на бандата Nick Cave and The Bad Seeds, и BaBa ZuLa пасват невероятно. Албумът е многолик, преплита ориенталска музика и дарк ембиънт китара, а текстовете на песните са или силно политически, или твърде лични. Пее се както на английски, така и на турски. Човек би си помислил, че специално за мен са го записали този албум! Слушах го до припадък.

Друг албум, който слушах безброй пъти, беше Gece („Нощ“), на една от най-нашумелите по света турскоезични банди, изпълняваща анадолски рок – Altın Gün („Златен ден“). Тя затвърди и популяризира хипарското, груув усещане за анадолския рок, но и вдигна летвата с талантливо и умело адаптиране. Групата е базирана в Амстердам. Създадена е от нидерландеца Яспер Ферхюлст (басист), бивш член на инди банда, който иска да свири анадолски рок, но не знае турски. Разлепя обяви в турските бакалии из цял Амстердам, пуска публикации в социалните мрежи и след известно време намира Мерве Дасдемир (вокал) и Ердиндж Еджевит (баглама и вокал). Така се сформира единствената (до този момент) турскoезична банда, номинирана за „Грами“ през 2019 г., която Турция разпознава като своя. Един от водещите ежедневници на Турция Hürriyet я нарича „нашата гордост на червения килим“.

Във връзка с наградите и популярността обръщам специално внимание на лейбъла GlitterBeat, част от който са BaBa ZuLa, Gaye Su Akyol и Altın Gün. Каталогът им е пъстър и пълен с качествена музика от Африка, Азия, Южна Америка и др.

През 2021 г. започва проектът #60JahreMusik, с който се отбелязва 60-годишнината на турската музика в Германия. Сайтът, в който са събрани интервюта, видеоклипове, статии за музиката на различните поколения, е наречен Истанбул-Берлин. Това е много смислен проект и единствен по рода си архив и каталог на турски музиканти, живеещи в Германия. В сайта може да се намерят статии на английски и всеки, който проявява по-голям интерес по темата, ще открие достатъчно лични истории и още музика.

Проследяването на присъствието на турската култура от последните шейсет години в Европа чрез музиката за мен е интересен начин да се ориентираме къде се намира тя на европейската карта в момента. Колкото далечна и непозната е в началото на 60-те в Германия, толкова разпознаваема е през последните няколко години.

Съвременната турска музика вече е част от европейската и световната сцена. И не се бори за нищо – просто е.

Свободна е да бъде себе си; приета е като равна. За това помагат не само глобализацията и открехнатата врата, през която успяват да преминат нови културни течения, но и стабилното присъствие на турската музика в Европа вече шест десетилетия. Помага и приемането на алтернативната западна музика в турската култура.

Културният диалог, който тече в момента, и приемствеността са важни за нас, живеещи между звуците на Изтока и Запада.

Казват, че най-лесният начин да научиш за един народ е да слушаш неговата музика. Мисля, че ние все повече се опознаваме едни други. Но този процес е крехък, затова е добре към него да се подхожда с внимание и разбиране. За да продължим да се срещаме отвъд границите, отвъд мисълта за тях.

Run and manage open source InfluxDB databases with Amazon Timestream

2024-03-15 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/run-and-manage-open-source-influxdb-databases-with-amazon-timestream/

Starting today, you can use InfluxDB as a database engine in Amazon Timestream. This support makes it easy for you to run near real-time time-series applications using InfluxDB and open source APIs, including open source Telegraf agents that collect time-series observations.

Now you have two database engines to choose in Timestream: Timestream for LiveAnalytics and Timestream for InfluxDB.

You should use the Timestream for InfluxDB engine if your use cases require near real-time time-series queries or specific features in InfluxDB, such as using Flux queries. Another option is the existing Timestream for LiveAnalytics engine, which is suitable if you need to ingest more than tens of gigabytes of time-series data per minute and run SQL queries on petabytes of time-series data in seconds.

With InfluxDB support in Timestream, you can use a managed instance that is automatically configured for optimal performance and availability. Furthermore, you can increase resiliency by configuring multi-Availability Zone support for your InfluxDB databases.

Timestream for InfluxDB and Timestream for LiveAnalytics complement each other for low-latency and large-scale ingestion of time-series data.

Getting started with Timestream for InfluxDB
Let me show you how to get started.

First, I create an InfluxDB instance. I navigate to the Timestream console, go to InfluxDB databases in Timestream for InfluxDB and select Create Influx database.

On the next page, I specify the database credentials for the InfluxDB instance.

I also specify my instance class in Instance configuration and the storage type and volume to suit my needs.

In the next part, I can choose a multi-AZ deployment, which synchronously replicates data to a standby database in a different Availability Zone or just a single instance of InfluxDB. In the multi-AZ deployment, if a failure is detected, Timestream for InfluxDB will automatically fail over to the standby instance without data loss.

Then, I configure how to connect to my InfluxDB instance in Connectivity configuration. Here, I have the flexibility to define network type, virtual private cloud (VPC), subnets, and database port. I also have the flexibility to configure my InfluxDB instance to be publicly accessible by specifying public subnets and set the public access to Publicly Accessible, allowing Amazon Timestream will assign a public IP address to my InfluxDB instance. If you choose this option, make sure that you have proper security measures to protect your InfluxDB instances.

In this demo, I set my InfluxDB instance as Not publicly accessible, which also means I can only access it through the VPC and subnets I defined in this section.

Once I configure my database connectivity, I can define the database parameter group and the log delivery settings. In Parameter group, I can define specific configurable parameters that I want to use for my InfluxDB database. In the log delivery settings, I also can define which Amazon Simple Storage Service (Amazon S3) bucket I have to export the system logs. To learn more about the required AWS Identity and Access Management (IAM) policy for the Amazon S3 bucket, visit this page.

Once I’m happy with the configuration, I select Create Influx database.

Once my InfluxDB instance is created, I can see more information on the detail page.

With the InfluxDB instance created, I can also access the InfluxDB user interface (UI). If I configure my InfluxDB as publicly accessible, I can access the UI using the console by selecting InfluxDB UI. As shown on the setup, I configured my InfluxDB instance as not publicly accessible. In this case, I need to access the InfluxDB UI with SSH tunneling through an Amazon Elastic Compute Cloud (Amazon EC2) instance within the same VPC as my InfluxDB instance.

With the URL endpoint from the detail page, I navigate to the InfluxDB UI and use the username and password I configured in the creation process.

With access to the InfluxDB UI, I can now create a token to interact with my InfluxDB instance.

I can also use the Influx command line interface (CLI) to create a token. Before I can create the token, I create a configuration to interact with my InfluxDB instance. The following is the sample command to create a configuration:

influx config create --config-name demo  \
    --host-url https://<TIMESTREAM for INFLUX DB ENDPOINT> \
   --org demo-org  
   --username-password [USERNAME] \
   --active

With the InfluxDB configuration created, I can now create an operator, all-access or read/write token. The following is an example for creating an all-access token to grant permissions to all resources in the organization that I defined:

influx auth create --org demo-org --all-access

With the required token for my use case, I can use various tools, such as the Influx CLI, Telegraf agent, and InfluxDB client libraries, to start ingesting data into my InfluxDB instance. Here, I’m using the Influx CLI to write sample home sensor data in the line protocol format, which you can also get from the InfluxDB documentation page.

influx write \
  --bucket demo-bucket \
  --precision s "
home,room=Living\ Room temp=21.1,hum=35.9,co=0i 1641024000
home,room=Kitchen temp=21.0,hum=35.9,co=0i 1641024000
home,room=Living\ Room temp=21.4,hum=35.9,co=0i 1641027600
home,room=Kitchen temp=23.0,hum=36.2,co=0i 1641027600
home,room=Living\ Room temp=21.8,hum=36.0,co=0i 1641031200
home,room=Kitchen temp=22.7,hum=36.1,co=0i 1641031200
home,room=Living\ Room temp=22.2,hum=36.0,co=0i 1641034800
home,room=Kitchen temp=22.4,hum=36.0,co=0i 1641034800
home,room=Living\ Room temp=22.2,hum=35.9,co=0i 1641038400
home,room=Kitchen temp=22.5,hum=36.0,co=0i 1641038400
home,room=Living\ Room temp=22.4,hum=36.0,co=0i 1641042000
home,room=Kitchen temp=22.8,hum=36.5,co=1i 1641042000
home,room=Living\ Room temp=22.3,hum=36.1,co=0i 1641045600
home,room=Kitchen temp=22.8,hum=36.3,co=1i 1641045600
home,room=Living\ Room temp=22.3,hum=36.1,co=1i 1641049200
home,room=Kitchen temp=22.7,hum=36.2,co=3i 1641049200
home,room=Living\ Room temp=22.4,hum=36.0,co=4i 1641052800
home,room=Kitchen temp=22.4,hum=36.0,co=7i 1641052800
home,room=Living\ Room temp=22.6,hum=35.9,co=5i 1641056400
home,room=Kitchen temp=22.7,hum=36.0,co=9i 1641056400
home,room=Living\ Room temp=22.8,hum=36.2,co=9i 1641060000
home,room=Kitchen temp=23.3,hum=36.9,co=18i 1641060000
home,room=Living\ Room temp=22.5,hum=36.3,co=14i 1641063600
home,room=Kitchen temp=23.1,hum=36.6,co=22i 1641063600
home,room=Living\ Room temp=22.2,hum=36.4,co=17i 1641067200
home,room=Kitchen temp=22.7,hum=36.5,co=26i 1641067200
"

Finally, I can query the data using the InfluxDB UI. I navigate to the Data Explorer page in the InfluxDB UI, create a simple Flux script, and select Submit.

Timestream for InﬂuxDB makes it easier for you to develop applications using InfluxDB, while continuing to use your existing tools to interact with the database. With the multi-AZ configuration, you can increase the availability of your InfluxDB data without worrying about the underlying infrastructure.

AWS and InfluxDB partnership
Celebrating this launch, here’s what Paul Dix, Founder and Chief Technology Officer at InfluxData, said about this partnership:

“The future of open source is powered by the public cloud—reaching the broadest community through simple entry points and practical user experience. Amazon Timestream for InfluxDB delivers on that vision. Our partnership with AWS turns InfluxDB open source into a force multiplier for real-time insights on time-series data, making it easier than ever for developers to build and scale their time-series workloads on AWS.”

Things to know
Here are some additional information that you need to know:

Availability – Timestream for InfluxDB is now generally available in the following AWS Regions: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Mumbai, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, Stockholm).

Migration scenario – To migrate from a self-managed InfluxDB instance, you can simply restore a backup from an existing InfluxDB database into Timestream for InfluxDB. If you need to migrate from existing Timestream LiveAnalytics engine to Timestream for InfluxDB, you can leverage Amazon S3. Read more on how to do migration for various use cases on Migrating data from self-managed InfluxDB to Timestream for InfluxDB page.

Supported version – Timestream for InfluxDB currently supports the open source 2.7.5 version of InfluxDB

Pricing – To learn more about pricing, please visit Amazon Timestream pricing.

Demo – To see Timestream for InfluxDB in action, have a look at this demo created by my colleague, Derek:

Start building time-series applications and dashboards with millisecond response times using Timestream for InfluxDB. To learn more, visit Amazon Timestream for InfluxDB page.

Happy building!
— Donnie

Comic for 2024.03.15 – Sick Dog

2024-03-15 Explosm.net

Post Syndicated from Explosm.net original https://explosm.net/comics/sick-dog

New Cyanide and Happiness Comic

Schwa

2024-03-15 xkcd.com

Post Syndicated from xkcd.com original https://xkcd.com/2907/

Doug's cousin, the one from London, runs a Bumble love cult.

Broadcom Now Sampling 51.2T Co-Packaged Optics Switch

2024-03-14 Rohit Kumar

Post Syndicated from Rohit Kumar original https://www.servethehome.com/broadcom-now-sampling-51-2t-co-packaged-optics-switch/

Broadcom is now sampling its Tomahawk 5 Bailly 51.2T co-packaged optics switch solution to transition the industry to more efficient networks

The post Broadcom Now Sampling 51.2T Co-Packaged Optics Switch appeared first on ServeTheHome.

Undersea cable failures cause Internet disruptions for multiple African countries

2024-03-14 João Tomé

Post Syndicated from João Tomé original https://blog.cloudflare.com/undersea-cable-failures-cause-internet-disruptions-across-africa-march-14-2024

Internet connectivity in several African countries was disrupted today, March 14, 2024. Beginning at approximately 05:00 UTC, west and central African countries were most impacted, as was South Africa. Based on published reports and social media posts from impacted network providers, the disruption is believed to be due to multiple undersea cable failures in the region. From The Gambia to Côte d’Ivoire, including a major network in South Africa (Vodacom), a total of 11 African countries were impacted, based on our observations.

Cloudflare Radar data shows a pattern of disruptions from the north to the south of West Africa over time. It began south of Senegal, with The Gambia, Guinea, and Liberia experiencing disruptions around 05:00 UTC.

In The Gambia and Guinea, the disruptions lasted about 30 minutes, while in Liberia, the disruption has lasted more than 12 hours.

Moving south, around 07:30 UTC, disruptions were observed in Côte d’Ivoire and Ghana.

Niger, a landlocked nation in Central Africa, experienced a disruption at 09:15, lasting just over two hours.

This was followed by disruptions starting around 10:30 UTC in Nigeria, Benin, Cameroon, and Togo. These disruptions were ongoing at the time of writing.

At approximately the same time, a significant disruption was observed on Vodacom’s South African network (AS29975). Traffic began to recover after 13:30 UTC, and appears to have reached close to normal levels by 16:00 UTC.

The importance of submarine cables

This series of disruptions serves as a reminder of how dependent the Internet is on submarine cables, which are estimated to carry over 90% of intercontinental data traffic. Only a small percentage of general use is done via satellite networks. There are 529 active submarine cables and 1,444 landings that are currently active or under construction, running to an estimated 1.3 million km around the globe.

We have written about submarine cable-related outages before, from Tonga to the AAE-1 & SMW5 cable cuts of June 2022.

Reports from several local networks, including South Africa’s Vodacom, MTN in Nigeria, and Celtiis in Bénin, reference multiple submarine cable failures. Microsoft was more detailed, stating on their Azure status page that “multiple fiber cables on the West Coast of Africa — WACS, MainOne, SAT3, ACE — have been impacted which reduced total capacity supporting our Regions in South Africa”. The company also explains that the recent cable cuts in the Red Sea in combination with today’s cable issues, “has impacted all Africa capacity”.

In addition to the impacts to the Microsoft Azure cloud platform, the website of MainOne, owners of the MainOne submarine cable, was offline for several hours. DNS for mainone.net is handled by name servers located in MainOne’s address space. It appears that a portion of the IPv4 address space for AS37282 (MAINONE) stopped being announced between 07:30 and 15:00 UTC, and once this address space was being routed again, both the nameservers and website became reachable.

This map from TeleGeography highlights the impacted submarine cables: WACS (West Africa Cable System), MainOne, SAT-3/WASC, and ACE.

The disruptions are now being reported by news media outlets, including in South Africa, where the emphasis is not only on the latest outage but also on the problem with the submarine cable operator Seacom. This operator experienced a service-impacting outage on its cable system in the Red Sea. On March 8, the company stated that it is waiting for permits to start repairing its broken submarine cable in the Red Sea.

We will keep monitoring the situation. Follow the Cloudflare Radar Outage Center for the latest updates, and follow us on social media at @CloudflareRadar (X), cloudflare.social/@radar (Mastodon), and radar.cloudflare.com (Bluesky).

Abby Wambach | International Women’s Day 2024 | Talks at Google

2024-03-14 Talks at Google

Post Syndicated from Talks at Google original https://www.youtube.com/watch?v=a2VhBwQwDTo

How VMware Tanzu CloudHealth migrated from self-managed Kafka to Amazon MSK

2024-03-14 Rivlin Pereira

Post Syndicated from Rivlin Pereira original https://aws.amazon.com/blogs/big-data/how-vmware-tanzu-cloudhealth-migrated-from-self-managed-kafka-to-amazon-msk/

This is a post co-written with Rivlin Pereira & Vaibhav Pandey from Tanzu CloudHealth (VMware by Broadcom).

VMware Tanzu CloudHealth is the cloud cost management platform of choice for more than 20,000 organizations worldwide, who rely on it to optimize and govern their largest and most complex multi-cloud environments. In this post, we discuss how the VMware Tanzu CloudHealth DevOps team migrated their self-managed Apache Kafka workloads (running version 2.0) to Amazon Managed Streaming for Apache Kafka (Amazon MSK) running version 2.6.2. We discuss the system architectures, deployment pipelines, topic creation, observability, access control, topic migration, and all the issues we faced with the existing infrastructure, along with how and why we migrated to the new Kafka setup and some lessons learned.

Kafka cluster overview

In the fast-evolving landscape of distributed systems, VMware Tanzu CloudHealth’s next-generation microservices platform relies on Kafka as its messaging backbone. For us, Kafka’s high-performance distributed log system excels in handling massive data streams, making it indispensable for seamless communication. Serving as a distributed log system, Kafka efficiently captures and stores diverse logs, from HTTP server access logs to security event audit logs.

Kafka’s versatility shines in supporting key messaging patterns, treating messages as basic logs or structured key-value stores. Dynamic partitioning and consistent ordering ensure efficient message organization. The unwavering reliability of Kafka aligns with our commitment to data integrity.

The integration of Ruby services with Kafka is streamlined through the Karafka library, acting as a higher-level wrapper. Our other language stack services use similar wrappers. Kafka’s robust debugging features and administrative commands play a pivotal role in ensuring smooth operations and infrastructure health.

Kafka as an architectural pillar

In VMware Tanzu CloudHealth’s next-generation microservices platform, Kafka emerges as a critical architectural pillar. Its ability to handle high data rates, support diverse messaging patterns, and guarantee message delivery aligns seamlessly with our operational needs. As we continue to innovate and scale, Kafka remains a steadfast companion, enabling us to build a resilient and efficient infrastructure.

Why we migrated to Amazon MSK

For us, migrating to Amazon MSK came down to three key decision points:

Simplified technical operations – Running Kafka on a self-managed infrastructure was an operational overhead for us. We hadn’t updated Kafka version 2.0.0 for a while, and Kafka brokers were going down in production, causing issues with topics going offline. We also had to run scripts manually for increasing replication factors and rebalancing leaders, which was additional manual effort.
Deprecated legacy pipelines and simplified permissions – We were looking to move away from our existing pipelines written in Ansible to create Kafka topics on the cluster. We also had a cumbersome process of giving team members access to Kafka machines in staging and production, and we wanted to simplify this.
Cost, patching, and support – Because Apache Zookeeper is completely managed and patched by AWS, moving to Amazon MSK was going to save us time and money. In addition, we discovered that running Amazon MSK with the same type of brokers on Amazon Elastic Compute Cloud (Amazon EC2) was cheaper to run on Amazon MSK. Combined with the fact that we get security patches applied on brokers by AWS, migrating to Amazon MSK was an easy decision. This also meant that the team was freed up to work on other important things. Finally, getting enterprise support from AWS was also critical in our final decision to move to a managed solution.

How we migrated to Amazon MSK

With the key drivers identified, we moved ahead with a proposed design to migrate existing self-managed Kafka to Amazon MSK. We conducted the following pre-migration steps before the actual implementation:

Assessment:
- Conducted a meticulous assessment of the existing EC2 Kafka cluster, understanding its configurations and dependencies
- Verified Kafka version compatibility with Amazon MSK
Amazon MSK setup with Terraform
- Used Terraform to provision an MSK cluster, embracing infrastructure as code (IaC) principles for repeatability and version control
- Configured Amazon Virtual Private Cloud (Amazon VPC) security groups, AWS Identity and Access Management (IAM) roles, and VPC settings through Terraform to establish a robust foundation for Amazon MSK
Network configuration:
- Ensured seamless network connectivity between the EC2 Kafka and MSK clusters, fine-tuning security groups and firewall settings

After the pre-migration steps, we implemented the following for the new design:

Automated deployment, upgrade, and topic creation pipelines for MSK clusters:
- In the new setup, we wanted to have automated deployments and upgrades of the MSK clusters in a repeatable fashion using an IaC tool. Therefore, we created custom Terraform modules for MSK cluster deployments as well as upgrades. These modules where called from a Jenkins pipeline for automated deployments and upgrades of the MSK clusters. For Kafka topic creation, we were using an Ansible-based home-grown pipeline, which wasn’t stable and led to a lot of complaints from dev teams. As a result, we evaluated options for deployments to Kubernetes clusters and used the Strimzi Topic Operator to create topics on MSK clusters. Topic creation was automated using Jenkins pipelines, which dev teams could self-service.
Better observability for clusters:
- The old Kafka clusters didn’t have good observability. We only had alerts on Kafka broker disk size. With Amazon MSK, we took advantage of open monitoring using Prometheus. We stood up a standalone Prometheus server that scraped metrics from MSK clusters and sent them to our internal observability tool. As a result of improved observability, we were able to set up robust alerting for Amazon MSK, which wasn’t possible with our old setup.
Improved COGS and better compute infrastructure:
- For our old Kafka infrastructure, we had to pay for managing Kafka, Zookeeper instances, plus any additional broker storage costs and data transfer costs. With the move to Amazon MSK, because Zookeeper is completely managed by AWS, we only have to pay for Kafka nodes, broker storage, and data transfer costs. As a result, in final Amazon MSK setup for production, we saved not only on infrastructure costs but also operational costs.
Simplified operations and enhanced security:
- With the move to Amazon MSK, we didn’t have to manage any Zookeeper instances. Broker security patching was also taken care by AWS for us.
- Cluster upgrades became simpler with the move to Amazon MSK; it’s a straightforward process to initiate from the Amazon MSK console.
- With Amazon MSK, we got broker automatic scaling out of the box. As a result, we didn’t have to worry about brokers running out of disk space, thereby leading to additional stability of the MSK cluster.
- We also got additional security for the cluster because Amazon MSK supports encryption at rest by default, and various options for encryption in transit are also available. For more information, refer to Data protection in Amazon Managed Streaming for Apache Kafka.

During our pre-migration steps, we validated the setup on the staging environment before moving ahead with production.

Kafka topic migration strategy

With the MSK cluster setup complete, we performed a data migration of Kafka topics from the old cluster running on Amazon EC2 to the new MSK cluster. To achieve this, we performed the following steps:

Set up MirrorMaker with Terraform – We used Terraform to orchestrate the deployment of a MirrorMaker cluster consisting of 15 nodes. This demonstrated the scalability and flexibility by adjusting the number of nodes based on the migration’s concurrent replication needs.
Implement a concurrent replication strategy – We implemented a concurrent replication strategy with 15 MirrorMaker nodes to expedite the migration process. Our Terraform-driven approach contributed to cost optimization by efficiently managing resources during the migration and ensured the reliability and consistency of the MSK and MirrorMaker clusters. It also showcased how the chosen setup accelerates data transfer, optimizing both time and resources.
Migrate data – We successfully migrated 2 TB of data in a remarkably short timeframe, minimizing downtime and showcasing the efficiency of the concurrent replication strategy.
Set up post-migration monitoring – We implemented robust monitoring and alerting during the migration, contributing to a smooth process by identifying and addressing issues promptly.

The following diagram illustrates the architecture after the topic migration was complete.
Mirror-maker setup

Challenges and lessons learned

Embarking on a migration journey, especially with large datasets, is often accompanied by unforeseen challenges. In this section, we delve into the challenges encountered during the migration of topics from EC2 Kafka to Amazon MSK using MirrorMaker, and share valuable insights and solutions that shaped the success of our migration.

Challenge 1: Offset discrepancies

One of the challenges we encountered was the mismatch in topic offsets between the source and destination clusters, even with offset synchronization enabled in MirrorMaker. The lesson learned here was that offset values don’t necessarily need to be identical, as long as offset sync is enabled, which makes sure the topics have the correct position to read the data from.

We addressed this problem by using a custom tool to run tests on consumer groups, confirming that the translated offsets were either smaller or caught up, indicating synchronization as per MirrorMaker.

Challenge 2: Slow data migration

The migration process faced a bottleneck—data transfer was slower than anticipated, especially with a substantial 2 TB dataset. Despite a 20-node MirrorMaker cluster, the speed was insufficient.

To overcome this, the team strategically grouped MirrorMaker nodes based on unique port numbers. Clusters of five MirrorMaker nodes, each with a distinct port, significantly boosted throughput, allowing us to migrate data within hours instead of days.

Challenge 3: Lack of detailed process documentation

Navigating the uncharted territory of migrating large datasets using MirrorMaker highlighted the absence of detailed documentation for such scenarios.

Through trial and error, the team crafted an IaC module using Terraform. This module streamlined the entire cluster creation process with optimized settings, enabling a seamless start to the migration within minutes.

Final setup and next steps

As a result of the move to Amazon MSK, our final setup after topic migration looked like the following diagram.
MSK Blog
We’re considering the following future improvements:

Move to AWS Graviton based nodes for Amazon MSK for faster performance and cost savings. For more details about Graviton with Amazon MSK, refer to Amazon MSK now supports Graviton3-based M7g instances for new provisioned clusters.
Use tiered storage to save even more on COGS.
Create dedicated clusters for certain needs, because we can now create these clusters much faster with automation.

Conclusion.

In this post, we discussed how VMware Tanzu CloudHealth migrated their existing Amazon EC2-based Kafka infrastructure to Amazon MSK. We walked you through the new architecture, deployment and topic creation pipelines, improvements to observability and access control, topic migration challenges, and the issues we faced with the existing infrastructure, along with how and why we migrated to the new Amazon MSK setup. We also talked about all the advantages that Amazon MSK gave us, the final architecture we achieved with this migration, and lessons learned.

For us, the interplay of offset synchronization, strategic node grouping, and IaC proved pivotal in overcoming obstacles and ensuring a successful migration from Amazon EC2 Kafka to Amazon MSK. This post serves as a testament to the power of adaptability and innovation in migration challenges, offering insights for others navigating a similar path.

If you’re running self-managed Kafka on AWS, we encourage you to try the managed Kafka offering, Amazon MSK.

About the Authors

Rivlin Pereira is Staff DevOps Engineer at VMware Tanzu Division. He is very passionate about Kubernetes and works on CloudHealth Platform building and operating cloud solutions that are scalable, reliable and cost effective.

Vaibhav Pandey, a Staff Software Engineer at Broadcom, is a key contributor to the development of cloud computing solutions. Specializing in architecting and engineering data storage layers, he is passionate about building and scaling SaaS applications for optimal performance.

Raj Ramasubbu is a Senior Analytics Specialist Solutions Architect focused on big data and analytics and AI/ML with Amazon Web Services. He helps customers architect and build highly scalable, performant, and secure cloud-based solutions on AWS. Raj provided technical expertise and leadership in building data engineering, big data analytics, business intelligence, and data science solutions for over 18 years prior to joining AWS. He helped customers in various industry verticals like healthcare, medical devices, life science, retail, asset management, car insurance, residential REIT, agriculture, title insurance, supply chain, document management, and real estate.

Todd McGrath is a data streaming specialist at Amazon Web Services where he advises customers on their streaming strategies, integration, architecture, and solutions. On the personal side, he enjoys watching and supporting his 3 teenagers in their preferred activities as well as following his own pursuits such as fishing, pickleball, ice hockey, and happy hour with friends and family on pontoon boats. Connect with him on LinkedIn.

Satya Pattanaik is a Sr. Solutions Architect at AWS. He has been helping ISVs build scalable and resilient applications on AWS Cloud. Prior joining AWS, he played significant role in Enterprise segments with their growth and success. Outside of work, he spends time learning “how to cook a flavorful BBQ” and trying out new recipes.

A Disaster Recovery Game Plan for Media Teams

2024-03-14 James Flores

Post Syndicated from James Flores original https://www.backblaze.com/blog/a-disaster-recovery-game-plan-for-media-teams/

A decorative image showing icons representing file types surrounding a cloud. The background has sports imagery incorporated.

When it comes to content creation, every second you can spend honing your craft counts. Which means things like disaster recovery planning are often overlooked—they’re tasks that easily get bumped to the bottom of every to-do list. Yet, the consequences of data loss or downtime can be huge, affecting everything from marketing strategy to viewer engagement.

For years, LTO tape has been a staple in disaster recovery (DR) plans for media teams that focus on everything from sports teams to broadcast news to TV and film production. Using an on-premises network attached storage (NAS) backed up to LTO tapes stored on-site, occasionally with a second copy off-site, is the de facto DR strategy for many. And while your off-site backup may be in a different physical location, more often than not, it’s the same city and still vulnerable to some of the same threats.

As in all areas of business, the key to a successful DR plan is preparation. Having a solid DR plan in place can be the difference between bouncing back swiftly or facing downtime. Today, I’ll lay out some of the challenges media teams face with disaster recovery and share some of the most cost-effective and time-efficient solutions.

Disaster Recovery Challenges

Let’s dive into some potential issues media teams face when it comes to disaster recovery.

Insufficient Resources

It’s easy to deprioritize disaster recovery when you’re facing budgetary constraints. You’re often faced with a trade-off: protect your data assets or invest in creating more. You might have limited NAS or LTO capacity, so you’re constantly evaluating what is worthy of protecting. Beyond cost, you might also be facing space limitations where investing in more infrastructure means not just shouldering the price of new tapes or drives, but also building out space to house them.

Simplicity vs. Comprehensive Coverage: Keeping Up With Scale

We’ve all heard the saying “keep it simple, stupid.” But sometimes you sacrifice adequate coverage for the sake of simplicity. Maybe you established a disaster recovery plan early on, but haven’t revisited it as your team scaled. Broadcasting and media management can quickly become complex, involving multiple departments, facilities, and stakeholders. If you haven’t revisited your plan, you may have gaps in your readiness to respond to threats.

As media teams grow and evolve, their disaster recovery needs may also change, meaning disaster recovery backups should be easy, automated, and geographically distanced.

The LTO Fallacy

No matter how well documented your processes may be, it’s inevitable that any process that requires a physical component is subject to human error. And managing LTO tapes is nothing if not a physical process. You’re manually inserting LTO tapes into an LTO deck to perform a backup. You’re then physically placing that tape and its replicas in the correct location in your library. These processes have a considerable margin of error; any deviation from an established procedure compromises the recovery process.

Additionally, LTO components—the decks and the tapes themselves—age like any other piece of equipment. And ensuring that all appropriate staff members are adequately trained and aware of any nuances of the LTO system becomes crucial in understanding the recovery process. Achieving consistent training across all levels of the organization and maintaining hardware can be challenging, leading to gaps in preparedness.

Embracing Cloud Readiness

As a media team faced with the challenges outlined above, you need solutions. Enter cloud readiness. Cloud-based storage offers unparalleled scalability, flexibility, and reliability, making it ideal for safeguarding media for teams large and small. By leveraging the power of the cloud, media teams can ensure seamless access to vital information from any location, at any time. Whether it’s raw footage, game footage, or final assets, cloud storage enables rapid recovery and minimal disruption in the event of a disaster.

Cloud Storage Considerations for Media Teams

Migrating to a cloud-based disaster recovery model requires careful planning and consideration. Here are some key factors for sports teams to keep in mind:

Data Security: Content security is becoming more and more of a top priority with many in the media space concerned about footage leakage and the growing monetization of archival content. Ensure your cloud provider employs robust security measures like encryption, and verify compliance with industry standards to maintain data privacy, especially if your media content involves sensitive or confidential information.
Cost Efficiency: Given the cost of NAS servers, LTO tapes, and external hard drives, scaling on-premises solutions indefinitely is not always the best solution. Extending your storage to the cloud makes scaling easy, but it’s not without its own set of considerations. Evaluate the cost structure of different cloud providers, considering factors like storage capacity, data transfer costs, and retention minimums.
Geospatial Redundancy: Driving LTO tapes to different locations or even shipping them to secure sites can become a logistical nightmare. When data is stored in the cloud, it not only can be accessed from anywhere but the replication of that data across geographic locations can be automated. Consider the geographical locations of the cloud servers to ensure optimal accessibility for your team, minimizing latency and providing a smooth user experience.
Interoperability: With data securely stored in the cloud it becomes instantly accessible to not only users but across different systems, platforms, and applications. This facilitates interoperability with applications like cloud media asset managers (MAMs) or cloud editing solutions and even simplifies media distribution. When choosing a cloud provider, consider APIs and third-party integrations that might enhance the functionality of your media production environment.
Testing and Training: Testing and training are paramount in disaster recovery to ensure a swift and effective response when crises strike. Rigorous testing identifies vulnerabilities, fine-tunes procedures, and validates recovery strategies. Simulated scenarios enable teams to practice and refine their roles, enhancing coordination and readiness. Regular training instills confidence and competence, reducing downtime during actual disasters. By prioritizing testing and training, your media team can bolster resilience, safeguard critical data, and increase the likelihood of a seamless recovery in the face of unforeseen disasters.

Cloud Backup in Action

For Trailblazer Studios, a leading media production company, satisfying internal and external backup requirements led to a complex and costly manual system of LTO tape and spinning disk drive redundancies. They utilized Backblaze’s cloud storage to streamline their data recovery processes and enhance their overall workflow efficiency.

Backblaze is our off-site production backup. The hope is that we never need to use it, but it gives us peace of mind.

—Kevin Shattuck, Systems Administrator, Trailblazer Studios

The Road Ahead

As media continues to embrace digital transformation, the need for robust disaster recovery solutions has never been greater. By transitioning away from on-premises solutions like LTO tape and embracing cloud readiness, organizations can future-proof their operations and ensure uninterrupted production. And, while cloud readiness creates a more secure foundation for disaster recovery, having data in the cloud creates a pathway into the future teams can take advantage of a wave of cloud tools designed to foster productivity and efficiency.

With the right strategy in place, media teams can turn potential disasters into mere setbacks, while taking advantage of their new cloud centric posterity maintaining their competitive edge.

The post A Disaster Recovery Game Plan for Media Teams appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

3 more awesome HACS frontend components to compliment your setup

2024-03-14 BeardedTinker

Post Syndicated from BeardedTinker original https://www.youtube.com/watch?v=WJaFUsyFp5g

AWS Pi Day 2024: Use your data to power generative AI

2024-03-14 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-pi-day-2024-use-your-data-to-power-generative-ai/

Today is AWS Pi Day! Join us live on Twitch, starting at 1 PM Pacific time.

On this day 18 years ago, a West Coast retail company launched an object storage service, introducing the world to Amazon Simple Storage Service (Amazon S3). We had no idea it would change the way businesses across the globe manage their data. Fast forward to 2024, every modern business is a data business. We’ve spent countless hours discussing how data can help you drive your digital transformation and how generative artificial intelligence (AI) can open up new, unexpected, and beneficial doors for your business. Our conversations have matured to include discussion around the role of your own data in creating differentiated generative AI applications.

Because Amazon S3 stores more than 350 trillion objects and exabytes of data for virtually any use case and averages over 100 million requests per second, it may be the starting point of your generative AI journey. But no matter how much data you have or where you have it stored, what counts the most is its quality. Higher quality data improves the accuracy and reliability of model response. In a recent survey of chief data officers (CDOs), almost half (46 percent) of CDOs view data quality as one of their top challenges to implementing generative AI.

This year, with AWS Pi Day, we’ll spend Amazon S3’s birthday looking at how AWS Storage, from data lakes to high performance storage, has transformed data strategy to becom the starting point for your generative AI projects.

This live online event starts at 1 PM PT today (March 14, 2024), right after the conclusion of AWS Innovate: Generative AI + Data edition. It will be live on the AWS OnAir channel on Twitch and will feature 4 hours of fresh educational content from AWS experts. Not only will you learn how to use your data and existing data architecture to build and audit your customized generative AI applications, but you’ll also learn about the latest AWS storage innovations. As usual, the show will be packed with hands-on demos, letting you see how you can get started using these technologies right away.

Data for generative AI
Data is growing at an incredible rate, powered by consumer activity, business analytics, IoT sensors, call center records, geospatial data, media content, and other drivers. That data growth is driving a flywheel for generative AI. Foundation models (FMs) are trained on massive datasets, often from sources like Common Crawl, which is an open repository of data that contains petabytes of web page data from the internet. Organizations use smaller private datasets for additional customization of FM responses. These customized models will, in turn, drive more generative AI applications, which create even more data for the data flywheel through customer interactions.

There are three data initiatives you can start today regardless of your industry, use case, or geography.

First, use your existing data to differentiate your AI systems. Most organizations sit on a lot of data. You can use this data to customize and personalize foundation models to suit them to your specific needs. Some personalization techniques require structured data, and some do not. Some others require labeled data or raw data. Amazon Bedrock and Amazon SageMaker offer you multiple solutions to fine-tune or pre-train a wide choice of existing foundation models. You can also choose to deploy Amazon Q, your business expert, for your customers or collaborators and point it to one or more of the 43 data sources it supports out of the box.

But you don’t want to create a new data infrastructure to help you grow your AI usage. Generative AI consumes your organization’s data just like existing applications.

Second, you want to make your existing data architecture and data pipelines work with generative AI and continue to follow your existing rules for data access, compliance, and governance. Our customers have deployed more than 1,000,000 data lakes on AWS. Your data lakes, Amazon S3, and your existing databases are great starting points for building your generative AI applications. To help support Retrieval-Augmented Generation (RAG), we added support for vector storage and retrieval in multiple database systems. Amazon OpenSearch Service might be a logical starting point. But you can also use pgvector with Amazon Aurora for PostgreSQL and Amazon Relational Database Service (Amazon RDS) for PostgreSQL. We also recently announced vector storage and retrieval for Amazon MemoryDB for Redis, Amazon Neptune, and Amazon DocumentDB (with MongoDB compatibility).

You can also reuse or extend data pipelines that are already in place today. Many of you use AWS streaming technologies such as Amazon Managed Streaming for Apache Kafka (Amazon MSK), Amazon Managed Service for Apache Flink, and Amazon Kinesis to do real-time data preparation in traditional machine learning (ML) and AI. You can extend these workflows to capture changes to your data and make them available to large language models (LLMs) in near real-time by updating the vector databases, make these changes available in the knowledge base with MSK’s native streaming ingestion to Amazon OpenSearch Service, or update your fine-tuning datasets with integrated data streaming in Amazon S3 through Amazon Kinesis Data Firehose.

When talking about LLM training, speed matters. Your data pipeline must be able to feed data to the many nodes in your training cluster. To meet their performance requirements, our customers who have their data lake on Amazon S3 either use an object storage class like Amazon S3 Express One Zone, or a file storage service like Amazon FSx for Lustre. FSx for Lustre provides deep integration and enables you to accelerate object data processing through a familiar, high performance, file interface.

The good news is that if your data infrastructure is built using AWS services, you are already most of the way towards extending your data for generative AI.

Third, you must become your own best auditor. Every data organization needs to prepare for the regulations, compliance, and content moderation that will come for generative AI. You should know what datasets are used in training and customization, as well as how the model made decisions. In a rapidly moving space like generative AI, you need to anticipate the future. You should do it now and do it in a way that is fully automated while you scale your AI system.

Your data architecture uses different AWS services for auditing, such as AWS CloudTrail, Amazon DataZone, Amazon CloudWatch, and OpenSearch to govern and monitor data usage. This can be easily extended to your AI systems. If you are using AWS managed services for generative AI, you have the capabilities for data transparency built in. We launched our generative AI capabilities with CloudTrail support because we know how critical it is for enterprise customers to have an audit trail for their AI systems. Any time you create a data source in Amazon Q, it’s logged in CloudTrail. You can also use a CloudTrail event to list the API calls made by Amazon CodeWhisperer. Amazon Bedrock has over 80 CloudTrail events that you can use to audit how you use foundation models.

During the last AWS re:Invent conference, we also introduced Guardrails for Amazon Bedrock. It allows you to specify topics to avoid, and Bedrock will only provide users with approved responses to questions that fall in those restricted categories

New capabilities just launched
Pi Day is also the occasion to celebrate innovation in AWS storage and data services. Here is a selection of the new capabilities that we’ve just announced:

The Amazon S3 Connector for PyTorch now supports saving PyTorch Lightning model checkpoints directly to Amazon S3. Model checkpointing typically requires pausing training jobs, so the time needed to save a checkpoint directly impacts end-to-end model training times. PyTorch Lightning is an open source framework that provides a high-level interface for training and checkpointing with PyTorch. Read the What’s New post for more details about this new integration.

Amazon S3 on Outposts authentication caching – By securely caching authentication and authorization data for Amazon S3 locally on the Outposts rack, this new capability removes round trips to the parent AWS Region for every request, eliminating the latency variability introduced by network round trips. You can learn more about Amazon S3 on Outposts authentication caching on the What’s New post and on this new post we published on the AWS Storage blog channel.

Mountpoint for Amazon S3 Container Storage Interface (CSI) driver is available for Bottlerocket – Bottlerocket is a free and open source Linux-based operating system meant for hosting containers. Built on Mountpoint for Amazon S3, the CSI driver presents an S3 bucket as a volume accessible by containers in Amazon Elastic Kubernetes Service (Amazon EKS) and self-managed Kubernetes clusters. It allows applications to access S3 objects through a file system interface, achieving high aggregate throughput without changing any application code. The What’s New post has more details about the CSI driver for Bottlerocket.

Amazon Elastic File System (Amazon EFS) increases per file system throughput by 2x – We have increased the elastic throughput limit up to 20 GB/s for read operations and 5 GB/s for writes. It means you can now use EFS for even more throughput-intensive workloads, such as machine learning, genomics, and data analytics applications. You can find more information about this increased throughput on EFS on the What’s New post.

There are also other important changes that we enabled earlier this month.

Amazon S3 Express One Zone storage class integrates with Amazon SageMaker – It allows you to accelerate SageMaker model training with faster load times for training data, checkpoints, and model outputs. You can find more information about this new integration on the What’s New post.

Amazon FSx for NetApp ONTAP increased the maximum throughput capacity per file system by 2x (from 36 GB/s to 72 GB/s), letting you use ONTAP’s data management features for an even broader set of performance-intensive workloads. You can find more information about Amazon FSx for NetApp ONTAP on the What’s New post.

What to expect during the live stream
We will address some of these new capabilities during the 4-hour live show today. My colleague Darko will host a number of AWS experts for hands-on demonstrations so you can discover how to put your data to work for your generative AI projects. Here is the schedule of the day. All times are expressed in Pacific Time (PT) time zone (GMT-8):

Extend your existing data architecture to generative AI (1 PM – 2 PM).
If you run analytics on top of AWS data lakes, you’re most of your way there to your data strategy for generative AI.
Accelerate the data path to compute for generative AI (2 PM – 3 PM).
Speed matters for compute data path for model training and inference. Check out the different ways we make it happen.
Customize with RAG and fine-tuning (3 PM – 4 PM).
Discover the latest techniques to customize base foundation models.
Be your own best auditor for GenAI (4 PM – 5 PM).
Use existing AWS services to help meet your compliance objectives.

Join us today on the AWS Pi Day live stream.

I hope I’ll meet you there!

— seb

Rapid7’s Ciara Cullinan Recognized as Community Trailblazer in Belfast Awards Program

2024-03-14 Rapid7

Post Syndicated from Rapid7 original https://blog.rapid7.com/2024/03/14/rapid7s-ciara-cullinan-recognized-as-community-trailblazer-in-belfast-awards-program/

Rapid7’s Ciara Cullinan Recognized as Community Trailblazer in Belfast Awards Program

At the 2024 Women Who Code She Rocks Awards, Rapid7 Software Engineer II Ciara Cullinan was recognized with their ‘Community Trailblazer’ award.

According to Women Who Code, “This award celebrates the efforts of someone who brings people together and creates genuine connections in our tech community. Whether this is online or in-person, this person demonstrates exceptional commitment to building a thriving and inclusive community.”

When it comes to building community, Ciara is a true champion who is consistently looking for ways to establish and grow meaningful connections among her team, across the organization, and in the local tech industry. Whether it’s encouraging engagement in various slack channels with ‘water cooler’ questions and ice breakers, or driving Rapid7’s sponsorship of Women Techmakers, she’s proactively seeking out ways to bring people together while growing her own network in the process.

“I think a lot of times – and especially for women – we focus on perfection in our work. We can be hesitant to share things until we have it 100% figured out ourselves. However, when we are able to build strong personal connections with our colleagues, or even others in the industry, the bravery to put something forward or ask for feedback comes much easier. That connection opens up the door to have honest conversations, share ideas, and provide feedback. This is where we can work together to drive impact and grow our skills, which lead to rewarding career experiences and growth.”

In addition to her role as an engineer, Ciara is an active member of Rapid7 Women. Rapid7 Women is an employee resource group that aims to support, enable, and empower all employees identifying as women to bring their best, true selves to work every day through community, action, and activism. Ciara actively contributes to this mission by helping build global and local initiatives for the group. As mentioned in her nomination submission, “Ciara collaborates with colleagues from around the globe, in different business units and roles to build a Women program that caters to supporting not only Women identifying individuals, but also seeks to educate allies on how to be a culture contributor exhibiting inclusive leadership traits.”

Ciara also highlights the importance of bringing more women into the tech industry, and how organizations like Women Who Code can make a difference. “In my role I am one of two women on the team. As technology continues to evolve and things like Artificial Intelligence become part of our everyday life, it’s important to get more women involved in the field to combat any implicit bias in the things that are being built. Bringing more diverse perspectives into a team can also help drive innovation and help organizations work through challenges more efficiently. Awards and programs like this help showcase what’s possible for the next generation of women, allowing them see and then realize the potential a career in tech could hold for them.”

To learn more about Women Who Code’s Belfast community, visit their website.

To learn more about Rapid7’s culture, and our Rapid Impact Groups, visit our careers page.

[$] The first half of the 6.9 merge window

2024-03-14 corbet

Post Syndicated from corbet original https://lwn.net/Articles/965141/

As of this writing, just over 4,900 non-merge changesets have been pulled
into the mainline for the 6.9 release. This work includes the usual array
of changes all over the kernel tree; read on for a summary of the most
significant work merged during the first part of the 6.9 merge window.

NIS2 Directive Compliance Checklist for Companies

2024-03-14 Editor

Post Syndicated from Editor original https://nebosystems.eu/nis2-compliance-checklist-guide/

NIS2 Directive Compliance Checklist for Companies

In response to the evolving cybersecurity threats, the European Union has introduced the Network & Information System (NIS2) Directive, setting a new standard for cybersecurity measures across member states. Understanding and complying with these requirements is crucial for organizations operating within the EU.

This checklist is designed to help companies understand whether they are affected by the NIS2 Directive (Directive (EU) 2022/2555) and need to comply with its cybersecurity requirements. Answering these questions will provide an initial assessment of your company’s obligations under the Directive.

Section 1: Company Size and Type

Is your company considered a medium-sized enterprise or larger according to the EU definition? (More than 50 employees and an annual turnover or balance sheet exceeding €10 million)

Does your company operate in the digital infrastructure, including as a DNS service provider, TLD name registry, or cloud computing service provider?

Is your company a small enterprise or micro-enterprise that plays a key role in society, the economy, or within specific sectors or types of service? (Consider if your services are critical even if your company is small.)

Section 2: Sector-Specific Questions

Is your company involved in any of the following sectors?

Energy

Transport

Banking

Financial Market Infrastructure

Health sector

Drinking water

Digital infrastructure

Public administration

Space

None of the above

Does your company provide essential services within these sectors that, if disrupted, would have a significant impact on societal or economic activities?

Section 3: Operational Impact

Does your company rely heavily on network and information systems for the provision of your services?

In the event of a cybersecurity incident, could your company’s services be significantly disrupted, leading to substantial financial loss or societal impact?

Section 4: Exclusions

Is your company’s primary activity related to national security, public security, defense, or law enforcement? (Note: If only marginally related, you might still fall under the Directive.)

Is your company a public administration entity that predominantly carries out activities in the areas of national security, public security, defense, or law enforcement?

Section 5: Additional Considerations

Has your company been previously identified as an operator of essential services under the NIS Directive or any national legislation related to cybersecurity?

Is your company part of the supply chain for critical services in any of the sectors identified in question 4?

Conclusion

Questions 1, 2, or 3 (Company Size and Type): If you answered “Yes” to any of these, your company falls within the scope of the NIS2 Directive due to its size, operation within digital infrastructure, or significant role despite being a small or microenterprise. Next Steps: Assess specific obligations under the NIS2 Directive and begin implementing necessary cybersecurity measures and reporting mechanisms.
Question 4 (Sector Involvement): A “Yes” response indicates your company operates in a sector directly affected by the NIS2 Directive. Next Steps: Identify sector-specific cybersecurity requirements and engage with sector regulators or national cybersecurity authorities for guidance.
Question 5 (Provision of Essential Services): If “Yes,” your services are crucial, making compliance with the NIS2 Directive imperative to ensure service continuity and security. Next Steps: Prioritize establishing a comprehensive risk management framework and incident response plan as per NIS2 requirements.
Questions 6 and 7 (Operational Impact): Affirmative answers highlight your reliance on network and information systems and potential significant impacts from cybersecurity incidents. Next Steps: Strengthen your cybersecurity infrastructure, focusing on resilience and rapid incident response capabilities.
Questions 8 and 9 (Exclusions): If you answered “Yes,” your company might be excluded due to its primary focus on national security or law enforcement. However, marginal involvement doesn’t grant exclusion. Next Steps: Clarify your exclusion status with legal experts and, if applicable, review your cybersecurity practices to ensure they’re adequate for your operational needs.
Question 10 (Previous Identification as Essential Service Operator): A “Yes” answer suggests your company was already under obligations similar to those in the NIS2 Directive, which will likely continue or expand under the new directive. Next Steps: Update your cybersecurity and compliance strategies to align with NIS2 enhancements and consult with authorities for transitional requirements.
Question 11 (Part of the Supply Chain for Critical Services): Answering “Yes” indicates your role in the supply chain could bring you under the NIS2 Directive’s purview, especially with its increased focus on supply chain security. Next Steps: Evaluate your cybersecurity practices in the context of supply chain integrity, collaborate with your partners to understand your shared responsibilities, and implement any necessary security and reporting enhancements.

Please note that this checklist provides a preliminary assessment, and the specific obligations under the NIS2 Directive may vary based on national transposition and interpretation by regulatory authorities.

Download the NIS2 Compliance Checklist

General Advice

Regardless of your answers, it’s advisable for all companies, especially those operating within or closely related to critical sectors, to adopt robust cybersecurity measures. The evolving cybersecurity landscape and the interconnected nature of digital services mean that comprehensive security practices are essential for resilience against cyber threats.

For companies potentially falling under the NIS2 Directive, consider the following steps:

Review and Update Security Policies: Ensure that your cybersecurity policies are up-to-date and align with the best practices.
Engage with Regulatory Authorities: Reach out to your national cybersecurity authority or sector-specific regulatory bodies to clarify your status under the NIS2 Directive and to obtain guidance on compliance.
Consult Legal and Cybersecurity Experts: Seek advice from professionals specializing in cybersecurity law and technical security measures to ensure that your company meets all legal obligations and effectively mitigates cyber risks.
Implement a Compliance Plan: Develop or update your cybersecurity compliance plan to address the requirements of the NIS2 Directive, focusing on risk management, incident reporting, supply chain security, and other relevant areas.

Remember, even if your company is not directly affected by the NIS2 Directive, adopting its principles can enhance your cybersecurity posture and potentially offer a competitive advantage by demonstrating a commitment to security to your clients and partners.

Ready to ensure your company is NIS2 compliant? Contact Nebosystems today for expert NIS2 compliance consulting. Our team is dedicated to helping you navigate these regulations, ensuring your cybersecurity measures are robust and compliant. Explore our NIS2 Compliance Cybersecurity Solutions for more information on how we can assist.

Reference: NIS2 Directive (Directive (EU) 2022/2555). EUR-Lex.

Security updates for Thursday

2024-03-14 jake

Post Syndicated from jake original https://lwn.net/Articles/965470/

Security updates have been issued by Debian (chromium and openvswitch), Fedora (chromium, python-multipart, thunderbird, and xen), Mageia (java-17-openjdk and screen), Red Hat (.NET 7.0, .NET 8.0, kernel-rt, kpatch-patch, postgresql:13, and postgresql:15), Slackware (expat), SUSE (glibc, python-Django, python-Django1, sudo, and vim), and Ubuntu (expat, linux-ibm, linux-ibm-5.4, linux-oracle, linux-oracle-5.4, linux-lowlatency, linux-raspi, python-cryptography, texlive-bin, and xorg-server).

Upcoming Let’s Encrypt certificate chain change and impact for Cloudflare customers

2024-03-14 Dina Kozlov

Post Syndicated from Dina Kozlov original https://blog.cloudflare.com/upcoming-lets-encrypt-certificate-chain-change-and-impact-for-cloudflare-customers

Let’s Encrypt, a publicly trusted certificate authority (CA) that Cloudflare uses to issue TLS certificates, has been relying on two distinct certificate chains. One is cross-signed with IdenTrust, a globally trusted CA that has been around since 2000, and the other is Let’s Encrypt’s own root CA, ISRG Root X1. Since Let’s Encrypt launched, ISRG Root X1 has been steadily gaining its own device compatibility.

On September 30, 2024, Let’s Encrypt’s certificate chain cross-signed with IdenTrust will expire. To proactively prepare for this change, on May 15, 2024, Cloudflare will stop issuing certificates from the cross-signed chain and will instead use Let’s Encrypt’s ISRG Root X1 chain for all future Let’s Encrypt certificates.

The change in the certificate chain will impact legacy devices and systems, such as Android devices version 7.1.1 or older, as those exclusively rely on the cross-signed chain and lack the ISRG X1 root in their trust store. These clients may encounter TLS errors or warnings when accessing domains secured by a Let’s Encrypt certificate.

According to Let’s Encrypt, more than 93.9% of Android devices already trust the ISRG Root X1 and this number is expected to increase in 2024, especially as Android releases version 14, which makes the Android trust store easily and automatically upgradable.

We took a look at the data ourselves and found that, from all Android requests, 2.96% of them come from devices that will be affected by the change. In addition, only 1.13% of all requests from Firefox come from affected versions, which means that most (98.87%) of the requests coming from Android versions that are using Firefox will not be impacted.

Preparing for the change

If you’re worried about the change impacting your clients, there are a few things that you can do to reduce the impact of the change. If you control the clients that are connecting to your application, we recommend updating the trust store to include the ISRG Root X1. If you use certificate pinning, remove or update your pin. In general, we discourage all customers from pinning their certificates, as this usually leads to issues during certificate renewals or CA changes.

If you experience issues with the Let’s Encrypt chain change, and you’re using Advanced Certificate Manager or SSL for SaaS on the Enterprise plan, you can choose to switch your certificate to use Google Trust Services as the certificate authority instead.

For more information, please refer to our developer documentation.

While this change will impact a very small portion of clients, we support the shift that Let’s Encrypt is making as it supports a more secure and agile Internet.

Embracing change to move towards a better Internet

Looking back, there were a number of challenges that slowed down the adoption of new technologies and standards that helped make the Internet faster, more secure, and more reliable.

For starters, before Cloudflare launched Universal SSL, free certificates were not attainable. Instead, domain owners had to pay around $100 to get a TLS certificate. For a small business, this is a big cost and without browsers enforcing TLS, this significantly hindered TLS adoption for years. Insecure algorithms have taken decades to deprecate due to lack of support of new algorithms in browsers or devices. We learned this lesson while deprecating SHA-1.

Supporting new security standards and protocols is vital for us to continue improving the Internet. Over the years, big and sometimes risky changes were made in order for us to move forward. The launch of Let’s Encrypt in 2015 was monumental. Let’s Encrypt allowed every domain to get a TLS certificate for free, which paved the way to a more secure Internet, with now around 98% of traffic using HTTPS.

In 2014, Cloudflare launched elliptic curve digital signature algorithm (ECDSA) support for Cloudflare-issued certificates and made the decision to issue ECDSA-only certificates to free customers. This boosted ECDSA adoption by pressing clients and web operators to make changes to support the new algorithm, which provided the same (if not better) security as RSA while also improving performance. In addition to that, modern browsers and operating systems are now being built in a way that allows them to constantly support new standards, so that they can deprecate old ones.

For us to move forward in supporting new standards and protocols, we need to make the Public Key Infrastructure (PKI) ecosystem more agile. By retiring the cross-signed chain, Let’s Encrypt is pushing devices, browsers, and clients to support adaptable trust stores. This allows clients to support new standards without causing a breaking change. It also lays the groundwork for new certificate authorities to emerge.

Today, one of the main reasons why there’s a limited number of CAs available is that it takes years for them to become widely trusted, that is, without cross-signing with another CA. In 2017, Google launched a new publicly trusted CA, Google Trust Services, that issued free TLS certificates. Even though they launched a few years after Let’s Encrypt, they faced the same challenges with device compatibility and adoption, which caused them to cross-sign with GlobalSign’s CA. We hope that, by the time GlobalSign’s CA comes up for expiration, almost all traffic is coming from a modern client and browser, meaning the change impact should be minimal.

Sending and receiving CloudEvents with Amazon EventBridge

2024-03-14 David Boyne

Post Syndicated from David Boyne original https://aws.amazon.com/blogs/compute/sending-and-receiving-cloudevents-with-amazon-eventbridge/

Amazon EventBridge helps developers build event-driven architectures (EDA) by connecting loosely coupled publishers and consumers using event routing, filtering, and transformation. CloudEvents is an open-source specification for describing event data in a common way. Developers can publish CloudEvents directly to EventBridge, filter and route them, and use input transformers and API Destinations to send CloudEvents to downstream AWS services and third-party APIs.

Overview

Event design is an important aspect in any event-driven architecture. Developers building event-driven architectures often overlook the event design process when building their architectures. This leads to unwanted side effects like exposing implementation details, lack of standards, and version incompatibility.

Without event standards, it can be difficult to integrate events or streams of messages between systems, brokers, and organizations. Each system has to understand the event structure or rely on custom-built solutions for versioning or validation.

CloudEvents is a specification for describing event data in common formats to provide interoperability between services, platforms, and systems using Cloud Native Computing Foundation (CNCF) projects. As CloudEvents is a CNCF graduated project, many third-party brokers and systems adopt this specification.

Using CloudEvents as a standard format to describe events makes integration easier and you can use open-source tooling to help build event-driven architectures and future proof any integrations. EventBridge can route and filter CloudEvents based on common metadata, without needing to understand the business logic within the event itself.

CloudEvents support two implementation modes, structured mode and binary mode, and a range of protocols including HTTP, MQTT, AMQP, and Kafka. When publishing events to an EventBridge bus, you can structure events as CloudEvents and route them to downstream consumers. You can use input transformers to transform any event into the CloudEvents specification. Events can also be forwarded to public APIs, using EventBridge API destinations, which supports both structured and binary mode encodings, enhancing interoperability with external systems.

Standardizing events using Amazon EventBridge

When publishing events to an EventBridge bus, EventBridge uses its own event envelope and represents events as JSON objects. EventBridge requires that you define top-level fields, such as detail-type and source. You can use any event/payload in the detail field.

This example event shows an OrderPlaced event from the orders-service that is unstructured without any event standards. The data within the event contains the order_id, customer_id and order_total.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c57",
  "account": "1234567890",
  "time": "2023-05-23T11:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    }
  }
}

Publishers may also choose to add an additional metadata field along with the data field within the detail field to help define a set of standards for their events.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "account": "1234567890",
  "time": "2023-05-23T12:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "metadata": {
      "idempotency_key": "29d2b068-f9c7-42a0-91e3-5ba515de5dbe",
      "correlation_id": "dddd9340-135a-c8c6-95c2-41fb8f492222",
      "domain": "ORDERS",
      "time": "1707908605"
    },
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    }
  }
}

This additional event information helps downstream consumers, improves debugging, and can manage idempotency. While this approach offers practical benefits, it duplicates solutions that are already solved with the CloudEvents specification.

Publishing CloudEvents using Amazon EventBridge

When publishing events to EventBridge, you can use CloudEvents structured mode. A structured-mode message is where the entire event (attributes and data) is encoded in the message body, according to a specific event format. A binary-mode message is where the event data is stored in the message body, and event attributes are stored as part of the message metadata.

CloudEvents has a list of required fields but also offers flexibility with optional attributes and extensions. CloudEvents also offers a solution to implement idempotency, requiring that the combination of id and source must uniquely identify an event, which can be used as the idempotency key in downstream implementations.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "account": "1234567890",
  "time": "2023-05-23T12:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "specversion": "1.0",
    "id": "bba4379f-b764-4d90-9fb2-9f572b2b0b61",
    "source": "myapp.orders-service",
    "type": "OrderPlaced",
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    },
    "time": "2024-01-01T17:31:00Z",
    "dataschema": "https://us-west-2.console.aws.amazon.com/events/home?region=us-west-2#/registries/discovered-schemas/schemas/myapp.orders-service%40OrderPlaced",
    "correlationid": "dddd9340-135a-c8c6-95c2-41fb8f492222",
    "domain": "ORDERS"
  }
}

By incorporating the required fields, the OrderPlaced event is now CloudEvents compliant. The event also contains optional and extension fields for additional information. Optional fields such as dataschema can be useful for brokers and consumers to retrieve a URI path to the published event schema. This example event references the schema in the EventBridge schema registry, so downstream consumers can fetch the schema to validate the payload.

Mapping existing events into CloudEvents using input transformers

When you define a target in EventBridge, input transformations allow you to modify the event before it reaches its destination. Input transformers are configured per target, allowing you to convert events when your downstream consumer requires the CloudEvents format and you want to avoid duplicating information.

Input transformers allow you to map EventBridge fields, such as id, region, detail-type, and source, into corresponding CloudEvents attributes.

This example shows how to transform any EventBridge event into CloudEvents format using input transformers, so the target receives the required structure.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "account": "1234567890",
  "time": "2024-01-23T12:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
    "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
    "order_total": "120.00"
  }
}

Using this input transformer and input template EventBridge transforms the event schema into the CloudEvents specification for downstream consumers.

Input transformer for CloudEvents:

{
  "id": "$.id",
  "source": "$.source",
  "type": "$.detail-type",
  "time": "$.time",
  "data": "$.detail"
}

Input template for CloudEvents:

{
  "specversion": "1.0",
  "id": "<id>",
  "source": "<source>",
  "type": "<type>",
  "time": "<time>",
  "data": <data>
}

This example shows the event payload that is received by downstream targets, which is mapped to the CloudEvents specification.

{
  "specversion": "1.0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "source": "myapp.orders-service",
  "type": "OrderPlaced",
  "time": "2024-01-23T12:38:46Z",
  "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    }
}

For more information on using input transformers with CloudEvents, see this pattern on Serverless Land.

Transforming events into CloudEvents using API destinations

EventBridge API destinations allows you to trigger HTTP endpoints based on matched rules to integrate with third-party systems using public APIs. You can route events to APIs that support the CloudEvents format by using input transformations and custom HTTP headers to convert EventBridge events to CloudEvents. API destinations now supports custom content-type headers. This allows you to send structured or binary CloudEvents to downstream consumers.

Sending binary CloudEvents using API destinations

When sending binary CloudEvents over HTTP, you must use the HTTP binding specification and set the necessary CloudEvents headers. These headers tell the downstream consumer that the incoming payload uses the CloudEvents format. The body of the request is the event itself.

CloudEvents headers are prefixed with ce-. You can find the list of headers in the HTTP protocol binding documentation.

This example shows the Headers for a binary event:

POST /order HTTP/1.1 
Host: webhook.example.com
ce-specversion: 1.0
ce-type: OrderPlaced
ce-source: myapp.orders-service
ce-id: bba4379f-b764-4d90-9fb2-9f572b2b0b61
ce-time: 2024-01-01T17:31:00Z
ce-dataschema: https://us-west-2.console.aws.amazon.com/events/home?region=us-west-2#/registries/discovered-schemas/schemas/myapp.orders-service%40OrderPlaced
correlationid: dddd9340-135a-c8c6-95c2-41fb8f492222
domain: ORDERS
Content-Type: application/json; charset=utf-8

This example shows the body for a binary event:

{
  "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
  "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
  "order_total": "120.00"
}

For more information when using binary CloudEvents with API destinations, explore this pattern available on Serverless Land.

Sending structured CloudEvents using API destinations

To support structured mode with CloudEvents, you must specify the content-type as application/cloudevents+json; charset=UTF-8, which tells the API consumer that the payload of the event is adhering to the CloudEvents specification.

POST /order HTTP/1.1
Host: webhook.example.com
 
Content-Type: application/cloudevents+json; charset=utf-8
{
    "specversion": "1.0",
    "id": "bba4379f-b764-4d90-9fb2-9f572b2b0b61",
    "source": "myapp.orders-service",
    "type": "OrderPlaced",      
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    },
    "time": "2024-01-01T17:31:00Z",
    "dataschema": "https://us-west-2.console.aws.amazon.com/events/home?region=us-west-2#/registries/discovered-schemas/schemas/myapp.orders-service%40OrderPlaced",
    "correlationid": "dddd9340-135a-c8c6-95c2-41fb8f492222",
    "domain":"ORDERS"
}

Conclusion

Carefully designing events plays an important role when building event-driven architectures to integrate producers and consumers effectively. The open-source CloudEvents specification helps developers to standardize integration processes, simplifying interactions between internal systems and external partners.

EventBridge allows you to use a flexible payload structure within an event’s detail property to standardize events. You can publish structured CloudEvents directly onto an event bus in the detail field and use payload transformations to allow downstream consumers to receive events in the CloudEvents format.

EventBridge simplifies integration with third-party systems using API destinations. Using the new custom content-type headers with input transformers to modify the event structure, you can send structured or binary CloudEvents to integrate with public APIs.

For more serverless learning resources, visit Serverless Land.

Какво да очакват хората с ипотеки след въвеждането на еврото

2024-03-14 VassilKendov

Post Syndicated from VassilKendov original https://kendov.com/%D0%BA%D0%B0%D0%BA%D0%B2%D0%BE-%D0%B4%D0%B0-%D0%BE%D1%87%D0%B0%D0%BA%D0%B2%D0%B0%D1%82-%D1%85%D0%BE%D1%80%D0%B0%D1%82%D0%B0-%D1%81-%D0%B8%D0%BF%D0%BE%D1%82%D0%B5%D0%BA%D0%B8-%D1%81%D0%BB%D0%B5%D0%B4/

Пълният запис на видеото с Роси Денева и Васил Кендов можете да намерите в платената секция на Patreon канала Kendov.com

Очаквани промени

– Лихвите по кредитите ще нараснат и ще се изравнят с европейските (между 6-7.5%)
– Кредитирането ще се свие под натиска на ЕЦБ и БНБ
– Цените на имотите ще се диференцират по показател ново и старо строителство

Моля използвайте приложената форма за записване на час за среща
[contact-form-7]

The post Какво да очакват хората с ипотеки след въвеждането на еврото appeared first on Kendov.com.

Mitigating a token-length side-channel attack in our AI products

2024-03-14 Celso Martinho

Post Syndicated from Celso Martinho original https://blog.cloudflare.com/ai-side-channel-attack-mitigated

Since the discovery of CRIME, BREACH, TIME, LUCKY-13 etc., length-based side-channel attacks have been considered practical. Even though packets were encrypted, attackers were able to infer information about the underlying plaintext by analyzing metadata like the packet length or timing information.

Cloudflare was recently contacted by a group of researchers at Ben Gurion University who wrote a paper titled “What Was Your Prompt? A Remote Keylogging Attack on AI Assistants” that describes “a novel side-channel that can be used to read encrypted responses from AI Assistants over the web”.
The Workers AI and AI Gateway team collaborated closely with these security researchers through our Public Bug Bounty program, discovering and fully patching a vulnerability that affects LLM providers. You can read the detailed research paper here.

Since being notified about this vulnerability, we’ve implemented a mitigation to help secure all Workers AI and AI Gateway customers. As far as we could assess, there was no outstanding risk to Workers AI and AI Gateway customers.

How does the side-channel attack work?

In the paper, the authors describe a method in which they intercept the stream of a chat session with an LLM provider, use the network packet headers to infer the length of each token, extract and segment their sequence, and then use their own dedicated LLMs to infer the response.

The two main requirements for a successful attack are an AI chat client running in streaming mode and a malicious actor capable of capturing network traffic between the client and the AI chat service. In streaming mode, the LLM tokens are emitted sequentially, introducing a token-length side-channel. Malicious actors could eavesdrop on packets via public networks or within an ISP.

An example request vulnerable to the side-channel attack looks like this:

curl -X POST \
https://api.cloudflare.com/client/v4/accounts/<account-id>/ai/run/@cf/meta/llama-2-7b-chat-int8 \
  -H "Authorization: Bearer <Token>" \
  -d '{"stream":true,"prompt":"tell me something about portugal"}'

Let’s use Wireshark to inspect the network packets on the LLM chat session while streaming:

The first packet has a length of 95 and corresponds to the token “Port” which has a length of four. The second packet has a length of 93 and corresponds to the token “ug” which has a length of two, and so on. By removing the likely token envelope from the network packet length, it is easy to infer how many tokens were transmitted and their sequence and individual length just by sniffing encrypted network data.

Since the attacker needs the sequence of individual token length, this vulnerability only affects text generation models using streaming. This means that AI inference providers that use streaming — the most common way of interacting with LLMs — like Workers AI, are potentially vulnerable.

This method requires that the attacker is on the same network or in a position to observe the communication traffic and its accuracy depends on knowing the target LLM’s writing style. In ideal conditions, the researchers claim that their system “can reconstruct 29% of an AI assistant’s responses and successfully infer the topic from 55% of them”. It’s also important to note that unlike other side-channel attacks, in this case the attacker has no way of evaluating its prediction against the ground truth. That means that we are as likely to get a sentence with near perfect accuracy as we are to get one where only things that match are conjunctions.

Mitigating LLM side-channel attacks

Since this type of attack relies on the length of tokens being inferred from the packet, it can be just as easily mitigated by obscuring token size. The researchers suggested a few strategies to mitigate these side-channel attacks, one of which is the simplest: padding the token responses with random length noise to obscure the length of the token so that responses can not be inferred from the packets. While we immediately added the mitigation to our own inference product — Workers AI, we wanted to help customers secure their LLMs regardless of where they are running them by adding it to our AI Gateway.

As of today, all users of Workers AI and AI Gateway are now automatically protected from this side-channel attack.

What we did

Once we got word of this research work and how exploiting the technique could potentially impact our AI products, we did what we always do in situations like this: we assembled a team of systems engineers, security engineers, and product managers and started discussing risk mitigation strategies and next steps. We also had a call with the researchers, who kindly attended, presented their conclusions, and answered questions from our teams.

Unfortunately, at this point, this research does not include actual code that we can use to reproduce the claims or the effectiveness and accuracy of the described side-channel attack. However, we think that the paper has theoretical merit, that it provides enough detail and explanations, and that the risks are not negligible.

We decided to incorporate the first mitigation suggestion in the paper: including random padding to each message to hide the actual length of tokens in the stream, thereby complicating attempts to infer information based solely on network packet size.

Workers AI, our inference product, is now protected

With our inference-as-a-service product, anyone can use the Workers AI platform and make API calls to our supported AI models. This means that we oversee the inference requests being made to and from the models. As such, we have a responsibility to ensure that the service is secure and protected from potential vulnerabilities. We immediately rolled out a fix once we were notified of the research, and all Workers AI customers are now automatically protected from this side-channel attack. We have not seen any malicious attacks exploiting this vulnerability, other than the ethical testing from the researchers.

Our solution for Workers AI is a variation of the mitigation strategy suggested in the research document. Since we stream JSON objects rather than the raw tokens, instead of padding the tokens with whitespace characters, we added a new property, “p” (for padding) that has a string value of variable random length.

Example streaming response using the SSE syntax:

data: {"response":"portugal","p":"abcdefghijklmnopqrstuvwxyz0123456789a"}
data: {"response":" is","p":"abcdefghij"}
data: {"response":" a","p":"abcdefghijklmnopqrstuvwxyz012"}
data: {"response":" southern","p":"ab"}
data: {"response":" European","p":"abcdefgh"}
data: {"response":" country","p":"abcdefghijklmno"}
data: {"response":" located","p":"abcdefghijklmnopqrstuvwxyz012345678"}

This has the advantage that no modifications are required in the SDK or the client code, the changes are invisible to the end-users, and no action is required from our customers. By adding random variable length to the JSON objects, we introduce the same network-level variability, and the attacker essentially loses the required input signal. Customers can continue using Workers AI as usual while benefiting from this protection.

One step further: AI Gateway protects users of any inference provider

We added protection to our AI inference product, but we also have a product that proxies requests to any provider — AI Gateway. AI Gateway acts as a proxy between a user and supported inference providers, helping developers gain control, performance, and observability over their AI applications. In line with our mission to help build a better Internet, we wanted to quickly roll out a fix that can help all our customers using text generation AIs, regardless of which provider they use or if they have mitigations to prevent this attack. To do this, we implemented a similar solution that pads all streaming responses proxied through AI Gateway with random noise of variable length.

Our AI Gateway customers are now automatically protected against this side-channel attack, even if the upstream inference providers have not yet mitigated the vulnerability. If you are unsure if your inference provider has patched this vulnerability yet, use AI Gateway to proxy your requests and ensure that you are protected.

Conclusion

At Cloudflare, our mission is to help build a better Internet – that means that we care about all citizens of the Internet, regardless of what their tech stack looks like. We are proud to be able to improve the security of our AI products in a way that is transparent and requires no action from our customers.

We are grateful to the researchers who discovered this vulnerability and have been very collaborative in helping us understand the problem space. If you are a security researcher who is interested in helping us make our products more secure, check out our Bug Bounty program at hackerone.com/cloudflare.

Интервю на Георги Георгиев от БОЕЦ: „ДАНС е предала пътната карта за Турски поток на Стефан Янев, Кирил Петков не е знаел“

2024-03-14 Николай Марченко

Post Syndicated from Николай Марченко original https://bivol.bg/boec-geoergiev-turskipotok.html

четвъртък 14 март 2024

Пътната карта за газопровода „Турски поток 2“ („Балкански поток“) е открита от Държавна агенция „Национална сигурност“ (ДАНС) още през 2021 г. Това сподели в интервю за „Биволъ“ председателят на гражданското…

Съвременната алтернативна турска музика е отделен жанр със собствена уникалност и ритъм,

През 1934 г., по време на културните промени на Ататюрк, турската музика в радиостанциите е забранена.

Първото истинско западно влияние върху турската музика е на „Бийтълс“,

Съвременната турска музика вече е част от европейската и световната сцена. И не се бори за нищо – просто е.

The importance of submarine cables

Kafka cluster overview

Kafka as an architectural pillar

Why we migrated to Amazon MSK

How we migrated to Amazon MSK

Kafka topic migration strategy

Challenges and lessons learned

Challenge 1: Offset discrepancies

Challenge 2: Slow data migration

Challenge 3: Lack of detailed process documentation

Final setup and next steps

Conclusion.

About the Authors

Disaster Recovery Challenges

Insufficient Resources

Simplicity vs. Comprehensive Coverage: Keeping Up With Scale

The LTO Fallacy

Embracing Cloud Readiness

Cloud Storage Considerations for Media Teams

Cloud Backup in Action

The Road Ahead

Preparing for the change

Embracing change to move towards a better Internet

Overview

Standardizing events using Amazon EventBridge

Publishing CloudEvents using Amazon EventBridge

Mapping existing events into CloudEvents using input transformers

Transforming events into CloudEvents using API destinations

Sending binary CloudEvents using API destinations

Sending structured CloudEvents using API destinations

Conclusion

How does the side-channel attack work?

Mitigating LLM side-channel attacks

What we did

Workers AI, our inference product, is now protected

One step further: AI Gateway protects users of any inference provider

Conclusion

The collective thoughts of the interwebz