Изгрява ли дъга над София и България?

Post Syndicated from Светла Енчева original https://www.toest.bg/izgryava-li-duga-nad-sofia-i-bulgaria/

Изгрява ли дъга над София и България?

На 22 юни „София прайд“ ще се проведе за 17-та поредна година. Както винаги досега, най-голямото събитие в защита на правата на ЛГБТИ+ хората в България е съпроводено с дискусии „за“ и „против“. Отново има антипарад. Даже два, понеже в редиците на основното от години насам „антисъбитие“ настъпи разкол. В резултат т.нар. „Поход за семейството“ поосиротя и се проведе седмица преди прайда. В деня на „София прайд“ ще се състои „Шествие за семейството“, което се очаква да привлече повече участници и на което е обещала да пее бившата пънк икона Милена Славова.

Под семейство и „Походът“, и „Шествието“ разбират хетеросексуалното семейство от мъж, жена и техните деца. Според организаторите това е „традиционното семейство“, за което хомосексуалните връзки – и по-общо ЛГБТИ+ хората – са заплаха.

Тази година обаче публичната активност около прайда надхвърля обичайните безплодни спорове „за“ и „против“. Едно събитие в Столичния общински съвет (СОС) не се превърна във водеща новина, но е важен прецедент за ЛГБТИ+ хората. Този прецедент е станал възможен, защото почти незабелязано е настъпил друг, и то двоен прецедент. За какво става дума?

Две декларации. И рецепта за розови еднорози

„Рецепта за розови еднорози“ е песен на групата Cool Den. Няколко групи в СОС също забъркаха рецепта, увенчана с голям плюшен розов еднорог, настанил се на стола на един общински съветник.

На 7 май „Синя България“ разпространи декларация във Facebook в подкрепа на „Шествието за семейството“ и призова симпатизантите си да се включат в него. В декларацията се казва, че коалицията застава „зад каузата на семейството, вярата и борбата с демографската криза“, защото е „дясноконсервативна формация“. Също и че „няма по-голям проблем за България от демографската криза и борбата с нея трябва да бъде приоритет на всяко бъдещо управление“.

„Синя България“ е коалиция от седем партии, опитала се (безуспешно) да разшири на национално равнище успеха на „Синя София“, която е част от коалицията на местните избори. В СОС „Синя София“ е представена от трима общински съветници – Вили Лилков, Иван Сотиров и Ивайло Йонков. Декларацията е разпространена в последния предизборен ден, в който политическата агитация е разрешена.

На 13 юни групата общински съветници от „Продължаваме Промяната“, „Демократична България“ и „Спаси София“ излезе със своя декларация по повод „Шествието за семейството“ и „София прайд“. Прочете я адв. Христо Копаранов, който заедно с Гергин Борисов (и двамата от „Спаси София“) е неин инициатор и вносител. Това е първият път, когато най-голямата група в СОС се обявява в подкрепа на „София прайд“ и по-общо – на правата на ЛГБТИ+ хората.

В декларацията противопоставянето на прайда и „Шествието за семейството“ се нарича „фалшива дилема“, защото „семейството не е математическа формула“ от майка, баща и техните деца – самотните родители и децата им, както партньорите без деца например също са семейства. И „всеки заслужава достоен живот“.

„Затова, когато години наред хиляди граждани излизат, за да ни кажат, че се чувстват онеправдани, че живеят в страх, че са лишени от права, ние сме длъжни да ги чуем“, се казва в декларацията и се споменават случаи на хомофобско насилие. В нея се подчертава и че прайдът не е само за ЛГБТИ+ хората, а за „утвърждаване правата и достойнството на всеки, който е различен от мнозинството – било то расово, етнически или по какъвто и да е друг признак“.

Всички семейства са ценни. Всички имаме семейства – някои сме родени в тях, други ги избираме, трети ги създаваме,

завършва декларацията.

Докато Копаранов чете от трибуната, от групата на БСП слагат на мястото му плюшен еднорог. Човек би могъл да възприеме този жест и като подигравка и да се обиди, но адвокатът предпочита да реагира позитивно. Той публикува селфи с еднорога, като благодари на колегите си „за милия подарък“. Копаранов го тълкува като знак, че „БСП най-накрая се позиционираха като европейска лява партия“ и че „резултатите им на последните избори са довели до катарзис“.

Двойният прецедент

Инициаторите на декларацията Христо Копаранов и Гергин Борисов са застъпници за правата на ЛГБТИ+ хората от години. Затова „Тоест“ разговаря с тях.

Копаранов е ядрото на изследователския екип, който през 2018 г. публикува юридически анализ на 70 закона. Според проучването българската държава лишава еднополовите двойки от около 300 права, защото не признава партньорството им. Неотдавна Борисов припомни свой пост от 2021 г., в който разказва измислена, но възможна и плашещо реалистична история на мъж, чийто партньор умира в болница, а той не може нито да получи информация за състоянието му, нито да продължи да живее в дома, който са градили двамата.

„Откакто пуснах онзи си пост с историята за болницата, хората, които ми казаха, че напускат [България], са поне 10. 10 човека за 3 години. Повечето бяха специалисти със заплати над 5000 лева месечно“, споделя Гергин Борисов и пресмята в духа на скорошното изследване на Института за пазарна икономика за икономическата цена на хомофобията: „1,8 милиона лева икономическа активност изчезна… защото е важно да се правят референдуми против джендър идеологията.“

„Жалко е, че подобна декларация трябва да бъде четена. Страшно е, че живеем във време, в което да кажеш, че всички семейства са важни и ценни може да бъде тема, по която не всички да са съгласни“, казва Борисов и допълва, че иска да живее в нормална държава и затова се опитва да я промени. 

Искам познатите ми, които се отказаха да чакат, да имат повод да се върнат – защото виждат, че става по-добро, по-толерантно и по-европейско място за живот.

„Отбягването на темата за правата на ЛГБТ хората от страна на много политици е естествена последица от токсичната среда в българската политика и сред избирателите“, смята Христо Копаранов. Според него е разбираемо, но „тази предпазливост води до това, че във всеки един момент „не е сега моментът“. И моментът да се заговори за темата никога не идва“. А същевременно хомофобските гласове стават все по-силни.

На въпроса защо продължава да говори по темата, след като вече е общински съветник, той отговаря, че връзката в питането всъщност трябва да е обратна: 

Влизането в Общинския съвет не трябва да е повод да престане да се говори по темата, а точно обратното – повод да се говори още повече. Като член на Общинския съвет един съветник трябва да формира насоките, в които да се развива неговият град, да създава визията за него. Затова […] искаме да създадем сигурен и приемащ град, в който всеки да може да бъде себе си с достойнство.

Според Гергин Борисов „Спаси София“ е партия, която „не мери теми по това какво е популярно, а по това кое е важно“. А е важно София да е „град за всички и това включва ЛГБТИ+ общността“. Борисов допълва: „Аз продължавам да говоря по темата, защото съм част от тази общност и стремежът за равноправие е и мой.“ На учудването, че човек на изборна публична длъжност в България може да каже това, отговаря с усмивка: 

„Аз съм открит от години и няма сила на планетата, която да ме набута обратно в килера“. 

„Откритото говорене без страх е много важно, но още по-важно е предприемането на реални мерки“, смята Христо Копаранов, на когото също като Борисов не му се налага да разкрива сексуалната си ориентация, понеже не я и крие. Той има опит с говоренето без страх, знае и каква е цената му. „Например след последните ми публикации и декларацията, прочетена в Общинския съвет, аз станах обект на сериозна хомофобска реакция, включително съм получавал заплахи за живота си.“

Конкретно тази заплаха той е изпратил на прокуратурата:

 

Изгрява ли дъга над София и България?

А в лични съобщения адвокатът получава и ето такива „пожелания“:

Изгрява ли дъга над София и България?

Копаранов си търси правата, но в същото време не престава да реагира с добродушна ирония на подигравките и слуховете по свой адрес. „Защото ако няма смели политици, които да говорят, и ако обществото проявява толерираща търпимост към ответната хомофобска реакция, мълчанието ще продължи завинаги.“

Знаейки, че жълто-кафявите медии разнасят клюки по негов адрес, той публикува във Facebook актуална снимка, на която е с мъж до себе си – „да си обновят архива, че ми писна да качват стари наши снимки, когато решат да правят интриги“.

Без да дават признаци, че осъзнават колко голям прецедент са, Христо Копаранов и Гергин Борисов са комай първите хора на изборна публична длъжност в България, които са открити за сексуалната си ориентация.

Веднъж на десетина години

В България се появява открито хомосексуален политик веднъж на десетина години. Малко артефакти са оцелели във виртуалното пространство от далечната 2004 година, когато 27-годишният тогава член на БСП Ивелин Йорданов разкрива сексуалната си ориентация пред медиите. Той е бил учредител на куиър организация и изглежда, е вярвал, че Столетницата може да стане лява партия от европейски тип.

Според активистите Моника Писанкънева и Станимир Панайотов, отразили случая тогава, с разкриването на младия социалист мълчанието по темата за ЛГБТИ+ хората е „веднъж завинаги разклатено“. Но БСП, изглежда, се опитва да заглуши случая, като не изпраща Йорданов на форум по темата. А партийният орган – вестник „Дума“ – му дава право на отговор, след което обявява „своята задача в обсъждането на сложния проблем на хомосексуалността за приключена“. И „веднъж завинаги разклатеното мълчание“ се завръща.

11 години по-късно – през 2015 г. – Виктор Лилов, който тогава беше кандидат за кмет от малката либерална партия ДЕОС – се разкри като хомосексуален в интервю на вече покойния Емил Коен за сайта „Маргиналия“. Татяна Ваксберг отбеляза, че в България за първи път се случва някой, който се кандидатира за изборен пост, да декларира нехетеросексуалната си ориентация.

Случаят имаше широк обществен отзвук, а в продължение на години сексуалността на Лилов беше почти единственото нещо, с което беше асоцииран. Тя беше и една от причините за настроенията срещу него в ДЕОС. Две години след разкриването на Лилов партията, чийто председател беше станал междувременно, го изключи по скалъпени обвинения. Малко по-късно ДЕОС се самозакри, оставайки куха регистрация, която евентуално да бъде използвана за целите на някой нов политически проект. А Виктор Лилов опита да влезе в политиката и чрез други формации – без успех.

Девет години по-късно, през 2024 г., Гергин Борисов и Христо Копаранов са избрани за членове на СОС. Без да се крият, но и без скандал. Като нещо най-нормално, каквото всъщност е. И също като нещо най-нормално, те се опитват да впрегнат своята група и целия Общински съвет в работа за равни права.

Чувал съм за много случаи, когато [депутати] са гласували по определен начин, защото еди-кой си разполага с касетка, снимка и т.н. След 1990 г. групичка хора определяха съдбата на България, но не знам дали сами решаваха, или ръководени от други съображения заради скритата си сексуална ориентация.

Тези думи са напълно актуални днес (само дето вече няма касетки, а флашки, социални мрежи, приложения и лични съобщения), но са изречени от Ивелин Йорданов при разкриването му през 2004 г. Не само за доста политици, а и за журналисти и други публични личности, разкриването на сексуалната им ориентация е равносилно на публичен линч. Макар ориентацията на някои от тях да е обществена тайна, те продължават да търпят натиск и унижения, без да са в състояние да си позволят да се разкрият.

Може би Христо Копаранов и Гергин Борисов, които разполагат с мандат от четири години да говорят от публична трибуна, най-сетне „веднъж завинаги“ ще катурнат мълчанието, за което говорят преди 20 години Моника Писанкънева и Станимир Панайотов. И ще спомогнат да изгрее дъгата, под която всеки може да бъде себе си. Възможно е обаче и стигмата да заглуши гласовете им. След още 20 години със сигурност ще знаем отговора.

Build multimodal search with Amazon OpenSearch Service

Post Syndicated from Praveen Mohan Prasad original https://aws.amazon.com/blogs/big-data/build-multimodal-search-with-amazon-opensearch-service/

Multimodal search enables both text and image search capabilities, transforming how users access data through search applications. Consider building an online fashion retail store: you can enhance the users’ search experience with a visually appealing application that customers can use to not only search using text but they can also upload an image depicting a desired style and use the uploaded image alongside the input text in order to find the most relevant items for each user. Multimodal search provides more flexibility in deciding how to find the most relevant information for your search.

To enable multimodal search across text, images, and combinations of the two, you generate embeddings for both text-based image metadata and the image itself. Text embeddings capture document semantics, while image embeddings capture visual attributes that help you build rich image search applications.

Amazon Titan Multimodal Embeddings G1 is a multimodal embedding model that generates embeddings to facilitate multimodal search. These embeddings are stored and managed efficiently using specialized vector stores such as Amazon OpenSearch Service, which is designed to store and retrieve large volumes of high-dimensional vectors alongside structured and unstructured data. By using this technology, you can build rich search applications that seamlessly integrate text and visual information.

Amazon OpenSearch Service and Amazon OpenSearch Serverless support the vector engine, which you can use to store and run vector searches. In addition, OpenSearch Service supports neural search, which provides out-of-the-box machine learning (ML) connectors. These ML connectors enable OpenSearch Service to seamlessly integrate with embedding models and large language models (LLMs) hosted on Amazon Bedrock, Amazon SageMaker, and other remote ML platforms such as OpenAI and Cohere. When you use the neural plugin’s connectors, you don’t need to build additional pipelines external to OpenSearch Service to interact with these models during indexing and searching.

This blog post provides a step-by-step guide for building a multimodal search solution using OpenSearch Service. You will use ML connectors to integrate OpenSearch Service with the Amazon Bedrock Titan Multimodal Embeddings model to infer embeddings for your multimodal documents and queries. This post illustrates the process by showing you how to ingest a retail dataset containing both product images and product descriptions into your OpenSearch Service domain and then perform a multimodal search by using vector embeddings generated by the Titan multimodal model. The code used in this tutorial is open source and available on GitHub for you to access and explore.

Multimodal search solution architecture

We will provide the steps required to set up multimodal search using OpenSearch Service. The following image depicts the solution architecture.

Multimodal search architecture

Figure 1: Multimodal search architecture

The workflow depicted in the preceding figure is:

  1. You download the retail dataset from Amazon Simple Storage Service (Amazon S3) and ingest it into an OpenSearch k-NN index using an OpenSearch ingest pipeline.
  2. OpenSearch Service calls the Amazon Bedrock Titan Multimodal Embeddings model to generate multimodal vector embeddings for both the product description and image.
  3. Through an OpenSearch Service client, you pass a search query.
  4. OpenSearch Service calls the Amazon Bedrock Titan Multimodal Embeddings model to generate vector embedding for the search query.
  5. OpenSearch runs the neural search and returns the search results to the client.

Let’s look at steps 1, 2, and 4 in more detail.

Step 1: Ingestion of the data into OpenSearch

This step involves the following OpenSearch Service features:

  • Ingest pipelines – An ingest pipeline is a sequence of processors that are applied to documents as they’re ingested into an index. Here you use a text_image_embedding processor to generate combined vector embeddings for the image and image description.
  • k-NN index – The k-NN index introduces a custom data type, knn_vector, which allows users to ingest vectors into an OpenSearch index and perform different kinds of k-NN searches. You use the k-NN index to store both the general field data types, such as text, numeric, etc., and specialized field data types, such as knn_vector.

Steps 2 and 4: OpenSearch calls the Amazon Bedrock Titan model

OpenSearch Service uses the Amazon Bedrock connector to generate embeddings for the data. When you send the image and text as part of your indexing and search requests, OpenSearch uses this connector to exchange the inputs with the equivalent embeddings from the Amazon Bedrock Titan model. The highlighted blue box in the architecture diagram depicts the integration of OpenSearch with Amazon Bedrock using this ML-connector feature. This direct integration eliminates the need for an additional component (for example, AWS Lambda) to facilitate the exchange between the two services.

Solution overview

In this post, you will build and run multimodal search using a sample retail dataset. You will use the same multimodal generated embeddings and experiment by running text search only, image search only and both text and image search in OpenSearch Service.

Prerequisites

  1. Create an OpenSearch Service domain. For instructions, see Creating and managing Amazon OpenSearch Service domains. Make sure the following settings are applied when you create the domain, while leaving other settings as default.
    • OpenSearch version is 2.13
    • The domain has public access
    • Fine-grained access control is enabled
    • A master user is created
  2. Set up a Python client to interact with the OpenSearch Service domain, preferably on a Jupyter Notebook interface.
  3. Add model access in Amazon Bedrock. For instructions, see add model access.

Note that you need to refer to the Jupyter Notebook in the GitHub repository to run the following steps using Python code in your client environment. The following sections provide the sample blocks of code that contain only the HTTP request path and the request payload to be passed to OpenSearch Service at every step.

Data overview and preparation

You will be using a retail dataset that contains 2,465 retail product samples that belong to different categories such as accessories, home decor, apparel, housewares, books, and instruments. Each product contains metadata including the ID, current stock, name, category, style, description, price, image URL, and gender affinity of the product. You will be using only the product image and product description fields in the solution.

A sample product image and product description from the dataset are shown in the following image:

Sample product image and description

Figure 2: Sample product image and description

In addition to the original product image, the textual description of the image provides additional metadata for the product, such as color, type, style, suitability, and so on. For more information about the dataset, visit the retail demo store on GitHub.

Step 1: Create the OpenSearch-Amazon Bedrock ML connector

The OpenSearch Service console provides a streamlined integration process that allows you to deploy an Amazon Bedrock-ML connector for multimodal search within minutes. OpenSearch Service console integrations provide AWS CloudFormation templates to automate the steps of Amazon Bedrock model deployment and Amazon Bedrock-ML connector creation in OpenSearch Service.

  1. In the OpenSearch Service console, navigate to Integrations as shown in the following image and search for Titan multi-modal. This returns the CloudFormation template named Integrate with Amazon Bedrock Titan Multi-modal, which you will use in the following steps.Configure domainFigure 3: Configure domain
  2. Select Configure domain and choose ‘Configure public domain’.
  3. You will be automatically redirected to a CloudFormation template stack as shown in the following image, where most of the configuration is pre-populated for you, including the Amazon Bedrock model, the ML model name, and the AWS Identity and Access Management (IAM) role that is used by Lambda to invoke your OpenSearch domain. Update Amazon OpenSearch Endpoint with your OpenSearch domain endpoint and Model Region with the AWS Region in which your model is available.Create a CloudFormation stackFigure 4: Create a CloudFormation stack
  4. Before you deploy the stack by clicking ‘Create Stack’, you need to give necessary permissions for the stack to create the ML connector. The CloudFormation template creates a Lambda IAM role for you with the default name LambdaInvokeOpenSearchMLCommonsRole, which you can override if you want to choose a different name. You need to map this IAM role as a Backend role for ml_full_access role in OpenSearch dashboards Security plugin, so that the Lambda function can successfully create the ML connector. To do so,
    • Login to the OpenSearch Dashboards using the master user credentials that you created as a part of prerequisites. You can find the Dashboards endpoint on your domain dashboard on the OpenSearch Service console.
    • From the main menu choose SecurityRoles, and select the ml_full_access role.
    • Choose Mapped usersManage mapping.
    • Under Backend roles, add the ARN of the Lambda role (arn:aws:iam::<account-id>:role/LambdaInvokeOpenSearchMLCommonsRole) that needs permission to call your domain.
    • Select Map and confirm the user or role shows up under Mapped users.Set permissions in OpenSearch dashboardsFigure 5: Set permissions in OpenSearch dashboards security plugin
  5. Return back to the CloudFormation stack console, check the box, ‘I acknowledge that AWS CloudFormation might create IAM resources with customised names‘ and click on ‘Create Stack’.
  6. After the stack is deployed, it will create the Amazon Bedrock-ML connector (ConnectorId) and a model identifier (ModelId). CloudFormation stack outputsFigure 6: CloudFormation stack outputs
  7. Copy the ModelId from the Outputs tab of the CloudFormation stack starting with prefix ‘OpenSearch-bedrock-mm-’ from your CloudFormation console. You will be using this ModelId in the further steps.

Step 2: Create the OpenSearch ingest pipeline with the text_image_embedding processor

You can create an ingest pipeline with the text_image_embedding processor, which transforms the images and descriptions into embeddings during the indexing process.

In the following request payload, you provide the following parameters to the text_image_embedding processor. Specify which index fields to convert to embeddings, which field should store the vector embeddings, and which ML model to use to perform the vector conversion.

  • model_id (<model_id>) – The model identifier from the previous step.
  • Embedding (<vector_embedding>) – The k-NN field that stores the vector embeddings.
  • field_map (<product_description> and <image_binary>) – The field name of the product description and the product image in binary format.
path = "_ingest/pipeline/<bedrock-multimodal-ingest-pipeline>"

..
payload = {
"description": "A text/image embedding pipeline",
"processors": [
{
"text_image_embedding": {
"model_id":<model_id>,
"embedding": <vector_embedding>,
"field_map": {
"text": <product_description>,
"image": <image_binary>
}}}]}

Step 4: Create the k-NN index and ingest the retail dataset

Create the k-NN index and set the pipeline created in the previous step as the default pipeline. Set index.knn to True to perform an approximate k-NN search. The vector_embedding field type must be mapped as a knn_vector. vector_embedding field dimension must be mapped with the number of dimensions of the vector that the model provides.

Amazon Titan Multimodal Embeddings G1 lets you choose the size of the output vector (either 256, 512, or 1024). In this post, you will be using the default 1024 dimensional vectors from the model. You can check the size of dimensions of the model by selecting ‘Providers’ -> ‘Amazon’ tab -> ‘Titan Multimodal Embeddings G1’ tab -> ‘Model attributes’, from your Bedrock console.

Given the smaller size of the dataset and to bias for better recall, you use the faiss engine with the hnsw algorithm and the default l2 space type for your k-NN index. For more information about different engines and space types, refer to k-NN index.

payload = {
"settings": {
"index.knn": True,
"default_pipeline": <ingest-pipeline>
},
"mappings": {
"properties": {
"vector_embedding": {
"type": "knn_vector",
"dimension": 1024
"method": {
"engine": "faiss",
"space_type": "l2",
"name": "hnsw",
"parameters": {}
}
},
"product_description": {"type": "text"},
"image_url": {"type": "text"},
"image_binary": {"type": "binary"}
}}}

Finally, you ingest the retail dataset into the k-NN index using a bulk request. For the ingestion code, refer to the step 7, ‘Ingest the dataset into k-NN index using Bulk request‘ in the Jupyter notebook.

Step 5: Perform multimodal search experiments

Perform the following experiments to explore multimodal search and compare results. For text search, use the sample query “Trendy footwear for women” and set the number of results to 5 (size) throughout the experiments.

Experiment 1: Lexical search

This experiment shows you the limitations of simple lexical search and how the results can be improved using multimodal search.

Run a match query against the product_description field by using the following example query payload:

payload = {
"query": {
"match": {
"product_description": {
"query": "Trendy footwear for women"
}
}
},
"size": 5
}

Results:

Lexical search results

Figure 7: Lexical search results

Observation:

As shown in the preceding figure, the first three results refer to a jacket, glasses, and scarf, which are irrelevant to the query. These were returned because of the matching keywords between the query, “Trendy footwear for women” and the product descriptions, such as “trendy” and “women.” Only the last two results are relevant to the query because they contain footwear items.

Only the last two products fulfil the intent of the query, which was to find products that match all words in the query.

Experiment 2: Multimodal search with only text as input

In this experiment, you will use the Titan Multimodal Embeddings model that you deployed previously and run a neural search with only “Trendy footwear for women” (text) as input.

In the k-NN vector field (vector_embedding) of the neural query, you pass the model_id, query_text, and k value as shown in the following example. k denotes the number of results returned by the k-NN search.

payload = {
"query": {
"neural": {
"vector_embedding": {
"query_text": "Trendy footwear for women",
"model_id": <model_id>,
"k": 5
}
}
},
"size": 5
}

Results:

Results from multimodal search using text

Figure 8: Results from multimodal search using text

Observation:

As shown in the preceding figure, all five results are relevant because each represents a style of footwear. Additionally, the gender preference from the query (women) is also matched in all the results, which indicates that the Titan multimodal embeddings preserved the gender context in both the query and nearest document vectors.

Experiment 3: Multimodal search with only an image as input

In this experiment, you will use only a product image as the input query.

You will use the same neural query and parameters as in the previous experiment but pass the query_image parameter instead of using the query_text parameter. You need to convert the image into binary format and pass the binary string to the query_image parameter:

Image of a woman’s sandal used as the query input

Figure 9: Image of a woman’s sandal used as the query input

payload = {
"query": {
"neural": {
"vector_embedding": {
"query_image": <query_image_binary>,
"model_id": <model_id>,
"k": 5
}
}
},
"size": 5
}

Results:

Results from multimodal search using an image

Figure 10: Results from multimodal search using an image

Observation:

As shown in the preceding figure, by passing an image of a woman’s sandal, you were able to retrieve similar footwear styles. Though this experiment provides a different set of results compared to the previous experiment, all the results are highly related to the search query. All the matching documents are similar to the searched product image, not only in terms of the product category (footwear) but also in terms of the style (summer footwear), color, and gender affinity of the product.

Experiment 4: Multimodal search with both text and an image

In this last experiment, you will run the same neural query but pass both the image of a woman’s sandal and the text, “dark color” as inputs.

Figure 11: Image of a woman’s sandal used as part of the query input

As before, you will convert the image into its binary form before passing it to the query:

payload = {
"query": {
"neural": {
"vector_embedding": {
"query_image": <query_image_binary>,
"query_text": "dark color",
"model_id": <model_id>,
"k": 5
}
}
},
"size": 5
}

Results:

payload = { "query": { "neural": { "vector_embedding": { "query_image": <query_image_binary>, "query_text": "dark color", "model_id": <model_id>, "k": 5 } } }, "size": 5 }” width=”904″ height=”796″></a></p>
<p><em>Figure 12: Results of query using text and an image</em></p>
<h4><strong>Observation:</strong></h4>
<p>In this experiment, you augmented the image query with a text query to return dark, summer-style shoes. This experiment provided more comprehensive options by taking into consideration both text and image input.</p>
<h2>Overall observations</h2>
<p>Based on the experiments, all the variants of multimodal search provided more relevant results than a basic lexical search. After experimenting with text-only search, image-only search, and a combination of the two, it’s clear that the combination of text and image modalities provides more search flexibility and, as a result, more specific footwear options to the user.</p>
<h2>Clean up</h2>
<p>To avoid incurring continued AWS usage charges, delete the Amazon OpenSearch Service domain that you created and delete the CloudFormation stack starting with prefix ‘<strong>OpenSearch-bedrock-mm-</strong>’ that you deployed to create the ML connector.</p>
<h2>Conclusion</h2>
<p>In this post, we showed you how to use OpenSearch Service and the Amazon Bedrock Titan Multimodal Embeddings model to run multimodal search using both text and images as inputs. We also explained how the new multimodal processor in OpenSearch Service makes it easier for you to generate text and image embeddings using an OpenSearch ML connector, store the embeddings in a k-NN index, and perform multimodal search.</p>
<p>Learn more about <a href=ML-powered search with OpenSearch and set up you multimodal search solution in your own environment using the guidelines in this post. The solution code is also available on the GitHub repo.


About the Authors

Praveen Mohan Prasad is an Analytics Specialist Technical Account Manager at Amazon Web Services and helps customers with pro-active operational reviews on analytics workloads. Praveen actively researches on applying machine learning to improve search relevance.

Hajer Bouafif is an Analytics Specialist Solutions Architect at Amazon Web Services. She focuses on Amazon OpenSearch Service and helps customers design and build well-architected analytics workloads in diverse industries. Hajer enjoys spending time outdoors and discovering new cultures.

Aruna Govindaraju is an Amazon OpenSearch Specialist Solutions Architect and has worked with many commercial and open-source search engines. She is passionate about search, relevancy, and user experience. Her expertise with correlating end-user signals with search engine behavior has helped many customers improve their search experience. Her favourite pastime is hiking the New England trails and mountains.

[$] Adding a JIT compiler to CPython

Post Syndicated from jake original https://lwn.net/Articles/977855/

One of the big-ticket items for the upcoming Python 3.13 release is an experimental just-in-time (JIT) compiler for the language;
the other is, of course, the removal of the
global interpreter lock (GIL)
, which is also an experiment. Brandt
Bucher is a member of the Faster CPython project, which is
working on making the reference implementation of the language faster via a
variety of techniques. Last year at PyCon, he gave a talk about the specializing adaptive
interpreter
; at PyCon 2024 in Pittsburgh, he described the work he and others have been doing
to add a copy-and-patch JIT compiler to CPython.

Introducing AWS Glue usage profiles for flexible cost control

Post Syndicated from Noritaka Sekiyama original https://aws.amazon.com/blogs/big-data/introducing-aws-glue-usage-profiles-for-flexible-cost-control/

AWS Glue is a serverless data integration service that enables you to run extract, transform, and load (ETL) workloads on your data in a scalable and serverless manner. One of the main advantages of using a cloud platform is its flexibility; you can provision compute resources when you actually need them. However, with this ease of creating resources comes a risk of spiraling cloud costs when those resources are left unmanaged or without guardrails. As a result, admins need to balance avoiding high infrastructure costs with allowing users to work without unnecessary friction.

To address that, today we are excited to announce the general availability of AWS Glue usage profiles. With AWS Glue usage profiles, admins can create different profiles for various classes of users within the account, such as developers, testers, and product teams. Each profile is a unique set of parameters that can be assigned to different types of users. For example, developers may need more workers and can have a higher number of maximum workers, whereas product teams may need fewer workers and a lower timeout or idle timeout value.

How AWS Glue usage profiles works

An AWS Glue usage profile is a resource identified by an Amazon Resource Name (ARN) for better governance of resources. Admins have the ability to create AWS Glue usage profiles and define default values to be used when a parameter value is not provided. For example, you can create an AWS Glue usage profile with the default number of workers set to 2. When you sign in to the AWS Glue console using the AWS Identity and Access Management (IAM) user associated with the usage profile and create a new job, the initial value configured for the number of workers shows as 2 instead of the service default of 10.

Additionally, you can specify a set of allowed values for validation when a user associated with this profile creates a resource. If the parameter is numeric, admins can define a range of allowed values by specifying minimum and maximum values, instead of a specific set. For example, you can create an AWS Glue usage profile that allows only G.1X worker types. When you sign in to the AWS Glue console using an IAM user associated with this usage profile and create a job with a G.2X worker type, saving it will result in a failure.

Because an AWS Glue profile is a resource identified by an ARN, all the default IAM controls apply, including action-based, resource-based, and tag-based authorization. Admins update the IAM policy of users who create AWS Glue resources, granting them read permission on the profiles. This enables users to view the profiles. In order to use them when making API calls to create AWS Glue resources, admins will tag the user or role with glue:UsageProfile as the key and the profile name as the value. AWS Glue validates the API requests such as CreateJob, UpdateJob, StartJobRun, and CreateSession based on the values specified in the AWS Glue profile and raise appropriate exceptions.

In the following sections, we demonstrate how to create AWS Glue usage profiles, assign profiles to users, and demonstrate the usage profiles in action.

Create an AWS Glue usage profiles

To get started and create AWS Glue usage profiles, complete the following steps:

  1. On the AWS Glue console, choose Cost management in the navigation pane.

Let’s create your first usage profile for your developers.

  1. Choose Create usage profile.
  2. For Usage profile name, enter developer.
  3. Under Customize configurations for jobs, for Number of workers, for Default, enter 20.
  4. For Default worker type, choose G.1X.
  5. For Allowed worker types, choose G.1X, G.2X, G.4X, and G.8X.
  6. For Customize configurations for sessions, configure the same values.
  7. Choose Create usage profile.

Next, create another usage profile for your business analysts, who need fewer workers and a lower timeout or idle timeout value.

  1. Choose Create usage profile.
  2. For Usage profile name, enter analyst.
  3. Under Customize configurations for jobs, for Number of workers, for Default, enter 2. For Maximum, enter 5.
  4. For Default worker type, choose G.1X.
  5. For Allowed worker types, choose only G.1X.
  6. For Timeout, for Default, enter 60. For Maximum, enter 120.
  7. For Customize configurations for sessions, configure the same values.
  8. For Idle timeout, for Default, enter 10. For Maximum, enter 60.
  9. Choose Create usage profile.

You have successfully created two usage profiles.

Assign usage profiles

Restrictions can only be applied to AWS Glue API calls made by IAM users or roles if the profile is assigned to them. There are two steps that the admin needs to take in order to assign a profile:

  • In IAM, create a tag named glue:UsageProfile on the user or role, with the name of the profile used as the tag value
  • The IAM policy assigned to the user or role needs to be updated to include the glue:GetUsageProfile IAM action permission to read the assigned profile

Follow these steps to create two new users, each assigned a different profile:

  1. On the IAM console, choose Users in the navigation pane.
  2. Choose Create user.
  3. For User name, enter blogDeveloper.
  4. Select Provide user access to the AWS Management Console and I want to create an IAM user.
  5. You can enter a custom password or let one be generated (in the latter case, select Show password so you can use it later to sign in).
  6. Choose Next.
  7. Attach the managed policies AWSGlueConsoleFullAccess and IAMReadOnlyAccess.
  8. Choose Next.
  9. Review the summary and complete the creation.
  10. Remember the password for later and choose Return to users list and choose the user just created.
  11. On the Permissions tab, for Add permissions, choose Create inline policy.
  12. In the policy editor, switch to JSON and enter the following policy, replacing the AWS Region, account ID, and usage profile name placeholders. For the usage profile name, use the value developer for the user blogDeveloper and analyst for the role blogAnalyst.
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "glue:GetUsageProfile"
          ],
          "Resource": [
            "arn:aws:glue:<aws region>:<account id>:usageProfile/<usage profile name>"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iam:PassRole"
          ],
          "Resource": [
            "*"
          ],
          "Condition": {
            "StringLike": {
              "iam:PassedToService": [
                "glue.amazonaws.com"
              ]
            }
          }
        }
      ]
    }

  13. Name the policy GlueUsageProfilePermission and complete the creation.
  14. On the Tags tab, add a new tag with the name glue:UsageProfile and the value developer.

Repeat the steps to create a user named blogAnalyst, and replace the ARN in the policy with arn:aws:glue:<aws region>:<account id>:usageProfile/analyst. Make sure the Region and account ID are populated before updating the policy. For the tag value, specify analyst instead of developer.

On the AWS Glue console, navigate to the developer usage profile. You can see that the status has been changed from Not assigned to Assigned.

Lastly, complete the following steps to create two IAM roles for AWS Glue jobs and sessions with the profile.

  1. Create two IAM roles for AWS Glue. Name them GlueServiceRole-developer and GlueServiceRole-analyst.
  2. Configure the following inline policies by replacing the Region, account ID, and usage profile name placeholders. For the usage profile name placeholder, use the value developer for the role GlueServiceRole-developer and analyst for the role GlueServiceRole-analyst.
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "glue:GetUsageProfile"
          ],
          "Resource": [
            "arn:aws:glue:<aws region>:<account id>:usageProfile/<usage profile name>"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iam:PassRole"
          ],
          "Resource": [
            "*"
          ],
          "Condition": {
            "StringLike": {
              "iam:PassedToService": [
                "glue.amazonaws.com"
              ]
            }
          }
        }
      ]
    }

  3. On the Tags tab for the IAM role, add a new tag with the name glue:UsageProfile and the value developer for GlueServiceRole-developer and analyst for GlueServiceRole-analyst.

Usage profiles in action: Jobs

Now you have two users with different AWS Glue profiles assigned. Let’s test them and see the differences. First, let’s try the user blogDeveloper to see how the profile developer works.

  1. Open the AWS Glue console with the blogDeveloper user.
  2. Choose ETL jobs in the navigation pane and choose Script editor.
  3. Choose Create script.
  4. Choose the Job details tab.

The default number of Requested number of workers is 20, which corresponds to the default setting of the profile developer.

Next, let’s try the user blogAnalyst to see how the profile analyst works.

  1. Open AWS Glue console with the blogAnalyst user.
  2. Choose ETL jobs in the navigation pane and choose Script editor.
  3. Choose Create script.
  4. Choose the Job details tab.

The default number of Requested number of workers is 2, which corresponds to the default setting of the profile analyst.

Additionally, the default number of Job timeout is 60, which corresponds to the default setting of the profile analyst.

  1. For Worker type, choose the dropdown menu.

Only G.1X is available and G.2X, G.4X, and G.8X are disabled. This is because we allowed the profile analyst to choose G.1X.

  1. For Requested number of workers, enter 20 to simulate invalid input.

You will see the waring message The maximum number of workers cannot exceed 5 for usage profile "analyst".

Now, the user blogAnalyst is attempting to run a job in the account where the number of workers set for the job is 20. However, the maximum number of workers in the profile assigned to this user is 5. When the user tries to run the job, it fails with an error, as shown in the following screenshot.

In this example, we’ve demonstrated how usage profiles manage AWS Glue jobs based on the preconfigured values in the profiles.

Usage profiles in action: Sessions

Next, continue using the user blogAnalyst and try the AWS Glue Studio notebook interface to see how interactive sessions work with usage profiles:

  1. Open the AWS Glue console with the blogAnalyst user.
  2. Choose ETL jobs in the navigation pane and choose Notebook.
  3. For IAM role, choose GlueServiceRole-analyst.
  4. Choose Create notebook.
  5. Wait for the notebook to be ready.

In the second notebook cell, %number_of_workers is set to 2, which corresponds to the default value of the profile analyst.

  1. Update %number_of_workers from 2 to 10 to simulate an invalid access pattern:
    %number_of_workers 10

  2. Run the cell.

You get an error message saying “Provided number of workers is not within the range [1, 5] in the analyst profile.”

This is because the given value of 10 exceeds the maximum number of workers set in the profile assigned to this user.

  1. Update %number_of_workers from 10 to 5 to simulate a valid access pattern:
    %number_of_workers 5

  2. Run the cell.

This time, the session has been successfully created.

Now you have observed how usage profiles manage AWS Glue interactive sessions based on the preconfigured values in the profiles.

Conclusion

This post demonstrated how AWS Glue usage profiles allow you to manage your AWS Glue resources with ease and flexibility.

With AWS Glue usage profiles, you can manage and control resources of different users in order to set your organization’s best practices and save costs. AWS Glue usage profiles serve as a guardrail to prevent unauthorized resource usage from occurring.

Try out the feature for yourself, and leave any feedback or questions in the comments.


About the Authors

Noritaka Sekiyama is a Principal Big Data Architect on the AWS Glue team. He is responsible for building software artifacts to help customers. In his spare time, he enjoys cycling with his road bike.

Gonzalo Herreros is a Senior Big Data Architect on the AWS Glue team, with a background in machine learning and AI.

Keerthi Chadalavada is a Senior Software Development Engineer at AWS Glue. She is passionate about designing and building end-to-end solutions to address customer data integration and analytic needs.

Gal HeyneGal Heyne is a Product Manager for AWS Glue with a strong focus on AI/ML, data engineering, and BI. She is passionate about developing a deep understanding of customers’ business needs and collaborating with engineers to design easy-to-use data products.

NAS vs. Cloud Storage: Which Remote Storage Option Is Best?

Post Syndicated from Vinodh Subramanian original https://www.backblaze.com/blog/nas-vs-cloud-storage-which-solution-fits-your-business-needs/

A decorative image showing a cloud and a NAS device.

If you’re leading IT strategy for a growing enterprise and still weighing network attached storage (NAS) and cloud storage, you’re not alone. And you’re not behind. Even the most seasoned infrastructure pros find themselves re-evaluating their stack as data volumes explode and budgets tighten. Both offer unique benefits, but with overlapping features, it’s easy to see why the choice can be confusing. 

Are you looking for greater control with physical access, as in a local NAS setup? Or is off-site backup, flexibility, and scalability through a cloud service provider more aligned with your needs? With plenty of discussions and debates outlining the pros and cons of one or the other, it can be difficult to determine the best storage solution for your specific needs. 

This guide walks through clear, actionable insights into NAS and cloud storage, addressing your most pressing questions about storage costs, dedicated machines, data sharing, and performance. Whether the focus is cost, scalability, security, or accessibility, this guide will help identify the ideal storage solution for your business.

What is NAS?

NAS, or network attached storage, is a file-level storage system designed specifically to provide centralized and shared disk storage for users on a local area network (LAN). 

Essentially, NAS is a purpose-built computer that operates its own dedicated operating system (OS). It contains one or more storage devices that are configured to create a single shared volume. These storage devices are arranged in a RAID configuration to ensure data redundancy and performance. 

These configurations make NAS ideal for file sharing, data backups, and accessing large files within an organization, making it a cost-effective solution for enterprises that need local storage with physical access.

Many NAS devices, such as Synology NAS or QNAP NAS, come with built-in software for additional functionalities like file syncing, data backups, and offsite backup options to integrate with cloud services.

How does NAS work?

NAS provides access to files using standard network file sharing protocols such as Network File System (NFS) and Server Message Block (SMB). By connecting directly to the local network, NAS allows users to easily store, access, and collaborate on files without overburdening other servers within the network. This separation of file-serving responsibilities helps optimize overall network performance, particularly for high-traffic environments. 

NAS systems are generally managed through a web-based utility accessible over the network, offering an intuitive interface for configuration and maintenance. This interface allows administrators to handle tasks such as user permissions, storage allocation, and data redundancy settings—making it simpler to secure and organize shared files across the network.

Advantages of NAS

NAS offers several advantages including faster data access, easier administration, simplified management, and many others. Here’s a breakdown: 

  • Cost effective: NAS devices typically involve an upfront purchase cost that includes access to applications from the NAS provider, like Synology Hyper Backup or QNAP Hybrid Backup Sync. This greatly reduces ongoing subscription fees, though you may incur costs if you want to expand your storage capacity with high-capacity storage drives or increase its performance with updates like more powerful processors, etc. 
  • Data control and security: NAS systems offer extensive control over data storage and security protocols. NAS systems are only accessible on the local network and to user accounts that can be controlled and managed.
  • Performance: NAS provides high-speed access to data over a local network, ensuring quick file retrieval and sharing. NAS generally work as fast as the local network speeds.
  • Scalable storage: Many NAS systems allow additional drives to be added, providing flexible storage expansion, albeit with the cost of additional drives or device upgrades. Modern NAS devices today offer large storage capacities and advanced features for virtualization and application hosting.
  • Data redundancy: When equipped with RAID configurations, NAS provides redundancy, ensuring data remains accessible even if one or more hard drives fail.
  • Better data management tools: Features such as fully automated backups, deduplication, compression, and encryption enhance data storage efficiency and security. NAS systems also support sync workflows for team collaboration, directory services for user and group management, and services like photo or media management.
  • Compatibility: NAS systems are designed to support different OS environments and are compatible with Windows, Mac, and Linux operating systems. They offer a seamless cross-platform access.
  • Remote access options: While primarily local, most NAS devices offer secure remote access through VPN or encrypted connections, allowing authorized users to access files from outside the office network when needed.

Limitations of NAS

While NAS offers numerous advantages for centralized file storage, there are some notable limitations to consider:

  • Initial setup and maintenance:. The configuration process can be complex at enterprise scale, and ongoing maintenance may demand external IT support, adding to operational costs.
  • Remote access vulnerabilities: NAS systems can be accessed remotely over the internet, creating a private cloud or hybrid cloud solution. While this offers a significant advantage in using your device, just like anything connected to the internet, it also poses security risks. Bad actors can exploit vulnerabilities and gain remote access to the device. To minimize risk, businesses must ensure proper security configurations, use encrypted connections, regularly update firmware, and restrict access to trusted IPs.
  • Scalability constraints: Although NAS systems allow for storage expansion, they are still limited by the physical capacity of the hardware.  Adding storage often involves purchasing high-capacity drives, which can be costly, and for larger expansions, migrating to more powerful NAS devices might be necessary.
  • Data vulnerability: Data stored on a NAS is susceptible to various threats, including hardware failures, natural disasters, theft, and cyber attacks such as ransomware. While RAID configurations offer some level of data redundancy, they do not protect against all forms of data loss. Regular backups and additional security measures are essential to mitigate these risks.
  • Performance overheads: As more users and devices access the NAS, network bandwidth and device performance can become bottlenecks. High demand may reduce access speeds, impact data throughput, and reduce efficiency, especially in larger organizations with extensive data needs.
  • Data recovery challenges: If a NAS drive fails or becomes corrupted, data recovery processes may be complex and require specialized services, which can be costly and time-intensive.

What is cloud storage?

Cloud storage is a model of data storage where data is stored on servers located in off-site locations and accessed via the internet. This setup enables users to store, retrieve, and manage data without requiring local storage infrastructure. There are two main types of cloud: public and private. 

  • Public cloud storage: Hyperscale providers like AWS, Google Cloud, and Azure and specialized cloud providers like Backblaze maintain servers and are responsible for hosting, managing, and securing data. The public cloud is cost-effective and offers scalable storage for multiple users and businesses.
  • Private cloud storage: Typically managed in-house or by a dedicated third-party provider, private cloud storage is reserved for a single organization. For example, a university may maintain data centers for its community. Private clouds offer enhanced control and security, though they often require more complex management.

What’s the diff: Public vs. private cloud

Public cloud storage services are provided by third-party vendors over the public internet, making them accessible to anyone who wants to purchase or lease storage capacity. These services are designed to offer scalability and reliability, often on a pay-as–you-go basis.

Private cloud storage is dedicated to a single organization where an organization utilizes its own servers and data centers to store data within their own network. It can be hosted on-premises or by a third-party provider, but it’s always behind the organization’s firewall. This model is ideal for businesses that require more control over their data and have stringent security and compliance requirements.

Advantages of public cloud

One of the key benefits of public cloud storage is that it eliminates the need for businesses to buy, manage, and operate their own data center infrastructure. This shift allows companies to move from capital expenditure (CapEx) to operational expenditure (OpEx) model, focusing on paying only for the storage they need when they need it. 

Additionally, cloud storage is elastic, enabling businesses to scale their storage capacity up or down more efficiently and strategically than through tactical hardware investments.

Advantages of private cloud

Private cloud storage allows for customized control and security measures, as organizations have full authority over their data environment. This setup can be highly beneficial for industries with strict data regulations, like finance and healthcare, as it enables better compliance with data privacy laws. 

Additionally, private clouds provide reliable performance since resources are not shared with external users, reducing latency issues and enabling faster data access for internal teams.

Types of cloud storage architecture 

In addition to the  elasticity and scalability benefits of cloud storage, you can also combine on-premises storage and different types of public or private cloud storage to uniquely support your business needs. The primary models of cloud storage are:

  • Hybrid cloud storage: A hybrid model combines both public and private cloud storage. This allows an organization to decide which data it wants to store in which cloud. Sensitive data and data that must meet strict compliance requirements may be stored in a private cloud or on-premises while less sensitive data is stored in the public cloud. You could also use hybrid cloud to leverage on-premises storage for performance-sensitive tasks, such as using NAS to edit large media files locally, which are later synced to the cloud. 
  • Multi-cloud storage: A multi-cloud model involves using two or more public cloud storage services from different service providers. This model helps businesses leverage the best features of each cloud service while enhancing data availability and redundancy. For example, some companies use multiple cloud providers to host mirrored copies of their active production data. If one of their public clouds suffers an outage, they have mechanisms in place to direct their applications or websites to failover to a second public cloud.

This flexibility in cloud storage architecture allows businesses to balance performance, cost, and security—ensuring critical data is stored securely while remaining accessible and resilient across multiple environments.

How does cloud storage work?

Cloud storage works by allowing users to upload data, such as files, documents, videos, or images to remote servers via the internet. 

Public cloud storage providers like Amazon, Google, Microsoft, and Backblaze maintain servers in large data centers. The uploaded data can be accessed and managed through web interfaces or APIs, making it highly accessible and flexible. 

Cloud storage offers numerous benefits that can greatly enhance business operations, such as storage space scalability, flexible data sharing options, and built-in data protection through regular backups and client-side encryption. However, there are also a few considerations like data security and storage costs to keep in mind. Next, we’ll look at the advantages and some of the key limitations of cloud-based storage solutions.

Advantages of cloud storage

Cloud storage enables businesses to scale with ease, reduce IT burdens, and access data remotely—offering a reliable, cost-efficient way to manage critical information. Here are some of the advantages of cloud storage:

  • Off-site protection: Cloud storage provides convenient off-site protection for data, ensuring that in the event of a physical disaster (such as fire or flood), data remains safe and accessible from any location. This supports in data redundancy and business continuity. 
  • Enhanced security: Leading cloud providers invest heavily in advanced security measures—including encryption, multi-factor authentication, Object Lock for immutability, and regular security audits—to protect stored data from unauthorized access and breaches.
  • Scalability: Cloud storage services offer virtually unlimited storage capacity. Businesses can easily scale their storage needs up or down based on demand without needing to invest in physical hardware. 
  • Accessibility: Data stored in the cloud can be accessed from anywhere with an internet connection, facilitating remote work and data sharing across teams and locations. 
  • Lower maintenance: Cloud providers handle all hardware maintenance, software updates, and security patches, reducing the IT burden of managing storage systems on businesses. 
  • Cost efficiency: Many cloud storage solutions operate on a pay-as-you-go model, allowing businesses to pay only for the storage they use, which can be more cost-effective than local NAS or investing in on-premises hardware.

Limitations of cloud storage

While cloud storage offers flexibility and scalability, it also has some limitations that impose additional considerations like ongoing costs and internet dependence that businesses should evaluate carefully. 

  • Ongoing costs: Unlike on-premises storage solutions such as NAS, cloud storage operates on a subscription-based pricing model. When evaluating cloud storage, businesses should consider the total cost of ownership, including ongoing fees, and weigh these against the benefits of cloud storage. 
  • Dependence on the internet: Cloud storage relies on a stable internet connection for access and data transfer. Any disruptions in internet connectivity can hinder access to critical files and services, potentially impacting business operations. Ensuring reliable internet service and having contingency plans are crucial for minimizing downtime.

NAS vs cloud storage: A side-by-side comparison

The following table provides a side-by-side comparison of NAS and cloud storage, highlighting key aspects such as cost, scalability, security, and performance. This comparison will help you determine which storage solution best aligns with your business requirements and operational workflows.

Aspect NAS Cloud Storage
Storage model File-level storage within a local network Data stored on remote servers accessed via the internet
Performance High speed access over a local network; optimal for on-premises work Dependent on internet speed and latency; suitable for global access and remote teams
Scalability Limited by physical hardware capacity; requires purchasing new devices for expansion Virtually unlimited scalability; allowing storage to expand without additional hardware
Cost Upfront hardware purchase, ongoing investment to expand capacity Subscription-based, pay-as-you-go model, often with no upfront hardware investment
Maintenance Requires in-house IT maintenance, updates and troubleshooting Maintenance handled by cloud provider, reducing IT burden
Security Controlled in-house, local network security; ideal for high-sensitive data Enhanced by provider with encryption, multi-factor authentication, and security
Data redundancy RAID configurations for local redundancy Built-in data redundancy and disaster recovery options
Accessibility Limited to local network access or VPN for remote connections Accessible from anywhere with an internet connection, supporting remote work and collaboration
Compliance Greater control for compliance in regulated industries; depends on in-house protocols Many providers offer compliance with standards like GDPR, HIPAA, and SOC 2, ideal for regulated industries

Hybrid cloud: The best of both worlds

A hybrid cloud solution combines the strengths of both NAS and cloud storage. While NAS offers a centralized location to store and access files, the data stored on the NAS is still vulnerable to data disasters such as floods, fires, or hardware failures. 

By integrating cloud storage with NAS, you create an off-site backup of your NAS data that securely protects your critical data from virtually any data threat. This approach not only mitigates the risk associated with physical damage to your on-premises NAS equipment but also offers the scalability, flexibility and remote accessibility benefits of cloud storage. 

Additionally, this helps you implement 3-2-1 backup protection where three copies of your data are stored in two different storage media (NAS and cloud) with one copy stored off-site in the cloud, protecting against ransomware, hardware failures, natural disasters, and other data threats.

NAS vs. cloud: Which is best for your business?

Choosing between NAS and cloud storage for your business largely depends on your specific use cases and operational needs. NAS provides fast local access, control, and cost efficiency for businesses with stable storage needs and on-premises operations. In contrast, cloud storage offers unparalleled scalability, remote access, and maintenance-free operation, making it ideal for organizations with dynamic storage needs and remote workforces. 

However, many businesses find that a combination of both, known as a hybrid cloud solution, offers the best of both worlds by combining the control of NAS with the scalability of cloud storage. 

Ultimately, the right choice will depend on a thorough evaluation of your business needs and operational workflows. By understanding the strengths and limitations of both NAS and cloud storage, you can make an informed decision that ensures your data is secure, accessible, and available when you need it.

FAQs about NAS and cloud storage

Is cloud storage better than NAS?

The answer depends on your specific business needs. Cloud storage offers scalability, remote access, and minimal maintenance requirements. NAS, on the other hand, provides fast local access and higher control over data management and security settings. Each solution has its strengths, and the best choice will depend on your priorities regarding data security, access, and cost.

Can I use a NAS as a cloud?

Yes, many modern NAS devices come with built-in features that allow them to function similarly to cloud storage, or to connect to a cloud storage provider of your choice. These NAS systems can be accessed remotely over the internet, creating a private cloud or hybrid cloud solution. However, it requires proper configuration, secure settings and a reliable internet connection to ensure seamless remote access.

Why use NAS instead of a server?

NAS devices are purpose-built for storage, offering simplicity, ease of management, and lower costs compared to traditional servers. While servers are multifunctional and can handle a variety of tasks, they are more complex to set up and maintain. NAS provides a straightforward solution for file sharing, backups, and media streaming without the need for extensive IT infrastructure. This makes NAS an excellent choice for small to medium-sized businesses that primarily need a dedicated storage solution.

Can NAS work without the internet?

Yes, NAS devices are designed to operate within a local area network (LAN) and do not require an internet connection for local access and file sharing. Users can store, access, and collaborate on files within local networks without internet access. However, for remote access or to leverage additional features such as cloud backups, an internet connection is necessary.

The post NAS vs. Cloud Storage: Which Remote Storage Option Is Best? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup

A teacher’s guide to teaching Experience AI lessons

Post Syndicated from Laura James original https://www.raspberrypi.org/blog/a-teachers-guide-to-teaching-experience-ai-lessons/

Today, Laura James, Head of Computing and ICT at King Edward’s School in Bath, UK, shares how Experience AI has transformed how she teaches her students about artificial intelligence. This article will also appear in issue 24 of Hello World magazine, which will be available for free from 1 July and focuses on the impact of technology.

I recently delivered Experience AI lessons to three Year 9 (ages 13–14) classes of about 20 students each with a ratio of approximately 2:3 girls to boys. They are groups of keen pupils who have elected to study computing as an option. The Experience AI lessons are an excellent set of resources.

Everything you need

Part of the Experience AI resources is a series of six lessons that introduce the concepts behind machine learning and artificial intelligence (AI). There are full lesson plans with timings, clear PowerPoint presentations, and activity sheets. There is also an end-of-topic multiple choice assessment provided.

Accompanying these are interesting, well-produced videos that underpin the concepts, all explained by real people who work in the AI industry. Plus, there are helpful videos for the educators, which explain certain parts of the scheme of work — particularly useful for parts that might have been seen as difficult for non-specialist teachers, for example, setting up a project using the Machine Learning for Kids website.

Confidence delivering lessons

The clear and detailed resources meant I felt mostly confident in delivering lessons. The suggested timings were a good guideline, although in some lessons, this did not always go to plan. For example, when the pupils were enjoying investigating websites that produce images generated by a text prompt, they were keen to spend more time on this than was allocated in the lesson plan. In this case, I modified the timings on the fly and set the final task of this lesson as a homework task.

Learning about AI sparked the students’ curiosity, and it triggered a few questions that I could not answer immediately. However, I admitted this was a new area for me, and with some investigation, found answers to many of their extra questions. This shows that the topic of AI is such an inspiring and important one for the next generation, and how important it is to add this to the curriculum now before students make their own, potentially biased, opinions about it.

“I’ve enjoyed actually learning about what AI is and how it works because before I thought it was just a scary computer that thinks like a human.” – Student, King Edward’s School, UK 

Impact on learners

The pupils’ feedback from the series of lessons was unerringly positive. I felt the lessons on bias in data were particularly important. The lesson where they trained their own algorithm recognising tomatoes and apples was a key one as it gave students an immediate sense of how a flawed training data set created bias and can impact the answers from a supposedly intelligent AI tool. I hope this has changed their outlook on AI-generated results and reinforced their critical thinking skills.

Many students are now seeing the influence of AI appearing in more and more tools around them and have mentioned that a career in AI is now something they are interested in.

“I have enjoyed learning about how AI is actually programmed rather than just hearing about how impactful and great it could be.” – Student, King Edward’s School, UK 

Tips for other teachers

Clearly this topic is incredibly important, and the Experience AI series of lessons is an excellent introduction to this for key stage 3 students (ages 11–14). My tips for other educators would be:

  • I delivered these to bright Year 9s and added a few more coding activities from the Machine Learning for Kids website. As these lessons stand, they could be delivered to Year 8s (ages 12–13), but perhaps Year 7s (ages 11–12) might struggle with some of the more esoteric concepts.
  • Before each lesson, ensure you read the content and familiarise yourself with the lesson resources and tools used. The Machine Learning for Kids website can take a little getting used to, but it is a powerful tool that brings to life how machine learning works, and many pupils said this was their favourite part of the lessons.
  • Before the lesson, ensure that the websites that you need to access are unblocked by your school’s firewall!
  • I tried to add a hands-on activity each lesson, e.g. for Lesson 1, I showed the students Google’s Quick, Draw! game, which they enjoyed and has a good section on the training data used to train the AI tool to recognise the drawings.
  • We also spent an extra lesson using the brilliant Machine Learning for Kids website and followed the ‘Shoot the bug’ worksheet, which allowed pupils to train an algorithm to learn how to play a simple video game.
  • I also needed to have a weekly homework task, so I would either use part of the activity from the lesson or quickly devise something (e.g. research another use for AI we haven’t discussed/what ethical issues might occur with a certain use of AI). Next year, our department will formalise these to help other teachers who might deliver these lessons to set these tasks more easily.
  • Equally, I needed to have a summative assessment at the end of the topic. I used some of the multiple choice questions that were provided but added some longer-answer questions and made an online assessment to allow me to mark students’ answers more efficiently.

“I have always been fascinated by AI applications and finally finding out how they work and make the decisions they do has been a really cool experience.” – Student, King Edward’s School, UK 

From comments I have had from the students, they really engaged with the lessons and appreciated the opportunity to discuss and explore the topic, which is often associated with ‘deception’ within school. It allowed them to understand the benefits and the risks of AI and, most importantly, to begin to understand how it works ‘under the hood’, rather than see AI as a magical, anthropomorphised entity that is guessing their next move.

“The best part about learning about AI was knowing the dangers and benefits associated and how we can safely use it in our day-to-day life.” – Student, King Edward’s School, UK 

As for my perspective, I really enjoyed teaching this topic, and it has earned its place in the Year 9 scheme of work for next year. 

If you’re interested in teaching the Experience AI Lessons to your students, download the resources for free today at experience-ai.org.

The post A teacher’s guide to teaching Experience AI lessons appeared first on Raspberry Pi Foundation.

[$] BPF tracing performance

Post Syndicated from daroc original https://lwn.net/Articles/978335/

On the final day of the 2024
Linux Storage,
Filesystem, Memory Management, and BPF Summit
, the BPF track
opened with a series of sessions on improving the performance and
flexibility of probes and other performance-monitoring tools, in the kernel and in
user space. Jiri Olsa led two sessions about different aspects of probes:
making the API for BPF programs attached to a probe more flexible, and making
user-space probes more efficient.

Plasma 6.1 released

Post Syndicated from corbet original https://lwn.net/Articles/978806/

Version 6.1 of
the Plasma desktop environment has been released.

Plasma 6 hits its stride with version 6.1. While Plasma 6.0 was all
about getting the migration to the underlying Qt 6 frameworks
correct (and what a massive job that was), 6.1 is where developers
start implementing the features that will take you desktop to a new
level.

Enhancements include better remote-desktop support, improved customization,
persistent apps, smoother animation under Wayland, and more; see the
changelog
for the full list.

AWS HITRUST Shared Responsibility Matrix v1.4.3 for HITRUST CSF v11.3 now available

Post Syndicated from Mark Weech original https://aws.amazon.com/blogs/security/aws-hitrust-shared-responsibility-matrix-v1-4-3-for-hitrust-csf-v11-3-now-available/

HITRUST r2 certified logo

The latest version of the AWS HITRUST Shared Responsibility Matrix (SRM)—SRM version 1.4.3—is now available. To request a copy, choose SRM version 1.4.3 from the HITRUST website.

SRM version 1.4.3 adds support for the HITRUST Common Security Framework (CSF) v11.3 assessments in addition to continued support for previous versions of HITRUST CSF assessments v9.1–v11.2. As with the previous SRM versions v1.4.1 and v1.4.2, SRM v1.4.3 enables users to trace the HITRUST CSF cross-version lineage and inheritability of requirement statements, especially when inheriting from or to v9.x and 11.x assessments.

The SRM is intended to serve as a resource to help customers use the AWS Shared Responsibility Model to navigate their security compliance needs. The SRM provides an overview of control inheritance, and customers also use it to perform the control scoring inheritance functions for organizations that use AWS services.

Using the HITRUST certification, you can tailor your security control baselines to a variety of factors—including, but not limited to, regulatory requirements and organization type. As part of their approach to security and privacy, leading organizations in a variety of industries have adopted the HITRUST CSF.

AWS doesn’t provide compliance advice, and customers are responsible for determining compliance requirements and validating control implementation in accordance with their organization’s policies, requirements, and objectives. You can deploy your environments on AWS and inherit our HITRUST CSF certification, provided that you use only in-scope services and apply the controls detailed on the HITRUST website.

What this means for our customers

The new AWS HITRUST SRM version 1.4.3 has been tailored to reflect both the Cross Version ID (CVID) and Baseline Unique ID (BUID) in the CSF object so that you can select the correct control for inheritance even if you’re still using an older version of the HITRUST CSF for your own assessment. As an additional benefit, the AWS HITRUST Inheritance Program also supports the control inheritance of AWS cloud-based workloads for new HITRUST e1 and i1 assessment types, in addition to the validated r2-type assessments offered through HITRUST.

For additional details on the AWS HITRUST program, see our HITRUST CSF page.

At AWS, we’re committed to helping you achieve and maintain the highest standards of security and compliance. We value your feedback and questions. Contact the AWS HITRUST team at AWS Compliance Support. If you have feedback about this post, submit comments in the Comments section below.

Mark Weech

Mark Weech

Mark is the Program Manager for the AWS HITRUST Security Assurance Program. He has over 10 years of experience in the healthcare industry holding director-level IT and security positions both within hospital facilities and enterprise-level positions supporting greater than 30,000 user healthcare environments. Mark has been involved with HITRUST as both an assessor and validated entity for over 10 years.

Security updates for Tuesday

Post Syndicated from corbet original https://lwn.net/Articles/978804/

Security updates have been issued by Debian (php7.3), Fedora (galera, ghostscript, and mariadb), Mageia (cups, iperf, and libndp), Oracle (firefox and flatpak), Red Hat (container-tools:rhel8, Firefox, firefox, and flatpak), SUSE (booth, bouncycastle, firefox, ghostscript, less, libaom, openssl-1_1, openssl-3, podman, python-Authlib, python-requests, python-Werkzeug, webkit2gtk3, and xdg-desktop-portal), and Ubuntu (ghostscript, ruby-rack, ruby2.7, ruby3.0, ruby3.1, ruby3.2, and sssd).

Helpful tools to get started in IoT Assessments

Post Syndicated from Tommy Yowell original https://blog.rapid7.com/2024/06/18/helpful-tools-to-get-started-in-iot-assessments/

Helpful tools to get started in IoT Assessments

The Internet of Things (IoT) can be a daunting field to get into. With many different tools and products available on the market it can be confusing to even know where to start. Having performed dozens of IoT assessments, I felt it would be beneficial to compile a basic list of items that are essential to getting started delving into the realm of testing embedded devices. The tools that will be covered in this post are primarily used to interact with the debug interface of embedded devices, however, many of them have multiple functions, from reading data from a memory chip to removing components from the physical circuit board. I would like to note that neither I, nor Rapid7, benefit in any way from the sale of any of these products. We honestly believe they are useful tools for any beginner.

1) Serial Debugger

One of the most used items when it comes to IoT testing would be a device used to interface with low-speed interfaces available on embedded devices. Gaining access to the debug interface on embedded devices is the easiest way to get a look under the hood of how the device is operating. One of the most popular and readily available devices on the market currently would be the Tigard.

Helpful tools to get started in IoT Assessments

The Tigard is a great open-source tool that has support for all the commonly used interfaces you might encounter on modern day embedded devices. It has support for Universal Asynchronous Receiver-Transmitter (UART), Joint Test Access Group (JTAG), Serial Peripheral Interface (SPI), Inter-Integrated Circuit (I2C), and Serial Wire Debug (SWD) connections. This device allows you to connect to various serial consoles or even extract the contents of commonly found flash memory chips. It is powered by a USB-C connection and also has the ability to select commonly used voltage supplies to power components when needed.

Link: https://www.crowdsupply.com/securinghw/tigard

2) PCByte Probes

A tool that saves a ton of time when it comes to connecting to serial interfaces and on-board components is a set of PCByte Probes. Without these probes, you would often have to resort to soldering on header pins or trying to attach to onboard components using probe connectors.

Helpful tools to get started in IoT Assessments

The starter level probe set includes 4 hands-free probes, a set of PCB holders, a magnetic base, and accessories. Oftentimes embedded devices contain small components on the circuit board that are not easily accessible due to size requirements. These probes allow for quick, solder-free, connections to be made to embedded devices. All you need to do is position the spring-loaded probes on areas of the circuit board and connect the included dupont wires to either a logic analyzer or a serial debugger to interface with the target device. The included circuit board holders are a nice touch to ensure the circuit board is kept firmly in position while working.

Link: https://sensepeek.com/pcbite-20

3) Rework Station

While working with embedded devices, there might be scenarios you run into that involve removing small components from the embedded device for offline analysis. There are many options for rework stations out on the internet, all with various levels of price and functionality. A model that hits the sweet spot of price and functionality is the Aoyue 968A+ Professional SMD Digital Hot Air Rework Station.

Helpful tools to get started in IoT Assessments

This rework station includes a number of tools to make any reworking job easy in one simple package. It includes a soldering iron, hot air rework gun, vacuum pickup tool, and a fume extractor. There are many times when performing embedded testing that it is necessary to either solder wires onto connections or remove components from the board for data extraction. The 70 watt soldering iron and 550 watt hot air gun provides plenty of power for quick soldering jobs and component rework.

Link: https://www.amazon.com/Aoyue-968A-Digital-Rework-Station/dp/B006FA481G?th=1

4) Logic Analyzer

Another important tool to have on hand when testing embedded devices is a logic analyzer. Many times, you will find that the debug port on an embedded device is not labeled on the circuit board. That is when a logic analyzer comes in handy to identify what various components on the board are without unnecessary guesswork. Logic analyzers are used to decode signals found on the board to identify and decode protocols such as UART, SPI, and I2C. There are many out on the market, but the sweet spot for price and functionality would be the Saleae Logic 8.

Helpful tools to get started in IoT Assessments

Saleae offers many different models of logic analyzers that all come in at different price points. Typically, the base model which supports 8 channels at a max speed of 100MS/s is sufficient for the majority, however, they do offer additional models that support a larger number of channels at higher speeds. Saleae includes the Logic 2 software which allows you to seamlessly interact with the device and identify protocols and decode signals on the board.

Link: https://usd.saleae.com/products/saleae-logic-8

As we’ve explored in this blog post, there are many options out on the market for conducting detailed analysis on embedded devices. Many of the tools out there are available at different price points and offer various levels of functionality and ease of interacting and interfacing with embedded devices. The goal with this guide is not to provide a comprehensive list of all available options, however to cover the basic tools used to begin your IoT journey.

Rethinking Democracy for the Age of AI

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/06/rethinking-democracy-for-the-age-of-ai.html

There is a lot written about technology’s threats to democracy. Polarization. Artificial intelligence. The concentration of wealth and power. I have a more general story: The political and economic systems of governance that were created in the mid-18th century are poorly suited for the 21st century. They don’t align incentives well. And they are being hacked too effectively.

At the same time, the cost of these hacked systems has never been greater, across all human history. We have become too powerful as a species. And our systems cannot keep up with fast-changing disruptive technologies.

We need to create new systems of governance that align incentives and are resilient against hacking … at every scale. From the individual all the way up to the whole of society.

For this, I need you to drop your 20th century either/or thinking. This is not about capitalism versus communism. It’s not about democracy versus autocracy. It’s not even about humans versus AI. It’s something new, something we don’t have a name for yet. And it’s “blue sky” thinking, not even remotely considering what’s feasible today.

Throughout this talk, I want you to think of both democracy and capitalism as information systems. Socio-technical information systems. Protocols for making group decisions. Ones where different players have different incentives. These systems are vulnerable to hacking and need to be secured against those hacks.

We security technologists have a lot of expertise in both secure system design and hacking. That’s why we have something to add to this discussion.

And finally, this is a work in progress. I’m trying to create a framework for viewing governance. So think of this more as a foundation for discussion, rather than a road map to a solution. And I think by writing, and what you’re going to hear is the current draft of my writing—and my thinking. So everything is subject to change without notice.

OK, so let’s go.

We all know about misinformation and how it affects democracy. And how propagandists have used it to advance their agendas. This is an ancient problem, amplified by information technologies. Social media platforms that prioritize engagement. “Filter bubble” segmentation. And technologies for honing persuasive messages.

The problem ultimately stems from the way democracies use information to make policy decisions. Democracy is an information system that leverages collective intelligence to solve political problems. And then to collect feedback as to how well those solutions are working. This is different from autocracies that don’t leverage collective intelligence for political decision making. Or have reliable mechanisms for collecting feedback from their populations.

Those systems of democracy work well, but have no guardrails when fringe ideas become weaponized. That’s what misinformation targets. The historical solution for this was supposed to be representation. This is currently failing in the US, partly because of gerrymandering, safe seats, only two parties, money in politics and our primary system. But the problem is more general.

James Madison wrote about this in 1787, where he made two points. One, that representatives serve to filter popular opinions, limiting extremism. And two, that geographical dispersal makes it hard for those with extreme views to participate. It’s hard to organize. To be fair, these limitations are both good and bad. In any case, current technology—social media—breaks them both.

So this is a question: What does representation look like in a world without either filtering or geographical dispersal? Or, how do we avoid polluting 21st century democracy with prejudice, misinformation and bias. Things that impair both the problem solving and feedback mechanisms.

That’s the real issue. It’s not about misinformation, it’s about the incentive structure that makes misinformation a viable strategy.

This is problem No. 1: Our systems have misaligned incentives. What’s best for the small group often doesn’t match what’s best for the whole. And this is true across all sorts of individuals and group sizes.

Now, historically, we have used misalignment to our advantage. Our current systems of governance leverage conflict to make decisions. The basic idea is that coordination is inefficient and expensive. Individual self-interest leads to local optimizations, which results in optimal group decisions.

But this is also inefficient and expensive. The U.S. spent $14.5 billion on the 2020 presidential, senate and congressional elections. I don’t even know how to calculate the cost in attention. That sounds like a lot of money, but step back and think about how the system works. The economic value of winning those elections are so great because that’s how you impose your own incentive structure on the whole.

More generally, the cost of our market economy is enormous. For example, $780 billion is spent world-wide annually on advertising. Many more billions are wasted on ventures that fail. And that’s just a fraction of the total resources lost in a competitive market environment. And there are other collateral damages, which are spread non-uniformly across people.

We have accepted these costs of capitalism—and democracy—because the inefficiency of central planning was considered to be worse. That might not be true anymore. The costs of conflict have increased. And the costs of coordination have decreased. Corporations demonstrate that large centrally planned economic units can compete in today’s society. Think of Walmart or Amazon. If you compare GDP to market cap, Apple would be the eighth largest country on the planet. Microsoft would be the tenth.

Another effect of these conflict-based systems is that they foster a scarcity mindset. And we have taken this to an extreme. We now think in terms of zero-sum politics. My party wins, your party loses. And winning next time can be more important than governing this time. We think in terms of zero-sum economics. My product’s success depends on my competitors’ failures. We think zero-sum internationally. Arms races and trade wars.

Finally, conflict as a problem-solving tool might not give us good enough answers anymore. The underlying assumption is that if everyone pursues their own self interest, the result will approach everyone’s best interest. That only works for simple problems and requires systemic oppression. We have lots of problems—complex, wicked, global problems—that don’t work that way. We have interacting groups of problems that don’t work that way. We have problems that require more efficient ways of finding optimal solutions.

Note that there are multiple effects of these conflict-based systems. We have bad actors deliberately breaking the rules. And we have selfish actors taking advantage of insufficient rules.

The latter is problem No. 2: What I refer to as “hacking” in my latest book: “A Hacker’s Mind.” Democracy is a socio-technical system. And all socio-technical systems can be hacked. By this I mean that the rules are either incomplete or inconsistent or outdated—they have loopholes. And these can be used to subvert the rules. This is Peter Thiel subverting the Roth IRA to avoid paying taxes on $5 billion in income. This is gerrymandering, the filibuster, and must-pass legislation. Or tax loopholes, financial loopholes, regulatory loopholes.

In today’s society, the rich and powerful are just too good at hacking. And it is becoming increasingly impossible to patch our hacked systems. Because the rich use their power to ensure that the vulnerabilities don’t get patched.

This is bad for society, but it’s basically the optimal strategy in our competitive governance systems. Their zero-sum nature makes hacking an effective, if parasitic, strategy. Hacking isn’t a new problem, but today hacking scales better—and is overwhelming the security systems in place to keep hacking in check. Think about gun regulations, climate change, opioids. And complex systems make this worse. These are all non-linear, tightly coupled, unrepeatable, path-dependent, adaptive, co-evolving systems.

Now, add into this mix the risks that arise from new and dangerous technologies such as the internet or AI or synthetic biology. Or molecular nanotechnology, or nuclear weapons. Here, misaligned incentives and hacking can have catastrophic consequences for society.

This is problem No. 3: Our systems of governance are not suited to our power level. They tend to be rights based, not permissions based. They’re designed to be reactive, because traditionally there was only so much damage a single person could do.

We do have systems for regulating dangerous technologies. Consider automobiles. They are regulated in many ways: drivers licenses + traffic laws + automobile regulations + road design. Compare this to aircrafts. Much more onerous licensing requirements, rules about flights, regulations on aircraft design and testing and a government agency overseeing it all day-to-day. Or pharmaceuticals, which have very complex rules surrounding everything around researching, developing, producing and dispensing. We have all these regulations because this stuff can kill you.

The general term for this kind of thing is the “precautionary principle.” When random new things can be deadly, we prohibit them unless they are specifically allowed.

So what happens when a significant percentage of our jobs are as potentially damaging as a pilot’s? Or even more damaging? When one person can affect everyone through synthetic biology. Or where a corporate decision can directly affect climate. Or something in AI or robotics. Things like the precautionary principle are no longer sufficient. Because breaking the rules can have global effects.

And AI will supercharge hacking. We have created a series of non-interoperable systems that actually interact and AI will be able to figure out how to take advantage of more of those interactions: finding new tax loopholes or finding new ways to evade financial regulations. Creating “micro-legislation” that surreptitiously benefits a particular person or group. And catastrophic risk means this is no longer tenable.

So these are our core problems: misaligned incentives leading to too effective hacking of systems where the costs of getting it wrong can be catastrophic.

Or, to put more words on it: Misaligned incentives encourage local optimization, and that’s not a good proxy for societal optimization. This encourages hacking, which now generates greater harm than at any point in the past because the amount of damage that can result from local optimization is greater than at any point in the past.

OK, let’s get back to the notion of democracy as an information system. It’s not just democracy: Any form of governance is an information system. It’s a process that turns individual beliefs and preferences into group policy decisions. And, it uses feedback mechanisms to determine how well those decisions are working and then makes corrections accordingly.

Historically, there are many ways to do this. We can have a system where no one’s preference matters except the monarch’s or the nobles’ or the landowners’. Sometimes the stronger army gets to decide—or the people with the money.

Or we could tally up everyone’s preferences and do the thing that at least half of the people want. That’s basically the promise of democracy today, at its ideal. Parliamentary systems are better, but only in the margins—and it all feels kind of primitive. Lots of people write about how informationally poor elections are at aggregating individual preferences. It also results in all these misaligned incentives.

I realize that democracy serves different functions. Peaceful transition of power, minimizing harm, equality, fair decision making, better outcomes. I am taking for granted that democracy is good for all those things. I’m focusing on how we implement it.

Modern democracy uses elections to determine who represents citizens in the decision-making process. And all sorts of other ways to collect information about what people think and want, and how well policies are working. These are opinion polls, public comments to rule-making, advocating, lobbying, protesting and so on. And, in reality, it’s been hacked so badly that it does a terrible job of executing on the will of the people, creating further incentives to hack these systems.

To be fair, the democratic republic was the best form of government that mid 18th century technology could invent. Because communications and travel were hard, we needed to choose one of us to go all the way over there and pass laws in our name. It was always a coarse approximation of what we wanted. And our principles, values, conceptions of fairness; our ideas about legitimacy and authority have evolved a lot since the mid 18th century. Even the notion of optimal group outcomes depended on who was considered in the group and who was out.

But democracy is not a static system, it’s an aspirational direction. One that really requires constant improvement. And our democratic systems have not evolved at the same pace that our technologies have. Blocking progress in democracy is itself a hack of democracy.

Today we have much better technology that we can use in the service of democracy. Surely there are better ways to turn individual preferences into group policies. Now that communications and travel are easy. Maybe we should assign representation by age, or profession or randomly by birthday. Maybe we can invent an AI that calculates optimal policy outcomes based on everyone’s preferences.

Whatever we do, we need systems that better align individual and group incentives, at all scales. Systems designed to be resistant to hacking. And resilient to catastrophic risks. Systems that leverage cooperation more and conflict less. And are not zero-sum.

Why can’t we have a game where everybody wins?

This has never been done before. It’s not capitalism, it’s not communism, it’s not socialism. It’s not current democracies or autocracies. It would be unlike anything we’ve ever seen.

Some of this comes down to how trust and cooperation work. When I wrote “Liars and Outliers” in 2012, I wrote about four systems for enabling trust: our innate morals, concern about our reputations, the laws we live under and security technologies that constrain our behavior. I wrote about how the first two are more informal than the last two. And how the last two scale better, and allow for larger and more complex societies. They enable cooperation amongst strangers.

What I didn’t appreciate is how different the first and last two are. Morals and reputation are both old biological systems of trust. They’re person to person, based on human connection and cooperation. Laws—and especially security technologies—are newer systems of trust that force us to cooperate. They’re socio-technical systems. They’re more about confidence and control than they are about trust. And that allows them to scale better. Taxi driver used to be one of the country’s most dangerous professions. Uber changed that through pervasive surveillance. My Uber driver and I don’t know or trust each other, but the technology lets us both be confident that neither of us will cheat or attack each other. Both drivers and passengers compete for star rankings, which align local and global incentives.

In today’s tech-mediated world, we are replacing the rituals and behaviors of cooperation with security mechanisms that enforce compliance. And innate trust in people with compelled trust in processes and institutions. That scales better, but we lose the human connection. It’s also expensive, and becoming even more so as our power grows. We need more security for these systems. And the results are much easier to hack.

But here’s the thing: Our informal human systems of trust are inherently unscalable. So maybe we have to rethink scale.

Our 18th century systems of democracy were the only things that scaled with the technology of the time. Imagine a group of friends deciding where to have dinner. One is kosher, one is a vegetarian. They would never use a winner-take-all ballot to decide where to eat. But that’s a system that scales to large groups of strangers.

Scale matters more broadly in governance as well. We have global systems of political and economic competition. On the other end of the scale, the most common form of governance on the planet is socialism. It’s how families function: people work according to their abilities, and resources are distributed according to their needs.

I think we need governance that is both very large and very small. Our catastrophic technological risks are planetary-scale: climate change, AI, internet, bio-tech. And we have all the local problems inherent in human societies. We have very few problems anymore that are the size of France or Virginia. Some systems of governance work well on a local level but don’t scale to larger groups. But now that we have more technology, we can make other systems of democracy scale.

This runs headlong into historical norms about sovereignty. But that’s already becoming increasingly irrelevant. The modern concept of a nation arose around the same time as the modern concept of democracy. But constituent boundaries are now larger and more fluid, and depend a lot on context. It makes no sense that the decisions about the “drug war”—or climate migration—are delineated by nation. The issues are much larger than that. Right now there is no governance body with the right footprint to regulate Internet platforms like Facebook. Which has more users world-wide than Christianity.

We also need to rethink growth. Growth only equates to progress when the resources necessary to grow are cheap and abundant. Growth is often extractive. And at the expense of something else. Growth is how we fuel our zero-sum systems. If the pie gets bigger, it’s OK that we waste some of the pie in order for it to grow. That doesn’t make sense when resources are scarce and expensive. Growing the pie can end up costing more than the increase in pie size. Sustainability makes more sense. And a metric more suited to the environment we’re in right now.

Finally, agility is also important. Back to systems theory, governance is an attempt to control complex systems with complicated systems. This gets harder as the systems get larger and more complex. And as catastrophic risk raises the costs of getting it wrong.

In recent decades, we have replaced the richness of human interaction with economic models. Models that turn everything into markets. Market fundamentalism scaled better, but the social cost was enormous. A lot of how we think and act isn’t captured by those models. And those complex models turn out to be very hackable. Increasingly so at larger scales.

Lots of people have written about the speed of technology versus the speed of policy. To relate it to this talk: Our human systems of governance need to be compatible with the technologies they’re supposed to govern. If they’re not, eventually the technological systems will replace the governance systems. Think of Twitter as the de facto arbiter of free speech.

This means that governance needs to be agile. And able to quickly react to changing circumstances. Imagine a court saying to Peter Thiel: “Sorry. That’s not how Roth IRAs are supposed to work. Now give us our tax on that $5B.” This is also essential in a technological world: one that is moving at unprecedented speeds, where getting it wrong can be catastrophic and one that is resource constrained. Agile patching is how we maintain security in the face of constant hacking—and also red teaming. In this context, both journalism and civil society are important checks on government.

I want to quickly mention two ideas for democracy, one old and one new. I’m not advocating for either. I’m just trying to open you up to new possibilities. The first is sortition. These are citizen assemblies brought together to study an issue and reach a policy decision. They were popular in ancient Greece and Renaissance Italy, and are increasingly being used today in Europe. The only vestige of this in the U.S. is the jury. But you can also think of trustees of an organization. The second idea is liquid democracy. This is a system where everybody has a proxy that they can transfer to someone else to vote on their behalf. Representatives hold those proxies, and their vote strength is proportional to the number of proxies they have. We have something like this in corporate proxy governance.

Both of these are algorithms for converting individual beliefs and preferences into policy decisions. Both of these are made easier through 21st century technologies. They are both democracies, but in new and different ways. And while they’re not immune to hacking, we can design them from the beginning with security in mind.

This points to technology as a key component of any solution. We know how to use technology to build systems of trust. Both the informal biological kind and the formal compliance kind. We know how to use technology to help align incentives, and to defend against hacking.

We talked about AI hacking; AI can also be used to defend against hacking, finding vulnerabilities in computer code, finding tax loopholes before they become law and uncovering attempts at surreptitious micro-legislation.

Think back to democracy as an information system. Can AI techniques be used to uncover our political preferences and turn them into policy outcomes, get feedback and then iterate? This would be more accurate than polling. And maybe even elections. Can an AI act as our representative? Could it do a better job than a human at voting the preferences of its constituents?

Can we have an AI in our pocket that votes on our behalf, thousands of times a day, based on the preferences it infers we have. Or maybe based on the preferences it infers we would have if we read up on the issues and weren’t swayed by misinformation. It’s just another algorithm for converting individual preferences into policy decisions. And it certainly solves the problem of people not paying attention to politics.

But slow down: This is rapidly devolving into technological solutionism. And we know that doesn’t work.

A general question to ask here is when do we allow algorithms to make decisions for us? Sometimes it’s easy. I’m happy to let my thermostat automatically turn my heat on and off or to let an AI drive a car or optimize the traffic lights in a city. I’m less sure about an AI that sets tax rates, or corporate regulations or foreign policy. Or an AI that tells us that it can’t explain why, but strongly urges us to declare war—right now. Each of these is harder because they are more complex systems: non-local, multi-agent, long-duration and so on. I also want any AI that works on my behalf to be under my control. And not controlled by a large corporate monopoly that allows me to use it.

And learned helplessness is an important consideration. We’re probably OK with no longer needing to know how to drive a car. But we don’t want a system that results in us forgetting how to run a democracy. Outcomes matter here, but so do mechanisms. Any AI system should engage individuals in the process of democracy, not replace them.

So while an AI that does all the hard work of governance might generate better policy outcomes. There is social value in a human-centric political system, even if it is less efficient. And more technologically efficient preference collection might not be better, even if it is more accurate.

Procedure and substance need to work together. There is a role for AI in decision making: moderating discussions, highlighting agreements and disagreements helping people reach consensus. But it is an independent good that we humans remain engaged in—and in charge of—the process of governance.

And that value is critical to making democracy function. Democratic knowledge isn’t something that’s out there to be gathered: It’s dynamic; it gets produced through the social processes of democracy. The term of art is “preference formation.” We’re not just passively aggregating preferences, we create them through learning, deliberation, negotiation and adaptation. Some of these processes are cooperative and some of these are competitive. Both are important. And both are needed to fuel the information system that is democracy.

We’re never going to remove conflict and competition from our political and economic systems. Human disagreement isn’t just a surface feature; it goes all the way down. We have fundamentally different aspirations. We want different ways of life. I talked about optimal policies. Even that notion is contested: optimal for whom, with respect to what, over what time frame? Disagreement is fundamental to democracy. We reach different policy conclusions based on the same information. And it’s the process of making all of this work that makes democracy possible.

So we actually can’t have a game where everybody wins. Our goal has to be to accommodate plurality, to harness conflict and disagreement, and not to eliminate it. While, at the same time, moving from a player-versus-player game to a player-versus-environment game.

There’s a lot missing from this talk. Like what these new political and economic governance systems should look like. Democracy and capitalism are intertwined in complex ways, and I don’t think we can recreate one without also recreating the other. My comments about agility lead to questions about authority and how that interplays with everything else. And how agility can be hacked as well. We haven’t even talked about tribalism in its many forms. In order for democracy to function, people need to care about the welfare of strangers who are not like them. We haven’t talked about rights or responsibilities. What is off limits to democracy is a huge discussion. And Butterin’s trilemma also matters here: that you can’t simultaneously build systems that are secure, distributed, and scalable.

I also haven’t given a moment’s thought to how to get from here to there. Everything I’ve talked about—incentives, hacking, power, complexity—also applies to any transition systems. But I think we need to have unconstrained discussions about what we’re aiming for. If for no other reason than to question our assumptions. And to imagine the possibilities. And while a lot of the AI parts are still science fiction, they’re not far-off science fiction.

I know we can’t clear the board and build a new governance structure from scratch. But maybe we can come up with ideas that we can bring back to reality.

To summarize, the systems of governance we designed at the start of the Industrial Age are ill-suited to the Information Age. Their incentive structures are all wrong. They’re insecure and they’re wasteful. They don’t generate optimal outcomes. At the same time we’re facing catastrophic risks to society due to powerful technologies. And a vastly constrained resource environment. We need to rethink our systems of governance; more cooperation and less competition and at scales that are suited to today’s problems and today’s technologies. With security and precautions built in. What comes after democracy might very well be more democracy, but it will look very different.

This feels like a challenge worthy of our security expertise.

This text is the transcript from a keynote speech delivered during the RSA Conference in San Francisco on April 25, 2023. It was previously published in Cyberscoop. I thought I posted it to my blog and Crypto-Gram last year, but it seems that I didn’t.

Enhancing Network Synergy: rConfig’s Native Integration with Zabbix

Post Syndicated from Stephen Stack original https://blog.zabbix.com/enhancing-network-synergy-rconfigs-native-integration-with-zabbix/28283/

Native integration between two leading open-source tools – Zabbix for network monitoring and rConfig for configuration management, delivers substantial benefits to organizations. On one side, Zabbix offers a platform that maintains a Single Source of Truth for network device inventories. It provides real-time monitoring, problem detection, alerting, and other critical features that are essential for day-to-day operations, ensuring smooth and reliable network connectivity crucial for business continuity.

On the other side, there’s rConfig, renowned for its robust and reliable network automation, configuration backup, and compliance management. Integrating rConfig with Zabbix enhances its capabilities, allowing for seamless Device Inventory synchronization. This union not only simplifies the management of network configurations but also introduces more advanced Network Automation Platform features. Together, they form a powerhouse toolset that streamlines network management tasks, reduces operational overhead, and boosts overall network performance, making it easier for businesses to focus on growth and innovation without being hindered by network reliability concerns.

Optimizing Network Management with Unified Inventory

At rConfig, we are deeply embedded with our customers, and our main mission is to work with them to solve their real-world problems. One significant challenge that consistently surfaces – both from client feedback and our own experiences – is managing and accurately locating a trusted and reliable central network inventory. This challenge brings to the forefront a classic dilemma in Enterprise Architecture circles: In our scenario of network inventory, which system ought to act as the System of Record, and which should function as the System of Engagement to optimize interactions with records for various purposes, such as Network Management Systems (NMS) and Network Configuration Management (NCM)?

Enterprise Architecture circles illustrating systems of record, insight and engagement. Credit: Sharon Moore - https://samoore.me/
Enterprise Architecture circles illustrating systems of record, insight and engagement. Credit: Sharon Moore – https://samoore.me/

At rConfig, from a product perspective we’ve chosen to focus on what we do best and love most: Network Configuration Management. Therefore, integrating with an upstream Network Management System (NMS) that can act as the System of Record for network device inventory was a logical step for us. Given that many of our customers also use Zabbix network operations, it was a natural choice to begin our integration journey with them. Our platforms are highly complementary, which streamlines the integration process and enhances our ability to serve our customers better. This strategic decision allows us to offer a seamless and efficient management solution that not only meets the current needs but also scales to address future challenges in network management.

Enhanced Integration Through ETL

You might be wondering how this integration works and whether it’s straightforward or challenging to set up. Setting up the integration between rConfig and Zabbix is relatively straightforward, but, as with any complex data driven systems, it requires careful planning and diligence to ensure that the data flow between the systems is fully optimized and automated. This is where ETL – or Extract, Transform, Load – plays a crucial role. ETL is a process that involves extracting data from the Zabbix API in its raw form, transforming it into a format that rConfig can readily process and validate, and then loading it into the rConfig production database. This process also efficiently handles any data conflicts and updates.

The advantages of using ETL are significant, enhancing data quality and making the data more accessible, thereby enabling rConfig to analyze information more effectively and make well-informed, data-driven decisions. At rConfig, our user interface is designed to aid in the development and troubleshooting of features, though we’re also fond of using the CLI for those who prefer it. Below is a screenshot from our lab showing the end-to-end ETL process with Zabbix in action. It illustrates the steps rConfig takes to connect to Zabbix, extract, validate, transform and map the data, load it to staging, and finally, move it to the production environment for a small set of devices.

While the screenshot below displays just a few devices as a sample integration in our lab, the most extensive integration we’ve achieved in a production environment with this new rConfig feature involved syncing a single Zabbix instance with over 5,000 host/device records. This highlights its efficiency and reliability in a real-world environment.

Screenshot of rConfig Zabbix Integration on the Command Line
Screenshot of rConfig Zabbix Integration on the Command Line

Going Deeper: Understanding the Integration Process

To grasp the integration process more clearly, let’s dive into the details that will help you understand how to set everything up before we automate the task. Our documentation website, docs.rconfig.com, provides comprehensive details, and our YouTube channel features a great demonstration video of the entire process.

Initial Setup: The first step involves configuring rConfig to connect and authenticate with the Zabbix API. This setup is managed through the Configuration page in the rConfig user interface. During this phase, you can also apply filters to select specific Zabbix tags or host groups, refining exactly which host records you want to synchronize.

Screenshot of Zabbix Configuration page in rConfig V7 professional
Screenshot of Zabbix Configuration page in rConfig V7 professional

Data Extraction and Validation: Once the connection is established, rConfig extracts host records in raw JSON format. This stage involves validating the data to ensure that the correct tags and data mappings are in place.

Screenshot of Zabbix Raw Host Extract page in rConfig V7 Professional
Screenshot of Zabbix Raw Host Extract page in rConfig V7 Professional

Staging for Review: After validation, the data is loaded into a staging table. This allows for a thorough review to confirm that the mapped rConfig data fields are correct, ensuring that the newly imported devices are associated with the appropriate connection templates, categories, and tags.

Screenshot of Zabbix host staging table in rConfig V7 Professional
Screenshot of Zabbix host staging table in rConfig V7 Professional

Final Loading: The final step involves transferring the staged devices to the main production devices table. After this transfer, the staging table is cleared. The devices then appear in the main device table, marked with a special icon indicating that they are synced through integration.

Screenshot of Zabbix host fully loaded to production devices table in rConfig V7 Professional 
Screenshot of Zabbix host fully loaded to production devices table in rConfig V7 Professional

Seamless Operational Integration: Once the devices are loaded into the production table, they are automatically incorporated into standard rConfig scheduled tasks, automations, or any other rConfig feature that utilizes the device data (like categories and tags). This integration facilitates a seamless operational workflow between the platforms. Users can even access these devices directly in Zabbix from within the rConfig UI, streamlining operations management.

After all the above steps are completed, and the initial setup is done future loads are completed on a scheduled and automation basis using the rConfig Task manager.

Screenshot of rConfig Device detail view for a Zabbix integrated host
Screenshot of rConfig Device detail view for a Zabbix integrated host

This detailed setup and validation process ensures that the integration between rConfig and Zabbix is not only effective but also enhances the functionality and efficiency of managing network devices across platforms.

Case Study: Enhancing Network Management for a Las Vegas Entertainment Organization

  1. Challenge: A prominent Las Vegas entertainment organization faced significant difficulties in managing the diverse and complex network that supports their extensive operations, including gaming, security, and hospitality services. The primary issues were outdated network inventories and inefficient management of network configurations across numerous devices, leading to operational disruptions and security vulnerabilities.
  2. Solution: To address these challenges, the organization implemented the integration of rConfig with Zabbix, focusing on automating and centralizing the network management process. This solution aimed to synchronize network device inventories across the organization’s extensive operations, ensuring accurate and real-time data availability.
  3. Implementation: The integration process began with setting up Zabbix to continuously monitor and gather data from network devices across different venues and services. This data was then extracted, standardized, and loaded into rConfig, where it could be used for automated configuration management and backup. The setup also included sophisticated mapping and validation to ensure all data transferred between Zabbix and rConfig was accurate and relevant.

Benefits:

  • Improved Network Reliability: The automated synchronization of network inventories reduced the frequency of network failures and minimized downtime, which is crucial in the high-stakes environment of Las Vegas entertainment.
  • Enhanced Security: With more accurate and timely network data, the organization could better identify and respond to security threats, protecting sensitive information and ensuring the safety of both guests and operations.
  • Operational Efficiency: The IT team was able to shift their focus from routine network maintenance to strategic initiatives that enhanced overall business operations, including integrating new technologies and improving guest experiences.
  • Scalability: The integration provided a scalable solution that could accommodate future expansion, whether adding new devices or incorporating new technologies or venues into the network.
  • Outcome: The implementation of the rConfig and Zabbix integration dramatically transformed the organization’s network management capabilities. The IT department noted a substantial reduction in the manpower and time required for routine maintenance, while operational uptime improved significantly. The organization now enjoys a robust, streamlined network management system that supports its dynamic environment, ensuring that both guests and staff benefit from reliable and secure network services.

This case study highlights the power of effective network management solutions in supporting complex operations and enhancing business efficiency and security within the entertainment industry.

Conclusion: Forging Ahead with Innovative Partnerships

In conclusion, the Zabbix platform stands out as a cornerstone in network monitoring, renowned for its extensive capabilities in real-time monitoring, problem detection, and alerting. Its robust architecture not only supports a broad range of network environments but also offers the flexibility and scalability necessary for today’s diverse technological landscapes. The platform’s ability to provide detailed and accurate network insights is crucial for organizations aiming to maintain optimal operational continuity and security.

The integration of Zabbix with rConfig, a globally reliable and robust network configuration management (NCM) solution, enhances these benefits significantly, creating a synergistic relationship that leverages the strengths of both platforms. For customers and partners, this integration means not only smoother and more efficient network management but also the assurance that they are supported by two of the leading solutions in the industry. Together, Zabbix and rConfig deliver a comprehensive network management experience that drives efficiency, reduces costs, and ensures a higher level of network reliability and security, positioning them as indispensable tools in the toolkit of any organization serious about its network infrastructure.

About rConfig

rConfig is an industry leader in network configuration management and automation. Founded in 2010 and based in Ireland, rConfig has been at the forefront of delivering innovative solutions that simplify the complexities of network management. Our software is designed to be both powerful and user-friendly, making it an ideal choice for IT professionals across a variety of sectors, including education, government, manufacturing, and large global enterprises.

With the capability to manage up to 10s of 1000s of devices, rConfig offers robust functionalities such as automated config backups, compliance management, and network automation. Our platform is vendor-agnostic, which allows seamless integration with a diverse range of network devices and systems, from traditional IT to IoT and OT environments. This flexibility ensures that our clients can manage all aspects of their network configurations, regardless of the underlying technology.

rConfig is committed to continuous innovation and customer-centric solutions, with industry first solutions such as API backups and our Script Integration Engine. Our native integration with platforms like Zabbix exemplifies our dedication to enhancing network management through strategic partnerships. This collaboration not only streamlines operations but also amplifies the benefits provided, ensuring that our customers have access to the most advanced tools in the industry.

 

The post Enhancing Network Synergy: rConfig’s Native Integration with Zabbix appeared first on Zabbix Blog.