Rapid7’s Ciara Cullinan Recognized as Community Trailblazer in Belfast Awards Program

2024-03-14 Rapid7

Post Syndicated from Rapid7 original https://blog.rapid7.com/2024/03/14/rapid7s-ciara-cullinan-recognized-as-community-trailblazer-in-belfast-awards-program/

At the 2024 Women Who Code She Rocks Awards, Rapid7 Software Engineer II Ciara Cullinan was recognized with their ‘Community Trailblazer’ award.

According to Women Who Code, “This award celebrates the efforts of someone who brings people together and creates genuine connections in our tech community. Whether this is online or in-person, this person demonstrates exceptional commitment to building a thriving and inclusive community.”

When it comes to building community, Ciara is a true champion who is consistently looking for ways to establish and grow meaningful connections among her team, across the organization, and in the local tech industry. Whether it’s encouraging engagement in various slack channels with ‘water cooler’ questions and ice breakers, or driving Rapid7’s sponsorship of Women Techmakers, she’s proactively seeking out ways to bring people together while growing her own network in the process.

“I think a lot of times – and especially for women – we focus on perfection in our work. We can be hesitant to share things until we have it 100% figured out ourselves. However, when we are able to build strong personal connections with our colleagues, or even others in the industry, the bravery to put something forward or ask for feedback comes much easier. That connection opens up the door to have honest conversations, share ideas, and provide feedback. This is where we can work together to drive impact and grow our skills, which lead to rewarding career experiences and growth.”

In addition to her role as an engineer, Ciara is an active member of Rapid7 Women. Rapid7 Women is an employee resource group that aims to support, enable, and empower all employees identifying as women to bring their best, true selves to work every day through community, action, and activism. Ciara actively contributes to this mission by helping build global and local initiatives for the group. As mentioned in her nomination submission, “Ciara collaborates with colleagues from around the globe, in different business units and roles to build a Women program that caters to supporting not only Women identifying individuals, but also seeks to educate allies on how to be a culture contributor exhibiting inclusive leadership traits.”

Ciara also highlights the importance of bringing more women into the tech industry, and how organizations like Women Who Code can make a difference. “In my role I am one of two women on the team. As technology continues to evolve and things like Artificial Intelligence become part of our everyday life, it’s important to get more women involved in the field to combat any implicit bias in the things that are being built. Bringing more diverse perspectives into a team can also help drive innovation and help organizations work through challenges more efficiently. Awards and programs like this help showcase what’s possible for the next generation of women, allowing them see and then realize the potential a career in tech could hold for them.”

To learn more about Women Who Code’s Belfast community, visit their website.

To learn more about Rapid7’s culture, and our Rapid Impact Groups, visit our careers page.

[$] The first half of the 6.9 merge window

2024-03-14 corbet

Post Syndicated from corbet original https://lwn.net/Articles/965141/

As of this writing, just over 4,900 non-merge changesets have been pulled
into the mainline for the 6.9 release. This work includes the usual array
of changes all over the kernel tree; read on for a summary of the most
significant work merged during the first part of the 6.9 merge window.

NIS2 Directive Compliance Checklist for Companies

2024-03-14 Editor

Post Syndicated from Editor original https://nebosystems.eu/nis2-compliance-checklist-guide/

NIS2 Directive Compliance Checklist for Companies

In response to the evolving cybersecurity threats, the European Union has introduced the Network & Information System (NIS2) Directive, setting a new standard for cybersecurity measures across member states. Understanding and complying with these requirements is crucial for organizations operating within the EU.

This checklist is designed to help companies understand whether they are affected by the NIS2 Directive (Directive (EU) 2022/2555) and need to comply with its cybersecurity requirements. Answering these questions will provide an initial assessment of your company’s obligations under the Directive.

Section 1: Company Size and Type

Is your company considered a medium-sized enterprise or larger according to the EU definition? (More than 50 employees and an annual turnover or balance sheet exceeding €10 million)

Does your company operate in the digital infrastructure, including as a DNS service provider, TLD name registry, or cloud computing service provider?

Is your company a small enterprise or micro-enterprise that plays a key role in society, the economy, or within specific sectors or types of service? (Consider if your services are critical even if your company is small.)

Section 2: Sector-Specific Questions

Is your company involved in any of the following sectors?

Energy

Transport

Banking

Financial Market Infrastructure

Health sector

Drinking water

Digital infrastructure

Public administration

Space

None of the above

Does your company provide essential services within these sectors that, if disrupted, would have a significant impact on societal or economic activities?

Section 3: Operational Impact

Does your company rely heavily on network and information systems for the provision of your services?

In the event of a cybersecurity incident, could your company’s services be significantly disrupted, leading to substantial financial loss or societal impact?

Section 4: Exclusions

Is your company’s primary activity related to national security, public security, defense, or law enforcement? (Note: If only marginally related, you might still fall under the Directive.)

Is your company a public administration entity that predominantly carries out activities in the areas of national security, public security, defense, or law enforcement?

Section 5: Additional Considerations

Has your company been previously identified as an operator of essential services under the NIS Directive or any national legislation related to cybersecurity?

Is your company part of the supply chain for critical services in any of the sectors identified in question 4?

Conclusion

Questions 1, 2, or 3 (Company Size and Type): If you answered “Yes” to any of these, your company falls within the scope of the NIS2 Directive due to its size, operation within digital infrastructure, or significant role despite being a small or microenterprise. Next Steps: Assess specific obligations under the NIS2 Directive and begin implementing necessary cybersecurity measures and reporting mechanisms.
Question 4 (Sector Involvement): A “Yes” response indicates your company operates in a sector directly affected by the NIS2 Directive. Next Steps: Identify sector-specific cybersecurity requirements and engage with sector regulators or national cybersecurity authorities for guidance.
Question 5 (Provision of Essential Services): If “Yes,” your services are crucial, making compliance with the NIS2 Directive imperative to ensure service continuity and security. Next Steps: Prioritize establishing a comprehensive risk management framework and incident response plan as per NIS2 requirements.
Questions 6 and 7 (Operational Impact): Affirmative answers highlight your reliance on network and information systems and potential significant impacts from cybersecurity incidents. Next Steps: Strengthen your cybersecurity infrastructure, focusing on resilience and rapid incident response capabilities.
Questions 8 and 9 (Exclusions): If you answered “Yes,” your company might be excluded due to its primary focus on national security or law enforcement. However, marginal involvement doesn’t grant exclusion. Next Steps: Clarify your exclusion status with legal experts and, if applicable, review your cybersecurity practices to ensure they’re adequate for your operational needs.
Question 10 (Previous Identification as Essential Service Operator): A “Yes” answer suggests your company was already under obligations similar to those in the NIS2 Directive, which will likely continue or expand under the new directive. Next Steps: Update your cybersecurity and compliance strategies to align with NIS2 enhancements and consult with authorities for transitional requirements.
Question 11 (Part of the Supply Chain for Critical Services): Answering “Yes” indicates your role in the supply chain could bring you under the NIS2 Directive’s purview, especially with its increased focus on supply chain security. Next Steps: Evaluate your cybersecurity practices in the context of supply chain integrity, collaborate with your partners to understand your shared responsibilities, and implement any necessary security and reporting enhancements.

Please note that this checklist provides a preliminary assessment, and the specific obligations under the NIS2 Directive may vary based on national transposition and interpretation by regulatory authorities.

Download the NIS2 Compliance Checklist

General Advice

Regardless of your answers, it’s advisable for all companies, especially those operating within or closely related to critical sectors, to adopt robust cybersecurity measures. The evolving cybersecurity landscape and the interconnected nature of digital services mean that comprehensive security practices are essential for resilience against cyber threats.

For companies potentially falling under the NIS2 Directive, consider the following steps:

Review and Update Security Policies: Ensure that your cybersecurity policies are up-to-date and align with the best practices.
Engage with Regulatory Authorities: Reach out to your national cybersecurity authority or sector-specific regulatory bodies to clarify your status under the NIS2 Directive and to obtain guidance on compliance.
Consult Legal and Cybersecurity Experts: Seek advice from professionals specializing in cybersecurity law and technical security measures to ensure that your company meets all legal obligations and effectively mitigates cyber risks.
Implement a Compliance Plan: Develop or update your cybersecurity compliance plan to address the requirements of the NIS2 Directive, focusing on risk management, incident reporting, supply chain security, and other relevant areas.

Remember, even if your company is not directly affected by the NIS2 Directive, adopting its principles can enhance your cybersecurity posture and potentially offer a competitive advantage by demonstrating a commitment to security to your clients and partners.

Ready to ensure your company is NIS2 compliant? Contact Nebosystems today for expert NIS2 compliance consulting. Our team is dedicated to helping you navigate these regulations, ensuring your cybersecurity measures are robust and compliant. Explore our NIS2 Compliance Cybersecurity Solutions for more information on how we can assist.

Reference: NIS2 Directive (Directive (EU) 2022/2555). EUR-Lex.

Security updates for Thursday

2024-03-14 jake

Post Syndicated from jake original https://lwn.net/Articles/965470/

Security updates have been issued by Debian (chromium and openvswitch), Fedora (chromium, python-multipart, thunderbird, and xen), Mageia (java-17-openjdk and screen), Red Hat (.NET 7.0, .NET 8.0, kernel-rt, kpatch-patch, postgresql:13, and postgresql:15), Slackware (expat), SUSE (glibc, python-Django, python-Django1, sudo, and vim), and Ubuntu (expat, linux-ibm, linux-ibm-5.4, linux-oracle, linux-oracle-5.4, linux-lowlatency, linux-raspi, python-cryptography, texlive-bin, and xorg-server).

Upcoming Let’s Encrypt certificate chain change and impact for Cloudflare customers

2024-03-14 Dina Kozlov

Post Syndicated from Dina Kozlov original https://blog.cloudflare.com/upcoming-lets-encrypt-certificate-chain-change-and-impact-for-cloudflare-customers

Let’s Encrypt, a publicly trusted certificate authority (CA) that Cloudflare uses to issue TLS certificates, has been relying on two distinct certificate chains. One is cross-signed with IdenTrust, a globally trusted CA that has been around since 2000, and the other is Let’s Encrypt’s own root CA, ISRG Root X1. Since Let’s Encrypt launched, ISRG Root X1 has been steadily gaining its own device compatibility.

On September 30, 2024, Let’s Encrypt’s certificate chain cross-signed with IdenTrust will expire. To proactively prepare for this change, on May 15, 2024, Cloudflare will stop issuing certificates from the cross-signed chain and will instead use Let’s Encrypt’s ISRG Root X1 chain for all future Let’s Encrypt certificates.

The change in the certificate chain will impact legacy devices and systems, such as Android devices version 7.1.1 or older, as those exclusively rely on the cross-signed chain and lack the ISRG X1 root in their trust store. These clients may encounter TLS errors or warnings when accessing domains secured by a Let’s Encrypt certificate.

According to Let’s Encrypt, more than 93.9% of Android devices already trust the ISRG Root X1 and this number is expected to increase in 2024, especially as Android releases version 14, which makes the Android trust store easily and automatically upgradable.

We took a look at the data ourselves and found that, from all Android requests, 2.96% of them come from devices that will be affected by the change. In addition, only 1.13% of all requests from Firefox come from affected versions, which means that most (98.87%) of the requests coming from Android versions that are using Firefox will not be impacted.

Preparing for the change

If you’re worried about the change impacting your clients, there are a few things that you can do to reduce the impact of the change. If you control the clients that are connecting to your application, we recommend updating the trust store to include the ISRG Root X1. If you use certificate pinning, remove or update your pin. In general, we discourage all customers from pinning their certificates, as this usually leads to issues during certificate renewals or CA changes.

If you experience issues with the Let’s Encrypt chain change, and you’re using Advanced Certificate Manager or SSL for SaaS on the Enterprise plan, you can choose to switch your certificate to use Google Trust Services as the certificate authority instead.

For more information, please refer to our developer documentation.

While this change will impact a very small portion of clients, we support the shift that Let’s Encrypt is making as it supports a more secure and agile Internet.

Embracing change to move towards a better Internet

Looking back, there were a number of challenges that slowed down the adoption of new technologies and standards that helped make the Internet faster, more secure, and more reliable.

For starters, before Cloudflare launched Universal SSL, free certificates were not attainable. Instead, domain owners had to pay around $100 to get a TLS certificate. For a small business, this is a big cost and without browsers enforcing TLS, this significantly hindered TLS adoption for years. Insecure algorithms have taken decades to deprecate due to lack of support of new algorithms in browsers or devices. We learned this lesson while deprecating SHA-1.

Supporting new security standards and protocols is vital for us to continue improving the Internet. Over the years, big and sometimes risky changes were made in order for us to move forward. The launch of Let’s Encrypt in 2015 was monumental. Let’s Encrypt allowed every domain to get a TLS certificate for free, which paved the way to a more secure Internet, with now around 98% of traffic using HTTPS.

In 2014, Cloudflare launched elliptic curve digital signature algorithm (ECDSA) support for Cloudflare-issued certificates and made the decision to issue ECDSA-only certificates to free customers. This boosted ECDSA adoption by pressing clients and web operators to make changes to support the new algorithm, which provided the same (if not better) security as RSA while also improving performance. In addition to that, modern browsers and operating systems are now being built in a way that allows them to constantly support new standards, so that they can deprecate old ones.

For us to move forward in supporting new standards and protocols, we need to make the Public Key Infrastructure (PKI) ecosystem more agile. By retiring the cross-signed chain, Let’s Encrypt is pushing devices, browsers, and clients to support adaptable trust stores. This allows clients to support new standards without causing a breaking change. It also lays the groundwork for new certificate authorities to emerge.

Today, one of the main reasons why there’s a limited number of CAs available is that it takes years for them to become widely trusted, that is, without cross-signing with another CA. In 2017, Google launched a new publicly trusted CA, Google Trust Services, that issued free TLS certificates. Even though they launched a few years after Let’s Encrypt, they faced the same challenges with device compatibility and adoption, which caused them to cross-sign with GlobalSign’s CA. We hope that, by the time GlobalSign’s CA comes up for expiration, almost all traffic is coming from a modern client and browser, meaning the change impact should be minimal.

Sending and receiving CloudEvents with Amazon EventBridge

2024-03-14 David Boyne

Post Syndicated from David Boyne original https://aws.amazon.com/blogs/compute/sending-and-receiving-cloudevents-with-amazon-eventbridge/

Amazon EventBridge helps developers build event-driven architectures (EDA) by connecting loosely coupled publishers and consumers using event routing, filtering, and transformation. CloudEvents is an open-source specification for describing event data in a common way. Developers can publish CloudEvents directly to EventBridge, filter and route them, and use input transformers and API Destinations to send CloudEvents to downstream AWS services and third-party APIs.

Overview

Event design is an important aspect in any event-driven architecture. Developers building event-driven architectures often overlook the event design process when building their architectures. This leads to unwanted side effects like exposing implementation details, lack of standards, and version incompatibility.

Without event standards, it can be difficult to integrate events or streams of messages between systems, brokers, and organizations. Each system has to understand the event structure or rely on custom-built solutions for versioning or validation.

CloudEvents is a specification for describing event data in common formats to provide interoperability between services, platforms, and systems using Cloud Native Computing Foundation (CNCF) projects. As CloudEvents is a CNCF graduated project, many third-party brokers and systems adopt this specification.

Using CloudEvents as a standard format to describe events makes integration easier and you can use open-source tooling to help build event-driven architectures and future proof any integrations. EventBridge can route and filter CloudEvents based on common metadata, without needing to understand the business logic within the event itself.

CloudEvents support two implementation modes, structured mode and binary mode, and a range of protocols including HTTP, MQTT, AMQP, and Kafka. When publishing events to an EventBridge bus, you can structure events as CloudEvents and route them to downstream consumers. You can use input transformers to transform any event into the CloudEvents specification. Events can also be forwarded to public APIs, using EventBridge API destinations, which supports both structured and binary mode encodings, enhancing interoperability with external systems.

Standardizing events using Amazon EventBridge

When publishing events to an EventBridge bus, EventBridge uses its own event envelope and represents events as JSON objects. EventBridge requires that you define top-level fields, such as detail-type and source. You can use any event/payload in the detail field.

This example event shows an OrderPlaced event from the orders-service that is unstructured without any event standards. The data within the event contains the order_id, customer_id and order_total.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c57",
  "account": "1234567890",
  "time": "2023-05-23T11:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    }
  }
}

Publishers may also choose to add an additional metadata field along with the data field within the detail field to help define a set of standards for their events.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "account": "1234567890",
  "time": "2023-05-23T12:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "metadata": {
      "idempotency_key": "29d2b068-f9c7-42a0-91e3-5ba515de5dbe",
      "correlation_id": "dddd9340-135a-c8c6-95c2-41fb8f492222",
      "domain": "ORDERS",
      "time": "1707908605"
    },
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    }
  }
}

This additional event information helps downstream consumers, improves debugging, and can manage idempotency. While this approach offers practical benefits, it duplicates solutions that are already solved with the CloudEvents specification.

Publishing CloudEvents using Amazon EventBridge

When publishing events to EventBridge, you can use CloudEvents structured mode. A structured-mode message is where the entire event (attributes and data) is encoded in the message body, according to a specific event format. A binary-mode message is where the event data is stored in the message body, and event attributes are stored as part of the message metadata.

CloudEvents has a list of required fields but also offers flexibility with optional attributes and extensions. CloudEvents also offers a solution to implement idempotency, requiring that the combination of id and source must uniquely identify an event, which can be used as the idempotency key in downstream implementations.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "account": "1234567890",
  "time": "2023-05-23T12:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "specversion": "1.0",
    "id": "bba4379f-b764-4d90-9fb2-9f572b2b0b61",
    "source": "myapp.orders-service",
    "type": "OrderPlaced",
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    },
    "time": "2024-01-01T17:31:00Z",
    "dataschema": "https://us-west-2.console.aws.amazon.com/events/home?region=us-west-2#/registries/discovered-schemas/schemas/myapp.orders-service%40OrderPlaced",
    "correlationid": "dddd9340-135a-c8c6-95c2-41fb8f492222",
    "domain": "ORDERS"
  }
}

By incorporating the required fields, the OrderPlaced event is now CloudEvents compliant. The event also contains optional and extension fields for additional information. Optional fields such as dataschema can be useful for brokers and consumers to retrieve a URI path to the published event schema. This example event references the schema in the EventBridge schema registry, so downstream consumers can fetch the schema to validate the payload.

Mapping existing events into CloudEvents using input transformers

When you define a target in EventBridge, input transformations allow you to modify the event before it reaches its destination. Input transformers are configured per target, allowing you to convert events when your downstream consumer requires the CloudEvents format and you want to avoid duplicating information.

Input transformers allow you to map EventBridge fields, such as id, region, detail-type, and source, into corresponding CloudEvents attributes.

This example shows how to transform any EventBridge event into CloudEvents format using input transformers, so the target receives the required structure.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "account": "1234567890",
  "time": "2024-01-23T12:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
    "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
    "order_total": "120.00"
  }
}

Using this input transformer and input template EventBridge transforms the event schema into the CloudEvents specification for downstream consumers.

Input transformer for CloudEvents:

{
  "id": "$.id",
  "source": "$.source",
  "type": "$.detail-type",
  "time": "$.time",
  "data": "$.detail"
}

Input template for CloudEvents:

{
  "specversion": "1.0",
  "id": "<id>",
  "source": "<source>",
  "type": "<type>",
  "time": "<time>",
  "data": <data>
}

This example shows the event payload that is received by downstream targets, which is mapped to the CloudEvents specification.

{
  "specversion": "1.0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "source": "myapp.orders-service",
  "type": "OrderPlaced",
  "time": "2024-01-23T12:38:46Z",
  "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    }
}

For more information on using input transformers with CloudEvents, see this pattern on Serverless Land.

Transforming events into CloudEvents using API destinations

EventBridge API destinations allows you to trigger HTTP endpoints based on matched rules to integrate with third-party systems using public APIs. You can route events to APIs that support the CloudEvents format by using input transformations and custom HTTP headers to convert EventBridge events to CloudEvents. API destinations now supports custom content-type headers. This allows you to send structured or binary CloudEvents to downstream consumers.

Sending binary CloudEvents using API destinations

When sending binary CloudEvents over HTTP, you must use the HTTP binding specification and set the necessary CloudEvents headers. These headers tell the downstream consumer that the incoming payload uses the CloudEvents format. The body of the request is the event itself.

CloudEvents headers are prefixed with ce-. You can find the list of headers in the HTTP protocol binding documentation.

This example shows the Headers for a binary event:

POST /order HTTP/1.1 
Host: webhook.example.com
ce-specversion: 1.0
ce-type: OrderPlaced
ce-source: myapp.orders-service
ce-id: bba4379f-b764-4d90-9fb2-9f572b2b0b61
ce-time: 2024-01-01T17:31:00Z
ce-dataschema: https://us-west-2.console.aws.amazon.com/events/home?region=us-west-2#/registries/discovered-schemas/schemas/myapp.orders-service%40OrderPlaced
correlationid: dddd9340-135a-c8c6-95c2-41fb8f492222
domain: ORDERS
Content-Type: application/json; charset=utf-8

This example shows the body for a binary event:

{
  "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
  "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
  "order_total": "120.00"
}

For more information when using binary CloudEvents with API destinations, explore this pattern available on Serverless Land.

Sending structured CloudEvents using API destinations

To support structured mode with CloudEvents, you must specify the content-type as application/cloudevents+json; charset=UTF-8, which tells the API consumer that the payload of the event is adhering to the CloudEvents specification.

POST /order HTTP/1.1
Host: webhook.example.com
 
Content-Type: application/cloudevents+json; charset=utf-8
{
    "specversion": "1.0",
    "id": "bba4379f-b764-4d90-9fb2-9f572b2b0b61",
    "source": "myapp.orders-service",
    "type": "OrderPlaced",      
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    },
    "time": "2024-01-01T17:31:00Z",
    "dataschema": "https://us-west-2.console.aws.amazon.com/events/home?region=us-west-2#/registries/discovered-schemas/schemas/myapp.orders-service%40OrderPlaced",
    "correlationid": "dddd9340-135a-c8c6-95c2-41fb8f492222",
    "domain":"ORDERS"
}

Conclusion

Carefully designing events plays an important role when building event-driven architectures to integrate producers and consumers effectively. The open-source CloudEvents specification helps developers to standardize integration processes, simplifying interactions between internal systems and external partners.

EventBridge allows you to use a flexible payload structure within an event’s detail property to standardize events. You can publish structured CloudEvents directly onto an event bus in the detail field and use payload transformations to allow downstream consumers to receive events in the CloudEvents format.

EventBridge simplifies integration with third-party systems using API destinations. Using the new custom content-type headers with input transformers to modify the event structure, you can send structured or binary CloudEvents to integrate with public APIs.

For more serverless learning resources, visit Serverless Land.

Какво да очакват хората с ипотеки след въвеждането на еврото

2024-03-14 VassilKendov

Post Syndicated from VassilKendov original https://kendov.com/%D0%BA%D0%B0%D0%BA%D0%B2%D0%BE-%D0%B4%D0%B0-%D0%BE%D1%87%D0%B0%D0%BA%D0%B2%D0%B0%D1%82-%D1%85%D0%BE%D1%80%D0%B0%D1%82%D0%B0-%D1%81-%D0%B8%D0%BF%D0%BE%D1%82%D0%B5%D0%BA%D0%B8-%D1%81%D0%BB%D0%B5%D0%B4/

Пълният запис на видеото с Роси Денева и Васил Кендов можете да намерите в платената секция на Patreon канала Kendov.com

Очаквани промени

– Лихвите по кредитите ще нараснат и ще се изравнят с европейските (между 6-7.5%)
– Кредитирането ще се свие под натиска на ЕЦБ и БНБ
– Цените на имотите ще се диференцират по показател ново и старо строителство

Моля използвайте приложената форма за записване на час за среща
[contact-form-7]

The post Какво да очакват хората с ипотеки след въвеждането на еврото appeared first on Kendov.com.

Mitigating a token-length side-channel attack in our AI products

2024-03-14 Celso Martinho

Post Syndicated from Celso Martinho original https://blog.cloudflare.com/ai-side-channel-attack-mitigated

Since the discovery of CRIME, BREACH, TIME, LUCKY-13 etc., length-based side-channel attacks have been considered practical. Even though packets were encrypted, attackers were able to infer information about the underlying plaintext by analyzing metadata like the packet length or timing information.

Cloudflare was recently contacted by a group of researchers at Ben Gurion University who wrote a paper titled “What Was Your Prompt? A Remote Keylogging Attack on AI Assistants” that describes “a novel side-channel that can be used to read encrypted responses from AI Assistants over the web”.
The Workers AI and AI Gateway team collaborated closely with these security researchers through our Public Bug Bounty program, discovering and fully patching a vulnerability that affects LLM providers. You can read the detailed research paper here.

Since being notified about this vulnerability, we’ve implemented a mitigation to help secure all Workers AI and AI Gateway customers. As far as we could assess, there was no outstanding risk to Workers AI and AI Gateway customers.

How does the side-channel attack work?

In the paper, the authors describe a method in which they intercept the stream of a chat session with an LLM provider, use the network packet headers to infer the length of each token, extract and segment their sequence, and then use their own dedicated LLMs to infer the response.

The two main requirements for a successful attack are an AI chat client running in streaming mode and a malicious actor capable of capturing network traffic between the client and the AI chat service. In streaming mode, the LLM tokens are emitted sequentially, introducing a token-length side-channel. Malicious actors could eavesdrop on packets via public networks or within an ISP.

An example request vulnerable to the side-channel attack looks like this:

curl -X POST \
https://api.cloudflare.com/client/v4/accounts/<account-id>/ai/run/@cf/meta/llama-2-7b-chat-int8 \
  -H "Authorization: Bearer <Token>" \
  -d '{"stream":true,"prompt":"tell me something about portugal"}'

Let’s use Wireshark to inspect the network packets on the LLM chat session while streaming:

The first packet has a length of 95 and corresponds to the token “Port” which has a length of four. The second packet has a length of 93 and corresponds to the token “ug” which has a length of two, and so on. By removing the likely token envelope from the network packet length, it is easy to infer how many tokens were transmitted and their sequence and individual length just by sniffing encrypted network data.

Since the attacker needs the sequence of individual token length, this vulnerability only affects text generation models using streaming. This means that AI inference providers that use streaming — the most common way of interacting with LLMs — like Workers AI, are potentially vulnerable.

This method requires that the attacker is on the same network or in a position to observe the communication traffic and its accuracy depends on knowing the target LLM’s writing style. In ideal conditions, the researchers claim that their system “can reconstruct 29% of an AI assistant’s responses and successfully infer the topic from 55% of them”. It’s also important to note that unlike other side-channel attacks, in this case the attacker has no way of evaluating its prediction against the ground truth. That means that we are as likely to get a sentence with near perfect accuracy as we are to get one where only things that match are conjunctions.

Mitigating LLM side-channel attacks

Since this type of attack relies on the length of tokens being inferred from the packet, it can be just as easily mitigated by obscuring token size. The researchers suggested a few strategies to mitigate these side-channel attacks, one of which is the simplest: padding the token responses with random length noise to obscure the length of the token so that responses can not be inferred from the packets. While we immediately added the mitigation to our own inference product — Workers AI, we wanted to help customers secure their LLMs regardless of where they are running them by adding it to our AI Gateway.

As of today, all users of Workers AI and AI Gateway are now automatically protected from this side-channel attack.

What we did

Once we got word of this research work and how exploiting the technique could potentially impact our AI products, we did what we always do in situations like this: we assembled a team of systems engineers, security engineers, and product managers and started discussing risk mitigation strategies and next steps. We also had a call with the researchers, who kindly attended, presented their conclusions, and answered questions from our teams.

Unfortunately, at this point, this research does not include actual code that we can use to reproduce the claims or the effectiveness and accuracy of the described side-channel attack. However, we think that the paper has theoretical merit, that it provides enough detail and explanations, and that the risks are not negligible.

We decided to incorporate the first mitigation suggestion in the paper: including random padding to each message to hide the actual length of tokens in the stream, thereby complicating attempts to infer information based solely on network packet size.

Workers AI, our inference product, is now protected

With our inference-as-a-service product, anyone can use the Workers AI platform and make API calls to our supported AI models. This means that we oversee the inference requests being made to and from the models. As such, we have a responsibility to ensure that the service is secure and protected from potential vulnerabilities. We immediately rolled out a fix once we were notified of the research, and all Workers AI customers are now automatically protected from this side-channel attack. We have not seen any malicious attacks exploiting this vulnerability, other than the ethical testing from the researchers.

Our solution for Workers AI is a variation of the mitigation strategy suggested in the research document. Since we stream JSON objects rather than the raw tokens, instead of padding the tokens with whitespace characters, we added a new property, “p” (for padding) that has a string value of variable random length.

Example streaming response using the SSE syntax:

data: {"response":"portugal","p":"abcdefghijklmnopqrstuvwxyz0123456789a"}
data: {"response":" is","p":"abcdefghij"}
data: {"response":" a","p":"abcdefghijklmnopqrstuvwxyz012"}
data: {"response":" southern","p":"ab"}
data: {"response":" European","p":"abcdefgh"}
data: {"response":" country","p":"abcdefghijklmno"}
data: {"response":" located","p":"abcdefghijklmnopqrstuvwxyz012345678"}

This has the advantage that no modifications are required in the SDK or the client code, the changes are invisible to the end-users, and no action is required from our customers. By adding random variable length to the JSON objects, we introduce the same network-level variability, and the attacker essentially loses the required input signal. Customers can continue using Workers AI as usual while benefiting from this protection.

One step further: AI Gateway protects users of any inference provider

We added protection to our AI inference product, but we also have a product that proxies requests to any provider — AI Gateway. AI Gateway acts as a proxy between a user and supported inference providers, helping developers gain control, performance, and observability over their AI applications. In line with our mission to help build a better Internet, we wanted to quickly roll out a fix that can help all our customers using text generation AIs, regardless of which provider they use or if they have mitigations to prevent this attack. To do this, we implemented a similar solution that pads all streaming responses proxied through AI Gateway with random noise of variable length.

Our AI Gateway customers are now automatically protected against this side-channel attack, even if the upstream inference providers have not yet mitigated the vulnerability. If you are unsure if your inference provider has patched this vulnerability yet, use AI Gateway to proxy your requests and ensure that you are protected.

Conclusion

At Cloudflare, our mission is to help build a better Internet – that means that we care about all citizens of the Internet, regardless of what their tech stack looks like. We are proud to be able to improve the security of our AI products in a way that is transparent and requires no action from our customers.

We are grateful to the researchers who discovered this vulnerability and have been very collaborative in helping us understand the problem space. If you are a security researcher who is interested in helping us make our products more secure, check out our Bug Bounty program at hackerone.com/cloudflare.

Интервю на Георги Георгиев от БОЕЦ: „ДАНС е предала пътната карта за Турски поток на Стефан Янев, Кирил Петков не е знаел“

2024-03-14 Николай Марченко

Post Syndicated from Николай Марченко original https://bivol.bg/boec-geoergiev-turskipotok.html

четвъртък 14 март 2024

Пътната карта за газопровода „Турски поток 2“ („Балкански поток“) е открита от Държавна агенция „Национална сигурност“ (ДАНС) още през 2021 г. Това сподели в интервю за „Биволъ“ председателят на гражданското…

Automakers Are Sharing Driver Data with Insurers without Consent

2024-03-14 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/03/automakers-are-sharing-driver-data-with-insurers-without-consent.html

Kasmir Hill has the story:

Modern cars are internet-enabled, allowing access to services like navigation, roadside assistance and car apps that drivers can connect to their vehicles to locate them or unlock them remotely. In recent years, automakers, including G.M., Honda, Kia and Hyundai, have started offering optional features in their connected-car apps that rate people’s driving. Some drivers may not realize that, if they turn on these features, the car companies then give information about how they drive to data brokers like LexisNexis [who then sell it to insurance companies].

Automakers and data brokers that have partnered to collect detailed driving data from millions of Americans say they have drivers’ permission to do so. But the existence of these partnerships is nearly invisible to drivers, whose consent is obtained in fine print and murky privacy policies that few read.

State Medical Boards: Last Week Tonight with John Oliver (HBO)

2024-03-14 LastWeekTonight

Post Syndicated from LastWeekTonight original https://www.youtube.com/watch?v=jVIYbgVks7E

Digital forgeries are hard

2024-03-14 Matthew Garrett

Post Syndicated from Matthew Garrett original https://mjg59.dreamwidth.org/69507.html

Closing arguments in the trial between various people and Craig Wright over whether he’s Satoshi Nakamoto are wrapping up today, amongst a bewildering array of presented evidence. But one utterly astonishing aspect of this lawsuit is that expert witnesses for both sides agreed that much of the digital evidence provided by Craig Wright was unreliable in one way or another, generally including indications that it wasn’t produced at the point in time it claimed to be. And it’s fascinating reading through the subtle (and, in some cases, not so subtle) ways that that’s revealed.

One of the pieces of evidence entered is screenshots of data from Mind Your Own Business, a business management product that’s been around for some time. Craig Wright relied on screenshots of various entries from this product to support his claims around having controlled meaningful number of bitcoin before he was publicly linked to being Satoshi. If these were authentic then they’d be strong evidence linking him to the mining of coins before Bitcoin’s public availability. Unfortunately the screenshots themselves weren’t contemporary – the metadata shows them being created in 2020. This wouldn’t fundamentally be a problem (it’s entirely reasonable to create new screenshots of old material), as long as it’s possible to establish that the material shown in the screenshots was created at that point. Sadly, well.

One part of the disclosed information was an email that contained a zip file that contained a raw database in the format used by MYOB. Importing that into the tool allowed an audit record to be extracted – this record showed that the relevant entries had been added to the database in 2020, shortly before the screenshots were created. This was, obviously, not strong evidence that Craig had held Bitcoin in 2009. This evidence was reported, and was responded to with a couple of additional databases that had an audit trail that was consistent with the dates in the records in question. Well, partially. The audit record included session data, showing an administrator logging into the data base in 2011 and then, uh, logging out in 2023, which is rather more consistent with someone changing their system clock to 2011 to create an entry, and switching it back to present day before logging out. In addition, the audit log included fields that didn’t exist in versions of the product released before 2016, strongly suggesting that the entries dated 2009-2011 were created in software released after 2016. And even worse, the order of insertions into the database didn’t line up with calendar time – an entry dated before another entry may appear in the database afterwards, indicating that it was created later. But even more obvious? The database schema used for these old entries corresponded to a version of the software released in 2023.

This is all consistent with the idea that these records were created after the fact and backdated to 2009-2011, and that after this evidence was made available further evidence was created and backdated to obfuscate that. In an unusual turn of events, during the trial Craig Wright introduced further evidence in the form of a chain of emails to his former lawyers that indicated he had provided them with login details to his MYOB instance in 2019 – before the metadata associated with the screenshots. The implication isn’t entirely clear, but it suggests that either they had an opportunity to examine this data before the metadata suggests it was created, or that they faked the data? So, well, the obvious thing happened, and his former lawyers were asked whether they received these emails. The chain consisted of three emails, two of which they confirmed they’d received. And they received a third email in the chain, but it was different to the one entered in evidence. And, uh, weirdly, they’d received a copy of the email that was submitted – but they’d received it a few days earlier. In 2024.

And again, the forensic evidence is helpful here! It turns out that the email client used associates a timestamp with any attachments, which in this case included an image in the email footer – and the mysterious time travelling email had a timestamp in 2024, not 2019. This was created by the client, so was consistent with the email having been sent in 2024, not being sent in 2019 and somehow getting stuck somewhere before delivery. The date header indicates 2019, as do encoded timestamps in the MIME headers – consistent with the mail being sent by a computer with the clock set to 2019.

But there’s a very weird difference between the copy of the email that was submitted in evidence and the copy that was located afterwards! The first included a header inserted by gmail that included a 2019 timestamp, while the latter had a 2024 timestamp. Is there a way to determine which of these could be the truth? It turns out there is! The format of that header changed in 2022, and the version in the email is the new version. The version with the 2019 timestamp is anachronistic – the format simply doesn’t match the header that gmail would have introduced in 2019, suggesting that an email sent in 2022 or later was modified to include a timestamp of 2019.

This is by no means the only indication that Craig Wright’s evidence may be misleading (there’s the whole argument that the Bitcoin white paper was written in LaTeX when general consensus is that it’s written in OpenOffice, given that’s what the metadata claims), but it’s a lovely example of a more general issue.

Our technology chains are complicated. So many moving parts end up influencing the content of the data we generate, and those parts develop over time. It’s fantastically difficult to generate an artifact now that precisely corresponds to how it would look in the past, even if we go to the effort of installing an old OS on an old PC and setting the clock appropriately (are you sure you’re going to be able to mimic an entirely period appropriate patch level?). Even the version of the font you use in a document may indicate it’s anachronistic. I’m pretty good at computers and I no longer have any belief I could fake an old document.

(References: this Dropbox, under “Expert reports”, “Patrick Madden”. Initial MYOB data is in “Appendix PM7”, further analysis is in “Appendix PM42”, email analysis is “Sixth Expert Report of Mr Patrick Madden”)

comments

Преди изборите, или как се става кандидат-президент в САЩ

2024-03-14 Йоанна Елми

Post Syndicated from Йоанна Елми original https://www.toest.bg/predi-izborite-ili-kak-se-stava-kandidat-prezident-v-usa/

Изборите в САЩ, част I. Механизмите на вота

Преди изборите, или как се става кандидат-президент в САЩ

Независимо дали обикаляте по улиците на Вашингтон, окръг Колумбия, или се разхождате из малки курортни градчета като Оушън Сити в Мериленд, може да си купите тениска както с лика на Джо Байдън, така и на Доналд Тръмп – и то от един и същи продавач. Излитащите от „Джон Ф. Кенеди“ в Ню Йорк пък могат да занесат у дома сувенирни шоколади с лика на 45-тия или 46-тия американски президент. Под това прозира не само търговска находчивост, американската способност всичко да се превръща в продукт или поредната изборна година, а и дълбоките разделения, които едновременно уж разкъсват обществената тъкан в Америка, но и са всекидневие, към което всички са привикнали.

САЩ е ключов партньор на България, световна сила и основна страна във все по-горещ военен конфликт, така че предстоящите през ноември президентски избори разбираемо вълнуват не само българите, но и света. Същевременно от САЩ ни дели цял океан, а дигиталната близост до американския свят е илюзорна: за много европейци например изборният процес в Америка е тъмна материя. У нас американската политика често се разглежда или през кратки новини и очерци, или през превратното тълкуване на източници на дезинформация – като тези, които по-рано през годината обявиха началото на гражданска война в САЩ, която така и не се състоя.

Но това, че Америка е любима тема на антидемократичната пропаганда в България, не означава, че отвъд океана няма съвсем реални и сериозни проблеми. Тяхното разбиране е свързано не само с интереса ни към политиката. То може да ни помогне да разберем по-добре как работи демокрацията, а също и какво може да се случи, когато механизмите ръждясат и не се сменят с нови. Затова в поредица от статии ще разгледаме някои от най-често задаваните въпроси, свързани с американските избори, и ще разнищим защо всъщност е толкова трудно да си направиш демокрация. Започваме от самото начало.

Първичните избори

Американските избори са сложен процес с повече стъпки, отколкото стандартните европейски избори. Кампаниите често започват в предходната година, като има множество кандидати от всяка партия. Традиция е партията на управляващия президент да не излъчва други кандидати, ако той е отслужил само един мандат, тоест ако има право на още един. Затова и настоящият президент Джо Байдън не бе предизвикан от свой съпартиец. От 70-те години насам няма действащ президент, който да е бил победен по този начин по време на предварителните избори.

Тази година обаче в някои щати рекорден брой избиратели на демократите избраха варианта „не подкрепям никого“ по време на първичните избори. Това е резултат от гражданска акция, която използва протестен вот като наказание за политиката на САЩ в конфликта между Израел и Палестина. След терористичната атака на „Хамас“ на 7 октомври 2023 г. и последвалия отговор на Израел (подкрепян от САЩ), към момента са убити над 30 000 цивилни палестинци. Палестинските територии са в хуманитарна криза, като се трупат все повече доказателства за диспропорционално използване на сила от страна на Израел спрямо цивилното население. Конфликтът в Близкия изток подхранва разделения в Демократическата партия и може да предизвика отлив в подкрепата за настоящия президент.

В изборна година всяко едно събитие, независимо дали става въпрос за вътрешен, или външен конфликт, може да наклони везните в една или друга посока. Същевременно обаче в САЩ се наблюдава т.нар. втвърдяване на електората, за което ще разкажем подробно в следващите текстове. Този феномен води до твърди електорални ядра, които подкрепят кандидатите си независимо от всичко – дори да става дума за множество престъпления като при кандидата на републиканците Доналд Тръмп, за някои от които той вече е осъден.

Какво представлява Избирателната колегия и демократична ли е тя?

През 2016 г. Доналд Тръмп спечели изборите, въпреки че по-голяма част от американците гласуваха за Хилари Клинтън. Как се случи това и възможно ли е през 2024 г. да има повторение?

Демокрацията в САЩ не е пряка, както в много европейски държави. В преките демокрации бюлетините се броят и общият брой се преизчислява в проценти подкрепа за всяка партия. Но когато американците гласуват, те всъщност избират кого ще изберат определени хора от техния щат.

Звучи сложно? Вероятно защото е. Тези определени хора са част от Избирателната колегия, разписана в американската Конституция от основателите като компромис между директното избиране на президента с гласуване и неговото назначаване от Конгреса. Именно заради Избирателната колегия пет пъти в историята на САЩ се е случвало президент, който не е спечелил най-много гласове, всъщност да спечели изборите.

Колегията има 538 членове. Президентът е избран, ако получи поне 270 от тези 538 гласа. Основният аргумент в полза на Избирателната колегия е, че делегирани представители биха могли да вземат по-добро решение от масите, избягвайки потенциална тирания на мнозинството над малцинството или подвеждане по популисти и демагози.

Разбира се, нищо не гарантира, че членовете на Избирателната колегия са по-образовани или по-компетентни от средностатистическия избирател – днес те са просто партийни назначения, които законите в някои щати задължават да се съобразят с волята на избирателите. На изборите през 2016 г. рекорден брой електори не го направиха. Някои от електорите на демократите например гласуваха за Бърни Сандърс вместо за Хилари Клинтън, тъй като предварителните избори бяха разцепили поддръжниците на партията на две крила.

Защитниците на Избирателната колегия твърдят, че така се осигурява пропорционално представителство на цялото население на Щатите. В противен случай кандидатите биха концентрирали усилията си върху големите населени места, където има най-много гласоподаватели, и напълно биха игнорирали нуждите на централните, по-слабо населени щати.

Но в съвремието нуждите и интересите на гласоподавателите не се определят от населението или от големината на техния щат. Разделенията по основните въпроси, като климатичните промени, правото на аборт или Втората поправка, са актуални както в градовете, така и в малките населени места, като в повечето щати има сравнително разнообразие от привърженици на републиканците и демократите, както и безпартийни гласоподаватели.

Все повече експерти са категорични, че Избирателната колегия е недемократична и следва да бъде премахната. Този процес никога не е бил особено популярен и сред гласоподавателите: данни от есента на 2023 г. показват, че 65% от американците искат пряко гласуване на президентски избори. Вместо да има изравняващ ефект между щатите, Избирателната колегия създава т.нар. колебаещи се щати. Те се променят през годините, но идеята е, че получават непропорционално внимание от кандидатите, които са наясно, че гласовете на електорите в останалите щати са им гарантирани. Това оборва и най-силния аргумент в полза на Избирателната колегия.

Тъй като обаче Колегията обикновено се защитава от онези, в чиято полза работи, Републиканската партия от години блокира нейното премахване, за което призовават демократите. Това е поредна разломна линия между двете партии, превърнала се в заложник по-скоро на политически интереси, отколкото на вслушване в желанието на мнозинството.

Преди изборите, или как се става кандидат-президент в САЩ — Различни визуализации на изборите през 2016 г. Картата дава грешна представа за политическите настроения в страната не само защото щатите в Средна Америка са много по-слабо населени от щатите по крайбрежието. *Скрийншот от VOX*

Какво представляват партийните комитети и какво е Супервторник?

В САЩ съществуват и т.нар. партийни комитети – различни групи, съставени от членове на съответната партия. Тяхната роля да номинират кандидати е ключова в първичните избори, завършващ с т.нар. Супервторник (тази година той беше в началото на март). След Супервторника – последния ден, в който членовете на дадена партия в определен щат гласуват за предпочитаната от тях номинация, е ясен кандидатът за президент от всяка партия, който ще се яви на същинските избори.

Партийните комитети се появяват през 1800 г. и са съставени от членове на Конгреса, които номинират кандидат-президенти. Критиките към тях са, че не успяват да гарантират разделението на властите, т.е. принципа, според който трите власти – изпълнителна (президент), законодателна (Конгрес и Камара на представителите) и съдебна (Върховен съд и съдилища) – трябва да бъдат независими една от друга. През 1812 г. например върху президента Джеймс Мадисън е оказван натиск от членове на Конгреса, които поставят условие той да обяви война на Обединеното кралство, ако иска да го номинират повторно.

Напрежението, породено от тази система, достига своя пик през 1824 г., когато нито един от кандидатите, номинирани от партийните комитети, не успява да спечели достатъчно гласове и де факто Конгресът трябва да назначи президент. Тогавашната криза и назначението на по-малко популярния кандидат води до разпадането на първата двупартийна система в САЩ и появата на настоящите две партии, макар и не във вида, в който ги познаваме днес.

Партийните комитети остават силно недемократични до 60-те години на миналия век. На предварителните избори дотогава се гледа по-скоро като на тест дали кандидатът, избран от една партия, ще се справи добре на същинските избори. От 1968 г. насам Демократическата партия постоянно променя функционирането на своите партийни комитети, опитвайки се все повече да ги демократизира. Това е дълъг (и със сигурност скучен за читателя) процес на нововъведения, отмяна на нововъведенията и замяната им с други, разписване на забраненото и позволеното в процеса и т.н. Подобна е ситуацията и при републиканците, които дават повече свобода на щатите сами да определят правила за партийните си комитети.

Какво ни показват тези сложни механизми?

Ако от всичко това ви е заболяла глава, не сте единствени. Освен класическото разделение на властите, цялата система на управление на САЩ е изградена от подобни микропроверки и баланси, които понякога работят, а понякога отнемат повече свободи, отколкото дават, в добрите случаи – временно. Компромисът за Избирателната колегия е само един от многото, заложени още в основите на държавата. Тези компромиси между често взаимно изключващи се позиции осигуряват така необходимия баланс, в който може да вирее някаква свобода, включително и най-важната – да се променят вече установени механизми.

Илюзорно разбиране е, че демокрацията гарантира пълна ефективност на управлението или някаква утопия. Напротив, демокрацията е по-хаотична, изисква повече време и често предполага натиск и борба между определени групи, за да се стигне до най-доброто решение за всички. Демокрацията също така не е равнозначна на свобода и равенство, а само на равния шанс да се стремим към тях и на държава, която ни пази ефективно както от самата себе си, така и от самите нас.

На север: Страната на инуитите

2024-03-14 Светла Стоянова

Post Syndicated from Светла Стоянова original https://www.toest.bg/stranata-na-inuitite/

На север: Страната на инуитите

<< Към На север: Гренландия и тайните на леда

Събудих се през нощта и слънцето грееше, като че беше ярък следобед. Зачудих се как ли човек не полудява тук – от постоянната светлина или от постоянната тъмнина през останалата част от годината. Как ли гренландците успяват да живеят на това толкова особено място на планетата?

Инуитите населяват най-отдалечените и най-северни райони в света и вероятно са един от най-добре приспособените етноси към арктическия климат и условия. Чрез специфичните си умения и начин на живот те са се научили да оцеляват при температури до –40°С.

Названието ескимоси, с което са по-познати у нас, вероятно води началото си от друг етнос в Северна Америка, който наричал инуитите „онези, които ядат сурово месо“. Негативната конотация в наименованието се запазва през вековете, като се създава цялостен принизяващ стереотип за етноса. Местното население на Гренландия говори за себе си като

инуити, което в превод означава „хòра“.

В миналото земите на Гренландия са населявани и от други етноси. Най-ранният е саккак, отпреди повече от 2500 г.пр.Хр., след него идва дорсет, около 400 г.пр.Хр., а след това и туле, чиито потомци са днешните инуити, преселили се от Канада по замръзналата повърхност на океана през Малкия ледников период на Средновековието. През Х в. до Гренландия с кораби достигат и викингите. Те се задържат там няколко века, докато не изчезват – може би заради критичното застудяване на климата.

През XVIII в. датско-норвежкият мисионер Ханс Егеде е изпратен в Гренландия, за да открие викингите и да ги покръсти. Той тръгва със семейството си и още няколко кораба и заживява в днешен Нуук, наречен от него Готхоб, в превод – „Добра надежда“. Ханс Егеде не намира викинги, но прави усилия да покръсти местното население.

На север: Страната на инуитите — Църквата в Илиманак © Светла Стоянова

Съвсем естествено, Егеде среща известни трудности в междуезиковото разбиране при проповядването на християнството. Забавен пример за това е преводът на изречението „Дай ни насъщния хляб!“. Хлябът в Гренландия е също толкова непознат, колкото за Европа са били непознати гренландците по онова време, и е било трудно да се разбере за какво става дума. Егеде смята, че е открил подходящо съответствие на символичната сила на думата хляб в речта им, защото, докато се хранят, местните често произнасят маммак и сочат храната. Затова той променя молитвата така: „Дай ни насъщния маммак!“. Впоследствие обаче, след поредните озадачаващи реакции, установява, че маммак е просто възклицание – „Колко вкусно!“. По-късно решава да използва думата за тюленско месо, тъй като именно то е също толкова често на масата, колкото хлябът в Европа.

Гренландският език е част от ескимо-алеутското езиково семейство. Той е полисинтетичен, тоест едно изречение често се слива в една много дълга дума. Най-ранните структурирани писмени сведения за езика са в първата гренландска граматика, написана от Пол Егеде, сина на мисионера Ханс Егеде, който, за разлика от баща си, научава гренландски добре и умее да общува с местните.

Живеейки в Гренландия, за мен също беше важно да общувам с местните. Исках да науча повече за тях, за начина им на живот и за светогледа им. Вероятно и заради това такава беше една от първите ми срещи с тях:

Вървях по улицата по края на града и насреща ми мина една малка червена кола с гренландец, който ми се усмихна. Само след малко колата мина отново в моята посока и спря. Реших, че човекът може би предлага да ме закара до града, и се качих. Питах го дали има книжарница наблизо, но той не ме разбра. Разговаряхме на смесица между неговия развален английски и датски с гренландски акцент и моя датски с норвежко произношение. Усмихнат и доволен, явно беше решил да ме разведе в града. Питах го има ли семейство, и отговорът беше: да – само баща и брат. Мина ми през ума, че може би е било твърде смело да се качвам просто така в колата му, но обясних къде работя, и вярвах, че всичко ще бъде наред. Той обиколи града, като спираше на места, показвайки гордо двете си лодки, къщата си, а най-накрая шейната и кучетата си, на които явно много държеше – цели трийсет, от които двайсет бяха още кутрета. Докато вървяхме към тях през тревата и калта, той предложи ръката си, но аз само поклатих глава с усмивка. Това като че го смути, но се засмя. Когато нахрани кучетата, му направих знак, че все пак трябва да тръгвам. Разбра ме и с неохота ме закара. Беше ми показал най-ценното си – къщата, лодките, колата, шейната и кучетата.

Интересен е животът на инуитите, които са съумели толкова добре да се приспособят към екстремните условия на живот в Арктика. По техните земи няма нито плодове, нито зеленчуци и единствените източници на храна са дивите животни – северни елени, тюлени, моржове, китове, птици и риби. В миналото хората са използвали специални харпуни за лов на тюлени, а китовете хващали с по-големи лодки, наречени умиак. През лятото семействата осигурявали прехраната си с лъкове и стрели и живеели в юрти, покрити с кожи от заловените животни. Зимували в обли къщи, направени от блокове сняг, познати като иглу, или в полуподземни къщи, построени от камъни и кожи върху пòкривна конструкция от трупи или кости от кит. В днешно време харпуните и лъковете се заменят с пушки, а юртите и иглуто – с дървени къщи.

За улова на всички животни е определена квота, като например ограничението за лов на китове е два броя годишно за целия остров.

Когато бях на гости у едно семейство инуити, стопанинът гордо показа огромния си фризер, с който явно всеки уважаващ себе си гренландец разполага и който беше пълен с полуразфасовано месо от най-различни животни. Жената пък извади големи пликове със замразени боровинки, които беше събирала през лятото. Различните видове боровинки са единствените плодове, растящи на острова.

Друго типично за ежедневието на инуитите са шейните с кучешки впрягове. Те са основният начин на придвижване през зимата, което предполага, че глутницата трябва да бъде хранена целогодишно. Шейните с кучешки впряг обикновено са гордост за семейството. Мъжете ги правят по свой размер и участват в ежегодни надбягвания. Един познат гренландец ми разказа, че мечтае да има свой кучешки впряг, но разрастването и обучението на глутницата отнемало години. Част от младите инуити като него през лятото работят като екскурзоводи, а през зимата – във фабрики за риба. Желанието му беше вместо това да заработва в бъдеще като гид с кучешкия си впряг. Притежанието на кучета също така означавало и по-висок статус в обществото. В днешно време обаче все повече гренландци ги заменят със снегомобили – все пак тях не е нужно да ги хранят с риба и еленско месо през цялата година.

Друга инуитска традиция е карането на каяк. Традиционно той се прави от дълги и тънки дъски и тюленски кожи. Благодарение на изключителната си лекота и маневреност е идеален за лов на тюлени. Заради тези качества каякът бързо се разпространява сред полярните изследователи, които неведнъж посещават инуитските племена и черпят знания и опит от техните изключителни методи за оцеляване. Малко известен факт е, че думата каяк, използвана в повечето европейски езици, е именно инуитска. Веднъж дори станах свидетелка на ежегодното първенство по каяк в Гренландия, което включваше дисциплини по бързина, издръжливост и специални умения, като впечатляващото преобръщане на 360 градуса с каяка във водата. Освен това беше задължително каяците да са измайсторени от самите състезатели.

Инуитите държат на традициите, но обществото все повече се модернизира с развитието на инфраструктурата и туризма. Има много работни места в предприятия за риба и скариди благодарение на големия износ в последните години. Съвременното гренландско общество е пръснато по градове и села. Връзки между градовете няма, освен по море, по лед или по въздух. Тоест пътища между селищата не съществуват, а коли може да се видят само в най-големите градове. За извънградско пътуване се използват най-вече лодки или фериботи. Един мой познат разказваше как всяко лято нямал търпение да потегли отново на 18-часовото си плаване с моторната си лодка на юг по западното крайбрежие, за да посети своите роднини.

В селото на име Илиманак, в което прекарах няколко месеца, живеят петдесетина гренландци. Пристигайки, човек се изкачва по стръмен кей, около който се клатушкат навързани множество лодки. Досами кея се вижда магазин и в него се продава всичко – от мляко и ябълки до въдици и пушки. За няколкото деца в селото е учредено училище с местна учителка. Съвсем наблизо е и селската баня, служеща едновременно за перално помещение, фитнес зала и кафене. Банята работи с купони за душа, които служител перфорира преди всяко къпане. Често същата сграда, изпълняваща множество функции, се превръща в място за спонтанни срещи.

Характерна черта за гренландските къщи е, че са боядисани в ярки и весели цветове – червено, розово, синьо, тюркоазено, жълто.

Някои от тях са окичени с ловни трофеи, като еленски рога, или с козината на бели мечки. Пътеките между къщите са частично циментирани, а наоколо занемарени стоят недовършени проекти и неизхвърлени боклуци. Често недалеч от домовете стопаните връзват кучетата си.

В повечето селища от подобен тип няма канализация, защото няма достатъчно средства за инфраструктурата. Водата се извежда по тръба от близкото езеро, минава през моторна помпа и стига по дълъг маркуч до резервоара на всяка къща, после от кухненската чешма изтича от умивалника директно навън, където образува голяма локва. А ако има тръби, те рано или късно извеждат отпадните води направо в океана. Тоалетната чиния прилича на нормална, но в нея се поставят огромни черни пликове, които трябва да се сменят редовно. Използваните пликове, затегнати с тел, се поставят на определено видно място до къщата, така че отговорникът за това да ги вижда и да ги събира своевременно.

Един мой приятел гренландец ми каза, че истинската Гренландия е именно тази, на север от Полярния кръг, точно тук, в малките селища. Защото, който иска да я изживее, трябва да прекара известно време в дивото и извън удобствата, с които толкова сме свикнали.

(Следва продължение.)

Introducing Sunlight, a CT implementation built for scalability, ease of operation, and reduced cost

2024-03-14 Let's Encrypt

Post Syndicated from Let's Encrypt original https://letsencrypt.org/2024/03/14/introducing-sunlight/

Let’s Encrypt is proud to introduce Sunlight, a new implementation of a Certificate Transparency log that we built from the ground up with modern Web PKI opportunities and constraints in mind. In partnership with Filippo Valsorda, who led the design and implementation, we incorporated feedback from the broader transparency logging community, including the Chrome and TrustFabric teams at Google, the Sigsum project, and other CT log and monitor operators. Their insights have been instrumental in shaping the project’s direction.

CT plays an important role in the Web PKI, enhancing the ability to monitor and research certificate issuance. The operation of a CT log, however, faces growing challenges with the increasing volume of certificates. For instance, Let’s Encrypt issues over four million certificates daily, each of which must be logged in two separate CT logs. Our well-established “Oak” log currently holds over 700 million entries, reflecting the significant scale of these challenges.

In this post, we’ll explore the motivation behind Sunlight and how its design aims to improve the robustness and diversity of the CT ecosystem, while also improving the reliability and performance of Let’s Encrypt’s logs.

Bottlenecks from the Database

Let’s Encrypt has been running public CT logs since 2019, and we’ve gotten a lot of operational experience with running them, but it hasn’t been trouble-free. The biggest challenge in the architecture we’ve deployed for our “Oak” log is that the data is stored in a relational database. We’ve scaled that up by splitting each year’s worth of data into a “shard” with its own database, and then later shrinking the shards to cover six months instead of a full year.

The approach of splitting into more and more databases is not something we want to continue doing forever, as the operational burden and costs increase. The current storage size of a CT log shard is between 5 and 10 terabytes. That’s big enough to be concerning for a single database: We previously had a test log fail when we ran into a 16TiB limit in MySQL.

Scaling read capacity up requires large database instances with fast disks and lots of RAM, which are not cheap. We’ve had numerous instances of CT logs becoming overloaded by clients attempting to read all the data in the log, overloading the database in the process. When rate limits are imposed to prevent overloading, clients are forced to slowly crawl the API, diminishing CT’s efficiency as a fast mechanism for detecting mis-issued certificates.

Serving Tiles

Initially, Let’s Encrypt only planned on building a new CT log implementation. However, our discussions with Filippo made us realize that other transparency systems had improved on the original Certificate Transparency design, and we could make our logs even more robust and scalable by changing the read path APIs. In particular, the Go Checksum Database is inspired by Certificate Transparency, but uses a more efficient format for publishing its data as a series of easily stored and cached tiles.

Certificate Transparency logs are a binary tree, with every node containing a hash of its two children. The “leaf” level contains the actual entries of the log: the certificates, appended to the right side of the tree. The top of the tree is digitally signed. This forms a cryptographically verifiable structure called a Merkle Tree, which can be used to check if a certificate is in the tree, and that the tree is append-only.

Sunlight tiles are files containing 256 elements each, either hashes at a certain tree “height” or certificates (or pre-certificates) at the leaf level. Russ Cox has a great explanation of how tiles work on his blog, or you can read the relevant section of the Sunlight specification. Even Trillian, the current implementation of CT we run, uses a subtree system similar to these tiles as its internal storage.

Unlike the dynamic endpoints in previous CT APIs, serving a tree as tiles doesn’t require any dynamic computation or request processing, so we can eliminate the need for API servers. Because the tiles are static, they’re efficiently cached, in contrast with CT APIs like get-proof-by-hash which have a different response for every certificate, so there’s no shared cache. The leaf tiles can also be stored compressed, saving even more storage!

The idea of exposing the log as a series of static tiles is motivated by our desire to scale out the read path horizontally and relatively inexpensively. We can directly expose tiles in cloud object storage like S3, use a caching CDN, or use a webserver and a filesystem.

Object or file storage is readily available, can scale up easily, and costs significantly less than databases from cloud providers. It seemed like the obvious path forward. In fact, we already have an S3-backed cache in front of our existing CT logs, which means we are currently storing our data twice.

Running More Logs

The tiles API improves the read path, but we also wanted to simplify our architecture on the write path. With Trillian, we run a collection of nodes along with etcd for leader election to choose which will handle writing. This is somewhat complex, and we believe the CT ecosystem allows a different tradeoff.

The key realization is that Certificate Transparency is already a distributed system, with clients submitting certificates to multiple logs, and gracefully failing over from any unavailable ones to the others. Each individual log’s write path doesn’t require a highly available leader election system. A simple single-node writer can meet the 99% Service Level Objective required by CT log programs.

The single-node Sunlight architecture lets us run multiple independent logs with the same amount of computing power. This increases the system’s overall robustness, even if each individual log has lower potential uptime. No more leader election needed. We use a simple compare-and-swap mechanism to store checkpoints and prevent accidentally running two instances at once, which could result in a forked tree, but that has much less overhead than leader election.

No More Merge Delay

One of the goals of CT was to have limited latency for submission to the logs. A design feature called Merge Delay was added to support that. When submitting a certificate to a log, the log can return a Signed Certificate Timestamp (SCT) immediately, with a promise to include it in the log within the log’s Maximum Merge Delay, conventionally 24 hours. While this seems like a good tradeoff to not slow down issuance, there have been multiple incidents and near-misses where a log stops operating with unmerged certificates, missing its maximum merge delay, and breaking that promise.

Sunlight takes a different approach, holding submissions while it batches and integrates certificates in the log, eliminating the merge delay. While this leads to a small latency increase, we think it’s worthwhile to avoid one of the more common CT log failure cases.

It also lets us embed the final leaf index in an extension of our SCTs, bringing CT a step closer to direct client verification of Merkle tree proofs. The extension also makes it possible for clients to fetch the proof of log inclusion from the new static tile-based APIs, without requiring server-side lookup tables or databases.

A Sunny Future

Today’s announcement of Sunlight is just the beginning. We’ve released software and a specification for Sunlight, and have Sunlight CT logs running. Head to sunlight.dev to find resources to get started. We encourage CAs to start test submitting to Let’s Encrypt’s new Sunlight CT logs, for CT Monitors and Auditors to add support for consuming Sunlight logs, and for the CT programs to consider trusting logs running on this new architecture. We hope Sunlight logs will be made usable for SCTs by the CT programs run by the browsers in the future, allowing CAs to rely on them to meet the browser CT logging requirements.

We’ve gotten positive feedback so far, with comments such as “Google’s TrustFabric team, maintainers of Trillian, are supportive of this direction and the Sunlight spec. We have been working towards the same goal of cacheable tile-based logs for other ecosystems with serverless tooling, and will be folding this into Trillian and ctfe, along with adding support for the Sunlight API.”

If you have feedback on the design, please join in the conversation on the ct-policy mailing list, or in the #sunlight channel on the transparency-dev Slack (invitation to join).

We’d like to thank Chrome for supporting the development of Sunlight, and Amazon Web Services for their ongoing support for our CT log operation. If your organization monitors or values CT, please consider a financial gift of support. Learn more at https://www.abetterinternet.org/sponsor/ or contact us at: [email protected].

[$] LWN.net Weekly Edition for March 14, 2024

2024-03-14 corbet

Post Syndicated from corbet original https://lwn.net/Articles/964623/

The LWN.net Weekly Edition for March 14, 2024 is available.

Introducing Sunlight, a CT implementation built for scalability, ease of operation, and reduced cost

2024-03-14 Let's Encrypt

Post Syndicated from Let's Encrypt original https://letsencrypt.org/2024/03/14/introducing-sunlight.html

Bottlenecks from the Database

Serving Tiles

Running More Logs

No More Merge Delay

A Sunny Future

If you have feedback on the design, please join in the conversation on the ct-policy mailing list, or in the #sunlight channel on the transparency-dev Slack (invitation to join).

Anthropic’s Claude 3 Haiku model is now available on Amazon Bedrock

2024-03-14 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/anthropics-claude-3-haiku-model-is-now-available-in-amazon-bedrock/

Last week, Anthropic announced their Claude 3 foundation model family. The family includes three models: Claude 3 Haiku, the fastest and most compact model for near-instant responsiveness; Claude 3 Sonnet, the ideal balanced model between skills and speed; and Claude 3 Opus, the most intelligent offering for top-level performance on highly complex tasks. AWS also announced the general availability of Claude 3 Sonnet in Amazon Bedrock.

Today, we are announcing the availability of Claude 3 Haiku on Amazon Bedrock. The Claude 3 Haiku foundation model is the fastest and most compact model of the Claude 3 family, designed for near-instant responsiveness and seamless generative artificial intelligence (AI) experiences that mimic human interactions. For example, it can read a data-dense research paper on arXiv (~10k tokens) with charts and graphs in less than three seconds.

With Claude 3 Haiku’s availability on Amazon Bedrock, you can build near-instant responsive generative AI applications for enterprises that need quick and accurate targeted performance. Like Sonnet and Opus, Haiku has image-to-text vision capabilities, can understand multiple languages besides English, and boasts increased steerability in a 200k context window.

Claude 3 Haiku use cases
Claude 3 Haiku is smarter, faster, and more affordable than other models in its intelligence category. It answers simple queries and requests with unmatched speed. With its fast speed and increased steerability, you can create AI experiences that seamlessly imitate human interactions.

Here are some use cases for using Claude 3 Haiku:

Customer interactions: quick and accurate support in live interactions, translations
Content moderation: catch risky behavior or customer requests
Cost-saving tasks: optimized logistics, inventory management, fast knowledge extraction from unstructured data

To learn more about Claude 3 Haiku’s features and capabilities, visit Anthropic’s Claude on Amazon Bedrock and Anthropic Claude models in the AWS documentation.

Claude 3 Haiku in action
If you are new to using Anthropic models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. Request access separately for Claude 3 Haiku.

To test Claude 3 Haiku in the console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Anthropic as the category and Claude 3 Haiku as the model.

To test more Claude prompt examples, choose Load examples. You can view and run examples specific to Claude 3 Haiku, such as advanced Q&A with citations, crafting a design brief, and non-English content generation.

Using Compare mode, you can also compare the speed and intelligence between Claude 3 Haiku and the Claude 2.1 model using a sample prompt to generate personalized email responses to address customer questions.

By choosing View API request, you can also access the model using code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Here is a sample of the AWS CLI command:

aws bedrock-runtime invoke-model \
     --model-id anthropic.claude-3-haiku-20240307-v1:0 \
     --body "{\"messages\":[{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"Write the test case for uploading the image to Amazon S3 bucket\\nCertainly! Here's an example of a test case for uploading an image to an Amazon S3 bucket using a testing framework like JUnit or TestNG for Java:\\n\\n...."}]}],\"anthropic_version\":\"bedrock-2023-05-31\",\"max_tokens\":2000}" \
     --cli-binary-format raw-in-base64-out \
     --region us-east-1 \
     invoke-model-output.txt

To make an API request with Claude 3, use the new Anthropic Claude Messages API format, which allows for more complex interactions such as image processing. If you use Anthropic Claude Text Completions API, you should upgrade from the Text Completions API.

Here is sample Python code to send a Message API request describing the image file:

def call_claude_haiku(base64_string):

    prompt_config = {
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 4096,
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": "image/png",
                            "data": base64_string,
                        },
                    },
                    {"type": "text", "text": "Provide a caption for this image"},
                ],
            }
        ],
    }

    body = json.dumps(prompt_config)

    modelId = "anthropic.claude-3-haiku-20240307-v1:0"
    accept = "application/json"
    contentType = "application/json"

    response = bedrock_runtime.invoke_model(
        body=body, modelId=modelId, accept=accept, contentType=contentType
    )
    response_body = json.loads(response.get("body").read())

    results = response_body.get("content")[0].get("text")
    return results

To learn more sample codes with Claude 3, see Get Started with Claude 3 on Amazon Bedrock, Diagrams to CDK/Terraform using Claude 3 on Amazon Bedrock, and Cricket Match Winner Prediction with Amazon Bedrock’s Anthropic Claude 3 Sonnet in the Community.aws.

Now available
Claude 3 Haiku is available now in the US West (Oregon) Region with more Regions coming soon; check the full Region list for future updates.

Claude 3 Haiku is the most cost-effective choice. For example, Claude 3 Haiku is cheaper, up to 68 percent of the price per 1,000 input/output tokens compared to Claude Instant, with higher levels of intelligence. To learn more, see Amazon Bedrock Pricing.

Give Claude 3 Haiku a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

— Channy

[$] Questions about machine-learning models for Fedora

2024-03-13 jzb

Post Syndicated from jzb original https://lwn.net/Articles/964739/

Kaitlyn Abdo of Fedora’s AI/ML
SIG opened an issue with the
Fedora Engineering Steering Committee (FESCo) recently that carried a few tricky
questions about packaging machine-learning (ML) models for Fedora.
Specifically, the SIG is looking for guidance on whether pre-trained weights for
PyTorch constitute code or content. And, if the models are released under a
license approved by the
Open Source Initiative (OSI),
does it matter what data the models were trained on? The issue was quickly
tossed over to Fedora’s legal
mailing list and sparked an interesting discussion about how to
handle these items, and a temporary path forward.

Gain insights from historical location data using Amazon Location Service and AWS analytics services

2024-03-13 Alan Peaty

Post Syndicated from Alan Peaty original https://aws.amazon.com/blogs/big-data/gain-insights-from-historical-location-data-using-amazon-location-service-and-aws-analytics-services/

Many organizations around the world rely on the use of physical assets, such as vehicles, to deliver a service to their end-customers. By tracking these assets in real time and storing the results, asset owners can derive valuable insights on how their assets are being used to continuously deliver business improvements and plan for future changes. For example, a delivery company operating a fleet of vehicles may need to ascertain the impact from local policy changes outside of their control, such as the announced expansion of an Ultra-Low Emission Zone (ULEZ). By combining historical vehicle location data with information from other sources, the company can devise empirical approaches for better decision-making. For example, the company’s procurement team can use this information to make decisions about which vehicles to prioritize for replacement before policy changes go into effect.

Developers can use the support in Amazon Location Service for publishing device position updates to Amazon EventBridge to build a near-real-time data pipeline that stores locations of tracked assets in Amazon Simple Storage Service (Amazon S3). Additionally, you can use AWS Lambda to enrich incoming location data with data from other sources, such as an Amazon DynamoDB table containing vehicle maintenance details. Then a data analyst can use the geospatial querying capabilities of Amazon Athena to gain insights, such as the number of days their vehicles have operated in the proposed boundaries of an expanded ULEZ. Because vehicles that do not meet ULEZ emissions standards are subjected to a daily charge to operate within the zone, you can use the location data, along with maintenance data such as age of the vehicle, current mileage, and current emissions standards to estimate the amount the company would have to spend on daily fees.

This post shows how you can use Amazon Location, EventBridge, Lambda, Amazon Data Firehose, and Amazon S3 to build a location-aware data pipeline, and use this data to drive meaningful insights using AWS Glue and Athena.

Overview of solution

This is a fully serverless solution for location-based asset management. The solution consists of the following interfaces:

IoT or mobile application – A mobile application or an Internet of Things (IoT) device allows the tracking of a company vehicle while it is in use and transmits its current location securely to the data ingestion layer in AWS. The ingestion approach is not in scope of this post. Instead, a Lambda function in our solution simulates sample vehicle journeys and directly updates Amazon Location tracker objects with randomized locations.
Data analytics – Business analysts gather operational insights from multiple data sources, including the location data collected from the vehicles. Data analysts are looking for answers to questions such as, “How long did a given vehicle historically spend inside a proposed zone, and how much would the fees have cost had the policy been in place over the past 12 months?”

The following diagram illustrates the solution architecture.
Architecture diagram

The workflow consists of the following key steps:

The tracking functionality of Amazon Location is used to track the vehicle. Using EventBridge integration, filtered positional updates are published to an EventBridge event bus. This solution uses distance-based filtering to reduce costs and jitter. Distanced-based filtering ignores location updates in which devices have moved less than 30 meters (98.4 feet).
Amazon Location device position events arrive on the EventBridge default bus with source: ["aws.geo"] and detail-type: ["Location Device Position Event"]. One rule is created to forward these events to two downstream targets: a Lambda function, and a Firehose delivery stream.
Two different patterns, based on each target, are described in this post to demonstrate different approaches to committing the data to a S3 bucket:
1. Lambda function – The first approach uses a Lambda function to demonstrate how you can use code in the data pipeline to directly transform the incoming location data. You can modify the Lambda function to fetch additional vehicle information from a separate data store (for example, a DynamoDB table or a Customer Relationship Management system) to enrich the data, before storing the results in an S3 bucket. In this model, the Lambda function is invoked for each incoming event.
2. Firehose delivery stream – The second approach uses a Firehose delivery stream to buffer and batch the incoming positional updates, before storing them in an S3 bucket without modification. This method uses GZIP compression to optimize storage consumption and query performance. You can also use the data transformation feature of Data Firehose to invoke a Lambda function to perform data transformation in batches.
AWS Glue crawls both S3 bucket paths, populates the AWS Glue database tables based on the inferred schemas, and makes the data available to other analytics applications through the AWS Glue Data Catalog.
Athena is used to run geospatial queries on the location data stored in the S3 buckets. The Data Catalog provides metadata that allows analytics applications using Athena to find, read, and process the location data stored in Amazon S3.
This solution includes a Lambda function that continuously updates the Amazon Location tracker with simulated location data from fictitious journeys. The Lambda function is triggered at regular intervals using a scheduled EventBridge rule.

You can test this solution yourself using the AWS Samples GitHub repository. The repository contains the AWS Serverless Application Model (AWS SAM) template and Lambda code required to try out this solution. Refer to the instructions in the README file for steps on how to provision and decommission this solution.

Visual layouts in some screenshots in this post may look different than those on your AWS Management Console.

Data generation

In this section, we discuss the steps to manually or automatically generate journey data.

Manually generate journey data

You can manually update device positions using the AWS Command Line Interface (AWS CLI) command aws location batch-update-device-position. Replace the tracker-name, device-id, Position, and SampleTime values with your own, and make sure that successive updates are more than 30 meters in distance apart to place an event on the default EventBridge event bus:

aws location batch-update-device-position --tracker-name <tracker-name> --updates "[{\"DeviceId\": \"<device-id>\", \"Position\": [<longitude>, <latitude>], \"SampleTime\": \"<YYYY-MM-DDThh:mm:ssZ>\"}]"

Automatically generate journey data using the simulator

The provided AWS CloudFormation template deploys an EventBridge scheduled rule and an accompanying Lambda function that simulates tracker updates from vehicles. This rule is enabled by default, and runs at a frequency specified by the SimulationIntervalMinutes CloudFormation parameter. The data generation Lambda function updates the Amazon Location tracker with a randomized position offset from the vehicles’ base locations.

Vehicle names and base locations are stored in the vehicles.json file. A vehicle’s starting position is reset each day, and base locations have been chosen to give them the ability to drift in and out of the ULEZ on a given day to provide a realistic journey simulation.

You can disable the rule temporarily by navigating to the scheduled rule details on the EventBridge console. Alternatively, change the parameter State: ENABLED to State: DISABLED for the scheduled rule resource GenerateDevicePositionsScheduleRule in the template.yml file. Rebuild and re-deploy the AWS SAM template for this change to take effect.

Location data pipeline approaches

The configurations outlined in this section are deployed automatically by the provided AWS SAM template. The information in this section is provided to describe the pertinent parts of the solution.

Amazon Location device position events

Amazon Location sends device position update events to EventBridge in the following format:

{
    "version":"0",
    "id":"<event-id>",
    "detail-type":"Location Device Position Event",
    "source":"aws.geo",
    "account":"<account-number>",
    "time":"<YYYY-MM-DDThh:mm:ssZ>",
    "region":"<region>",
    "resources":[
        "arn:aws:geo:<region>:<account-number>:tracker/<tracker-name>"
    ],
    "detail":{
        "EventType":"UPDATE",
        "TrackerName":"<tracker-name>",
        "DeviceId":"<device-id>",
        "SampleTime":"<YYYY-MM-DDThh:mm:ssZ>",
        "ReceivedTime":"<YYYY-MM-DDThh:mm:ss.sssZ>",
        "Position":[
            <longitude>, 
            <latitude>
	]
    }
}

You can optionally specify an input transformation to modify the format and contents of the device position event data before it reaches the target.

Data enrichment using Lambda

Data enrichment in this pattern is facilitated through the invocation of a Lambda function. In this example, we call this function ProcessDevicePosition, and use a Python runtime. A custom transformation is applied in the EventBridge target definition to receive the event data in the following format:

{
    "EventType":<EventType>,
    "TrackerName":<TrackerName>,
    "DeviceId":<DeviceId>,
    "SampleTime":<SampleTime>,
    "ReceivedTime":<ReceivedTime>,
    "Position":[<Longitude>,<Latitude>]
}

You could apply additional transformations, such as the refactoring of Latitude and Longitude data into separate key-value pairs if this is required by the downstream business logic processing the events.

The following code demonstrates the Python application logic that is run by the ProcessDevicePosition Lambda function. Error handling has been skipped in this code snippet for brevity. The full code is available in the GitHub repo.

import json
import os
import uuid
import boto3

# Import environment variables from Lambda function.
bucket_name = os.environ["S3_BUCKET_NAME"]
bucket_prefix = os.environ["S3_BUCKET_LAMBDA_PREFIX"]

s3 = boto3.client("s3")

def lambda_handler(event, context):
    key = "%s/%s/%s-%s.json" % (bucket_prefix,
                                event["DeviceId"],
                                event["SampleTime"],
                                str(uuid.uuid4())
    body = json.dumps(event, separators=(",", ":"))
    body_encoded = body.encode("utf-8")
    s3.put_object(Bucket=bucket_name, Key=key, Body=body_encoded)
    return {
        "statusCode": 200,
        "body": "success"
    }

The preceding code creates an S3 object for each device position event received by EventBridge. The code uses the DeviceId as a prefix to write the objects to the bucket.

You can add additional logic to the preceding Lambda function code to enrich the event data using other sources. The example in the GitHub repo demonstrates enriching the event with data from a DynamoDB vehicle maintenance table.

In addition to the prerequisite AWS Identity and Access Management (IAM) permissions provided by the role AWSBasicLambdaExecutionRole, the ProcessDevicePosition function requires permissions to perform the S3 put_object action and any other actions required by the data enrichment logic. IAM permissions required by the solution are documented in the template.yml file.

{
    "Version":"2012-10-17",
    "Statement":[
        {
            "Action":[
                "s3:ListBucket"
            ],
            "Resource":[
                "arn:aws:s3:::<S3_BUCKET_NAME>"
            ],
            "Effect":"Allow"
        },
        {
            "Action":[
                "s3:PutObject"
            ],
            "Resource":[
                "arn:aws:s3:::<S3_BUCKET_NAME>/<S3_BUCKET_LAMBDA_PREFIX>/*"
            ],
            "Effect":"Allow"
        }
    ]
}

Data pipeline using Amazon Data Firehose

Complete the following steps to create your Firehose delivery stream:

On the Amazon Data Firehose console, choose Firehose streams in the navigation pane.
Choose Create Firehose stream.
For Source, choose as Direct PUT.
For Destination, choose Amazon S3.
For Firehose stream name, enter a name (for this post, ProcessDevicePositionFirehose).
Configure the destination settings with details about the S3 bucket in which the location data is stored, along with the partitioning strategy:
1. Use <S3_BUCKET_NAME> and <S3_BUCKET_FIREHOSE_PREFIX> to determine the bucket and object prefixes.
2. Use DeviceId as an additional prefix to write the objects to the bucket.
Enable Dynamic partitioning and New line delimiter to make sure partitioning is automatic based on DeviceId, and that new line delimiters are added between records in objects that are delivered to Amazon S3.

These are required by AWS Glue to later crawl the data, and for Athena to recognize individual records.
Destination settings for Firehose stream

Create an EventBridge rule and attach targets

The EventBridge rule ProcessDevicePosition defines two targets: the ProcessDevicePosition Lambda function, and the ProcessDevicePositionFirehose delivery stream. Complete the following steps to create the rule and attach targets:

On the EventBridge console, create a new rule.
For Name, enter a name (for this post, ProcessDevicePosition).
For Event bus¸ choose default.
For Rule type¸ select Rule with an event pattern.
For Event source, select AWS events or EventBridge partner events.
For Method, select Use pattern form.
In the Event pattern section, specify AWS services as the source, Amazon Location Service as the specific service, and Location Device Position Event as the event type.
For Target 1, attach the ProcessDevicePosition Lambda function as a target.
We use Input transformer to customize the event that is committed to the S3 bucket.

Configure Input paths map and Input template to organize the payload into the desired format.

The following code is the input paths map:

{
    EventType: $.detail.EventType
    TrackerName: $.detail.TrackerName
    DeviceId: $.detail.DeviceId
    SampleTime: $.detail.SampleTime
    ReceivedTime: $.detail.ReceivedTime
    Longitude: $.detail.Position[0]
    Latitude: $.detail.Position[1]
}

The following code is the input template:

{
    "EventType":<EventType>,
    "TrackerName":<TrackerName>,
    "DeviceId":<DeviceId>,
    "SampleTime":<SampleTime>,
    "ReceivedTime":<ReceivedTime>,
    "Position":[<Longitude>, <Latitude>]
}

For Target 2, choose the ProcessDevicePositionFirehose delivery stream as a target.

This target requires an IAM role that allows one or multiple records to be written to the Firehose delivery stream:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "firehose:PutRecord",
                "firehose:PutRecords"
            ],
            "Resource": [
                "arn:aws:firehose:<region>:<account-id>:deliverystream/<delivery-stream-name>"
            ],
            "Effect": "Allow"
        }
    ]
}

Crawl and catalog the data using AWS Glue

After sufficient data has been generated, complete the following steps:

On the AWS Glue console, choose Crawlers in the navigation pane.
Select the crawlers that have been created, location-analytics-glue-crawler-lambda and location-analytics-glue-crawler-firehose.
Choose Run.

The crawlers will automatically classify the data into JSON format, group the records into tables and partitions, and commit associated metadata to the AWS Glue Data Catalog.

When the Last run statuses of both crawlers show as Succeeded, confirm that two tables (lambda and firehose) have been created on the Tables page.

The solution partitions the incoming location data based on the deviceid field. Therefore, as long as there are no new devices or schema changes, the crawlers don’t need to run again. However, if new devices are added, or a different field is used for partitioning, the crawlers need to run again.
Tables

You’re now ready to query the tables using Athena.

Query the data using Athena

Athena is a serverless, interactive analytics service built to analyze unstructured, semi-structured, and structured data where it is hosted. If this is your first time using the Athena console, follow the instructions to set up a query result location in Amazon S3. To query the data with Athena, complete the following steps:

On the Athena console, open the query editor.
For Data source, choose AwsDataCatalog.
For Database, choose location-analytics-glue-database.
On the options menu (three vertical dots), choose Preview Table to query the content of both tables.

The query displays 10 sample positional records currently stored in the table. The following screenshot is an example from previewing the firehose table. The firehose table stores raw, unmodified data from the Amazon Location tracker.
Query results
You can now experiment with geospatial queries.The GeoJSON file for the 2021 London ULEZ expansion is part of the repository, and has already been converted into a query compatible with both Athena tables.

Copy and paste the content from the 1-firehose-athena-ulez-2021-create-view.sql file found in the examples/firehose folder into the query editor.

This query uses the ST_Within geospatial function to determine if a recorded position is inside or outside the ULEZ zone defined by the polygon. A new view called ulezvehicleanalysis_firehose is created with a new column, insidezone, which captures whether the recorded position exists within the zone.

A simple Python utility is provided, which converts the polygon features found in the downloaded GeoJSON file into ST_Polygon strings based on the well-known text format that can be used directly in an Athena query.

Choose Preview View on the ulezvehicleanalysis_firehose view to explore its content.

You can now run queries against this view to gain overarching insights.

Copy and paste the content from the 2-firehose-athena-ulez-2021-query-days-in-zone.sql file found in the examples/firehose folder into the query editor.

This query establishes the total number of days each vehicle has entered ULEZ, and what the expected total charges would be. The query has been parameterized using the ? placeholder character. Parameterized queries allow you to rerun the same query with different parameter values.

Enter the daily fee amount for Parameter 1, then run the query.

The results display each vehicle, the total number of days spent in the proposed ULEZ, and the total charges based on the daily fee you entered.
Query results
You can repeat this exercise using the lambda table. Data in the lambda table is augmented with additional vehicle details present in the vehicle maintenance DynamoDB table at the time it is processed by the Lambda function. The solution supports the following fields:

MeetsEmissionStandards (Boolean)
Mileage (Number)
PurchaseDate (String, in YYYY-MM-DD format)

You can also enrich the new data as it arrives.

On the DynamoDB console, find the vehicle maintenance table under Tables. The table name is provided as output VehicleMaintenanceDynamoTable in the deployed CloudFormation stack.
Choose Explore table items to view the content of the table.
Choose Create item to create a new record for a vehicle.
Enter DeviceId (such as vehicle1 as a String), PurchaseDate (such as 2005-10-01 as a String), Mileage (such as 10000 as a Number), and MeetsEmissionStandards (with a value such as False as Boolean).
Choose Create item to create the record.
Duplicate the newly created record with additional entries for other vehicles (such as for vehicle2 or vehicle3), modifying the values of the attributes slightly each time.
Rerun the location-analytics-glue-crawler-lambda AWS Glue crawler after new data has been generated to confirm that the update to the schema with new fields is registered.
Copy and paste the content from the 1-lambda-athena-ulez-2021-create-view.sql file found in the examples/lambda folder into the query editor.
Preview the ulezvehicleanalysis_lambda view to confirm that the new columns have been created.

If errors such as Column 'mileage' cannot be resolved are displayed, the data enrichment is not taking place, or the AWS Glue crawler has not yet detected updates to the schema.

If the Preview table option is only returning results from before you created records in the DynamoDB table, return the query results in descending order using sampletime (for example, order by sampletime desc limit 100;).

Now we focus on the vehicles that don’t currently meet emissions standards, and order the vehicles in descending order based on the mileage per year (calculated using the latest mileage / age of vehicle in years).

Copy and paste the content from the 2-lambda-athena-ulez-2021-query-days-in-zone.sql file found in the examples/lambda folder into the query editor.

In this example, we can see that out of our fleet of vehicles, five have been reported as not meeting emission standards. We can also see the vehicles that have accumulated high mileage per year, and the number of days spent in the proposed ULEZ. The fleet operator may now decide to prioritize these vehicles for replacement. Because location data is enriched with the most up-to-date vehicle maintenance data at the time it is ingested, you can further evolve these queries to run over a defined time window. For example, you could factor in mileage changes within the past year.

Due to the dynamic nature of the data enrichment, any new data being committed to Amazon S3, along with the query results, will be altered as and when records are updated in the DynamoDB vehicle maintenance table.

Clean up

Refer to the instructions in the README file to clean up the resources provisioned for this solution.

Conclusion

This post demonstrated how you can use Amazon Location, EventBridge, Lambda, Amazon Data Firehose, and Amazon S3 to build a location-aware data pipeline, and use the collected device position data to drive analytical insights using AWS Glue and Athena. By tracking these assets in real time and storing the results, companies can derive valuable insights on how effectively their fleets are being utilized and better react to changes in the future. You can now explore extending this sample code with your own device tracking data and analytics requirements.

About the Authors

Alan Peaty is a Senior Partner Solutions Architect at AWS. Alan helps Global Systems Integrators (GSIs) and Global Independent Software Vendors (GISVs) solve complex customer challenges using AWS services. Prior to joining AWS, Alan worked as an architect at systems integrators to translate business requirements into technical solutions. Outside of work, Alan is an IoT enthusiast and a keen runner who loves to hit the muddy trails of the English countryside.

Parag Srivastava is a Solutions Architect at AWS, helping enterprise customers with successful cloud adoption and migration. During his professional career, he has been extensively involved in complex digital transformation projects. He is also passionate about building innovative solutions around geospatial aspects of addresses.

Preparing for the change

Embracing change to move towards a better Internet

Overview

Standardizing events using Amazon EventBridge

Publishing CloudEvents using Amazon EventBridge

Mapping existing events into CloudEvents using input transformers

Transforming events into CloudEvents using API destinations

Sending binary CloudEvents using API destinations

Sending structured CloudEvents using API destinations

Conclusion

How does the side-channel attack work?

Mitigating LLM side-channel attacks

What we did

Workers AI, our inference product, is now protected

One step further: AI Gateway protects users of any inference provider

Conclusion

Изборите в САЩ, част I. Механизмите на вота

Първичните избори

Какво представлява Избирателната колегия и демократична ли е тя?

Какво представляват партийните комитети и какво е Супервторник?

Какво ни показват тези сложни механизми?

инуити, което в превод означава „хòра“.

За улова на всички животни е определена квота, като например ограничението за лов на китове е два броя годишно за целия остров.

Характерна черта за гренландските къщи е, че са боядисани в ярки и весели цветове – червено, розово, синьо, тюркоазено, жълто.

Bottlenecks from the Database

Serving Tiles

Running More Logs

No More Merge Delay

A Sunny Future

Bottlenecks from the Database

Serving Tiles

Running More Logs

No More Merge Delay

A Sunny Future

Overview of solution

Data generation

Manually generate journey data

Automatically generate journey data using the simulator

Location data pipeline approaches

Amazon Location device position events

Data enrichment using Lambda

Data pipeline using Amazon Data Firehose

Create an EventBridge rule and attach targets

Crawl and catalog the data using AWS Glue

Query the data using Athena

Clean up

Conclusion

About the Authors

The collective thoughts of the interwebz